SCIPRESS J. Japan Statist. Soc.

J. Japan Statist. Soc., Vol. 32 (No. 1), pp. 15-42, 2002

A Knowledge-Based Variable Selection Method for Box-Cox Transformation

Haruo Onishi

Abstract. In actual applications of regression analysis, users face two difficult problems. One is to find the most appropriate functional form, while the other is to search for the best subset derivable from a given set of all possible explanatory variables. Variable selection for the Box-Cox transformation may be useful to concurrently solve both problems. The purpose of this paper is to (1) concretely formulate the j-th OLS-best subset problem for the Box-Cox transformation, (2) introduce a knowledge-based computational method to solve it and (3) propose a solution to the (first) OLS-best subset problem (j = 1) or one selected by a user among solutions to the first j OLS-best subset problems (j > 1) solved in a run of a computer as a solution to a variable selection problem for the Box-Cox transformation. The integer j, specified by the user, depends on his scientific knowledge, criteria for statistical and data-analytic tests and model-building experience.

Key words and phrases: Regression analysis, Box-Cox transformation, variable selection, j-th best subset problem, variable classification, meaningful subset, practically best regression equation, intellectual statistical system OEPP.

[Full text] (PDF 256 KB)