Statistical model selection criteria provide answers to the questions, "How much improvement in fit should be achieved to justify the inclusion of an additional parameter in a model, and on what scale should this improvement in fit be measured?" Mathematically, statistical model selection criteria are defined as estimates of suitable functionals of the probability distributions corresponding to alternative models. This paper discusses different approaches to model-selection criteria, with a view toward illuminating their similarities and differences. The approaches discussed range from explicit, small-sample criteria for highly specific problems to general, large-sample criteria such as Akaike's information criterion and variants thereof. Special emphasis is given to criteria derived from a Bayesian approach, as this presents a unified way of viewing a variety of criteria. In particular, the approach to model-selection criteria by asymptotic expansion of the log posterior probabilities of alternative models is reviewed. An information-theoretic approach to model selection, through minimum-bit data representation, is explored. Similarity of the asymptotic form of Rissanen's criterion, obtained from a minimum-bit data representation approach, to criteria derived from a Bayesian approach, is discussed.

KEY WORDS: model selection, model evaluation, Akaike's information criterion, AIC, Schwarz's criterion, Kashyap's criterion, Bayesian inference, posterior probabilities