# Bayesian model selection using the median probability model

Focus Article

Published Online: Mar 31 2015

DOI: 10.1002/wics.1352

Can't access this content? Tell your librarian.

In the Bayesian approach to model selection, models and model specific parameters are treated as unknown quantities and uncertainty about them are expressed through prior distributions. Given the observed data, updating of the prior distribution to the posterior distribution occurs via Bayes' theorem. The posterior probability of a given model may be interpreted as the support it gets based on the observed data. The highest probability model (HPM) that receives the maximum support from the data is a possible choice for model selection. For large model spaces, Markov chain Monte Carlo (MCMC) algorithms are commonly used to estimate the posterior distribution over models. However, estimates of posterior probabilities of individual models based on MCMC output are not reliable because the number of MCMC samples is typically far smaller than the size of the model space. Thus, the HPM is difficult to estimate and for large model spaces it often has a very small posterior probability. An alternative to the HPM is the median probability model (MPM) of Barbieri and Berger, which has been shown to be the optimal model for prediction using a squared error loss function, under certain conditions. In this article we review some of the conditions for which the MPM is optimal, and provide real data examples to evaluate the performance of the MPM under small and large model spaces. We also discuss the behavior of the MPM under collinearity. WIREs Comput Stat 2015, 7:185–193. doi: 10.1002/wics.1352 This article is categorized under: Statistical and Graphical Methods of Data Analysis > Bayesian Methods and Theory Statistical and Graphical Methods of Data Analysis > Markov Chain Monte Carlo (MCMC)