This Title All WIREs
How to cite this WIREs title:
WIREs Comp Stat

Challenges in model‐based clustering

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Model‐based clustering is an increasingly popular area of cluster analysis that relies on probabilistic description of data by means of finite mixture models. Mixture distributions prove to be a powerful technique for modeling heterogeneity in data. In model‐based clustering, each data group is seen as a sample from one or several mixture components. Despite attractive interpretation, model‐based clustering poses many challenges. This paper discusses some of the most important problems a researcher might encounter while applying the model‐based cluster analysis. WIREs Comput Stat 2013, 5:135–148. doi: 10.1002/wics.1248 This article is categorized under: Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification Statistical and Graphical Methods of Data Analysis > Density Estimation

Performance of the EM algorithm based on different initialization procedures. represents the log‐likelihood value reached by the EM algorithm. (a) ΣEM, = −10,643, (b) emEM, = −10,665, (c) mclust, = −10,667, (d) k‐means & EM, = −10,837.

[ Normal View | Magnified View ]

Four solutions obtained by Rnd‐EM: (a) and (b) are nonspurious, (c) and (d) are spurious. (a) = 25.69, (b) = 26.61, (c) = 27.14, (d) = 27.58.

[ Normal View | Magnified View ]

Model‐based clustering with and without influential observations denoted as 1 and 2. Symbols represent the true assignment of data points simulated from a four‐component mixture model; colors represent obtained classification.

[ Normal View | Magnified View ]

Bivariate datasets with three distinct clusters: (a) one variable is informative and the other is irrelevant for clustering, (b) both variables carry clustering information.

[ Normal View | Magnified View ]

Browse by Topic

Statistical and Graphical Methods of Data Analysis > Density Estimation
Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts