Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Hierarchical clustering for histogram data

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Clustering methods for classical data are well established, though the associated algorithms primarily focus on partitioning methods and agglomerative hierarchical methods. With the advent of massively large data sets, too large to be analyzed by traditional techniques, new paradigms are needed. Symbolic data methods form one solution to this problem. While symbolic data can be important and arise naturally in their own right, they are particularly relevant when faced with data that emerged from aggregation of (larger) data sets. One format is when the data are histogram‐valued in ℝp, instead of points in ℝp as in classical data. This paper looks at the problem of constructing hierarchies using a divisive polythetic algorithm based on dissimilarity measures derived for histogram observations. WIREs Comput Stat 2017, 9:e1405. doi: 10.1002/wics.1405

Simulated data: all S = 5 data sets.
[ Normal View | Magnified View ]
Validity indices for diabetes data.
[ Normal View | Magnified View ]
Hierarchy tree for diabetes data: (a) CDF, (b) Euclidean Ichino–Yaguchi, (c) extended Gowda–Diday and (d) extended de Carvalho.
[ Normal View | Magnified View ]
Hierarchy tree for simulated data: (a) CDF and (b) Euclidean Ichino–Yaguchi.
[ Normal View | Magnified View ]

Browse by Topic

Data Mining > Clustering and Classification
Statistical Methods > Statistical Theory and Applications

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts