Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Data Mining Knowl Discov
Impact Factor: 2.541

Robust clustering

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Historical and recent developments in the field of robust clustering and their applications are reviewed. The discussion focuses on different strategies that have been developed to reduce the sensitivity of clustering methods to outliers in data, while pointing out the importance of the need for both efficient partitioning and simultaneous robust model fitting. Although all clustering methods and algorithms have good partitioning capabilities when data are clean and free of outliers, they break down in the presence of outliers in the data. This is because classical development in the field of clustering has focused on such assumptions that data is free of noise and the data are well distributed, Robust model fitting, while retaining the partitioning power, involves the development of methods and algorithms that reject these classical assumptions either by explicitly incorporating robust statistical methods (often regression based) or by recasting the clustering problem in a way that does so implicitly. In this review, the robust model fitting aspect is identified in pertinent methodological and algorithmic advances and tied to related developments in robust statistics wherever possible. The paper also includes representative samples of various applications of robust clustering methods to both synthetic and real‐world datasets. © 2011 Wiley Periodicals, Inc. This article is categorized under: Algorithmic Development > Structure Discovery Technologies > Structure Discovery and Clustering

Univariate sample of ten points (also known as Cushy and Peebles data) showing mean, median, and trimmed‐mean of the sample.

[ Normal View | Magnified View ]

Attraction subtree for a cluster in the Iyer data.

[ Normal View | Magnified View ]

(a) Image segmentation of ‘Autumn Leaves’ using noise clustering; left: original image, center: segmentation using FCM; right: segmentation using noise clustering. (Reprinted with permission from Elsevier.). (b) Image segmentation of MRI scan images using noise clustering; left: original image, center: segmentation using FCM; right: segmentation using noise clustering. (Reprinted with permission from Elsevier. Copyright 2006 Elsevier)

[ Normal View | Magnified View ]

Monochrome image segmentation results using DBSCAN. (Reprinted with permission from Ref 114. Copyright 2005 Stein and Busch.)

[ Normal View | Magnified View ]

Color image segmentation results. (a) Input images, Image segmentation using (b) k‐means, (c) spectral clustering, and (d) robust path‐based spectral clustering. (Reprinted with permission from Elsevier. Copyright 2008 Elsevier)

[ Normal View | Magnified View ]

Core, border, and outlier, defined by densities in DBSCAN.

[ Normal View | Magnified View ]

Chameleon clustering methodology. First, finer subdivisions in the data are identified using k‐nearest neighbor graphs, followed by cluster merging using measures of relative closeness and relative interconnectivity [Eqs (1) and (2)]. (Reprinted with permission from The Institute of Electrical and Electronics Engineers. Copyright 1999 IEEE)

[ Normal View | Magnified View ]

Illustration of weight function showing no rejection, hard rejection, and smooth rejection of outliers.

[ Normal View | Magnified View ]

Weight as a function of scaled residual for (a) Huber M‐estimator with a = 1, [Eq. (2)], and (b) Hampel's three‐part redescending M‐estimator with a = 1, b = 3, and c = 5, [Eq. (3)].

[ Normal View | Magnified View ]

Browse by Topic

Technologies > Structure Discovery and Clustering
Algorithmic Development > Structure Discovery

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts