Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Community detection in large‐scale networks: a survey and empirical evaluation

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Community detection is a common problem in graph data analytics that consists of finding groups of densely connected nodes with few connections to nodes outside of the group. In particular, identifying communities in large‐scale networks is an important task in many scientific domains. In this review, we evaluated eight state‐of‐the‐art and five traditional algorithms for overlapping and disjoint community detection on large‐scale real‐world networks with known ground‐truth communities. These 13 algorithms were empirically compared using goodness metrics that measure the structural properties of the identified communities, as well as performance metrics that evaluate these communities against the ground‐truth. Our results show that these two types of metrics are not equivalent. That is, an algorithm may perform well in terms of goodness metrics, but poorly in terms of performance metrics, or vice versa. WIREs Comput Stat 2014, 6:426–439. doi: 10.1002/wics.1319 This article is categorized under: Algorithms and Computational Methods > Algorithms Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification Data: Types and Structure > Graph and Network Data
Goodness metrics for overlapping community detection. Missing boxplots indicate that the corresponding algorithm did not finish on that graph within 4 h.
[ Normal View | Magnified View ]
Run‐times of the disjoint and overlapping community detection algorithms (including read and write times) per graph. The algorithms were terminated if they did not finish within 4 h. For the nondeterministic algorithms, the average of 10 run‐times was taken.
[ Normal View | Magnified View ]
Matrices of pairwise similarity scores for the community detection algorithms and the ground‐truth. Dots indicate that a similarity score could not be computed, because one of the algorithms did not finish on that graph within 4 h.
[ Normal View | Magnified View ]
Goodness metrics for disjoint community detection. Missing boxplots indicate that the corresponding algorithm did not finish on that graph within 4 h.
[ Normal View | Magnified View ]

Browse by Topic

Algorithms and Computational Methods > Algorithms
Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification
Data: Types and Structure > Graph and Network Data

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts