Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Community detection in complex networks: From statistical foundations to data science applications

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Abstract Identifying and tracking community structures in complex networks are one of the cornerstones of network studies, spanning multiple disciplines, from statistics to machine learning to social sciences, and involving even a broader range of application areas, from biology to politics to blockchain. This survey paper aims to provide an overview of some most popular approaches in statistical network community detection as well as the newly emerging research directions such as community extraction with higher‐order features and community discovery in multilayer and multiscale networks. Our goal is to offer a unified view at methodological interconnections and the wide spectrum of interdisciplinary data science applications of network community analysis. This article is categorized under: Data: Types and Structure > Graph and Network Data Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification
Communities in social interaction networks. (a) Community structure in Zachary's Karate Club network. (b) Community structure among Twitter users sharing the hashtags #BBC and #FoxNews. Reprinted (Twitter) figure with permission from Weng et al. (2013)
[ Normal View | Magnified View ]
The five zones of the Italian power grid network. (a) Spectral clustering with k‐means, (b) spectral clustering with k‐depth with L1, (c) spectral clustering with k‐depth with MhD. The k‐way expansion, a performance metric in spectral clustering (a lower k‐way expansion indicates a better clustering algorithm), of these three algorithms are 1.95%, 1.65%, and 1.75%, respectively. Reprinted figure with permission from Dey et al. (2017)
[ Normal View | Magnified View ]
Clustering with the presence of outliers (in red). Green stars represent community centers yielded by k‐means (left panel) and depth‐based clustering (right panel). Reprinted figure with permission from (Tian & Gel, 2019)
[ Normal View | Magnified View ]
Examples of the music style (left) and tag (right) hypergraphs. Reprinted figure with permission from Wang et al. (2009)
[ Normal View | Magnified View ]
A hypergraph with four hyperedges: E = {{v1, v2, v3, v6}, {v2, v3, v4, v5}, {v6, v7, v8, v9}, {v5, v8, v9}}. Reprinted figure with permission from Gao et al. (2015)
[ Normal View | Magnified View ]
Multiscale brain networks. The networks are organized across multiple spatio‐temporal scales and also can be analyzed at topological (network) scales ranging from individual nodes to the network as a whole. Reprinted (adapted) figure with permission from Garcia et al. (2018)
[ Normal View | Magnified View ]
Communities in a multilayer network. Reprinted figure with permission from Ashourvan et al. (2019)
[ Normal View | Magnified View ]
A multilayer network (left) with (unweighted) intralayer connections (solid lines) and (weighted, ω) interlayer connections (dashed curves) and its “supra‐adjacency matrix” adjacency matrix (right). Reprinted figure with permission from Bazzi et al. (2016) (Copyright ©2016 Society for Industrial and Applied Mathematics. All rights reserved)
[ Normal View | Magnified View ]
Clustering of a network based on directed motif M7. Reprinted figure with permission from Benson et al. (2016)
[ Normal View | Magnified View ]
All 13 connected three‐node directed motifs in a network. Reprinted figure with permission from Benson et al. (2016)
[ Normal View | Magnified View ]
Top row shows an initiator graph (K1) and its Kronecker product with itself. (a) Graph K1. (b) Intermediate stage. (c) Graph K2 = K1 ⊗ K1. Bottom row represents corresponding adjacency matrices. (d) Adjacency matrix of K1. (d) Adjacency matrix of K2 = K1 ⊗ K1. Reprinted figure with permission from Leskovec et al. (2010)
[ Normal View | Magnified View ]
Divisions of the karate club network based on estimated stochastic block model (SBM) with r = 2. (a) Uncorrected SBM and (b) corrected SBM. The size of each node is proportional to its degree. The node color reflects inferred group membership and the dashed line indicates the split observed in real life (truth). Reprinted figure with permission from Karrer and Newman (2011)
[ Normal View | Magnified View ]
(a) Block decomposition of an undirected network on 15 nodes (numbered from 1 to 15), where the blue circles mark the presence of connection, the empty circles mark the absence of connection. (b) Stochastic representation of block A in the network's block decomposition. In block A, each node having a 50% chance to form an edge with another node in Block A (stochastically equivalence), a 42% chance to form an edge with another node in block B, and an 80% chance to form an edge with another node in block C. Reprinted figure with permission from Pavlović et al. (2014)
[ Normal View | Magnified View ]
Stochastic block model (SBM) defines a probability distribution over networks Pr(G|θ), where θ is the SBM's parameters. If the values of θ are chosen one can draw instances from the corresponding distribution. In turn, for a given network G, the values of θ can be estimated using statistical inference that best explain or reproduce the observed pattern of connectivity. Reprinted figure with permission from Clauset (2013)
[ Normal View | Magnified View ]

Browse by Topic

Statistical Learning and Exploratory Methods of the Data Sciences > Clustering and Classification
Data: Types and Structure > Graph and Network Data

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts