Community Detection — The Science of Finding Hidden Groups in Graphs

Introduction — Why Find "Communities"?

Real-world networks almost always contain clusters of closely connected nodes. In social networks, these are friend groups; on the Web, topically related pages; in protein interaction networks, functional modules; in citation networks, research fields. We call such densely interconnected groups of nodes communities.

Community Detection is the problem of partitioning the vertex set $V$ of a graph $G = (V, E)$ into groups that are "internally dense and externally sparse." While this sounds simple, deep mathematical and computational difficulties lie beneath the surface. This article traces the historical development of the field, explains the core ideas behind major algorithms, and explores why this problem is "fundamentally hard."

Historical Context — From Sociograms to Leiden

Community Detection has a nearly century-long history. The following timeline surveys the key milestones.

History of Community Detection Research

Key milestones from Moreno's sociogram (1927) to the Leiden algorithm (2019)

1927

Social Network Analysis Begins

Moreno invents the sociogram — the first attempt to visualize human relationships as graphs.

1970

Graph Partitioning

Kernighan-Lin algorithm (1970). Iteratively improves a bisection of a graph for circuit design. Assumes balanced partition.

2002

Girvan-Newman

Top-down method based on edge betweenness centrality. Revolutionized social network analysis.

2004

Modularity Q Formalized

Newman & Girvan define Modularity Q — establishing a quantitative criterion for good community structure.

2008

NP-Hardness Proved

Brandes et al. prove modularity maximization is NP-hard (published in IEEE TKDE). Exact solutions intractable for large graphs.

2007

Resolution Limit Discovered

Fortunato & Barthélemy discover the resolution limit. Small communities become undetectable depending on graph size.

2008

Louvain Algorithm

Blondel et al.'s greedy method. Alternates local moving and aggregation, scaling to millions of nodes. Became the de facto standard.

2009

LFR Benchmark

Lancichinetti-Fortunato-Radicchi benchmark. Generates synthetic graphs with planted community structure for algorithm evaluation.

2011

SBM Revival

Karrer & Newman propose Degree-Corrected SBM. Statistical inference-based detection gains momentum.

2011

Detectability Threshold

Decelle et al. derive the detectability threshold via the cavity method, showing a phase transition below which communities are fundamentally undetectable (rigorous proofs followed in 2013–2015).

2019

Leiden Algorithm

Traag et al. fix Louvain's flaw (disconnected communities). An improved version with quality guarantees.

Prehistory: Graph Partitioning

The direct ancestor of Community Detection is the graph partitioning problem from VLSI circuit design. The Kernighan-Lin (KL) algorithm of 1970 partitions a graph's vertices into two balanced halves while minimizing the number of cut edges.

Kernighan-Lin Algorithm (1970):
  Input: Graph G = (V, E), initial bisection (A, B) with |A| = |B|
  Repeat:
    1. Mark all nodes as "unlocked"
    2. Perform n/2 passes:
       a. Find the unlocked pair (a ∈ A, b ∈ B) with maximum swap gain g(a,b)
       b. Swap a and b, then "lock" both (they cannot move again this pass)
       c. Record the cumulative gain
    3. Find the prefix k that maximizes total cumulative gain
       → Only keep the first k swaps; undo the rest (backtrack)
    4. Repeat until no improvement

The key insight of KL is this look-ahead mechanism: even swaps that locally worsen the cut are explored, and the algorithm backtracks to the best prefix. This enables escaping local optima that simple greedy cannot.

However, graph partitioning and Community Detection differ fundamentally:

	Graph Partitioning	Community Detection
# of partitions	Pre-specified (usually 2)	Unknown — determined by algorithm
Partition sizes	Balanced	May be unequal
Objective	Minimize cut edges	Maximize internal density
Applications	Circuit design, parallel computing	Social network analysis, biology

This distinction is precisely what established Community Detection as its own field. Automatically finding meaningful groups when you don't know how many there are or how large they should be — that is the core challenge.

Turning Point: Girvan-Newman (2002)

In 2002, Michelle Girvan and Mark Newman revolutionized social network analysis with an inverted approach: instead of finding edges within communities, remove the bridges between them.

The key concept is Edge Betweenness Centrality. For all pairs of nodes, consider all shortest paths; an edge's betweenness is the number of shortest paths passing through it. Bridge edges between communities concentrate many shortest paths, so they have high betweenness.

Girvan-Newman Algorithm:
  1. Compute betweenness for all edges
  2. Remove the edge with highest betweenness
  3. Recompute betweenness
  4. Repeat until the graph is fully disconnected
  → Produces a hierarchical dendrogram

The complexity is $O(m^2 n)$ , making it impractical for large graphs, but it pioneered the idea of quantitatively evaluating community structure.

Modularity Q — What Is a "Good" Partition?

A natural question arose from Girvan-Newman's work: "At which level should we cut the dendrogram?" To answer this, Newman and Girvan defined Modularity Q in 2004.

Definition

Q = \frac{1}{2m} \sum_{ij} \left[ A_{ij} - \frac{k_i k_j}{2m} \right] \delta(c_i, c_j)

Where:

$A_{ij}$ : adjacency matrix entry (1 if edge exists, 0 otherwise)
$k_i$ : degree of vertex $i$
$m$ : total number of edges
$c_i$ : community assignment of vertex $i$
$\delta(c_i, c_j)$ : 1 if $c_i = c_j$ , 0 otherwise

Understanding the intuition behind this formula is crucial.

Intuition Behind Modularity Q

Compare actual intra-community edges with expected values under a random null model

Good partition (high Q)

Bad partition (low Q)

Modularity Q = (fraction of intra-community edges) − (expected fraction under a random null model). Higher Q means intra-community connectivity is denser than random. Range is [-0.5, 1]; real networks typically yield 0.3–0.7.

$\frac{k_i k_j}{2m}$ is the expected number of edges between vertices $i$ and $j$ under the Configuration Model, a random null model that preserves the degree sequence. Thus Modularity Q measures:

Q = \text{(actual fraction of intra-community edges)} - \text{(expected fraction under random rewiring)}

It quantifies "how much denser intra-community connectivity is compared to random." $Q > 0$ means "more structure than random," and real networks typically yield $Q \approx 0.3$ – $0.7$ .

Achievements and Limitations of Modularity

Modularity gave Community Detection a quantitative evaluation criterion and catalyzed explosive growth in the field. However, two fundamental problems emerged:

NP-Hardness (Brandes et al., 2008): Modularity maximization is NP-hard; exact solutions are intractable
Resolution Limit (Fortunato & Barthélemy, 2007): A structural flaw that causes small communities to be overlooked

We examine these in detail in later sections.

Notably, the first practical optimization algorithm for Modularity was the Clauset-Newman-Moore (CNM) agglomerative greedy method (2004). With complexity $O(md \log n)$ (where $d$ is dendrogram depth), CNM bridged the gap between Girvan-Newman (2002) and Louvain (2008).

Major Algorithms in Detail

Algorithm Comparison Table

Method	Year	Approach	Complexity	Scale	Strengths
Girvan-Newman	2002	Edge betweenness (removal)	O(m²n)	★☆☆☆☆	Intuitive, hierarchical dendrogram
Louvain	2008	Greedy modularity optimization	O(n log n)	★★★★★	Fast, scales to large graphs
Leiden	2019	Improved Louvain + refinement	O(n log n)	★★★★★	Connectivity guarantee, high quality
Label Propagation	2007	Majority-vote label spreading	O(m)	★★★★★	One of the fastest methods
Infomap	2008	Random walk compression (map equation)	O(m)	★★★★☆	Information-theoretic, captures flow
Spectral Clustering	~2000	Laplacian eigenvectors	O(n³)	★★☆☆☆	Strong theoretical foundation
SBM (統計的推論)	2011	Stochastic Block Model inference	O(n log²n)	★★★☆☆	Model selection, respects detectability

1. Louvain Algorithm (2008)

Proposed by Blondel, Guillaume, Lambiotte, and Lefebvre in 2008, the Louvain algorithm became the de facto standard for Community Detection. Its success lies in combining greedy optimization with hierarchical aggregation for outstanding scalability.

The Two Phases

Phase 1 — Local Moving:

For each node $i$ , compute the modularity change $\Delta Q$ from moving it to the community $C$ of each neighboring node:

\Delta Q = \left[ \frac{\Sigma_{in} + 2k_{i,in}}{2m} - \left(\frac{\Sigma_{tot} + k_i}{2m}\right)^2 \right] - \left[ \frac{\Sigma_{in}}{2m} - \left(\frac{\Sigma_{tot}}{2m}\right)^2 - \left(\frac{k_i}{2m}\right)^2 \right]

Where:

$\Sigma_{in}$ : sum of edge weights inside community $C$
$\Sigma_{tot}$ : sum of edge weights incident to community $C$
$k_{i,in}$ : sum of edge weights from node $i$ to community $C$
$k_i$ : degree of node $i$

Move the node to the community yielding the maximum positive $\Delta Q$ . Process all nodes in sequence and repeat until no moves occur.

Note: In practice, $\Delta Q$ is computed in two steps: the cost of removing a node from its current community, then the gain of inserting it into the target community. The formula above shows the gain for inserting an isolated node (one in its own singleton community) into community $C$ .

Phase 2 — Aggregation:

Collapse each community from Phase 1 into a single "super-node." Internal edges become self-loops; inter-community edges become super-edges. Apply Phase 1 again on this coarsened graph.

Walk through both phases in this interactive demo:

Louvain Algorithm Step-Through

Walk through Louvain's two phases (local moving → aggregation) on a 10-node graph

初期化

Q = 0.0000

Louvain algorithm begins. Phase 1 (local moving): Initialize each node in its own community. A 10-node, 14-edge graph with 3 hidden clusters.

1 / 101 / 10

Why It's Fast

Louvain's effective complexity is $O(n \log n)$ . Computing $\Delta Q$ for each node is proportional to its degree, and aggregation rapidly shrinks the graph, enabling processing of million-node graphs in minutes.

Louvain's Flaw

However, Louvain has a fundamental issue. Because Phase 1 moves nodes individually, communities can become disconnected — nodes assigned to the same community may lack any connecting path within that community. This counterintuitive result occurs in practice.

2. Leiden Algorithm (2019)

Traag, Waltman, and van Eck proposed the Leiden algorithm in 2019 to fix Louvain's flaw. It has three phases:

Local Moving — Similar to Louvain but with a fast queue
Refinement — Attempts to sub-partition within communities, guaranteeing connected communities
Aggregation — Coarsens using refined communities

Leiden maintains Louvain's speed while guaranteeing that all communities are connected. It is becoming the new gold standard.

3. Label Propagation (2007)

The Label Propagation Algorithm (LPA) by Raghavan, Albert, and Kumara is extremely simple and fast:

Label Propagation:
  1. Assign each node a unique label
  2. Process nodes in random order:
     - Update to the most frequent label among neighbors
  3. Repeat until no labels change
  → Nodes sharing the same label form a community

With $O(m)$ complexity it is among the fastest methods, but results are non-deterministic. Outcomes depend on node processing order and tie-breaking. With synchronous updates (all nodes updated simultaneously), convergence is not guaranteed and oscillation can occur; asynchronous updates (one node at a time) guarantee convergence.

4. Spectral Clustering

A linear algebra-based approach. Decompose the graph Laplacian $L = D - A$ (where $D$ is the degree matrix, $A$ the adjacency matrix) into eigenvectors and embed nodes in a low-dimensional space for clustering with k-means.

L \mathbf{v} = \lambda \mathbf{v}

The eigenvector corresponding to the smallest non-zero eigenvalue (the Fiedler vector) encodes information about the optimal bisection. Using multiple eigenvectors extends to $k$ -way partitioning.

Strengths: Strong theoretical foundation (equivalence with graph cuts, Cheeger inequality guarantees)

Weaknesses: Full eigendecomposition is $O(n^3)$ , but in practice iterative methods (Lanczos/ARPACK) compute the needed $k$ eigenvectors in $O(k \cdot m)$ time for sparse graphs. Requires pre-specifying $k$ .

In practice, the normalized Laplacian $L_{sym} = D^{-1/2}LD^{-1/2}$ or $L_{rw} = D^{-1}L$ is more commonly used than the combinatorial Laplacian (Shi & Malik, 2000). The normalized version produces more stable results on graphs with heterogeneous degree distributions.

Furthermore, Newman (2006) showed that Modularity can be optimized via the leading eigenvector of the modularity matrix $B_{ij} = A_{ij} - k_i k_j / 2m$ . This reveals that spectral methods and Modularity are "different facets of the same mathematical structure."

5. Infomap — Information-Theoretic Approach (2008)

Proposed by Rosvall and Bergstrom, Infomap finds the community structure that most efficiently compresses (describes) a random walker's trajectory.

At its heart is the Map Equation:

L(M) = q_{\curvearrowleft} H(\mathcal{Q}) + \sum_{i=1}^{c} p_{\circlearrowleft}^i H(\mathcal{P}^i)

$c$ : number of communities
First term: cost of describing inter-community transitions
Second term: cost of describing intra-community transitions

Intuitively, if community structure is well-captured, a random walker stays within communities for long stretches, so short codewords can be assigned to intra-community nodes, compressing the overall description length. This is analogous to telephone area codes: within the same city (community) you dial short numbers, but calling another city (community) requires an area code.

While Modularity takes a statistical perspective ("comparison with a random model"), Infomap takes an information-theoretic perspective ("information compression"). It is particularly well-suited for directed graphs and networks with flow structure.

6. Stochastic Block Model (SBM) — Statistical Inference

A fundamentally different paradigm from Modularity-based methods is the statistical inference approach.

The SBM is a generative model:

Each node belongs to community $c_i \in \{1, ..., k\}$
Edge probability between communities $r$ and $s$ is $\omega_{rs}$
Infer the most likely community assignments and parameters from the observed graph

Likelihood:

P(G | \{c_i\}, \omega) = \prod_{i < j} \omega_{c_i c_j}^{A_{ij}} (1 - \omega_{c_i c_j})^{1-A_{ij}}

The Degree-Corrected SBM (Karrer & Newman, 2011) extends this to account for heterogeneous node degrees. Real networks often follow power-law degree distributions that vanilla SBM cannot properly model.

SBM Inference:
  Parameters: community assignments {c_i}, edge probability matrix ω
  Objective: maximize posterior P({c_i}, ω | G)
  Methods: EM, variational inference, MCMC, Belief Propagation

The most important feature of SBM is model selection. It treats the number of communities $k$ as an inference target, using criteria like Minimum Description Length (MDL) to avoid overfitting. This allows answering the fundamental question: "Does community structure truly exist?"

Furthermore, Peixoto's (2014) Nested SBM recursively block-structures the SBM itself, enabling nonparametric model selection and avoiding Modularity's resolution limit.

The Fundamental Hardness of Community Detection

Having surveyed the algorithms, let's dig into why Community Detection is "fundamentally hard."

1. Ill-Definedness

The most fundamental difficulty is that no universal definition of "community" exists.

The intuition that communities are "internally dense and externally sparse" is shared, but formalizing it mathematically admits countless approaches:

Cut minimization: the graph partitioning tradition
Internal density maximization: the Modularity perspective
Random walk residence time: the Infomap perspective
Probabilistic model fit: the SBM perspective
Clique-like structure: subgraph density focus

These can yield different "correct answers" for the same graph, and which is right depends on context. Community Detection is not an objectively defined problem but one that depends on implicit assumptions about what constitutes a community.

2. NP-Hardness

That Modularity maximization is NP-hard (Brandes et al., 2008) means efficient exact solutions are impossible (assuming P ≠ NP). All practical algorithms are approximations perpetually at risk of local optima.

Yet NP-hardness is in some sense a "surface-level" problem. The choice of what to optimize is itself arbitrary, so the truly hard question is the problem formulation itself.

3. Resolution Limit

The Resolution Limit

Fortunato & Barthélemy (2007): Modularity maximization misses small communities

As total edges m grows, the modularity-optimal partition merges small cliques with fewer than ~√(2m) edges. In other words, the larger the graph, the more small true communities become invisible. This is the Resolution Limit.

The resolution limit discovered by Fortunato and Barthélemy (2007) is a fundamental critique of Modularity-based methods. Given total edges $m$ , communities whose total degree (sum of all node degrees within the community) is less than $\sqrt{2m}$ get merged with adjacent communities in the Modularity-optimal solution, even if they are complete cliques. In terms of internal edge count, this means communities with fewer than approximately $\sqrt{m/2}$ edges are affected.

This is severe for real networks with multi-scale structure. Social networks harbor groups ranging from a handful of close friends to communities of thousands.

Countermeasures include:

Resolution parameter $\gamma$ : $Q_\gamma = \frac{1}{2m} \sum_{ij} \left[ A_{ij} - \gamma \frac{k_i k_j}{2m} \right] \delta(c_i, c_j)$
Multi-scale analysis: detect repeatedly at different $\gamma$
Moving to SBM-based methods: avoiding null-model constraints

4. Detectability Threshold — Fundamentally Undetectable Regimes

Detectability Threshold

The information-theoretic limit below which community structure is fundamentally undetectable

In the SBM, let c_in be the average intra-community degree and c_out the inter-community degree. When (c_in − c_out)² < 2(c_in + c_out), no algorithm can detect communities better than random (Decelle et al., 2011). This is deeply connected to phase transitions in physics.

The detectability threshold derived by Decelle et al. (2011) via the cavity method (a non-rigorous technique from statistical physics) and later rigorously proven by Mossel, Neeman & Sly (2018) and Massoulié (2014), is one of the most profound results in Community Detection.

In a 2-community SBM with edge probability parameters $c_{in}/n$ within communities and $c_{out}/n$ between communities (each node has $\approx c_{in}/2$ within-community neighbors and $\approx c_{out}/2$ between-community neighbors, giving average degree $(c_{in} + c_{out})/2$ ):

(c_{in} - c_{out})^2 < 2(c_{in} + c_{out})

When this holds, no algorithm can detect communities better than random (Decelle et al., 2011; rigorous proof by Mossel, Neeman & Sly, 2018).

This result is deeply connected to spin glass theory in statistical physics, showing that a phase transition exists in community detectability. Above the threshold, detection suddenly becomes possible; below it, it is fundamentally impossible. This is not a computational hardness issue but an information-theoretic limit.

5. Absence of Ground Truth

Unlike classification in machine learning, Community Detection generally lacks ground-truth labels.

Benchmarks (LFR, etc.) provide artificial ground truth but don't reflect real network complexity. The LFR benchmark (Lancichinetti-Fortunato-Radicchi, 2008) generates synthetic graphs with power-law distributions for both degree and community size, partially mimicking real-network statistics, and is the standard evaluation tool
Metadata (paper fields, user attributes) is sometimes used as "ground truth," but metadata-based groups don't necessarily align with network-derived communities (Peel, Larremore & Clauset, 2017)
Evaluation metrics (NMI, ARI) assume ground truth exists

SBM-based approaches enable autonomous evaluation via model fit, but there's no guarantee that the model assumptions themselves are correct.

6. Overlapping Communities

In real networks, nodes naturally belong to multiple communities simultaneously. A university researcher might belong to both the "machine learning" and "computational theory" communities.

Algorithms allowing overlap (COPRA, BigCLAM, Mixed-Membership SBM, etc.) exist, but formulation and evaluation become even more challenging. How to measure overlap and what constitutes "correct" overlapping structure remain open questions.

Beyond Modularity — Why So Many Approaches Exist

As we've seen, Modularity is the most widely used metric but is not universal. Different methods use different objective functions because they conceptualize "community" differently:

Objective	Perspective	Methods
Modularity Q	Comparison with random model	Louvain, Leiden
Map Equation	Random walk compression	Infomap
Likelihood / Bayes	Probabilistic model fit	SBM
Normalized Cut	Graph cut value / volume	Spectral Clustering
Conductance $\phi(S) = \frac{\text{cut}(S, \bar{S})}{\min(\text{vol}(S), \text{vol}(\bar{S}))}$	Boundary "leakiness"	Local methods

Which objective is appropriate depends on the network's properties and the analysis goal. This is not a weakness but a reflection that no single "correct answer" exists in network analysis.

Modern Developments and Open Problems

Consensus Clustering

To address the non-determinism of Label Propagation and Louvain, consensus clustering (Lancichinetti & Fortunato, 2012) was proposed. The algorithm is run multiple times, and a consensus matrix is built from "how often nodes $i$ and $j$ are placed in the same community." Community detection is then applied to this consensus matrix to produce stable results.

Higher-Order / Motif-Based Detection

Traditional approaches consider only pairwise edges, but motif-based Community Detection (Benson, Gleich & Leskovec, 2016) has emerged. For example, communities defined by triangle density reveal structure that simple edge density cannot capture.

GNN-Based Approaches

Graph Neural Networks (GNNs) for Community Detection have gained significant attention (Tsitsulin et al., 2023). These approaches learn node features for end-to-end community assignment optimization. Methods include making unsupervised objectives (like Modularity) differentiable for gradient descent and contrastive learning approaches. Challenges remain in model interpretability and generalization.

Dynamic and Multiplex Networks

Community detection in time-varying networks requires tracking community birth, growth, splitting, and death. Mucha et al. (2010) provided a unified formulation for time-dependent, multiscale, multiplex networks. Temporal modularity and inter-snapshot correspondence are active research areas.

Large-Scale Distributed Processing

Billion-node networks (social graphs, Web graphs) cannot be processed on a single machine. Distributed Louvain/Leiden implementations based on MapReduce and Pregel are advancing.

Practical Algorithm Selection Guide

Here is a practical guide for choosing the right tool:

Situation	Recommended Approach
Large undirected graph, unknown $k$	Leiden (best quality/speed balance)
Directed graph / flow structure	Infomap (high flow sensitivity)
Need to determine if structure exists at all	Nested SBM (model selection capable)
Fastest possible, quality secondary	Label Propagation ( $O(m)$ )
Known $k$ , theoretical guarantees needed	Spectral Clustering
Exploring multi-scale structure	Resolution parameter $\gamma$ scan or Nested SBM
Result stability is critical	Consensus clustering

Summary

Community Detection is a seemingly simple problem — "find hidden structure in a graph" — yet it harbors these fundamental difficulties:

Definitional ambiguity: No universal definition of "community" exists
Computational hardness: Modularity maximization is NP-hard
Resolution Limit: Scale-dependent detection blind spots
Detectability threshold: Information-theoretically undetectable regimes
Absence of ground truth: Solving a problem without knowing the answer
Overlapping structure: Reality resists hard partitions

After a century of research, we've come to understand that "perfect Community Detection is fundamentally impossible" and instead adopted a pragmatic stance of choosing the right tool for the purpose. Louvain/Leiden's practicality, SBM's theoretical robustness, Infomap's flow sensitivity — understanding each method's strengths and matching them to the network's properties and analysis goals is what modern network science demands.

References

Kernighan, B. W., & Lin, S. (1970). An efficient heuristic procedure for partitioning graphs. Bell System Technical Journal, 49(2), 291–307.
Shi, J., & Malik, J. (2000). Normalized cuts and image segmentation. IEEE TPAMI, 22(8), 888–905.
Girvan, M., & Newman, M. E. J. (2002). Community structure in social and biological networks. PNAS, 99(12), 7821-7826.
Newman, M. E. J., & Girvan, M. (2004). Finding and evaluating community structure in networks. Physical Review E, 69(2), 026113.
Clauset, A., Newman, M. E. J., & Moore, C. (2004). Finding community structure in very large networks. Physical Review E, 70(6), 066111.
Newman, M. E. J. (2006). Modularity and community structure in networks. PNAS, 103(23), 8577-8582.
Brandes, U., et al. (2008). On modularity clustering. IEEE TKDE, 20(2), 172-188.
Fortunato, S., & Barthélemy, M. (2007). Resolution limit in community detection. PNAS, 104(1), 36-41.
Raghavan, U. N., Albert, R., & Kumara, S. (2007). Near linear time algorithm to detect community structures in large-scale networks. Physical Review E, 76(3), 036106.
Blondel, V. D., et al. (2008). Fast unfolding of communities in large networks. JSTAT, P10008.
Rosvall, M., & Bergstrom, C. T. (2008). Maps of random walks on complex networks reveal community structure. PNAS, 105(4), 1118-1123.
Lancichinetti, A., Fortunato, S., & Radicchi, F. (2008). Benchmark graphs for testing community detection algorithms. Physical Review E, 78(4), 046110.
Fortunato, S. (2010). Community detection in graphs. Physics Reports, 486(3-5), 75-174.
Mucha, P. J., et al. (2010). Community structure in time-dependent, multiscale, multiplex networks. Science, 328(5980), 876–878.
Karrer, B., & Newman, M. E. J. (2011). Stochastic blockmodels and community structure in networks. Physical Review E, 83(1), 016107.
Decelle, A., et al. (2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. Physical Review E, 84(6), 066106.
Lancichinetti, A., & Fortunato, S. (2012). Consensus clustering in complex networks. Scientific Reports, 2, 336.
Massoulié, L. (2014). Community detection thresholds and the weak Ramanujan property. STOC 2014, 694–703.
Peixoto, T. P. (2014). Hierarchical block structures and high-resolution model selection in large networks. Physical Review X, 4(1), 011047.
Benson, A. R., Gleich, D. F., & Leskovec, J. (2016). Higher-order organization of complex networks. Science, 353(6295), 163–166.
Fortunato, S., & Hric, D. (2016). Community detection in networks: A user guide. Physics Reports, 659, 1-44.
Peel, L., Larremore, D. B., & Clauset, A. (2017). The ground truth about metadata and community detection in networks. Science Advances, 3(5), e1602548.
Mossel, E., Neeman, J., & Sly, A. (2018). A proof of the block model threshold conjecture. Combinatorica, 38(3), 665–708.
Traag, V. A., Waltman, L., & van Eck, N. J. (2019). From Louvain to Leiden: guaranteeing well-connected communities. Scientific Reports, 9, 5233.
Tsitsulin, A., et al. (2023). Graph clustering with graph neural networks. JMLR, 24(127), 1–21.