On the Complexity of Newman’s Community Finding Approach for Biological and Social Networks
journal contributionposted on 2013-12-03, 00:00 authored by Bhaskar DasGupta, Devendra Desai
Given a graph of interactions, a module (also called a community or cluster) is a subset of nodes whose fitness is a function of the statistical significance of the pairwise interactions of nodes in the module. The topic of this paper is a model-based community finding approach, commonly referred to as modularity clustering, that was originally proposed by Newman (Leicht and Newman, 2008 ) and has subsequently been extremely popular in practice (e.g., see Agarwal and Kempe, 2008 , Guimer'a et al., 2007 , Newman, 2006 , Newman and Girvan, 2004 , Ravasz et al., 2002 ). Various heuristic methods are currently employed for finding the optimal solution. However, as observed in Agarwal and Kempe (2008), the exact computational complexity of this approach is still largely unknown. To this end, we initiate a systematic study of the computational complexity of modularity clustering. Due to the specific quadratic nature of the modularity function, it is necessary to study its value on sparse graphs and dense graphs separately. Our main results include a (1 + epsilon)-inapproximability for dense graphs and a logarithmic approximation for sparse graphs. We make use of several combinatorial properties of modularity to get these results. These are the first non-trivial approximability results beyond the NP-hardness results in Brandes et al. (2007) . (c) 2012 Elsevier Inc. All rights reserved.