Community recovery in weighted stochastic block models

Duration: 46 mins 49 secs
Share this media item:
Embed this media item:


About this item
Image inherited from collection
Description: Loh, P
Thursday 6th July 2017 - 11:45 to 12:30
 
Created: 2017-07-24 10:15
Collection: Scalable inference; statistical, algorithmic, computational aspects
Publisher: Isaac Newton Institute
Copyright: Loh, P
Language: eng (English)
Distribution: World     (downloadable)
Explicit content: No
Aspect Ratio: 16:9
Screencast: No
Bumper: UCS Default
Trailer: UCS Default
 
Abstract: Co-authors: Min Xu (University of Pennsylvania), Varun Jog (University of Wisconsin - Madison)

Identifying communities in a network is an important problem in many fields, including social science, neuroscience, military intelligence, and genetic analysis. In the past decade, the Stochastic Block Model (SBM) has emerged as one of the most well-studied and well-understood statistical models for this problem. Yet, the SBM has an important limitation: it assumes that each network edge is drawn from a Bernoulli distribution. This is rather restrictive, since weighted edges are fairly ubiquitous in scientific applications, and disregarding edge weights naturally results in a loss of valuable information. In this paper, we study a weighted generalization of the SBM, where observations are collected in the form of a weighted adjacency matrix, and the weight of each edge is generated independently from a distribution determined by the community membership of its endpoints. We propose and analyze a novel algorithm for community estimation in the weighted SBM based on various su broutines involving transformation, discretization, spectral clustering, and appropriate refinements. We prove that our procedure is optimal in terms of its rate of convergence, and that the misclassification rate is characterized by the Renyi divergence between the distributions of within-community edges and between-community edges. In the regime where the edges are sparse, we also establish sharp thresholds for exact recovery of the communities. Our theoretical results substantially generalize previously established thresholds derived specifically for unweighted block models. Furthermore, our algorithm introduces a principled and computationally tractable method of incorporating edge weights to the analysis of network data.
Available Formats
Format Quality Bitrate Size
MPEG-4 Video 640x360    1.93 Mbits/sec 680.95 MB View Download
WebM 640x360    528.38 kbits/sec 181.25 MB View Download
iPod Video 480x270    522.17 kbits/sec 179.05 MB View Download
MP3 44100 Hz 249.75 kbits/sec 85.73 MB Listen Download
Auto * (Allows browser to choose a format it supports)