Langevin MCMC: theory and methods

43 mins 14 secs,  174.98 MB,  WebM  640x360,  29.97 fps,  44100 Hz,  552.59 kbits/sec
Share this media item:
Embed this media item:


About this item
Image inherited from collection
Description: Moulines, E
Friday 7th July 2017 - 09:00 to 09:45
 
Created: 2017-07-24 12:16
Collection: Scalable inference; statistical, algorithmic, computational aspects
Publisher: Isaac Newton Institute
Copyright: Moulines, E
Language: eng (English)
Distribution: World     (downloadable)
Explicit content: No
Aspect Ratio: 16:9
Screencast: No
Bumper: UCS Default
Trailer: UCS Default
 
Abstract: Nicolas Brosse, Ecole Polytechnique, Paris
Alain Durmus, Telecom ParisTech and Ecole Normale Supérieure Paris-Saclay
Marcelo Pereira, Herriot-Watt University, Edinburgh


The complexity and sheer size of modern datasets, to whichever increasingly demanding questions are posed, give rise to major challenges. Traditional simulation methods often scale poorly with data size and model complexity and thus fail for the most complex of modern problems.
We are considering the problem of sampling from a log-concave distribution. Many problems in machine learning fall into this framework,
such as linear ill-posed inverse problems with sparsity-inducing priors, or large scale Bayesian binary regression.



The purpose of this lecture is to explain how we can use ideas which have proven very useful in machine learning community to
solve large-scale optimization problems to design efficient sampling algorithms.
Most of the efficient algorithms know so far may be seen as variants of the gradient descent algorithms,
most often coupled with « partial updates » (coordinates descent algorithms). This, of course, suggests studying methods derived from Euler discretization of the Langevin diffusion. Partial updates may in this context as « Gibbs steps »This algorithm may be generalized in the non-smooth case by « regularizing » the objective function. The Moreau-Yosida inf-convolution algorithm is an appropriate candidate in such case.

We will prove convergence results for these algorithms with explicit convergence bounds both in Wasserstein distance and in total variation. Numerical illustrations will be presented (on the computation of Bayes factor for model choice, Bayesian analysis of high-dimensional regression, aggregation of estimators) to illustrate our results.
Available Formats
Format Quality Bitrate Size
MPEG-4 Video 640x360    1.94 Mbits/sec 628.97 MB View Download
WebM * 640x360    552.59 kbits/sec 174.98 MB View Download
iPod Video 480x270    522.26 kbits/sec 165.31 MB View Download
MP3 44100 Hz 249.77 kbits/sec 79.15 MB Listen Download
Auto (Allows browser to choose a format it supports)