Joris Bierkens - Piecewise Deterministic Monte Carlo
On this website I aim to collect references to papers on the use and analysis of Piecewise Deterministic Markov Processes (PDMPs) for use in MCMC. I will try to list all papers that I find relevant, will put years of first appearance on arXiv or elsewhere, and list in reverse chronological order. Inevitably this page will reflect some of my own judgements. If you want to bring some paper to my attention, or feel I should correct something, please contact me.
Main idea
Markov Chain Monte Carlo (MCMC) is a probabilistic computational method which is of vital importance in data science (statistics, machine learning) as well as in the physical sciences (physics, chemistry, biology). For example, MCMC is applied in Bayesian statistics, for the training of (deep) neural networks, as well as for the simulation of many-particle systems.
In recent years it has been established that PDMPs may play a very useful role in designing new efficient MCMC methods which have good convergence properties and allow various clever tricks in dealing with large data sets (subsampling of the data, for example).
In the animation below you see two-dimensional example trajectories of three different PDMP samplers: the Boomerang Sampler, the Zig-Zag Sampler and the Bouncy Particle Sampler (BPS). The trajectories can be seen to be continuous (in contrast to most MCMC algorithms). They consist of deterministic dynamics, interrupted by random changes of velocity. These random changes of velocity can be informed by the target density, or can be random refreshments (to ensure ergodicity of the process).
Where to start?
If you are looking for a gentle introduction into this class of algorithms, which does not assume too much mathematical background (no measure theory...) and covers both Bouncy Particle Sampler and Zig-Zag, I can recommend Fearnhead et al. (2016). (Note: although it is a very interesting topic, Section 4 on Continuous-Time Sequential Importance Sampling can be skipped if you wish to understand the BPS and Zig-Zag algorithms.)
A Jupyter tutorial [github link] with a very accessible introduction to MCMC and Zig-Zag together with Python code has been developed by Bernardita Ried (University of Chile).
An R package is available to apply Zig-Zag and BPS to Gaussian targets, Student t distribution, logistic regression. It can be installed via the R command install.packages("RZigZag").
Mathematics of PDMPs
Here papers are listed which concern the mathematical analysis of the stochastic processes underlying PDMP based algorithms.
J. Bierkens, S. M. Verduyn Lunel, Spectral analysis of the zigzag process (2019). Spectral analysis of the one-dimensional zigzag process, similar to Monmarché (2012) but for non-constant switching intensities. Results are obtained for a one-dimensional unimodal zigzag process with canonical switching intensities, i.e. without refreshments.
C. Andrieu, A. Durmus, N. Nüsken, J. Roussel, Hypocoercivity of Piecewise Deterministic Markov Process-Monte Carlo (2018). This important paper gives results in a very general context for scaling with dimension of various PDMP samplers. The obtained results seem to be sharp, for example in the sense that they agree with the results obtained for the specialized case of a factorized distribution in Bierkens, Kamatani and Roberts (2018) discussed below.
G. Deligiannidis, D. Paulin, A. Doucet, Randomized Hamiltonian Monte Carlo as Scaling Limit of the Bouncy Particle Sampler and Dimension-Free Convergence Rates (2018). For a constant refreshment rate (relative to standard normal velocities in stationarity), the first coordinate of the BPS has a scaling limit as the dimension d goes to infinity, implying that the mixing time of the first coordinate process is dimension-independent. Note however that in Bierkens, Kamatani and Roberts (2018) we show that in order for the radial component to mix we require the refreshment rate to grow at the same speed as (the norm of) the velocity.
J. Bierkens, K. Kamatani, G. Roberts, High-dimensional scaling limits of piecewise deterministic sampling algorithms (2018). In the case of a standard normal distribution on R^d, the computational efficiency of Zig-Zag and BPS (with constant positive refreshment) are compared by means of scaling limits for certain summary statistics.
A. Durmus, A. Guillin, P. Monmarché, Geometric ergodicity of the bouncy particle sampler (2018). Compared to Deligiannidis et al. (2017) the conditions for exponential ergodicity of BPS are weakened, based on a coupling argument.
J. Bierkens, G. Roberts, P.-A. Zitt, Ergodicity of the zigzag process (2017). In this paper it is shown that under mild conditions the zigzag process is (exponentially) ergodic, even without refreshments of the velocity.
J. Bierkens, Gareth Roberts, A piecewise deterministic scaling limit of Lifted Metropolis-Hastings in the Curie-Weiss model (2015). In this work we obtained the zigzag process as a scaling limit of Lifted Metropolis-Hastings, a discrete non-reversible MCMC algorithm (see the paper by Turitsyn et al. (2008) listed below). We obtain exponential ergodicity of the Zig-Zag process in the one-dimensional case.
J. Fontbona, H. Guérin, F. Malrieu, Long time behavior of telegraph processes under convex potentials (2015). Analysis of a one-dimensional process with piecewise linear dynamics and space dependent switching rate for a chemotaxis model, essentially what I call the one-dimensional zigzag process.
P. Monmarché, Piecewise deterministic simulated annealing (2014). Most of the building blocks for sampling in the one-dimensional case are here (in a later version, this is extended to multiple dimensions). The focus is on using PDMPs for optimization by means of simulated annealing.
J. Bierkens, S. Grazzi, K. Kamatani, G. Roberts, The Boomerang Sampler (2020). Combining elliptical trajectories with approximate Gaussianity of the target distribution to improve efficiency.
J. Bierkens, S. Grazzi, M. Schauer, F. van der Meulen, A piecewise deterministic Monte Carlo method for diffusion bridges (2020). The Zig-Zag method is applied to the simulation of diffusino bridges. It benefits significantly from the sparse structury that is present due to a particular choice of basis functions.