Gaussian Expectation Propagation

Multivariate Normal Distribution in the Exponential Family

Remember from a previous blog post that a pdf p(x)=N(x;μ,Σ) can be written as p(x)=exp{Σ1μ,x+vec(12Σ1),vec(xx)12[dlog2π+log|Σ|+μΣ1μ]}

Following the work of Barthelmé, Chopin, and Cottet (2015) we can relabel the expression above as follows p(x)exp{r,x+vec(12Q),vec(xx)} where we have defined r:=Σ1μandQ:=Σ1

Target Distribution

We assume the target distribution is intractable, but can be factorized into a product of K+1 terms f(γ)k=0Kfk(γk) where usually

  • f(γ)=p(γx) is a posterior distribution.
  • f0(γ0)=p(γ;θ0) is a prior distribution.
  • f1(γ1),,fK(γk) are likelihood terms, which are intractable.

From here onwards we assume that the prior distribution f0(γ) is a multivariate gaussian distribution N(γ;r0,Q0) (with natural parameters θ0=(r0,12Q0)). f0(γ)=p(γ;r0,Q0)=∝exp{r0,x+vec(12Q0),vec(xx)}

Note: The parameters γ0,,γk are not the components of γ, they are different parameters.

Global Approximation

The global approximation is defined as g(θ)k=0Kgk(θk) where we set the first term to be equal to the (tractable) prior, which is a multivariate gaussian distribution g0(θ)exp{r0,x+vec(12Q0),vec(xx)} Furthermore, we assume that each factor gk(θk) for k=1,,K also follows a Multivariate Normal Distribution with natural parameter θk=(rk,Qk). Luckily, the product of Gaussians is again a Gaussian distribution. In particular we have g(θ)k=0Kexp{rkx+vec(12Qk)vec(xx)}=exp{(k=0Krk)x+vec(12k=0KQk)vec(xx)}:=exp{rx+vec(12Q)vec(xx)} where we have defined the global natural parameters to be r:=k=0KrkandQ:=k=0KQk In other words, the natural parameters of the global approximation are found by summing the natural parameters of all the sites.

Note: The parameters θ0,,θk are not the components of θ, they are different parameters.

Cavity Distribution

The cavity distribution at the kth site is given by the product of all but the kth Multivariate Gaussian. This means that we can write the cavity distribution as gk(θθk)exp{(rrk)x+vec(12(QQk))vec(xx)} In other words, the natural parameters of the cavity distribution are found by taking the difference between the global natural parameters and the natural parameters of the kth site.

Tilted Distribution

The tilted distribution, also called pseudo-posterior, is found by multiplying the cavity distribution by the kth local likelihood term. In other words, the tilted distribution is a (pseudo-)posterior where we use the cavity distribution as a (pseudo-)prior and the single local likelihood term as the likelihood.

In general, computing moments of this distribution will be intractable, however we how that calculating moments of this distribution will be easier than calculating moments of the entire target distribution. gk(γ~k)fk(γk)exp{(rrk)x+vec(12(QQk))vec(xx)}

Gaussian Expectation Propagation Updates

Initialization

  • Choose natural parameters for the prior distribution r0 and Q0.
  • For every site k=1,,K choose the parameters rk and Qk.
  • Compute the natural parameters of the global approximation rk=0KrkandQk=0KQk

Updates

  • At the kth site subtract the kth natural parameters from the global ones, to obtain the natural parameters of the cavity distribution rkrrkandQkQQk
  • Sample from the tilted distribution using an MCMC sampler to obtain N samples γ~k(1),,γ~k(N) from gk(γ~k)fk(γk)exp{(rrk)x+vec(12(QQk))vec(xx)}
  • Perform moment-matching by computing the mean and the variance-coviariance matrix from the samples, and assign those values to the mean and the variance-covariance matrix of the newly-found global gaussian approximation. μnew,k1Nj=1Nγ~i(j)Σnew,k1N1j=1N(γ~i(j)μnew,k)(γ~i(j)μnew,k) Notice that every site k=1,,K has found (possibly) different new mean and variance-covariance parameters. We denote by μnew,k the new mean parameter for the multivariate normal global approximation, found at site k.
  • Next, at every site i=1,,K, convert the global mean and variance-covariance parameters into natural parameters Qnew,k(Σnew,k)1rnew,kQnew,kμnew,k
  • At site k find the new natural parameters for the kth approximating factor gk(θk) by taking the difference between the new global natural parameters and the natural parameters of the old cavity distribution. QknewQnew,kQkrknewrnew,krk
  • From each site, send rknew and Qknew to the posterior server and update global approximation rparallelnewk=1KrknewQparallelnewk=1KQknew

Bibliography

Barthelmé, Simon, Nicolas Chopin, and Vincent Cottet. 2015. “Divide and Conquer in ABC: Expectation-Progagation Algorithms for Likelihood-Free Inference.” https://arxiv.org/abs/1512.00205.
Avatar
Mauro Camara Escudero
Machine Learning Engineer

My research interests include approximate manifold sampling and generative models.