Sequential Monte Carlo Samplers

Problem Set-Up

This is all taken from Sequential Monte Carlo samplers. We have a collection of target distributions $π_{n} (x) = \frac{γ_{n} (x)}{Z_{n}}$ and we would like to sample from them sequentially in order to approximate expectations.

Importance Sampling

We write target expectations using the Importance Sampling (IS) trick for a proposal density $η_{n}$ $E_{π_{n}} [ϕ] = \int_{E} ϕ (x) π_{n} (x) d x = \frac{1}{Z_{n}} \int_{E} ϕ (x) γ_{n} (x) d x = \frac{1}{Z_{n}} \int_{E} ϕ (x) w_{n} (x) η_{n} (x) d x$ $Z_{n} = \int_{E} γ_{n} (x) d x = \int_{E} w_{n} (x) η_{n} (x) d x$ where we have defined importance weights as $w_{n} (x) = \frac{γ_{n} (x)}{η_{n} (x)}$ Therefore importance sampling uses the following particle approximation $η_{n}^{N} (d x) = \frac{1}{N} \sum_{i = 1}^{N} δ_{X_{n}^{(i)}} (d x)$ Plugging this into the two expressions above we obtain $\begin{array}{r} E_{π_{n}} [ϕ] = \frac{E_{η_{n}} [ϕ w_{n}]}{E_{η_{n}} [w_{n}]} \approx \frac{E_{η_{n}^{N}} [ϕ w_{n}]}{E_{η_{n}^{N}} [w_{n}]} = \frac{\sum_{i = 1}^{N} ϕ (X_{n}^{(i)}) w_{n} (X_{n}^{(i)})}{\sum_{i = 1}^{N} w_{n} (X_{n}^{(i)})} = \sum_{i = 1}^{N} ϕ (X_{n}^{(i)}) W_{n} (X_{n}^{(i)}) \end{array}$ where we have defined the normalized importance weights $W_{n} (X_{n}^{(i)}) = \frac{w_{n} (X_{n}^{(i)})}{\sum_{j = 1}^{N} w_{n} (X_{n}^{(j)})}$

Sequential Importance Sampling

In importance sampling, for each different target $π_{n}$ we would sample the particles afresh from $η_{n}$ . This assumes that we can sample from $η_{n} \approx π_{n}$ and that we can evaluate $η_{n}$ in order to evaluate the unnormalized IS weights $w_{n} (x) = \frac{γ_{n} (x)}{η_{n} (x)}$ In Sequential Importance Sampling (SIS) we start using IS at $n = 1$ but then we build $η_{n}$ from the previous iteration. Specifically we do this:

At time $n = 1$ our target is $π_{1}$ and we use an IS proposal $η_{1}$ which we choose to approximate well $π_{1}$ (often we choose $η_{1} = π_{1}$ ). This means we sample particles $X_{1}^{(1 : N)}$ from $η_{1}$ and then compute the IS unnormalized weights $w_{1} (X_{1}^{(i)}) = \frac{γ_{1} (X_{1}^{(i)})}{η_{1} (X_{1}^{(i)})}$
Suppose that at time $n - 1$ we had a set of particles ${X_{n - 1}^{(i)}}$ sampled from $η_{n - 1}$ . Our target at time $n$ is $π_{n}$ . In order to propose a new set of particles we use a Markov Kernel $K_{n} (x^{'} ∣ x)$ . We call the resulting distribution $η_{n}$ . Notice that this distribution can be found using the property that a kernel $K_{n}$ operates on measures on the left $η_{n} = η_{n - 1} K_{n} ⟹ η_{n} (x^{'}) = \int_{E} η_{n - 1} (x) K_{n} (x^{'} ∣ x) d x$ Once we have sampled from the kernel to move the particles forward $X_{n}^{(i)} \sim K_{n} (\cdot ∣ X_{n - 1}^{(i)})$ , we need to compute the weights to account for the discrepancy of sampling from $η_{n}$ rather than $π_{n}$ $w_{n} (X_{n}^{(i)}) = \frac{γ_{n} (X_{n}^{(i)})}{η_{n} (X_{n}^{(i)})}$ However notice we can only do this if we can evaluate $η_{n}$ .

In general, we cannot evaluate $η_{n}$ because it is defined in terms of an integral with respect to $x_{1 : n - 1}$ . Indeed consider $η_{3}$

$\begin{aligned} η_{3} (x_{3}) & = \int_{E} η_{2} (x_{2}) K_{3} (x_{3} ∣ x_{2}) d x_{2} \\ = \int_{E} [\int_{E} η_{1} (x_{1}) K_{2} (x_{2} ∣ x_{1}) d x_{1}] K_{3} (x_{3} ∣ x_{2}) d x_{2} \\ = \int_{E^{2}} η_{1} (x_{1}) K_{3} (x_{3} ∣ x_{2}) K_{2} (x_{2} ∣ x_{1}) d x_{1} d x_{2} \end{aligned}$

In general $η_{n}$ will be $η_{n} (x_{n}) = \int_{E^{n - 1}} η_{1} (x_{1}) \prod_{k = 1}^{n} K_{k} (x_{k} ∣ x_{k - 1}) d x_{1 : k - 1}$ which is clearly intractable.

SMC sampler

Since the problem is integration with respect to $x_{1 : k - 1}$ we “open up” the integral and instead consider its integrand. Rather than considering $η_{n} (x_{n})$ which proposes a new set of particles ${X_{n}^{(i)}}$ from ${X_{n - 1}^{(i)}}$ we consider the proposal distribution $η_{n} (x_{1 : n})$ defined as $η_{n} (x_{1 : n}) = η_{1} (x_{1}) \prod_{k = 2}^{n} K_{k} (x_{k} ∣ x_{k - 1})$ We would now like to perform importance sampling. To do this, we need to extend the target from $π_{n} (x_{n})$ to ${\tilde{π}}_{n} (x_{1 : n})$ . We do this by introducing backward kernels $L_{n - 1} (x_{n - 1} ∣ x_{n})$ ${\tilde{π}}_{n} (x_{1 : n}) = \frac{1}{Z_{n}} {\tilde{γ}}_{n} (x_{1 : n}) = \frac{1}{Z_{n}} γ_{n} (x_{n}) \prod_{k = 1}^{n - 1} L_{k} (x_{k} ∣ x_{k + 1})$ The IS weights would then become $\begin{aligned} w_{n} (x_{1 : n}) & = \frac{{\tilde{γ}}_{n} (x_{1 : n})}{η_{n} (x_{1 : n})} \\ = \frac{γ_{n} (x_{n}) \prod_{k = 1}^{n - 1} L_{k} (x_{k} ∣ x_{k + 1})}{η_{1} (x_{1}) \prod_{k = 2}^{n} K_{k} (x_{k} ∣ x_{k - 1})} \\ = \frac{γ_{n} (x_{n}) L_{n - 1} (x_{n - 1} ∣ x_{n}) L_{n - 2} (x_{n - 2} ∣ x_{n - 1}) \dots L_{1} (x_{1} ∣ x_{2})}{η_{1} (x_{1}) K_{n} (x_{n} ∣ x_{n - 1}) K_{n - 1} (x_{n - 1} ∣ x_{n - 2}) \dots K_{2} (x_{2} ∣ x_{1})} \\ = \frac{γ_{n} (x_{n}) L_{n - 1} (x_{n - 1} ∣ x_{n})}{γ_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})} \cdot \frac{γ_{n - 1} (x_{n - 1}) L_{n - 2} (x_{n - 2} ∣ x_{n - 1}) \dots L_{1} (x_{1} ∣ x_{2})}{η_{1} (x_{1}) K_{n - 1} (x_{n - 1} ∣ x_{n - 2}) \dots K_{2} (x_{2} ∣ x_{1})} \\ = \frac{γ_{n} (x_{n}) L_{n - 1} (x_{n - 1} ∣ x_{n})}{γ_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})} \cdot w_{n - 1} (x_{1 : n - 1}) \\ = {\tilde{w}}_{n} (x_{n} ∣ x_{n - 1}) w_{n - 1} (x_{1 : n - 1}) \end{aligned}$

where we have defined the incremental weight as ${\tilde{w}}_{n} (x_{n} ∣ x_{n - 1}) = \frac{γ_{n} (x_{n}) L_{n - 1} (x_{n - 1} ∣ x_{n})}{γ_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}$

To summarize:

Importance Sampling at time $n$ targets $π_{n}$ . It samples particles ${X_{n}^{(i)}}$ afresh from a proposal $η_{n}$ and computes weights afresh as $w_{n} (X_{n}^{(i)}) = γ_{n} (X_{n}^{(i)}) / η_{n} (X_{n}^{(i)})$ . For this to work, however, we need to be able to find proposals $η_{n} \approx π_{n}$ which is in general very hard.
Sequential Importance Sampling also targets $π_{n}$ at time $n$ . It tries to fix the problem of finding $η_{n}$ by using a local Markov Kernel $K_{n} (\cdot ∣ X_{n - 1}^{(i)})$ to sample a new set of particles ${X_{n}^{(i)}}$ starting from ${X_{n - 1}^{(i)}}$ . This, at time $n$ , gives rise to the following proposal distributions $η_{n} (x_{n}) = \int_{E^{n - 1}} η_{1} (x_{1}) \prod_{k = 2}^{n} K_{k} (x_{k} ∣ x_{k - 1}) d x_{1 : k - 1}$ We can now sample from $η_{n} (x_{n})$ but we cannot evaluate $η_{n} (\cdot)$ due to the integral with respect to $x_{1 : k - 1}$ . Evaluating $η_{n}$ is needed to compute the IS weights $w_{n} (X_{n}^{(i)}) = \frac{γ_{n} (X_{n}^{(i)})}{η_{n} (x_{n})}$
SMC Samplers overcomes the problem of integrating over $x_{1 : k - 1}$ by working with the integrand directly. The proposal and the target distributions are then $η_{n} (x_{1 : n})$ and ${\tilde{π}}_{n} (x_{1 : n})$ . Notice the difference with respect to IS and SIS: In IS and SIS we get new particles at each time step, that is at time step $n - 1$ we have $X_{n - 1}^{(1 : N)}$ and at time step $n$ we have $X_{n}^{(1 : N)}$ . In an SMC sampler, instead, we extend the particles at time $n - 1$ $X_{1 : n - 1}^{(1 : N)}$ by sampling from a kernel $X_{n}^{(i)} \sim K_{n} (\cdot ∣ X_{n - 1}^{(i)})$ and then appending this to the current particles to obtain $X_{1 : n}^{(1 : N)}$ . Since we have appended $X_{n}^{(1 : N)}$ to the previous particles, we need to update the weights and these are updated using an incremental weight ${\tilde{w}}_{n} (x_{n} ∣ x_{n - 1}) = \frac{γ_{n} (x_{n}) L_{n - 1} (x_{n - 1} ∣ x_{n})}{γ_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}$ Importantly, this requires us to introduce backwards kernels which essentially allow us to approach the problem from an auxiliary variable perspective.

Since the variance of the weights increases as $η_{n}$ and ${\tilde{π}}_{n}$ become further apart, one often resamples the particles according to ${\tilde{π}}_{n}^{N} (d x_{1 : n}) = \sum_{i = 1}^{N} W_{n}^{(i)} δ_{X_{1 : n}^{(i)}} (d x_{1 : n})$

The algorithm is summarized below.

A few notes:

The particle estimate of the $n$ th target is $π_{n}^{N} (d x) = \sum_{i = 1}^{N} W_{n} (X_{1 : n}^{(i)}) δ_{X_{n}^{(i)}} (d x)$
It is helpful to remember the distributions of $X_{n}^{(i)}$ and $X_{1 : n}^{(i)}$ (using sloppy notation) $\begin{aligned} X_{n}^{(i)} & \sim \int_{E^{n - 1}} η_{1} (x_{1}) \prod_{k = 2}^{n} K_{k} (x_{k} ∣ x_{k - 1}) d x_{1 : k - 1} \\ X_{1 : n}^{(i)} & \sim η_{1} (x_{1}) \prod_{k = 2}^{n} K_{k} (x_{k} ∣ x_{k - 1}) \end{aligned}$
The optimal backward kernel takes us back to IS on $E$ $L_{n - 1}^{opt} (x_{n - 1} ∣ x_{n}) = \frac{η_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}{η_{n} (x_{n})}$ It is difficult to use this kernel as it relies on $η_{n - 1}$ and $η_{n}$ which are intractable (indeed it is the reason why we went from SIS to SMC samplers).
Sub-optimal kernel: substitute $π_{n}$ for $η_{n}$ . This is motivated by the fact that, if $η_{n}$ is a good proposal for $π_{n}$ then they should be sufficiently close. First, rewrite the optimal kernel $L_{n - 1}^{opt} (x_{n - 1} ∣ x_{n}) = \frac{η_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}{η_{n} (x_{n})} = \frac{η_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}{\int_{E} η_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1}) d x_{n - 1}}$ Now substitute $π_{n}$ and $π_{n - 1}$ for $η_{n}$ and $η_{n - 1}$ respectively. $L_{n - 1} (x_{n - 1} ∣ x_{n}) = \frac{π_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}{\int_{E} π_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1}) d x_{n - 1}}$ The incremental weights become $\begin{aligned} w_{n} (x_{n} ∣ x_{n - 1}) & = \frac{γ_{n} (x_{n}) L_{n - 1} (x_{n - 1} ∣ x_{n})}{γ_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})} \\ = \frac{γ_{n} (x_{n}) π_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1})}{γ_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1}) \int_{E} π_{n - 1} (x_{n - 1}) K_{n} (x_{n} ∣ x_{n - 1}) d x_{n - 1}} \end{aligned}$

IS Measure Theory

Suppose $π_{1} : E \to [0, 1]$ is our target probability distribution and $η_{1} : E \to [0, 1]$ is our IS proposal probability distribution. Suppose that they admit density with respect to the Lebesgue measure $d x$ on $(E, E)$ $\frac{d π_{1}}{d x} = \frac{{\tilde{p}}_{π_{1}} (x)}{\int_{E} {\tilde{p}}_{π_{1}} (y) d y} = p_{π_{1}} (x) \frac{d η_{1}}{d x} (x) = \frac{{\tilde{p}}_{η_{1}} (x)}{\int_{E} {\tilde{p}}_{η_{1}} (y) d y} = p_{η_{1}} (x)$ Suppose that $π_{1} ≪ η_{1}$ then the Radon-Nikodym derivative of $π_{1}$ with respect to $η_{1}$ exists $\frac{d π_{1}}{d η_{1}} = \frac{w_{1} (x)}{\int_{E} w_{1} (x) d η_{1} (x)}$ We can the approximate the expectation as follows $\begin{aligned} E_{π_{1}} [ϕ] & = \int_{E} ϕ (x) d π_{1} (x) \\ = \int_{E} ϕ (x) \frac{d π_{1}}{d η_{1}} (x) d η_{1} (x) \\ = \int_{E} ϕ (x) \frac{d π_{1}}{d η_{1}} (x) \frac{d η_{1}}{d x} (x) d x \\ = \frac{\int_{E} ϕ (x) w_{1} (x) p_{η_{1}} (x) d x}{\int_{E} w_{1} (x) d η_{1} (x)} \\ = \frac{\int_{E} ϕ (x) w_{1} (x) p_{η_{1}} (x) d x}{\int_{E} w_{1} (x) \frac{d η_{1}}{d x} (x) d x} \\ = \frac{\int_{E} ϕ (x) w_{1} (x) p_{η_{1}} (x) d x}{\int_{E} w_{1} (x) p_{η_{1}} (x) d x} \end{aligned}$

Now suppose you have samples ${X_{1}^{(1 : N)}} \sim η_{1}$ then we can construct the particle approximation $η_{1}^{N} (d x) = \frac{1}{N} \sum_{i = 1}^{N} δ_{X_{1}^{(i)}} (d x)$ Substituting this into the expression for the expectation we get $\begin{aligned} E_{π_{1}} [ϕ] & = \frac{E_{η_{1}} [ϕ w_{1}]}{E_{η_{1}} [w_{1}]} \\ \approx \frac{E_{η_{1}^{N}} [ϕ w_{1}]}{E_{η_{1}^{N}} [w_{1}]} \\ = \frac{\frac{1}{N} \sum_{i = 1}^{N} \int_{E} ϕ (x) w_{1} (x) δ_{X_{1}^{(i)}} (d x)}{\frac{1}{N} \sum_{i = 1}^{N} \int_{E} w_{1} (x) δ_{X_{1}^{(i)}} (d x)} \\ = \frac{\sum_{i = 1}^{N} ϕ (X_{1}^{(i)}) w_{1} (X_{1}^{(i)})}{\sum_{i = 1}^{N} w_{1} (X_{1}^{(i)})} \\ = \sum_{i = 1}^{N} ϕ (X_{1}^{(i)}) W_{1} (X_{1}^{(i)}) \end{aligned}$ where $W_{1} (X_{1}^{(i)}) = \frac{w_{1} (X_{1}^{(i)})}{\sum_{j = 1}^{N} w_{1} (X_{1}^{(j)})}$

SIS Proposal

Now let $K_{n} : E \times E \to [0, 1]$ be a Markov Kernel. Then this kernels can operate on the left with measures. $η_{n} = η_{n - 1} K_{n} ⟹ η_{n} (A) = \int_{E} K_{n} (x, A) d η_{n - 1} (x) = \int_{E} d η_{n - 1} (x) \int_{A} K_{n} (x, d y)$ Denote by $K_{n, x} : E \to [0, 1]$ the probability measure in the second argument of the kernel, i.e. $K_{n, x} (A) = K_{n} (x, A)$ Suppose that $K_{n, x}$ admits a density with respect to the Lebesgue measure $\frac{d K_{n, x}}{d y} (y) = k_{n} (y ∣ x)$ Then the new measure can be written as $\begin{aligned} η_{n} & = \int_{E} \frac{d η_{n - 1}}{d x} [\int_{A} K_{n} (x, d y)] d x & η_{n - 1} ≪ d x \\ = \int_{E} \frac{d η_{n - 1}}{d x} [\int_{A} d K_{n, x} (y)] d x & def K_{n, x} \\ = \int_{E} p_{η_{n - 1}} (x) [\int_{A} \frac{d K_{n, x} (y)}{d y} d y] d x & K_{n, x} ≪ d y \\ = \int_{E} p_{η_{n - 1}} (x) [\int_{A} k_{n} (y ∣ x) d y] d x \\ = \int_{E} \int_{A} p_{η_{n - 1}} (x) k_{n} (y ∣ x) d y d x \\ = \int_{A} [\int_{E} p_{η_{n - 1}} (x) k_{n} (y ∣ x) d x] d y \end{aligned}$ Then by definition we must have that the expression in parenthesis is indeed the density of $η_{n}$ with respect to $d y$ $\frac{d η_{n}}{d y} = \int_{E} p_{η_{n - 1}} (x) k_{n} (y ∣ x) d x$ Or more precisely, identifying $x = x_{n - 1}$ and $y = x_{n}$ we have $p_{η_{n}} (x_{n}) = \frac{d η_{n}}{d x_{n}} (x_{n}) = \int_{E} p_{η_{n - 1}} (x_{n - 1}) k_{n} (x_{n} ∣ x_{n - 1}) d x_{n - 1}$ Indeed we have $η_{n} (A) = \int_{A} d η_{n} (x_{n}) = \int_{A} \frac{d η_{n}}{d x_{n}} d x_{n} = \int_{A} p_{η_{n}} (x_{n}) d x_{n}$

SMC Proposal

The proposal in the SMC sampler is not given by $η_{n} = η_{n - 1} K_{n} .$ Indeed in SMC we don’t simply propose $X_{n}^{(i)}$ . We propose $X_{n}^{(i)}$ and then we append it to $X_{1 : n - 1}^{(i)}$ which was generated according to $η_{n - 1}$ and so on.

SMC Steps

Step $n = 1$ : Our target is $π_{1}$ and proposal $η_{1}$ is given. Sample $X_{1}^{(1 : N)} \sim η_{1}$ . Weights are RN-derivative $w_{1} \propto \frac{d π_{1}}{d η_{1}}$
Step $n = 2$ : Move particles forward using kernel $K_{2} : E \times E \to [0, 1]$ . We sample $X_{2}^{(i)} \sim K_{2} (\cdot ∣ X_{1}^{(i)})$ . Marginally, each new particle $X_{n}^{(i)}$ is distributed as $X_{2}^{(i)} \sim η_{2} = η_{1} K_{2}$ Which can be written as $\begin{aligned} η_{2} (A) & = \int_{E} d η_{2} \\ = \int_{E} d (η_{1} K_{2}) \\ = \int_{E} K_{2} (x_{1}, A) d η_{1} (x_{1}) \\ = \int_{E} \int_{A} K_{2} (x_{1}, d x_{2}) η_{1} (x_{1}) \\ = \int_{E} [\int_{A} \frac{d K_{2} (x_{1}, \cdot)}{d x_{2}} d x_{2}] \frac{d η_{1} (x_{1})}{d x_{1}} d x_{1} \\ = \int_{E} [\int_{A} k_{2} (x_{2} ∣ x_{1}) d x_{2}] p_{η_{1}} (x_{1}) d x_{1} \\ = \int_{E} \int_{A} k_{2} (x_{2} ∣ x_{1}) p_{η_{1}} (x_{1}) d x_{2} d x_{1} \\ = \int_{A} [\int_{E} k_{2} (x_{2} ∣ x_{1}) p_{η_{1}} (x_{1}) d x_{1}] d x_{2} \end{aligned}$ where we can see that the density of $η_{2}$ is $\frac{d η_{2}}{d x_{2}} (x_{2}) = \int_{E} k_{2} (x_{2} ∣ x_{1}) p_{η_{1}} (x_{1}) d x_{1} .$ In SMC we then append $X_{2}^{(i)}$ to $X_{1}^{(i)}$ to get $X_{1 : 2}^{(i)}$ so our aim is now to find a measure for it.

Define $η_{1 : 2} := η \times K (x_{1}, \cdot)$ to be the following product measure on $E \otimes E$ $η_{1 : 2} (A \times B) := (η_{1} \times K (x_{1}, \cdot)) (A \times B) = η_{1} (A) K (x_{1}, B) A \times B \in E \otimes E x_{1} \in E$ Since $η_{1} ≪ d x_{1}$ and $K (x_{1}, \cdot) ≪ d x_{2}$ by a standard result (see here and here) on product measures we have that $η_{1 : 2} ≪ d (x_{1} \times x_{2})$ and $\begin{aligned} \frac{d η_{1 : 2}}{d (x_{1} \times x_{2})} (x_{1 : 2}) & = \frac{d (η_{1} \times K (x_{1}, \cdot))}{d (x_{1} \times x_{2})} (x_{1 : 2}) \\ = \frac{d η_{1}}{d x_{1}} (x_{1}) \cdot \frac{d K (x_{1}, \cdot)}{d x_{2}} (x_{2}) \\ = p_{η_{1}} (x_{1}) k_{2} (x_{2} ∣ x_{1}) \end{aligned}$ If we also define the extended target as the following product measure $π_{1 : 2} (A \times B) = (L_{1} (x_{2}, \cdot) \times π_{2}) (A, B) = L_{1} (x_{2}, A) π_{2} (B)$ then by the same arguments as above its Radon-Nikodym derivative will be given by $\begin{aligned} \frac{d π_{1 : 2}}{d (x_{1} \times x_{2})} (x_{1 : 2}) \\ = \frac{d (L_{1} (x_{2}, \cdot) \times π_{2})}{d (x_{1} \times x_{2})} (x_{1 : 2}) \\ = \frac{d L_{1} (x_{2}, \cdot)}{d x_{1}} (x_{1}) \cdot \frac{d π_{2}}{d x_{2}} (x_{2}) \\ = ℓ (x_{1} ∣ x_{2}) p_{π_{2}} (x_{2}) \end{aligned}$ where $ℓ$ is the density of $L$ with respect to $x_{1}$ . As long as $π_{1 : 2} ≪ η_{1 : 2}$ the weights are given by $\begin{aligned} w_{1 : 2} (x_{1 : 2}) & = \frac{d π_{1 : 2}}{d η_{1 : 2}} (x_{1 : 2}) \\ = \frac{d π_{1 : 2}}{d (x_{1} \times x_{2})} \cdot \frac{d (x_{1} \times x_{2})}{d η_{1 : 2}} (x_{1 : 2}) & chain rule \\ = \frac{d π_{1 : 2}}{d (x_{1} \times x_{2})} \cdot {(\frac{d η_{1 : 2}}{d (x_{1} \times x_{2})})}^{- 1} (x_{1 : 2}) & see below \\ = \frac{p_{π_{2}} (x_{2}) ℓ (x_{1} ∣ x_{2})}{p_{η_{1}} (x_{1}) k_{2} (x_{2} ∣ x_{1})} \\ = \frac{p_{π_{2}} (x_{2}) ℓ (x_{1} ∣ x_{2})}{p_{π_{1}} (x_{1}) k_{2} (x_{2} ∣ x_{1})} \cdot \frac{p_{π_{1}} (x_{1})}{p_{η_{1}} (x_{1})} \\ \propto \frac{{\tilde{p}}_{π_{2}} (x_{2}) ℓ (x_{1} ∣ x_{2})}{{\tilde{p}}_{π_{1}} (x_{1}) k_{2} (x_{2} ∣ x_{1})} \cdot \frac{{\tilde{p}}_{π_{1}} (x_{1})}{{\tilde{p}}_{η_{1}} (x_{1})} \end{aligned}$ where on line $3$ we have used the following fact and on line $5$ we have basically multiplied by $1 = \frac{d π_{1}}{d π_{1}} = \frac{d π_{1}}{d x_{1}} \cdot {(\frac{d π_{1}}{d x_{1}})}^{- 1}$

Step $n$ : Target is $π_{1 : n}$ . Perform importance sampling using $η_{1 : n}$ which is a product measure $η_{1 : n} = η_{1} \times K_{n} (x_{n - 1}, \cdot) \times \dots \times K_{2} (x_{1}, \cdot)$ with density given by the product rule $\frac{d η_{1 : n}}{d x_{1 : n}} = p_{η_{1}} (x_{1}) \prod_{k = 2}^{n} k_{k} (x_{k} ∣ x_{k - 1})$ Similarly the extended target is a product measure $π_{1 : n} = π_{n} \times L_{n - 1} (x_{n}, \cdot) \times \dots \times L_{1} (x_{1}, \cdot)$ with density $\frac{d π_{1 : n}}{d x_{1 : n}} = p_{π_{n}} (x_{n}) \prod_{k = 1}^{n - 1} ℓ_{k} (x_{k} ∣ x_{k + 1})$ The weights are then given by $w_{1 : n} = \frac{d π_{1 : n}}{d η_{1 : n}} \propto \frac{{\tilde{p}}_{π_{n}} (x_{n}) ℓ (x_{n - 1} ∣ x_{n})}{{\tilde{p}}_{π_{n - 1}} (x_{n - 1}) k_{n} (x_{n} ∣ x_{n - 1})} \cdot w_{1 : n - 1}$

Last updated on Jun 9, 2021

Edit this page