Approximating Tensor Contractions with Annealed Importance Sampling

Tensors are ubiquitous in modern information processing. In classical information, tensors represent high-dimensional data structures like images, videos, and text. In quantum computing, tensors encode quantum states and operations, enabling efficient simulation of quantum circuits.

For example, a quantum circuit, represented by a set of unitaries \( \text{ }\mathcal{U} = \{U_1, U_2, \ldots, U_m\} \), has the probability of measuring a particular bitstring \( x = (x_1, x_2, \ldots, x_n) \) given by the contraction \[ \mathrm{Pr}(x) = \left| \langle x | U_m \cdots U_2 U_1 | 0^{\otimes n} \rangle \right|^2 \] where \( |0^{\otimes n} \rangle \) is the initial state of the circuit. This amplitude corresponds to a tensor network contraction, where each gate \( U_i \in \mathcal{U}\) is represented as a low-rank tensor, with the circuit structure defining the connectivity. The contraction sums over all intermediate qubit states at each layer \[ \langle x | U_m \cdots U_2 U_1 | 0^{\otimes n} \rangle = \sum_{\text{internal indices}}\prod_{v \in \mathcal{V}} T_v[\text{indices}] \] Evaluating this quantity is \(\mathsf{\#P}\)-hard in general, as it requires summing over an exponential number of intermediate states. However, many practical applications only require approximate estimates of these contractions, which can be achieved using Monte Carlo methods.

Annealed importance sampling (AIS), introduced by Radford Neal in 2001, is a powerful altered Markov Chain Monte Carlo method for estimating partition functions or marginal likelihoods of high-dimensional, multimodal distributions. Suppose we want to estimate the normalization constant \(Z\) of a target distribution \[ \pi_K (x) = \frac{\psi(x)^1}{Z} \] which is intractable. AIS interpolates between a simple proposal distribution \(\pi_0(x)\) and the target distribution \(\pi_K(x)\) by introducing a sequence of intermediate distributions \[ \pi_k(x) \propto \psi(x)^{\beta_k},\quad \beta_0 = 0 < \beta_1 < \cdots < \beta_K = 1 \] By gradually "annealing" through these distributions using MCMC moves that leave \(\pi_k \) invariant, AIS constructs a sequence of states \(\mathbf{x}^{(0)}, \mathbf{x}^{(1)}, \ldots, \mathbf{x}^{(K)}\) and corresponding weights. The overall AIS estimator is, in fact, unbiased even for a finite number of steps. At every step \(k\), we sample \(N\) independent chains, each initialized from \(\pi_0\). After \(K\) steps, we average the product of weights across all chains to obtain an estimate of the normalization constant: \[ \hat Z = Z_0 \cdot \frac{1}{N} \sum_{i=1}^N \prod_{k=1}^K \psi\left(\mathbf{x}_i^{(k-1)}\right)^{\beta_k - \beta_{k-1}} \] where each ratio is computed as the expectation under the intermediate distribution \(\pi_{\beta_{k-1}} \). \[ \frac{Z_{\beta_k}}{Z_{\beta_{k-1}}} = \mathbb{E}_{\mathbf{x} \sim \pi_{\beta_{k-1}}} \left[ \psi \left(\mathbf{x}\right)^{\beta_k - \beta_{k-1}}\right] \] In Progress.

        \begin{algorithm}
        \caption{Annealed Importance Sampling}
        \Require N chains, ladder (\beta_k)
        \State Sample \(x^{(0,i)}\sim \pi_0\) for \(i=1,\dots,N\)
        \For{\(k=1\) to \(K\)}
        \For{\(i=1\) to \(N\)}
            \State Mix chain \(i\) under \(\pi_{\beta_k}\) via MCMC
            \State \(w_k^{(i)} \gets \psi(x^{(k-1,i)})^{\beta_k-\beta_{k-1}}\)
        \EndFor
        \EndFor
        \State \(\hat Z \gets Z_0 \cdot \frac{1}{N} \sum_{i=1}^N \prod_{k=1}^K w_k^{(i)}\)
        \end{algorithm}