5. July 2023
QCD partition function \[ \mathcal{Z}\ =\ \int\mathcal{D}\bar\psi\,\mathcal{D}\psi\,\mathcal{D}U\, e^{-S[\bar\psi,\psi,U]} \]
Euclidean lattice action \(S\)
fermion fields Grassmann valued, difficult to represent on a computer
\(\Rightarrow\quad\) integrate out the fermion fields
\(\Rightarrow\quad\) integrate out the fermion fields
\[ \mathcal{Z}\ =\ \int\,\mathcal{D}U\,[\det(D)]^2\, e^{-S_g[U]} \]
and represent the determinant stochastically
\[ \mathcal{Z}\ =\ \int\mathcal{D}\phi^\dagger\,\mathcal{D}\phi\,\mathcal{D}U\, e^{-\phi^\dagger\frac{1}{D[U]^2}\phi - S_g[U]} \]
with pure gauge action \(S_g\)
\(D\) the Lattice Dirac operator
\(\phi^\dagger,\phi\) so-called pseudo-fermion (bosonic) fields
Traded Grassmann-valued fields with a highly non-local term \[ S_\mathrm{eff} = S_\mathrm{PF} + S_g\,,\qquad S_\mathrm{PF}\ =\ \phi^\dagger\,D[U]^{-2}\,\phi \] due to the matrix inverse!
\(\Rightarrow\quad\) local Metropolis updates unrealistic
random global updates?
Recall the steps for the local Metropolis algorithm
Starting from initial \(x\)
proposal needs to be ergodic
Metropolis algorithm fulfils detailed balance \[ g(x)\ P_\mathrm{acc}(x\to x') = g(x')\ P_\mathrm{acc}(x'\to x) \]
\(\Rightarrow\quad\) algorithm generates a Markov chain
this works well for pure gauge theory action local
again, with fermions this is no longer the case
Goal: smartly replace the local update by a global one!
Introduce an artificial Hamiltonian \[ \mathcal{H}(p, x)\ =\ \frac{1}{2}\sum p^2 - \ln(g(x)) \] with canonical momenta \(p\in\mathbb{R}^n\) and a one-paramter familty of deterministic maps \[ \varphi^\tau(p, x): (p, x) \to (p', x') \] with \(\tau\in\mathbb{R}\) which is reversible, i.e. for all \(p, x\) \[ \varphi^\tau(p, x)=(p', x')\quad \Rightarrow\quad \varphi^\tau(-p',x') = (-p, x) \]
Then the HMC algorithm is defined by the following steps
Starting from initial \(x\in\mathbb{R}^n\)
heatbath: \(p\ \sim\ \mathcal{N}_{0,1}\)
set \((p', x') = \varphi^\tau(p, x)\) with \(\tau\neq 0\)
accept/reject \((p', x')\) with probability \[ P_\mathrm{acc}\ =\ \min\{1, \exp(-\Delta\mathcal{H})\}\,,\quad \Delta\mathcal{H} = \mathcal{H}(p,x) - \mathcal{H}(p', x') \]
restart at 1.
Don’t need to know the normalisation: let \(g(x)\propto f(x)\)
The HMC transition probability \[ \begin{split} \mathbb{P}_\mathrm{HMC}(x, x') = \int d p\ &\mathcal{N}_{0,1}(p)\, \delta(x'-\varphi^\tau_x(p, x))\\ &\mathbb{P}_\mathrm{acc}[p, x \to \varphi^\tau(p,x)]\,.\\ \end{split} \] We can introduce an integral over \(p'\) as well, which gives \[ \begin{split} \mathbb{P}_\mathrm{HMC}(x, x') = \int d p\,d p'\ &\mathcal{N}_{0,1}(p)\, \delta((p',x')-\varphi^\tau(p, x))\\ &\mathbb{P}_\mathrm{acc}[p, x \to p',x']\,.\\ \end{split} \]
Due to the Metropolis accept/reject step, we have detailed balance with respect to \(\exp(-\mathcal{H})\), and thus \[ \begin{split} e^{-\mathcal{H}(p, x)}\,&\mathbb{P}_\mathrm{acc}[p, x \to p',x']\\ & = e^{-\mathcal{H}(p', x')}\,\mathbb{P}_\mathrm{acc}[p', x' \to p,x]\,.\\ \end{split} \] Moreover, \[ f(x)\, \mathcal{N}_{0,1}(p) = C e^{-\mathcal{H}(p, x)} \] with a normalisation factor \(C\).
Using these properties and the reversibility of \(\varphi^\tau\) it follows \[ \begin{split} f(x)\, &\mathbb{P}_\mathrm{HMC}(x, x') \\ = C\int &d p\,d p'\, e^{-\mathcal{H}(-p', x')}\, \delta((-p,x)-\varphi^\tau(-p', x'))\\ &\quad\mathbb{P}_\mathrm{acc}[-p', x' \to -p,x]\,,\\ \end{split} \] where we have also used that \(\mathcal{H}\) as well as the acceptance probability are agnostic to the sign of \(p\). Since \(d pd p' = d(-p)d(-p')\), detailed balance follows.
… so, what about this \(\varphi^\tau\)??
\(\Rightarrow\) HMC generates a Markov Chain for distribution \(g(x)\)!
The only option I know is the map \(\varphi^\tau\) induced by Hamilton’s EOM \[ \begin{split} \dot p_i \equiv \frac{d p_i}{d\tau} &= -\frac{\partial\mathcal{H}}{\partial x_i}\,,\qquad \dot x_i \equiv \frac{d x_i}{d\tau} &= \frac{\partial\mathcal{H}}{\partial p_i}\,.\\ \end{split} \]
in practice: need to solve EOMs numerically
use a symplectic, reversible integration scheme like for instance the Leap-Frog integration scheme
those conserve a shadow Hamiltonian \(\mathcal{H}_s\neq\mathcal{H}\) as correction in powers of the integration step size \(\Delta\tau\)
Perform a \(\chi^2\) fit using the HMC \[ \mathcal{H} = \frac{1}{2}(p_a^2 + p_b^2) + \chi^2\,,\qquad \chi^2 = \frac{1}{2}\frac{(y - at - b)^2}{\Delta y^2} \]
Derivatives \[ \frac{d\mathcal{H}}{da} = -t\frac{y-at-b}{\Delta y^2}\,,\qquad\frac{d\mathcal{H}}{db}=-\frac{y-at-b}{\Delta y^2} \]
Leap-Frog \[ \begin{split} p_{a,b}^{i+1/2} &= p_{a,b}^i - \frac{d\mathcal{H}}{d\{a,b\}}\,\Delta\tau/2\\ \{a,b\}^{i+1} &= \{a,b\}^{i} + p_{a,b}^{i+1/2}\Delta\tau\\ p_{a,b}^{i+1} &= p_{a,b}^{i+1/2} - \frac{d\mathcal{H}}{d\{a,b\}}\,\Delta\tau/2\\ \end{split} \]
pseudo-fermion fields \(\phi^\dagger,\phi\) fixed for each trajectory
\(\mathrm{SU}(N)\) generators \(t_a\)
momenta \(p_a(\mu, x)\) conjugate to the \(\alpha_a(\mu, x)\)
derivative in direction \(a\) given by (left) Lie derivative \[ D_a f(U) = -i\frac{d}{d\epsilon} f(e^{i\epsilon t_a} U)|_{\epsilon = 0} \]
large acceptance guaranteed by symplectic integration scheme!
\(\Rightarrow\quad\) in 2001: simulation at the physical point totally unrealistic!
Since then there were two key improvements!
Hasenbusch, 2001; Lüscher, 2005; CU et al., 2005; Clark and Kennedy, 2006, …
Lüscher, 2007; Babich et al., 2010; Fommer et al., 2011, …
\[ \det(Q^2)\ =\ \frac{\det(Q^2)}{\det(Q^2+\rho_1^2)}\cdot \frac{\det(Q^2+\rho_1^2)}{\det(Q^2+\rho_2^2)}\cdot\ldots\cdot\det(Q^2+\rho_n^2) \]
\(\Rightarrow\) four effects of this preconditioning
in particular: tune \(\rho_i\) parameters such that
Note: there are other such preconditioners available!
Assume: \(\mathcal{H}=\frac{1}{2}\sum p^2 + S_0(x) + S_1(x) + \ldots + S_n(x)\)
Define (\(j=0,1, \ldots n\)): \[ \begin{split} T_\mathrm{x}(\Delta\tau)&:\quad x\quad\to\quad x' = x + \Delta\tau p \\ T_{\mathrm{S}_j}(\Delta\tau)&:\quad p\quad\to\quad p' = p - \Delta\tau\frac{\partial S_j}{\partial x}\\ \end{split} \]
then recursively: \[ \begin{split} \Bigl.\Bigr.T_0 &= T_{\mathrm{S}_0}(\Delta\tau_0/2)\ T_x(\Delta\tau_0)\ T_{\mathrm{S}_0}(\Delta\tau_0/2)\, ,\\ \Bigl.\Bigr.T_j &= T_{\mathrm{S}_j}(\Delta\tau_j/2)\ [T_{j-1}]^{N_{j-1}}\ T_{\mathrm{S}_j}(\Delta\tau_j/2)\\ \end{split} \]
trajectory of length \(\tau\): \(\bigl[ T_n\bigr]^{N_n}\)
Hasenbusch plus multiple time scales
simulations at the physical point become realistic!
quark mass dependence still challenging!
any further improvement would require reducing the effort for \(D^{-2}\)
one could compute spectrum of \(D\) up to some threshold \(\Rightarrow\) too many modes to be feasible (and faster)
devide the lattice into \(N_b\) blocks
denote them \(\Lambda(\vec b)\) with block coordinates \(\vec b\)
2d example in the right panel
generate \(N=\mathcal{O}(10)\) approximate lowest eigenvectors \(\psi_l\) of \(D\) (global)
restrict to blocks \[ \phi_l^{\vec b}(x) = \begin{cases} \psi_l(x) & \textrm{if}\ x \in \Lambda(\vec b)\, ,\\ 0 & \textrm{otherwise}\,. \end{cases} \]
restrict to blocks \[ \phi_l^{\vec b}(x) = \begin{cases} \psi_l(x) & \textrm{if}\ x \in \Lambda(\vec b)\, ,\\ 0 & \textrm{otherwise}\,. \end{cases} \] and orthonormlise
this generates \(N_b\cdot N\) vectors
key points of this so far
a relatively fast to generate set of basis vectors \(\psi_l^{\vec b}\)
local coherence: they can approximate the space of low modes
\(\Rightarrow\) this can be used as a preconditioner
roughly speaking:
deflate \(D\): little Dirac operator \(A\)
solve \(A\) on the sub-space
use this as an approximate preconditioner for a Krylov subspace method on the full space
\(A\) can be deflated with the global modes
\(\Rightarrow\) a cycle of restriction and prolongation \(\Rightarrow\) Multgrid
result:
concept has been further improved
the multigrid solver roughly changes \[ m_q^{-3}\quad\to\quad m_q^{-1} \] in the cost formula
example plot for \(a=0.08\ \mathrm{fm}\), \(L=1.92\ \mathrm{fm}\) and 1000 trajectories
only illustrative in practice the details will be different
\[ \mathrm{Cost}\propto a^{-10}\ ? \]
some quantities not affected by large \(\tau_\mathrm{int}\)-values
however, how to interpret results if \(Q\) not properly sampled?
\(\Rightarrow\) normalising flows:
learn a diffeomorphism between e.g. a Gaussian distribution and \(e^{-S}\)
learning takes time, but sampling is then basically for free
problematic volume scaling