8

In QAOA algorithm, two terms are being discussed; 1) clause or cost (C) Hamiltonian and 2) mixer consisting of pauli X gates.

What is the role of this mixer? Not clear why it comes after the C. Doesn't it cause the state to flip after evaluating C?

John Parker
  • 1,081
  • 3
  • 11
  • 1
    The purpose of the mixer term is to make sure that if somehow your trial state is an eigenstate of the cost Hamiltonian $H_C$, that is $H_C|\psi (\theta) \rangle = E |\psi (\theta) \rangle$, then you can get out of this state. If you don't have this mixer, and you continue applying $e^{i\alpha_k H_C}$ to the state $|\psi \rangle$, you will remain in the state $|\psi\rangle$. And $H_C$ have many eigenstates so you might very well stuck in an undesirable eigenstate! In fact, you need to make sure the mixer Hamiltonian, $H_M$ anti-commute with $H_C$. Otherwise you will still be stuck... – KAJ226 May 18 '21 at 23:13
  • The reason for this is because if $H_M$ and $H_C$ commute then they have common eigenstates... so the eigenstate of $H_C$ will be the eigenstate of $H_M$. That is why you often see QAOA picked their $H_M$ to be something like $\bigotimes\sigma_X$. But you can choose it to be whatever... as long as $H_M$ anti-commute with $H_C$. – KAJ226 May 18 '21 at 23:16
  • Thanks. Kinda understood. sorry due to lack of experience. HC has many egienstates? why? because HC has many terms? – John Parker May 19 '21 at 01:57
  • $H_C$ is some hermitian operator, in this case, a hermitian matrix, and and $n \times n$ hermitian matrix will have $n$ eigenvectors. – KAJ226 May 19 '21 at 05:23

1 Answers1

12

Probably the easiest way to understand this is to pretend that the mixer is NOT there and see what happens. So, let's assume you have some initial state $\lvert \psi \rangle = \sum_x \psi_x \lvert x \rangle$ and you want to use QAOA to find the ground state of some cost Hamiltonian $H_C$. I'm using the notation $\big\{\lvert x \rangle : x \in \{\pm 1\}^n \big\}$ for the $\sigma^z$-basis (the computational basis). Note that the cost Hamiltonian $H_C$ will be diagonal in this basis. $$ H_C = \sum_x E_x \lvert x \rangle \langle x \rvert $$ Applying the phase-shifter $U(\gamma) = \exp(-i \gamma H_C)$ to $\lvert \psi \rangle$ one obtains $$ \lvert \psi(\gamma) \rangle \equiv U(\gamma) \lvert \psi \rangle = \sum_x e^{-i\gamma E_x} \psi_x \lvert x \rangle $$ Let's stop here for now. If you measure the new state $\lvert \psi(\gamma) \rangle$ with respect to the computational basis you get a configuration $x \in \{\pm 1\}^n$ with probability $$ p_x = \lvert \langle x|\psi(\gamma) \rangle \rvert^2 = \lvert e^{-i\gamma E_x} \psi_x \rvert^2 = \lvert \psi_x \rvert^2 $$ which is exactly the probability you would have got if you had measured the state $\lvert \psi \rangle$ to begin with. Evolution with respect to $H_C$ has not changed this probability distribution so we haven't really gained anything.

Note that this implies that for any observable that is diagonal in the $\sigma^z$-basis, e.g. for the cost Hamiltonian $H_C$, we have $$ \langle \psi(\gamma) \rvert H_C \lvert \psi(\gamma) \rangle = \langle \psi \rvert U^{\dagger}(\gamma) H_C U(\gamma) \lvert \psi \rangle = \langle \psi \rvert H_C \lvert \psi \rangle $$ since $\langle x \rvert H_C \lvert y \rangle = E_x \delta_{xy}$. This is bad since the average energy functional (here I'm suppressing the dependence on the mixing time $\beta$ since we're pretending we're not using a mixer) $$ f(\gamma) = \langle \psi(\gamma) \rvert H_C \lvert \psi(\gamma) \rangle $$ is used by the classical optimization part of the QAOA in order to optimize the state $ \lvert \psi(\gamma) \rangle$: you want to find a value $\gamma$ that minimizes $f(\gamma)$. But we just saw that $f(\gamma)$ is a constant function of $\gamma$ so there's nothing to optimize here. You can't go down in energy (that is, in "cost") by choosing different values of $\gamma$.

From a physical point of view this is perfectly obvious: the system is evolving under closed-system dynamics and the energy (represented here by $H_C$) is conserved.

All of this changes if you then apply a mixer (where e.g. $B= \sum_i \sigma_i^x$) $$ U(\beta) \equiv \exp(-i\beta B) $$ to your state $\lvert \psi(\gamma) \rangle$ so that you get $$ \lvert \psi(\gamma,\beta) \rangle \equiv \exp(-i\beta B)\exp(-i\gamma H_C) \lvert \psi \rangle $$ Since $H_C$ and $B$ do not commute, the dynamics generated by $B$ will not in general conserve the "energy" (i.e. the cost) $H_C$.

Now the (correct) QAOA energy functional $$ f(\gamma,\beta) = \langle \psi(\gamma,\beta) \rvert H_C \lvert \psi(\gamma,\beta) \rangle $$ is no longer a constant function of $\gamma,\beta$ and you can use your favourite classical optimizer to minimize its value. That is, you will (in general) be able to find values $\gamma^*,\beta^*$ such that $$ \langle \psi(\gamma^*,\beta^*) \rvert H_C \lvert \psi(\gamma^*,\beta^*) \rangle < \langle \psi \rvert H_C \lvert \psi \rangle. $$ The exact same argument applies to any depth $p$ of the QAOA variational Ansatz.

Gianni Mossi
  • 450
  • 1
  • 6
  • 10
  • Thank you for the great explanation!. One basic follow-up question, ⟨ψ|U†(γ)HCU(γ)|ψ⟩=⟨ψ|HC|ψ⟩ in this expression. Why U(γ)=exp(−iγHC) is moved out by passing HC in the middle? – John Parker May 20 '21 at 12:27
  • Glad you found it helpful. Because $H_C = \sum_x E_x \lvert x \rangle \langle x \rvert$ and $U(\gamma) = \sum_x e^{-i\gamma E_x} \lvert x \rangle \langle x \rvert$ commute due to the fact that they are diagonal in the same basis. Then you have $U^{\dagger}(\gamma)U(\gamma) = I$ since $U(\gamma)$ is a unitary operator. – Gianni Mossi May 20 '21 at 15:40
  • Thanks again. I got it. this means Hc consists of only with Pauli I and Z's. I assume if it has X or Y, it wouldn't work? Thank you again for your help! – John Parker May 20 '21 at 17:24
  • Well, the exact statement is as follows. For arbitrary Hermitian operators $H,Q$ define the quantity $q(t) \equiv \langle \psi \rvert \exp(iHt), Q \exp(-iHt) \lvert \psi \rangle$. If $[H,Q]=0$ then $q(t)$ is a constant function of $t$ because $Q$ and $\exp(-iHt)$ will commute. If $[H,Q] \neq 0$ then this will not happen and $q(t)$ will generically depend on $t$, even though you might find very specific choices of the initial state $\lvert \psi \rangle$ where it does not (e.g. the eigenvectors of $H$). – Gianni Mossi May 21 '21 at 05:57
  • In my answer above we have $H = Q = H_C$ so $f(\gamma)$ does not depend on $\gamma$ simply because every operator commutes with itself: $[H_C, H_C]=0$. Note that the determining factor here is really commutativity between $H$ and $Q$, not whether they are made of different Pauli operators. For example $\bigotimes_i \sigma_i^x$ and two-spin interactions like $\sum_{ij} J_{ij}\sigma_i^z \sigma_j^z$ do commute, even though one is $\sigma^x$-diagonal and the other is $\sigma^z$-diagonal. – Gianni Mossi May 21 '21 at 05:59
  • 1
    Every once in while, you encounter a post on SE where you just want to stop, upvote and go ahead and say "I love you so much". – niCk cAMel Apr 05 '22 at 06:35