Is it possible to speed up the generation of the weighting matrix using a quantum algorithm?

Question

In this^[1] paper, on page 2, they mention that they are generating the weighting matrix as follows:

$$W = \frac{1}{Md}[\sum_{m=1}^{m=M} \mathbf{x}^{(m)}\left(\mathbf{x}^{(m)}\right)^{T}] - \frac{\Bbb I_d}{d}$$

where $\mathbf{x}^{(m)}$'s are the $d$-dimensional training samples (i.e. $\mathbf{x} := \{x_1,x_2,...,x_d\}^{T}$ where $x_i \in \{1,-1\} \ \forall \ i\in \{1,2,...,d\}$) and there are $M$ training samples in total. This generation of weighting matrix using matrix multiplication followed by a sum over $M$ terms seems to be a costly operation in terms of time complexity i.e. I guess around $O(Md)$ (?).

Does there exist any quantum algorithm which can offer a substantial speed-up for generation of the weighting matrix? I think in the paper their main speedup comes from the quantum matrix inversion algorithm (which is mentioned later on the paper), but they don't seem to have taken into account this aspect of the weighting matrix generation.

[1]: A Quantum Hopfield Neural Network Lloyd et al. (2018)

Mithrandir24601 · Accepted Answer · 2018-06-18T10:02:13.760

Taking the density matrix $$\rho=W+\frac{I_d}{d}=\frac 1M \sum_{m=1}^M\left|x^{\left(m\right)}\rangle\langle x^{\left(m\right)}\right|,$$ many of the details are all contained in the following paragraph on page 2:

Crucial for quantum adaptations of neural networks is the classical-to-quantum read-in of activation patterns. In our setting, reading in an activation pattern $x$ amounts to preparing the quantum state $|x〉$. This could in principle be achieved using the developing techniques of quantum random access memory (qRAM) [33] or efficient quantum state preparation, for which restricted, oracle based, results exist [34]. In both cases, the computational overhead is logarithmic in terms of $d$. One can alternatively adapt a fully quantum perspective and take the activation patterns $|x〉$ directly from a quantum device or as the output of a quantum channel. For the former, our preparation run time is efficient whenever the quantum device is composed of a number of gates scaling at most polynomially with the number of qubits. Instead, for the latter, we typically view the channel as some form of fixed system-environment interaction that does not require a computational overhead to implement.

The references in the above are:

[33]: V. Giovannetti, S. Lloyd, L. Maccone, Quantum random access memory, Physical Review Letters 100, 160501 (2008) [PRL link, arXiv link]

[34]: A. N. Soklakov, R. Schack, Efficient state preparation for a register of quantum bits, Physical Review A 73, 012307 (2006). [PRA link, arXiv link]

Without going into details of how, both of the above are indeed schemes for respectively, implementing an efficient qRAM; and efficient state preparation that recreate the state $\left|x\right>$ in time $\mathcal O\left(\log_2 d\right)$.

However, this only gets us so far: this can be used to create the state $\rho^{\left(m\right)} = \left|x^{\left(m\right)}\rangle\langle x^{\left(m\right)}\right|$, while we want a sum over all the possible $m$'s.

Crucially, $\rho = \sum_m\rho^{\left(m\right)}/M$ is mixed, so cannot be represented by a single pure state, so the second of the above two references on recreating pure states doesn't apply and the first requires the state to already be in qRAM.

As such, the authors make one of three possible assumptions:

They have a device that just so happens to give them the correct input state
They either have the states $\rho^{\left(m\right)}$ in qRAM,
They are able to create those states at will, using the second of the above references. The mixed state is then created using a quantum channel (i.e. a completely positive, trace preserving (CPTP) map).

Forgetting about the first two of the above options for the moment (the first magically solves the problem anyway), the channel could either be:

an engineered system, in that it would be created for a specific instance in something akin to an analogue simulation. In other words you've got a physical channel that takes a physical length of time $t$ (as opposed to some time complexity). This is the "fixed system-environment interaction that does not require a computational overhead to implement."
The channel is itself simulated. There are a few papers on this, such as Bény and Oreshkov's Approximate simulation of quantum channels (arXiv link - this looks like a thorough paper, but I couldn't find any time complexity statements), Lu et. al.'s Experimental quantum channel simulation (no arXiv version seems to exist) and Wei, Xin and Long's arXiv preprint Efficient universal quantum channel simulation in IBM's cloud quantum computer, which (for number of qubits $n=\lceil\log_2 d\rceil$) gives a time complexity of $\mathcal O\left(\left(8n^3+n+1\right)4^{2n}\right)$. Stinespring dilation can also be used, with a complexity of $\mathcal O\left(27n^34^{3n}\right)$.

Now looking at option 2¹, one possible more efficient method would be to transfer the states from the address register to the data register in the usual method: for addresses in register $a$, $\sum_j\psi_j\left|j\right>_a$, transferring this to the data register gives the state in the data register $d$ as $\sum_j\psi_j\left|j\right>_a\left|D_j\right>_d$. It should be possible to simply decohere the address and data register to turn this into a mixed state, giving a small time overhead, although no extra computational complexity overhead, giving a much improved complexity of producing $\rho$, given a qRAM with the states $\left|x^{\left(m\right)}\right>$, of $\mathcal O\left(n\right)$. This is also the complexity of creating the states $\left|x^{\left(m\right)}\right>$ in the first place, giving a potential (much improved) complexity of producing $\rho$ of $\mathcal O\left(n\right)$.

^{1 With thanks to @glS for pointing this possibility out in chat}

This density matrix is then fed into 'qHop' (quantum Hopfield), where it is used to simulate $e^{-iAt}$ for $$A=\begin{pmatrix}W-\gamma I_d && P\\ P&& 0\end{pmatrix}$$ as per the "Efficient Hamiltonian Simulation of A" subsection on page 8.

just as small note about your edit: you don't really need to "decohere" the address register, or do anything at all really. The simple fact of not using it makes the content of the data register indistinguishable from a mixture of the various $|D_j\rangle$ — glS, Jun 18 '18 at 13:12

Is it possible to speed up the generation of the weighting matrix using a quantum algorithm?

1 Answers1