6

When decomposing the 4-qubit Toffoli in the Clifford+T universal gate set with 1 ancilla qubit, what is the most efficient implementation one can get in terms of T-count? I can only find papers that handle the problem for general n-qubit Toffoli gates, but I am not sure if there exists some better implementation specifically for small n. For example, the following paper mentions they can do it with a T-count of 32 * n + 96 (for n = 4, this is 32), but this seems quite poor given that the T-count of the 3-qubit Toffoli (without ancilla) is just 7.

Paper: (Decompositions of n-qubit Toffoli Gates with Linear Circuit Complexity, https://link.springer.com/article/10.1007/s10773-017-3389-4)

Ocelot
  • 93
  • 5

2 Answers2

4

The best known decomposition of the CCCX gate into Clifford+T (allowing feedback) uses 6 T gates: https://arxiv.org/abs/2106.11513

enter image description here

Craig Gidney
  • 36,389
  • 1
  • 29
  • 95
4

Just for posterity, the construction in the paper you referenced can be improved from $32n \pm O(1)$ to $16n \pm O(1)$, without using feedback and while still using only one ancilla:

enter image description here

Note that all but a constant number of the Toffolis come in compute/uncompute pairs, so they can be done in 4 T gates each instead of 7 each. Further note there are 8 "sweeps", with each sweep having $n/2 \pm O(1)$ Toffolis. So the total cost is $4 \cdot 8 \cdot n/2 \pm O(1) = 16n \pm O(1)$ T gates.

Note that $16n \pm O(1)$ is four times as expensive as the $4n \pm O(1)$ T count that can be achieved using feedback and a linear number of ancilla.

Craig Gidney
  • 36,389
  • 1
  • 29
  • 95