In "Distillation with sublogarithmic overhead", Hastings and Haah present a magic state distillation factory with a yield parameter $\gamma \approx 0.7$. Meaning that the number of initial fixed-error-rate magic states they need, in order to reach a target error rate $\epsilon$, grows like $O(\left(\log \frac{1}{\epsilon})^{0.7}\right)$.
How can this be possible? It seems like if you have fewer than $O(\log \frac{1}{\epsilon})$ input states, then the chance of them all failing simultaneously will be larger than $\epsilon$. But you can't possibly be getting a correct result if all of your inputs are bad, so it shouldn't be possible to do better than $\Omega(\log \frac{1}{\epsilon})$.
For example, suppose the fixed initial error rate is $f = 10^{-10}$ and the target error rate is $\epsilon = 10^{-10^{10}}$ and the constant factor hidden by the O notation is such that the base of the logarithm is $e$. So we start with $(\ln \frac{1}{\epsilon})^{0.7} \approx 2 \cdot 10^7$ states. The chance of them all failing simultaneously is $f^{2 \cdot 10^7} \approx 10^{-2 \cdot 10^8}$. This failure rate is higher than the target failure rate. It seems like for any choice of constant factor and initial error rate, I can find a target error rate demanding enough that this problem will occur. Why is this kind of thing not fatal to the construction?