To compute a threshold, you need to figure out the probability of an error occurring despite the error-correction ($\bar{p}$), and then compare it to the chance without error-correction ($p$). The threshold is given by $\bar{p}<p$. For the Shor code, we know that it can correct an arbitrary single-qubit error on any one of the 9 qubits, so the chance that the code correctly functions is
\begin{align}
(1-p)^9 \quad[\text{no error}] + 9p(1-p)^8 \quad[\text{9 ways of having 1 error}]
\end{align}
The chance that the code fails is the complement of this probability, i.e. when two or more errors occur
\begin{align}
\bar{p} &= 1 - ((1-p)^9 + 9p(1-p)^8)\\
&= 1 -((1-p)^8 - p(1-p)^8 + 9p(1-p)^8)\\
&= 1 - (1+8p)(1-p)^8
\end{align}
which is the bound that Shor computes in his original paper. The threshold is when the logical error rate just beats out the physical error rate, which comes out to around 3.23%.