The binomial distribution $B(n,p)$ is just the sum of of Bernoulli variables with sucess probability p. Therefore the Central Limit Theorem applies and if n is "large enough" you can approximate the binomial distribution by a normal distribution with the same mean np and the same variance $np(1-p)$. This means $\frac{X}{n}$ can be approximated by a normal distribution with mean $p$ and variance $\frac{p(1-p)}{n}$. The corresponding standard deviation is then given by $\sqrt{\frac{p(1-p)}{n}}$ .
The question then becomes when is n "large enough" and there are always lots of discussions about that. One of the shortcomings of the normal approximation is that it is always symmetrical and the binomial distribution is not for $p != 0$. This in turn has the effect that the confidence interval above may include values larger than 1 or smaller than 0, which obviously does not make sense. More seriously, it does not only include values that do not sense but it also does not have the stated coverage of $\alpha$.
There are other lots of discussion when the normal approximation makes sense, which depends not only on n but also on p. This Wikipedia Article http://en.wikipedia.org/wiki/Binomial_proportion_confidence_interval is a very good starting, which also discusses better alternatives. The normal approximation is still widely in use, because it usually does a decent jobs and because it requires little calculations, which was important in times before computers.