Prediction intervals seem to be talked about most in the context of regression, but I want to reduce it to one random variable to understand the reasoning. Assume you are sampling from a normal distribution $N(\mu ,\sigma ^{2})$.
Wikipedia says the prediction interval for a new observation $X_{n+1}$ will be $\overline {X}_{n}+s_{n}{\sqrt {1+1/n}}\cdot T^{{n-1}}$.
I am wondering specifically about the $s_{n}{\sqrt {1+1/n}}$ part of the equation. If you square it to get the variance, it's $s_{n}^{2}{({1+1/n})}$.
Why is the variance $s_{n}^{2}{({1+1/n})}$ instead of just $s_{n}^{2}$? Isn't $s_{n}^{2}$ supposed to be an unbiased estimator of $\sigma ^{2}$ in $N(\mu ,\sigma ^{2})$, from which all the samples (including a hypothetical $X_{n+1}$) are drawn?
So why wouldn't a new data point $X_{n+1}$ also have variance of $s_{n}^{2}$? If I had to guess, it's something to do with the uncertainty around $\overline {X}_{n}$, hence the extra $s_{n}^{2}/n$ term.
Intuitively, it doesn't make sense to me that there is more uncertainty around a new data point i.e. variance of $s_{n}^{2}{({1+1/n})}$ when you already have same sample data to go off, compared to if you just blindly drew a new data point without any prior sampling i.e. variance of $s_{n}^{2}$. Would appreciate corrections to my thinking and reasoning about this.