Sample means in general don't have t-distributions.
For example, if I am drawing iid samples from an exponential distribution, the distribution of sample means has a gamma distribution (which is skewed; it is lighter tailed than normal on the left and heavier-tailed than normal on the right); on the other hand, if I am drawing iid samples from a uniform distribution, the distribution of sample means will have a scaled Irwin-Hall distribution (a Bates distribution) -- and that's symmetric but lighter-tailed than normal, so a t (which is heavier-tailed than normal) would always be a worse approximation to the distribution of $\bar{X}$ than the normal in that case.
The usual t-distribution arises when you divide a normally distributed (with zero mean) numerator by and independent estimate of the standard deviation of the numerator (provided that the variance estimate has a scaled chi-squared distribution). Under iid sampling from a normal distribution, the usual t-statistics have these properties and so have t-distributions.
Why would you divide a normally distributed numerator by an estimate of its standard deviation? Every normal distribution is different -- they all have different variance; you don't have a way to directly tell if a sample mean is consistent with some population mean (for example) -- is $\bar{x}-\mu$ unusually far from 0 or not? Well, that depends on the population standard deviation...
You can standardize things like $\bar{x}-\mu$ by dividing by the standard deviation of $\bar{x}-\mu$ (if you know it, but you generally don't). However, if you use an estimate of the standard deviation then the variability in the denominator makes the ratio ($\bar{x}-\mu$ divided by its estimated standard deviation) more heavy-tailed than normal. See the intuition offered at Why does the t-distribution become more normal as sample size increases? for how this happens.
That sort of sounds like we didn't get very far, but actually we have. The standardized distribution only depends on the degrees of freedom, which derive from the sample size. This means we can make tables for it for example.
Consequently we can now perform inference (typically about the population mean) in cases where we don't know the standard deviation, as long as we're sampling (at least to a sufficiently good approximation) from a normal distribution.
If you're not sampling from a normal distribution, there's no general rule I'm aware of that would make a t-statistic have a particular distribution (except asymptotically but then you'd be arguing for convergence to normality, not to a t-distribution).
Consider a CI for the population mean, where we have a sample assumed to be independently drawn from a normal population (but one where we don't know the population mean $\mu$ or variance $\sigma^2$). How are we to give an interval for the mean?
A common way to find a confidence interval proceeds from finding a pivotal quantity. This is a function of the data and the unknown (the parameter $\mu$ in this case) that has a known distribution that doesn't depend on the parameter.
If we knew $\sigma$, we could write $Z=\frac{\bar{x}-\mu}{\sigma/\sqrt{n}}$, which would then be pivotal (since $Z$ has a standard normal distribution). We could write an interval for $Z$ with the desired coverage and then back out an interval for $\mu$ (since everything else is known and we can just rearrange to rewrite the inequalities in the probability statement to leave $\mu$ isolated).
But when we don't know $\sigma$, we can estimate it by $s$, writing $Q=\frac{\bar{x}-\mu}{s/\sqrt{n}}$. It's still pivotal (the distribution of $Q$ doesn't depend on $\mu$, nor on the unknown $\sigma$), but we now have a different distribution at each sample size (specifically, a t-distribution). We can still write an interval for $Q$ with the desired coverage properties and so back out an interval for $\mu$.