SPSS provides the output "confidence interval of the difference means." I have read in some places that it means "95 times out of 100, our sample mean difference will be between between these bounds" I find this unclear. Can anyone suggest clearer wording to explain "confidence interval of the difference in means"? This output appears in the context of a one-sample t-test.
-
1What is your intepretation? – mpiktas Oct 07 '11 at 12:18
-
1Note that there is nothing special about this being a proportion: a CI for the estimate of anything will be interpreted in a similar manner. (However, different procedures may be used to construct the CI, depending on what is being estimated.) Consequently, this question is exactly the same as previous questions asking for interpretations of CIs. – whuber Oct 07 '11 at 14:34
7 Answers
This is not an easy thing, even for respected statisticians. Look at one recent attempt by Nate Silver:
... if I asked you to tell me how often your commute takes 10 minutes longer than average — something that requires some version of a confidence interval — you’d have to think about that a little bit, ...
(from the FiveThirtyEight blog in the New York Times, 9/29/10.) This is not a confidence interval. Depending on how you interpret it, it's either a tolerance interval or a prediction interval. (Otherwise there's nothing the matter with Mr. Silver's excellent discussion of estimating probabilities; it's a good read.) Many other web sites (particularly those with an investment focus) similarly confuse confidence intervals with other kinds of intervals.
The New York Times has made efforts to clarify the meaning of the statistical results it produces and reports on. The fine print beneath many polls includes something like this:
In theory, in 19 cases out of 20, results based on such samples of all adults will differ by no more than three percentage points in either direction from what would have been obtained by seeking to interview all American adults.
(e.g., How the Poll Was Conducted, 5/2/2011.)
A little wordy, perhaps, but clear and accurate: this statement characterizes the variability of the sampling distribution of the poll results. That's getting close to the idea of confidence interval, but it is not quite there. One might consider using such wording in place of confidence intervals in many cases, however.
When there is so much potential confusion on the internet, it is useful to turn to authoritative sources. One of my favorites is Freedman, Pisani, & Purves' time-honored text, Statistics. Now in its fourth edition, it has been used at universities for over 30 years and is notable for its clear, plain explanations and focus on classical "frequentist" methods. Let's see what it says about interpreting confidence intervals:
The confidence level of 95% says something about the sampling procedure...
[at p. 384; all quotations are from the third edition (1998)]. It continues,
If the sample had come out differently, the confidence interval would have been different. ... For about 95% of all samples, the interval ... covers the population percentage, and for the other 5% it does not.
[p. 384]. The text says much more about confidence intervals, but this is enough to help: its approach is to move the focus of discussion onto the sample, at once bringing rigor and clarity to the statements. We might therefore try the same thing in our own reporting. For instance, let's apply this approach to describing a confidence interval of [34%, 40%] around a reported percentage difference in a hypothetical experiment:
"This experiment used a randomly selected sample of subjects and a random selection of controls. We report a confidence interval from 34% to 40% for the difference. This quantifies the reliability of the experiment: if the selections of subjects and controls had been different, this confidence interval would change to reflect the results for the chosen subjects and controls. In 95% of such cases the confidence interval would include the true difference (between all subjects and all controls) and in the other 5% of cases it would not. Therefore it is likely--but not certain--that this confidence interval includes the true difference: that is, we believe the true difference is between 34% and 40%."
(This is my text, which surely can be improved: I invite editors to work on it.)
A long statement like this is somewhat unwieldy. In actual reports most of the context--random sampling, subjects and controls, possibility of variability--will already have been established, making half of the preceding statement unnecessary. When the report establishes that there is sampling variability and exhibits a probability model for the sample results, it is usually not difficult to explain a confidence interval (or other random interval) as clearly and rigorously as the audience needs.
- 322,774
-
Thanks Whuber, I understand confidence intervals for a mean quite well. It is the CI for a difference in means (between a sample and pop) where I become confused. – Anne Jun 13 '11 at 14:50
-
@Anne What are you referring to? Neither your question nor any of the replies refers to a difference between a sample mean and a population mean, as far as I can tell. Your question appears to refer to the difference between two sample means (perhaps between the mean of a group of experimental subjects and a group of controls). – whuber Jun 13 '11 at 14:55
-
The example I am thinking of is where you are looking a difference between a sample and population mean. In this case what exactly does the CI between sample and pop mean, mean. We have used the sample mean to estimate the pop standard deviation and thus from that we are estimating the CI around the mean estimate. The difference of means isn't the difference between the pop mean we have provided and the sample mean. So what is it? – Anne Jun 13 '11 at 15:02
-
1Your explanation of CI is however very useful as it is the clearest explanation I have seen. I like the quote from page 384. – Anne Jun 13 '11 at 15:03
-
1@Anne Is the "population mean" the hypothetical, unknown mean of the population being sampled or is it the measured mean of another population that has been exhaustively sampled? Also, in what sense did you use the "sample mean" to estimate the population standard deviation? Is that perhaps a typo? – whuber Jun 13 '11 at 15:06
-
@whuber I interpreted the statement to mean they used the sample mean ($\bar{x}$) to calculate $s$ which is the estimate of $\sigma$. Not sure if that is correct. – DQdlM Jun 13 '11 at 17:48
-
This is rather confusing. Anne, I suggest you clarify your question... That fact that you used $s$, and thus a T-test (not a normal test) doesn't change the interpretation of Confidence Interval. And when we talk about difference of means, we are refereing about the difference of the mean of two groups (say, mean wages of men and women). And remember that tha whole idea of confidence interval is to provide how uncertain we are about our inferencital tasks, i.e., to infer from the sample the populational value of interest. – Manoel Galdino Jun 13 '11 at 21:16
-
@whuber - your last block quote could be greatly improved by stating that this is all based on a model for the sampling distribution, and the assumption that future samples (which we have not observed) will behave according to that sampling distribution. Your current quote sounds like this is certain to happen. – probabilityislogic Jun 13 '11 at 22:52
-
(cont'd) - something like "If our sampling distribution is correct, and future samples behave like the current sample, then in 95% of such case the confidence interval...". I would also add that for that last sentence to hold true "...therefore it is likely - but not certain - that this confidence interval..." your confidence interval must use sufficient statistics, and condition on all ancillary statistics. Otherwise the CI will have notably different coverage (recognizable from the sample) in specific "subclasses" of samples. – probabilityislogic Jun 13 '11 at 22:57
-
I am sorry for being unclear. The quote I provided was given in the context of the following example. The average GPA at a college is 3. The education department has been accused of grade inflation. You take a sample of 200 education students and find their GPA is 3.5. Q: based on this do you agree there is grade inflation? They then go on to show how you compare the sample mean to the population mean, and then provide the explanation I quoted above. – Anne Jun 14 '11 at 01:03
-
@Anne - this is a simple one-sample t-test. There is no uncertainty in the population mean, so you would just have the one-sided hypothesis test $H_{0}:\mu_{edu}=3$ versus $H_{A}:\mu_{edu}>3$. If you assume a normal distribution (tenuous maybe?), then to get a 5% significance level, you require at least a difference of 1.65 standard errors. So this requires the sample standard deviation to satisfy $1.65\frac{s}{\sqrt{200}}<0.5$ which gives $s<4.27$. For 1% significance we require $s<3.01$, for 0.1% significance we require $s<2.25$. – probabilityislogic Jun 14 '11 at 13:05
-
@anne - but remember, this just tells you that there exists some hypothesis which is supported by the data more than "equal means". The inflation of grades is just one hypothesis. The education department may also just attract the people with better skills in terms of GPA. The significance test does not tell you which of these alternatives (or any other which leads to a higher GPA) to accept in favour of $H_{0}$. It just tells you that $H_0$ does not explain the difference well – probabilityislogic Jun 14 '11 at 13:10
-
@probability Nothing in my message refers to "future" samples. It refers to all samples of a population. That is the model. It is explicit (although, for brevity, I have not digressed about how one takes a sample so that all possible ones are equally likely). Your second point about having varying coverage among recognizable subclasses of samples is excellent, but takes us beyond the scope of this question. – whuber Jun 14 '11 at 13:12
-
@whuber - point taken (always easy to confuse "logically later" with "actually later"). You are still modelling though, and this does not appear clear to me from the quote. I can't see the "qualifying assumptions" so to speak. You can't unconditionally guarantee 95% coverage because you are making statements about quantities which will never be observed, so obviously some sort of assumptions are there somewhere. – probabilityislogic Jun 14 '11 at 13:35
-
@probability Yes: the assumption (for this particular concept of a CI) is that we are working with a dataset that either is, or is viewed as being like, a random sample with replacement from a definite population. That is not always true, but as a foundation for understanding and describing a CI, it has an unmatched simplicity and clarity, IMO. The calculation of the CI requires additional assumptions. In a report, one does not usually repeat all such assumptions in the explanation of the CI; but if you are suggesting such assumptions should be made evident, I agree wholeheartedly. – whuber Jun 15 '11 at 03:20
-
@whuber. I understand what a confidence interval means. But what does what is a confidence interval for the "difference in means." I am imagining it in these terms. Each time I take a sample it is most likely different from the population mean and I can plot these differences. But what then is a confidence interval for this difference of means? – Anne Jun 15 '11 at 04:22
-
@Anne I won't be able to answer this question to your satisfaction within the scope of a single comment. Perhaps you should solicit answers by posting it as a separate question. – whuber Jun 15 '11 at 14:18
-
@Whuber. Thanks. I was referring to the confidence interval for the difference in means in my initial question. I have edited it to make this even clearer. Is this correct or should I post an entirely new question? – Anne Jun 17 '11 at 03:19
-
@Anne Fortunately, my original answer applies unchanged: each possible sample (that is, replication of your experiment) yields a CI for the difference of means (just dump its results into SPSS and read off the results). There exists (by hypothesis) a true difference of means. The CIs computed for 95% of all samples (that is, 95% of all possible replications) will cover that true difference. – whuber Jun 17 '11 at 13:23
-
2@whuber thanks. Your line "The CIs computed for 95% of all samples (that is, 95% of all possible replications) will cover that true difference." is clearer to me than "95 times out of 100, our sample mean difference will be between between these bounds" and your explanation makes logical sense. – Anne Jun 17 '11 at 16:10
From a pedantic technical viewpoint, I personally don't think there is a "clear wording" of the interpretation of confidence intervals.
I would interpret a confidence interval as: there is a 95% probability that the 95% confidence interval covers the true mean difference
An interpretation of this is that if we were to repeat the whole experiment $N$ times, under the same conditions, then we would have $N$ different confidence intervals. The confidence level is the proportion of these intervals which contain the true mean difference.
My own personal quibble with the logic of such reasoning is that this explanation of confidence intervals requires us to ignore the other $N-1$ samples when calculating our confidence interval. For instance if you had a sample size of 100, would you then go and calculate 100 "1-sample" 95% confidence intervals?
But note that this is all in the philosophy. Confidence intervals are best left vague in the explanation I think. They give good results when used properly.
- 24,971
-
Starting a new sentence after "N different confidence intervals." doesn't flow well with "you can further interpret this as saying...". I suggest modifying the third paragraph. – Bogdan Lataianu Jun 15 '11 at 20:11
-
2Your third paragraph is much better than the second. Conditional on the observed data, the confidence interval either contains the true parameter value or it doesn't. – cardinal Jun 16 '11 at 18:10
-
@probabilityislogic: Since this answer has been accepted, please consider editing your second paragraph. Also, can you please clarify what you mean in your second to last paragraph? As it reads, I'm not quite sure what argument you are making. – cardinal Jun 17 '11 at 23:03
-
if we interpret confidence intervals in terms of "repetition" of the experiment then we must ignore previous experiments in these repetitions. My point is: why is ignorance of previous experiments in these "repetitions" of confidence intervals good for those data sets that we have not observed, but we must pool the data together for data we have observed? Would it not make just as much sense (from what I understand about the CI interpretation) to produce as many CIs as you can with the data you have? – probabilityislogic Jun 18 '11 at 02:39
-
1There is a whole theory, largely parallel to optimal decision theory, on uniformly most accurate confidence sets. Maybe that is the piece of the puzzle missing for you. (?) – cardinal Jun 18 '11 at 02:51
-
So if we had out actual experiment gave $(\text{mean}\pm \text{standard deviation})$, for each group of $(3\pm 0.5)$ and $(10\pm 0.7)$ with a sample size of $10$. Then if we were to actually repeat this experiment, under the same conditions, and get $(4\pm 0.7)$ and $(9\pm 0.3)$, wouldn't it make sense to pool these data to get a more reliable estimate of the difference? So, if it makes sense in this case, why does it not make sense for the explanation/interpretation of confidence intervals? – probabilityislogic Jun 18 '11 at 02:52
-
@probabilityislogic: My remark above to the "second paragraph" is the one beginning: "I would put the wording as...". What follows that is incorrect. – cardinal Jun 18 '11 at 02:52
-
@cardinal - my point is that confidence intervals often make an appeal to "the long run", yet the process by which they demonstrate good long run behaviour is to make your statistical procedure constantly forget/ignore what has happened in previous experiments! How can that be a desirable property or a desirable way to evaluate long run performance? – probabilityislogic Jun 18 '11 at 03:00
-
@cardinal - woops, I see you meant the "other" second paragraph. – probabilityislogic Jun 18 '11 at 03:02
-
@probabilityislogic: Sorry, to be more explicit, I object to the wording: "...there is a 95% probability that this [emphasis mine] 95% confidence interval covers the true mean difference." When I read "this confidence interval", I interpret that to mean the one reported conditional on the observed data. That confidence interval either contains the true parameter or it doesn't. There is no probability involved. – cardinal Jun 18 '11 at 12:00
-
@cardinal - thanks for that. Confused definitions of probability for a minute. It would be much better if the frequentist referred to long run frequencies instead of probabilities (although that could cause confusion in "fourier space"). Or perhaps the Bayesians should start using the word implicability. Although if you object to my use here, then presumably you should also object to @whuber's use in his answer "...therefore - it is likely, but not certain - that this confidence interval contains the true difference..." – probabilityislogic Jun 18 '11 at 12:31
-
@probabilityislogic: Actually, I almost posted a comment to that effect on @whuber's post as well. But, I haven't figured out what to say about it, yet. The problem is it is hard to come up with a better suggestion in that case. But, I'm still thinking about it. Part of it is, I believe, a language problem as opposed to a statistical one. For me, the heart of the matter is that the probability statement is about the procedure established to construct the interval (and defined a priori) rather than the interval eventually obtained. – cardinal Jun 18 '11 at 15:36
-
@cardinal - I think this is true. However, if this is the case, then the CI is a pre-data answer to a post-data question. Talking about the procedure is pre-data thinking, and I think is the answer to an experimental design type of question (which perhaps explains why CIs depend on the sample design). – probabilityislogic Jun 19 '11 at 03:03
-
@probability @cardinal I would be interested in your comments about my use of "likely, but not certain." Please note I did not use any form of "probability" in that statement! – whuber Jun 22 '11 at 14:52
-
@whuber - if you are saying that "it is likely but not certain" and not having it be interpreted as probability, then what are we to make of the word "likely"? The first question that comes to mind is "just how likely?" which I thought was the whole point of creating an interval. You give a precise numerical answer to a question that wasn't asked, and a qualitative answer to the important question. Wouldn't it be much more efficient to simply construct an interval by eye/intuition? Your intuition can do pretty good qualitative reasoning in most cases, its the quantitative stuff we need. – probabilityislogic Jun 23 '11 at 05:55
The rough answer to the question is that a 95% confidence interval allows you to be 95% confident that the true parameter value lies within the interval. However, that rough answer is both incomplete and inaccurate.
The incompleteness lies in the fact that it is not clear that "95% confident" means anything concrete, or if it does, then that concrete meaning would not be universally agreed upon by even a small sample of statisticians. The meaning of confidence depends on what method was used to obtain the interval and on what model of inference is being used (which I hope will become clearer below).
The inaccuracy lies in the fact that many confidence intervals are not designed to tell you anything about the location of the true parameter value for the particular experimental case that yielded the confidence interval! That will be surprising to many, but it follows directly from the Neyman-Pearson philosophy that is clearly stated in this quote from their 1933 paper "On the Problem of the Most Efficient Tests of Statistical Hypotheses":
We are inclined to think that as far as a particular hypothesis is concerned, no test based upon the theory of probability can by itself provide any valuable evidence of the truth or falsehood of that hypothesis.
But we may look at the purpose of tests from another view-point. Without hoping to know whether each separate hypothesis is true or false, we may search for rules to govern our behaviour with regard to them, in following which we insure that, in the long run of experience, we shall not be too often wrong.
Intervals that are based on the 'inversion' of N-P hypothesis tests will therefore inherit from that test the nature of having known long-run error properties without allowing inference about the properties of the experiment that yielded them! My understanding is that this protects against inductive inference, which Neyman apparently considered to be an abomination.
Neyman explicitly lays claim to the term ‘confidence interval’ and to the origin of the theory of confidence intervals in his 1941 Biometrika paper “Fiducial argument and the theory of confidence intervals”. In a sense, then, anything that is properly a confidence interval plays by his rules and so the meaning of an individual interval can only be expressed in terms of the long run rate at which intervals calculated by that method contain (cover) the relevant true parameter value.
We now need to fork the discussion. One strand follows the notion of ‘coverage’, and the other follows non-Neymanian intervals that are like confidence intervals. I will defer the former so that I can complete this post before it becomes too long.
There are many different approaches that yield intervals that could be called non-Neymanian confidence intervals. The first of these is Fisher’s fiducial intervals. (The word ‘fiducial’ may scare many and elicit derisive smirks from others, but I will leave that aside...) For some types of data (e.g. normal with unknown population variance) the intervals calculated by Fisher’s method are numerically identical to the intervals that would be calculated by Neyman’s method. However, they invite interpretations that are diametrically opposed. Neymanian intervals reflect only long run coverage properties of the method, whereas Fisher’s intervals are intended to support inductive inference concerning the true parameter values for the particular experiment that was performed.
The fact that one set of interval bounds can come from methods based on either of two philosophically distinct paradigms leads to a really confusing situation--the results can be interpreted in two contradictory ways. From the fiducial argument there is a 95% likelihood that a particular 95% fiducial interval will contain the true parameter value. From Neyman’s method we know only that 95% of intervals calculated in that manner will contain the true parameter value, and have to say confusing things about the probability of the interval containing the true parameter value being unknown but either 1 or 0.
To a large extent, Neyman’s approach has held sway over Fisher’s. That is most unfortunate, in my opinion, because it does not lead to a natural interpretation of the intervals. (Re-read the quote above from Neyman and Pearson and see if it matches your natural interpretation of experimental results. Most likely it does not.)
If an interval can be correctly interpreted in terms of global error rates but also correctly in local inferential terms, I don’t see a good reason to bar interval users from the more natural interpretation afforded by the latter. Thus my suggestion is that the proper interpretation of a confidence interval is BOTH of the following:
Neymanian: This 95% interval was constructed by a method that yields intervals that cover the true parameter value on 95% of occasions in the long run (...of our statistical experience).
Fisherian: This 95% interval has a 95% probability of covering the true parameter value.
(Bayesian and likelihood methods will also yield intervals with desirable frequentist properties. Such intervals invite slightly different interpretations that will both probably feel more natural than the Neymanian.)
- 15,102
-
@Micheal - the place where they will differ is that a fudicial interval must be based on a sufficient statistic, and condition on all ancillary quantities. Neymans confidence interval does not require this property, and so are subject to the "95% confidence interval" having varying coverage for particular sub-classes of samples. – probabilityislogic Jun 23 '11 at 07:25
-
@probability - Can you expand on that? Do you mean that there are circumstances where a 95% Neymanian confidence interval is a confidence interval but it is not a 95% interval? What would those circumstances be? Would the Fisherian interval have the same bounds in those circumstances? – Michael Lew Jun 24 '11 at 03:57
-
You can show cases where you can tell from the sample, that a "95%" confidence interval doesn't contain the true value. example 5 and example 6 in Jaynes' paper gives two cases where not using sufficient statistics in CIs will give the long run coverage, but the coverage will vary over certain classes of samples. It is analogous to having two variables with the same average (long run coverage) but different variance (coverage in specific case) – probabilityislogic Jun 24 '11 at 11:30
The meaning of a confidence interval is: if you were to repeat your experiment in the exact same way (i.e.: the same number of observations, drawing from the same population, etc.), and if your assumptions are correct, and you would calculate that interval again in each repetition, then this confidence interval would contain the true prevalence in 95% of the repetitions (on average).
So, you could say you are 95% certain (if your assumptions are correct etc.) that you have now constructed an interval that contains the true prevalence.
This is typically stated as: with 95% confidence, between 4.5 and 8.3% of children of mothers who smoked throughout pregnancy become obese.
Note that this is typically not interesting in itself: you probably want to compare this to prevalence in children of mothers who didn't smoke (odds ratio, relative risk, etc.)
- 12,819
- 2
- 37
- 47
-
(This reply, which arrived here after a merger of two threads, is responding to a duplicate question framed in terms of a CI of a proportion.) – whuber Oct 07 '11 at 14:37
If the true mean difference is outside of this interval, then there is only a 5% chance that the mean difference from our experiment would be so far away from the true mean difference.
- 3,171
-
What do you mean by "this far away"? Is this the upper bound of the CI that is far away or the observed mean? – probabilityislogic Jun 16 '11 at 09:44
-
The distance between the true mean and the observed mean is what I mean by "this far away". I'm going to change it to "so far away"; I think that is a little more clear. – Avery Richardson Jun 16 '11 at 19:50
My Interpretation: If you conduct the experiment N times ( where N tends to infinity) then out of these large number of experiments 95% of the experiments will have confidence intervals which lie within these 95% limits. More clearly, lets say those limits are "a" and "b" then 95 out of 100 times your sample mean difference will lie between "a" and "b".I assume that you understand that different experiment can have different samples to cover out of the whole population.
- 1,617
-
@ Ayush. thanks. That is helpful. Sorry I don't quite follow your final sentence. – Anne Jun 13 '11 at 05:21
-
@anne -- Ok. What I mean is if you want to test the mean between two samples and lets say each sample has 1000 people, you can define infinite samples out of it ( of lets say 40 people from each)..I had wrote this to tell why do the different experiments differ from each other..The experiments where we are observing confidence interval. – ayush biyani Jun 13 '11 at 05:33
-
2@ayush - this is not the correct interpretation in your second last sentence. Or at least you should add subscripts to "a" and "b", which makes it clear that it is these quantities which are varying over the 100 times. Your current notation makes it seem like "a" and "b" are fixed quantities. – probabilityislogic Jun 13 '11 at 06:06
-
-
@anne -- are you clear with the meaning now ? Please keep subscripts in mind. – ayush biyani Jun 13 '11 at 06:33
-
1@Ayush (-1) The characterization that currently appears in your reply can be interpreted in several ways, most of which (therefore) are incorrect. For example, confidence intervals $[a,b]$ are usually constructed so as to contain the "sample mean difference," implying that this difference will lie between the limits 100% of the time no matter what. – whuber Jun 13 '11 at 13:28
-
@Ayush, No I do not understand. The example I am thinking of is where you are looking a difference between a sample and population mean. In this case what exactly does the CI between sample and pop mean, mean. We have used the sample mean to estimate the pop standard deviation and thus from that we are estimating the CI around the mean estimate. But what then is the difference of means. It isn't the difference between the pop mean we have provided and the sample mean. So what is it? Is my question clear? – Anne Jun 13 '11 at 14:59
-
@whuber -- if the confidence intervals are constructed such that the difference will lie there 100% of the time, what is 95% for then? Please make me clear. – ayush biyani Jun 14 '11 at 05:12
-
@ayush Consider how a 95% CI for the mean of a large sample of size $N$ from a population with mean $\mu$ is computed: one estimates the mean $\hat{m}$ and standard error $\widehat{se}$ from the sample and reports the interval $[\hat{a},\hat{b}]$ = $[\hat{m}-z_{0.05}\widehat{se}, \hat{m}+z_{0.05}\widehat{se}]$. Whence, by construction, $\hat{a} \le \hat{m} \le \hat{b}$. But you assert $\Pr[\hat{a} \le \hat{m} \lt \hat{b}] = 1 - 0.05$! The correct probability statement is $\Pr[\hat{a} \le \mu \le \hat{b}] = 1 - 0.05$. – whuber Jun 14 '11 at 13:20
-
I stress that $a$ and $b$ are random variables, so when @whuber writes Pr[$a\leq\mu\leq b$]=1-0.05, he means $a$ and $b$ are random, not fixed, and both are defined by him above. Perhaps adding hats Pr[$\hat{a}\leq\mu\leq \hat{b}$]=1-0.05 for them would make it clearer. – Bogdan Lataianu Jun 15 '11 at 19:33
-
@Bogdan Interesting idea! I made the changes you suggest. I think it's great to visually distinguish parameters from random variables. I'm a little uneasy using the hat to do it, though, because neither $a$ nor $b$ is actually estimating anything. – whuber Jun 15 '11 at 22:05
"95 times out of 100, your value will fall within one standard deviation of the mean"
-
4Welcome to the site, @beginnerstat. I wonder if you meant to say, "*two* standard deviations of the mean"? In addition, I'm not sure I see how this wording improves on what the OP has read elsewhere. Would you like to elaborate a bit? – gung - Reinstate Monica Feb 18 '13 at 21:05
-
1Yes to @gung's comment: I am particularly interested in understanding the sense in which "mean" and "SD" are used here. Are these referring to underlying parameters or to sample estimates? Do they refer to the distribution of an underlying random variable or to the sampling distribution of the mean of iid variates from such a distribution? – whuber Feb 18 '13 at 21:39