1

Given some samples that each containing thousands of data points, I want to compare which sample is drastically different from other samples. Since the sample size is big, any statistical tests will just say the differences are significant.

The five sample means are [27.5,22.3,20.2,18.9,18.4]. What I really want to model is the amount of change between adjacent means. For example, the rate of change from 27.5 to 22.3 is faster than from 18.9 to 18.4, and therefore, there might be some "significances" between the first two sample, but not in the last two samples. I'm looking for a way that gives statistical meanings to this fast/slow decrease, and potentially show a faster rate of change is more significant than a slower rate of change. I believe simply subtracting the means and say something like "27.5-22.3=5.2 is bigger than 22.3-20.2=2." is not enough.

What is the most appropriate approach to this question?

  • 3
    Welcome to Cross Validated! Does this answer your question? Significance test for large sample sizes If not, please say what confusion remains. – Dave Jul 13 '23 at 13:38
  • Re the edits: could you please explain why the differences between the means (that is, subtracting them) don't fulfill your requirements for assessing the "amount of change between adjacent means"? – whuber Jul 14 '23 at 18:50
  • @whuber Because this will involve defining a threshold to quantify the difference. If subtracting, how big of a difference is considered a big effect? For example, although 27.5-22.3 > 22.3-20.2, and 22.3-20.2 > 20.2-18.9, and the difference is actually decreasing and reaching plateau, (5.3, 2.1, 1.3, 0.5), how significantly is 5.3 different from 2.1? Moreover, from 5.3 to 2.1, it seems like the "jump" is big, but what if there are other differences, say 4.6 and 3.2? I'm wondering if there is some statistical measures that can be used to describe this behavior. – sensationti Jul 14 '23 at 19:29
  • 1
    Why do you need a threshold? You appear to be using the word "significant" in the sense of important, but that's not something we can provide opinions about: it depends on what these numbers mean to you. There's no universal statistical way to tell you the answer. – whuber Jul 14 '23 at 20:54

2 Answers2

4

Another (better, in my opinion) approach is to do one-way ANOVA, followed by Tukey tests to compare each mean with every other mean. This automatically corrects for multiple comparisons. (But with such tiny p-values by t-test, this won't change your conclusions at all.)

Rather than focus on p-values, I'd focus on the CI of the differences between means, ideally computed with correction for multiple comparisons so the "95%" applies to the whole family of comparisons, not to each comparison individually.

  • Thanks! I understand the part where CI of the differences between means would be a better option, but can you please explain what "with correction for multiple comparisons " mean? For example, If I do a one-to-one comparison between 5 samples, there will be 10 CIs. What kind of correction are you referring to? – sensationti Jul 14 '23 at 10:30
  • 1
    If you do one-way ANOVA with Tukey followup testing, you'll get a set of 10CIs where the 95% confidence level applies to the whole set, not to each individually. – Harvey Motulsky Jul 14 '23 at 14:30
1

I have some measurements collected from 4000 randomly selected pedestrians under 5 conditions. (5 samples of 4000 data points) Each sample are independent from another sample (the 4000 pedestrians in the first sample are different from those in the second sample, second is different from the third, etc.)

Yes, running individual t-tests is appropriate here, but the preferred approach is to run an ANOVA first and only look at differences between individual groups if the ANOVA shows a significant main effect.

Under condition 1 to 5, the five sample means are [27.5,22.3,20.2,18.9,18.4], and the standard deviations are almost the same [0.28,0.26,0.29,0.26,0.24].

Yes, with these means, standard deviations, and sample sizes, you would expect every group to be significantly different from every other group.

However, all resulting p-values are 0. This means that using 95% confidence interval, the null hypothesis (no difference between the means) has been rejected under all conditions.

The p values are actually very small but non-zero, e.g. $p < .0000000000001$. This indicates a strongly statistically significant result.

Moreover, although many people said sample size does not matter in a t-test, if I were able to collect more data, the sample mean will eventually becomes the population mean. In this case, performing a t-test would be meaningless since the population means can be directly compared?

I'm afraid this objection doesn't make any sense. Everything you've done here is perfectly normal.

Eoin
  • 8,997
  • It's unclear how "running individual t-tests" might be appropriate, given that (1) there are multiple tests, requiring an adjustment for multiple comparisons; (2) the tests are fairly strongly correlated; and (3) ANOVA automatically handles both issues. (This is moot in light of your point about the obviousness of the differences.) – whuber Jul 14 '23 at 12:59