0

I've seen some discussion of using a 83.4% confidence interval to compare two parameter estimates, so that non-overlapping confidence intervals correspond to a statistical difference between these parameter estimates at the alpha = 0.05 level.

However, these examples are almost always use the mean of two groups as an example.

Is this a general conclusion for parameter estimates just based on the properties of the distribution of the estimated parameter? (Independent, normal).

I'm wondering if the 83.4% CI method can be applied for other parameter estimates, particularly effect size statistics. Say, r in correlation, or Cramer's V for the association of two categorical variables.

Knol, et al. The (mis)use of overlap of confidence intervals to assess effect modification, suggests this can be used for odds ratio, risk ratio, hazard ratio and risk difference.

Sal Mangiafico
  • 11,330
  • 2
  • 15
  • 35
  • 2
    That's an interesting reference you found. Its result differs from the one I obtained at https://stats.stackexchange.com/a/18259/919 because it assumes Normal distributions for the test statistics whereas my analysis used Student t distributions. By examining the derivation in that article's supplemental materials you can find the answer to your question: it assumes Normality and equal variances. – whuber Oct 26 '23 at 14:17
  • Thanks, @whuber. That post is very helpful. ... So, if I did the algebra correctly, if one wanted to mimic a hypothesis test at alpha = 0.05, the corresponding non-overlapping confidence intervals would be about 91.4% ? – Sal Mangiafico Oct 26 '23 at 18:47
  • But the other part of my question was about the normality assumption for the parameters, thinking about applying this principle to various effect size statistics. I did some simulations to see the distribution of these statistics. ... phi is a signed statistic, a measure of association for two binomial variables. Because it's signed, it's distribution approaches normal for any value of population phi, and is fairly normal even a relatively small sample size, say, 16 or greater. – Sal Mangiafico Oct 26 '23 at 18:52
  • But Cramer's V, a similar statistic for contingency tables greater than 2 x 2, isn't signed. That is, it can never be less than 0. Because of this, it's only normally distributed if the population V is, say, greater than 0.25 or 0.3 with a sample size of, say, 30. With a sample size of 100, population V could be as low as, say, 0.15 or 0.20 and still be relatively normally distributed. ... Am I thinking about this correctly ? – Sal Mangiafico Oct 26 '23 at 19:00
  • 1
    Simulation can be used to investigate properties of using a "non-overlap" type of rule under whatever set of assumptions you like. I'd suggest trying examples where the intervals are not based on the normal and where the widths of the intervals may be quite different, both under circumstances where you have an interval that uses normality and where you don't. Are you only contemplating a two-sided alternative in your test? – Glen_b Oct 27 '23 at 00:22
  • 1
    An example of an interval that's not based directly on a normal would be an interval for a ratio of two variances (there are many other possibilities), albeit that the underlying model for the original observations may be normal. While you might not often care about testing equality of variance ratios, it's a simple example where the endpoints of the usual "symmetric" interval (symmetric in tail probability terms) is not equally distant from the point estimate. Intervals based on quite different sets of sample sizes should suffice to tend to yield substantively different-width intervals. – Glen_b Oct 27 '23 at 00:32
  • 3
    @Glen Excellent points all. But I would add that even in many cases where CIs are asymmetric, the Normal-theory calculations of overlap thresholds tend to be pretty accurate because the error made in one direction tends to compensate for the error made in the other direction. Thus, for purposes of quick-and-dirty calculations, thinking on your feet, etc., using Normal-theory calculations (not assumptions!) can be remarkably effective. – whuber Oct 27 '23 at 12:22
  • Thanks, @Glen_b . I think the cases I'm thinking about would all be two-sided. ... So far, I've run some simulations on t-tests vs. traditional confidence intervals of the means. For equal-width intervals, the non-overlapping 83.4% confidence intervals pretty much always reflect the t-test at alpha = 0.05. If I get the CI widths to differ by a factor of 4, the non-overlapping CI's agree with with the test about 95% of the time, with the errors always being on the side of non-overlapping CI's and a p-value > 0.05. – Sal Mangiafico Oct 27 '23 at 14:05
  • More to do, but so far this is pretty reassuring. I have some ideas on simulations I can do for e.g. phi and Cramer's V. – Sal Mangiafico Oct 27 '23 at 14:06
  • 2
    In the case of location, Tukey et al engaged in a discussion of robust estimation (albeit partly based around normality) and selection of compromise when designing notched boxplots so that non-overlap of notch-intervals corresponds roughly to significant differences at alpha=0.05. See sec 7 of McGill,Tukey & Larsen (1978) Variations of Box Plots, The American Statistician, 32:1, 12-16 http://dx.doi.org/10.1080/00031305.1978.10479236 ... The discussion of the issues there might be useful. – Glen_b Oct 28 '23 at 00:29
  • Thanks, @Glen_b . The situation I'm interested in isn't really comparing location, as we have tests for that (and methods for multiple comparisons and whatnot). It's really about a question that pops up sometimes about comparing effect size statistics across groups, e.g. Cramer's V, maybe Glass rank biserial correlation, where I don't know of any applicable hypothesis test. That is, Is it practical to use 83.4% non-overlapping confidence intervals ? – Sal Mangiafico Oct 29 '23 at 01:39
  • I'm far from having any general recommendations, but at least for some of the effect size statistics I'm interested in, it appears that using the Fisher r-to-z transformation and test, or using 83.4% confidence intervals, work for the (two-sample) cases I'm interested in for effect size statistics. ... This question pops up relatively frequently in discussion forums. – Sal Mangiafico Oct 29 '23 at 01:42
  • 1
    I believe the discussion there is relevant to thinking about the issue with differing variance on the things being compared. – Glen_b Oct 29 '23 at 03:29

0 Answers0