8

Suppose we have two independent Poisson-distributed variables $X_1$ and $X_2$. We want to test whether the Poisson parameters are equal, i.e. whether $\lambda_1=\lambda_2$.

Now we have 4 distinct statistical exact tests to choose:

  1. E-test (see Krishnamoorthy and Thomson, A more powerful test for comparing two Poisson means, Journal of Statistical Planning and Inference 119 (2004) 23–35; see also Checking if two Poisson samples have the same mean on Cross Validated)
  2. poisson.exact(tsmethod="central")
  3. poisson.exact(tsmethod="minlike")
  4. poisson.exact(tsmethod="balker") (from exactci R package)

Now, given all those tests are labeled as "exact", one would expect all yield the same p-values. Contrary to this, the quoted paper clearly illustrates that methods 2-4 give different significance. Furthermore, I personally implemented the E-test and found that this test gives yet another, distinct result. Why is that?

  • 1
    I am not familiar with all the methods so I am reluctant to make this comment and answer. It is really just a guess on my part. It is possible to have different tests for the same null hypothesis that use different test statistics. It is the distribution of the test statistic under the null hypothesis that determines the p-value for the specific test. Saying that a test is exact level alpha just means that it is constructed using the exact distirbution of the test statistic under the null hypothesis rather than an asymptotic approximation. – Michael R. Chernick Jul 26 '12 at 22:47
  • So it shouldn't be surprising that different tests on the same data set give different p-value. Now if the p-value is less than 0.05 for one test and greater for another one will reject and the other will not (at the 5% level). Some tests are more powerful than others (this could be universal or under specific conditions). So this can happen and is not rare. – Michael R. Chernick Jul 26 '12 at 22:51
  • @MichaelChernick Are you sure about the meaning of exactness ? I think an exact confidence interval at the nominal $\alpha$ level is a confidence interval whose effective confidence level is at least $\alpha$ (but it is true that such tests are usually (always?) based on the exact distribution of the test statistic and not asymptotic approximation) – Stéphane Laurent Jul 27 '12 at 07:27
  • Adam, where do you see an illustration of "significant" difference between the methods ? The paper gives an example for which the three methods yield close confidence intervals. – Stéphane Laurent Jul 27 '12 at 07:29
  • @StéphaneLaurent My point about exact tests is that an 0.05 level test means that the exact significance level is less thanor equal to 0.05. For absolutely continous distributions an exact test will have exact level 0.05 for all sample sizes. But for test statistics that have discrete distributions for most sample sizes the exact level will be less than 0.05 and yet we will call the test exact. Examples would be an exact binomial test test using the Clopper-Pearson method or Fisher's exact test for contingency tables. – Michael R. Chernick Jul 27 '12 at 08:41
  • @MichaelChernick Ok, we are in agreement – Stéphane Laurent Jul 27 '12 at 08:45
  • @StéphaneLaurent True. But as I said, given all methods are exact I thought that the yield exactly the same p-value. As a side-note, the Monte-Carlo simulation I made about the problem gave result similar to E-test... I drew both X1 and X2 independently from their Poisson distribution (assuming $\lambda=k$), and checked how often one is greater than the other – Adam Ryczkowski Jul 28 '12 at 07:11

1 Answers1

1

The p-value of a hypothesis test or a corresponding confidence interval depends on the treatment or choice of 2 issues:

1. Treatment of nuisance parameter

To preserve the size at the exact level, the type 1 error needs to be less than or equal to alpha for all possible values of the nuisance parameter. The null hypothesis that the rates are equal, does not constrain the value itself, so it is a nuisance parameter.

Conditional tests like the exact conditional Poisson test or Fisher's exact test remove the nuisance parameter by conditioning on a summary statistic.

Unconditional exact test need to assert that the size is correct by using the max or sup over all possible values of the nuisance parameter. Berger-Boos test limits the space of nuisance parameter for the max but adds a factor to make it exact, i.e. preserves the size alpha.

The Poisson E-test is not an exact test in this sense. It uses the "exact" distribution but it uses the estimated value of the nuisance parameter.

2. Location of two-sided rejection region

A two-sided test has rejection region in both lower and upper tail. The requirement that the sixe of the test is at most alpha, is a requirement on the probability to be in one of the tails, but it does not pin down the probability in each tail separately.

"central" or equal tail methods limit the probability of the each tail to be less than or equal to half the size, alpha / 2.

"minlike" uses the likelihood value (based on likelihood ratio test) to find the non-rejection region. The corresponding profile confidence interval will not have equal tails in skewed distributions like Poisson or Binomial.

One point that Michael P. Fay points out and emphasizes is the hypothesis tests and confidence interval are often not consistent with each other.

For example the exact poisson test in R uses "minlike" hypothesis test and exact pvalue, but reports exact "central" or equal-tail confidence intervals.

In one-sided test, the location is fixed and this distinction between "minlike" and "central" becomes irrelevant. Because there is only one tail, an exact test needs to preserve the size for that tail at level alpha.

Josef
  • 3,202
  • I'm working on getting more Poisson inference into statsmodels https://www.statsmodels.org/dev/generated/statsmodels.stats.rates.test_poisson_2indep.html – Josef Mar 29 '22 at 16:13