6

Let's say we have a multivariate normal distribution with two components. The two means $\mu_1$ and $\mu_2$ are both equal to 0 and the covariance matrix is a simple 2x2 square matrix with diag(1, 2, 2).

I want to use this distribution to perform a sample size calculation, so I am trying to calculate the quantiles of it. The mvtnorm R-package allows me to do so quite easily. In my example it would be:

alpha <- 0.05
mean_null <- c(0, 0)
mean_alternative <- c(0, 0)
sigma <- diag(1, 2, 2)

crit_lower <- qmvnorm(p=(alpha/2), mean=mean_null, sigma=sigma)$quantile crit_upper <- qmvnorm(p=1 - (alpha/2), mean=mean_null, sigma=sigma)$quantile

This works properly (I think). However, as a test of my understanding I wanted to calculate the power of a test under the null hypothesis (which should be euqal to $alpha$). But this doesn't seem to be the case. See the following code:

# probability of both test statistics being < lower crit
# approximately 0.025, makes sense
pmvnorm(lower=rep(-Inf, 2), upper=rep(crit_lower, 2),
        mean=mean_alternative,
        sigma=sigma)

probability of both test statistics being > upper crit

This however is way smaller than 0.025

pmvnorm(lower=rep(crit_upper, 2), upper=rep(Inf, 2), mean=mean_alternative, sigma=sigma)

The sum of both terms should be the power, e.g. 0.05, but it is way smaller. What is it that I am misunderstanding here?

Denzo
  • 504

1 Answers1

6

In the single variate case, you can use the complement rule

$$P(X_1 \leq \text{crit}) = 1 - P(X_1 > \text{crit})$$

However in the multivariate case this is not true and

$$P(X_1 \leq \text{crit}, X_2 \leq \text{crit}) \neq 1- P(X_1 > \text{crit}, X_2 > \text{crit})$$

The events are not complements.

  • Thanks a lot! This makes me question the whole point of what I am trying but at least it answers the question here.. – Denzo Jan 10 '24 at 12:34
  • @Denzo the point of using some region may not be a wrong point, but your underlying problem is not clear enough such that mpre precise comments can be made about it. You might ask a question where your problem is explained more precisely. – Sextus Empiricus Jan 10 '24 at 13:09
  • @Denzo, the standard extension of a CDF to multiple dimensions is to have $P(X\le c)$ apply to all vector values, so the complement is $1-P(X_1>c\vee X_2>c)$. It really depends on what you consider a successful outcome: the union (either) or the intersection (both) of your statistics exceeding $c$? Since the distributions are symmetric you can flip the signs to switch where you want the 'and'. – PBulls Jan 10 '24 at 13:11
  • This question reminds me of https://stats.stackexchange.com/questions/469795/ . At the same time, it might be interesting to wonder whether a square or some other shape of region might be useful, especially when the variables are correlated (see also: https://stats.stackexchange.com/a/364723/). – Sextus Empiricus Jan 10 '24 at 13:12