1

Say I have a finite real valued random variable, $X(\theta)$ where $\theta$ is some set of parameters such that the distribution of $X$ depends only on $\theta$. In particular, depending on $\theta$, we may have $E[X(\theta)]$ as finite or infinite.

Now, let's say I have a fixed, but unknown, $\theta$ and finite sample $X_1, X_2, ..., X_n$. Is there a way to test to see if $E[X(\theta)] = \infty$?

My instinct is to do something like take increasingly large subsamples and see if the sample mean goes to infinity, but I'm not sure if that works and/or how to do it. Is there a way to test this?

  • 1
    This is a little too general to provide any specific answer. Your parameter space $\Theta$ is the disjoint union $\Theta=\Theta_0\cup\Theta_\infty$ where $\Theta_0={\theta\mid E[X(\theta)]\lt \infty}.$ You ask how to test the hypothesis $\Theta_0$ against the alternative $\Theta_\infty.$ We can't possibly tell you any more without specifics of this parameterization. – whuber Jan 20 '24 at 18:14

1 Answers1

6

Not really.

Your suggestion might give some hints, but with a finite sample size, there is a possibility the sample mean may appear to converge even if the expectation is infinite or that the sample mean may appear not to converge even if the expectation is finite.

As an illustration , here are two sets of Pareto distribution sample means generated in R using the same random numbers and the inverse CDF method with larger sample sizes to the right, where the higher red points are associated with an infinite distribution expectation and the lower blue points with a finite distribution expectation of $101$; fewer than $1\%$ of sample observations will exceed the distribution expectation, and even fewer of sample means of any reasonable size.

I do not think you can meaningfully distinguish from this chart whether you will get convergence or not despite a sample size of up to $100000$.

set.seed(1)
maxsamplesize <- 10^5
redexponent  <- 0.99
blueexponent <- 1.01
x <- runif(maxsamplesize)
redsample  <- (1/x)^(1/redexponent)
bluesample <- (1/x)^(1/blueexponent) 
redmean  <- cumsum(redsample)  / (1:maxsamplesize)
bluemean <- cumsum(bluesample) / (1:maxsamplesize)
plot(redmean, col="red", xlab="sample size", ylab="cumulative sample mean")
points(bluemean, col="blue")

enter image description here

You get a different visual effect just by changing the seed, but the same difficulty in distinguishing between the two cases.

set.seed(2024)
maxsamplesize <- 10^5
redexponent  <- 0.99
blueexponent <- 1.01
x <- runif(maxsamplesize)
redsample  <- (1/x)^(1/redexponent)
bluesample <- (1/x)^(1/blueexponent) 
redmean  <- cumsum(redsample)  / (1:maxsamplesize)
bluemean <- cumsum(bluesample) / (1:maxsamplesize)
plot(redmean, col="red", xlab="sample size", ylab="cumulative sample mean")
points(bluemean, col="blue")

enter image description here

Henry
  • 39,459
  • You seem to be answering an analog of the question at https://stats.stackexchange.com/questions/2504. But your interpretation here seems a little too narrow. For instance, consider the family of Pareto$(\alpha,\alpha)$ distributions (using the (scale,shape) parameterization at https://en.wikipedia.org/wiki/Pareto_distribution). When the smallest data value is $1$ or less, you can conclude with certainty that the expectation is infinite. Variants of this with parameters $(\alpha^c,\alpha)$ with large positive values of $c$ have high power. – whuber Jan 20 '24 at 19:31
  • @whuber: Here I am considering Pareto distributions with minima of $1$ in both cases and shape/exponent $\alpha$ of $0.99$ or $1.01$. In terms of my answer there, the simulations here are not showing visual convergence (the blue sample means are nowhere near $101$). – Henry Jan 20 '24 at 20:55
  • My point is that testing for an infinite expectation is possible and can even be powerful, depending on the distributional assumptions. – whuber Jan 20 '24 at 22:49