3

I noted a strange fact. Let X be a set of Pareto distributed random number with $\alpha$ and $x_{\min}$ defined a priori. Now, let $\alpha'$ be the estimated value of the shape and $x_{\min}'$ be a new threshold fixed a posteriori.

If $x_{\min}'\rightarrow \max(x)$ then $\alpha'\rightarrow +\infty$.

where $\max(x)$ is the max value in the sample. I will try to explain me better. Look at this code Comparing Pareto fitting methods now imagine to add after the line

hh1 <- (matrix(rpareto(100,alpha,0.1),ncol=1)) 

(where 0.1 is $x_{\min}$) this piece of code

hh1 <- subset(hh1,hh1 > xmin) 

Here xmin${}=x_{\min}'$ namely the point where I start the fit or if you prefer a cut-off. Of course $x_{\min}'$ cannot be greater then $\max(x)$. Now imagine to put the code in a loop in order to see the behavior of $\alpha'$ while you increment the cut-off $x_{\min}'$.

What I would expect is when $x_{\min}'\rightarrow \max(x)$ then $\alpha'\rightarrow\alpha_0$ because they are independent. But this not happens and $\alpha'$ (the estimated) tend to infinity. Why do we have such a strange behavior during the estimation? Well I understand the behavior in the MLE method, but the others?

The same happens with alpha-stable distributions when we try make a measure of the shape parameter in the tails.

emanuele
  • 2,098
  • 1
    what do you mean by "defined a priori" and "fixed a posteriori"? –  Jul 02 '12 at 09:15
  • defined a priori means: i choose the parameter and generate the random numbers. fixed a posteriori means: then i choose a threshold where to start fitting. – emanuele Jul 02 '12 at 09:38
  • yes i have did it in a different "experiment", but what i am asking here is different. Why we have this behaviour even thought the two parameter are independents. Is a sort of conceptual experiment. I think it is important, because in general power law behaviour is present asymptotically in nature and we have to know where to starts fitting. – emanuele Jul 02 '12 at 09:47
  • I agree with estimators, but what i would expects is sort of "plateu" in the behavior of $\alpha$, namely a behavior of this kind $\alpha\rightarrow\alpha_0$. I disagree with the last part of your comment. I can choose different Pareto distributions with the same $\alpha$ and different $x_{min}$ in this sense they are independent. – emanuele Jul 02 '12 at 10:03
  • In this case (speaking of N parameters) means that the distribution has N degrees of freedom. – emanuele Jul 02 '12 at 10:10
  • 1
    In what sense does "$x_{min}'\rightarrow +\infty$"? After all, necessarily $x_{min}$ is smaller than the smallest observed value! Are you perhaps discussing truncation of the distribution to values of $x_{min}$ or greater? Even so, $x_{min}'$ cannot exceed the largest observed value: truncation at larger values leaves one with no observations at all, whence no possibility of estimating $\alpha'$. – whuber Jul 02 '12 at 12:00
  • You right. $x_{min}'\rightarrow +\infty$ can be misled. Yes i am talking about truncation of the distribution. – emanuele Jul 02 '12 at 12:29
  • 1
    Could you add a bit more detail on your question. What I would like to see is an explicit statement of the parameters: such as $(x_i|\alpha x_m)\sim Pareto(\alpha,x_m)$ for $i=1,\dots,n$. Then define $x_{min}$ - is it a statistic (i.e. function of the $x_i$) or a parameter? Similarly, is $x_{min}^{'}$ a statistic or a parameter? what about $\alpha^{'}$? – probabilityislogic Jul 02 '12 at 12:48
  • Emanuele, to add to @prob's query, I suspect the answer also depends on how one estimates $\alpha'$, so could you please indicate that, too? – whuber Jul 02 '12 at 13:01
  • $x_{min}$ is defined for creating pareto distributed sample of numbers. $x_{min}'$ is an arbitrary point where i truncate the sample and i start to fit. So $x_{min}' \ge x_{min}$ always: the behavior of $\alpha'$ is independent of the method i choose for fitting, i.e. MLE, mean (if $\alpha >1$), median and fitting a stright line in a log-log scale. – emanuele Jul 02 '12 at 13:41
  • 1
    I think what @whuber is trying to get at is: (1) How does ${x'}_{!\mathrm{min}}$ "converge to infinity? Is it as a function of the sample size? Or, in what (other) sense? (2) What means are you using to estimate $\alpha'$; for example, maximum likelihood? (Clearly, if you took as your estimate, say, $\hat\alpha' := 2$ by ignoring the data altogether, you would not have this problem! So, the estimation procedure does matter.) – cardinal Jul 02 '12 at 13:46
  • you need to edit the question, not bury the details in a comment – probabilityislogic Jul 02 '12 at 14:22

1 Answers1

2

If $x_{\min}^{'}\to\infty$ then we also have $x_i\to\infty$ and the type of convergence is "sure" convergence. But note that just because two numbers both diverge does not mean that their limiting ratio is $1$. This becomes clear once it is recognised that $x_{\min}$ is a scale parameter. This is because we have: $$\log\left(\frac{x_i}{x_{\min}}\right)\sim \operatorname{Expo}(\alpha) $$ Expo(·) is the exponential distribution. Now because the distribution is independent of $x_{\min}$ this is also the limiting distribution as $x_{\min}\to\infty$.

So the estimated value for $\alpha$ could be anything. as the sample size increases, it will converge to the true value.

Update

In response to the revised question, the limit you are actually asking for is $x_{\min}^{'}\to x_{\max}$ not to infinity. Now you ask why the estimate for $\alpha$ is infinite in this case. Well the reason is that this limit corresponds to using a sample such that all of the values are equal $x_1=x_2=\dots=x_n=x_{\min}^{'}=x_{\max}$. But in this case, the likelihood is exactly fitted by a dirac delta function. The wikipedia page states that: $$\lim_{\alpha\to\infty}f(x\mid x_{\min}^{'},\alpha)=\delta(x-x_{\min}^{'})$$ So the MLE procedure is not breaking down, but it is quite properly doing what it should: fitting the data as hard as it can within the class of distributions you give it. You do get a warning though as the mle has infinite variance for all finite samples

  • I did not understand. why you say $log\left(\frac{x_i}{x_{min}}\right)\sim Exp(\alpha)$ ? – emanuele Jul 02 '12 at 11:26
  • 2
    Probability, could you explain the sense in which you understand "$x_{min}'\rightarrow +\infty$"? (See my related comment to the question itself.) – whuber Jul 02 '12 at 12:01
  • @whuber - I was treating $x_{min}^{'}$ as a fixed, known parameter of the Pareto distribution. You can show that the sampling distribution of the MLE for $\alpha$ is invariant to both the sample minimum, and the lower bound parameter. Only the values relative to the sample minimum matter, and this limit is arbitrary - a limit of a ratio of two divergent terms. – probabilityislogic Jul 02 '12 at 12:52
  • 2
    I'm afraid I don't follow this at all--must be Monday morning. :-) What is the relationship, then, between $x_{min}$ and $x_{min}'$ in your answer? – whuber Jul 02 '12 at 13:01
  • The pareto distribution is "invariant" to the distinction, as truncating a pareto sample to $x\geq x_{min}^{'}$ where $x\sim Pareto(\alpha,x_{min})$ just replaces $x_{min}\to x_{min}^{'}$. Similar to memoryless property of exponential distribution – probabilityislogic Jul 02 '12 at 13:07
  • 1
    There are so many things that we don't know about the sampling process (yet), that it's hard to see how this gives a definitive answer. While it is true that $\newcommand{\xmin}{x_{\mathrm{min}}^{'}}\mathbb P(X > x \mid X > \xmin) = (\xmin / x)^{\alpha}$ for $x \geq \xmin$, the resulting sample size is random and we do not know what $\xmin \to \infty$ means. If it means that it is a function of the (latent) sample size $n$, then this response may provide some insight into the behavior of $\hat\alpha'$, or it may not relate at all to the behavior!) – cardinal Jul 02 '12 at 13:28
  • (Imagine, e.g., as @whuber hints at, that $\xmin := \xmin(n)$ grows faster than the quantiles of the maximum of the (latent) sample, which is easy to compute.) – cardinal Jul 02 '12 at 13:32
  • I think this means the question really needs to be editted. I will edit my answer when that happens - if this takes too long, I'll remove my answer I think. Note that if $\alpha$ is estimated by the wikipedia formula (as the comments indicate), then the only way $x_{min}^{'}$ can influence the estimate is through $n$. I assumed that the a posteriori statement meant that we always have some sample, which basically means the estimate does not converge to anything unless $n\to 1$ or $n\to\infty$. – probabilityislogic Jul 02 '12 at 13:52
  • Fair enough. I completely agree that we need more info from the OP. But, I don't think that last comment is true, prob. For example, how do you "choose" $\xmin$ to "guarantee" you have a sample without looking at the data? And, if $\xmin$ is chosen by some data-driven procedure of the latent sample, then it can induce dependencies on the resultant truncated sample, in which case it will not "influence the estimate (only) through $n$". – cardinal Jul 02 '12 at 13:57
  • What I interpretted "a posteriori" to mean is that you look at the data to choose $x_{min}^{'}$ - what else could be meant by "a posteriori"? The estimate for $\alpha$ can be wriiten as $\hat{\alpha}=\alpha\frac{n}{\sum_{i=1}^{n}[Y_{i}-Y_{(1)}]}$ where $Y_{i}\sim iid;Expo(1)$ and $Y_{(1)}$ is the minimum of the $Y_{i}$ the only "data" based part of the estimator is $n$. – probabilityislogic Jul 02 '12 at 14:29
  • 2
    Unfortunately, the last sentence of your most recent comment is not true without knowing how $\xmin$ was arrived at. – cardinal Jul 02 '12 at 14:42
  • yeah I just realised that now. $1$ should be replaced by $k$ which depends on $x_{min}^{'}$ – probabilityislogic Jul 02 '12 at 14:51