From Wikipedia I have the compliment of the CDF parameterized for fat-tails distributions.
$$ \Pr[X>x] \sim x^{- \alpha}\text{ as }x \to \infty,\qquad \alpha > 0.\, $$
Here $\alpha$ is the fatness parameter. According to Taleb. $\alpha \leq 2.5$ is forecastable, but $\alpha > 2.5$ is not.
I would like to fit $\alpha$ given my data so I can mark it as forecastable or not.
I thought I would start by trying to fit my data to a linear model.
set.seed(42)
df_tails <- tibble(y = 1- seq(0.01,1, 0.01),
norm = sort(rnorm(n = 100, 0,1)),
cauchy = sort(rcauchy(n = 100, 0,1)))
lm(log(y) ~ norm - 1, data = df_tails)
lm(log(y) ~ cauchy - 1, data = df_tails)
The problem is that I end up with many NAs so I think I am coding something wrong.
Try 2
library(tidyverse)
set.seed(42)
df_tails_raw <- tibble(y = log(1- seq(0.01,1, 0.01)),
norm = log(sort(rnorm(n = 100, 0,1))),
cauchy = log(sort(rcauchy(n = 100, 0,1))))
df_tails <- na.omit(df_tails_raw)
df_tails |>
ggplot() +
geom_point(aes(x = norm, y=y), color = 'tomato', size = 2, stroke = 2, shape = 1) +
geom_point(aes(x = cauchy, y = y), color = 'grey50', size = 2, stroke = 2, shape = 1) +
theme_classic() +
labs('Red is normal and Grey is Cauchy')
lm(y ~ norm, data = df_tails)
lm(y ~ cauchy - 1, data = df_tails)
My error is
Error in lm.fit(x, y, offset = offset, singular.ok = singular.ok, ...) : NA/NaN/Inf in 'y'



yincludes a zero in the range and that gives an error when you take the logarithm and pass the result tolm. – Sextus Empiricus Nov 30 '22 at 22:15y = log(1- seq(0.01,0.99, 0.01))instead. This removeslog(0)from your tibble. (Then adjust the number of random draws you're using.) – Sycorax Nov 30 '22 at 22:48lmfit lines to the inverse CDF (the quantile function), which cannot possibly be remotely close to polynomial: it has vertical asymptotes at 0 and 1. Second, any polynomial approximation of a CDF of any degree is doomed to fail dramatically in the tails and, moreover, is unlikely to be monotonic. For the third time: a scatterplot will reveal all. Superimpose yourlmfit on that to see just how bad it is. – whuber Dec 01 '22 at 14:43