Programming inverse-transformation sampling for Pareto distribution

Question

I am having trouble deriving a formula, and running a simulation with its distribution. The Pareto distribution has CDF:

$$F(x) = 1 - \bigg( \frac{k}{x} \bigg)^\gamma \quad \quad \quad \text{for } x \geqslant k,$$

where $k>0$ is the scale parameter and $\gamma>0$ is the shape parameter. I need to derive the probability inverse transformation 'quantile':

$$X = F^{-1}(U) = Q(U).$$

I tried deriving the equation and ended up with $X = k/\text{gammaroot}(1-U)$. Does that make sense? If so, how would I do a $\text{gammaroot}$ function in R?

Please explain what you mean by "gammaroot." Would it be something like taking the $1/\gamma$ power? — whuber, Nov 21 '20 at 20:35
Yes, that would be. If that was the case, did I derive the equation correctly to obtain X? — John Huang, Nov 21 '20 at 20:49
If you want to write $\sqrt[\gamma]{x}$, use \sqrt[\gamma]{x}. — J.G., Nov 22 '20 at 14:40

forecaster · Accepted Answer · 2020-11-21T21:09:10.347

I'm assuming you are referecning to Inverse Transform Sampling method. Its very straight forward. Refer Wiki article and this site.

Pareto CDF is given by: $$ F(X) = 1 -(\frac{k}{x})^\gamma; x\ge k>0 \ and \ \gamma>0 $$

All you do is equate to uniform distribution and solve for $x$

$$ F(X) = U \\ U \sim Uniform(0,1) \\ 1 -(\frac{k}{x})^\gamma = u \\ x = k(1-u)^{-1/\gamma} $$

Now in R:


#N = number of samples
#N = number of sample
rpar <- function(N,g,k){
if (k < 0 | g <0){
    stop("both k and g >0")
  }
k*(1-runif(N))^(-1/g)
}
rand_pareto <- rpar(1e5,5, 16)
hist(rand_pareto, 100, freq = FALSE)
#verify using built in random variate rpareto in package extrDistr
x <- (extraDistr::rpareto(1e5,5, 16))
hist(x, 100, freq = FALSE)

This will give you the random variate for Pareto distribution. I'm not sure about where you are getting gammaroot?

Ben · Answer 2 · 2020-11-21T21:25:13.690

Using the quantile $p=F(x)$ and inverting the CDF equation gives the quantile function:

$$Q(p) = \frac{k}{(1-p)^{1/\gamma}} \quad \quad \quad \text{for all } 0 \leqslant p \leqslant 1.$$

The corresponding log-quantile function is:

$$\log Q(p) = \log k - \frac{1}{\gamma} \log (1-p) \quad \quad \quad \text{for all } 0 \leqslant p \leqslant 1.$$

The probability functions for the Pareto distribution are already available in R (see e.g., the EnvStats package). However, it is fairly simple to program this function from scratch if preferred. Here is a vectorised version of the function.

qpareto <- function(p, scale, shape = 1, lower.tail = TRUE, log.p = FALSE) {
#Check input p
  if (!is.vector(p))                { stop('Error: p must be a numeric vector') }
  if (!is.numeric(p))               { stop('Error: p must be a numeric vector') }
  if (min(p) < 0)                   { stop('Error: p must be between zero and one') }
  if (max(p) > 1)                   { stop('Error: p must be between zero and one') }
n   <- length(p)
  OUT <- numeric(n)
  for (i in 1:n) { 
    P      <- ifelse(lower.tail, 1-p[i], p[i])
    OUT[i] <- log(scale) - log(P)/shape) }
if (log.p) OUT else exp(OUT) }

Once you have the quantile function it is simple to generate random variables using inverse-transformation sampling. Again, this is already done in the existing Pareto functions in R, but if you want to program it from scratch, that is quite simple to do.

rpareto <- function(n, scale, shape = 1) {
#Check input n
  if (!is.vector(n))                { stop('Error: n must be a single positive integer') }
  if (!is.numeric(n))               { stop('Error: n must be a single positive integer') }
  if (length(n) != 1)               { stop('Error: n must be a single positive integer') }
  if (as.integer(n) != n)           { stop('Error: n must be a single positive integer') }
  if (n <= 0)                       { stop('Error: n must be a single positive integer') }
qpareto(runif(n), scale = scale, shape = shape) }

Programming inverse-transformation sampling for Pareto distribution

2 Answers2