Let $X$ be a sample from $N(0,1)$ and $m$, $v$, $s$, $k$ denote sample mean, variance, skewness and kurtosis of $X$. I want to transform the sample $X$ such that the sample moments equal the true population moments, e.g.
- sample mean = 0
- sample variance = 1
- sample skewness = 0
- sample kurtosis = 3
- ...
Using z-scores, $\frac{X-m}{\sqrt{v}}$, I can match the first two moments perfectly.
I seek a (nonlinear) transformation which helps my sample to match further population moments.
I found online the sinh-arcsinh transformation, that is $$Z=\sinh\left((4-k)\sinh^{-1}\left(\frac{X-m}{\sqrt{v}}\right)-s\right),$$
which should result in a match of the first four sample moments with the true population moments.
However, if I compare this transformation with the plain z-scores, $\frac{X-m}{\sqrt{v}}$, then that simpler approach yields better results (sample moments match population moments more closely). How can I transform the data correctly to match the moments?
Let $Z\sim N(0,1)$. Then, $$X=\mu+\sigma\sinh\left(\frac{\sinh^{-1}\left(Z\right)+\varepsilon}{\delta}\right)$$ has mean $\mu$, variance $\sigma^2$, skewness $\varepsilon$ and kurtosis $4-\delta$.
randn(matlab). Could you perhaps elaborate on how such a transformation would look like? The probability integral transform merely states $F_X(X)\sim U(0,1)$, right? – Alex May 16 '20 at 22:22randn. That sample should have mean 0, variance 1, skewness 0 etc. But the sample moments differ slightly from those population moments. I want to ``improve'' that sample by matching its moments with the true population moments. E.g. when I use z-scores (subtract sample mean and divide by sample standard deviation), then the first two moments are matched perfectly. I wonder what transformation helps me to match higher sample moments to population moments from $N(0,1)$. The more moments match, the better the quality of my sample – Alex May 17 '20 at 15:27