2

Given the formula of the third moment that define the skewness: $$skewness=E\Bigg[\bigg(\frac{x_i−\bar{X}}{σ}\bigg)^3 \Bigg] = \frac{\mu_3}{\sigma^3} $$

I understand that this formula calculates the ratio of the spread of the data relatively to the mean and the standard deviation. and raising to the cube preserves the signs of the quantities.

so let's say I have a distribution and the result of this formula = 1.5, which means that it is positively skewed and the tail to the right is longer than the left. Is it correct to say that the tail to the right is 0.5 times longer than the left? if we want to describe the spread of 1 standard deviation for this distribution.

tmo
  • 23
  • 2
    How would you measure "longer," especially for distributions with unbounded support? – whuber Mar 07 '22 at 01:08
  • I'm thinking of the terms $(x_i - \bar{x})^3$. the sum of these terms should indicate if one side is spread longer than the other. Am I understanding it right? – tmo Mar 07 '22 at 02:21
  • 2
    In the sense of the mean absolute cubed deviation, yes: that's just another way of stating the skewness is positive. But not necessarily in the standard deviation sense. – whuber Mar 07 '22 at 02:30
  • and when we divide it by $\sigma$, isn't it become the proportion to the standard deviation? My goal is basically to understand what the number, as in example, 1.5 means, not only just conclude that it is left or right skewed, or symmetric. while most documents seem to skip it. – tmo Mar 07 '22 at 02:40
  • 1
    One of the best ways to know what a distributional property means is to examine how it varies among many distributions. A good place to start is a family of location-scale distributions, for then often a shape parameter (or multiple shape parameters) are directly related to skewness. See, for instance, a Pearson plot. – whuber Mar 07 '22 at 03:30
  • 1
    I think it’s a mistake to look for good equivalents for the formula $skewness=1/2$. Third-moment skewness has some advantages (ease of analytical calculations with exact distributions, and tests of normality), but for most datasets it’s a poor choice of descriptive statistic. You could instead calculate and analyze $mean-median$ or $(Q1-Q2)-(Q2-Q3)$ directly, and then you don’t have to worry about third-moment skeeness at all. – Matt F. Mar 07 '22 at 03:59
  • 1
    thank you. can you put it as answer so that I can accept it. otherwise, I think it would be left open. – tmo Mar 07 '22 at 04:26
  • Or you could make a plot of graphical moments for skewness, as done at https://stats.stackexchange.com/questions/84158/how-is-the-kurtosis-of-a-distribution-related-to-the-geometry-of-the-density-fun/362745#362745 for kurtosis! – kjetil b halvorsen Mar 07 '22 at 11:34
  • The difference between mean and median is almost always worth a look, but it is easy to find distributions for which mean and median are equal but otherwise you would call them skew. Comparisons based on differences of quantiles can help too, as can L-moments https://en.wikipedia.org/wiki/L-moment – Nick Cox Mar 07 '22 at 12:33
  • Skewness don’t tell you how much longer is the right tail unless you define this length somehow, then relate it to skewness. – Aksakal Mar 07 '22 at 13:31

1 Answers1

1

A precise interpretation of the skewness value 1.5 is given as follows:

  1. Construct the $z$-scores, $z_i =(x_i-\bar x)/s$.
  2. Calculate the values $s_i =z_i^3$.
  3. Draw the dot plot (like a histogram but instead dots) of the $s_i$.
  4. Since skewness is the mean of the $s_i$ values, and since a distribution's point of balance is the mean, the dot plot of the $s_i$ numbers balances at 1.5.

The above interpretation refers to sample skewness, but a nearly identical procedure gives you the precise interpretation of "population" skewness: drop the subscript, replace the mean and variance with their "population" counterparts, and draw the probability distribution graph of the resulting $s_i$. The point of balance of this graph is the "population" skewness.

BigBendRegion
  • 5,871
  • 1
  • 18
  • 30