I was looking at some newly gathered data comprised of reading times measured in ms and I decided to plot the observed values prior to cleaning the data to see to what extent the data is normally distributed. I was quite surprised to see this pattern. While I somehow understand from this that the data is right skewed, I do not really understand why the line is flat. I have searched for explanaitions in stats books and on the internet, but no examples that I have found resemble the pattern that I observed.
1 Answers
One way to understand qqplots is to simulate data with particular properties and look at the qqplots they produce. Below is a simulation that produces some flat lines in the qqplot:
In each of the horizontal lines, the theoretical quantile is varying, while the sample quantile is constant. The only way the sample quatile can be constant, is that the sample value is constant. And indeed, the R code for the simulation was
sample(1:5, 1000, replace=TRUE)
which samples, with equal probability, each of the values 1,2,3,4 or 5.
In your case, the left part of the plot is only almost flat, meaning that the data values varies slowly, over a small range, while in the right part of the plot they are varying more rapidly, over a large range, indicating an asymmetric, right-tailed distribution. So simulate data from distributions like gamma, Weibull or the lognormal and look at their qqplots!
- 77,844


qqlinestates: "qqlineadds a line to a “theoretical”, by default normal, quantile-quantile plot which passes through the probs quantiles, by default the first and third quartiles." Does this answer your question? – COOLSerdash Jul 23 '21 at 12:01qqnorm(rnorm(1000,mean=100,sd=10))to see an example. Your chart shows severe right skewness (and possibly infinite population variance). Something likeqqnorm(200*abs(rt(1000,2)))might be similar – Henry Jul 23 '21 at 12:03