0

So I'm solving old exams in preparation for my statistical data analysis exam and came by this question.

"b) Using the QQ-plot of the location-scale family that you have selected under part a, determine the location a and scale b approximately."

The given QQ-Plots can be seen below.


enter image description here


Also given is a summary:

enter image description here

From my understanding location can be described by mean or median, whereas the scale can be described by the variance or standard deviation. The above can be found in the summary provided, however I do not understand how one would approximate these values from the QQ-plot? Thanks for any help in advance!

1 Answers1

1

Your QQ-Plots have theoretical quantiles on the horizontal axis and sample quantiles on the vertical axis. (This style is the default in R.)

The following R code generates a random sample of size 200 from a standard normal population $\mathsf{Norm}(\mu = 0,\, \sigma=1),$ makes a normal QQ-plot and plots (left panel) the reference line $y = a + bx,$ where $a = \mu = 0$ and $b = \sigma = 1.$

Then it generates a random sample of size 200 from the population $\mathsf{Norm}(\mu = 100,\, \sigma=15),$ makes a normal QQ-plot and plots (right panel) the reference line $y = a + bx,$ where $a = \mu = 100$ and $b = \sigma = 15.$

set.seed(2019)
par(mfrow=c(1,2))
 z = rnorm(200);  qqnorm(z)
   abline(a=0, b=1, col="red", lwd=2)
 z = rnorm(200, 100, 15); qqnorm(z)
   abline(a=100, b=15, col="blue", lwd=2)
par(mfrow=c(1,1))

enter image description here

In general, a reference line with intercept $\mu$ (often estimated by $\bar X)$ and slope $\sigma$ (often estimated by the sample standard deviation $S)$ is a reasonable fit to the points in a QQ-plot.

Note: On request, R will also draw a reference line that connects the lower quartiles (data and distribution) with the corresponding upper quartiles. These may be useful in many situations, but I do not believe such lines are directly relevant to your question.

BruceET
  • 56,185
  • Certainly the reference line is relevant; its slope will estimate the slope of the line the points should lay along if they were drawn from a normal distribution and its intercept will estimate the intercept. It's effectively using k*(Q3-Q1) to estimate the standard deviation and (Q3+Q1)/2 to estimate the mean. – Glen_b Jun 12 '19 at 02:15