The dataset that I'm using for my thesis is right skewed. It is about lead times (days) I tried log10 transforming it in SPSS but it still does not meet the requirement p>0,05 (Shapiro-wilk). So after the logtransform the dataset is still not normally distributed. Would you know any further options which can be applied that tell something about a confidence interval? (or any other advise)
-
1See if this helps https://stats.stackexchange.com/questions/112829/how-do-i-calculate-confidence-intervals-for-a-non-normal-distribution. \ If p > 0.05 then you have no support to reject the null hypothesis of normal distribution. I.e. your data is compatible with a normal distribution. Maybe it's a typo for p < 0.05? Also, you may get p < 0.05 and still be fairly close to normal if you have lots of data. Finally, I would prefer to apply a transformation to make the results more meaningful rather than to fit a statistical assumption. – dariober May 09 '22 at 09:01
-
In addition to the useful link for @dariober you might want to read the thread: Is normality testing 'essentially useless'?. – EdM May 09 '22 at 20:26
1 Answers
Usually integer data, like days, is modeled using a Poisson (or Negative Binomial) regression model, both instances of what is called a generalized linear model (GLM).
$$Y \sim Poisson(\lambda)$$ $$log(\lambda) = \beta_0 + \beta_1 x_1 ... $$
I'm not sure about SPSS but a quick search turns up the following for SPSS GLMs.
https://stats.oarc.ucla.edu/spss/library/spss-librarymanova-and-glm-2/
From here you can appeal to asymptotic distributions of $\hat{\beta}$ to get a 95% CI.
Edit: note that I might be reading into your question and it's not about regression but simply estimating a 95% quantile for the mean. This can be done using poisson regression (with just an intercept) but you can also take a look at more precise measures http://ms.mcmaster.ca/peter/s743/poissonalpha.html
- 151