Is there a distribution for use with generalized linear models that captures both heavy tails and "pointiness" near the mean?

Question

If I fit a regular linear mixed model to my data with lmer, I get a pattern of residuals that, at a glance, looks to me to deviate from Gaussian in two ways. The residuals are obviously very heavy-tailed, and there are a lot of discussions on Cross Validated of how to address heavy-tailed distributions. But, for example, a commonly used distribution to address heavy tails, the Student's t distribution, doesn't capture this histogram at all. The residuals are both heavy tailed and pointy -- there are extreme outliers and a lot of tight clustering around the mean. Is there a distribution I can use as a link function in a generalized linear model (or some other approach) that would capture the actual shape of these residuals?

Here's a density plot for the dependent variable:

Here's one for the residuals in a linear mixed model:

Here's the Q-Q plot for the residuals:

EDIT:

Here's a residual versus fitted plot, to show the heteroskedasticity:

ANOTHER EDIT:

Here are density plots of subsets of the residuals, split at fitted = 1.1, in an effort to show the two distributions of the residuals with roughly homogeneous variance:

The first thing that comes to mind is the double exponential distribution. It looks like there's a nimble package that might help (rdocumentation.org/packages/nimble/). — gung - Reinstate Monica, Sep 29 '22 at 15:39
I'm pretty sure you couldn't use that with the generalized linear model, though. For estimation (fitting the model), a linear model would probably work. For testing, you'd want to use something other than the p-values that come by default. Eg, bootstrapping might work. — gung - Reinstate Monica, Sep 29 '22 at 15:47
Thanks! I felt like I had seen a distribution that looked like mine before and I think that was it. It looks like maybe I could also use brms to model data with a double exponentional: https://www.rdocumentation.org/packages/brms/versions/1.3.1/topics/set_prior — Katie, Sep 29 '22 at 16:21
we could try implementing the Generalized Gaussian in glmmTMB ... https://en.wikipedia.org/wiki/Generalized_normal_distribution — Ben Bolker, Sep 29 '22 at 19:18
Oops, generalized normal won't work in glmmTMB because the log-likelihood is non-differentiable with respect to mu. I wonder how brms handles it ... ??? — Ben Bolker, Sep 29 '22 at 20:10
https://en.wikipedia.org/wiki/Kurtosis#The_Pearson_type_VII_family ???? — Ben Bolker, Sep 29 '22 at 20:13
Presumably you need a generalized linear mixed model? If you just want a log-linear model with a conditional distribution like this (no random effects), then it's a whole lot easier ... Can you clarify? — Ben Bolker, Sep 29 '22 at 21:16
Yes, I have a mixed model with nested random effects. In fact, I have two random effects that are crossed and nested within a third. We were discussing my model here (https://stackoverflow.com/questions/73855415/how-to-specify-a-model-with-more-than-one-nested-random-effect-in-heavy-packag) before I realized that student's t didn't capture the true shape of my distribution at all. — Katie, Sep 29 '22 at 21:26
@dipetkov is this helpful? https://docs.google.com/document/d/1BeM9k6jxYpHWp9ElSkQnB0J5nUxbctqsQ-JfZkeAToA/edit?usp=sharing — Katie, Sep 30 '22 at 15:32
Thanks. I admit I don't understand this... So I'm not sure whether/how you can implement this but if I were you, I would consider simplifying the model instead of making it more complex with fancy distributions, by analyzing summaries (instead of the "raw" data). In the simplest case, the summaries are averages (perhaps average over clusters? any grouping of measurements where the fixed effects of interest do not vary might be a good candidate for the analysis of summaries approach). — dipetkov, Sep 30 '22 at 15:55
Hi @dipetkov if I can be clearer in any specific way, please let me know. I already average over a single cluster, and I have also tried averaging over all clusters (which doesn't really change the the distribution of the data). It is also eliminates a variable to compute the relevance contrasts. I would be very happy to have a simpler model with fewer fixed effects, but from experience in various experiments with lmer and rlmer all the fixed effects are important. There are complicated interactions in this experiment (which shouldn't really surprise me theoretically). — Katie, Sep 30 '22 at 16:16
@Katie any chance to share a ( sample,) of your data ? Asking to see if a simple Lambert W x Gaussian Transform + OLS could help you here. — Georg M. Goerg, Oct 26 '23 at 03:31

score 5 · Answer 1 · answered Sep 30 '22 at 08:29

5

You can't interpret the shape of the residuals without checking the conditional mean and variance assumptions (e.g. by residuals vs fitted); if the model for the conditional mean was wrong or the residuals were heteroskedastic, you could see a residual pattern like that even though the errors were normal.
Assuming that's all fine, there's not a GLM that will do it, but an L1 regression (least absolute deviations regression) model might work reasonably well for conditional distributions close to a Laplace (you might want to check that the logs of the bin counts decrease roughly linearly either side of the mode; it can sometimes be hard to judge directly from the histogram, but it looks reasonable).

For an identity link and constant variance function, L1 regression is easy to do in R with quantreg::rq (with tau at the default value). There's other possible packages, but that's the one I'd look at first.

answered Sep 30 '22 at 08:29

Glen_b

282,281

I'll edit with a residuals vs. fitted plot. The distribution is in fact somewhat heteroskedastic, particularly on the right side.
I had started looking at quantreg, and then lqmm, yesterday in response to I don't know what I read. Can quantreg be used for mixed models? The vignette doesn't make that clear. lqmm apparently can although I was struggling to figure out how to specify my model; but that seems a more appropriate question for Stack Overflow.

Katie

Sep 30 '22 at 12:55

Yeah, that will tend to make the marginal distribution look peaker and heavier tailed than the conditional distribution (what assumptions relate to) would look. I was concerned that your tails might even be a bit heavier than Laplace but now with the look of that heteroskedasticity, perhaps the conditionals will be somewhat less heavy tailed than it. You seem to have a lot of data so you might consider slicing into a few more or less homogenous sections and looking at distributions of residuals with those. $,$ ... ctd

– Glen_b Oct 01 '22 at 01:44

gung - Reinstate Monica · Answer 2 · 2022-09-29T20:51:04.367

2

The first thing that comes to mind is the double exponential distribution. It looks like there's a nimble package that might help (rdocumentation.org/packages/nimble/).

I'm pretty sure you couldn't use that with the generalized linear model, though. For estimation (fitting the model), a linear model would probably work. For testing, you'd want to use something other than the p-values that come by default. For example, bootstrapping might work.

edited Sep 29 '22 at 20:51

answered Sep 29 '22 at 16:43

gung - Reinstate Monica

145,122

1

I don't think that's going to work -- it's a double-exponential prior, not a double-exponential conditional distribution (brms relies on having a posterior distribution that is differentiable with respect to the model parameters ...) – Ben Bolker Sep 29 '22 at 20:11
Yeah, I realized this in the past few hours. Or rather, I realized the part outside of the parenthetical. – Katie Sep 29 '22 at 20:27
Good catch, @BenBolker, I've deleted that comment. – gung - Reinstate Monica Sep 29 '22 at 20:51

Sextus Empiricus · Answer 3 · 2022-09-30T14:45:24.897

You can not use GLM

To solve a generalized linear model (GLM) one uses a iterative algorithm that computes an ordinary least squares problem while changing the weights and values of the observations (also in the case of mixed models one can use an iterative algorithm to approximate the solution, for example Wolfinger & O'connell 1993).

Your function for the conditional distribution of the errors looks like a Laplace distribution or something similar (when you plot it on a log scale it is more clear), and can not be used as a distribution in a GLM (it needs to be in the exponential family).

Alternative

However, you can still use an alternative iterative algorithm. If your error distribution is the Laplace distribution then estimation of the model relates to minimizing least absolute residuals. For minimizing least absolute residuals instead of least squares residuals, on can use iteratively reweighted least squares.

Maybe this is a silly idea (one would have to back this up with some references), but I imagine that you can extend the iterative algorithm by replacing ordinary least squares regression (the function lm) with the mixed effects regression (the functionlmer) that you want to perform.

If there are no software packages that do this already, then it should not be too difficult to create a function yourselve (here is an example for GLM).

In R, the package lqmm provides the combination of mixed effect model with quantile regression.

Is there a distribution for use with generalized linear models that captures both heavy tails and "pointiness" near the mean?

3 Answers3

You can not use GLM

Alternative