Is it possible to fit repeated measure data to Gaussian density curve

Question

I collected my data (insect biomass) fortnightly from the same trees (six trees) from February to August. The biomass curve looks like Gaussian density curve. I know that I can use optim() to fit the data.

However, As I sampled the same tree repeatedly, to account for the lack of independence of the observations, I want to include 'tree id' as random effect. How can I fit the data? nlme or any other regression?

Yes, lme from nlme package or lmer from lme4 package, both can do random effecets (linear models). — user2974951, Oct 18 '18 at 10:28
Thank you. However, the function fitting Gaussian density curve is not linear, and involved more than one parameters, like this: k * exp(-0.5 * ((x - m)/sd)^2). If I want to fit like this post, and consider random effect. How can I do? https://stats.stackexchange.com/questions/83022/how-to-fit-data-that-looks-like-a-gaussian — Ming-Tang Shiao, Oct 18 '18 at 11:41

Galen · Answer 1 · 2023-08-04T03:18:04.520

Between the question and comments, it looks like you might want to try a non-linear mixed effect model using tree id as a random effect.

The curve $$k \exp \left( -\frac{1}{2} \left( \frac{x - m}{\sigma} \right)^2 \right)$$ is only a "Gaussian density" when $k = \frac{1}{\sigma \sqrt{2 \pi}}$. But if you're treating $k$ as a free parameter then that will readily not be the case. Instead what you describe is a Gaussian function.

We can begin with the following

$$Y = k \exp \left( -\frac{1}{2} \left( \frac{t-m}{\sigma} \right)^2 \right) + \epsilon$$

where $Y$ is biomass, and $t$ is time.

We have three parameters $k$, $m$, and $\sigma$. We could convert any, or all, of these parameters to mixed effects. The general trick for a given parameter $\theta$ is to substitute it with a sum $$\theta := \underbrace{\gamma}_{\text{Fixed Effect}} + \sum_{l=1}^q \mathbb{1}_l(\text{Group l}=l) \underbrace{\psi_l}_{\text{Random Effect}}$$ where $\gamma$ is the fixed effect, $\mathbb{1}_l(\text{Group k}=l)$ is an indicator function for group $l$, and $\psi_l$ is a random effect parameter for group $l$.

Just for an example, let's make this substitution for $\sigma$. But again you could do this for any or all of them. Let's make the substitution

$$\sigma := \sigma_f + \sum_{r=1}^6 \mathbb{1}_{r}(\text{tree id}=r) \sigma_r.$$

$$Y = k \exp \left( -\frac{1}{2} \left( \frac{t-m}{\sigma_f + \sum_{r=1}^6 \mathbb{1}_{r}(\text{tree id}=r) \sigma_r} \right)^2 \right) + \epsilon.$$

There are different estimation procedures one could take here. Let's go Bayesian, specifying priors on our parameters. I'm going to drop $\epsilon$ and assume that $Y$ is a lognormal variable to get non-negativity, so we'll have $\tau$ instead.

$$Y \sim \text{Lognormal}(\mu, \tau^2)$$

$$\tau \sim \text{Exponential}(1)$$

$$\mu := k \exp \left( -\frac{1}{2} \left( \frac{t-m}{\sigma_f + \sum_{r=1}^6 \mathbb{1}_{r}(\text{tree id}=r) \sigma_r} \right)^2 \right)$$

$$k \sim \text{Exponential}(1)$$

$$m \sim \mathcal{N}(0,1)$$

$$\sigma_f \sim \text{Exponential}(1)$$

$$\sigma_r \sim \text{Exponential}(1) \forall r \in \{1, \ldots, 6 \}$$

These priors are just using what I find to be useful defaults, but you should perform prior predictive simulations to ensure that they are useful. Likewise, changing the priors above to be better match background/external knowledge is appropriate. See Gelman et al. 2020 for more information on developing Bayesian models.

I would build this in PyMC, but it looks like you would prefer something based on R. RStan is one option.

Is it possible to fit repeated measure data to Gaussian density curve

1 Answers1