1

I collected my data (insect biomass) fortnightly from the same trees (six trees) from February to August. The biomass curve looks like Gaussian density curve. I know that I can use optim() to fit the data.

However, As I sampled the same tree repeatedly, to account for the lack of independence of the observations, I want to include 'tree id' as random effect. How can I fit the data? nlme or any other regression?

Galen
  • 8,442
  • Yes, lme from nlme package or lmer from lme4 package, both can do random effecets (linear models). – user2974951 Oct 18 '18 at 10:28
  • Thank you. However, the function fitting Gaussian density curve is not linear, and involved more than one parameters, like this: k * exp(-0.5 * ((x - m)/sd)^2). If I want to fit like this post, and consider random effect. How can I do? https://stats.stackexchange.com/questions/83022/how-to-fit-data-that-looks-like-a-gaussian – Ming-Tang Shiao Oct 18 '18 at 11:41

1 Answers1

0

Between the question and comments, it looks like you might want to try a non-linear mixed effect model using tree id as a random effect.

The curve $$k \exp \left( -\frac{1}{2} \left( \frac{x - m}{\sigma} \right)^2 \right)$$ is only a "Gaussian density" when $k = \frac{1}{\sigma \sqrt{2 \pi}}$. But if you're treating $k$ as a free parameter then that will readily not be the case. Instead what you describe is a Gaussian function.

We can begin with the following

$$Y = k \exp \left( -\frac{1}{2} \left( \frac{t-m}{\sigma} \right)^2 \right) + \epsilon$$

where $Y$ is biomass, and $t$ is time.

We have three parameters $k$, $m$, and $\sigma$. We could convert any, or all, of these parameters to mixed effects. The general trick for a given parameter $\theta$ is to substitute it with a sum $$\theta := \underbrace{\gamma}_{\text{Fixed Effect}} + \sum_{l=1}^q \mathbb{1}_l(\text{Group l}=l) \underbrace{\psi_l}_{\text{Random Effect}}$$ where $\gamma$ is the fixed effect, $\mathbb{1}_l(\text{Group k}=l)$ is an indicator function for group $l$, and $\psi_l$ is a random effect parameter for group $l$.

Just for an example, let's make this substitution for $\sigma$. But again you could do this for any or all of them. Let's make the substitution

$$\sigma := \sigma_f + \sum_{r=1}^6 \mathbb{1}_{r}(\text{tree id}=r) \sigma_r.$$

$$Y = k \exp \left( -\frac{1}{2} \left( \frac{t-m}{\sigma_f + \sum_{r=1}^6 \mathbb{1}_{r}(\text{tree id}=r) \sigma_r} \right)^2 \right) + \epsilon.$$

There are different estimation procedures one could take here. Let's go Bayesian, specifying priors on our parameters. I'm going to drop $\epsilon$ and assume that $Y$ is a lognormal variable to get non-negativity, so we'll have $\tau$ instead.

$$Y \sim \text{Lognormal}(\mu, \tau^2)$$

$$\tau \sim \text{Exponential}(1)$$

$$\mu := k \exp \left( -\frac{1}{2} \left( \frac{t-m}{\sigma_f + \sum_{r=1}^6 \mathbb{1}_{r}(\text{tree id}=r) \sigma_r} \right)^2 \right)$$

$$k \sim \text{Exponential}(1)$$

$$m \sim \mathcal{N}(0,1)$$

$$\sigma_f \sim \text{Exponential}(1)$$

$$\sigma_r \sim \text{Exponential}(1) \forall r \in \{1, \ldots, 6 \}$$

These priors are just using what I find to be useful defaults, but you should perform prior predictive simulations to ensure that they are useful. Likewise, changing the priors above to be better match background/external knowledge is appropriate. See Gelman et al. 2020 for more information on developing Bayesian models.

I would build this in PyMC, but it looks like you would prefer something based on R. RStan is one option.

Galen
  • 8,442