3

I'm interested in how growth of a fish species is related to several environmental variables. Those variables are measured at the time of capture, but several fish can be captured in a single sample (tow), so will all have identical environmental variables. I'd like to nest the samples within towid to account for this, but I'm having trouble figuring out how the syntax works within mgcv in R.

My current best guess is the following formulation, where mar is the marginal growth and is the response variable, sal, temp, secchi are the three environmental variables of interest and towid is the id for the sampling event during which potentially multiple fish were captured.

model = gam(mar ~ s(sal, by = towid) + s(temp, by = towid) + s(secchi, by = towid), 
            data = df)

However, I'm unsure if every variable needs to have that by = towid argument or if I just need to add in a fishid variable that has it, for example:

model = gam(mar ~ s(sal) + s(temp) + s(secchi) + s(fishid, by = towid),
            data = df)

thanks for your help.

C. Denney
  • 695
  • 1
  • 6
  • 11
  • 1
    If you are interested in fish growth, doesn't that mean that you have growth measurements for each fish at several time points? I don't see this reflected in your current model. – Isabella Ghement Oct 08 '19 at 01:18
  • 1
    @IsabellaGhement We do, but for the current analysis, we are only using the recent growth (marginal growth: mar in the above example) which is growth in the few days before capture, so we only have a single measure of growth. Which is why we are using environmental variables at the point of capture: we assume that point of capture environment is representative of recent environmental variables. – C. Denney Oct 08 '19 at 01:28
  • 1
    So you have multiple fish nested within a tow; for each of those fish you have a single growth measurement and for each tow you have a single measurement of sal, temp and secchi? – Isabella Ghement Oct 08 '19 at 01:39
  • Yes, exactly, so I think all my metrics need that by = towid argument – C. Denney Oct 08 '19 at 01:50
  • 1
    Your random grouping factor is tow. Because you measured your environmental variables once per tow, each of these variables is a between-tow variable. As such, you can't have varying effects across tows associated with any of these variables - ruling out your first model. Only within-tow variables would have varying effects across tows. For example, if you had measured the age of each fish within a tow, you could have allowed the relationship between age and marginal growth within a tow to vary randomly across tows. – Isabella Ghement Oct 08 '19 at 01:58
  • 2
    When you use the by = towid expression in a smooth of age, you would allow the relationship between age and marginal growth to vary across tows. But with an environmental variable (e.g., temp), you only have one value of that variable per tow, so you can't even estimate the relationship between it and marginal growth within a tow - let alone allow the relationship to vary across tows! – Isabella Ghement Oct 08 '19 at 02:06
  • 1
    To sum up, your environmental variables are tow-level variables (rather than fish-level variables), hence they can't have have nonlinear effects on marginal fish growth which vary from tow to tow. In other words, your model cannot include terms like s(temp, by = towid). Only fish-level variables - such as age - could be included in your model using syntax like s(age, by = towid). – Isabella Ghement Oct 08 '19 at 02:09
  • 1
    So a model that might work for you would look something like: gam(mar ~ s(towid,bs="re")+s(sal)+s(temp)+s(secchi),data=df). The model allows for random intercepts for each tow, thereby capturing the correlation among marginal growth measurements of fish within the same tow. – Isabella Ghement Oct 08 '19 at 02:11
  • 1
    We do in fact have fish age, and will probably try including it in the model, so in that case our model might be something like

    gam(mar ~ s(age, by = towid) + s(towid, bs = 're') + s(temp) + s(sal) + s(secchi)

    is that correct?

    – C. Denney Oct 08 '19 at 02:15
  • 1
    Yes, if you have fish age, then its effect on marginal growth can vary across tows so a model like the one you suggested would make sense. See https://peerj.com/articles/6876/ for more insights into how you can formulate such a model. – Isabella Ghement Oct 08 '19 at 02:17

0 Answers0