I'm new to statistics and model and currently struggling to understand how to answer the following example question.
I'm trying to find out if the abundance of snails on a particular food source is a result of a preference for that food source, or the prevalence. Or more simply, do snails display a significant preference for one food source or another?
Is the number of snails found on lettuce vs carrots vs broccoli significantly different based on the relative abundance of lettuce or not? Do I find more snail on lettuce because there is more lettuce, or because snails prefer it?
Given the (example) data:
| Site | Snails | Food | Food_Percentage_Abundance |
|---|---|---|---|
| A | 4 | lettuce | 10% |
| A | 48 | carrot | 80% |
| A | 12 | broccoli | 4% |
| B | 34 | lettuce | 10% |
| C | 5 | lettuce | 13% |
What would be the best way to model this in R?
How would I then test the interaction between abundance and Site as well (to demonstrate / test that is is a general pattern across sites)?
If I try to fit this formula to a gam model e.g.
gam(SNAILS ~ 0 + FOOD + FOOD:FOOD_PERCENT_ABUNDANCE)
I get the following summary:
Estimate Std. Error t value Pr(>|t|)
FOODother 1.64706 0.59263 2.779 0.049858 *
FOODbroccoli 15.60870 2.88580 5.409 0.005659 **
FOODcarrots -1.13942 1.13841 -1.001 0.373520
FOODlettuice 23.96296 2.54035 9.433 0.000704 ***
FOODother:FOOD_PERCENT_ABUNDANCE -0.17647 0.54867 -0.322 0.763836
FOODbroccoli:FOOD_PERCENT_ABUNDANCE -0.28986 0.09464 -3.063 0.037564 *
FOODcarrots:FOOD_PERCENT_ABUNDANCE 0.24038 0.03140 7.656 0.001564 **
FOODlettuice:FOOD_PERCENT_ABUNDANCE 0.74074 0.12093 6.125 0.003599 **
To confirm I'm reading this correctly...
For any given site (as SITE isn't included in the formula) Lettuice is the Food type that has the most significant impact on the number of Snails observed, however when accounting for the Average Percent Abundance Carrots has a greater effect (due to the lower P value)?
Finally, if I wanted to model the impact of SITE as well, then I would use the formula:
SNAILS ~ 0 + SITE:FOOD:FOOD_PERCENT_ABUNDANCE
To model the interaction between all 3 variables, i.e. to see if one food source has a greater impact at Site A vs B
Anovafrom thecarpackage if the interaction variable is significant (for that, you should use a ANOVA type III I would say). In addition, what I think you want down the line is to understand how the snail abundance varies with the types of food, you should useemmeansfrom theemmeanspackage. If you want also to check how the food availability influences snail abundance per type of food,emtrendsfrom the same package might be helpful – André Barros Jan 15 '23 at 20:37