0

I have a dataset with measurements of a animal color pattern from a body part, measured under multiple conditions.In brief, each animal was treated from one side of the body, but not from the other, so here I assume that I have paired treatment and control data. Animals were reared at either high or low temperatures, the pattern values were measured from both sides, the body part values were also measured from both sides as a covariant. I'm interested to see the impact of temperature, treatment, and their interactions, on pattern values.

Here's a example of the structure of my dataset.

    pattern   body  treatment temperature     ID
   54.19922 785456      Y       low           L1
  142.38281 754103      Y       low           L2
  465.91797 810738      N       low           L1
  531.25000 754103      N       low           L2
 2217.67578 624083      Y       high          H1
  481.83594 396332      Y       high          H2
 3883.10547 636011      N       high          H1
  777.73438 377092      N       high          H2

Each ID indicates one unique animal, so each ID has both paired treated and untreated measurements.

This is my model using the anova_test function from the rstatix package.

res=anova_test(data=df,dv=pattern,wid=ID,between=c(temperature,treatment),covariate = body,type = 3)

However, I got a warning message:

Warning: The 'wid' column contains duplicate ids across between-subjects variables. Automatic unique id will be created

I wonder if there is any problem with the IDs assigned to the samples?Is it ok to proceed? Also, I wonder if this model is equivalent to the model below?

aov(pattern~body+temperature*treatment+Error(ID/(treatment)),df)
  • Welcome to Cross Validated! I don't believe that the anova_test() function is part of the basic R distribution that people reading this question might be familiar with. Please edit the question to indicate the package from which it was obtained so that others can evaluate what's going on. On this site, please provide additional information like that by editing the question itself; comments are easy to overlook and can even be deleted. – EdM Jul 02 '23 at 19:45
  • Hi, thx! I’ve edited the question – shenTTT Jul 03 '23 at 04:56

1 Answers1

0

What you have seems to be a within-individual treatment predictor and a between-individual temperature predictor, with the body predictor as a continuous covariate and pattern as a continuous outcome. If that's the case then your aov() model seems OK, as it specifies that only the within-individual treatment is nested within ID in the Error() term. See this page, for example. Your aov() model doesn't allow for interactions of those predictors with the continuous body predictor, so I assume that you don't want to evaluate such potential interactions.

I can't say whether your specification of the model to the anova_test() function in the rstatix package is equivalent. For one, highly software-specific questions like that are technically off-topic on this site, given that you seem to have correctly identified the statistical structure of the analysis you seek to perform. For another, I'm not familiar with that package. I have nearly 500 R packages installed on this computer, but rstatix isn't among them.

Furthermore, I think that you should be wary of using packages like that, purporting to make things "simple and intuitive" without necessarily having had a lot of vetting. There is a single author for the package in question, whose idea of "simple and intuitive" might not correspond with others' ideas of "simple and intuitive" in terms of translating the anova_test() function arguments into the underlying aov() structure. The author of that particular package also maintains the survminer package, which for some time had a serious but simple unfixed coding error that made some of its output unusable.

You have to be very careful when using open-source R statistical software, as quality control can vary greatly among packages. Many packages have had decades of use and contributions from multiple authors, enhancing the ability to discover errors. Even then, some errors in unusual use cases might remain unfixed until discovered by an astute user. If a package mainly exists to make things "simple and intuitive" instead of providing otherwise unavailable computational tools, I tend to avoid it and concentrate on the better-vetted and more established packages that are less likely to lead you unexpectedly astray.

EdM
  • 92,183
  • 10
  • 92
  • 267