0

I am currently arguing with someone on how to correctly treat data with multiple observations per subject. More specifically data was gathered from 100 participants 8 times per day for 5 days (resulting in 40 observations per participants for each variable of interest).

So now we came up with two way of analyzing the data: 1)aggregate data per variable and analyze it further with PROCESS macro by Hayes (however here is my problem - wouldn't data from the same subject not be completely independent?)

2)aggregate data to a mean for each variable PER subject and then analyze it further.

I am firmly for the second option, however I am not 100% if it is a valuable one. Every opinion will be appreciated

1 Answers1

1

I suppose it depends on what you want to do with the data.

I know in insurance we used grouped data - that is we would group the data for each combination of explanatory variables then count the number of claims as response for the frequency model (poisson glm) or average the cost of claims as response for the cost model (gamma glm). Thus we aggregated the data over each group, not per policy, so the same policy might have more than one claim in a group.

From your question it sounds like you are dealing with longitudinal data. I have not dealt much with such designs but this might be of some interest to you: panel data and repeated measures.

StatGrrl
  • 670