Examining group mean differences in time series using regression and HAC-robust standard errors

Question

I’ve got a clinical time series dataset containing (hourly) heart frequencies from deceased and survived patients. I need to investigate whether there is a significant difference of means between those two groups (deceased vs. survived). By assessing the ACF the time series clearly turns out to be auto correlated. Also, normal distribution assumption needs to be rejected. So, plain vanilla t-testing is not possible due to the violation of the main conditions.

According to some articles about HAC-robust testing (Newey-West (1987) t-stats and How to efficiently do rank-sum tests on autocorrelated time-series? ) I came along with following idea:

Estimate a linear regression model globally where the variable ‘heart frequency’ depends on the two outcome groups (deceased & survived).

# Fit lm for heart frequency (HF) globally for Outcome groups (survived & deceased)
HF_lm <- lm(data = HF_data, HF~Outcome)

Adjusting the model by using HAC-robust standard errors (Newey-West / Andrews)

library(sandwich)
library(lmtest)
Testing the coefficients with coeftest using HAC standard errors
coeftest(HF_lm, vcov. = vcovHAC(HF_lm))

Assessing the slope coefficient for significance

> coeftest(HF_lm, vcov. = vcovHAC(HF_lm))
t test of coefficients:
            Estimate Std. Error t value Pr(&gt;|t|)    

(Intercept)      86.8819     2.4534 35.4132   <2e-16 ***
Outcomesurvived  -3.8165     2.8435 -1.3422   0.1796

Signif. codes:  0 '*' 0.001 '' 0.01 '*' 0.05 '.' 0.1 ' ' 1

From my perspective, the Outcomesurvived -3.8165 represents the difference of means of both time series. For comparison:

> as.data.frame(HF_data %>% group_by(Outcome) %>% summarise(Group_mean = round(mean(HF),4)))
   Outcome Group_mean
1 deceased    86.8819
2 survived    83.0654

The p-Value (in this case 0.1796) then tells me about the significance of the result. (compare also R: Anova and Linear Regression)

My question is whether this is a fair and valid approach for comparing time series means or am I missing something? Would you suggest a different approach?

score 1 · Answer 1 · answered Feb 06 '24 at 16:00

A survival analysis approach would be more interesting for these data. You could try a descriptive Kaplan-Meier with a log-rank test to compare the survival of both groups. Then you could try a competing risks model or a Cox proportional hazards model. The type of analysis will depend on your context and research question.

One of the important aspects for this type of approach is to consider whether censoring exists and whether such censoring is informative or not.

I recommend the book: Collett, D. (2023). Modeling Survival Data in Medical Research. CRC Press.

Examining group mean differences in time series using regression and HAC-robust standard errors

Testing the coefficients with coeftest using HAC standard errors

1 Answers1