0

I have a longitudinal dataset with several binary variables, along with an id variable and a time variable. In my research project, these binary measures are observed behaviors during a videotaped task (0 = absent, 1 = present) and I am interested to examine how the behaviors correlate over time and which (if any) variables cluster or vary together across time.

My first instinct is to conduct factor analysis (FA) since, conceptually, these observed behaviors are similar to items on a questionnaire and I want to see what kind of "subscales" appear in my data. I have found resources/posts for FA or PCA on longitudinal data and binary data, but none on longitudinal binary data.

Is FA or PCA appropriate or even possible for longitudinal binary data? Based on one of the prior posts, it seems like PCA would be more appropriate than FA, but I am not sure if the technique will still hold for longitudinal binary data. If neither FA nor PCA are appropriate for this data, I am open to other methods. Currently, I am interested in conducting more exploratory/descriptive statistics on my data before delving into model building, which will likely take the form of vector autoregression or generalized estimating equations.

The below R code creates an example dataset that resembles my actual dataset. I would prefer to conduct analyses in R but am open to other programs.

library(tidyverse)
set.seed(123)

x_1 <- rbinom(n=100, size=1, prob=0.20) x_2 <- rbinom(n=100, size=1, prob=0.25) x_3 <- rbinom(n=100, size=1, prob=0.15) y_4 <- rbinom(n=100, size=1, prob=0.80) y_5 <- rbinom(n=100, size=1, prob=0.75) y_6 <- rbinom(n=100, size=1, prob=0.75) df <- data.frame(x_1,x_2,x_3,y_4,y_5,y_6) %>% mutate(time = c(rep(1:20), rep(1:25), rep(1:15), rep(1:15), rep(1:25))) %>% mutate(id = c(rep("001", 20), rep("002", 25), rep("003", 15), rep("004", 15), rep("005", 25))) %>% select(id, time, everything()) %>% mutate(across(where(is.integer), as.numeric))

  • I think this might be helpful: https://stats.stackexchange.com/questions/18617/can-i-do-a-pca-on-repeated-measures-for-data-reduction – Spätzle Nov 15 '22 at 10:23
  • Maybe also look into multiple correspondence analysis – kjetil b halvorsen Nov 15 '22 at 15:45
  • On doing FA or PCA on binary data https://stats.stackexchange.com/q/16331/3277. Please be sensitive to the differences between the two. – ttnphns Nov 16 '22 at 23:15
  • @kjetil b halvorsen, MCA looks very appropriate for binary data, do you have any resources about applying it to a longitudinal dataframe? – jrcalabrese Nov 19 '22 at 17:22
  • @Spätzle, it looks like MFA is only applicable to continuous variables, am I mistaken? – jrcalabrese Nov 19 '22 at 17:30
  • @jcalabrese: Here is a paper: https://www.researchgate.net/profile/Peter-Gm-Heijden/publication/301327542_Correspondence_Analysis_of_Longitudinal_Data/links/57120a2e08aeff315ba0ac9b/Correspondence-Analysis-of-Longitudinal-Data.pdf – kjetil b halvorsen Nov 19 '22 at 18:56

0 Answers0