Analyzing (hierarchical?) clustered pair-matched binary data

Question

We teach supplementary lessons in nearly two dozen local schools, and have two data sets of approximately four hundred records each from pre-post tests given at these schools. Each record contains pre and post values (correct, incorrect) for questions on 12 topics as well as whether or not there was an intervention (lesson taught) relating to that topic. Our goal is of course to assess the impact of the lessons taught.

The pre and post test responses are pair-matched by student, and clustered by school since the lessons taught vary by school, in addition to the other factors generally accepted as valid for clustering at this level i.e; similar socioeconomic background, school culture, common instructors, etc.

Without clustering, McNemar is the accepted test for analysis of this data, and several authors have explored various modifications to McNemar to allow for clustering including:

Methods for the Analysis of Pair-Matched Binary Data from School-Based Intervention Studies. Vaughan & Begg. 1999. doi: 10.3102/10769986024004367
Analysis of clustered matched-pair data. Durkalski et al. 2003. doi: 10.1002/sim.1438
Methods for the Statistical Analysis of Binary Data in Split-Cluster Designs. Donner, Klar, Zou. 2004. doi: 10.1111/j.0006-341X.2004.00247.x
Adjustment to the McNemar’s Test for the Analysis of Clustered Matched-Pair Data. McCarthy. 2007. http://biostats.bepress.com/cobra/ps/art29/ (Free to download)

I have subsequently experimented with the Durkalski method as documented in McCarthy, since it seems to be deemed rather robust, as well as being the simplest for me to understand and code. However, none of the documented methods fit our case exactly as they use the matched pairs for pre-post or control-treatment only, and treat the clusters as a single class. We actually have matched pairs of pre-post in multiple control & treatment clusters, but this latter level of information is not used and discarding the ~50% of our data points from the control groups seems sub-optimal. Is anyone aware of a technique designed to analyze this data configuration, or someone whom might be interested in exploring this area?

Thanks in advance!

In case it helps anyone, an R implementation of Durkalski as well as a couple other related tests has been uploaded to CRAN in clust.bin.pair — user12341234, Feb 10 '17 at 02:02

score 1 · Accepted Answer · answered Jul 22 '11 at 18:01

1

I would use conditional logistic regression, which incorporates the matched pair design in a regression model with covariates you want to test.

answered Jul 22 '11 at 18:01

CDX

726

Based on various lecture notes I've been able to find, it looks like this should work, if I can figure out how to implement it with a FL/OSS package. My first crack at doing it in R seems to have caused it to eat max CPU clogit(intervene~pre+post+strata(school),data=tableLoadedInDeduceR) – Simon Jul 22 '11 at 20:30
1

Turns out our data is non-convergent (discovered after eventually finding a copy of SPSS to run it through), so R spins out of control unless clogit is supplied with the argument method=c("approximate") I also found this video helpful: http://www.youtube.com/watch?v=be-M0tMMKyU – Simon Jul 27 '11 at 21:08

Analyzing (hierarchical?) clustered pair-matched binary data

1 Answers1