1

I conducted a cognitive experiment where I coded several behavioral patterns. One pattern is the latency between the start of the experiment and the first behavioral response of the participant. I would like to estimate a probability model for this duration data using a Gamma family.

However, some participants did not react at all, I coded these values as N/A. They are not missing values per se. They just showed no response.

How do I incorporate these observations in analyses?

AdamO
  • 62,637
Tony
  • 11
  • Without knowing the stimulus it's hard to say too much. Was the lack of reaction a deliberate suppression, or is it possible the reaction is below the limit of (your) detection, such as a muted gasp, or other subtle quickening? We did a cold pressor test and found that men had a "macho" tendency to try to hold their hands in ice water for as long as physically possible, making it impossible for us to measure any pain threshold in them. – AdamO Dec 07 '17 at 13:33
  • can you say what the 'mixed' aspect of this analysis would be, i.e. what grouping variable would be included? – Ben Bolker Dec 07 '17 at 13:50
  • @AdamO thanks so much, this already helped me a lot! – Tony Dec 12 '17 at 09:02

1 Answers1

1

The lack of observed response is called censoring. You may assume that participants would have shown some response had you waited long enough for them to do so, but you administratively censored further observation after, say, 2 minutes. In R you can code censored data as a Surv object with the survival package. To do so, you create a dyad by inputting the observation time (for either the event or the censoring time) and the event indicator 1=event, 0=censored. Censored times are indicated as 2.0+ if a particular subject was observed for 2 minutes (say) and the observation was halted for lack of timely response. The + indicates the event would have occured had you observed the subject indefinitely.

Competing risks are not often discussed in such scenarios. For instance, in cancer survival analysis, it is assumed that death due to other causes is censoring of cancer death, i.e. had a subject not died of *heart disease * (or alzheimer's disease, or pneumonia, and so on) they eventually would have died of cancer. I think ignoring competing risks is okay when the event in question is in some sense imminent and prevalent. In your case, it's hard to conceive of a competing risk when a subject has been asked to complete a specific task.

Probability models for censored data are generally estimated using maximum likelihood. The survreg function in the survival package fits such models to estimate shape and possibly scale parameters for many parametric survival models. These methods, however, apply to independent data. You can, however, use a single observation from each participant to obtain an independent subsample. An alternate approach is to use a frailty component which, in essence, is like a random intercept in a GLMM.

AdamO
  • 62,637
  • good answer; my only quibble is that you don't necessarily need EM to analyze censored data (I don't think survival uses EM). See any survival analysis book, e.g. the relevant chapter of Harrell's Regression Modeling Strategies ... – Ben Bolker Dec 07 '17 at 13:51
  • @BenBolker thanks for pointing that out, I had never realized when likelihoods for location-scale failure data with censoring were so amenable to Newton Raphson estimation. – AdamO Dec 07 '17 at 15:42