10

I have a dataset asking people whether they have been to a certain places (e.g. A, B, C, D), and they can make more than one choice, then a specimen is taken from their nose to see if they are infected with some disease.

I need to find out the relative risk of getting infected for one going to a certain place, I can only think of logistic regression right now, is there any other suggestions?

Thanks.

Stephan Kolassa
  • 123,354
lokheart
  • 3,199
  • 9
  • 40
  • 49

1 Answers1

2

You can still use logistic regression because your outcome is dichotomous, infected vs not-infected. I would just simply take a dummy variable approach and use no travel as the reference category (i.e. for each of your places you have a variable coded as 1 if they visited that place and coded as 0 if they did not visit that place). As such if you transform your beta coefficients to odds (i.e. exponentiate the log odds) the interpretation of the dummy variable for location A would be the odds ratio of visiting location A over not visiting location A controlling for other places one visited. Also note in this approach multi-collinearity is a concern (e.g. if many of the people who travel to A also travel to B it may bias each of their coefficients).

Andy W
  • 16,026
  • 5
    This model assumes the response is an additive function of traveling to each place, which is highly unlikely. It can still be made to work by including interaction terms. A full set of all possible interactions might be needed (beyond just the two-way interactions). (That would be mathematically identical to providing a separate dummy for each possible combination of destinations.) – whuber Oct 14 '10 at 04:59
  • 4
    Better have a lot of data if you use all interactions (15 parameters) rather than just the main effects (4 parameters)... – Stephan Kolassa Oct 14 '10 at 06:56
  • @whuber and @Stephen, Thanks for the responses, and I agree completely with each of you. I personally would be ok with the main effects dummy variable approach if multiple responses weren't all that common, which may not be a tenable assumption given the original posters concerns. I would maybe propose other designs if the original poster was interested in the risk of travelling to A vs B (such as some type of matching procedure). And I agree additive risk does not make sense except if some selection bias is occurring. – Andy W Oct 14 '10 at 12:31