Combining two survey samples

Question

I have two surveys of business owners. One is a sample (sample 1) of business owners who were not members of the association, done using a random digit dialing approach. The other is a sample (sample 2) of business owners who were members of an association (used the list of members for the frame). Both samples were stratified by state. Sample 1 was weighted using state and gender. Sample 2 was weighted for non-response and weighted to reflect the association characteristics. As it turns out, there is a greater proportion of males in the association data (sample 2) than in the non-member data (sample 1 data). At first look, it appears that the association appeals to males more than females. We have 500 responses for each sample.

I want to use a probit regression to find the factors (mostly demographic type of factors) influencing association membership. What is the best way to do this? I was hoping I could just combine the samples. However the samples were done independently. Clearly there is a huge difference in the size of the populations - sample 1 is huge (all owners in the country) and sample 2 is just the members. The survey instrument (questions) was almost identical - there were some extra questions for the members of the association (sample 2) related to the satisfaction with the association.

I'm wondering why you want to use probit. Logistic regression is standard for relating covariates to a discrete binary response. Should you be interested, I wrote a good deal about all that here: difference-between-logit-and-probit-models. — gung - Reinstate Monica, Aug 25 '12 at 18:18

score 2 · Answer 1 · answered Aug 25 '12 at 19:39

2

This looks like a case-control study (all cases are sampled, a similar number of controls are sampled at a much lower rate) -- read up Alastair Scott and Chris Wild's s work on this (book chapter, invited lecture). I second gung's opinion about logistic regression being somewhat more suitable (the theoretical advantages being the exponential family and sufficient statistics).

answered Aug 25 '12 at 19:39

StasK

31,547
2
92
179

Thanks to everyone for the answers. I'm now clear on what I need to do and I've found a couple of very useful papers. You're correct,it is logistic regression I want - to get the odds ratios out. Thanks!! – eliz Aug 26 '12 at 21:20

Combining two survey samples

1 Answers1