3

I have two surveys of business owners. One is a sample (sample 1) of business owners who were not members of the association, done using a random digit dialing approach. The other is a sample (sample 2) of business owners who were members of an association (used the list of members for the frame). Both samples were stratified by state. Sample 1 was weighted using state and gender. Sample 2 was weighted for non-response and weighted to reflect the association characteristics. As it turns out, there is a greater proportion of males in the association data (sample 2) than in the non-member data (sample 1 data). At first look, it appears that the association appeals to males more than females. We have 500 responses for each sample.

I want to use a probit regression to find the factors (mostly demographic type of factors) influencing association membership. What is the best way to do this? I was hoping I could just combine the samples. However the samples were done independently. Clearly there is a huge difference in the size of the populations - sample 1 is huge (all owners in the country) and sample 2 is just the members. The survey instrument (questions) was almost identical - there were some extra questions for the members of the association (sample 2) related to the satisfaction with the association.

chl
  • 53,725
eliz
  • 31

1 Answers1

2

This looks like a case-control study (all cases are sampled, a similar number of controls are sampled at a much lower rate) -- read up Alastair Scott and Chris Wild's s work on this (book chapter, invited lecture). I second gung's opinion about logistic regression being somewhat more suitable (the theoretical advantages being the exponential family and sufficient statistics).

StasK
  • 31,547
  • 2
  • 92
  • 179
  • Thanks to everyone for the answers. I'm now clear on what I need to do and I've found a couple of very useful papers. You're correct,it is logistic regression I want - to get the odds ratios out. Thanks!! – eliz Aug 26 '12 at 21:20