I'm in charge of contacting customers of a company in order to analyse their satisfaction.
The problem is I contact them by phone and the people I contact (the sample) are not representative of the full population.
Then I consider post-stratification but the problem is I need to ensure the new version of sample is representative on several quantitative variables at the time :
- age
- spent amount.
I know that I need a qualitative variable for the stratification.
How can I put these 2 quantitative variables into a single qualitative stratification variable ?
1st idea :
Split the sample based on the age quartile. I obtain 4 groups.
Split each of the 4 groups based on its spent amount quartile.
Now I have 16 groups (6.25% of the population each) which can be used for the stratification
2nd idea : Perform a clustering and find k-groups which can be used for the stratification
Which one is the most used when analysts need to post-stratificate a sample based on several quantitative variables ?