2

I am trying to set up a data sheet for my project, which is on the number of men and women on each floor in a dorm. My hypothesis is that each floor (and therefor the dorm as well) will have more women than men.

I went to 9 floors and counted the number of men and women. Now I'm confused on how to set up my data sheet. My data consists of floor number, and number of men and women on each floor. I'm doing a 1-way ANOVA, so do I need to make my data into percentages (of women on each floor), have the number of men and women on each floor in my data sheet ...? I'm not sure what to do.

Lindsay
  • 21
  • not clear how this is an anova design: the number of men and women can hardly be considered independent and it does not seem to be replicated. A paired, by floor, t-test on % might be a better alternative. Wait, datasheet to record data, not for analyses? – katya Nov 26 '14 at 23:26
  • 1
    This is ANOVA-ish, in the sense that your explanatory variables (floors) are all categorical instead of continuous, but your response is binary (female vs male), not normal. You should look into using logistic regression. – gung - Reinstate Monica Nov 26 '14 at 23:44

1 Answers1

0

You should keep the data as counts, and not reduce it to percentages. Then it have the format of a contingency table (specifically, a 2 times 9 table: 2 genders and 9 floors). Then you can use a chisquared test of independence.

As pointed out by user @gung in a comment, you can also use logistic regression, which in its most basic form corresponds closely to the chisquared test, but admits for extensions (like using the floor number as a predictor). For the relationship between logistic regression and the chisquared test, see for instance Logistic regression vs chi-square in a 2x2 and Ix2 (single factor - binary response) contingency tables? and chi square test vs logistic regression result