6

I have six ratios (females-males). These are the observed values:

1.78 - 1.17 - 0.53 - 1 - 0.85 - 0.56

I need to run a chi-squared test to show that these ratios are significantly different.

I expect all these ratios to be equal. But as for the “equal” value, there is no expectation. It could be any value, providing that this value is the same for all the cases. How can I run a chi-squared test in this condition? What do I plug into the “expected value” of the formula? Thanks

These are the data:

Group    1    2    3    4    5    6        Total

Men :    9   17   13   12   11   19         81
Women:  16    9   11   14   11    5         66

Total:  25   26   24   26   22   23        147
Gwen
  • 583

3 Answers3

3

Testing for equality of ratio of females::males across groups is the same as testing for equality of proportion of females across all groups ($r_i = f_i/m_i$, $p_i = f_i/(f_i+m_i)$, so $p_i = r_i/(1+r_i)$ and $r_i = p_i/(1-p_i)$. If the $p_i$ differ so do the $r_i$.

If you check that the proportions, $p_i$ differ significantly, you can then conclude that the ratios, $r_i$ differ significantly when the $p_i$ do.

So you are testing for equality of proportion across groups.

This is actually the same testing for independence in the two-way table.

As a result, for the data,

Group    1    2    3    4    5    6        Total

Men :    9   17   13   12   11   19         81
Women:  16    9   11   14   11    5         66

Total:  25   26   24   26   22   23        147

your expected values are just $E_i = \text{row total}\times\text{column total}/\text{overall total}$. E.g. the expected values to go with the first group are:

 25 x 81 / 147 = 13.78

 25 x 66 / 147 = 11.22

The table of expecteds is:

Group    1     2     3     4     5     6     
Men    13.78 14.33 13.22 14.33 12.12 12.67
Women  11.22 11.67 10.78 11.67  9.88 10.33

As a result, you can just calculate the chisquare for the table - just find

(observed - expected)^2/expected 

for all $6\times 2$ numbers and add the 12 terms up.

The first column:

(9- 13.78)^2/13.78 = 1.66

(16 - 11.22)^2/11.22 = 2.03

though it's better if you keep more than two decimal places for all the intermediate calculations. If you do it right you should get a chi-square of somewhere close to 11.5 on 5 df. Looking at the Pearson residuals, almost all of that is coming from the first and sixth groups (especially the sixth group).

Glen_b
  • 282,281
2

It depends on your null hypothesis. If your hypothesis is that the ratio of males to females is equal then the expected value is 0.5.

EDIT:

So your data should look something like this:

       1    2    3    4    5    6      TOTAL
MEN                                   | a
WOMEN                                 | b
---------------------------------------  
TOTAL  c    d    e    f    g    h

$\chi^2 = \sum_{i=1}^2\sum_{j=1}^6 \dfrac{(O_{i,j} - E_{i,j})^2}{E_{i,j}}$

where the entries in your table are your observed values (e.g. How many men in group 1 is $O_{1,1}$). Under the null hypothesis of independence $E_{1,1} = a*c/N$, $E_{1,2} = a*d/N$, etc. where $N$ is total number of observations.

in R you can do

men <- c( 9  , 17 ,  13 ,  12  , 11  , 19 )
women<- c(16  ,  9  , 11  , 14  , 11  ,  5)

prop.test(x=men, n=(men+women))

6-sample test for equality of proportions without
continuity correction

data:  men out of (men + women)
X-squared = 11.4978, df = 5, p-value = 0.04236
alternative hypothesis: two.sided
sample estimates:
   prop 1    prop 2    prop 3    prop 4    prop 5 
0.3600000 0.6538462 0.5416667 0.4615385 0.5000000 
   prop 6 
0.7916667 

So since the p-value is below 0.05 I would say we have enough evidence to reject the null hypothesis that ALL 6 proportions are equal. At least one of the proportions is not equal to the rest.

bdeonovic
  • 10,127
  • 1
    That is the issue, the hypothesis is that the ratio is equal, but there is no hypothesis as for the value of this ratio – Gwen Oct 03 '13 at 00:43
  • Ah, my mistake, I did not read the question carefully enough. – bdeonovic Oct 03 '13 at 00:44
  • So....is it possible to conduct the test? I have been told it is, but I can't figure out how.... – Gwen Oct 03 '13 at 01:07
  • If you know the number of males and females that produced the ratios I think it would be quite easy to do a chi-squared test. – bdeonovic Oct 03 '13 at 01:08
  • Yes, I do know the numbers of the females and males that generated the ratios. – Gwen Oct 03 '13 at 01:34
  • You need an $N$ in your expected formulas (eg, "$E_{1,1}=a*c$" should be, $E_{1,1}=ac/N$). – gung - Reinstate Monica Oct 03 '13 at 02:19
  • @Gwen, your data didn't come through. If you have the numbers of males & females in each condition, you can run the chi-squared test. – gung - Reinstate Monica Oct 03 '13 at 02:20
  • So, I get two sums of squares: 1 for the males, 1 for the females, then add up this two SS and get the chi squared statistic. Is this correct? – Gwen Oct 03 '13 at 02:33
  • Any source where I can read about this procedure (i.e., the nested summations). Thanks – Gwen Oct 03 '13 at 02:34
  • @Ben So N should be 12? – Gwen Oct 03 '13 at 02:49
  • @Gwen No N should be the total number of people in your data (so in my made up example above it would be N=83+90+129+70+22+87+86+136+82+25+85. For a good reference on chi-squared test see Agresti's "An Introduction to Categorical Data Analysis" Page 35 http://books.google.com/books?id=YRtEiDevAi0C&printsec=frontcover&source=gbs_ge_summary_r&cad=0#v=onepage&q&f=false – bdeonovic Oct 03 '13 at 13:35
0

Is it mandatory to run a test based on the chi2 statistics ? If not, I would suggest you use a likelihood ratio test which can properly account for the fact you do not know the expected fraction of females $p$. If I write $p_i = p + \Delta p_i$ ($\Delta p_0 = 0$) the expected fraction of females in category i, the the null hypothesis can be written as:

$$ H_0 : p_0 = p_1 = ... = p_6 = p $$

or equivalently

$$ H_0 : \Delta p_1 = \Delta p_2 = ... = \Delta p_6 = 0 $$

with $p$ an unknown nuisance parameter.

With a likelihood ratio test, you would build your test statistics as

$$ D = -2 log(L({\bf n},{\bf N}, {\bf \Delta p} = 0, \hat{\hat{p}})) + 2 log(L({\bf n},{\bf N}, {\bf \hat{\Delta p}}, \hat{p})) $$

with

$$ log(L({\bf n},{\bf N}, {\bf Delta p}, p)) = \sum_i log f_i(n_i, N_i, p_i) $$

Now if you believe that your data are normally distributed (I do not think so... I would rather choose a binomial distribution), then $-2 log(L)$ simplifies to

$$ -2 log(L) = \sum_i \frac{(n_i - p_i N_i)^2}{n_i} = \chi^2 $$

so the test statistics would a be a difference of $\chi^2$

$$ D = \chi^2({\bf n},{\bf N}, {\bf \Delta p} = 0, p) - \chi^2({\bf n},{\bf N}, {\bf \Delta p}, p) $$

In one case you run your least-square fit with all parameters free and in the other case you run your least-square fit with constrained ${\bf \Delta p} = 0$. If $H_0$ is satisfied $D$ should be distributed as a $\chi^2$ distribution, but as I said I think it is much better to work with a likelihood function and $f_i(n_i, N_i, p_i) = C_{N_i}^{n_i} p_i^{n_i} (1 - p_i)^{N_i - n_i}$ binomially distributed.

Mr Renard
  • 176
  • 4