Questions tagged [hypergeometric-distribution]

A discrete distribution used to model sampling without replacement.

The hypergeometric distribution is a discrete distribution. It is used to model sampling without replacement from a collection of objects regarded as being of two types - for example, drawing otherwise identical colored balls from an urn.

Specifically, in that situation, it is the probability of drawing $k$ red balls ("successes") in a sample of $n$ balls drawn without replacement from an urn containing $K$ red balls out of $N$ balls in total.

The probability mass function of the distribution is:

$$P(X=k) = \frac{ {K \choose k} {N-k \choose n-k} }{N \choose n}$$

It arises in a number of contexts in probability and statistics including the analysis of 2x2 contingency tables when the margins are conditioned on, as is the case with Fisher's exact test.

Reference: Wikipedia - Hypergeometric distribution

206 questions
6
votes
1 answer

How to use hyper-geometric test

My professor wrote some things very quickly on the board and I had a very hard time interpreting what arguments are being made. I am trying to test the conclusion. I read this post but I'm still not quite grasping it. If I could be pointed in the…
Christian
  • 1,872
  • 4
  • 21
  • 28
5
votes
2 answers

How the hypergeometric distribution sums to 1?

The hypergeometric distribution is defined for $\max(0, n+K-N)\leq k\leq \min(K,n).$ But, when we use Vandermonde's identity to prove that probabilities sum to $1$, then we use the range of $0\leq k \leq n.$ I wonder how this is justified?
Silent
  • 489
2
votes
1 answer

An inequality for a bi-modal hypergeometric distribution

Say $X$ has a hypergeometric distribution with parameters $m$, $n$ and $k$, with $k\leq n<\frac12m$. I know that $X$ has a dual mode if and only if $d=\frac{(k+1)(n+1)}{m+2}$ is integer. In that case $P(X=d)=P(X=d-1)$ equals the maximum…
Michel de Ruiter
  • 267
  • 3
  • 16
1
vote
0 answers

What distribution does a ratio of successes to failures (with replacement) follow?

Suppose you realize $n$ draws—with no replacement—from a sample of $N$ marbles, in which I know there are $K$ white marbles and $N - K$ black marbles. The probability of getting $k \leq n$ white marbles is given by a hypergeometric distribution. Now…
1
vote
0 answers

Using hypergeometric test for presence/absence of different conditions

My dataset consists of 3 conditions, with different numbers of samples in each condition (30 in condition 1, 80 in condition 2, 50 in condition 3). I have measured the presence or absence of a gene in each sample. I would like to know if the…
Elliot
  • 11
1
vote
2 answers

How can I calculate the continuous cumulitive density function for a hypergeometric distribution?

I have a sample from a large population where it is possible to select either a 1 or a 5. I want to find the probability that the average in any sample is greater than 4.76, and I want to chart the probability of selecting a sample with average…
Mr. A
  • 191
  • 4
1
vote
0 answers

2x2 contigency table hyper-geometric test

I'm having trouble using the hyper-geometric test on a 2x2 contingency table. I have listed the null hypothesis below. Assuming the null hypothesis is: $$P = \frac{A}{A+B} = \frac{C}{C+D}$$ From here, how do I proceed with the hyper -geometric…
Christian
  • 1,872
  • 4
  • 21
  • 28
1
vote
0 answers

Hypergeometric dist. with three balls

We have 13 black balls, 3 red balls and 16 white balls. We want to calculate the probability for getting 16 balls where we want only the white and red balls i.e. zero black balls. We draw without replacement. Here is the rule: If we get a white…
0
votes
1 answer

Closed form solution for the expectation of the square root of a hypergeometric variate

As the title states, is there a closed form formula for the expectation of the square root of a hypergeometric variable. Edit: Closed form approximate solutions based on related distributions or expansions are also welcome.
hyper
  • 3
-1
votes
1 answer

Overlap between two lists, hypergeometric test

I used two different methods to retrieve a list of 1808 and 867 elements from a list of 3431 elements. The two lists (1808 and 867) have 683 elements in common. This shows that the two approaches produce results that highly overlap but I want to…