When a random variable has a distribution whose parameter is another random variable

Question

Is there a standard name for a situation where a random variable follows a distribution whose parameter is another random variable ? For example a binomial(15,p) variable where the the p is distributed as beta(1,2), or a Poisson(Y) where Y is distributed as exponential(2)

Is this called a compound distribution, or ?

Then my real question is, given Y is distributed according to some given pdf with parameter X (say pdf1), but X is distributed according to another distribution (say pdf2), how do I use Bayes rule: $$ f_{X|Y}(x|y)=\frac{f_{Y|X}(y|x) \, f_X(x)}{f_Y(y)} $$ ?

$f_X(x)$ must just be pdf2, right ?

Is $f_{Y|X}(y|x)$ just the pdf of Y (that is, pdf1) with the pdf of X substituted in place of X ?

How do I work out $f_Y(y)$ ?

I hope it isn't asking too much for someone to tell me the general approach and also give an example of this, not necessarily one of those I mentioned above.

I have looked in several statistics books but I didn't find the answer.

For an example, check out the negative binomial as a Gamma mixture of Poisson distributions. — Momo, Oct 14 '12 at 22:54
@Momo Thanks, that seems like exactly the kind of example I am thinking of. http://en.wikipedia.org/wiki/Negative_binomial_distribution#Gamma.E2.80.93Poisson_mixture But I am still confused: In that case, what is $f_{Y|X}(y|x)$ and what is $f_Y(y)$ ? — Joe King, Oct 15 '12 at 06:47
I'm not sure I can help you -- that's why I gave no answer, I'd rather let the people who know more about Bayesian statistics do that. In the case of the negative binomial to me that is just a marginal distribution with in your notation $f(Y|X)$ being a $Poisson(X)$ and $X\sim Gamma(r,p/(1−p))$, so, $f(Y)=\int_X f(Y|X)f(X)dX$ is the negative binomial distribution. $f(Y)$ is just the marginal distribution, that's all. However, I'd rather have someone of the probability wizards confirm that I'm not talking BS. — Momo, Oct 15 '12 at 18:01
Back in the day, models like these were referred to as hierarchical Bayes. — Placidia, Oct 16 '12 at 15:06
This is far from trivial and raises more questions than it answers. Does not the concept of a random variable depend on it being associated with a probability distribution to which moments of the distribution converge to some unknown but finite values. If these values are themsleves random variables then probability distributions should be associated with these random variables, surely this could go on forever. I am not happy about the state of probabilty theory it seems to me to be in a real mess of confused terms, where our ignorance gets cloaked in more terminology. I wish I could give a si — , Aug 13 '13 at 02:14

Dilip Sarwate · Accepted Answer · 2012-10-16T13:53:37.730

There is nothing Bayesian (in the sense of "inverse" probability calculations) in this problem, only the law of total probability. Of course, the law of total probability requires assumptions about a priori probabilities....

Using the illustrations in the question, suppose that there are random variables $Y$ and $X$ where $Y$ has a binomial distribution $\text{Binom}(15,X)$. (Note that $X$ must take on values in $[0,1]$ only) What this is saying is that conditioned on the value of $X$, $Y$ is a binomial random variable. Thus, the conditional distribution of $Y$ given the value of $X$ is a binomial distribution $\text{Binom}(15,X)$. Perhaps this is the name that you are looking for when you ask "Is this called a compound distribution, or ..."? The unconditional distribution of $Y$ is, in general, not a binomial distribution. It is, in fact, a mixture distribution. This is particularly visible in the case when $X$ is a discrete random variable because then the unconditional distribution of $Y$ is a weighted sum of the conditional distributions. For our particular example, we have that for $0 \leq n \leq 15$, $$P\{Y = n\} = \begin{cases}\sum_i \binom{15}{n}\alpha_i^n (1-\alpha_i)^{15-n}\cdot P\{X = \alpha_i\}, & X ~\text{a discrete random variable,}\\ \int_0^1 \binom{15}{n}\alpha^n (1-\alpha)^{15-n}\cdot f_X(\alpha)\,\mathrm d\alpha, & X ~\text{a continuous random variable,} \end{cases}$$

score -1 · Answer 2 · answered Jul 23 '13 at 11:53

The standard way to do this is usually through transforms. One starts with the transform of the outer variable, for a given outcome of the inner, and then averages over the resulting expression. Then one needs to go back. See for example page 77 in A. Gut "An intermediate course in probability".

This is a standard undergraduate problem.

Craig Wright · Answer 3 · 2012-10-15T12:44:08.700

-2

I'm a novice at this myself, but I have gotten a lot of mileage out of Data Analysis: A Bayesian Tutorial by Devinderjit Sivia and John Skilling.

What I think you have described, however is Bayesian parameter estimation for a parameter $p$, say perhaps the probability associated with a coin coming up heads. The function $f_{X|Y}$ is the distribution of that parameter.

If this is the case you would call $f_{Y|X}$ you likelihood function, which we could take as a binomial, since our evidence would be a series of coin flips. Note that the parameter of $f_{X|Y}$ is not so much "y" as it is the number of heads and tails thrown (i.e. $f_{X|Y}(x;\#heads, \#tails)$ as this is what you need to properly parameterize your binomial.

As for $f_X$ it is our prior, which we could take to be uniform. $f_{X|Y}$ is our posterior, which ends up being a beta distribution for the case I just described.

As for $f_Y$ it doesn't really matter if you are just trying to find the best value for the parameter, because it's a normalization factor. However if you need the distribution then it's the integral over the range of the possible values of $x$, in this case $[0,1]$.

This article on Wikipedia may help as well. http://en.wikipedia.org/wiki/Checking_whether_a_coin_is_fair

edited Oct 15 '12 at 12:44

answered Oct 14 '12 at 22:55

Craig Wright

275

Thank you, though I am still unsure about it. Is there a general approach to this ? Can you give an example ? How would it work for the other example I mentioned or Normal(0,X) with X~Exp(1). You say $f_Y$ is an integral, but what is the integrand ? – Joe King Oct 14 '12 at 23:05
Also, why isn't $f_X$ just pdf2 – Joe King Oct 14 '12 at 23:11
The Wikipedia article on Coin Flipping fully derives the example I described. Have a look at that. Like I said I'm a novice as well. As for $f_X$, the way you framed your question it could be $pdf2$. The term you might be looking for is "conjugate prior". – Craig Wright Oct 14 '12 at 23:22
What I said about $pdf2$ isn't quite right. The prior distribution of $X$, $f_X$ is the distribution you think $X$ has before you have gathered evidence. Your posterior $f_{X|Y}$ is the distribution of $X$ given the evidence. In your notation $y$ is the evidence. – Craig Wright Oct 14 '12 at 23:25
Thanks again. I have studied Bayesian statistics a little. But the question I asked seems to me to be separate from a Bayesian approach to conditional probability. While I can see that a Bayesian approach could be adopted, I don't think it's necessary. Bayes rule is an irrefutable law of probability whereas the Bayesian approach uses Bayes rule in a particular way (posterior $\propto$ likelihood $\times$ prior). Also, I don't see where the wikipedia article fully derives the example. – Joe King Oct 15 '12 at 06:25
Sorry. Wrong URL. This is the correct one. http://en.wikipedia.org/wiki/Checking_whether_a_coin_is_fair – Craig Wright Oct 15 '12 at 12:44

When a random variable has a distribution whose parameter is another random variable

3 Answers3