3

Okay, so, in the traditional Bernoulli Urn problem, we have an urn with a number N, possibly infinite, of coloured balls, and there are k possible colours. That one I grok.

However, what if I don't actually know what k is? That is, what if I have an urn with N balls and an unknown but finite and strictly positive number of possible colours?

The main question is, in fact, what my priors should be. What's the prior that there is exactly one colour? Exactly two? At least two? How do I update on the relative frequencies of each colour? Is this problem even solvable?

My first lines of thinking are to have a vector of parameters $\vec \theta \in \mathbb R^\infty$ such that the first parameter is the number of colours in the urn (let's call it $\alpha$) and the remaining are the relative frequencies of each colour. If $P(A=n|\vec\theta)$ is the probability that the colour of the next draw will be n given the knowledge contained by $\vec\theta$, we'd have:

  • $\vec\theta = (\alpha, p_1, p_2, p_3, ...)$
  • $\alpha \in \mathbb N^*$
  • $\left(\sum\limits_{n=1}^\infty P(\alpha = n) \right)= 1$
  • $\left(\sum\limits_{n=1}^\infty p_n\right) = 1$
  • $\forall n > \alpha : p_n = 0$
  • $\forall n \in \mathbb N^* : P(A=n|\vec\theta) = p_n$

However, this is just wild speculation on my part. I'm mostly curious about whether this is even in principle solvable. What I'd want to know is a way to compute both the prior (objective/uninformative) and posterior distributions of $P(\vec\theta)$ or, in other words, the pdfs $P(\alpha)$, $P(p_1)$, $P(p_2)$, etc. How to start with them and how to update on them.

Red
  • 525
  • "what my priors should be" -- we can't tell you your own priors. You tell us what your prior distribution over the k's are; this will depend in the exact details of the situation you're in (not just the part you describe). – Glen_b Dec 06 '13 at 05:57
  • And if the part I describe is exactly the totality of the situation I'm in? Also, how about the posteriors, how do I compute them? – Red Dec 06 '13 at 07:59
  • With all information, the priors would still be subjective - the additional information would inform your priors. – Glen_b Dec 06 '13 at 08:55
  • So there is no way to calculate an objective/uninformative prior, then? And that still doesn't answer the question about how to calculate the posterior. – Red Dec 06 '13 at 09:27
  • 2
    But how can I tell you to use an uninformative prior, unless you said you wanted one? – Glen_b Dec 06 '13 at 10:18
  • I apologise, I thought it was clear. I'll edit the question. – Red Dec 06 '13 at 13:28
  • May you should not use a prior at all! Read about the Ellsberg Paradox. – whuber Dec 06 '13 at 14:55
  • Okay so I read about the paradox and I'm not sure how it relates to this. In the paradox case I was indifferent to either gamble, the pairs had the same expected value on MaxEnt, but even so that doesn't really relate to a problem that is strictly about figuring the prior and the posterior out in the first place. – Red Dec 06 '13 at 17:37
  • Maybe you could tell us why you are interested in this question, from where it came, then maybe we can advance some more. It might (or might not) have some aspects common with the problem of an unknown binomial $N$, where some bayesian approaches has the problem that the prior info still influences in the limit of infinite data. As presently stated the Q is very broad, so try to specialize it. – kjetil b halvorsen Aug 02 '15 at 22:00
  • One interpretation of your question leads to http://stats.stackexchange.com/questions/87494/estimating-n-in-coupon-collectors-problem as a related question. – kjetil b halvorsen Aug 02 '15 at 22:06
  • I was interested in this question for purely academic reasons, i.e. because it entertained me to think about it. – Red Aug 03 '15 at 23:02

0 Answers0