Estimating the probabilities of many events given only intersections

Question

We are interested in approximating the probability of $n$ (possibly dependent) events $\{e_1,\dots, e_n\},$ but we can only estimate the probability of an intersection of any $q$ of them:

$$P(e_{k_1}\cap e_{k_2}\cap \dots \cap e_{k_q})$$

The problem's model gives us the following identity. We have an algebraic function $f:\underset{q \text{ times}}{\underbrace{[0,1]\times \dots \times [0,1]}}\rightarrow [0,1]$ where:

$$f\left(P(e_{k_1}),P(e_{k_2}),\dots, P(e_{k_n})\right) = P(e_{k_1}\cap e_{k_2}\cap \dots \cap e_{k_q}).$$

$q$ is fairly small (it's 9), but $n$ can be on the order of dozens to several thousand.

If we took enough samples we would have a chance of just using algebra to solve for each $P(e_i)$ individually, but the amount of samples and computation time needed to do this seems huge.

Is there a good way to approach such a problem?

Several thousand is small computationally maybe you need to profile and analyze your code or consider parallel computing — pyCthon, Sep 28 '12 at 17:13

score 3 · Accepted Answer · answered Jun 30 '12 at 06:20

3

I'm not clear from your question what type of model you're using, but in Linear Modelling there is a class of problems known as "large p, small n" in the "Variable Selection" literature, the best known solution is probably the Lasso, refer Tibshirani's Lasso page.

answered Jun 30 '12 at 06:20

Sean

1,619

it appears the OP is doing something along the lines of inverting a function of several variables and is finding that it becomes quite difficult as the number of variables increase. – Macro Jun 30 '12 at 06:36
ah... if that is the case my answer is probably a delete... – Sean Jun 30 '12 at 06:44

Estimating the probabilities of many events given only intersections

1 Answers1