I have observed N (a_i, p_i) pairs each drawn from a different Bernoulli distribution. Here a_i are observed amplitudes and p_i are observed probabilities of success for the i^{th} draw.
I would like to model the full likelihood distribution (and not just via the MLE) with a view to identifying which draws belong to the successful class (and hence the statistics of just these draws, especially any irregular uncertainty distributions etc.).
For example, I have
a_i p_i p_true
10 0.2 0
100 0.9 1
11 0.1 0
99 0.93 1
12 0.25 0
I know the PDFs of the model and of the data (both are obviously Bernoulli distributions), but how do I combine them to obtain residuals that I can use to explore the joint distribution? What distribution does the joint distribution follow and how?
I have tried to unpack the cross-entropy and data and model but don't have a clear solution.
Is the continuous Bernoulli distribution a complete red herring?
Note that ordering doesn't matter in my example. Some other points in response to comments:
The p_i values come from an oracle - these represent the probabilities that the datapoints belong to a common class.
The amplitudes a_i are just weights for the corresponding Bernoulli distributions. The higher the amplitude, the higher the scaling of the Bernoulli distribution in its contribution to the overall process.
See also:
"Weighted" Poisson binomial distribution
Weighted sum of Bernoulli distributions
https://math.stackexchange.com/questions/3481907/sum-of-weighted-independent-bernoulli-rvs
What is the CDF of the sum of weighted Bernoulli random variables?
Please let me know if you need any more information to assist in answering. Thanks as ever!