1

I constructed a Bayesian network which presents the conditional independence among N random variables. Each random variable $X_i$ represents a Bernoulli random variable with an associated success probability $\pi$ (the probabilities, $\pi$`s, are the parameters of the Bayesian graph).

I'm interested in computing the sum $S=X_1+X_2+...+X_N$. This means I'm interested in computing the sum over the nodes of the Bayesian graph. I know that Bayesian graph enables us to compute the joint probability distribution $p(X_1,X_2,X_3,...,X_N)$ but not their sum.

Are there some references or ideas that deal with the same sum on a Bayesian graph.

I found this page Decomposing dependent Bernoulli random variables into independent Bernoulli random variables

Can some one tell whether we can use this link to compute the sum $S$.

Fred
  • 11
  • More specifically, I'm looking to find out the probability distribution of p(S=k) or its argmax. For independent bernoulli variables, this distribution is called poisson's binomial distribution. However, when there are conditional independence and dependence among variables (modeled using the bayesian graph), I don't know how to find out this distribution. – Fred Dec 29 '16 at 18:34
  • why don't you just create a new deterministic node $S=\sum_i X_i$ in your network. Then you can sample from it using MCMC, Gibbs etc. Or even do exact inference on it. – jkt Dec 30 '16 at 00:10
  • thanks YBE for your answer, but for now I don't see what you mean exactly. – Fred Dec 30 '16 at 02:07
  • thanks YBE for your answer. Sorry, I didn't find the details how your suggestion works. Please note that I need to estimate/predict the value of S before getting its exact value from the experiment. Means, I will not wait the result of the trials in order to compute S. I want to quantify S given the success probabilities on the Bayesian graph. – Fred Dec 30 '16 at 02:15
  • You have Bayes net with variables X1,..., XN right? What I am saying is introduce an additional node to your Bayes Net, which is the S you defined in your question. Then just using whatever inference routine you are utilizing; compute the joint posterior p(X1,...,XN,S). Then you can marginalize if you just want p(S). – jkt Dec 30 '16 at 02:23
  • I will try to do it. Do you have some references or links that can provide more details. I didn't use deterministic node, before, to compute sum of the other nodes in a Bayes network. – Fred Dec 30 '16 at 14:06
  • I am looking for a good basic to intermediate paper that uses the solution proposed by YBE. I have a computer science background and basic Bayesian knowledge. – Fred Dec 30 '16 at 16:28
  • Do you want to do things by hand? Any probabilistic programming language will handle such things automatically. – jkt Dec 30 '16 at 18:57
  • Sorry YBE. These are may be trivial things for you, but I'm very new to the area. So , that is why I'm not familiar with all what you have said. – Fred Dec 30 '16 at 19:16
  • no worries, is this a homework question where you are expected to do computations by hand or is this is a project where you would like to do inference via computers since the model is big? I am just trying to understand your requirement. – jkt Dec 30 '16 at 19:32
  • This is a project. I would like to do inference via computer (software), the model is quite big. – Fred Dec 30 '16 at 19:42
  • Probabilistic programming language (PPL) is a high-level programming language where you can define your probabilistic model and use available black-box inference engines like MCMC, Gibbs, Variational Inference etc. Some famous alternatives are: BUGS, STAN, Church, Anglican etc. – jkt Dec 30 '16 at 20:12
  • Ok. which free software for windows that uses PPL do you recommand ? – Fred Dec 30 '16 at 20:26
  • STAN seems the best bet. – jkt Dec 30 '16 at 21:12
  • Ok, thanks YBE, I installed STAN with software R. Now, to construct the graph model, I plan to use Bayesian Network Constraint-Based Structure Learning Algorithms (like PC and IC) with some modifications. Is PC algorithm already implemented in STAN ? – Fred Dec 31 '16 at 03:49
  • However, still I don't see how to get sum S of X1+X2+...XN? In fact, using your suggestion mean S will be a node in the graph:
    1. What are the parents of S? If I suppose their parents are all the Xi, so I need to define relations among Xi given S. ?
    2. Each Xi is a bernoulli random variable, its parameter pi represents conditional probability distribution of that node (random variable). Hwvr, I need the binary value of Xi not its parameter pi.

    ==> I think a deterministic node is not enough to compute S. I think it should be some way to quantify dependence among the Xi random variables .

    – Fred Dec 31 '16 at 15:43
  • in your question you said: "I constructed a Bayesian network". Ideally that network encodes all the dependence/independence information over Xi's. I am just telling you to add a node variable S whose parents are all the Xi's with the conditional density $p(S|X1,...,XN)=\delta(S-(X1+\dots+XN))$. You do not need to define relations among Xi given S, that's the job of the inference engine. You define a forward generative model respecting the causality and then do inference on it. A bernoulli variable takes values 0 or 1 in your model. – jkt Dec 31 '16 at 19:01

0 Answers0