Distribution estimation

Question

What would be a good strategy for estimating the joint distribution of a bunch of measurements?

So if I had drawn from a 2D Gaussian I would have given vectors:

[[ 3.30598028  4.42541811]  
 [ 2.53505053  1.29456389]  
 [ 0.66794753 -0.36475196]  
 ...,                       
 [ 2.54780722 -0.19608394]  
 [ 2.99712014  4.57796175]  
 [ 3.07760632  3.35089218]]

and I would somehow like to figure out that this somewhat looks like a 2D Gaussian. Except in my case the vectors are actually 16 bytes and I don't know what the bits are. I am hoping to gain some insight into the format by looking at the joint distribution.

When I say I have a vectors of 16 bytes I mean I have 128 values which can be either 0 or 1. The distribution is not from a specific family.

Essentially I am looking for things like

"the second bit is always the same as the fourths" or
"every eights bit is always zero (maybe this is ascii encoded)" or
"the third bit is always the xor of the first two bits"

How do I best approach this?

I think we are going to need more detail here. Do you have a family of distributions to select from? Do you mean the vectors of of length 16 and the contents lie in [0,,255] or something else? — mdewey, Nov 09 '17 at 15:33
It's still confusing--there seems to be little connection between your discussion of 2D Gaussian distributions and your description of "vectors" and "bytes." Are you trying to say that each observation is a vector of 128 binary (0/1) values and that you're trying to estimate the joint distribution of all 128 components? — whuber, Nov 09 '17 at 15:52
@whuber Yes, I want the joint distribution of all 128 components. And sorry about the confusion I just wanted to add an example of what I am roughly trying to do. — Elias, Nov 09 '17 at 16:01
You really need to give us much more context! To estimate a 128-dim joint distribution is a very ambitious goal, do you really need it? What is your ultimate goal, which question do you ask from the data? — kjetil b halvorsen, Jul 08 '18 at 10:41

Distribution estimation

0 Answers0