0

What would be a good strategy for estimating the joint distribution of a bunch of measurements?

So if I had drawn from a 2D Gaussian I would have given vectors:

[[ 3.30598028  4.42541811]  
 [ 2.53505053  1.29456389]  
 [ 0.66794753 -0.36475196]  
 ...,                       
 [ 2.54780722 -0.19608394]  
 [ 2.99712014  4.57796175]  
 [ 3.07760632  3.35089218]] 

and I would somehow like to figure out that this somewhat looks like a 2D Gaussian. Except in my case the vectors are actually 16 bytes and I don't know what the bits are. I am hoping to gain some insight into the format by looking at the joint distribution.

When I say I have a vectors of 16 bytes I mean I have 128 values which can be either 0 or 1. The distribution is not from a specific family.

Essentially I am looking for things like

  • "the second bit is always the same as the fourths" or
  • "every eights bit is always zero (maybe this is ascii encoded)" or
  • "the third bit is always the xor of the first two bits"

How do I best approach this?

Elias
  • 109
  • 2
    I think we are going to need more detail here. Do you have a family of distributions to select from? Do you mean the vectors of of length 16 and the contents lie in [0,,255] or something else? – mdewey Nov 09 '17 at 15:33
  • @mdewey I edit the question a bit, Does that help? – Elias Nov 09 '17 at 15:45
  • 1
    It's still confusing--there seems to be little connection between your discussion of 2D Gaussian distributions and your description of "vectors" and "bytes." Are you trying to say that each observation is a vector of 128 binary (0/1) values and that you're trying to estimate the joint distribution of all 128 components? – whuber Nov 09 '17 at 15:52
  • 1
    @whuber Yes, I want the joint distribution of all 128 components. And sorry about the confusion I just wanted to add an example of what I am roughly trying to do. – Elias Nov 09 '17 at 16:01
  • You really need to give us much more context! To estimate a 128-dim joint distribution is a very ambitious goal, do you really need it? What is your ultimate goal, which question do you ask from the data? – kjetil b halvorsen Jul 08 '18 at 10:41

0 Answers0