I have two finite-sampled signals, $x_1$ and $x_2$, and I want to check for statistical independence.
I know that for two statistically independent signals, their joint probability distribution is a product of the two marginal distributions.
I have been advised to use histograms in order to approximate the distributions. Here's a small example.
x1 = rand(1, 50);
x2 = randn(1, 50);
n1 = hist(x1);
n2 = hist(x2);
n3 = hist3([x1' x2']);
Since I am using the default number of bins, n1 and n2 are 10-element vectors, and n3 is a 10x10 matrix.
My question is this: How do I check whether n3 is in fact a product of n1 and n2?
Do I use an outer product? And if I do, should I use x1'*x2 or x1*x2'? And why?
Also, I have noticed that hist returns the number of elements (frequency) of elements in each bin? Should this be normalized in any way? (I haven't exactly understood how hist3 works either..)
Thank you very much for your help. I'm really new to statistics so some explanatory answers would really help.
cor.test()in R will give an appropriate test; I'm sure there are Matlab commands to do the same. For the plot, one simple approach is to plot several histograms of $X_1$, each using only data where $X_2$ lies in some specified range. If those histograms look different, this suggests dependence. Alternatively, just scatterplot $X_1$ vs $X_2$ and look for a trend; increasing or decreasing ones are easiest to spot. – guest Mar 12 '12 at 00:50chisq.test()in R) of independence; the null hypothesis being tested is that the joint distribution of the cell counts in your 2-dimensional contingency table is the product of the row and column marginals. – guest Mar 12 '12 at 16:48