I have two distinct sets of ~1500 intervals of differing lengths - let’s call them Interval Set 1 and Interval Set2.
I have another set of data that contains values that fall within each of the sets of intervals - I’ll call it Data Values.
I want to compare the how the Data Values are distributed across Interval Set 1 and Interval Set 2 - Are the values generally higher or lower at the beginning or end of the intervals? Is there a difference in this distribution between Interval Set 1 and Interval Set 2?
To approach this, I binned the intervals into 100 equal width bins (% Interval). I summed up the Data Values that fall within each bin for each interval set, generating two matrices of [100 bin values x ~1500 intervals]. I calculated the average Data Value/bin [100 averages] and plotted this against % Interval for each interval set.
This approach has shown me that there are differences in how the Data Values are distributed, and indeed, it appears that there are differences in this distribution between Interval Set 1 and Interval Set 2.
However, I am having trouble directly comparing the results, primarily because the distribution of interval lengths is different between Interval Set 1 and Interval Set 2. When plotted on the same graph, Interval Set 1 averages are always much lower than those for Interval Set2, but I suspect it’s simply because of this difference in interval length.
Interval Set 1 lengths have a mean of ~4000
Interval Set 2 lengths have a mean of ~10000
So, in general, more of the Data Values are falling inside of the % Interval bins created for Interval Set 2. There are definitely some bins, particularly for Interval Set 1, where no Data Values fall inside, and the bin for that interval ends up with a value of 0. I suspect that I am ending up with lower averages for Interval Set 1 simply because there are more bins that end up with 0 values.
Does anyone know of a better way that I can quantitatively compare the Data Values distributions between my two Interval sets? Do I just need to use smaller bins?