0

I need a statistic or metric to obtain which vector has the highest changes between its values, I mean, I would like to get vector b, because it has different number contiguous to each other one. They don't have just binary numbers, maybe between 0-255 range:

a = [0,0,0,0,0,1,1,1,1,1]
b = [1,0,1,0,1,0,1,0,1,0]

As you can see, both vector has the same number of values and range, but different sequence. I had apply variance and coefficient of variation, but they give the same results.

print 'Var:', np.var(a),'CV:', np.std(a)/np.mean(a)
print 'Var:', np.var(b),'CV:', np.std(b)/np.mean(b)
Var: 0.25 CV: 1.0
Var: 0.25 CV: 1.0

What other statistic or measure I can try? Any other suggestion?

1 Answers1

1

Here is a suggested metric: Let your metric be the mean of the absolute value of the "diff" of the vector, where diff is the vector with n-1 elements formed by taking differences of successive elements of the original n element vector.

With your example data, mean(abs(diff(a))) = 0.11111. mean(abs(diff(b))) = 1.

You could instead consider some other function of the diff if this is not exactly what you want.

Mark L. Stone
  • 13,342
  • 1
  • 37
  • 58
  • 1
    A more general form is "norm of the difference". Absolute value is one norm, but the L2-norm, all even positive L-norms, and the L-infinity norm are also acceptable. – EngrStudent Jan 19 '16 at 01:06
  • @ EngrStudent , agreed. I thought of describing it as such, but decided to keep it non-technical, and indicate that variations were possible. I suppose I should have, so thanks for the comment. – Mark L. Stone Jan 19 '16 at 01:08
  • There is great power in being able to describe it so your grandma would understand it. My grandma is smart, but has no idea what an L-norm is. – EngrStudent Jan 19 '16 at 12:13
  • @MarkL.Stone Thank you. Good solution, but see that it is affected when you have big differences between consecutive values, so I will modify it a little bit, just counting cases where they are different, then obtain their sum, and divide it by vector length. – Jaime Lopez Jan 19 '16 at 23:03