4

I don't know exactly how to describe what I'm looking for, but I will try to make some examples. Let's take three different data series:

  • Series A: 1,2,3,4,5,6,7,8,9,10,9,8,7,6,5,4,3,2,1
  • Series B: 1,2,1,2,1,2,1,2,1,2,1,2,1,2,1
  • Series C: 1,2,3,2,1,2,3,2,1,2,3,2,1

The change from point to point is:

  • Series A: +1,+1,+1,+1,+1,+1,+1,+1,...,-1,-1,-1,-1,-1,-1,-1,-1...
  • Series B: +1,-1,+1,-1,+1,-1,...
  • Series C: +1,+1,-1,-1,+1,+1,...

Or simplified in binary format 1 for +1 and 0 for -1:

  • Series A: 11111111111111...00000000000...
  • Series B: 10101010101010...
  • Series C: 11001100110011...

I'm looking for a function that returns the

  • highest Value for Series A (incrementing data is the same like the previous increment)
  • lowest Value for Series B (data change is always different than the previous)
  • something in between for Series C (data change sometimes same, sometimes different)
Emre
  • 2,638
  • So by "highest value" you mean the longest run of increments in the sequence? Do you have a particular programming language in mind? – MånsT May 11 '12 at 11:06
  • The returning value should be optimally between 0 and 1, so the highest value would be 1 if there is always the same change in the same direction. My above series A would return something close to 1, say 0.97 since it changes from +1 to -1 sometime in between. Regarding the programming language I am fluent with Java, but also worked with Matlab in the past. For quick tests I run Excel/VBA or Libreoffice Calc. – Jens Roth May 11 '12 at 11:32
  • A real life example for the data would be temperature during seasons (almost steadily increasing during spring/summer and almost steadily decreasing during fall/winter). The opposite example might be stock market data or just random data. Maybe there is also a standard statistics function available which I don't know of. – Jens Roth May 11 '12 at 15:42

2 Answers2

4

I would start with the autocorrelation of the +1/-1 sequence with a lag of 1. It has a range of -1 to 1, but you can convert easily transform it to 0 to 1. Here is a quick example in R:

(note: head(x,-1) drops the last value, tail(x,-1) drops the first)

> x1 <- c(1,1,1,1,1,1,-1,-1,-1,-1,-1)
> x2 <- c(1,-1,1,-1,1,-1,1,-1,1,-1,1)
> x3 <- c(1,1,-1,1,1,-1,1,1,-1,1,1)
> cor(head(x1,-1), tail(x1,-1))
[1] 0.8164966
> cor(head(x2,-1), tail(x2,-1))
[1] -1
> cor(head(x3,-1), tail(x3,-1))
[1] -0.4285714
Aniko
  • 11,014
  • Wow, autocorrelation it is :-) The results of -1 to 1 is just perfect. I just installed R and will play with my real data over the weekend and look if the results is what I expected. Thanks Aniko! (Oops, I just saw I need a reputation of 15 to vote up) – Jens Roth May 11 '12 at 22:12
2

Judging from your answer to my comment, it seems that you're looking for a function that gives you the proportion of changes in a sequence. In some sort of pseudocode, with mySequence being a vector of 0's and 1's, that could look like

count=0
index=2
while(!EndOfSequence)
{
   if mySequence[index] == mySequence[index-1] then count=count+1
   index=index+1
}
return count/length(mySequence)

The result is 0 if the sequence is monotonely non-increasing and 1 if it is monotonely increasing.

MånsT
  • 11,979
  • Your function would give me the same value for all above series, since the starting and ending value are the same. Thus, we have the same number of increases/decreases and that results in something near 0.5. I thought about a real life example and added a comment above. – Jens Roth May 11 '12 at 15:42
  • You're right! I think I've fixed that with my edit though :) – MånsT May 11 '12 at 20:30
  • I also love your code and the speed I received the answer - was my first question on this site! It is very simple to rewrite to Java and I will look over the weekend how useful the results are with my data. Thanks MansT! (Oops, I just saw I need a reputation of 15 to vote up) – Jens Roth May 11 '12 at 22:14