0

I have the data.frame

mydata <- data.frame(x=c(0,2,8,5,7,9,0,1,4,6,6,10,14,12),y=c(33,22,74,25,75,109,10,1,4,5,62,10,14,17))

I want to cut it into groups, where each group starts where the 0s are. After that I want to do for each group a regression where x is the input and y the output value. How do I split x efficiently into groups in such a way that I can do stuff with the grouped data afterwards?

My inefficient idea is to, after sorting, loop over x and, remember a counter and add the counter as a third parameter, but that is going to be awfully slow in R, where you shouldn't loop if possible.

Here is an example what I expect from the grouping:

mydata_out <- data.frame(x=c(0,2,8,5,7,9,0,1,4,6,6,10,14,12),
                         y=c(33,22,74,25,75,109,10,1,4,5,62,10,14,17),
                         group=c(1,1,1,1,1,1,2,2,2,2,2,2,2,2))
Sotos
  • 47,396
  • 5
  • 31
  • 61
Make42
  • 10,870
  • 22
  • 68
  • 142
  • 3
    just `cumsum(x$x == 0)` and then [follow this](http://stackoverflow.com/questions/1169539/linear-regression-and-group-by-in-r) – Sotos Apr 11 '17 at 13:59
  • @Sotos: I think my question was not clear. I added an example, what I expect as output. I am not sure how `cumsum` would be helping here. – Make42 Apr 11 '17 at 14:22
  • Did you try it? It produces the exact same result as your `group` column – Sotos Apr 11 '17 at 14:26
  • @Sotos: My bad. I did a mistake in my code. Yes, it worked. – Make42 Apr 11 '17 at 14:28

0 Answers0