22

I have a simple dataframe which I would like to bin for every 3 rows.

It looks like this:

    col1
0      2
1      1
2      3
3      1
4      0

and I would like to turn it into this:

    col1
0      2
1    0.5

I have already posted a similar question here but I have no Idea how to port the solution to my current use case.

Can you help me out?

Many thanks!

Community
  • 1
  • 1
TheChymera
  • 15,154
  • 14
  • 50
  • 83

3 Answers3

44

In Python 2 use:

>>> df.groupby(df.index / 3).mean()
   col1
0   2.0
1   0.5
TankorSmash
  • 11,649
  • 6
  • 62
  • 103
Roman Pekar
  • 99,839
  • 26
  • 181
  • 193
  • 2
    such a simple and elegant solution! – Constantino Oct 26 '15 at 14:43
  • 18
    I get 0.000000 2, 0.333333 1, 0.666667 3, 1.000000 1, 1.333333 0 with the latest Python and Pandas version. Probably has to do with integer division. *Edit*: Yes, Python 3 users, use `df.index // 3` – sougonde Feb 24 '16 at 19:49
  • 1
    Is there an equivalent way to do this if your dataframe has a datetime index, and you were insisting on doing every `n` rows? – Seth May 15 '20 at 16:56
19

The answer from Roman Pekar was not working for me. I imagine that this is because of differences between Python2 and Python3. This worked for me in Python3:

>>> df.groupby(df.index // 3).mean()
   col1
0   2.0
1   0.5
ShadowUC
  • 610
  • 4
  • 19
ojunk
  • 747
  • 6
  • 19
4

For Python 2 (2.2+) users, who have "true division" enabled (e.g. by using from __future__ import division), you need to use the "//" operator for "floor division":

df.groupby(df.index // 3).mean()
mohaseeb
  • 389
  • 4
  • 8