1

What is the fastest way to compute the number of occurrences of elements within a Pandas series?

My current fastest solution involves .groupby(columnname).size(). Is there anything faster within Pandas? E.g. I want something like the following:

In [42]: df = DataFrame(['a', 'b', 'a'])

In [43]: df.groupby(0).size()
Out[43]: 
0
a    2
b    1
dtype: int64
MRocklin
  • 52,252
  • 21
  • 144
  • 218
  • 3
    Worrying about optimizations on this level seems like a waste of time, but you could try `value_counts`: it should have less overhead. – DSM Apr 27 '14 at 00:26
  • 1
    possible duplicate of [what is the most efficient way of counting occurrences in pandas?](http://stackoverflow.com/questions/20076195/what-is-the-most-efficient-way-of-counting-occurrences-in-pandas) – Noah Apr 27 '14 at 22:12

1 Answers1

3

The value_counts() function in pandas does this exactly.

Use that function on the column you want. i.e.

df['column_i_want'].value_counts()
cwharland
  • 5,315
  • 2
  • 20
  • 28