13

I want to count the occurrence of a string in a grouped pandas dataframe column.

Assume I have the following Dataframe:

catA    catB    scores
A       X       6-4 RET
A       X       6-4 6-4
A       Y       6-3 RET
B       Z       6-0 RET
B       Z       6-1 RET

First, I want to group by catA and catB. And for each of these groups I want to count the occurrence of RET in the scores column.

The result should look something like this:

catA    catB    RET
A       X       1
A       Y       1
B       Z       2

The grouping by two columns is easy: grouped = df.groupby(['catA', 'catB'])

But what's next?

beta
  • 4,684
  • 12
  • 51
  • 88

1 Answers1

19

Call apply on the 'scores' column on the groupby object and use the vectorise str method contains, use this to filter the group and call count:

In [34]:    
df.groupby(['catA', 'catB'])['scores'].apply(lambda x: x[x.str.contains('RET')].count())

Out[34]:
catA  catB
A     X       1
      Y       1
B     Z       2
Name: scores, dtype: int64

To assign as a column use transform so that the aggregation returns a series with it's index aligned to the original df:

In [35]:
df['count'] = df.groupby(['catA', 'catB'])['scores'].transform(lambda x: x[x.str.contains('RET')].count())
df

Out[35]:
  catA catB   scores count
0    A    X  6-4 RET     1
1    A    X  6-4 6-4     1
2    A    Y  6-3 RET     1
3    B    Z  6-0 RET     2
4    B    Z  6-1 RET     2
EdChum
  • 339,461
  • 188
  • 752
  • 538
  • is this then permanently stored in a new column? if not, how can it be stored as a new column? what i want to do is, that i only want to display the output, if the count is greater than a certain number. – beta Jul 27 '15 at 09:48
  • how can i search for two different strings? so str can contain `RET` or `ASDF`? then I need an RegEx right? – beta Jul 27 '15 at 09:58
  • 1
    Use `x.str.contains('RET|ASDF')` also you should post your full requirement, update your question and keep your question to 1 problem per question rather than incrementing your problem – EdChum Jul 27 '15 at 10:00
  • sorry. i did not know about this requirement when asking the question. it's fine now... – beta Jul 27 '15 at 10:05
  • No worries, but you have to understand that SO is not a forum site, it's a Q+A site so to help others help you you need to fully define your problem with enough information that helps everyone. If my answer fully resolved your issue then you can accept it, there will be an empty tick mark at the top left of my answer – EdChum Jul 27 '15 at 10:07