1

I can successfully fill my new column with group counts, but I suspect there is a simpler way:

# How do I simplify this?

def f(gr):

    return pd.Series([gr['class_name'].count()] * gr.shape[0], index=gr.index)

df['class_size'] = df.groupby("class_name").apply(f).reset_index(level=0, drop=True)
column_list = ['class_name', 'class_size']
df[column_list].head(5)

Gets:

This is just the first few rows of data - see how the same class name has the same class count?

Dave Babbitt
  • 898
  • 8
  • 20

2 Answers2

1

I think you need transform:

df['class_size'] = df.groupby('class_name')['class_name'].transform('size')

Or:

df['class_size'] = df.groupby('class_name')['class_name'].transform('count')

What is the difference between size and count in pandas?

Graham
  • 7,035
  • 17
  • 57
  • 82
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
0

Depending on your DataFrame shape you can also just do a count on the groupby:

import pandas as pd
df = pd.DataFrame({'class names':list('abracadabra'),'class count':1})
df.groupby('class names').count().reset_index()
Sebastiaan
  • 1,015
  • 9
  • 18