How to count id's per name in a dataframe

Question

I have a list of names:

lst = ['Albert', 'Carl', 'Julian', 'Mary']

and I have a DF:

target     id      name
  A       100     Albert
  A       110     Albert
  B       200     Carl
  D       500     Mary
  E       235     Mary

I want to make another dataframe counting how many id per name in lst:

lst_names   Count
Albert       2
Carl         1
Julian       0
Mary         2

What's the most efficient way to do this considering the list of names has 12k unique names on it?

score 3 · Accepted Answer · answered Mar 20 '22 at 01:22

3

Check with value_counts

pd.Categorical(df['name'],lst).value_counts()
Out[894]: 
Albert    2
Carl      1
Julian    0
Mary      2
dtype: int64

Or

df['name'].value_counts().reindex(lst,fill_value=0)
Out[896]: 
Albert    2
Carl      1
Julian    0
Mary      2
Name: name, dtype: int64

answered Mar 20 '22 at 01:22

BENY

score 2 · Answer 2 · answered Mar 20 '22 at 01:24

You can use value_counts, and then create an empty Series with lst as the index, and then add them together, filling NaN with 0:

(df['name'].value_counts() + pd.Series(index=lst, dtype=int)).fillna(0).astype(int)

Output:

>>> df
Albert    2
Carl      1
Julian    0
Mary      2
Name: count, dtype: int64

2 Answers2