Vectorize groupedby dataframe

Asked Aug 26 '20 at 14:39

Active Aug 26 '20 at 16:24

Viewed 20 times

I have a dataframe where one of the columns is item and there is a non-unique field id. So first, I'm grouping by id:

grouped = df.groupby('id')

Now I can iterate each group like so:

for name, group in grouped:

I can also have a list of all unique items with

all_items = df['item'].unique()

What I'd like to do is for each group get a list/vector of size len(all_items) with counts according to the number of times the item appeared in the group. Basically, my main goal is to have a numpy matrix of these vectors so I can process it with scikit-learn models.

How can I do that?

edited Aug 26 '20 at 16:24

desertnaut

52,940
19
125
157

asked Aug 26 '20 at 14:39

IsaacLevon

1,680
1
25
65

do you mean `pd.crosstab(df['id'],df['item'])` ? can you show us a small example and an expected output? – anky Aug 26 '20 at 14:43
Exactly what I meant! Thanks @anky (You may write this as an answer if you'd like) – IsaacLevon Aug 26 '20 at 14:45
It has been asked before actually , hence closing :-) Glad my comment helped you..!! – anky Aug 26 '20 at 14:50

Vectorize groupedby dataframe

0 Answers0