1

I have a dataframe where one of the columns is item and there is a non-unique field id. So first, I'm grouping by id:

grouped = df.groupby('id')

Now I can iterate each group like so:

for name, group in grouped:

I can also have a list of all unique items with

all_items = df['item'].unique()

What I'd like to do is for each group get a list/vector of size len(all_items) with counts according to the number of times the item appeared in the group. Basically, my main goal is to have a numpy matrix of these vectors so I can process it with scikit-learn models.

How can I do that?

desertnaut
  • 52,940
  • 19
  • 125
  • 157
IsaacLevon
  • 1,680
  • 1
  • 25
  • 65

0 Answers0