0

I have this data frame (df).

        doc_id       token
0       0           ‘token1’
1       0           ‘token2’
2       1           ‘token1’
3       1           ‘token2’
4       2           ‘token1’
5       3           ‘token1’
6       3           ‘token2’

I want to be able to group every token by doc_id in one row such that :

        doc_id      token
0       0           [‘token1’, ‘token2’]    
1       1           [‘token1’, ‘token2’]
2       2           [‘token1’]
3       3           [‘token1’, ‘token2’]

I tried to do it with df_group = pd.DataFrame(df.groupby(['doc_id','token'])) but it’s returning this:

        0                   1
0       (0, token1)         doc_id token    
1       (0, token2)         doc_id token    
2       (1, token1)         doc_id token
3       (1, token2)         doc_id token
4       (2, token1)         doc_id token
5       (3, token1)         doc_id token
6       (3, token2)         doc_id token

Can someone help me with this ?

Jazzy
  • 1

0 Answers0