I have this data frame (df).
doc_id token
0 0 ‘token1’
1 0 ‘token2’
2 1 ‘token1’
3 1 ‘token2’
4 2 ‘token1’
5 3 ‘token1’
6 3 ‘token2’
I want to be able to group every token by doc_id in one row such that :
doc_id token
0 0 [‘token1’, ‘token2’]
1 1 [‘token1’, ‘token2’]
2 2 [‘token1’]
3 3 [‘token1’, ‘token2’]
I tried to do it with df_group = pd.DataFrame(df.groupby(['doc_id','token'])) but it’s returning this:
0 1
0 (0, token1) doc_id token
1 (0, token2) doc_id token
2 (1, token1) doc_id token
3 (1, token2) doc_id token
4 (2, token1) doc_id token
5 (3, token1) doc_id token
6 (3, token2) doc_id token
Can someone help me with this ?