in my dataframe i have a column named authors.
Within this authors column, each cell contains a list of elements. What I want to do, is to split the list into multiple columns.
The reasoning behind this action, is to easily use groupby() and other pandas analysis methods. In particular, my next goal is to see, which author has the most publications in my dataset and which author has published most in which journals.
What I have:
authors journal
0 ['Savola', 'Petri Heinonen', 'Miller'] 2011 Information...
1 ['Mariana Gerber', 'Rossouw von Solms'] Some Journal
2 ['Cyril Onwubiko'] Some other Journal
What I want:
authors journal
0 1 2
0 'Savola' 'Petri Heinonen' 'Miller' '2011 Information...'
1 'Mariana Gerber' 'Rossouw von Solm' NaN 'Some Journal'
2 'Cyril Onwubiko' NaN NaN 'Some other Journal'
What I've tried so far is creating a new dataframe from the authors column:
df2 = df["authors"].apply(pd.Series)
df2
But I can't get my head around, on how to insert this dataframe into my original dataframe.
How do I get this new df2 as subcolumns into my original dataframe?