I need to reindex the 2nd level of a pandas dataframe, so that the 2nd level becomes a list np.arange(N) for each 1st level index. I tried to follow this, but unfortunately it only creates an index with as many rows as previously existing. What I want is that for each new index new rows are inserted (with nan values).
In [79]:
df = pd.DataFrame({
'first': ['one', 'one', 'one', 'two', 'two', 'three'],
'second': [0, 1, 2, 0, 1, 1],
'value': [1, 2, 3, 4, 5, 6]
})
print df
first second value
0 one 0 1
1 one 1 2
2 one 2 3
3 two 0 4
4 two 1 5
5 three 1 6
In [80]:
df['second'] = df.reset_index().groupby(['first']).cumcount()
print df
first second value
0 one 0 1
1 one 1 2
2 one 2 3
3 two 0 4
4 two 1 5
5 three 0 6
My desired result is:
first second value
0 one 0 1
1 one 1 2
2 one 2 3
3 two 0 4
4 two 1 5
4 two 2 nan
5 three 0 6
5 three 1 nan
5 three 2 nan