I have a CSV file with a funny structure:
date,level,indicatorOfParameter,temp,value
2019-01-01 12:00:00,0,1,295.2,101355.125
2019-01-01 12:00:00,2,11,295.2,294.978515625
2019-01-01 12:00:00,2,51,295.2,0.0070480406284332
2019-01-02 12:00:00,0,1,294.2,101531.25
2019-01-02 12:00:00,2,11,294.2,295.42236328125
2019-01-02 12:00:00,2,51,294.2,0.0049711901228874
Note that date and temp only change together. value is the value for indicatorOfParameter, which in turn is just a label for a variable. The idea is that this CSV should represent a MultiIndex DataFrame where temp is a function of the other variables. But I find this extremely counterintuitive and inefficient. There are plenty of values repeated, and different rows represent information for the same "data point".
I think it would be more sensible to have a regular Dataframe with date as the only Index, and then turn all rows within the same date into columns, where indicatorOfParameter becomes the column name, removing the value column. This would reduce the number of rows to have as many as different dates (i.e. actual data points). In return, it would increase the number of columns to have as many as explaining variables for temp. And I'd get rid of the nasty value column. I think this is more sensible, and indeed the format I need to train ML models with sklearn.
In a nutshell, I'd like to get this structure:
date,level,1,11,51,temp
2019-01-01 12:00:00,0,101355.125,294.978515625,0.0070480406284332,295.2
2019-01-02 12:00:00,0,101531.25,295.42236328125,0.0049711901228874,294.2
How can I achieve this transformation with Pandas?