4

I have a dataframe with dates as columns. I'd like to average the values from daily to a monthly level. I've tried with Time Grouper and Resample, but it doesn't like the columns names are strings and I can seem to figure out how to make columns into something like a DatetimeIndex.

My starting dataframe:

import pandas as pd

df = pd.DataFrame(data=[[1,2,3,4],[5,6,7,8]],
                  columns=['2013-01-01', '2013-01-02', '2013-02-03', '2013-02-04'], 
                  index=['A', 'B'])

Desired Output:

   2013-01-01  2013-02-01
A         1.5         3.5
B         5.6         7.5
MaxU - stop genocide of UA
  • 191,778
  • 30
  • 340
  • 375
user2242044
  • 7,943
  • 23
  • 91
  • 154

5 Answers5

8

You can using resample

df.columns = pd.to_datetime(df.columns)
df.T.resample('M').mean().T
Out[409]: 
   2013-01-31  2013-02-28
A         1.5         3.5
B         5.5         7.5

Or groupby one

axis=1 
df.groupby(pd.to_datetime(df.columns).to_period('M'),1).mean()
Out[412]: 
   2013-01  2013-02
A      1.5      3.5
B      5.5      7.5
BENY
  • 296,997
  • 19
  • 147
  • 204
4

First, convert column index to datetime with pd.to_datetime, then use T and groupby with pd.Grouper (note pd.TimeGerouper is deprecated use pd.Grouper):

df.columns = pd.to_datetime(df.columns)
df.T.groupby(by=pd.Grouper(freq='MS')).mean().T

Output:

   2013-01-01  2013-02-01
A         1.5         3.5
B         5.5         7.5
Scott Boston
  • 133,446
  • 13
  • 126
  • 161
4

you can make use of pd.PeriodIndex:

In [145]: df.groupby(pd.PeriodIndex(df.columns, freq='M'), axis=1).mean()
Out[145]:
   2013-01  2013-02
A      1.5      3.5
B      5.5      7.5
MaxU - stop genocide of UA
  • 191,778
  • 30
  • 340
  • 375
1

Try converting the column names to date first:

df = pd.DataFrame(data=[[1,2,3,4],[5,6,7,8]], columns=pd.to_datetime(['2013-01-01', '2013-01-02', '2013-02-03', '2013-02-04']), index=['A', 'B'])

Hope it helps!

1
import pandas as pd

list=df.columns
df_new = pd.DataFrame()

for i in range(int(0.5*len(list))):
    df_new[list[2*i]] = (df[[list[2*i], list[2*i+1]]].mean(axis=1))

Output

       2013-01-01  2013-02-03
A         1.5         3.5
B         5.5         7.5

I dont understand your Desired Output:

Manuel
  • 688
  • 4
  • 8
  • it's the monthly mean. – stucash Dec 07 '17 at 23:10
  • Yes, for '2013-01-01' the average is 1.5 ((1+2)/2) and 5.5 ((5+6)/2), but for 2013-02-01 the average is 2.5 ((2+3)/2) and 6.5 ((6+7)/2), and you use 3.5 ((3+4)/2) and 7.5 ((7+8)/2). – Manuel Dec 07 '17 at 23:15
  • it's not daily. it's monthly. have a look below answers, the index is on 'Month'. i.e. Jan and Feb. January, column A has an avarege of (1+2)/2 = 1.5. February, column A has an average of (3+4)/2 = 3.5. you get the rest I believe. – stucash Dec 07 '17 at 23:24