3

I am using the following code to groupby and count/sum etc.

groups = df[df['isTrade'] == 1].groupby('dateTime')                         
grouped = (groups.agg({'tradeBid': [np.sum,lambda x: (x > 0).sum()],})) 

The output is giving me:

tradeBid    tradeBid
sum <lambda>

79  46
7   6
4   4
20  6

How can I change the output's header ( so my end user will know what is this data?

Giladbi
  • 1,762
  • 3
  • 18
  • 29

1 Answers1

10

You can provide names like this:

groups.agg({'tradeBid': [('sum', np.sum), ('other', lambda x: (x > 0).sum())]})

It used to be you could use a dict instead of a list of 2-tuples, but that is now deprecated (probably because the ordering of the columns is then arbitrary).

John Zwinck
  • 223,042
  • 33
  • 293
  • 407
  • Is this documented somewhere? – ayhan Jan 28 '18 at 10:56
  • 3
    @ayhan: The docs say `agg()` accepts "dict of column names -> functions (or list of functions)" but does not say that a list of 2-tuples is an acceptable substitute. Nor does it say that using a dict now results in a deprecation warning. But I know that many things in NumPy/Pandas which could use a dict can also use a list of (name, value) tuples. So I tried it and it worked. So no, it's not documented. :) – John Zwinck Jan 28 '18 at 10:59
  • Yeah I have never seen it so I thought maybe they added this after deprecating dict renaming but it seems it was always possible. Good to know. :) – ayhan Jan 28 '18 at 11:01