1

I have a table that is looks like follows:

name  type    val 
A     online  12
B     online  24
A     offline 45
B     online  32
A     offline 43
B     offline 44

I want to dataframe in such a manner that it can be groupby with multiple cols name & type, which also have additional columns that return the count of the record with val being added of the same type records. It should be like follows:

name    type    count   val
A       online  1       12
        offline 2       88
B       online  2       56
        offline 1       44

I have tried pd.groupby(['name', 'type'])['val'].sum() that gives the addition but unable to add the count of records.

abhi1610
  • 679
  • 13
  • 26

2 Answers2

1

Add parameter sort=False to groupby for avoid default sorting and aggregate by agg with tuples with new columns names and aggregate functions, last reset_index for MultiIndex to columns:

df1 = (df.groupby(['name', 'type'], sort=False)['val']
        .agg([('count', 'count'),('val', 'sum')])
        .reset_index())
print (df1)
  name     type  count  val
0    A   online      1   12
1    B   online      2   56
2    A  offline      2   88
3    B  offline      1   44
jezrael
  • 729,927
  • 78
  • 1,141
  • 1,090
0

You can try pivoting i.e

df.pivot_table(index=['name','type'],aggfunc=['count','sum'],values='val')

             count sum
               val val
name type             
A    offline     2  88
     online      1  12
B    offline     1  44
     online      2  56
Bharath
  • 28,527
  • 5
  • 52
  • 95
  • what if there are many cols rather than these 3 columns? Do I have to somehow mention particular columnname for aggfunc? – abhi1610 Dec 06 '18 at 06:54
  • 1
    Yeah its simple. There's a values parameter in pivot table. You can simply add `values='val'` – Bharath Dec 06 '18 at 06:55