0

I want to aggregate rows, using different conditions for two columns.

When I do df.groupby('[a]').agg('count'), I get the output 1

When I do df.groupby('[a]').agg('mean'), I get the output 2

Is there a way to do an aggregation that show output 1 to the column[b] and output 2 to the column[c]?

Bill Armstrong
  • 1,443
  • 3
  • 18
  • 41
Oalvinegro
  • 389
  • 4
  • 19

1 Answers1

1

Code below should work:

# Import libraries
import pandas as pd
import numpy as np

# Create sample dataframe
df = pd.DataFrame({'a': ['A1', 'A1', 'A2', 'A3', 'A4', 'A3'],
                   'value': [1,2,3,4,5,6]})

enter image description here

# Calculate count, mean 
temp1 = df.groupby(['a']).count().reset_index().rename(columns={'value':'count'})
temp2 = df.groupby(['a'])['value'].mean().reset_index().rename(columns={'value':'mean'})

# Add columns to existing dataframe
df.merge(temp1, on='a', how='inner').merge(temp2, on='a', how='inner')

enter image description here

# Add columns to a new dataframe
df2 = temp1.merge(temp2, on='a', how='inner')
df2

enter image description here

Nilesh Ingle
  • 1,603
  • 10
  • 16