0

i recently looked at Get statistics for each group (such as count, mean, etc) using pandas GroupBy?. It does not address my issue.

Is there a way to get separate count of values for multiple df columns?

See my code:

Import pandas as pd
Import numpy as np
fle1 = r’k:\file1.xlsx’

df1 = pd.read_excel(fle1, sheet_name=”Sheet1”)
df2 = df1.select_dtypes(exclude=np.number)
col_names = df2.columns.values.tolist()

col_names contains ['Column1', 'Column2', 'Column3']

My dataframe looks like.

Column1     Column2          Column3  
Checking    Car              House  
Checking    Car              House  
Checking    Car              House  
House       Checking         Car 
House       Checking         Car 

I'm Looking for output like:

Column1   Count  Column2    Count   Column3 Count
Checking    3    Car        3       House    3
House       2    Checking   2       Car      2

Not sure what next steps are, any help would be greatly appreciated.

marc_s
  • 704,970
  • 168
  • 1,303
  • 1,425

1 Answers1

1

You can use pd.concat and groupby:

(
    pd.concat([df[x].groupby(df[x]).size().to_frame('Count').reset_index() 
               for x in df.columns],1)
)

    Column1     Count   Column2     Count   Column3 Count
0   Checking    3       Car         3       Car     2
1   House       2       Checking    2       House   3
Allen Qin
  • 18,332
  • 6
  • 47
  • 59