How to check how often every combination of unique values appears in a DataFrame

Question

I´d be grateful if someone could help me out.

What my goal is: Given a DataFrame:

df = pd.DataFrame({"ID" : [1, 2, 3, 4], 
                   "age": [46, 48, 55, 55],
                   "gender": ['female', 'female', 'male', 'male'],
                   "overweight": ['y', 'n', 'y', 'y']},
                   index = [0, 1, 2, 3])

Now I have already collected the unique values of "age"=[46,48,55], "gender"= ['female', 'male'] and "overweight": ['y', 'n'] in an dictionary, let´s call it for now "dict_unique_values". What I want to achieve is to check how often every combination of age, gender and overweight appears in the DataFrame. he output I want is an array and the array should contain all the frequencies of the combinations. So here is an example for three combinations:

values are: age = 46, gender = male, overweight = y --> this combi appears only once
values are: age = 48, gender = female, overweight = n --> this combi appears only once
values are: age = 55, gender = male, overweight = y --> this combi appears twice

So the output for these three exemplary combis would be: [1, 1, 2]

The Problem: I have to do that for an unknown amount of columns with its own unique values (n over k possibilities to combine the values of the columns) and I have no idea to do that :D

But maybe you do ? :)

the expected output is unclear, please provide one if the linked duplicate is not what you want — mozway, May 20 '22 at 09:03
The Output should be an array containing all the frequencies of the combinations — peter.bucher, May 20 '22 at 09:08
I meant provide the **explicit output** (but the duplicate should do what you want) — mozway, May 20 '22 at 09:11
then, as indicated in the link: `df.groupby(['age', 'gender', 'overweight']).size()` — mozway, May 20 '22 at 09:30

How to check how often every combination of unique values appears in a DataFrame

0 Answers0