-1

I need to group the dataframe and add a new column 'Amount' where the number of records repetitions will be displayed. Some like fun. count() in sql or summarise (https://dplyr.tidyverse.org/reference/summarise.html) in R

How I can do this in pandas?

a = [{'order': '789', 'name': 'A', 'date': 20220501}, {'order': '456', 'name': 'B', 'date': 20220502}, {'order': '704', 'name': 'C', 'date': 20220503}, {'order': '789', 'name': 'A', 'date': 20220501}, {'order': '789', 'name': 'A', 'date': 20220501}, {'order': '704', 'name': 'C', 'date': 20220503}]

df = pd.DataFrame(a)

df = df.groupby(['order', 'name', 'date']).sum().reset_index()

print(df)

I need the next result:

enter image description here

Mayank Porwal
  • 31,737
  • 7
  • 30
  • 50
Aleksandr
  • 91
  • 5
  • You want to group the dataframe on which column? Question's unclear. – Mayank Porwal May 13 '22 at 10:39
  • Sorry, across all columns – Aleksandr May 13 '22 at 10:42
  • If you are grouping all columns, shouldn't the `Amount` column be `1,1,1`, because there's no repetitions for any group? – Mayank Porwal May 13 '22 at 10:43
  • there are repetitions, just print the dataframe -> a = [{'order': '789', 'name': 'A', 'date': 20220501}, {'order': '456', 'name': 'B', 'date': 20220502}, {'order': '704', 'name': 'C', 'date': 20220503}, {'order': '789', 'name': 'A', 'date': 20220501}, {'order': '789', 'name': 'A', 'date': 20220501}, {'order': '704', 'name': 'C', 'date': 20220503}] df = pd.DataFrame(a) – Aleksandr May 13 '22 at 10:47
  • Use `df = df.groupby(['order', 'name', 'date']).size().reset_index(name='Amount')` – jezrael May 13 '22 at 10:55
  • Here is a solution: df["Amount"] = df["name"] df = df.groupby(['order', 'name', 'date'], as_index=False).agg({'Amount': 'count'}) – Gokhan Gerdan May 13 '22 at 11:03

0 Answers0