keep smallest and highest date in each group of a dataframe

Question

i have a dataframe that looks like this :

data = {'key_part_1': ["Jack", "Jack","Lisa"],
        'key_part_2': ["Mayer", "Mayer","Osten"],
        'subscription_date': ['2021-05-01', '2021-08-01','2019-08-01'],
        'cancelation_date': ['2021-08-01', '2021-11-01', '2019-09-025'],
          }

As you can se, the group ['Jack','Mayer'] is repeated twice with subscription and cancelation dates than coincide between the two lines. I want to solve that by creating this :

data = {'key_part_1': ["Jack","Lisa"],
        'key_part_2': ["Mayer","Osten"],
        'subscription_date': ['2021-05-01','2019-08-01'],
        'cancelation_date': ['2021-11-01', '2019-09-025'],
          }

Essentially keeping only one appearance of the group ['Jack','Mayer'] in my dataframe by taking the earliest subsciption date and the latest cancelation date (no need to check if the subscription and cancelation dates are equal from a line to another, it is always the case for repeated keys).

I want to do this with all the persons that are repeated in the dataframe. I know i should sort the dates first within the groups but couldn't find how.

Thanks !

Thank you.

Your code is missing in this question. :) – gajendragarg Feb 24 '22 at 09:58 — gajendragarg, Feb 24 '22 at 09:58

keep smallest and highest date in each group of a dataframe

0 Answers0