0

I'm in the situation of having a dataframe on the form:

import pandas as pd

df_1 = pd.DataFrame({
  'A': [0, 0, 1, 1, 1, 2],
  'B': [0, 1, 0, 1, 2, 1],
  'C': ['a', 'a', 'b', 'b', 'c',  'c']
}) 

what I want to do is to drop rows of that dataframe where the ordered couples coming from numbers of column 'A'and 'B' are duplicated.

So what I want is:

df_1 = pd.DataFrame({
  'A': [0, 0, 1, 1],
  'B': [0, 1, 1, 2],
  'C': ['a', 'a', 'b', 'c']
}) 

My idea was to add a column with a the sorted couple as a string and to use the drop_duplicates function of the dataframe, but since i'm using a very huge dataframe this solution is very expansive.

Did you have any suggestions? Thanks for the answers.

0 Answers0