I'm in the situation of having a dataframe on the form:
import pandas as pd
df_1 = pd.DataFrame({
'A': [0, 0, 1, 1, 1, 2],
'B': [0, 1, 0, 1, 2, 1],
'C': ['a', 'a', 'b', 'b', 'c', 'c']
})
what I want to do is to drop rows of that dataframe where the ordered couples coming from numbers of column 'A'and 'B' are duplicated.
So what I want is:
df_1 = pd.DataFrame({
'A': [0, 0, 1, 1],
'B': [0, 1, 1, 2],
'C': ['a', 'a', 'b', 'c']
})
My idea was to add a column with a the sorted couple as a string and to use the drop_duplicates function of the dataframe, but since i'm using a very huge dataframe this solution is very expansive.
Did you have any suggestions? Thanks for the answers.