0

I have a dataframe df like this:

    x   
1   paris   
2   paris  
3   lyon  
4   lyon   
5   toulouse 

I would like to only keep not duplicated rows, for exemple above I would like to only keep the row 'toulouse'.

I tried drop duplicates pandas function but doesn't work:

df.drop_duplicates(subset=['x'], inplace=True)

Expected output:

      x   
 5 toulouse

How can I do this ?

jos97
  • 361
  • 1
  • 12

1 Answers1

2

From documentation:

keep{‘first’, ‘last’, False}, default ‘first’ Determines which duplicates (if any) to keep. - first : Drop duplicates except for the first occurrence. - last : Drop duplicates except for the last occurrence. - False : Drop all duplicates.

It says , keep=False would drop all duplicates. So you can do:

df.drop_duplicates(subset=['x'], keep=False,inplace=True)

Related Post: Drop all duplicate rows across multiple columns in Python Pandas

anky
  • 71,373
  • 8
  • 36
  • 64