0

It does nothing when I am trying to delete duplicate values

           1           2     3            4
0          1  7733739797  НИКА  Александров
1          2  7733771790  НИКА  Александров
2          3  7733783323  НИКА  Александров
3          4  7733739797  НИКА  Александров
4          5  7733739797  НИКА  Александров

This is how my xlsx data looks like and I have 12000 rows. All I want is to delete all duplicate values by the second column. Is there another way how can I do that or maybe another function?

import pandas as pd

df = pd.read_excel(r'/Users/gfidarov/Downloads/TEST FOR GIRLS/ИНН аптечные сети1.xlsx')
df.drop_duplicates()
print(df)

George
  • 11
  • 5
  • 1
    What is the error? seems like you are not getting the deduplicated results back? if so you just need to assign the results back `df = df.drop_duplicates()` – anky May 26 '20 at 13:54
  • 1
    Use `df = df.drop_duplicates(subset=[2])` – jezrael May 26 '20 at 13:54
  • 1
    @anky Thank you a lot, Haven't thought about that. The problem solved. – George May 26 '20 at 13:58

0 Answers0