1

In pandas, you can do this by:

df.loc[:,~df.columns.duplicated()]

df.columns.duplicated() returns a boolean array that denotes duplicate columns

python pandas remove duplicate columns

Luka
  • 11
  • 4
  • 2
    Do you mean duplicated column names? They are not possible in polars. If you create a DataFrame with duplicate column names, you will get an error. – ritchie46 Oct 22 '21 at 16:54
  • d'oh! thanks. actually I guess it silently drops the first duplicate column? which sort of achieves the no duplicates – Luka Oct 22 '21 at 18:35
  • I don't know how you materialized the DataFrame in the first place, but it may be that you overwrote the duplicates? In any case, no need to remove duplicate names. ;) – ritchie46 Oct 23 '21 at 06:46
  • 1
    They actually come from a `pd.read_sql().` So I was wondering how to make it work with ConnectorX and Polars, but ConnectorX does not support downloading dups either so no worries – Luka Oct 25 '21 at 14:50

0 Answers0