How to remove duplicate columns in Python Polars?

Asked Oct 22 '21 at 16:21

Active Oct 22 '21 at 16:21

Viewed 207 times

In pandas, you can do this by:

df.loc[:,~df.columns.duplicated()]

df.columns.duplicated() returns a boolean array that denotes duplicate columns

asked Oct 22 '21 at 16:21

Luka

2

Do you mean duplicated column names? They are not possible in polars. If you create a DataFrame with duplicate column names, you will get an error. – ritchie46 Oct 22 '21 at 16:54
d'oh! thanks. actually I guess it silently drops the first duplicate column? which sort of achieves the no duplicates – Luka Oct 22 '21 at 18:35
I don't know how you materialized the DataFrame in the first place, but it may be that you overwrote the duplicates? In any case, no need to remove duplicate names. ;) – ritchie46 Oct 23 '21 at 06:46
1

They actually come from a `pd.read_sql().` So I was wondering how to make it work with ConnectorX and Polars, but ConnectorX does not support downloading dups either so no worries – Luka Oct 25 '21 at 14:50

0 Answers0