0

There are a lot of cells with data that looks like this (I'm analyzing an E-Commerce dataset):

Mode d'été Flare Sleeve Plus Size T-shirt hors de l'épaule 

How would you recommend that I clean this dataset? I know string replace is an option, but is there a better/more efficient option? I only really need "Flare Sleeve Plus Size T-shirt"

thorntonc
  • 1,976
  • 1
  • 5
  • 18
sjoelly
  • 127
  • 1
  • 10
  • Maybe some different encoding when reading/importing the data would help. – Quang Hoang Aug 31 '20 at 20:34
  • Does this answer your question? [How to convert text in pandas dataframe (delete punctuation, split text into one word per entry)](https://stackoverflow.com/questions/53762880/how-to-convert-text-in-pandas-dataframe-delete-punctuation-split-text-into-one) – Trenton McKinney Aug 31 '20 at 21:33
  • [How to clean a string to get value_counts for words of interest by date?](https://stackoverflow.com/questions/62236140) – Trenton McKinney Aug 31 '20 at 21:35

1 Answers1

1

Use this:

print("Mode d'été Flare Sleeve Plus Size T-shirt hors de l'épaule ".encode('WINDOWS-1252').decode())

Martin J
  • 1,705
  • 1
  • 11
  • 24