0

I am trying scraping and meet an issue about the words shows as ''and '', i serach the whole network but there's no answer about how to decode it, so I come to here to ask for help, is there's any way to decode it?

M_Sea
  • 359
  • 1
  • 12

1 Answers1

1

These words called "html entities". Searching use this name, you can find many methods to parse them in python. (Decode HTML entities in Python string?)

import html
print(html.unescape(''))

P.S. Unicode code point U+E091 and U+E3C4 are in Private Use Area of Unicode, these don't have any meaning unless someone defines it (e.g. webfonts).

John Kugelman
  • 330,190
  • 66
  • 504
  • 555
Coxxs
  • 331
  • 2
  • 8