16

Some Unicode data is stored in file as '\u84b8\u6c7d\u5730' without any encoding.

Is there a way to covert them back in Python?

dda
  • 5,760
  • 2
  • 24
  • 34
lucemia
  • 6,057
  • 5
  • 39
  • 73
  • 3
    Do you mean `'\\u84b8\\u6c7d\\u5730'` or as `u'\u84b8\u6c7d\u5730'`? – Chris Morgan Jun 19 '12 at 04:34
  • @Chris: No need to escape the backslashes, as `\u` isn't a valid escape in bytestrings. – Ignacio Vazquez-Abrams Jun 19 '12 at 04:37
  • @IgnacioVazquez-Abrams: I know; I put it with the doubled backslashes to make my meaning more obvious – Chris Morgan Jun 19 '12 at 04:37
  • As you've accepted Ignacio's answer, this must be a duplicate of [How do I treat an ASCII string as unicode and unescape the escaped characters in it in python?](http://stackoverflow.com/questions/267436/how-do-i-treat-an-ascii-string-as-unicode-and-unescape-the-escaped-characters-in) – Chris Morgan Jun 19 '12 at 04:47
  • I agree. I just cannot find out the right article for this issue. – lucemia Jun 19 '12 at 04:48

1 Answers1

48
>>> print '\u84b8\u6c7d\u5730'.decode('unicode-escape')
蒸汽地
Ignacio Vazquez-Abrams
  • 740,318
  • 145
  • 1,296
  • 1,325