1

Possible Duplicate:
Python and character normalization

Anyone knows how to drop the umlauts and other funny thingies above letters such as ā, ä, å to make them simple ascii characters like a, a, a in Python?

Community
  • 1
  • 1
Peter Krumins
  • 905
  • 1
  • 7
  • 22
  • 1
    See http://stackoverflow.com/questions/517923/what-is-the-best-way-to-remove-accents-in-a-python-unicode-string – zoo Jun 29 '11 at 19:22
  • 1
    See http://stackoverflow.com/questions/4162603/python-and-character-normalization – Andreas Jung Jun 29 '11 at 19:27

1 Answers1

-2

I don't think they can be trivially converted since the letters ā, ä, å are valid characters as per Unicode. What you would need to do is do your own mapping from say ā, ä, å to a, if that's what you are looking to do.

sparkymat
  • 9,674
  • 3
  • 29
  • 48
  • This isn't correct; you can use Unicode normalization to convert "unitary" characters such as á into an a followed by a `COMBINING ACUTE ACCENT`, using `unicodedata.normalize`, for instance. –  Oct 16 '11 at 07:14