0
import re
test = unicode("شدَد", encoding='utf-8')
test = test.replace(u"\u064e", "")

This is the code to remove one character. I would like to replace any of the following unicode characters: 0622, 0623, 0625 with 0627. This is for the Arabic language. I know how to do it in multiple lines but is there a way to do it in one?

Klaus D.
  • 12,804
  • 4
  • 37
  • 48
aQaddoumi
  • 43
  • 1
  • 5

1 Answers1

2

If you want multiple characters (unicode code points) to be replaced in a oneliner, you can use a simple alternation regex:

import re
test = unicode("شدَد", encoding='utf-8')
test = re.sub(u"\u064e|\u0634", "", test,  flags=re.UNICODE)

Or, with a range regex:

test = re.sub(u"[\u064e\u0634]", "", test,  flags=re.UNICODE)
randomir
  • 17,069
  • 1
  • 37
  • 52