1
>>> import re
>>> re.match(u'^[一二三四五六七]、', u'一、')

If the pattern and the text are stored in variables (for example, they were read from text files),

>>> myregex='^[一二三四五六七]、'
>>> mytext='一、'

How shall I specify myregex and mytext to re.match, in the same way as re.match(u'^[一二三四五六七]、', u'一、')? Thanks.

Tim
  • 88,294
  • 128
  • 338
  • 543
  • Your working example uses Unicode strings while your non-working example uses byte strings and that's wrong in your case. – dlask Jun 16 '15 at 03:41
  • Did you just create a duplicate of [your own question](http://stackoverflow.com/questions/30857742/unicode-regex-to-match-a-character-class-of-chinese-characters)? – Raniz Jun 16 '15 at 03:44

1 Answers1

1

simply use

re.match(myregex.decode('utf-8'), mytext.decode('utf-8'))
styvane
  • 55,207
  • 16
  • 142
  • 150