I understand that to match a literal backslash, it must be escaped in the regular expression. With raw string notation, this means r"\\". Without raw string notation, one must use "\\\\".
When I saw the code string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string), I was wondering the meaning of backslash in \' and \`, since it also works well as ' and `, like string = re.sub(r"[^A-Za-z0-9(),!?'`]", " ", string). Is there any need to add the backslash ?
Then I try some examples in Python.
1) str1 = "\'s"
print(str1)
str2 = "'s"
print(str2)
The result is same as 's. I think this might be the reason why in previous code, they use \'\` in string = re.sub(r"[^A-Za-z0-9(),!?\'\`]", " ", string). I was wondering is there any difference between "\'s" and "'s" ?
2) string = 'adequately describe co-writer/director peter jackson\'s expanded vision of j . r . r . tolkien\'s middle-earth .'
re.match(r"\\", string)
The re.match returns nothing, which shows there is no backslash in the string. However, I do see backslashes in it. Is that the backslash in \' actually not a backslash?
Thanks for your help!