0

I am trying to use re.split in python. I want to remove all these characters like " , ; < > { } [ ] / \ ? ! .I am trying to do something like this-

re.split("[, \_!?,;:-]+", word)

How can I add characters like " ( ) or < > ' so that they can also be removed?

Edit

re.split('\W+',word)

This works fine but it is not removing underscore symbol. How can I also remove underscore?

DilithiumMatrix
  • 16,484
  • 19
  • 73
  • 110
Noober
  • 1,366
  • 2
  • 19
  • 42

2 Answers2

2

checkout the str.translate function for example in python 2.6+

line = line.translate(None, " ?.!/;:")

or in python 3+

line = line.translate(" ?.!/;:")

see Remove specific characters from a string in python

Community
  • 1
  • 1
pwilmot
  • 586
  • 2
  • 8
2

Try:

re.split('\W+|\_', word)

Also just remove them:

re.sub('\W+|\_', '', word)

Take a look at the document for more details.

Remi Guan
  • 20,142
  • 17
  • 60
  • 81