-1
def pre_process(t):
    """ (str) -> str
    returns a copy of the string with all punctuation removed, and all letters set to lowercase. The only characters in the output will be lowercase letters, numbers, and whitespace.

    """
Wolph
  • 74,301
  • 10
  • 131
  • 146
Eric Choi
  • 21
  • 1
  • 2

3 Answers3

1

Try the following code.

import re

string = 'This is an example sentence.'
string = re.sub(r'[^a-zA-Z\d]', string)

print(string)

You should get out Thisisanexamplesentance.

Zak
  • 1,718
  • 1
  • 14
  • 28
0

Just rebuild your string with only alpha-numeric characters:

''.join(_char for _char in _str.lower() if _char.isalnum())
midori
  • 4,657
  • 5
  • 28
  • 59
0

This is the simplest function using regex I could put together to achieve your requirement.

import re
def pre_process(t):
    return re.sub(r'[^a-z\d ]','',str.lower())

It will return the input string in lower case, and omit any characters that are not letters, numbers or whitespace.

maze88
  • 740
  • 1
  • 8
  • 15