25
code: df['review'].head()
        index         review
output: 0      These flannel wipes are OK, but in my opinion

I want to remove punctuations from the column of the dataframe and create a new column.

code: import string 
      def remove_punctuations(text):
          return text.translate(None,string.punctuation)

      df["new_column"] = df['review'].apply(remove_punctuations)

Error:
  return text.translate(None,string.punctuation)
  AttributeError: 'float' object has no attribute 'translate'

I am using python 2.7. Any suggestions would be helpful.

cs95
  • 330,695
  • 80
  • 606
  • 657
data_person
  • 3,832
  • 7
  • 35
  • 60

3 Answers3

58

Using Pandas str.replace and regex:

df["new_column"] = df['review'].str.replace('[^\w\s]','')
nalzok
  • 13,395
  • 18
  • 64
  • 118
Bob Haffner
  • 7,393
  • 1
  • 33
  • 39
26

You can build a regex using the string module's punctuation list:

df['review'].str.replace('[{}]'.format(string.punctuation), '')
David C
  • 6,789
  • 4
  • 48
  • 65
12

I solved the problem by looping through the string.punctuation

def remove_punctuations(text):
    for punctuation in string.punctuation:
        text = text.replace(punctuation, '')
    return text

You can call the function the same way you did and It should work.

df["new_column"] = df['review'].apply(remove_punctuations)
Arthur Gouveia
  • 714
  • 4
  • 12