1

How to remove trailing dots from pandas series?

My attempt

import numpy as np
import pandas as pd

pd.set_option('max_colwidth',1000)

s = pd.Series(["""Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagram.com/p/YGEt5JC6JM/"""])


s.str.replace(r'(\w)\.+',r'\1',regex=True)

My results

Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagramcom/p/YGEt5JC6JM/


wanted:
Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperia http://instagramcom/p/YGEt5JC6JM/

BhishanPoudel
  • 1
  • 15
  • 87
  • 137

3 Answers3

3

Those aren't periods, they're the ellipsis character, which is Unicode character \u2026. See How should I write three dots?

s.str.replace(r'(\w)\u2026+',r'\1',regex=True)
Barmar
  • 669,327
  • 51
  • 454
  • 560
2

Could you please try following, written as per shown samples.

pd.set_option('max_colwidth',1000)
s = pd.Series(["""Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagram.com/p/YGEt5JC6JM/"""])
s.str.replace(r'…+',r'')
RavinderSingh13
  • 117,272
  • 11
  • 49
  • 86
0

As per suggestion of Barmar:

s = pd.Series(["""Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias… http://instagram.com/p/YGEt5JC6JM/"""])


s.str.replace(r'(\w)…',r'\1',regex=True)

Gives:
Finally a transparant silicon case ^^ Thanks to my uncle :) #yay #Sony #Xperia #S #sonyexperias http://instagram.com/p/YGEt5JC6JM/
BhishanPoudel
  • 1
  • 15
  • 87
  • 137