How to use str.replace to replace multiple pairs at once?

Question

Currently I am using the following code to make replacements which is a little cumbersome:

df1['CompanyA'] = df1['CompanyA'].str.replace('.','')
df1['CompanyA'] = df1['CompanyA'].str.replace('-','')
df1['CompanyA'] = df1['CompanyA'].str.replace(',','')
df1['CompanyA'] = df1['CompanyA'].str.replace('ltd','limited')
df1['CompanyA'] = df1['CompanyA'].str.replace('&','and')
df1['Address1A'] = df1['Address1A'].str.replace('.','')
df1['Address1A'] = df1['Address1A'].str.replace('-','')
df1['Address1A'] = df1['Address1A'].str.replace('&','and')
df1['Address1A'].str.replace(r'\brd\b', 'road')
df1['Address2A'] = df1['Address2A'].str.replace('.','')
df1['Address2A'] = df1['Address2A'].str.replace('-','')
df1['Address2A'] = df1['Address2A'].str.replace('&','and')
df1['Address2A'].str.replace(r'\brd\b', 'road')

In order to make changing on the fly easier my ideal scenario would be something like:

df1['CompanyA'] = df1['CompanyA'].str.replace(('&','and'), ('.', ''), ('-','')....)
df1['Address1A'] = df1['Address1A'].str.replace(('&','and'), ('.', ''), ('-','')....)
df1['Address2A'] = df1['Address2A'].str.replace(('&','and'), ('.', ''), ('-','')....)

This is so I could just input/change what I wanted to replace for a particular column without having to adjust multiple lines of code.

Is this possible at all?

Did you try writing a loop? – mkrieger1 Jun 17 '20 at 13:11 — mkrieger1, Jun 17 '20 at 13:11

Celius Stingher · Accepted Answer · 2020-06-19T00:56:39.760

16

You can create a dictionary and pass it to the function replace() without needing to chain or name the function so many times.

replacers = {',':'','.':'','-':'','ltd':'limited'} #etc....
df1['CompanyA'] = df1['CompanyA'].replace(replacers)

edited Jun 19 '20 at 00:56

answered Jun 17 '20 at 13:10

Celius Stingher

14,458
5
18
47

2

I like this neat answer, also there is nothing wrong with a for loop, more readable and friendly for python rookies – E.Serra Jun 17 '20 at 13:11
3

Yes it might be more readable and friendly for rookies, however I don't think think passing a simple dictionary is too complex. I'd say dictionaries should be learnt before for loops and pandas together, but that's an opinion of course, thanks for your comment. – Celius Stingher Jun 17 '20 at 13:14
1

yes, but knowing that you can pass a dictionary to replace has nothing to do with pandas or dictionaries themselves, it is just some internal magic, that was my point, it looks really clean though – E.Serra Jun 17 '20 at 13:16
1

Amazing! This is exactly the sort of funtionality I was after. – Manesh Halai Jun 17 '20 at 13:27
6

Can you pass a dictionary to `replace()`? I'm confused here. With Python 3.8.5, evaluating `"abcdefghi".replace({'b':'B', 'g':'G'})` gives `TypeError: replace expected at least 2 arguments, got 1`. Am I missing something? – bitinerant Apr 06 '21 at 18:26
1

This is pandas replace we are using, not python's built-in replace :) – Celius Stingher Apr 06 '21 at 20:25

score 2 · Answer 2 · answered Jun 17 '20 at 13:06

2

you could chain the replacings:

df1['CompanyA'] = df1['CompanyA'].str.replace('.','').replace('-','').replace(',','').replace('ltd','limited').replace('&','and')
...

answered Jun 17 '20 at 13:06

bigbear3001

480
6
18

score 1 · Answer 3 · answered Jun 17 '20 at 13:11

1

Replace function accepts values as dictionaries as well. You can do something like this:

df1.replace({'CompanyA' : { '&' : 'and', '.': '' , '-': ''}},regex=True)

answered Jun 17 '20 at 13:11

LazyCoder

1,409
8
23

score 1 · Answer 4 · answered Jun 17 '20 at 13:12

1

You can use a dictionary to map the characters for each column:

to_replace = {'.': '',
              ',': '',
              'foo': 'bar'
             }

for k, v in to_replace.items():
    df1['CompanyA'] = df1['CompanyA'].str.replace(k, v)

answered Jun 17 '20 at 13:12

PApostol

1,644
2
8
15

never use cycles in pandas it break all the productivity there – Igor Tischenko Jun 17 '20 at 13:33

Igor Tischenko · Answer 5 · 2020-06-17T14:22:38.323

most likely you use pd.Dataframe so i suggest to make universal remover

def remover(row, replaces):
    for k,v in replacers.items():
        if k in row:
            row = row.replace(k, v)
    return row      


replacers = {',' : "",
         '.':'',
         '-':'',
         'ltd':'limited'
        }

for column in df.columns:
    df[column] = df[column].apply(lambda row: remover(row, replacers))

or you can specify specific column names to modify

How to use str.replace to replace multiple pairs at once?

5 Answers5

Linked