-2

I have a dataframe with a column in python:

df
columnA
Apple Banana
Orange Citron Pineapple

How can I reserve the order of the substrings based on the white spaces? The outcome should be:

columnA
Banana Apple
Pineapple Citron Orange

Right now, I am only using:

df['columnA'] = df['columnA'].replace(r'(\s+).(\s+).(\s+)',r'\3\2\1',regex=True)

but this only works if I know the number of substrings, this I do not know upfront.

yatu
  • 80,714
  • 11
  • 64
  • 111
PV8
  • 5,145
  • 4
  • 35
  • 64

2 Answers2

2

I'd go with a list comprehension for this task and avoid the str accessor

df['new'] = [' '.join(s.split()[::-1]) for s in df['columnA']]

df = pd.concat([df]*10000)
%timeit [' '.join(s.split()[::-1]) for s in df.col]
100 loops, best of 3: 12.9 ms per loop

%timeit df.col.str.split().apply(lambda x: ' '.join(x[::-1]))
10 loops, best of 3: 25.3 ms per loop

%timeit df.col.str.split().str[::-1].agg(' '.join)
10 loops, best of 3: 27.4 ms per loop

%timeit df.col.str.split().apply(reversed).apply(' '.join)
10 loop, best of 3: 28.7 ms per loop
PV8
  • 5,145
  • 4
  • 35
  • 64
rafaelc
  • 52,436
  • 15
  • 51
  • 78
1

The three steps you need are:

  1. Split the string
  2. Reverse the string
  3. Join the string

The first and third steps can be achieved using str.split and join, so you could do:

 df.A.str.split().apply(lambda x: ' '.join(x[::-1]))

Output

0               Banana Apple
1    Pineapple Citron Orange
Name: A, dtype: object

Another alternative is to use reversed:

df.A.str.split().apply(reversed).apply(' '.join)
Dani Mesejo
  • 55,057
  • 6
  • 42
  • 65