421

What would be the best way to split a string on the first occurrence of a delimiter?

For example:

"123mango abcd mango kiwi peach"

splitting on the first mango to get:

"abcd mango kiwi peach"
wjandrea
  • 23,210
  • 7
  • 49
  • 68
Acorn
  • 46,659
  • 24
  • 128
  • 169

5 Answers5

727

From the docs:

str.split([sep[, maxsplit]])

Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements).

s.split('mango', 1)[1]
Community
  • 1
  • 1
Ignacio Vazquez-Abrams
  • 740,318
  • 145
  • 1,296
  • 1,325
  • 1
    Note: if more splits can be performed after reaching the `maxsplit` count, the last element in the list will contain the remainder of the string (inclusive of any `sep` chars/strings). – BuvinJ Sep 10 '19 at 13:01
88
>>> s = "123mango abcd mango kiwi peach"
>>> s.split("mango", 1)
['123', ' abcd mango kiwi peach']
>>> s.split("mango", 1)[1]
' abcd mango kiwi peach'
utdemir
  • 25,564
  • 10
  • 59
  • 81
  • 8
    @Swiss: So what. The technique is still the same. – Ignacio Vazquez-Abrams Aug 01 '11 at 19:55
  • 7
    @Ignacio: I'm just pointing it out. No reason to have a partially correct answer in place of a completely correct one. – Swiss Aug 01 '11 at 19:57
  • Technically assumes the correct delimiter. The 'first' is the [1] index. The one we are all referencing would of course be the zero-ith index. :D Semantics. –  Nov 15 '17 at 13:19
  • ``"value" parameter must be a scalar or dict, but you passed a "list"``i got this returned with ``s.split("mango", 1)[1]`` – yuliansen Sep 28 '20 at 06:14
  • In my case I had to use s.split("mango", 1,expand=True)[1] on Pandas , because I was getting an error – Alvaro Parra Dec 15 '21 at 13:55
35

For me the better approach is that:

s.split('mango', 1)[-1]

...because if happens that occurrence is not in the string you'll get "IndexError: list index out of range".

Therefore -1 will not get any harm cause number of occurrences is already set to one.

Alex
  • 3,001
  • 6
  • 32
  • 48
  • 2
    As written before it is number of occurrences in which method split() is being applied. Method will find and apply only first 'mango' string. – Alex Jul 01 '17 at 06:57
  • Attention, this really depends on what are you going to use the result for. In many cases you will need to know if the string was split or you will need the program to fail if the split did not happen. --- With your implementation the problem would be silently skipped and it would be much more complicated to find its cause. – pabouk - Ukraine stay strong May 17 '22 at 15:19
15

You can also use str.partition:

>>> text = "123mango abcd mango kiwi peach"

>>> text.partition("mango")
('123', 'mango', ' abcd mango kiwi peach')

>>> text.partition("mango")[-1]
' abcd mango kiwi peach'

>>> text.partition("mango")[-1].lstrip()  # if whitespace strip-ing is needed
'abcd mango kiwi peach'

The advantage of using str.partition is that it's always gonna return a tuple in the form:

(<pre>, <separator>, <post>)

So this makes unpacking the output really flexible as there's always going to be 3 elements in the resulting tuple.

heemayl
  • 35,775
  • 6
  • 62
  • 69
  • 2
    This is really useful for creating key value pairs from a line of text, if some of the lines only have a key, since, as you pointed out, you always get a tuple: `key, _, value = text_line.partition(' ')` – Enterprise Dec 05 '20 at 16:51
  • 1
    You could even ignore the separator in the tuple with an one liner using slices: `key, value = text_line.partition(' ')[::2]` – giuliano-oliveira Sep 16 '21 at 18:02
-6
df.columnname[1].split('.', 1)

This will split data with the first occurrence of '.' in the string or data frame column value.

Shay
  • 1,128
  • 10
  • 17
  • 1
    _If_ anyone's looking for the operation on a Pandas DataFrame, it should be this: `df["column_name"].str.split('.', 1)` Docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.split.html – Nisse Knudsen May 19 '21 at 13:05