Splitting on first occurrence

Question

What would be the best way to split a string on the first occurrence of a delimiter?

For example:

"123mango abcd mango kiwi peach"

splitting on the first mango to get:

"abcd mango kiwi peach"

Remeber `first, *rest = my_list` exists – P i Aug 30 '21 at 14:30 — P i, Aug 30 '21 at 14:30

score 727 · Accepted Answer · edited Jun 20 '20 at 09:12

727

From the docs:

str.split([sep[, maxsplit]])

Return a list of the words in the string, using sep as the delimiter string. If maxsplit is given, at most maxsplit splits are done (thus, the list will have at most maxsplit+1 elements).

s.split('mango', 1)[1]

edited Jun 20 '20 at 09:12

Community

1
1

answered Aug 01 '11 at 19:48

Ignacio Vazquez-Abrams

740,318
145
1,296
1,325

1

Note: if more splits can be performed after reaching the `maxsplit` count, the last element in the list will contain the remainder of the string (inclusive of any `sep` chars/strings). – BuvinJ Sep 10 '19 at 13:01

utdemir · Answer 2 · 2011-08-01T19:56:28.297

88

>>> s = "123mango abcd mango kiwi peach"
>>> s.split("mango", 1)
['123', ' abcd mango kiwi peach']
>>> s.split("mango", 1)[1]
' abcd mango kiwi peach'

edited Aug 01 '11 at 19:56

answered Aug 01 '11 at 19:47

utdemir

25,564
10
59
81

8

@Swiss: So what. The technique is still the same. – Ignacio Vazquez-Abrams Aug 01 '11 at 19:55
7

@Ignacio: I'm just pointing it out. No reason to have a partially correct answer in place of a completely correct one. – Swiss Aug 01 '11 at 19:57
Technically assumes the correct delimiter. The 'first' is the [1] index. The one we are all referencing would of course be the zero-ith index. :D Semantics. – Nov 15 '17 at 13:19
``"value" parameter must be a scalar or dict, but you passed a "list"``i got this returned with ``s.split("mango", 1)[1]`` – yuliansen Sep 28 '20 at 06:14
In my case I had to use s.split("mango", 1,expand=True)[1] on Pandas , because I was getting an error – Alvaro Parra Dec 15 '21 at 13:55

score 35 · Answer 3 · answered Jun 09 '14 at 08:26

35

For me the better approach is that:

s.split('mango', 1)[-1]

...because if happens that occurrence is not in the string you'll get "IndexError: list index out of range".

Therefore -1 will not get any harm cause number of occurrences is already set to one.

answered Jun 09 '14 at 08:26

Alex

3,001
6
32
48

2

As written before it is number of occurrences in which method split() is being applied. Method will find and apply only first 'mango' string. – Alex Jul 01 '17 at 06:57
Attention, this really depends on what are you going to use the result for. In many cases you will need to know if the string was split or you will need the program to fail if the split did not happen. --- With your implementation the problem would be silently skipped and it would be much more complicated to find its cause. – pabouk - Ukraine stay strong May 17 '22 at 15:19

score 15 · Answer 4 · answered Oct 30 '19 at 16:40

15

You can also use str.partition:

>>> text = "123mango abcd mango kiwi peach"

>>> text.partition("mango")
('123', 'mango', ' abcd mango kiwi peach')

>>> text.partition("mango")[-1]
' abcd mango kiwi peach'

>>> text.partition("mango")[-1].lstrip()  # if whitespace strip-ing is needed
'abcd mango kiwi peach'

The advantage of using str.partition is that it's always gonna return a tuple in the form:

(<pre>, <separator>, <post>)

So this makes unpacking the output really flexible as there's always going to be 3 elements in the resulting tuple.

answered Oct 30 '19 at 16:40

heemayl

35,775
6
62
69

2

This is really useful for creating key value pairs from a line of text, if some of the lines only have a key, since, as you pointed out, you always get a tuple: `key, _, value = text_line.partition(' ')` – Enterprise Dec 05 '20 at 16:51
1

You could even ignore the separator in the tuple with an one liner using slices: `key, value = text_line.partition(' ')[::2]` – giuliano-oliveira Sep 16 '21 at 18:02

score -6 · Answer 5 · edited Oct 30 '19 at 17:23

-6

df.columnname[1].split('.', 1)

This will split data with the first occurrence of '.' in the string or data frame column value.

edited Oct 30 '19 at 17:23

Shay

1,128
10
17

answered Oct 30 '19 at 16:25

himanshu arora

5
1

1

_If_ anyone's looking for the operation on a Pandas DataFrame, it should be this: `df["column_name"].str.split('.', 1)` Docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.str.split.html – Nisse Knudsen May 19 '21 at 13:05

Splitting on first occurrence

5 Answers5

Linked

Related