228

How to split this string where __ is the delimiter

MATCHES__STRING

To get an output of ['MATCHES', 'STRING']?

shgnInc
  • 1,830
  • 1
  • 22
  • 32
Hulk
  • 30,904
  • 60
  • 142
  • 212
  • 6
    http://docs.python.org/library/stdtypes.html#str.split – getekha Aug 13 '10 at 08:50
  • 9
    It is worth to read the python standard documents and trying to understand few programs others have made to start to grasp basics of Python. Practise and copying/modifying are great tools to learn language. – Tony Veijalainen Aug 13 '10 at 09:00

5 Answers5

378

You can use the str.split method: string.split('__')

>>> "MATCHES__STRING".split("__")
['MATCHES', 'STRING']
MendelG
  • 8,523
  • 3
  • 16
  • 34
adamk
  • 42,536
  • 7
  • 49
  • 56
  • 1
    I was wondering, what is the difference between the first example (simply using split()) and the second example (with a for loop)? – EndenDragon Jun 26 '16 at 18:21
  • 4
    @EndenDragon The for loop will automatically apply `x.strip()` and return a list of matches without whitespace on either side. The devil is in the details. – Sébastien Vercammen Jun 29 '16 at 13:59
  • Hey, since this is a very popular question, I edited it to ask only 1 specific question and removed the part with the spaces around the delimiter because it wasn't clear what the OP even expected to happen (Since there never was a question in the question). I think the question (and answers) are more useful this way, but feel free to rollback all the edits if you disagree. – Aran-Fey Oct 09 '18 at 14:55
  • Often you just want one part of the `splitted` string. Get it with `'match'.split('delim')[0]` for the first one, etc. – Timo Mar 15 '22 at 12:37
4

You may be interested in the csv module, which is designed for comma-separated files but can be easily modified to use a custom delimiter.

import csv
csv.register_dialect( "myDialect", delimiter = "__", <other-options> )
lines = [ "MATCHES__STRING" ]

for row in csv.reader( lines ):
    ...
Aran-Fey
  • 35,525
  • 9
  • 94
  • 135
Katriel
  • 114,760
  • 19
  • 131
  • 163
2

When you have two or more elements in the string (in the example below there are three), then you can use a comma to separate these items:

date, time, event_name = ev.get_text(separator='@').split("@")

After this line of code, the three variables will have values from three parts of the variable ev.

So, if the variable ev contains this string and we apply separator @:

Sa., 23. März@19:00@Klavier + Orchester: SPEZIAL

Then, after the split operation the variable

  • date will have value Sa., 23. März
  • time will have value 19:00
  • event_name will have value Klavier + Orchester: SPEZIAL
Gino Mempin
  • 19,150
  • 23
  • 79
  • 104
Sergey Nasonov
  • 103
  • 1
  • 7
0

For Python 3.8, you actually don't need the get_text method, you can just go with ev.split("@"), as a matter of fact the get_text method is throwing an att. error. So if you have a string variable, for example:

filename = 'file/foo/bar/fox'

You can just split that into different variables with comas as suggested in the above comment but with a correction:

W, X, Y, Z = filename.split('_') 
W = 'file' 
X = 'foo'
Y = 'bar'
Z = 'fox'
Gnai
  • 17
  • 5
0

Besides split and rsplit, there is partition/rpartition. It separates string once, but the way question was asked, it may apply as well.

Example:

>>> "MATCHES__STRING".partition("__")
('MATCHES', '__', 'STRING')

>>> "MATCHES__STRING".partition("__")[::2]
('MATCHES', 'STRING')

And a bit faster then split("_",1):

$ python -m timeit "'validate_field_name'.split('_', 1)[-1]"
2000000 loops, best of 5: 136 nsec per loop

$ python -m timeit "'validate_field_name'.partition('_')[-1]"
2000000 loops, best of 5: 108 nsec per loop

Timeit lines are based on this answer

topin89
  • 141
  • 2
  • 3