16

Using python, I want to split the following string:

a=foo, b=bar, c="foo, bar", d=false, e="false"

This should result in the following list:

['a=foo', 'b=bar', 'c="foo, bar"', 'd=false', 'e="false'"']

When using shlex in posix-mode and splitting with ", ", the argument for cgets treated correctly. However, it removes the quotes. I need them because false is not the same as "false", for instance.

My code so far:

import shlex

mystring = 'a=foo, b=bar, c="foo, bar", d=false, e="false"'

splitter = shlex.shlex(mystring, posix=True)
splitter.whitespace += ','
splitter.whitespace_split = True
print list(splitter) # ['a=foo', 'b=bar', 'c=foo, bar', 'd=false', 'e=false']
Remo
  • 2,222
  • 2
  • 21
  • 35

2 Answers2

26
>>> s = r'a=foo, b=bar, c="foo, bar", d=false, e="false", f="foo\", bar"'
>>> re.findall(r'(?:[^\s,"]|"(?:\\.|[^"])*")+', s)
['a=foo', 'b=bar', 'c="foo, bar"', 'd=false', 'e="false"', 'f="foo\\", bar"']
  1. The regex pattern "[^"]*" matches a simple quoted string.
  2. "(?:\\.|[^"])*" matches a quoted string and skips over escaped quotes because \\. consumes two characters: a backslash and any character.
  3. [^\s,"] matches a non-delimiter.
  4. Combining patterns 2 and 3 inside (?: | )+ matches a sequence of non-delimiters and quoted strings, which is the desired result.
Janne Karila
  • 22,762
  • 5
  • 50
  • 92
0

Regex can solve this easily enough:

import re

mystring = 'a=foo, b=bar, c="foo, bar", d=false, e="false"'

splitString = re.split(',?\s(?=\w+=)',mystring)

The regex pattern here looks for a whitespace followed by a word character and then an equals sign which splits your string as you desire and maintains any quotes.

ydaetskcoR
  • 47,133
  • 8
  • 134
  • 157
  • 1
    This would split `'c="foo, bar="'` – Janne Karila May 23 '13 at 10:04
  • Fair point. I guess that's the problem with regex, writing something less explicit always seems to catch you out with unexpected cases and yet explicit regex can look horrible to read and understand what's even going on. – ydaetskcoR May 23 '13 at 10:07