11

I'm creating a class that renames a file using a user-specified format. This format will be a simple string whose str.format method will be called to fill in the blanks.

It turns out that my procedure will require extracting variable names contained in braces. For example, a string may contain {user}, which should yield user. Of course, there will be several sets of braces in a single string, and I'll need to get the contents of each, in the order in which they appear and output them to a list.

Thus, "{foo}{bar}" should yield ['foo', 'bar'].

I suspect that the easiest way to do this is to use re.split, but I know nothing about regular expressions. Can someone help me out?

Thanks in advance!

Louis Thibault
  • 18,222
  • 22
  • 78
  • 144
  • In case you know all possible variables *beforehand*, you can just pass them all to `str.format` - it will ignore those not in pattern. `'{user}_{bar}'.format(user='Mike', foo=1, bar=2)` will output `Mike_2`. I happend to have allowed vars fixed in a dict, so I could skip looking for vars in pattern. Anyway knowing about `string.Formatter()` is useful. – yentsun Mar 11 '13 at 10:10

2 Answers2

60

Another possibility is to use Python's actual Formatter itself to extract the field names for you:

>>> import string
>>> s = "{foo} spam eggs {bar}"
>>> string.Formatter().parse(s)
<formatteriterator object at 0x101d17b98>
>>> list(string.Formatter().parse(s))
[('', 'foo', '', None), (' spam eggs ', 'bar', '', None)]
>>> field_names = [name for text, name, spec, conv in string.Formatter().parse(s)]
>>> field_names
['foo', 'bar']

or (shorter but less informative):

>>> field_names = [v[1] for v in string.Formatter().parse(s)]
>>> field_names
['foo', 'bar']
DSM
  • 319,184
  • 61
  • 566
  • 472
18

Using re.findall():

In [5]: import re

In [8]: strs = "{foo} spam eggs {bar}"

In [9]: re.findall(r"{(\w+)}", strs)
Out[9]: ['foo', 'bar']
Ashwini Chaudhary
  • 232,417
  • 55
  • 437
  • 487
  • Just a quick question. Are the results from `re.findall` guaranteed to be listed in the same order as they appear in the string? – Louis Thibault Dec 27 '12 at 21:53
  • 2
    @blz yes, as the string is parsed from left to right. – Ashwini Chaudhary Dec 27 '12 at 21:56
  • Beware, this does not account for format specifiers such as `{spam:3f}`. @DSM's answer should be the accepted one. Modifying the `\w` to include more characters until it matches the full spec of `str.format` could work, but using the formatter itself is better (and not prone to breakage if the syntax evolves) – ewen-lbh Apr 18 '21 at 08:55