1

I'm fairly inexperienced with regex, but I need one to match the parameter of a function. This function will appear multiple times in the string, and I would like to return a list of all parameters.

The regex must match:

  1. Alphanumeric and underscore
  2. Inside quotes directly inside parenthesis
  3. After a specific function name

Here's an example string:

Generic3(p, [Generic3(g, [Atom('_xyx'), Atom('y'), Atom('z_')]), Atom('x_1'), Generic2(f, [Atom('x'), Atom('y')])])

and I would like this as output:

['_xyx', 'y', 'z_', x_1', 'x', 'y']

What I have so far:

(?<=Atom\(')[\w|_]*

I'm calling this with:

import re

s = "Generic3(p, [Generic3(g, [Atom('x'), Atom('y'), Atom('z')]), Atom('x'), Generic2(f, [Atom('x'), Atom('y')])])"
print(re.match(r"(?<=Atom\(')[\w|_]*", s))

But this just prints None. I feel like I'm nearly there, but I'm missing something, maybe on the Python side to actually return the matches.

bendl
  • 1,504
  • 1
  • 19
  • 38

1 Answers1

1

Your regex is close, you need to add \W character to find the underscore:

s = "Generic3(p, [Generic3(g, [Atom('_xyx'), Atom('y'), Atom('z_')]), Atom('x_1'), Generic2(f, [Atom('x'), Atom('y')])])"

r = "(?<=Atom\()\W\w+"

final_data = re.findall(r, s)

You can also try this:

import re

s = "Generic3(p, [Generic3(g, [Atom('_xyx'), Atom('y'), Atom('z_')]), Atom('x_1'), Generic2(f, [Atom('x'), Atom('y')])])"

new_data = re.findall("Atom\('(.*?)'\)", s)

Output:

['_xyx', 'y', 'z_', 'x_1', 'x', 'y']
Ajax1234
  • 66,333
  • 7
  • 57
  • 95