28

Is there a way to see if a line contains words that matches a set of regex pattern? If I have [regex1, regex2, regex3], and I want to see if a line matches any of those, how would I do this? Right now, I am using re.findall(regex1, line), but it only matches 1 regex at a time.

egidra
  • 7,577
  • 18
  • 59
  • 88

4 Answers4

51

You can use the built in functions any (or all if all regexes have to match) and a Generator expression to cycle through all the regex objects.

any (regex.match(line) for regex in [regex1, regex2, regex3])

(or any(re.match(regex_str, line) for regex in [regex_str1, regex_str2, regex_str2]) if the regexes are not pre-compiled regex objects, of course)

However, that will be inefficient compared to combining your regexes in a single expression. If this code is time- or CPU-critical, you should try instead to compose a single regular expression that encompasses all your needs, using the special | regex operator to separate the original expressions.

A simple way to combine all the regexes is to use the string join method:

re.match("|".join([regex_str1, regex_str2, regex_str2]), line)

A warning about combining the regexes in this way: It can result in wrong expressions if the original ones already do make use of the | operator.

iff_or
  • 809
  • 1
  • 12
  • 22
jsbueno
  • 86,446
  • 9
  • 131
  • 182
  • 3
    You can make the join method less likely to fail if you wrap each expression in parenthesis. `'(' + ')|('.join(['foo', 'bar', 'baz']) + ')'` gives `'(foo)|(bar)|(baz)'`. – Brigand Jan 17 '12 at 02:02
  • 10
    Better yet, wrap in `(?:...)`, and put the string together in a way that highlights its logical structure. `'|'.join('(?:{0})'.format(x) for x in ('foo', 'bar', 'baz'))` for example. – Karl Knechtel Jan 17 '12 at 02:53
7

Try this new regex: (regex1)|(regex2)|(regex3). This will match a line with any of the 3 regexs in it.

jok3rnaut
  • 291
  • 1
  • 10
7

You cou loop through the regex items and do a search.

regexList = [regex1, regex2, regex3]

line = 'line of data'
gotMatch = False
for regex in regexList:
    s = re.search(regex,line)
    if s:
         gotMatch = True
         break

if gotMatch:
    doSomething()
tharen
  • 1,206
  • 10
  • 21
1
#quite new to python but had the same problem. made this to find all with multiple 
#regular #expressions.

    regex1 = r"your regex here"
    regex2 = r"your regex here"     
    regex3 = r"your regex here"
    regexList = [regex1, regex1, regex3]

    for x in regexList:
    if re.findall(x, your string):
        some_list = re.findall(x, your string)     
        for y in some_list:
            found_regex_list.append(y)#make a list to add them to.
Harry Pye
  • 11
  • 2