0

I want to match digits betwen "000" or betwen \b and "000" or "000" and \b from a string like this:

11101110001011101000000011101010111

I have tried with expressions like this:

(?<=000)\d+(?=000)

but I only get the largest occurrence

I expect to get:

1110111
1011101
0
11101010111
Vikash Chauhan
  • 784
  • 2
  • 9
  • 18
  • Possible duplicate of [The 'g' flag in regular expressions](http://stackoverflow.com/questions/12993629/the-g-flag-in-regular-expressions) – dorukayhan Oct 07 '16 at 01:43

2 Answers2

1

You can use the regex package and the .findall() method:

In [1]: s = "11101110001011101000000011101010111"

In [2]: import regex

In [3]: regex.findall(r"(?<=000|^)\d+?(?=000|$)", s)
Out[3]: ['1110111', '1011101', '0', '00011101010111']

The 000|^ and 000|$ would help to match either the 000 or the beginning and the end of a string respectively. Also note the ? after the \d+ - we are making it non-greedy.

Note that the regular re.findall() would fail with the following error in this case:

error: look-behind requires fixed-width pattern

This is because re does not support variable-length lookarounds but regex does.

Community
  • 1
  • 1
alecxe
  • 441,113
  • 110
  • 1,021
  • 1,148
1

you can do it with the re module like this:

re.findall(r'(?:\b|(?<=000))(\d+?)(?:000|\b)', s)
Casimir et Hippolyte
  • 85,718
  • 5
  • 90
  • 121