2

This is my input string :

str = '32 -3.723 +98.6 .357 0.86'

And my regex is :

print re.findall('[+-]?\d*\.?\d*',str)

It returns :

['32', '', '-3.723', '', '+98.6', '', '.357', '', '0.86', '']

What I could not understand why all these empty strings in between.

MrGeek
  • 21,097
  • 4
  • 28
  • 52
Rakesh kumar
  • 599
  • 7
  • 19

1 Answers1

5

what I could not understand why all these missing comes in between

All of the elements of your regex are optional, which means the regex can (and does) match the empty string.

[+-]? - ZERO or one matches
\d*   - ZERO or more matches
\.?   - ZERO or one matches
\d*   - ZERO or more matches

At every position in the input, the regex tries to find the longest match. For example, here

'32 -3.723 +98.6 .357 0.86'
   ^

the longest match is the empty string.

There are several ways to work around this. Rather than trying to shoehorn the regex into not matching empty strings, I personally would filter them out post-matching.

NPE
  • 464,258
  • 100
  • 912
  • 987
  • Thank you dear NPE for such a great insight. My mind boggled over this while searching for reasons, but failed. BTW, I am also going to filter them out post-matching. – Rakesh kumar Sep 05 '17 at 13:08