3

I have a Regex which extracts German mobile phone numbers from a website:

[^\d]((\+49|0049|0)1[567]\d{1,2}([ \-/]*\d){7})(?!\d)

As you can see in the demo it works quite well. The only pattern which doesn't match yet is:

+49 915175461907

Please see more examples in the linked demo. The problem is the whitespace behind +49.

How do I need to change the current regex pattern in order to match even these kind of patterns?

martineau
  • 112,593
  • 23
  • 157
  • 280
PParker
  • 1,217
  • 1
  • 8
  • 19

3 Answers3

5

A better regex would be:

(?<!\d)(?:\+49|0049|0) *[19][1567]\d{1,2}(?:[ /-]*\d){7,8}(?!\d)

Updated RegEx Demo

Changes:

  • (?<!\d): Make sure previous character is not a digit
  • [19][1567]: Match 1 or 9 followed by one of the [1567] digit
  • {7,8}: Match 7 or 8 repetitions of given construct
  • Better to keep an unescaped hyphen at first or last position in a character class
  • Avoid capturing text that you don't need by using non-capture group
anubhava
  • 713,503
  • 59
  • 514
  • 593
3

No brain method : removing space before regex.

Otherwise matching non withe space in regex is \s so (maybe too much parenthesis)

[^\d](((\+49|0049|0)([\s]{0,1})1)[567]\d{1,2}([ \-/]*\d){7})(?!\d)
Mayot
  • 39
  • 2
  • Thanks a lot. Works great except for the numbers which start have a ``9`` like this one: ``+49 915175461907`` – PParker Oct 18 '21 at 14:04
1

Add an optional white space:

[^\d]((\+49|0049|0)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)

Update-Capturing beginning of line

If you want is to match numbers without them necessarily starting with a line break you can use this. It matches anything except digits before phone number:

 (^|[^\d])((\+49|0049|0)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)

test it here

KZiovas
  • 1,878
  • 8
  • 22