-2

I am trying to extract the German VAT number (Umsatzsteuer-Identifikationsnummer) from a text.

string = "I want to get this DE813992525 number."

I know, that the correct regex for this problem is (?xi)^( (DE)?[0-9]{9}|)$. It works great according to my demo.

What I tried is:

string = "I want to get this DE813992525 number.
match = re.compile(r'(?xi)^( (DE)?[0-9]{9}|)$')
print(match.findall(string))

>>>>>> []

What I would like to get is:

print(match.findall(string))
>>>>>  DE813992525
Wiktor Stribiżew
  • 561,645
  • 34
  • 376
  • 476
PParker
  • 1,217
  • 1
  • 8
  • 19
  • 1
    Why not just `^DE[0-9]{9}$`https://regex101.com/r/FDuzNE/1 See https://ideone.com/nRaAXx – The fourth bird Sep 10 '20 at 11:44
  • 1
    no, it's [not correct one](https://regex101.com/r/yMFxD7/1) - e.g. `$` anchor means end of string and your test string VAT number is not at the end. – buran Sep 10 '20 at 11:44

1 Answers1

1

When searching within a string, dont use ^ and $:

import re
string = """I want to get this DE813992525 number.
I want to get this DE813992526 number.
"""
match = re.compile(r'DE[0-9]{9}')
print(match.findall(string))

Out:

['DE813992525', 'DE813992526']
Maurice Meyer
  • 14,803
  • 3
  • 20
  • 42