0

I have following text

txt = 'Lithium 0.25 (7/11/77).  LFTS wnl.  Urine tox neg.  Serum tox + fluoxetine 500; otherwise neg.  TSH 3.28.  BUN/Cr: 16/0.83.  Lipids unremarkable.  B12 363, Folate >20.  CBC: 4.9/36/308 Pertinent Medical Review of Systems Constitutional:'

I want to get date in above expression and i have written following expression.

re.findall(r'(?:[\d{1,2}]+)(?:[/-]\d{0,}[/-]\d{2,4})', txt)

If I execute above expression following output is shown

['7/11/77', '9/36/308']

I don't want "4.9/36/308" this to be included how do I have to change regular expression for this.

Kindly help.

venkysmarty
  • 10,593
  • 21
  • 94
  • 174

1 Answers1

1

You may fix the current regex as

\b(?<!\.)\d{1,2}[/-]\d+[/-]\d{2,4}\b

See the regex demo

The \b will match a word boundary and (?<!\.) negative lookbehind will fail the match if there is a . before the first digit matched.

See the Python demo.

Note that you will have to use a non-regex method later if you need to only get a list of valid dates.

Wiktor Stribiżew
  • 561,645
  • 34
  • 376
  • 476