35

Is it possible to use a regex to match "February 2009", for example?

Jeremy
  • 8,513
  • 19
  • 54
  • 68

4 Answers4

54

Along the lines of

\b(?:Jan(?:uary)?|Feb(?:ruary)?|...|Dec(?:ember)?) (?:19[7-9]\d|2\d{3})(?=\D|$)

that's

\b                  # a word boundary
(?:                 # non-capturing group
  Jan(?:uary)?      # Jan(uary)
  |Feb(?:ruary)?    #
  |...              # and so on
  |Dec(?:ember)?    # Dec(ember)
)                   # end group
                    # a space
(?:                 # non-capturing group
  19[7-9]\d|2\d{3}  # 1970-2999
)                   # end group
(?=\D|$)            # followed by: anything but a digit or the end of string
Tomalak
  • 322,446
  • 66
  • 504
  • 612
30

I had to work on this to match a few fringe examples, but I ended up using

(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}\D?)?\D?((19[7-9]\d|20\d{2})|\d{2})

to capture dates with word months in them

Beerswiller
  • 529
  • 5
  • 8
  • 4
    Just a minor thing, for the months instead of (Nov|Dec) it should be (?:Nov|Dec), or at least I had to change that in order for it to work with Python otherwise it was returning an empty [''] match – Walter R Aug 03 '17 at 19:04
  • You can add (?i)(regex_part_to_make_case_insensitive) or (?i)regex_part_to_make_case_insensitive(?-i) depending on the regex processor you are using. – Onyr Apr 22 '21 at 16:04
4

Modifying Beerswiller's answer, if you want "st"/"nd"/"rd" variations:

(\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?(\d{1,2}(st|nd|rd|th)?)?(([,.\-\/])\D?)?((19[7-9]\d|20\d{2})|\d{2})*

3

This regex accounts for some spacing around the comma.

Sometimes it's not always in the right place.

((\b\d{1,2}\D{0,3})?\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|(Nov|Dec)(?:ember)?)\D?)(\d{1,2}(st|nd|rd|th)?)?((\s*[,.\-\/]\s*)\D?)?\s*((19[0-9]\d|20\d{2})|\d{2})*
Acapulco
  • 3,195
  • 8
  • 37
  • 46
Todd
  • 41
  • 1