I am trying to write a regex expression which matches to proper capitalized nouns, s/a "Oreo," "Snickers Bar," "McFlurry".
import re
text = "George Washington, known as the \"Father of His Country,\" was an American soldier and statesman who served from 1789 to 1797 as the first President of the United States. He was commander-in-chief of the Continental Army during the American Revolutionary War and presided over the 1787 Constitutional Convention. As one of the leading Patriots, he was among the nation's Founding Fathers. Yankee Hotel Foxtrot Yankee Hotel Foxtrot."
reg = "[A-Z]\w+(\s*[A-Z]\w+)*"
re.findall(reg, text)
gives me the output
[' Washington', '', ' Country', '', '', ' States', '', ' Army', ' War', ' Convention', '', '', ' Fathers', ' Foxtrot']
Which is obviously the kinds of matches that I'm looking for, minus the first word. Any idea why my regex search seems to be validating the [A-Z]\w+ at the beginning but not yielding it as part of the result?
edit: I should add that this expression works as indended on regex-testing sites like pythex.org, but works as stated above in my Google Colab notebook.