Python extract after and before certain conditions

Question

I'm looking to extract only what appears after '/g' and before the '+' or '?'

urls = ["https://www.google.com/es/g/Dmitry+Kharchenko?searchterm=isometrico",
       "https://www.google.com/es/g/Irina+Strelnikova?searchterm=isom%C3%A9trico",
       "https://www.google.com/es/g/ParabolStudio?searchterm=auto"]

for i in urls:
    print(re.findall(r'g/(.*)[\+|\??]', i))


['Dmitry+Kharchenko']
['Irina+Strelnikova']
['ParabolStudio']

Desired result:

'Dmitry'
'Irina'
'ParabolStudio'

Try `(?<=\/g\/)[^+?]+`, `(?<=\/g\/)` being a *positive lookbehind*. [Demo(https://regex101.com/r/R2peEP/1/) — Cary Swoveland, Mar 29 '20 at 01:28

score 0 · Accepted Answer · answered Mar 29 '20 at 01:28

You need to use non-greedy pattern .*? which will match up to the first + or ? it encountered instead of the last + or ? in greedy case, i.e. .*; To match + or ? with character class you can just do [+?]:

for i in urls:
    print(re.findall(r'g/(.*?)[+?]', i))

# ['Dmitry']
# ['Irina']
# ['ParabolStudio']

Python extract after and before certain conditions

1 Answers1