1
from urllib.request import urlopen
import re

urlpath =urlopen("http://blablabla.com/file")
string = urlpath.read().decode('utf-8')

pattern = re.compile('*.docx"')
onlyfiles = pattern.findall(string)

print(onlyfiles)

Target output

['http://blablabla.com/file/1.docx','http://blablabla.com/file/2.docx']

But I got this

[]

I get this error message when trying this.

re.error: nothing to repeat at position 0
John Kugelman
  • 330,190
  • 66
  • 504
  • 555
Nurdin
  • 22,198
  • 39
  • 124
  • 293

1 Answers1

2

The star from this line:

pattern = re.compile('*.docx"')

Apparently seems to be a python known bug:

Check out this related answers: regex error - nothing to repeat

Try this using word or a-z regexp:

pattern = re.compile('\w*.docx"')
# or
pattern = re.compile('[a-zA-Z0-9]*.docx"')
V. Sambor
  • 10,295
  • 5
  • 42
  • 59