3

I'm trying to download an MP3 file, via its URL, using Python's urllib2.

mp3file = urllib2.urlopen(url)
output = open(dst,'wb')
output.write(mp3file.read())
output.close()

I'm getting a urllib2.HTTPError: HTTP Error 403: Forbidden error. Trying urllib also fails, but silently.

urllib.urlretrieve(url, dst)

However, if I use wget, I can download the file successfully.

I've noted the general differences between the two methods mentioned in "Difference between Python urllib.urlretrieve() and wget", but they don't seem to apply here.

Is wget doing something to negotiate permissions that urllib2 doesn't do? If so, what, and how do I replicate this in urllib2?

Community
  • 1
  • 1
Richard Horrocks
  • 398
  • 2
  • 17
  • 1
    That is completely dependent on the server, have you tried `wget --verbose` to see what's happening? – Jasper Apr 16 '14 at 12:14
  • Have you tried adding headers: http://stackoverflow.com/questions/13303449/urllib2-httperror-http-error-403-forbidden – etna Apr 16 '14 at 12:17
  • It appears `wget`'s default output level is verbose, so it's not giving me anything extra when the flag is given explicitly. I'll try playing around with the headers... – Richard Horrocks Apr 16 '14 at 19:34

1 Answers1

1

Could be something on the server side - blocking python user agent for example. Try using wget user agent : Wget/1.13.4 (linux-gnu) .

In Python 2:

import urllib

# Change header for User-Agent
class AppURLopener(urllib.FancyURLopener):
    version = "Wget/1.13.4 (linux-gnu)"
url = "http://www.example.com/test_file"
fname = "test_file"
urllib._urlopener = AppURLopener()
urllib.urlretrieve(url, fname)
VirtualScooter
  • 1,583
  • 3
  • 16
  • 24
WeaselFox
  • 7,100
  • 7
  • 42
  • 73