84

I have

import urllib2
try:
   urllib2.urlopen("some url")
except urllib2.HTTPError:
   <whatever>

but what I end up is catching any kind of HTTP error. I want to catch only if the specified webpage doesn't exist (404?).

Piper
  • 1,220
  • 2
  • 15
  • 25
Arnab Sen Gupta
  • 5,479
  • 5
  • 22
  • 16
  • Have tried the recipe in this post? http://stackoverflow.com/questions/1308542/how-to-catch-404-error-in-urllib-urlretrieve – John P Jul 07 '10 at 08:29

3 Answers3

145

Python 3

from urllib.error import HTTPError

Python 2

from urllib2 import HTTPError

Just catch HTTPError, handle it, and if it's not Error 404, simply use raise to re-raise the exception.

See the Python tutorial.

Here is a complete example for Python 2:

import urllib2
from urllib2 import HTTPError
try:
   urllib2.urlopen("some url")
except HTTPError as err:
   if err.code == 404:
       <whatever>
   else:
       raise
Wouter
  • 499
  • 2
  • 8
  • 17
Tim Pietzcker
  • 313,408
  • 56
  • 485
  • 544
  • can i do urllib2.urlopen("*") to handle any 404 errors and route them to my 404.html page? –  Oct 01 '15 at 15:36
  • 1
    @TobiasKolb: Since the question is tagged `urllib2` (after all, it's over 9 years old) and `urllib3` is not part of the standard library, I think that wouldn't fit here. If there isn't a duplicate already, maybe open a new question? Or use `urllib` as outlined in Lazik's answer below. – Tim Pietzcker Oct 28 '19 at 21:27
  • I'm writing regression tests, so I want access to the urlopen response even if it was a 404. Even if I assign the value from the `urllib2.urlopen("some url")` , I can't use that value inside the exception -- it will cause another exception. So, how do I get the response text of the 404 page that was returned? – TaiwanGrapefruitTea Jul 10 '21 at 09:13
  • I found the answer: You can use the HTTPError instance as a response. https://docs.python.org/3/howto/urllib2.html#httperror – TaiwanGrapefruitTea Jul 10 '21 at 09:42
45

For Python 3.x

import urllib.request
import urllib.error
try:
    urllib.request.urlretrieve(url, fullpath)
except urllib.error.HTTPError as err:
    print(err.code)
Lazik
  • 2,320
  • 2
  • 23
  • 30
5

Tim's answer seems to me as misleading especially when urllib2 does not return the expected code. For example, this error will be fatal (believe or not - it is not uncommon one when downloading urls):

AttributeError: 'URLError' object has no attribute 'code'

Fast, but maybe not the best solution would be code using nested try/except block:

import urllib2
try:
    urllib2.urlopen("some url")
except urllib2.HTTPError as err:
    try:
        if err.code == 404:
            # Handle the error
        else:
            raise
    except:
        ...

More information to the topic of nested try/except blocks Are nested try/except blocks in python a good programming practice?

NelsonGon
  • 12,469
  • 5
  • 25
  • 52
sonavolob
  • 334
  • 5
  • 8