How do I catch a specific HTTP error in Python?

Question

I have

import urllib2
try:
   urllib2.urlopen("some url")
except urllib2.HTTPError:
   <whatever>

but what I end up is catching any kind of HTTP error. I want to catch only if the specified webpage doesn't exist (404?).

Have tried the recipe in this post? http://stackoverflow.com/questions/1308542/how-to-catch-404-error-in-urllib-urlretrieve — John P, Jul 07 '10 at 08:29

score 145 · Accepted Answer · edited Nov 10 '21 at 06:04

145

Python 3

from urllib.error import HTTPError

Python 2

from urllib2 import HTTPError

Just catch HTTPError, handle it, and if it's not Error 404, simply use raise to re-raise the exception.

See the Python tutorial.

Here is a complete example for Python 2:

import urllib2
from urllib2 import HTTPError
try:
   urllib2.urlopen("some url")
except HTTPError as err:
   if err.code == 404:
       <whatever>
   else:
       raise

edited Nov 10 '21 at 06:04

Wouter

499
2
8
17

answered Jul 07 '10 at 09:14

Tim Pietzcker

313,408
56
485
544

can i do urllib2.urlopen("*") to handle any 404 errors and route them to my 404.html page? – Oct 01 '15 at 15:36
1

@TobiasKolb: Since the question is tagged `urllib2` (after all, it's over 9 years old) and `urllib3` is not part of the standard library, I think that wouldn't fit here. If there isn't a duplicate already, maybe open a new question? Or use `urllib` as outlined in Lazik's answer below. – Tim Pietzcker Oct 28 '19 at 21:27
I'm writing regression tests, so I want access to the urlopen response even if it was a 404. Even if I assign the value from the `urllib2.urlopen("some url")` , I can't use that value inside the exception -- it will cause another exception. So, how do I get the response text of the 404 page that was returned? – TaiwanGrapefruitTea Jul 10 '21 at 09:13
I found the answer: You can use the HTTPError instance as a response. https://docs.python.org/3/howto/urllib2.html#httperror – TaiwanGrapefruitTea Jul 10 '21 at 09:42

Lazik · Answer 2 · 2022-05-27T16:41:38.667

45

For Python 3.x

import urllib.request
import urllib.error
try:
    urllib.request.urlretrieve(url, fullpath)
except urllib.error.HTTPError as err:
    print(err.code)

edited May 27 '22 at 16:41

answered Oct 04 '13 at 02:27

Lazik

2,320
2
23
30

urllib.request.urlretrieve() is legacy python 2 interface . . . https://docs.python.org/3/library/urllib.request.html#legacy-interface – TaiwanGrapefruitTea Jul 10 '21 at 08:47
Shouldn't the except line be "except HTTPError as err:" since you imported a specific thing?. Or am I just such a beginner that I'm even wrong on that? – JeopardyTempest May 26 '22 at 06:53
1

@JeopardyTempest I have fixed the import – Lazik May 27 '22 at 16:42

score 5 · Answer 3 · edited Jun 04 '21 at 02:09

Tim's answer seems to me as misleading especially when urllib2 does not return the expected code. For example, this error will be fatal (believe or not - it is not uncommon one when downloading urls):

AttributeError: 'URLError' object has no attribute 'code'

Fast, but maybe not the best solution would be code using nested try/except block:

import urllib2
try:
    urllib2.urlopen("some url")
except urllib2.HTTPError as err:
    try:
        if err.code == 404:
            # Handle the error
        else:
            raise
    except:
        ...

More information to the topic of nested try/except blocks Are nested try/except blocks in python a good programming practice?

How do I catch a specific HTTP error in Python?

3 Answers3

Linked

Related