4

Hi Stackoverflow community,

I'm trying to get familiar with the urllib.request standard library and use it in my scripts at work instead of wget. I'm however unable to get the detailed HTTP messages displayed neither in IDLE nor using script file or manually typing the commandy into cmd (py).

I'm using Python on Windows 7 x64, and tried 3.5 and 3.6 including 3.6.1rc1 without success.

The messages are supposedly turned on using this command:

http.client.HTTPConnection.debuglevel = 1

so here is my sample code. It works but no details are displayed:

import http.client
import urllib.request
http.client.HTTPConnection.debuglevel = 1
response = urllib.request.urlopen('http://stackoverflow.com')
content = response.read()
with open("stack.html", "wb") as file:
    file.write(content)

I have tried using .set_debuglevel(1) without success. There seem to be years old questions here Turning on debug output for python 3 urllib However this is the same as I have and it's not working. Also in this question's comment user Yen Chi Hsuan says it's a bug and reported it here https://bugs.python.org/issue26892

The bug was closed in June 2016 so I would expect this is corrected in recent Python versions.

Maybe I'm missing something (e.g. something else needs to be enabled / installed etc..) but I spent some time on this and reached a dead end.

Is there a working way to have the http detailed messages displayed with urllib on Python 3 on Windows?

Thank you

EDIT: the response suggested by pvg works on the simple example but I cannot make it to work in a case where login needed. The HTTPBasicAuthHandler does not have this debuglevel attribute. And when I try combining multiple handlers into the opener it does not work either.

userName = 'mylogin'
passWord  = 'mypassword'
top_level_url = 'http://page-to-login.com'

# create an authorization handler
passman = urllib.request.HTTPPasswordMgrWithDefaultRealm()
passman.add_password(None, top_level_url, userName, passWord);

auth_handler = urllib.request.HTTPBasicAuthHandler(passman)
opener = urllib.request.build_opener(auth_handler)
urllib.request.install_opener(opener)

result = opener.open(top_level_url)
content = result.read()
alleby
  • 61
  • 1
  • 6
  • You can try copying exactly what's in the test case for the bug or just switching to urllib3 and using `urllib3.add_stderr_logger()` – pvg Mar 18 '17 at 16:12

1 Answers1

1

The example in the issue you linked shows the working code, a version reproduced below:

import urllib.request

handler = urllib.request.HTTPHandler(debuglevel=10)
opener = urllib.request.build_opener(handler)
content = opener.open('http://stackoverflow.com').read()

print(content[0:120])

This is pretty clunky, another option is to use a friendlier library like urllib3 (http://urllib3.readthedocs.io/en/latest/).

import urllib3

urllib3.add_stderr_logger()
http = urllib3.PoolManager()
r = http.request('GET', 'http://stackoverflow.com')
print(r.status)

If you decide to use the requests library instead, the following answer describes how to set up logging:

How can I see the entire HTTP request that's being sent by my Python application?

Community
  • 1
  • 1
pvg
  • 2,651
  • 4
  • 15
  • 31
  • Thank you! The basic example I provided works fine, but when I want to combine it with HTTPBasicAuthHandler to access page where login is needed, I'm not able to set the debuglevel as it has no such attribute. So I was hoping for a way to turn the debuglevel "globally" for http requests as the original example is showing :/ – alleby Mar 19 '17 at 11:19
  • I'm not sure if this comment means this is an acceptable answer or if your question has other requirements. If it's the latter, update your question to explain. – pvg Mar 19 '17 at 17:28
  • Hi, Yes you answered my question but the method does not work always. I edited my question as suggested. If needed I can open a new question and close this one. Thx – alleby Mar 20 '17 at 08:51
  • Well, that's a pretty different question, you really should be asking about the specific thing you're trying to accomplish. Additionally, providing some context as to your motivation is helpful, clearly you're not trying to debug urllib itself so why are you so keen on getting its internal debug messages? There are more general ways to trace an http flow if that's what you're after. And again, a higher-level library like requests might be appropriate. So perhaps you should indeed write a new question that actually describes what you're after. – pvg Mar 20 '17 at 09:01