8

I have installed the nltk package. Following that I am trying to download the supporting packages using nltk.download() and am getting error:

[Errno 11001] getaddrinfo

My machine / software details are:

OS: Windows 8.1 Python: 3.3.4 NLTK Package: 3.0

Below are the commands run in python:

Python 3.3.4 (v3.3.4:7ff62415e426, Feb 10 2014, 18:13:51) [MSC v.1600 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.

import nltk

nltk.download()
showing info http://nltk.github.com/nltk_data/
True

nltk.download("all")
[nltk_data] Error loading all: <urlopen error [Errno 11001]
[nltk_data]     getaddrinfo failed>
False

enter image description here

It looks like it is going to http://nltk.github.com/nltk_data/ whereas it should Ideally try to get the data from http://www.nltk.org/nltk_data/.

On another machine when we type http://nltk.github.com/nltk_data/ in the browser, it redirects to http://www.nltk.org/nltk_data/. I am not understanding why the redirection is not happening on my laptop.

I feel that this might be the issue.

Kindly help.

I have added the command prompt screenshot. Need help..

enter image description here

Regards, Bonson

Bonson
  • 1,388
  • 4
  • 15
  • 38
  • Hello @elyase I do not have http_proxy as a variable. Also this is a home computer so I do not have a firewall. Is there anything specific I should check in the DNS? – Bonson Jan 03 '15 at 11:42

6 Answers6

9

Try below code. It has downloaded package as expected

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()

Looks before link was broken whicvh been fixed by ssl.

Note :- MAC been used

Swarit Agarwal
  • 2,276
  • 22
  • 31
4

I got this error because of network constraint. Here is how I solved

Browsed http://www.nltk.org/nltk_data/ and downloaded required corpora from the corresponding link.

Then placed the downloaded files in C:/ folder path in windows (or any other relevant directories like C:/ProgramData/Anaconda3) in a same folder structure mentioned in https://github.com/nltk/nltk_data/tree/gh-pages/packages

Avijit Das
  • 123
  • 10
3

Got the solution. The issue in my case was that when the NLTK downloader started it had the server index as - http://nltk.github.com/nltk_data/

This needs to be changed to - http://nltk.org/nltk_data/

You can change this by going into the NLTK Downloader window and the File->Change Server Index.

Regards, Bonson

Bonson
  • 1,388
  • 4
  • 15
  • 38
  • 1
    Hi, i overcame this problem with nltk downloader by changing the server, but how do i do it in code? I am getting [nltk_data] Error loading all: Error while running the code – user3207655 Nov 18 '21 at 19:15
1

it resolved issues for me by "setting http & https proxy in environment variables"

set http_proxy=http://IPN:PWD@ipaddress:port
set https_proxy=https://IPN:PWD@ipaddress:port

ask your network or admin team for this proxy IP address

0

The Error might be of the proxy that the system has. Refer the following link for the answer, have posted the answer there:

Error in downloading NLTK data: [Errno 11004] getaddrinfo failed

Ranjeet
  • 21
  • 2
0

I was facing this issue on my Jupyter notebook as well. The below code snippet from another stackoverflow answer helped. Just in case it might help someone else -

import socket
socket.getaddrinfo('localhost', 8080)

Ref : "getaddrinfo failed", what does that mean?