42

While running a Python script using NLTK I got this:

Traceback (most recent call last):
  File "cpicklesave.py", line 56, in <module>
    pos = nltk.pos_tag(words)
  File "/usr/lib/python2.7/site-packages/nltk/tag/__init__.py", line 110, in pos_tag
    tagger = PerceptronTagger()
  File "/usr/lib/python2.7/site-packages/nltk/tag/perceptron.py", line 140, in __init__
    AP_MODEL_LOC = str(find('taggers/averaged_perceptron_tagger/'+PICKLE))
  File "/usr/lib/python2.7/site-packages/nltk/data.py", line 641, in find
    raise LookupError(resource_not_found)
LookupError:
**********************************************************************
  Resource u'taggers/averaged_perceptron_tagger/averaged_perceptro
  n_tagger.pickle' not found.  Please use the NLTK Downloader to
  obtain the resource:  >>> nltk.download()
  Searched in:
    - '/root/nltk_data'
    - '/usr/share/nltk_data'
    - '/usr/local/share/nltk_data'
    - '/usr/lib/nltk_data'
    - '/usr/local/lib/nltk_data'
**********************************************************************

Can anyone explain the problem?

erip
  • 15,290
  • 10
  • 62
  • 113
Shiv Shankar
  • 877
  • 2
  • 7
  • 13

9 Answers9

63

Use

>>> nltk.download()

to install the missing module (the Perceptron Tagger).

(check also the answers to Failed loading english.pickle with nltk.data.load)

Community
  • 1
  • 1
user2314737
  • 24,359
  • 17
  • 91
  • 104
  • 2
    nltk.download starts a large download of data: all, all-corpora, all-nltk, book, popular,test, and thirdparty – Golden Lion Dec 01 '20 at 15:51
40

First answer said the missing module is 'the Perceptron Tagger', actually its name in nltk.download is 'averaged_perceptron_tagger'

You can use this to fix the error

nltk.download('averaged_perceptron_tagger')

Posuer
  • 401
  • 4
  • 6
  • 12
    it is `python -m nltk.downloader averaged_perceptron_tagger` if you want to download it from the command line – Papples Aug 04 '17 at 14:12
23

TL;DR

import nltk
nltk.download('averaged_perceptron_tagger')

Or to download all packages + data + docs:

import nltk
nltk.download('all')

See How do I download NLTK data?

Community
  • 1
  • 1
alvas
  • 105,505
  • 99
  • 405
  • 683
  • Hi, may I know where this content will be saved after downloading all the nltk data by using `nltk.download("all ")` – Pyd Feb 20 '18 at 07:11
  • 1
    See https://stackoverflow.com/questions/22211525/how-do-i-download-nltk-data and more specifically https://stackoverflow.com/a/36383314/610569 – alvas Feb 20 '18 at 11:10
  • Hi, Its just downloads all the packages and stops...rest codes don't execute – Partha Paul Aug 27 '21 at 12:14
10

Install all nltk resources in one line:

python3 -c "import nltk; nltk.download('all')"

the data will be saved at ~/nltk_data


You can also substitute "all" for "averaged_perceptron_tagger" to install only this module.

Lucas Azevedo
  • 1,544
  • 17
  • 32
  • 1
    nltk.download(), this is working fine. already marked as correct answer. – Shiv Shankar Mar 29 '19 at 03:46
  • @ShivShankar you need to do `python3` then `import nltk`, `nltk.download()` which will put you on a download prompt asking the package name that you want to download. It is not a wrong solution, it's just not as simple as the one I'm proposing. – Lucas Azevedo Mar 29 '19 at 10:52
2

Problem: Lookup error when extracting count vectorizer from scikit learn. Below is code snippet.

from sklearn.feature_extraction.text import CountVectorizer
bow_transformer = CountVectorizer(analyzer=text_process).fit(X)

Solution: Try to run the below code and then try to install the stopwords from corpora natural language processing toolkit!!

import nltk
nltk.download()
2

You can download NLTK missing module just by

import nltk
nltk.download()

This will shows the NLTK download screen. If it shows SSL Certificate verify failed error. Then it should works by disabling SSL check with below code!

import nltk
import ssl

try:
    _create_unverified_https_context = ssl._create_unverified_context
except AttributeError:
    pass
else:
    ssl._create_default_https_context = _create_unverified_https_context

nltk.download()
ishwardgret
  • 990
  • 7
  • 10
0

Sometimes even by writing nltk.download('module_name'), it does not get downloaded. At those times, you can open python in interactive mode and then download by using nltk.download('module_name').

Lucky Sunda
  • 53
  • 1
  • 7
-1

If you have not downloaded ntlk then firstly download ntlk and then use this nltk.download('punkt') it will give you the result.

kk.
  • 3,288
  • 11
  • 33
  • 60
  • I don't believe the punkt tokenizer is the missing piece here. See, for example, Lucas' answer. – Andy Sep 09 '20 at 13:42
-1
import nltk


nltk.download('vader_lexicon')

Use this this might work