0

I'm try this code script from here: How can I translate this XPath expression to BeautifulSoup? but I receive error. can someone help me, why do I get the error:

spider = self.crawler.spiders.create(spname, **opts.spargs)
File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\spidermanag er.py", line 43, in create
  raise KeyError("Spider not found: %s" % spider_name) 
KeyError: 'Spider not found: app'

I installed pyparsing

Thi is the code:

from pyparsing import makeHTMLTags, withAttribute, SkipTo 
import urllib

# get the HTML from your URL 
url = "http://www.whitecase.com/Attorneys/List.aspx?LastName=&FirstName="
page = urllib.urlopen(url) 
html = page.read() 
page.close()

# define opening and closing tag expressions for <td> and <a> tags
# (makeHTMLTags also comprehends tag variations, including attributes, 
# upper/lower case, etc.) 
tdStart,tdEnd = makeHTMLTags("td") 
aStart,aEnd = makeHTMLTags("a")

# only interested in tdStarts if they have "class=altRow" attribute 
tdStart.setParseAction(withAttribute(("class","altRow")))

# compose total matching pattern (add trailing tdStart to filter out 
# extraneous <td> matches) 
patt = tdStart + aStart("a") + SkipTo(aEnd)("text") + aEnd + tdEnd + tdStart

# scan input HTML source for matching refs, and print out the text and 
# href values 
for ref,s,e in patt.scanString(html):
    print ref.text, ref.a.href

Thanks in advance! Floriano

Community
  • 1
  • 1
Floriano
  • 53
  • 1
  • 8
  • How are you running this code? The error you get comes from Scrapy not finding a spider called "app". But I don't see scrapy code (spider, rules etc.) – paul trmbrth Jul 31 '13 at 14:37
  • There spider is: app_spider.py I have run other script code in app_spider.py but working – Floriano Jul 31 '13 at 14:45
  • Is your spider's `name` attribute "app"? – paul trmbrth Jul 31 '13 at 14:48
  • Please post the code that is actually causing the error instead of this pyparsing code, including details about how the spider is being run. – Talvalin Jul 31 '13 at 15:00
  • Yes, bot_name is 'app' – Floriano Jul 31 '13 at 15:04
  • `bot_name` is one thing but the spider to run is referenced by it's `name` attribtute, e.g. "dmoz" in Scrapy's tutorial http://doc.scrapy.org/en/latest/intro/tutorial.html#using-our-item, which matches `scrapy crawl dmoz` command – paul trmbrth Jul 31 '13 at 15:07
  • Right now I have another error File "app\spiders\app_spider.py", line 6, in page = urllib.urlopen(url) NameError: name 'urllib' is not defined – Floriano Jul 31 '13 at 15:10
  • Um, how can we help if you won't post your spider code? – Talvalin Jul 31 '13 at 15:17
  • ckages\scrapy-0.16.5-py2.7.egg\scrapy\cmdline.py" , line 76, in _run_print_help func(*a, **kw) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\cmdline.py" , line 138, in _run_command cmd.run(args, opts) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\commands\cr awl.py", line 43, in run spider = self.crawler.spiders.create(spname, **opts.spargs) File "C:\Python27\lib\site-packages\scrapy-0.16.5-py2.7.egg\scrapy\spidermanag er.py", line 43, in create raise KeyError("Spider not found: %s" % spider_name) KeyError: 'Spider not found: app' – Floriano Jul 31 '13 at 15:19
  • As @Talvalin suggested, to help find the issue, you should probaby post your spider code, scrapy settings and console commands and output to some pastebin, https://gist.github.com/ or something – paul trmbrth Jul 31 '13 at 15:38
  • 1
    I cleaned up your pyparsing sample code, I see that you copied it from the linked question. But it doesn't look like you are even getting to the HTML parsing part yet, your spider setup/code is messed up. – PaulMcG Aug 02 '13 at 15:59

0 Answers0