1

Hi Guys I'm trying to get the citations from a number of papers out of google. This is my code

import urllib
import mechanize
from bs4 import BeautifulSoup

import csv
import os #change directory
import re #for regular expressions



br = mechanize.Browser()

br.set_handle_equiv(False)
br.set_handle_robots(False)   # ignore robots

br.addheaders = [('User-agent', 'Firefox')]             # [()]
br.open('http://google.com/')

br.select_form(name='f')   # Note: select the form named 'f' here
term = "Multinational Study of the Efficacy and Safety of Humanized Anti-HER2 Monoclonal Antibody in Women Who Have HER2-Overexpressing Metastatic Breast Cancer That Has Progressed After Chemotherapy for Metastatic Disease".replace(" ","+")
br.form['q'] = term # query
data = br.submit()

soup = BeautifulSoup(data)


cite= soup.findAll('div',{'class': 'f slp'})
ref = str(cite[1])
print ref

However I keep getting erorrs. I want the number of citations this paper has.

Jayanth Koushik
  • 9,006
  • 1
  • 38
  • 49
Naus
  • 99
  • 10

1 Answers1

0

The problem is that there is no citation info on the page you are getting after the form submit, in other words there is no divs with f slp class.

You have several options to solve it:

See also:

Hope that helps.

Community
  • 1
  • 1
alecxe
  • 441,113
  • 110
  • 1,021
  • 1,148