6

I noticed when running wget https://www.google.com/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=foo and similar queries, I don't get the search results, but the google homepage.

There seems to be some redirect within the google page. Does anyone know a fix to wget so it would work?

2 Answers2

12

You can use this curl commands to pull Google query results:

curl -sA "Chrome" -L 'http://www.google.com/search?hl=en&q=time' -o search.html

For using https URL:

curl -k -sA "Chrome" -L 'https://www.google.com/search?hl=en&q=time' -o ssearch.html

-A option sets a custom user-agent Chrome in request to Google.

anubhava
  • 713,503
  • 59
  • 514
  • 593
7

#q=foo is your hint, as that's a fragment ID, which never gets sent to the server. I'm guessing you just took this URL from your browser URL-bar when using the live-search function. Since it is implemented with a lot of client-side magic, you cannot rely on it to work; try using Google with live search disabled instead. A URL pattern that seems to work looks like this: http://www.google.com/search?hl=en&q=foo.

However, I do notice that Google returns 403 Forbidden when called naïvely with wget, indicating that they don't want that. You can easily get past it by setting some other user-agent string, but do consider all the implications before doing so on a regular basis.

Dolda2000
  • 24,347
  • 2
  • 47
  • 89