Possible Duplicate:
What's a good tool to screen-scrape with Javascript support?
I’m trying to do some screen-scraping of my bank’s website. (I know, I’m probably onto a loser, but bear with me.)
The site seems to be setting several cookies, with varying session-related values, via JavaScript, and then redirecting to the home page if it can’t find those values.
I’ve been trying to figure out a way to spot the values of those cookies by searching the HTML/JavaScript code of the pages, but the relevant code looks very obfuscated, so I’m having a hard time doing it.
Is there a Python library that simulates a web browser with JavaScript enabled? I was thinking something like mechanize that also:
- parses the HTML page returned (e.g. with something like lxml)
- parses any JavaScript on the HTML page
- sets any cookies set by the JavaScript
- amends the parsed HTML page with any DOM modifications made by the JavaScript
Basically a web browser that’s programmable in Python. Failing that, a solution in any other language.