0

I am trying to detect the "google retclecha" page while scraping Google search results with selenium. Some of the scraping codes I wrote.

def spider(search_term, intext_term, include_term, target_site):
    driver = open_webdriver()
    driver.implicitly_wait(10)
    num_records_scraped = 0

    for page in range(0, Max_Page, 10):
        search_url = target_url(search_term, intext_term, include_term, target_site, page)
        driver.get(search_url)
        items = select_wholePage(driver)
        for item in items:
            record = get_result(item)
            if record:
                records.append(record)
                num_records_scraped += 1
        time_interval()

    driver.quit()

The page is start=0, increasing by 10 and moving to the next 10 pages usually appears as "Move Captcha Page" -> for page in range(0, Max_Page, 10). Verify that the captcha page contains the element ID as "recaptcha token". So I will use this

recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token")))

and tried like this

    for page in range(0, Max_Page, 10):
        search_url = target_url(search_term, intext_term, include_term, target_site, page)
        driver.get(search_url)
        recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token")))
        if recaptcha :
            print('This is recaptcha')
        else:
            items = select_wholePage(driver)
            for item in items:
                record = get_result(item)
                if record:
                    records.append(record)
                    num_records_scraped += 1
            time_interval()

    driver.quit()

but it has timeout error

recaptcha = WebDriverWait(driver, 10).until(EC.presence_of_element_located((By.ID, "recaptcha-token"))) in until raise TimeoutException(message, screen, stacktrace) selenium.common.exceptions.TimeoutException: Message:

I thought my logic had a problem detecting catch ID or something. Please help me.

SY Moon
  • 101
  • 1
  • 8

0 Answers0