0

I'm trying to get Selenium to work with Tor on an AWS EC2 instance running Ubuntu 20.4. Here are the steps I tried, where pwd is /home/ubuntu:

  1. Install pre-requisites:
sudo apt update
sudo apt install unzip libnss3 python3-pip
  1. Install geckodriver:
sudo apt-get install firefox-geckodriver
  1. Install the Tor Browser Bundle:
sudo wget https://www.torproject.org/dist/torbrowser/10.5.8/tor-browser-linux64-10.5.8_en-US.tar.xz
sudo tar -xf tor-browser-linux64-10.5.8_en-US.tar.xz
sudo chmod 775 tor-browser_en-US
  1. Create test.py:
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.firefox.options import Options
from selenium import webdriver

tor = '/home/ubuntu/tor-browser_en-US/Browser/firefox'
firefox_binary = FirefoxBinary(tor)

options = Options()
options.headless = True

driver = webdriver.Firefox(executable_path='/usr/bin/geckodriver', 
                            firefox_binary=firefox_binary)

driver.get("http://google.com/")
print(driver.page_source)
driver.quit()

The structure of test.py is taken from this SO answer.

Here's where the problems begin: running python3 test.py returns the following error:

Traceback (most recent call last):
  File "test.py", line 11, in <module>
    driver = webdriver.Firefox(executable_path='/usr/bin/geckodriver', 
  File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
    RemoteWebDriver.__init__(
  File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
    self.start_session(capabilities, browser_profile)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
    response = self.execute(Command.NEW_SESSION, parameters)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: binary is not a Firefox executable

By binary I assume the error refers to tor = '/home/ubuntu/tor-browser_en-US/Browser/firefox'. I obtained this path by looking at the contents of tor-browser_en-US, the result of Step 3, and finding firefox. I also tried replacing firefox with firefox.real, start-tor-browser, and start-tor-browser.desktop, but all resulted in the same error.

I also removed executable_path, thinking that might help given that Step 2 installs geckodriver to Path. Same error.

Perhaps my most direct question would be: Given my project structure, how can I find Tor's Firefox binary on Linux? Are additional manipulations of tor-browser_en-US needed? Should tor-browser_en-US be located in a different directory, such as /usr/bin/?

mmz
  • 901
  • 3
  • 14

0 Answers0