I'm trying to get Selenium to work with Tor on an AWS EC2 instance running Ubuntu 20.4. Here are the steps I tried, where pwd is /home/ubuntu:
- Install pre-requisites:
sudo apt update
sudo apt install unzip libnss3 python3-pip
- Install geckodriver:
sudo apt-get install firefox-geckodriver
- Install the Tor Browser Bundle:
sudo wget https://www.torproject.org/dist/torbrowser/10.5.8/tor-browser-linux64-10.5.8_en-US.tar.xz
sudo tar -xf tor-browser-linux64-10.5.8_en-US.tar.xz
sudo chmod 775 tor-browser_en-US
- Create
test.py:
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.firefox.options import Options
from selenium import webdriver
tor = '/home/ubuntu/tor-browser_en-US/Browser/firefox'
firefox_binary = FirefoxBinary(tor)
options = Options()
options.headless = True
driver = webdriver.Firefox(executable_path='/usr/bin/geckodriver',
firefox_binary=firefox_binary)
driver.get("http://google.com/")
print(driver.page_source)
driver.quit()
The structure of test.py is taken from this SO answer.
Here's where the problems begin: running python3 test.py returns the following error:
Traceback (most recent call last):
File "test.py", line 11, in <module>
driver = webdriver.Firefox(executable_path='/usr/bin/geckodriver',
File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/firefox/webdriver.py", line 170, in __init__
RemoteWebDriver.__init__(
File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 157, in __init__
self.start_session(capabilities, browser_profile)
File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 252, in start_session
response = self.execute(Command.NEW_SESSION, parameters)
File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
self.error_handler.check_response(response)
File "/home/ubuntu/.local/lib/python3.8/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.InvalidArgumentException: Message: binary is not a Firefox executable
By binary I assume the error refers to tor = '/home/ubuntu/tor-browser_en-US/Browser/firefox'. I obtained this path by looking at the contents of tor-browser_en-US, the result of Step 3, and finding firefox. I also tried replacing firefox with firefox.real, start-tor-browser, and start-tor-browser.desktop, but all resulted in the same error.
I also removed executable_path, thinking that might help given that Step 2 installs geckodriver to Path. Same error.
Perhaps my most direct question would be: Given my project structure, how can I find Tor's Firefox binary on Linux? Are additional manipulations of tor-browser_en-US needed? Should tor-browser_en-US be located in a different directory, such as /usr/bin/?