I've made a flask web app that does some machine learning with python, and I'm currently trying to deploy it to Heroku (using the free plan). Everything works well on the development server and when I run heroku local, but I'm having problems with the live version when I deploy the app.
The app consists of two dynos, a web dyno that runs the main application, and a worker. The worker serves a multiprocessing.managers.BaseManager at the address 127.0.0.1:22109, which is used to manage a shared python object like this. The object handles all the state and functionality which underlies the machine learning task.
When I launch a machine learning task from the web app, the process in the web dyno should connect to the Manager on the worker dyno so the app can interact with it, but I'm getting a ConnectionRefusedError:
2021-08-18T13:25:55.657204+00:00 app[web.1]: ERROR:neuralnetworkwebapp:Exception on /network-widget/train [GET]
2021-08-18T13:25:55.657214+00:00 app[web.1]: Traceback (most recent call last):
2021-08-18T13:25:55.657231+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 2070, in wsgi_app
2021-08-18T13:25:55.657233+00:00 app[web.1]: response = self.full_dispatch_request()
2021-08-18T13:25:55.657240+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1515, in full_dispatch_request
2021-08-18T13:25:55.657240+00:00 app[web.1]: rv = self.handle_user_exception(e)
2021-08-18T13:25:55.657240+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1513, in full_dispatch_request
2021-08-18T13:25:55.657241+00:00 app[web.1]: rv = self.dispatch_request()
2021-08-18T13:25:55.657241+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/site-packages/flask/app.py", line 1499, in dispatch_request
2021-08-18T13:25:55.657241+00:00 app[web.1]: return self.ensure_sync(self.view_functions[rule.endpoint])(**req.view_args)
2021-08-18T13:25:55.657242+00:00 app[web.1]: File "/app/neuralnetworkwebapp/views.py", line 134, in train_network
2021-08-18T13:25:55.657242+00:00 app[web.1]: manager = get_manager(current_app)
2021-08-18T13:25:55.657243+00:00 app[web.1]: File "/app/neuralnetworkwebapp/persistence.py", line 29, in get_manager
2021-08-18T13:25:55.657244+00:00 app[web.1]: manager.connect()
2021-08-18T13:25:55.657244+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/multiprocessing/managers.py", line 522, in connect
2021-08-18T13:25:55.657245+00:00 app[web.1]: conn = Client(self._address, authkey=self._authkey)
2021-08-18T13:25:55.657245+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/multiprocessing/connection.py", line 507, in Client
2021-08-18T13:25:55.657245+00:00 app[web.1]: c = SocketClient(address)
2021-08-18T13:25:55.657246+00:00 app[web.1]: File "/app/.heroku/python/lib/python3.9/multiprocessing/connection.py", line 635, in SocketClient
2021-08-18T13:25:55.657246+00:00 app[web.1]: s.connect(address)
2021-08-18T13:25:55.657246+00:00 app[web.1]: ConnectionRefusedError: [Errno 111] Connection refused
These are the two functions in persistance.py that handle the creation of the manager (init_manager) and subsequent connections to it (get_manager):
def init_manager(host, port, key):
manager = StateManager(address=(host, port), authkey=key)
server = manager.get_server()
print(f'HOST: {host}\nPORT: {port}')
server.serve_forever()
def get_manager(app):
manager = StateManager(address=(app.config['MANAGER_HOST'], app.config['MANAGER_PORT']), authkey=app.config['MANAGER_AUTHKEY'])
manager.register('TrainNetwork')
manager.connect()
return manager
Where StateManager is just my subclass of BaseManager.
The error arises when I try to call get_manager in the web dyno. I've been trying to work out a solution but I'm quite new to this and I'm a bit stumped. I'm not sure if its useful but here's the procfile I'm using to run the app:
worker: python runmanager.py
web: waitress-serve --call --port=$PORT neuralnetworkwebapp:create_app
And this is how I'm calling init_manager in runmanager.py
MANAGER_HOST = '127.0.0.1'
MANAGER_PORT = 22109
MANAGER_AUTHKEY = bytes(os.environ.get('MANAGER_AUTHKEY'), 'utf8')
init_manager(MANAGER_HOST, MANAGER_PORT, MANAGER_AUTHKEY)
I know that when serving the web process with waitress the port is assigned by Heroku and made available via an environment variable. Does something similar happen with the worker process that means I shouldn't be hard-coding in the address as 127.0.0.1:22109? I also read a comment somewhere about possible connection limits with the free plan, but I haven't found any additional information or potential fixes I can try and implement.
Any help would be much appreciated.