60

I am writing python to crawl Twitter space using Twitter-py. I have set the crawler to sleep for a while (2 seconds) between each request to api.twitter.com. However, after some times of running (around 1), when the Twitter's rate limit not exceeded yet, I got this error.

[Errno 10054] An existing connection was forcibly closed by the remote host.

What are possible causes of this problem and how to solve this?

I have searched through and found that the Twitter server itself may force to close the connection due to many requests.

Thank you very much in advance.

Kara
  • 5,996
  • 16
  • 49
  • 56
Nama Keru
  • 771
  • 1
  • 6
  • 10

5 Answers5

22

This can be caused by the two sides of the connection disagreeing over whether the connection timed out or not during a keepalive. (Your code tries to reused the connection just as the server is closing it because it has been idle for too long.) You should basically just retry the operation over a new connection. (I'm surprised your library doesn't do this automatically.)

David Schwartz
  • 173,634
  • 17
  • 200
  • 267
  • 2
    I have the same problem. Using bottle library, and sending with httplib. I can't really send retry, because the original call was already executed on server. The connection was closed when I tried to read response data. This happens not all the time, usually when I just spam server with requests. Do you know any parameters I can tweak to make the communication stable? – Roman Hwang Nov 12 '13 at 14:47
  • 1
    @RomanHwang You either need a way to check on the previous operation without repeating it or you need to make your operations [idempotent](http://stackoverflow.com/questions/1077412/what-is-an-idempotent-operation). – David Schwartz Nov 12 '13 at 20:43
  • 2
    Thanks for the hint. I also found out the reason of why I get the error so often. It's because of implementation of default development server of bottle. It's single-threaded and is not suited to handle too many requests at a time. – Roman Hwang Nov 21 '13 at 15:23
13

there are many causes such as

  • The network link between server and client may be temporarily going down.
  • running out of system resources.
  • sending malformed data.

To examine the problem in detail, you can use Wireshark.

or you can just re-request or re-connect again.

12

I know this is a very old question but it may be that you need to set the request headers. This solved it for me.

For example 'user-agent', 'accept' etc. here is an example with user-agent:

url = 'your-url-here'
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.75 Safari/537.36'}
r = requests.get(url, headers=headers)
PythonIsBae
  • 328
  • 2
  • 10
  • 2
    Can you add just some details? – Don Oct 12 '20 at 10:10
  • 2
    **Extra details:** Imagine writing a crawler to poll twitter, and since the crawler isn't a browser it won't have the user-agent by default. So the website is saying please trick us into thinking you're using a real browser with established user-agent settings, like Mozilla, AppleWebKit, Chrome, etc browser. – Jeremy Thompson Feb 01 '21 at 06:34
2

For me this problem arised while trying to connect to the SAP Hana database. When I got this error,

OperationalError: Lost connection to HANA server (ConnectionResetError(10054, 'An existing connection was forcibly closed by the remote host', None, 10054, None))

I tried to run the code for connection(mentioned below), which created that error, again and it worked.


    import pyhdb
    connection = pyhdb.connect(host="example.com",port=30015,user="user",password="secret")
    cursor = connection.cursor()
    cursor.execute("SELECT 'Hello Python World' FROM DUMMY")
    cursor.fetchone()
    connection.close()

It was because the server refused to connect. It might require you to wait for a while and try again. Try closing the Hana Studio by logging off and then logging in again. Keep running the code for a number of times.

Sreeja
  • 21
  • 2
  • A separate question, please. Any chance you know where Windows10 stores connection strings? I thought it was in C:\Users\User-Name\AppData\Roaming\Microsoft\MicrosoftSQL_Server\\110\Tools\Shell\RegServer.xml (This is for SQL Server, of course) – MSIS Dec 03 '20 at 02:11
1

I got the same error ([WinError 10054] An existing connection was forcibly closed by the remote host) with websocket-client after setting ping_interval = 2 in websocket.run_forever(). (I had multiple threads connecting to the same host.)

Setting ping_interval = 10 and ping_timeout = 9 solved the issue. May be you need to reduce the amount of requests and stop making host busy otherwise it will forcibly disconnect you.