How does server-side `TIME_WAIT` really work?

Question

I know there are quite a few SE questions on this, and I believe I read as many of them as it matters before coming to this point.

By "server-side TIME_WAIT" I mean the state of a server-side socket pair that had its close() initiated on the server side.

I often see these statements that sound contradictory to me:

Server-side TIME_WAIT is harmless
You should design your network apps to have clients initiate close(), therefore having client bear the TIME_WAIT

The reason I find this contradictory is because TIME_WAIT on the client can be a problem -- the client can run of out available ports, so in essence the above is recommending to move the burden of TIME_WAIT to the client side where it can be problem, from the server side where it's not a problem.

Client-side TIME_WAIT is of course only a problem for limited number of use cases. Most of client-server solutions would involve one server and many clients, clients usually don't deal with high enough volume of connections for it to be a problem, and even if they do, there is a number of recommendations to "sanely" (as opposed to SO_LINGER with 0 timeout, or meddling with tcp_tw sysctls) combat client-side TIME_WAIT by avoiding creating too many connections too quickly. But that's not always feasible, for example for class of applications like:

monitoring systems
load generators
proxies

On the other side, I don't even understand how server-side TIME_WAIT is helpful at all. The reason TIME_WAIT is even there, is because it prevents injecting stale TCP fragments into streams they don't any longer belong to. For client-side TIME_WAIT it's accomplished by simply making it impossible to create a connection with the same ip:port pairs that this stale connection could have had (the used pairs are locked out by TIME_WAIT). But for the server side, this can't be prevented since the local address will have the accepting port, and always will be the same, and the server can't (AFAIK, I only have the empirical proof) deny the connection simply because an incoming peer would create the same address pair that already exists in the socket table.

I did write a program that shows that server-side TIME-WAIT are ignored. Moreover, because the test was done on 127.0.0.1, the kernel must have a special bit that even tells it whether it's a server side or a client side (since otherwise the tuple would be the same).

Source: http://pastebin.com/5PWjkjEf, tested on Fedora 22, default net config.

$ gcc -o rtest rtest.c -lpthread
$ ./rtest 44400 s # will do server-side close
Will initiate server close
... iterates ~20 times successfully
^C
$ ss -a|grep 44400
tcp    TIME-WAIT  0      0            127.0.0.1:44400         127.0.0.1:44401   
$ ./rtest 44500 c # will do client-side close
Will initiate client close
... runs once and then
connecting...
connect: Cannot assign requested address

So, for server-side TIME_WAIT, connections on the exact same port pair could be re-established immediately and successfully, and for client-side TIME-WAIT, on the second iteration connect() righteously failed

To summarize, the question is two fold:

Does server-side TIME_WAIT really not do anything, and is just left that way because the RFC requires it to?
Is the reason the recommendation is for client to initiate close() because the server TIME_WAIT is useless?

You won't run out of ports unless you hava only 1 client. You have 65535 ports for each combination of client/server IP. Connection from 1.2.3.4:1111 is different from 4.3.2.1:1111. It just takes few bytes of memory for each connection in TIME_WAIT. — Marki555, May 21 '15 at 12:46

score 5 · Accepted Answer · edited Oct 07 '21 at 07:34

In TCP terms server side here means the host that has the socket in LISTEN state.

RFC1122 allows socket in TIME-WAIT state to accept new connection with some conditions

        When a connection is closed actively, it MUST linger in
        TIME-WAIT state for a time 2xMSL (Maximum Segment Lifetime).
        However, it MAY accept a new SYN from the remote TCP to
        reopen the connection directly from TIME-WAIT state, if it:

For exact details on the conditions, please see the RFC1122. I'd expect there also must be a matching passive OPEN on the socket (socket in LISTEN state).

Active OPEN (client side connect call) does not have such exception and must give error when the socket is in TIME-WAIT, as per RFC793.

My guess for the recommendation on client (in TCP terms the host performing active OPEN i.e. connect) initiated close is much the same as yours, that in the common case it spreads the TIME-WAIT sockets on more hosts where there is abundance of resources for the sockets. In the common case clients do not send SYN that would reuse TIME-WAIT sockets on server. I agree that to apply such recommendation still depends on the use case.

+1. For further reading, this MS technet link refers to the reopening of a connection as TCP TIME-WAIT Assassination. This occurs only if the conditions in the linked RFC1122 are met; otherwise MS technet states "The server will not respond to the SYN packets from the client because the socket is in the TIME-WAIT state" — Wad, Sep 08 '20 at 19:16

score 1 · Answer 2 · edited Oct 20 '22 at 06:01

The client will use a new TCP ISN (initial sequence number) based on an algorithm (incremented by 1 roughly every 4 microseconds), and the ISN is basically always larger than sequence number sent in the last same "ip:port pair" TCP socket FIN, so the server will always accept the new SYN even if the "ip:port pair" is still recorded in TIME_WAIT state on the server.

   RFC 793 [RFC0793] suggests that the choice of the ISN of a connection
   is not arbitrary, but aims to reduce the chances of a stale segment
   from being accepted by a new incarnation of a previous connection.
   RFC 793 [RFC0793] suggests the use of a global 32-bit ISN generator
   that is incremented by 1 roughly every 4 microseconds.

score 0 · Answer 3 · answered Jul 09 '15 at 23:05

0

This is probably the clearest example of what TIME-WAIT actually does and more importantly why it's important. It also explains why to avoid some of the 'expert' tips on Linux machines to 'reduce' TIME-WAIT's.

answered Jul 09 '15 at 23:05

Khushil

563
3
12

Still doesn't explain what happens when a client->server connection is initiated, and a server has that pair locked out in a TIME_WAIT – Pawel Veselov Jul 09 '15 at 23:19
Please see https://stackoverflow.com/questions/1490196/tcp-connection-in-time-wait-wont-allow-reconnect-java - the Answer there is what you're looking for. – Khushil Jul 09 '15 at 23:26

basos · Answer 4 · 2015-08-20T12:17:23.077

0

A tcp session is identified by the tupple (sourceIP, sourcePort, destIP, destPort). Hence the TIME_WAIT does work on every tcp connection.

Regarding the closing side, in some scenarios, closing from the client side can reduce TIME_WAIT sockets on the server, thus slightly reducing memory. In cases when socket space can be exhausted (due to ephemeral port depletion) (e.g. greedy clients with many connections to the same server), this problem should be solved in any side.

edited Aug 20 '15 at 12:17

answered Aug 19 '15 at 09:11

basos

143

Please explain; when you ask if the server-side TW does anything, you wonder if the same connection can be reused during the TW period. The answer is no because the connection, as defined by the tupple, takes a place in the server's tcp table. If the client tries to open the same connection soon, it will receive a RST, effectively denying the tcp connection. By the way, the article from Khushil is very descriptive. – basos Aug 20 '15 at 12:16
I'm very sorry, your answer actually does answer the question, I read it wrong, and have retracted my comment. However, it is also seems to be incorrect, as I have code that seems to prove that there is no protection from server-side TIME_WAIT (I updated question with that info). @Khushil's reference doesn't cover server-side TIME_WAIT cases in enough detail. – Pawel Veselov Aug 21 '15 at 08:35

score -2 · Answer 5 · answered Aug 06 '15 at 17:36

You can never be sure with an unreliable protocol, that you have received the last message from your peer device, therefore it is dangerous to assume that your peer has hung up the phone rather suddenly. It is a major disadvantage of TCP protocol that only 65000 or so ports can be open simultaneously. But the way to overcome this would be to move to a server farm, which scales better with the load, than by recycling port numbers quickly. At the client end it is highly unlikely that you will run out of ports if it is a basic workstation.

I've very sorry, but this doesn't answer my question. – Pawel Veselov Aug 06 '15 at 22:48 — Pawel Veselov, Aug 06 '15 at 22:48

How does server-side `TIME_WAIT` really work?

5 Answers5