I have a host which is a VM on ESXI.
In this host I run docker and start several containers.
I want these containers to be exposed to the outside network, so according to here I added new IPs to the host's interface (let's call it ens160) and each container I start with: -p ${IP}:8888:8888, where ${IP} is an IP I added to ens160 (using ip addr add...).
This way someone from outside the host can send a request to a container inside the host, towards port 8888.
I have another server on the same subnet of the host which send requests towards the containers.
When there is more than 1 container, very frequently there are connectivity issue between the server and the containers. I see timeouts on the server - he thinks the containers are down (each at a time). I'm also using tcpdump on the host and see the following:
Each couple of requests from the server towards one of the containers, there's this ICMP message of type 3 - Destination unreachable, Code 1 - Host unreachable).
192.168.201.58 is the container. 192.168.201.195 is the server which making requests towards the containers.
According to here:
The ICMP destination unreachable message is generated by a router to inform the source host that the destination unicast address is unreachable.
I realize someone sends this message towards the server.
Another interesting article related to the issue is A reason for unexplained connection timeouts on Kubernetes/Docker. where the bottom line is that iptables SNAT needs to be --random-fully when generating NAT source ports. But I don't think that's the issue on my setup, because I have different source IPs for each container, and there's no need for SNAT.
I've tried playing with Linux parameters on the host, such as increasing max connections, etc... according to this question, but none seems to help.
I use:
$ docker --version
Docker version 20.10.7, build 20.10.7-0ubuntu1~20.04.1
My questions are, what happens that the containers are sometimes unreachable?
There's almost none traffic (viewed with iftop).
Where can the limitation be?
How can it be solved?
--- Edit ---
I narrowed it down a bit.
Looks like if I start containers on different docker network this issue does not happen, so I believe it relates to linux bridges docker creates when creating a new docker network.