6

I am trying to have a Docker image deployed to my server by GitLab CI, but it only worked once¹ and then dies with the error message below. Here is the .gitlab-ci.yml snippet:

deploy-image:
  image: docker:latest
  services:
    - docker:dind
  variables:
    DOCKER_HOST: ssh://gitlab-ci@$DEPLOY_HOST
  before_script:
    - eval $(ssh-agent -s)
    - echo "${SSH_KEY}" | ssh-add -
    - mkdir -p ~/.ssh
    - '[[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config'
  script:
    - echo $CI_JOB_TOKEN | docker login --username gitlab-ci-token --password-stdin $CI_REGISTRY;
    - docker pull "$CI_REGISTRY_IMAGE"
    - docker run --name mycontainer --detach $CI_REGISTRY_IMAGE
  after_script:
    - docker logout $CI_REGISTRY

The error in the job log looks like that:

Executing "step_script" stage of the job script 00:04
Using docker image sha256:51453dcdd9bd51e503f75e6d42a4071469ad2ba816321781985041b6bc7776db for docker:latest with digest docker@sha256:ddf0d732dcbc3e2087836e06e50cc97e21bfb002a49c7d0fe767f6c31e01d65f ...
$ eval $(ssh-agent -s)
Agent pid 18
$ echo "${SSH_KEY}" | ssh-add -
Identity added: (stdin) (gitlab-ci@jacks)
$ mkdir -p ~/.ssh
$ [[ -f /.dockerenv ]] && echo -e "Host *\n\tStrictHostKeyChecking no\n\n" > ~/.ssh/config
$ echo $CI_JOB_TOKEN | docker login --username gitlab-ci-token --password-stdin $CI_REGISTRY;
error during connect: Post "http://docker.example.com/v1.24/auth": command [ssh -l gitlab-ci -- xxx.xxx.xxx.xxx docker system dial-stdio] has exited with exit status 255, please make sure the URL is valid, and Docker 18.09 or later is installed on the remote host: stderr=ssh: connect to host xxx.xxx.xxx.xxx port 22: Connection refused

While I x-ed out the IP address of the destination server in this post (which is the value of the DEPLOY_HOST env var), it literally says docker.example.com in the error message.

So my question would be, where that example string comes from.

This is on a self-hosted GitLab instance and the runner is also self-hosted on a server that is neither the destination host, nor the one where GitLab runs on.

¹ The connection issue itself seems to be a problem with the SSH connection and not related to the CI/CD pipeline config.

mcnesium
  • 375
  • 1
  • 2
  • 7
  • 1
    Post "http://docker.example.com/v1.24/auth" is likely coming from docker login --username gitlab-ci-token --password-stdin $CI_REGISTRY; - what's the value of $CI_REGISTRY – Sathyajith Bhat Sep 12 '21 at 20:13
  • 1
    The value of $CI_REGISTRY is the container registry. It is a pre-defined variable that is specific to every project and defined automatically by GitLab CI, see https://docs.gitlab.com/ee/ci/variables/predefined_variables.html (sorry for answering late) – mcnesium Nov 24 '21 at 16:34

1 Answers1

10

It's a placeholder domain name for connecting over SSH. It doesn't mean that it's connecting to docker.example.com.

I ran into a similar error when using a Docker client over SSH. The error message was:

ERROR: Cannot connect to the Docker daemon at http://docker.example.com. Is the docker daemon running?

Reading the Docker source code, it looks like "docker.example.com" is a placeholder host name when connecting over SSH, resulting in the confusing error message.

In my case, the error was caused by the user I was logging in as not being part of the docker group, and therefore not having permission to use docker.

Nick ODell
  • 216
  • 3
  • 7
  • Thank you for investigating this. So do you think this behavior is unexpected and therefore worth reporting as a bug? I think it is at least misleading when the actual problem is something else. – mcnesium Dec 19 '21 at 11:56
  • @mcnesium Yes, I think that's confusing enough to count as a bug. Also, here's another person who ran into the same problem: https://forums.docker.com/t/docker-compose-through-ssh-failing-and-referring-to-docker-example-com/115165 – Nick ODell Dec 19 '21 at 16:33
  • When I saw this, I was copy-pasting an example into my terminal, so I thought I must have forgotten to change some placeholder. Spent a bunch of time looking for where I forgot to replace 'docker.example.com.' – Nick ODell Dec 19 '21 at 16:59
  • This can occur when you have a multi-platform buildx setup that includes a remote machine. I do it to do x86+arm64 builds from an M1 Mac, with a remote x86 box doing the cross-platform build while my M1 does the arm64 build in parallel, each doing their thing without emulation. If I don't have Docker up on the remote host, I get this error. – Warren Young Jul 14 '23 at 06:15