1

I am creating a singularity container on a server that doesn't have GPUs. I want to use this container on another server which has GPUs, so I am installing cuda development libraries for the container.

However, this gives me an error along the lines of

*****************************************************************************
*** Reboot your computer and verify that the NVIDIA graphics driver can   ***
*** be loaded.                                                            ***
*****************************************************************************

I am wondering if there's a way to get around this.

Thank you for the help!

  • 1
    Is the error message a real problem? I've got a similar setup (non-container but still a build server without GPU), and it works for me – MSalters Mar 03 '22 at 08:47
  • Depending on what you are installing, the answer is either yes, or no. You can install the CUDA toolkit (which includes various CUDA libraries) but you cannot (properly) install the GPU driver. It looks to me like you are using an improper method for installing "the libraries" but since you haven't shown the method, that's just a guess. If you are actually installing the driver in the container (which also includes certain CUDA libraries) that is a really bad idea. There is not enough info in your question to provide a sensible answer. – Robert Crovella Mar 03 '22 at 14:33
  • Which "libraries" do you want to install? – einpoklum Mar 03 '22 at 16:57
  • It may not fit every use-case, but the best practice seems to be to use NVIDIAs own [container images](https://registry.hub.docker.com/r/nvidia/cuda#!) as base and build everything else on top. I had no problems converting them to singularity and using them. – paleonix Mar 03 '22 at 17:48
  • I get the error while trying to install `cuda-10-1`. Other libraries I will be installing are `libcudnn7` and `libcudnn7-dev`. @einpoklum – Keshav Agrawal Mar 03 '22 at 22:07
  • @RobertCrovella - In fact, it looks like the nvidia-driver-460 is installed. It is stuck on the next step which is to install the cuda libraries. As for the installation method, the installation is done during the building of the container (from a container definition file). It uses `apt-get` to install these. I wonder why installing the driver in the container is a bad idea. – Keshav Agrawal Mar 03 '22 at 22:12
  • 2
    So, in that case, I believe your question has the same answer as [this question](https://stackoverflow.com/q/27306724/1593077). – einpoklum Mar 03 '22 at 22:13
  • I see what you are saying. I believe it should work! Thank you. – Keshav Agrawal Mar 03 '22 at 22:18
  • You shouldn't be attempting to install `nvidia-driver-460`. Your method of installing `cuda-10-1` (whatever it may be) is not correct/appropriate for container usage. I would recommend learning what the implications are for using the nvidia container toolkit. – Robert Crovella Mar 03 '22 at 22:18
  • okay @RobertCrovella. Yes, I am quite new to this - I will look into this. – Keshav Agrawal Mar 03 '22 at 22:22

0 Answers0