Hello Neural Explorers!
Coding with LLM is quite easy. There are plenty of libraries available that simplify AI usage. Basic Python knowledge will suffice.
However, installing dependencies can be challenging as it requires coordination among different library packages. The first time I attempted to install my environment, it took me an entire weekend to get everything working correctly. NVIDIA resources can be a bit messy to navigate, especially when trying to figure out which driver versions will work well with your specific CUDA version as well as with PyTorch… But since then, I’ve discovered a trick that has made the whole process much easier. With Debian 12, all you have to do is to add non-free repositories, and then install nvidia-smi.
So, let’s begin!
If you don’t have a Debian 12 installation, install it. In apt configuration, add add non-free, contrib, and non-free-firmware after main. Depend on your specific installation, it might be located at either /etc/apt/sources.list
or /etc/apt/sources.list.d/debian.sources
.
Update your package lists, be sure your system is up to date, then install linux-headers (the NVIDIA driver will require it though there are no package dependencies and this package may not exist on your system) and nvidia-smi.
sudo apt update
sudo apt upgrade -y
sudo apt install -y linux-headers-$(uname -r)
sudo apt install -y nvidia-smi
nvidia-smi
Normally, you should have something like this:
+-----------------------------------------------------------------------------+ | NVIDIA-SMI 525.105.17 Driver Version: 525.105.17 CUDA Version: 12.0 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA RTX A5000 On | 00000000:00:03.0 Off | Off | | 30% 40C P5 75W / 230W | 1MiB / 24564MiB | 0% Default | | | | N/A | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | No running processes found | +-----------------------------------------------------------------------------+
Interdependencies between Python packages can be complex when working with AI systems, which requires multiple installation environments to handle varying tasks. To simplify this process, using virtual environment (venv) or Docker containers are recommended options. Here I choose Docker because venv is not really used in the corporate world and docker is a more useful skill to cultivate.
for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt -y install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo docker run hello-world
Normally, you would see this message:
Hello from Docker!
Enable the Docker service and add your non-administrator user to the Docker group.
sudo systemctl enable docker.service
sudo systemctl enable containerd.service
sudo usermod -aG docker $USER
Close and reopen your session, then try again using Docker as normal user. You can also remove the hello world image.
docker run hello-world
docker rmi hello-world -f
Install docker compose. If you want the later version, you can find it on this website and replace the number in the URL.
sudo curl -L "https://github.com/docker/compose/releases/download/v2.23.3/docker-compose-linux-x86_64" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose
Install NVIDIA Container Toolkit. There’s a trick to doing this on Debian 12. You need to use the Debian 11 version (actually, NVIDIA Container Toolkit works similarly across all versions from Debian 10).
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/debian11/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker
And now, you can try running nvidia-smi within a container.
docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi
Finally, you can delete that temporary image.
docker rmi ubuntu -f
Now you have a Linux installation with NVIDIA drivers, CUDA, and Docker, all working together. In the next lesson, you’ll learn how to set up your Docker development environment.