Lesson 1: Install your environment

Hello Neural Explorers!

Coding with LLM is quite easy. There are plenty of libraries available that simplify AI usage. Basic Python knowledge will suffice.

However, installing dependencies can be challenging as it requires coordination among different library packages. The first time I attempted to install my environment, it took me an entire weekend to get everything working correctly. NVIDIA resources can be a bit messy to navigate, especially when trying to figure out which driver versions will work well with your specific CUDA version as well as with PyTorch… But since then, I’ve discovered a trick that has made the whole process much easier. With Debian 12, all you have to do is to add non-free repositories, and then install nvidia-smi.

So, let’s begin!

If you don’t have a Debian 12 installation, install it. In apt configuration, add add non-free, contrib, and non-free-firmware after main. Depend on your specific installation, it might be located at either /etc/apt/sources.list or /etc/apt/sources.list.d/debian.sources.

Update your package lists, be sure your system is up to date, then install linux-headers (the NVIDIA driver will require it though there are no package dependencies and this package may not exist on your system) and nvidia-smi.

sudo apt update
sudo apt upgrade -y
sudo apt install -y linux-headers-$(uname -r)
sudo apt install -y nvidia-smi

Normally, you should have something like this:

| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  NVIDIA RTX A5000    On   | 00000000:00:03.0 Off |                  Off |
| 30%   40C    P5    75W / 230W |      1MiB / 24564MiB |      0%      Default |
|                               |                      |                  N/A |

| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|  No running processes found                                                 |

Interdependencies between Python packages can be complex when working with AI systems, which requires multiple installation environments to handle varying tasks. To simplify this process, using virtual environment (venv) or Docker containers are recommended options. Here I choose Docker because venv is not really used in the corporate world and docker is a more useful skill to cultivate.

for pkg in docker.io docker-doc docker-compose podman-docker containerd runc; do sudo apt-get remove $pkg; done
sudo apt install -y ca-certificates curl gnupg
sudo install -m 0755 -d /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/debian/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
sudo chmod a+r /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/debian \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt -y install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
sudo docker run hello-world

Normally, you would see this message:

Hello from Docker!

Enable the Docker service and add your non-administrator user to the Docker group.

sudo systemctl enable docker.service
sudo systemctl enable containerd.service
sudo usermod -aG docker $USER

Close and reopen your session, then try again using Docker as normal user. You can also remove the hello world image.

docker run hello-world
docker rmi hello-world -f

Install docker compose. If you want the later version, you can find it on this website and replace the number in the URL.

sudo curl -L "https://github.com/docker/compose/releases/download/v2.23.3/docker-compose-linux-x86_64" -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

Install NVIDIA Container Toolkit. There’s a trick to doing this on Debian 12. You need to use the Debian 11 version (actually, NVIDIA Container Toolkit works similarly across all versions from Debian 10).

curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/debian11/libnvidia-container.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt update
sudo apt install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
sudo systemctl restart docker

And now, you can try running nvidia-smi within a container.

docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi

Finally, you can delete that temporary image.

docker rmi ubuntu -f

Now you have a Linux installation with NVIDIA drivers, CUDA, and Docker, all working together. In the next lesson, you’ll learn how to set up your Docker development environment.