Managing hardware-related aspects of Kubernetes clusters is hard. Accessing host hardware often requires additional kernel modules and deep integration or even modification of container runtimes. Meanwhile, nodes come and go and provisioning tools like coreos/ignition or canonical/cloud-init don’t cover the whole node lifecycle. It becomes even harder on specialised container operating systems like Fedora/RedHat CoreOS and Flatcar because of their largely immutable or ephemeral filesystems. Oftentimes using an additional IT automation stack like Ansible or Chef seems necessary.

Fortunately, use cases that require direct hardware access, like dedicated networking, are rare. There is just one notable exception to this rule: GPUs. While GPU integration in Kubernetes has become much better over the years, managing drivers and runtimes can still be challenging. Therefore, I was excited when I discovered the NVIDIA/gpu-operator. The gpu-operator is a Kubernetes native operator that automates almost all work related to managing nodes with GPUs: It installs drivers, runtimes, device plugins, metric exporters, labels node objects etc.

This might sound almost too good to be true and, of course, there was a major problem for my use cases. At the time of this writing, the operator only supports a very limited number of platforms namely Ubuntu, RHCOS and CentOS¹ and all our nodes run on Flatcar Linux. But don’t despair! As I discovered, Flatcar support seems to be in development ² and there are some flatcar tagged driver container images in the NVIDIA Container Registry.

Setup

The installation is still not as smooth as on the supported operating systems and there are no regular driver builds on the official CI. But otherwise, it works! Obviously, there is no official documentation on the prerequisites and node preparation steps yet, but other than that you can follow the official Getting Started guide.

Node Preparation

Two things need to be done before the GPU Operator can provision a Flatcar node. First, you need to activate the i2c_core and ipmi_msghandler kernel modules³. If you want the driver container to be able to dynamically build the driver modules after kernel upgrades, you also need to activate the loop module⁴. You can load and permanently enable the modules using the following commands

sudo modprobe -a loop i2c_core ipmi_msghandler
sudo echo -e "loop\ni2c_core\nipmi_msghandler" | tee /etc/modules-load.d/driver.conf

The second preparation step concerns Flatcar’s filesystem. By default, most system directories are read-only filesystems on Flatcar. Unfortunately, the GPU Operator tries to put its artefacts, like the driver and container runtimes, in such a location. I couldn’t find a solution for changing that behaviour that didn’t involve maintaining some kind of fork of the official operator resources. Therefore, I figured the easiest solution would be to make the destination writable. The best way to do this seems to create a writable overlay for the needed directories⁵.

You can do this using a simple systemd unit.

[Unit]
Description=Writable nvidia driver location
Before=local-fs.target
ConditionPathExists=/opt/usr-local-overlay

[Mount]
Type=overlay
What=overlay
Where=/usr/local
Options=lowerdir=/usr/local,upperdir=/opt/usr-local-overlay,workdir=/opt/usr-local-overlay.wd

[Install]
WantedBy=local-fs.target

I am using Container Linux Configs to activate the kernel modules and install the systemd unit. These get transpiled to ignition files and are supplied to the nodes when they are provisioned. If that is not possible in your setup, I suggest running the necessary commands as a Kubernetes DaemonSet.

Driver Containers

As mentioned above, there are no official regular pre-built driver containers available for flatcar yet. Although, there are some builds in the NVIDIA Container Registry, e.g. nvcr.io/nvidia/driver:460.73.01-5.10.32-flatcar. So one option is to use one of those.

If you want to use a more recent driver version or make use of the precompiled kernel interfaces feature, you need to build the container yourself. Fortunately, this is as straightforward as cloning the repository, choosing a driver version and issuing the docker build command.

git clone https://gitlab.com/nvidia/container-images/driver.git
cd driver/flatcar
DRIVER_VERSION=470.57.02
docker build --pull --tag your-registry/nvidia-driver:${DRIVER_VERSION} --file Dockerfile . --build-arg DRIVER_VERSION=${DRIVER_VERSION}

The current source uses an outdated URL for getting release information for Flatcar. The URL seems to have recently changed when Kinvolk (the creators of Flatcar) was acquired by Microsoft. Although the old link is still online, it seems to create some problems on more recent versions of Flatcar, so I suggest replacing the link with the new one before building the container

-COREOS_ALL_RELEASES="https://kinvolk.io/flatcar-container-linux/releases-json/releases-${COREOS_RELEASE_CHANNEL}.json"
+COREOS_ALL_RELEASES="https://www.flatcar-linux.org/releases-json/releases-${COREOS_RELEASE_CHANNEL}.json"

Precompiled kernel interfaces

The default behaviour of the driver container is to download the Flatcar development container and build the kernel modules on the fly. This has the advantage that you can use the same driver container across nodes running Flatcar versions with different kernels. The downside is, that depending on your hardware it can take a couple of minutes for the nodes to become available.

A faster solution is to package precompiled kernel interfaces into the container. The only thing you need to do for that is run the container with the update command, then commit it and push the additional layers to your registry. You need to do this every time a new OS version gets released. The GPU Operator can automatically select the correct tag when pulling the image for a node. If the container detects a kernel mismatch, it will just fall back to building the module on the node.

You can read more about this feature in the driver container’s README.

Deployment

For the deployment of the GPU Operator, you can follow its official Getting Started guide. Just make sure, that you select the right driver container. If you are using Helm, the corresponding values are the following:

driver:
    repository: your-repo
    image: nvidia-driver
    version: "sha256:XXXXX"

You’ll notice that I am using the SHA sum of the container instead of a tag here. This is required if you are not using the precompiled kernel interfaces feature. You can also specify an image tag. In this case, the GPU Operator will append -flatcarVERSION, e.g. -flatcar2905 to the tag, so you need to make sure to provide tagged images for all Flatcar versions you are running. Otherwise, your nodes will fail to pull the driver image.

And that’s all that is necessary to run the GPU Operator on flatcar nodes :)

Setup#

Node Preparation#

Driver Containers#

Precompiled kernel interfaces#

Deployment#

Setup

Node Preparation

Driver Containers

Precompiled kernel interfaces

Deployment