Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

// index //

declarative GPU containers. vast.ai. runpod. bare-metal. zero dockerfile cope.


// start here //


// high-level //

  1. declare containers under perSystem.nix2gpu.<n>
  2. each container config is a nix module (like nixos modules)
  3. nix2gpu assembles:
    • root filesystem with nix store + your packages
    • startup script for runtime environment
    • service graph via Nimi
  4. helper commands:
    • nix build .#<n> — build image
    • nix run .#<n>.copy-to-container-runtime — load into docker/podman
    • nix run .#<n>.copy-to-github — push to ghcr
    • nix run .#<n>.copy-to-runpod — push to runpod

// cloud targets //

platformstatusnotes
vast.ai✅ stablenvidia libs at /lib/x86_64-linux-gnu
runpod✅ stablenetwork volumes, template support
lambda labs✅ worksstandard docker
bare-metal✅ worksjust run the container
kubernetes🚧 wipgpu operator integration

// where to go //

// ComfyUI setup guide //

This guide covers a walk through of setting up ComfyUI inside a nix2gpu container, and then deploying it to vast.ai. It should hopefully also provide useful information to others trying to deploy different pieces of software too.

Installing Nix

First of all, you’ll need to install Nix.

There are a couple of easy ways to get it:

Creating a flake to develop out of:

With nix installed you can now run:

mkdir my-nix2gpu-project

cd my-nix2gpu-project

git init

nix flake init

git add .

git commit -m "nix flake init"

Adding Inputs

You will now have a new git repository with an empty flake.nix. Edit this to add

nix2gpu.url = "github:weyl-ai/nix2gpu?ref=baileylu/public-api";
systems.url = "github:nix-systems/default";
flake-parts.url = "github:hercules-ci/flake-parts";

Into the inputs section.

No additional inputs are required to use services or home-manager; nix2gpu bundles Nimi and nix2container internally.

Replace the outputs section with this:

outputs =
  inputs@{ flake-parts, self, ... }:
  flake-parts.lib.mkFlake { inherit inputs; } {
    imports = [
      inputs.nix2gpu.flakeModule
    ];

    systems = import inputs.systems;

    # This is where nix2gpu config goes
    # More on this later
    perSystem.nix2gpu = {};
  };

Select a nix2gpu starter config to use

Take a look in the examples folder and pick one which looks useful.

Going forward, we will use the comfyui.nix example.

We can run this in nix2gpu like (replacing the perSystem.nix2gpu from earlier:

{
  perSystem = { pkgs, ... }: {
    nix2gpu."comfyui-service" = {
      services.comfyui."comfyui-example" = {
        imports = [ (lib.modules.importApply ../services/comfyui.nix { inherit pkgs; }) ];
        # You'll need to use the nixified-ai overlay for this
        # Check them out - https://github.com/nixified-ai/flake
        models = [ pkgs.nixified-ai.models.stable-diffusion-v1-5 ];
      };

      registries = [ "ghcr.io/weyl-ai" ];

      exposedPorts = {
        "22/tcp" = { };
        "8188/tcp" = { };
        "8188/udp" = { };
      };
    };
  };
}

Getting the ComfyUI instance onto vast

You can now build and copy your service to the GitHub package registry (ghcr.io) with

nix run .#comfyui-service.copyToGithub

Next, go to the vast.ai web UI and create a new template, using the GitHub package we just pushed as the source when prompted.

Now, reserve a new instance on vast using the template and give it a few minutes to start up (wait for “Running…”).

Now, use the web UI to add an ssh key. You can get/find your public key with this guide. Once you have it, use the key shaped button on your running vast instance and paste the key starting with ssh-rsa or ssh-ed into the keys for the instance.

You can now connect with the command and IP address it gives you. Make sure you use -L 8188: localhost:8188 to be able to view comfy UI in your browser.

Other options

You can also run a nix2gpu image locally if you have docker or podman installed:

nix run .#my-service.copyToDockerDaemon
nix run .#shell

Options Reference

_module.args

Additional arguments passed to each module in addition to ones like lib, config, and pkgs, modulesPath.

This option is also available to all submodules. Submodules do not inherit args from their parent module, nor do they provide args to their parent module or sibling submodules. The sole exception to this is the argument name which is provided by parent modules to a submodule and contains the attribute name the submodule is bound to, or a unique generated name if it is not bound to an attribute.

Some arguments are already passed by default, of which the following cannot be changed with this option:

  • lib: The nixpkgs library.

  • config: The results of all options after merging the values from all modules together.

  • options: The options declared in all modules.

  • specialArgs: The specialArgs argument passed to evalModules.

  • All attributes of specialArgs

    Whereas option values can generally depend on other option values thanks to laziness, this does not apply to imports, which must be computed statically before anything else.

    For this reason, callers of the module system can provide specialArgs which are available during import resolution.

    For NixOS, specialArgs includes modulesPath, which allows you to import extra modules from the nixpkgs package tree without having to somehow make the module aware of the location of the nixpkgs or NixOS directories.

    { modulesPath, ... }: {
      imports = [
        (modulesPath + "/profiles/minimal.nix")
      ];
    }
    

For NixOS, the default value for this option includes at least this argument:

  • pkgs: The nixpkgs package set according to the nixpkgs.pkgs option.

Type: lazy attribute set of raw value

Declared by:

age

The agenix configuration for the container’s user environment.

This module is a port of agenix to nix2gpu, and allows simplified management of age encrypted secrets in nix.

To use this, you must first enable the optional agenix integration:

Then enable the module with:

age.enable = true;

Type: submodule

Default: { }

Declared by:

age.enable

Whether to enable enable agenix integration.

Type: boolean

Default: false

Example: true

Declared by:

age.package

The rage package to use.

Type: package

Default: pkgs.rage

Declared by:

age.identityPaths

Paths to SSH keys to be used as identities in age decryption.

Type: list of absolute path

Default:

[
  "/etc/ssh/ssh_host_rsa_key"
  "/etc/ssh/ssh_host_ed25519_key"
]

Declared by:

age.secrets

Attrset of secrets.

Type: attribute set of (submodule)

Default: { }

Declared by:

age.secrets.<name>.file

Age file the secret is loaded from.

Type: absolute path

Declared by:

age.secrets.<name>.mode

Permissions mode of the decrypted secret in a format understood by chmod.

Type: string

Default: "0400"

Declared by:

age.secrets.<name>.name

Name of the file used in ${cfg.secretsDir}

Type: string

Default: "‹name›"

Declared by:

age.secrets.<name>.path

Path where the decrypted secret is installed.

Type: string

Default: "\${XDG_RUNTIME_DIR}/agenix/‹name›"

Declared by:

Whether to enable symlinking secrets to their destination.

Type: boolean

Default: true

Example: true

Declared by:

age.secretsDir

Folder where secrets are symlinked to

Type: string

Default:

"${XDG_RUNTIME_DIR}"/${dir}.

Declared by:

age.secretsMountPoint

Where secrets are created before they are symlinked to ${cfg.secretsDir}

Type: unspecified value

Default:

"${XDG_RUNTIME_DIR}"/${dir}.

Declared by:

copyToRoot

A list of packages to be copied to the root of the container.

This option allows you to specify a list of Nix packages that will be symlinked into the root directory of the container. This is useful for making essential packages and profiles available at the top level of the container’s filesystem.

The default value includes the base system, the container’s profile, and the Nix store profile, which are essential for the container to function correctly.

If you want to add extra packages without replacing the default set, use the extraCopyToRoot option instead.

This is a direct mapping to the copyToRoot attribute from nix2container.

Type: list of package

Default: The generated base system from the other config options

Example:

copyToRoot = with pkgs; [
  coreutils
  git
];

Declared by:

cuda.enable

If nix2gpu’s cuda integration should be enabled or not

Type: boolean

Default: true

Example:

cudaPackages = pkgs.cudaPackages_11_8;

Declared by:

cuda.packages

The set of CUDA packages to be used in the container.

This option allows you to select a specific version of the CUDA toolkit to be installed in the container. This is crucial for ensuring compatibility with applications and machine learning frameworks that depend on a particular CUDA version.

The value should be a package set from pkgs.cudaPackages. You can find available versions by searching for cudaPackages in Nixpkgs.

Type: package

Default: pkgs.cudaPackages_13_0

Example:

cuda.packages = pkgs.cudaPackages_11_8;

Declared by:

env

A list of environment variables to set inside the container.

This option allows you to define the environment variables that will be available within the container.

The default value provides a comprehensive set of environment variables for a typical development environment, including paths for Nix, CUDA, and other essential tools.

If you want to add extra environment variables without replacing the default set, use the extraEnv option instead.

This is a direct mapping to the Env attribute of the oci container spec.

Type: attribute set of string

Default:

CURL_CA_BUNDLE = "/etc/ssl/certs/ca-bundle.crt";
HOME = "/root";
LANG = "en_US.UTF-8";
LC_ALL = "en_US.UTF-8";
LD_LIBRARY_PATH = "/lib/x86_64-linux-gnu:/usr/lib64:/usr/lib";
LOCALE_ARCHIVE = "glibc/lib/locale/locale-archive";
NIXPKGS_ALLOW_UNFREE = "1";
NIX_PATH = "nixpkgs=/nix/var/nix/profiles/per-user/root/channels";
NIX_SSL_CERT_FILE = "/etc/ssl/certs/ca-bundle.crt";
PATH = "/root/.nix-profile/bin:/nix/var/nix/profiles/default/bin:/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin";
SSL_CERT_FILE = "/etc/ssl/certs/ca-bundle.crt";
TERM = "xterm-256color";
USER = "root";

Example:

env = {
  MY_CUSTOM_VARIABLE = "hello";
  ANOTHER_VARIABLE = "world";
};

Declared by:

exposedPorts

A set of ports to expose from the container.

This option allows you to specify which network ports should be exposed by the container. The keys are the port and protocol (e.g., “80/tcp”), and the values are empty attribute sets.

By default, port 22 is exposed for SSH access.

This is a direct mapping to the ExposedPorts attribute of the oci container spec.

Type: attribute set of anything

Default:

{
  "22/tcp" = { };
}

Example:

exposedPorts = {
  "8080/tcp" = {};
  "443/tcp" = {};
};

Declared by:

extraEnv

A list of extra environment variables to set inside the container.

This option allows you to add more environment variables to the env option without overriding the default set. The variables listed here will be appended to the main env list.

This is the recommended way to add your own custom environment variables.

Type: attribute set of string

Default: { }

Example:

extraEnv = {
  DATABASE_URL = "postgres://user:password@host:port/db";
  NIXPKGS_ALLOW_UNFREE = "1";
};

Declared by:

extraLabels

A set of extra labels to apply to the container.

This option allows you to add custom metadata to the container in the form of labels. These labels can be used for organizing and filtering containers, or for storing information about the container’s contents or purpose.

The labels defined here will be merged with the default labels set.

This is the recommended way to add more labels to your project rather than overriding labels.

Type: attribute set of string

Default: { }

Example:

extraLabels = {
  "com.example.vendor" = "My Company";
  "com.example.project" = "My Project";
};

Declared by:

extraStartupScript

A string of shell commands to be executed at the end of the container’s startup script.

This option provides a way to run custom commands every time the container starts. The contents of this option will be appended to the main startup script, after the default startup tasks have been completed.

This is useful for tasks such as starting services, running background processes, or printing diagnostic information.

Type: string (concatenated when merged)

Default: ""

Example:

extraStartupScript = ''
  echo "Hello world"
'';

Declared by:

home

The home-manager configuration for the container’s user environment.

This option allows you to define the user’s home environment using home-manager. You can configure everything from shell aliases and environment variables to user services and application settings.

By default, a minimal set of useful modern shell packages is included to provide a comfortable and secure hacking environment on your machines.

home-manager is bundled with nix2gpu, so no additional flake inputs are required to use this option.

Type: lazy attribute set of raw value

Default: A sample home manager config with some nice defaults from nix2gpu

Example:

home = home-manager.lib.homeManagerConfiguration {
  inherit pkgs;
  extraSpecialArgs = { inherit inputs; };
  modules = [
    ./home
  ];
};

Declared by:

labels

A set of labels to apply to the container.

This option allows you to define metadata for the container in the form of labels. These labels can be used for organizing and filtering containers, or for storing information about the container’s contents or purpose.

The default value includes several labels that provide information about the container’s origin, runtime, and dependencies.

If you want to add extra labels without replacing the default set, use the extraLabels option instead.

This is a direct mapping to the Labels attribute of the oci container spec.

Type: attribute set of string

Default:

"ai.vast.gpu" = "required";
"ai.vast.runtime" = "nix2gpu";
"org.opencontainers.image.source" = "https://github.com/weyl-ai/nix2gpu";
"org.opencontainers.image.description" = "Nix-based GPU container";

Example:

labels = {
  "my.custom.label" = "some-value";
  "another.label" = "another-value";
};

Declared by:

maxLayers

The maximum number of layers to use when creating the container image.

This option sets the upper limit on the number of layers that will be used to build the container image. This is an important consideration for caching and build time purposes, and can have many benefits.

See this blog post for some nice information on layers in a nix context.

This is a direct mapping to the maxLayers attribute from nix2container.

Type: signed integer

Default: 50

Example:

maxLayers = 100;

Declared by:

meta

meta attributes to include in the output of generated nix2gpu containers

Type: lazy attribute set of raw value

Default: { }

Example:

{
  meta = {
    description = "My cool nimi package";
  };
}

Declared by:

nimiSettings

Bindings to nimi.settings for this nix2gpu instance.

Use this to tune Nimi runtime behavior (restart policy, logging, startup hooks, and container build settings) beyond the defaults provided by nix2gpu.

Type: module

Default: { }

Declared by:

nixConfig

The content of the nix.conf file to be used inside the container.

This option allows you to provide a custom nix.conf configuration for the Nix daemon running inside the container. This can be used to configure things like custom binary caches, experimental features, or other Nix-related settings.

By default, a standard nix.conf is provided which is suitable for most use cases.

Type: string

Default:

sandbox = false
build-users-group =
experimental-features = nix-command flakes
trusted-users = root
max-jobs = auto
cores = 0
trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= nix-community.cachix.org-1:mB9FSh9qf2dCimDSUo8Zy7bkq5CX+/rkCWyvRCYg3Fs= cuda-maintainers.cachix.org-1:0dq3bujKpuEPMCX6U4WylrUDZ9JyUG0VpVZa7CNfq5E= cache.garnix.io:CTFPyKSLcx5RMJKfLo5EEPUObbA78b0YQ2DTCJXqr9g=
substituters = https://cache.nixos.org https://nix-community.cachix.org https://cuda-maintainers.cachix.org https://cache.garnix.io
keep-outputs = true
keep-derivations = true
accept-flake-config = true

Example:

nixConfig = ''
  experimental-features = nix-command flakes
  substituters = https://cache.nixos.org/ https://my-cache.example.org
  trusted-public-keys = cache.nixos.org-1:6NCHdD59X431o0gWypbMrAURkbJ16ZPMQFGspcDShjY= my-cache.example.org-1:abcdef...
'';

Declared by:

passthru

passthru attributes to include in the output of generated nix2gpu containers

Type: lazy attribute set of raw value

Default: { }

Example:

{
  passthru = {
    doXYZ = pkgs.writeShellApplication {
      name = "xyz-doer";
      text = ''
        xyz
      '';
    };
  };
}

Declared by:

registries

The container registries to push your images to.

This option specifies a list of the full registry paths, including the repository and image name, where the container image will be pushed. This is a mandatory field if you intend to publish your images via <container>.copyToGithub.

Type: list of string

Default: [ ]

Example:

registries = [ "ghcr.io/my-org/my-image" ];

Declared by:

services

Services to run inside the nix2gpu container via Nimi.

Each attribute defines a named NixOS modular service (Nix 25.11): import a service module and override its options per instance. This keeps service definitions composable and reusable across projects.

For the upstream model, see the NixOS manual section on Modular Services.

Type: lazy attribute set of module

Default: { }

Declared by:

sshdConfig

The content of the sshd_config file to be used inside the container.

This option allows you to provide a custom configuration for the OpenSSH daemon (sshd) running inside the container. This can be used to customize security settings, authentication methods, and other SSH-related options.

By default, a standard sshd_config is provided that is suitable for most use cases, with password authentication disabled in favor of public key authentication.

Type: string

Default: nix2gpu generated sshd config

Example:

sshdConfig = builtins.readFile ./my-sshd-config;

Declared by:

systemPackages

A list of system packages to be copied into the container.

This option allows you to specify a list of Nix packages that will be added to the container.

Type: list of package

Default: [ ]

Example:

systemPackages = with pkgs; [
  coreutils
  git
];

Declared by:

tag

The tag to use for your container image.

This option specifies the tag that will be applied to the container image when it is built and pushed to a registry. Tags are used to version and identify different builds of your image.

The default value is “latest”, which is a common convention for the most recent build. However, it is highly recommended to use more descriptive tags for production images, such as version numbers or git commit hashes.

Type: string

Default: "latest"

Example:

tag = "v1.2.3";

Declared by:

tailscale

The tailscale configuration to use for your nix2gpu container.

Configure the tailscale daemon to run on your nix2gpu instance, giving your instances easy and secure connectivity.

Type: submodule

Default: { }

Example:

tailscale = {
  enable = true;
};

Declared by:

tailscale.enable

Whether to enable enable the tailscale daemon.

Type: boolean

Default: false

Example: true

Declared by:

tailscale.authKey

Runtime path to valid tailscale auth key

Type: string

Default: ""

Example: /etc/default/tailscaled

Declared by:

user

The default user for the container.

This option specifies the username of the user that will be used by default when running commands or starting services in the container.

The default value is “root”. While this is convenient for development, it is strongly recommended to create and use a non-root user for production environments to improve security. You can create users and groups using the users and groups options in your home-manager configuration.

Type: string

Default: root

Example:

user = "appuser";

Declared by:

workingDir

The working directory for the container.

This option specifies the directory that will be used as the current working directory when the container starts. It is the directory where commands will be executed by default.

The default value is “/root”. You may want to change this to a more appropriate directory for your application, such as /app or /srv.

Type: string

Default: /root

Example:

workingDir = "/app";

Declared by:

// architecture overview //

how nix2gpu works under the hood.


// the big picture //

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   nix flake     │────│      Nimi       │────│   OCI image     │
│   definition    │    │ mkContainerImage│    │   (layered)     │
└─────────────────┘    └─────────────────┘    └─────────────────┘
         │                        │                        │
         ▼                        ▼                        ▼
┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│ container config│    │  startup script │    │ docker/podman   │
│   (nix modules) │    │ + Nimi runtime  │    │    runtime      │
└─────────────────┘    └─────────────────┘    └─────────────────┘

nix2gpu transforms declarative nix configurations into reproducible GPU containers through a multi-stage build process.


// build pipeline //

1. nix evaluation

perSystem.nix2gpu."my-container" = {
  cuda.packages = pkgs.cudaPackages_12_8;
  tailscale.enable = true;
  services."api" = {
    process.argv = [ (lib.getExe pkgs.my-api) "--port" "8080" ];
  };
};

The nix module system processes your configuration, applying defaults, validating options, and computing the final container specification.

2. dependency resolution

nix build .#my-container

Nix builds the entire dependency graph:

  • Base system packages (bash, coreutils, etc.)
  • CUDA toolkit and drivers
  • Your application packages
  • Service configurations
  • Startup scripts

3. image assembly

nimi.mkContainerImage {
  name = "my-container";
  copyToRoot = [ baseSystem cudaPackages userPackages ];
}

Nimi builds the OCI image (via nix2container) with:

  • Layered filesystem for efficient caching
  • Only necessary dependencies included
  • Reproducible layer ordering

4. container execution

docker run --gpus all my-container:latest

The startup script orchestrates initialization, then Nimi runs services.


// filesystem layout //

/
├── nix/
│   └── store/          # immutable package store
│       ├── cuda-*      # CUDA toolkit
│       ├── startup-*   # initialization script  
│       └── packages-*  # your applications
├── etc/
│   ├── ssh/            # SSH daemon config
│   └── ld.so.conf.d/   # library search paths
├── run/
│   └── secrets/        # mounted secret files
├── workspace/          # default working directory
└── tmp/                # temporary files

Key principles:

  • Immutable system: /nix/store contains all software, never modified at runtime
  • Mutable state: /workspace, /tmp, /run for runtime data
  • Secrets: mounted at /run/secrets from external sources
  • Library paths: dynamic loader configured for both nix store and host-mounted drivers

// startup sequence //

The container initialization follows a precise sequence:

1. environment setup

# startup.sh
export PATH="/nix/store/.../bin:$PATH"
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/lib/x86_64-linux-gnu"
export CUDA_PATH="/nix/store/...-cuda-toolkit"
  • Sets up PATH for nix store binaries
  • Configures library search for both nix store and host-mounted NVIDIA drivers
  • Establishes CUDA environment

2. runtime detection

if [[ -d "/lib/x86_64-linux-gnu" ]]; then
  echo "vast.ai runtime detected"
  # patch nvidia utilities for host drivers
elif [[ -n "$RUNPOD_POD_ID" ]]; then
  echo "runpod runtime detected"  
  # configure network volumes
else
  echo "bare-metal/docker runtime detected"
fi

Adapts configuration based on detected cloud provider or bare-metal environment.

3. GPU initialization

# link host drivers to expected locations
ldconfig

# test GPU access
nvidia-smi || echo "GPU not available"

Ensures GPU toolchain works with both nix store CUDA and host-mounted drivers.

4. network setup

# tailscale daemon (if enabled)
if [[ -n "$TAILSCALE_AUTHKEY_FILE" ]]; then
  tailscaled --state-dir=/tmp/tailscale &
  tailscale up --authkey="$(cat $TAILSCALE_AUTHKEY_FILE)"
fi

# SSH daemon
mkdir -p /var/empty /var/log
sshd

Starts networking services: Tailscale for mesh networking, SSH for remote access.

5. service orchestration

# Nimi is the container entrypoint
nimi --config /nix/store/.../nimi.json

Nimi runs the startup hook and then launches your modular services.


// service management //

Nimi

nix2gpu uses Nimi, a tiny process manager for NixOS modular services (Nix 25.11):

services."api" = {
  process.argv = [ (lib.getExe pkgs.my-api) "--port" "8080" ];
};

nimiSettings.restart.mode = "up-to-count";

Benefits over systemd:

  • No init system complexity
  • Modular service definitions
  • JSON config generated by Nix
  • Predictable restart behavior

service lifecycle

  1. Dependency resolution: services start in correct order
  2. Health monitoring: automatic restart on failure
  3. Log aggregation: all service logs to stdout for docker logs
  4. Graceful shutdown: proper signal handling for container stops

// networking architecture //

standard mode (docker/podman)

┌─────────────┐    ┌─────────────┐    ┌─────────────┐
│    host     │────│  container  │────│   service   │
│ localhost:* │    │   bridge    │    │ localhost:* │
└─────────────┘    └─────────────┘    └─────────────┘

Standard container networking with port forwarding.

tailscale mode (mesh networking)

┌─────────────┐    ┌─────────────────┐    ┌─────────────┐
│   host-a    │    │   tailscale     │    │   host-b    │
│ container-a ├────┤  mesh network   ├────┤ container-b │
│10.0.0.100:22│    │                 │    │10.0.0.101:22│
└─────────────┘    └─────────────────┘    └─────────────┘

Direct container-to-container communication across hosts via Tailscale.

Key advantages:

  • No port forwarding needed
  • Works across clouds and networks
  • End-to-end encryption
  • DNS-based service discovery
  • ACL-based access control

// GPU integration //

driver compatibility

# nix store CUDA toolkit
/nix/store/...-cuda-toolkit/
├── bin/nvcc
├── lib/libcuda.so       # stub library
└── include/cuda.h

# host-mounted real drivers  
/lib/x86_64-linux-gnu/
├── libcuda.so.1         # actual GPU driver
├── libnvidia-ml.so.1
└── libnvidia-encode.so.1

The challenge: CUDA applications need both:

  • CUDA toolkit (development headers, nvcc compiler) from nix store
  • Actual GPU drivers from the host system

The solution: dynamic library path configuration

export LD_LIBRARY_PATH="/nix/store/...-cuda/lib:${LD_LIBRARY_PATH}:/lib/x86_64-linux-gnu"
ldconfig

This allows nix store CUDA to find host-mounted drivers at runtime.

cloud provider adaptations

vast.ai: NVIDIA drivers mounted at /lib/x86_64-linux-gnu

# startup.sh detects vast.ai and configures paths
patchelf --set-rpath /lib/x86_64-linux-gnu /nix/store/.../nvidia-smi

runpod: Standard nvidia-docker integration

# uses nvidia-container-toolkit mounts
# drivers available via standard paths

bare-metal: Host nvidia-docker setup

# relies on proper nvidia-container-toolkit configuration
# GPU access via device mounts: --gpus all

// secret management //

security principles

  1. Secrets never enter nix store (nix store is world-readable)
  2. Runtime-only access (secrets mounted at container start)
  3. File-based injection (not environment variables)
  4. Minimal exposure (secrets only accessible to specific processes)

agenix integration

nix2gpu."my-container" = {
  age.enable = true;
  age.secrets.tailscale-key = {
    file = ./secrets/ts-key.age;
    path = "/run/secrets/ts-key";
  };

  tailscale.authKeyFile = config.secrets.tailscale-auth.path;
};

Flow:

  1. Host system decrypts secrets to /run/secrets/
  2. Container mounts /run/secrets as volume
  3. Container references secrets by path, never by value

// caching & performance //

layer optimization

# nix2container creates efficient layers
[
  layer-01-base-system     # coreutils, bash, etc.
  layer-02-cuda-toolkit    # large but stable
  layer-03-python-packages # frequently changing  
  layer-04-app-code        # most frequently changing
]

Frequently changing components go in higher layers to maximize cache hits.

build caching

# first build: downloads everything
nix build .#my-container  # ~15 minutes

# subsequent builds: only changed layers
nix build .#my-container  # ~30 seconds

Nix’s content-addressed store ensures perfect reproducibility with efficient incremental builds.

registry layer sharing

# multiple containers share base layers
my-container:v1    # 2GB total (8 layers)
my-container:v2    # +100MB (only top layer changed)
other-container:v1 # +500MB (shares 6 bottom layers)

OCI registries deduplicate shared layers across images.


// extending the system //

custom services

# services/my-service.nix
{ lib, pkgs, ... }:
{ config, ... }:
let
  inherit (lib) mkOption types;
  cfg = config.myService;
in
{
  _class = "service";

  options.myService = {
    port = mkOption { type = types.port; default = 8080; };
    # ... other options
  };
  
  config.process.argv = [
    (lib.getExe pkgs.my-service)
    "--port"
    (toString cfg.port)
  ];
}

custom cloud targets

# modules/container/scripts/copy-to-my-cloud.nix
{
  perSystem = { pkgs, self', ... }: {
    perContainer = { container, ... }: {
      scripts.copy-to-my-cloud = pkgs.writeShellApplication {
        name = "copy-to-my-cloud";
        text = ''
          # implement your cloud's container registry push
        '';
      };
    };
  };
}

The modular architecture makes it straightforward to add new cloud providers or service types.

// services & runtime //

managing long-running processes inside nix2gpu containers.


// overview //

nix2gpu uses Nimi, a tiny process manager for NixOS modular services (Nix 25.11). Define services under services.<name> and nix2gpu will build an OCI image with Nimi as the entrypoint.

No extra flake inputs are required to enable services.


// defining services //

using existing modular services

services."ghostunnel" = {
  imports = [ pkgs.ghostunnel.services ];
  ghostunnel = {
    listen = "0.0.0.0:443";
    cert = "/root/service-cert.pem";
    key = "/root/service-key.pem";
    disableAuthentication = true;
    target = "backend:80";
    unsafeTarget = true;
  };
};

a minimal custom service

services."hello" = {
  process.argv = [
    (lib.getExe pkgs.bash)
    "-lc"
    "echo hello from Nimi"
  ];
};

For a full custom module example, see defining custom services.


// runtime behavior //

When the container starts, Nimi runs the nix2gpu startup hook and then launches all configured services. You can still drop into a shell for debugging with:

$ docker run -it --entrypoint bash my-container:latest

// restart policies //

Nimi controls service restarts. Tune it with nimiSettings.restart:

nimiSettings.restart = {
  mode = "up-to-count"; # never | up-to-count | always
  time = 2000;           # delay in ms
  count = 3;             # max restarts when using up-to-count
};

// logging //

Logs always stream to stdout/stderr for docker logs. You can also enable per-service log files:

nimiSettings.logging = {
  enable = true;
  logsDir = "nimi_logs";
};

At runtime, Nimi creates a logs-<n> directory under logsDir and writes one file per service.


// config data //

Modular services can provide config files via configData. Nimi exposes these files under a temporary directory and sets XDG_CONFIG_HOME for the service. This lets services read configs from $XDG_CONFIG_HOME/<path> without writing to the Nix store.

See nix2gpu service modules like services/comfyui.nix for real-world usage.

// defining a custom service //

Use NixOS modular services (Nix 25.11) to describe long-running processes. nix2gpu runs them through Nimi, so there is no extra flake input to enable.


// example: a simple HTTP server //

This example defines a tiny service module that runs python -m http.server.

simple-http.nix

{ lib, pkgs, ... }:
{ config, ... }:
let
  inherit (lib) mkOption types;
  cfg = config.simpleHttp;
in
{
  _class = "service";

  options.simpleHttp = {
    port = mkOption {
      type = types.port;
      default = 8080;
    };
    bind = mkOption {
      type = types.str;
      default = "0.0.0.0";
    };
    directory = mkOption {
      type = types.str;
      default = "/workspace";
    };
  };

  config.process.argv = [
    (lib.getExe pkgs.python3)
    "-m"
    "http.server"
    "--bind"
    cfg.bind
    "--directory"
    cfg.directory
    (toString cfg.port)
  ];
}

server.nix

{ lib, ... }:
{
  perSystem = { pkgs, ... }: {
    nix2gpu."simple-http" = {
      services."web" = {
        imports = [ (lib.modules.importApply ./simple-http.nix { inherit pkgs; }) ];
        simpleHttp = {
          port = 8080;
          directory = "/workspace/public";
        };
      };

      exposedPorts = {
        "8080/tcp" = {};
        "22/tcp" = {};
      };
    };
  };
}

// using existing modular services //

If a package ships a modular service module, you can import it directly. For example, ghostunnel from nixpkgs:

services."ghostunnel" = {
  imports = [ pkgs.ghostunnel.services ];
  ghostunnel = {
    listen = "0.0.0.0:443";
    cert = "/root/service-cert.pem";
    key = "/root/service-key.pem";
    disableAuthentication = true;
    target = "backend:80";
    unsafeTarget = true;
  };
};

Need a fuller reference? See services/comfyui.nix for a real-world service module.

// integrations //

nix2gpu ships its integrations as part of the flake. You do not enable them by adding extra inputs upstream; if you are using the nix2gpu flake, they are already available.


// Nimi (modular services) //

Nimi is the runtime that powers services in nix2gpu. It runs NixOS modular services (Nix 25.11) without requiring a full init system.

  • Define services under services.<name> using modular service modules.
  • Tune runtime behavior with nimiSettings (restart, logging, startup).

See services & runtime for the workflow and examples.


// nix2container //

nix2gpu builds OCI images through Nimi’s mkContainerImage, which uses nix2container under the hood. You do not need to add a separate nix2container input to use nix2gpu containers.


// home-manager //

home-manager is integrated for user environment configuration. Use the home option to describe shell configuration, tools, and dotfiles in a modular way.

If you are porting an existing home-manager config that targets a non-root user, nix2gpu includes a convenience module:

inputs.nix2gpu.homeModules.force-root-user

See the home option for details.