Luka Dekanozishvili

Author's image

Luka Dekanozishvili

Student, developer & DevOps enthusiast


Rescuing NixOS stuck on Stage 1 boot

2025-10-27 · 3 minute read

Introduction

One of the main selling points of NixOS is that it allows you to have many different binaries of the same program at the same time. Along with program isolation and dependency management, this enables the user to have many of the so-called NixOS generations at the same time.

How it works is simple: when applying any change to the system, like installing a program or changing an option declaratively, you have to rebuild the system. You can either switch to it immediately like so:

sudo nixos-rebuild switch

or switch to it only when you reboot the machine:

sudo nixos-rebuild boot

If the rebuild succeeds, it creates a new generation. The user is still able to switch to a previous generation though, and this comes in very handy when either a rebuild succeeds but the system is left unusable. Or if something hangs/breaks after a reboot, you are able to boot into a different generation.

The issue

I have a service enabled that garbage-collects previous generations older than 2 weeks:

nix.gc = {
  automatic = true;
  dates = "Fri *-*-* 04:00:00";
  options = "--delete-older-than 14d";
};

This means, if I were to make a change that broke the booting sequence of my system, and I only noticed it 2 weeks after, I'll be left with an unbootable system. While this might sound scary, it doesn't happen often. Unluckily for me, this is exactly what happened.

My issue was that the system was waiting indefinitely for the kernel module vfio_pci to load, which wasn't responding. After making sure there were no timeouts in place, I rebooted the system into a NixOS minimal installation image and began diagnosing.

Fixing the issue

I'm using zfs for my boot drive, so the first step was to mount it.

# Escalate your privileges
sudo su

# Load the zfs kernel module
modprobe zfs

# List all available pools (that are not imported)
zpool import

# Show all available datasets
zfs list

Output:

NAME                           USED  AVAIL  REFER  MOUNTPOINT
nvmepool                      51.3G   406G    96K  none
nvmepool/home                  156M   406G   156M  legacy
nvmepool/nix                  16.7G   406G  16.7G  legacy
nvmepool/root                 11.8M   406G  11.8M  legacy
nvmepool/var                  1.35G   406G  1.35G  legacy

Here, nvmepool is the name of my zpool.

Now create the directories to mount the datasets on:

mkdir -p /mnt/home /mnt/nix /mnt/root /mnt/var

and mount them:

mount -t zfs nvmepool/root /mnt
mount -t zfs nvmepool/home /mnt/home
mount -t zfs nvmepool/nix /mnt/nix

Note: nvmepool/root refers to the / directory, also known as the root directory, and not to the /root/ directory, which is the root user's home directory.

The boot directory also has to be mounted explicitly for NixOS to be able to build generations, so find out which is your boot partition:

$ lsblk --fs
NAME        FSTYPE     FSVER LABEL    UUID                FSAVAIL FSUSE% MOUNTPOINTS
nvme0n1
├─nvme0n1p1 vfat       FAT32 boot     6445-6413            906.6M    11% /boot
└─nvme0n1p2 zfs_member 5000  nvmepool 6733023062829592015

Here, look for vfat, or for the EFI or ESP label. The boot partition shouldn't be larger than a few gigabytes.

Afterward, mount it:

mount /dev/nvme0n1p1 /mnt/boot

In my case, I also had to mount these directories:

mount --rbind /dev /mnt/dev
mount --rbind /sys /mnt/sys
mount --rbind /proc /mnt/proc

Finally, enter the mounted system:

nixos-enter --root /mnt

Make sure networking works (for cache.nixos.org):

ping google.com

Then, navigate to your NixOS config directory, edit and revert the changes manually, and execute:

nixos-rebuild boot

Or this, if you're using flakes:

nixos-rebuild boot --flake .#

If that succeeds, exit the chrooted environment:

exit

and export the zpool cleanly:

zpool export nvmepool

Note: the last step isn't really necessary if you don't have this NixOS option set: boot.zfs.forceImportRoot = true;

Finally reboot the system (and make sure not to boot into the live image). Afterward, you should be able to boot into your system normally.