ZFS on NetBSD

This page attempts to do two things: provide enough orientation and pointers to standard ZFS documentation for NetBSD users who are new to ZFS, and to describe NetBSD-specific ZFS information. It is emphatically not a tutorial or an introduction to ZFS.

Many things are marked with \todo because they need a better explanation, and some have question marks

This HOWTO describes the most recent state of branches, and does not attempt to describe formal releases. This is a clue; if you are using NetBSD 9 and ZFS, you should update along the branch.

Status of ZFS in NetBSD

NetBSD 8

NetBSD 8 has an old version of ZFS, and it is not recommended for use at all. There is no evidence that anyone is interested in helping with ZFS on 8. Those wishing to use ZFS on NetBSD 8 should therefore update to NetBSD 9.

NetBSD 9

NetBSD-9 has ZFS that is considered to work well. There have been fixes since 9.0_RELEASE. As always, people running NetBSD 9 are likely best served by the most recent version of the netbsd-9 stable branch. As of 2021-03, ZFS in the NetBSD 9.1 release is very close to netbsd-9, except that the mkdir fix is newly in netbsd-9.

There was a crash with mkdir over NFS with maproot, resolved in March 2021 in 9 and current. See http://gnats.netbsd.org/55042

There is a workaround where removing a file will commit the ZIL (normally this would not be done), to avoid crashes due to vnode reclaims. \todo Link to PR.

There has been a report of an occasional panic somewhere in zfs_putpages.

NetBSD-current

NetBSD-current (as of 2021-03) has similar ZFS code to 9.

There is initial support for ZFS root, via booting from ffs and pivoting.

NetBSD/xen special issues

Summary: if you are using NetBSD, xen and zfs, use NetBSD-current.

In NetBSD-9, MAXPHYS is 64KB in most places, but because of xbd(4) it is set to 32KB for XEN kernels. Thus the standard zfs kernel modules do not work under xen. In NetBSD-current, xbd(4) supports 64 KB MAXPHYS and this is no longer an issue. Xen and zfs on current are reported to work well together, as of 2021-02.

Architectures

Most people seem to be using amd64.

To build zfs, one puts MKZFS=yes in mk.conf. This is default on amd64 and aarch64 on netbsd-9. In current, it is also default on sparc64.

More or less, zfs can be enabled on an architecture when it is known to build and run reliably. (Of course, users are welcome to build it and report.)

Quick Start

See the FreeBSD Quickstart Guide; only the first item is NetBSD specific.

Documentation Pointers

See the man pages for zfs(8), zpool(8). Also see zdb(8), if only for seeing pool config info when run with no arguments.

NetBSD-specific information

rc.conf

The main configuration is to put zfs=YES in rc.conf, so that the rc.d scripts bring up ZFS and mount ZFS file systems.

pool locations

One can add disks or parts of disks into pools. Methods of specifying areas to be included include:

Information about created or imported pools is stored in /etc/zfs/zpool.cache.

Conventional wisdom is that a pool that is more than 80% used gets unhappy; so far there is not NetBSD-specific wisdom to confirm or refute that.

pool native blocksize mismatch

ZFS attempts to find out the native blocksize for a disk when using it in a pool; this is almost always 512 or 4096. Somewhere between 9.0 and 9.1, at least some disks on some controllers that used to report 512 now report 4096. This provokes a blocksize mismatch warning.

Given that the native blocksize of the disk didn't change, and things seemed OK using the 512 emulated blocks, the warning is likely not critical. However, it is also likely that rebuilding the pool with the 4096 blocksize is likely to result in better behavior because ZFS will only try to do 4096-byte writes. \todo Verify this and find the actual change and explain better.

pool importing problems

While one can "zpool pool0 /dev/wd0f" and have a working pool, this pool cannot be exported and imported straigthforwardly. "zpool export" works fine, and deletes zpool.cache. "zpool import", however, only looks at entire disks (e.g. /dev/wd0), and might look at slices (e.g. /dev/dk0). It does not look at partitions like /dev/wd0f, and there is no way on the command line to ask that specific devices be examined. Thus, export/import fails for pools with disklabel partitions.

One can make wd0 be a link to wd0f temporarily, and the pool will then be importable. However, "wd0" is stored in zpool.cache and on the next boot that will attempt to be used. This is obviously not a good approach.

One an mkdir e.g. /etc/zfs/pool0 and in it have a symlink to /dev/wd0f. Then, zpool import -d /etc/zfs/pool0 will scan /etc/zfs/pool0/wd0f and succeed. The resulting zpool.cache will have that path, but having symlinks in /etc/zfs/POOLNAME seems acceptable.

\todo Determine a good fix, perhaps man page changes only, fix it upstream, in curent, and in 9, before removing this discussion.

mountpoint conventions

By default, datasets are mounted as /poolname/datasetname. One can also set a mountpoint; see zfs(8).

There does not appear to be any reason to choose explicit mountpoints vs the default (and either using data in place or symlinking to it).

mount order

NetBSD 9 mounts other file systems and then ZFS file systems. This can be a problem if /usr/pkgsrc is on ZFS and /usr/pkgsrc/distfiles is on NFS. A workaround is to use noauto and do the mounts in /etc/rc.local.

NetBSD current after 20200301 mounts ZFS first. The same issues and workarounds apply in different circumstances.

NFS

zfs filesystems can be exported via NFS, simply by placing them in /etc/exports like any other filesystem.

The "zfs share" command adds a line for each filesystem with the sharenfs property set to /etc/zfs/exports, and "zfs unshare" removes it. This file is ignored on NetBSD-9 and current before 20210216; on current after 20210216 those filesystems should be exported (assuming NFS is enabled). It does not appear to be possible to set options like maproot and network restrictions via this method.

On current before 20210216, a remote mkdir of a filesystem mounted via -maproot=0:10 causes a kernel NULL pointer dereference. This is now fixed.

zvol

Within a ZFS pool, the standard approach is to have file systems, but one can also create a zvol, which is a block device of a certain size.

As an example, "zfs create -V 16G tank0/xen-netbsd-9-amd64" creates a zvol (intended to be a virtual disk for a domU).

The zvol in the example will appear as /dev/zvol/rdsk/tank0/xen-netbsd-9-amd64 and /dev/zvol/dsk/tank0/xen-netbsd-9-amd64 and can be used like a disklabel partition or wedge. However, the system will not read disklabels and gpt labels from a zvol.

Doing "swapctl -a" on a zvol device node fails. \todo Is it really true that NetBSD can't swap on a zvol? (When using a zvol for swap, standard advice is to avoid the "-s" option which avoids reserving the allocated space. Standard advice is also to consider using a dedicated pool.)

\todo Explain that one can export a zvol via iscsi.

One can use ccd to create a normal-looking disk from a zvol. This allows reading a GPT label from the zvol, which is useful in case the zvol had been exported via iscsi and some other system created a label.

Memory usage

Basically, ZFS uses lots of memory and most people run it on systems with large amounts of memory. NetBSD works well on systems with comparatively small amounts of memory. So a natural question is how well ZFS works on one's VAX with 2M of RAM :-) More seriously, one might ask if it is reasonable to run ZFS on a RPI3 with 1G of RAM, or if it is reasonable on a system with 4G.

The prevailing wisdom is more or less that ZFS consumes 1G plus 1G per 1T of disk. 32-bit architectures are viewed as too small to run ZFS.

Besides RAM, zfs requires that architecture kernel stack size is at least 12KB or more -- some operations cause stack overflow with 8KB kernel stack. On NetBSD, the architectures with 16KB kernel stack are amd64, sparc64, powerpc, and experimental ia64, hppa. mac68k and sh3 have 12KB kernel stack. All others use only 8KB stack, which is not enough to run zfs.

NetBSD has many statistics provided via sysctl; see "sysctl kstat.zfs".

FreeBSD has tunables that NetBSD does not seem to have, described in FreeBSD Handbook ZFS Advanced section.

Interoperability with other systems

Modern ZFS uses pool version 5000 and feature flags.

It is in general possible to export a pool and them import the pool on some other system, as long as the other system supports all the used features.

\todo Explain how to do this and what is known to work.

\todo Explain feature flags relationship to FreeBSD, Linux, iIllumos, macOS.

Sources of ZFS code

Currently, there are multiple ZFS projects and codebases:

OpenZFS is a coordinating project to align open ZFS codebases. There is a notion of a shared core codebase and OS-specific adaptation code.