File:  [NetBSD Developer Wiki] / wikisrc / zfs.mdwn
Revision 1.42: download - view: text, annotated - select for diffs
Thu Mar 25 22:59:39 2021 UTC (22 months, 1 week ago) by gdt
Branches: MAIN
CVS tags: HEAD
zfs: clarify rm/zil-commit issue

    1: # ZFS on NetBSD
    3: This page attempts to do two things: provide enough orientation and
    4: pointers to standard ZFS documentation for NetBSD users who are new to
    5: ZFS, and to describe NetBSD-specific ZFS information.  It is
    6: emphatically not a tutorial or an introduction to ZFS.
    8: Many things are marked with \todo because they need a better
    9: explanation, and some have question marks
   11: This HOWTO describes the most recent state of branches, and does not
   12: attempt to describe formal releases.  This is a clue; if you are using
   13: NetBSD 9 and ZFS, you should update along the branch.
   15: # Status of ZFS in NetBSD
   17: ## NetBSD 8
   19: NetBSD 8 has an old version of ZFS, and it is not recommended for use
   20: at all.  There is no evidence that anyone is interested in helping
   21: with ZFS on 8.  Those wishing to use ZFS on NetBSD 8 should therefore
   22: update to NetBSD 9.
   24: ## NetBSD 9
   26: NetBSD-9 has ZFS that is considered to work well.  There have been
   27: fixes since 9.0_RELEASE.  As always, people running NetBSD 9 are
   28: likely best served by the most recent version of the netbsd-9 stable
   29: branch.  As of 2021-03, ZFS in the NetBSD 9.1 release is very close to
   30: netbsd-9, except that the mkdir fix is newly in netbsd-9.
   32: There was a crash with mkdir over NFS with maproot, resolved in March
   33: 2021 in 9 and current.  See
   35: There is a workaround where removing a file will commit the ZIL
   36: (normally this would not be done), to avoid crashes due to vnode
   37: reclaims.  \todo Link to PR.
   39: There has been a report of an occasional panic somewhere in
   40: zfs_putpages.
   42: ## NetBSD-current
   44: NetBSD-current (as of 2021-03) has similar ZFS code to 9.
   46: There is initial support for [[ZFS root|wiki/RootOnZFS]], via booting
   47: from ffs and pivoting.
   49: ## NetBSD/xen special issues
   51: Summary: if you are using NetBSD, xen and zfs, use NetBSD-current.
   53: In NetBSD-9, MAXPHYS is 64KB in most places, but because of xbd(4) it
   54: is set to 32KB for XEN kernels.  Thus the standard zfs kernel modules
   55: do not work under xen.  In NetBSD-current, xbd(4) supports 64 KB
   56: MAXPHYS and this is no longer an issue.  Xen and zfs on current are
   57: reported to work well together, as of 2021-02.
   59: ## Architectures
   61: Most people seem to be using amd64.
   63: To build zfs, one puts MKZFS=yes in mk.conf.  This is default on amd64
   64: and aarch64 on netbsd-9.  In current, it is also default on sparc64.
   66: More or less, zfs can be enabled on an architecture when it is known
   67: to build and run reliably.  (Of course, users are welcome to build it
   68: and report.)
   70: # Quick Start
   72: See the [FreeBSD Quickstart
   73: Guide](; only
   74: the first item is NetBSD specific.
   76:   - Put zfs=YES in rc.conf.
   78:   - Create a pool as "zpool create pool1 /dev/dk0".
   80:   - df and see /pool1
   82:   - Create a filesystem mounted on /n0 as "zfs create -o
   83:     mountpoint=/n0 pool1/n0".
   85:   - Read the documentation referenced in the next section.
   87: ## Documentation Pointers
   89: See the man pages for zfs(8), zpool(8).  Also see zdb(8), if only for
   90: seeing pool config info when run with no arguments.
   92:   - [OpenZFS Documentation](
   93:   - [OpenZFS admin docs index page](
   94:   - [FreeBSD Handbook ZFS Chapter](
   95:   - [Oracle ZFS Administration Manual](
   96:   - [Wikipedia](
   98: # NetBSD-specific information
  100: ## rc.conf
  102: The main configuration is to put zfs=YES in rc.conf, so that the rc.d
  103: scripts bring up ZFS and mount ZFS file systems.
  105: ## pool locations
  107: One can add disks or parts of disks into pools.  Methods of specifying
  108: areas to be included include:
  110:   - entire disks (e.g., /dev/wd0d on amd64, or /dev/wd0 which has the same major/minor)
  111:   - disklabel partitions (e.g., /dev/sd0e)
  112:   - wedges (e.g., /dev/dk0)
  114: Information about created or imported pools is stored in
  115: /etc/zfs/zpool.cache.
  117: Conventional wisdom is that a pool that is more than 80% used gets
  118: unhappy; so far there is not NetBSD-specific wisdom to confirm or
  119: refute that.
  121: ## pool native blocksize mismatch
  123: ZFS attempts to find out the native blocksize for a disk when using it
  124: in a pool; this is almost always 512 or 4096.  Somewhere between 9.0
  125: and 9.1, at least some disks on some controllers that used to report
  126: 512 now report 4096.  This provokes a blocksize mismatch warning.
  128: Given that the native blocksize of the disk didn't change, and things
  129: seemed OK using the 512 emulated blocks, the warning is likely not
  130: critical.  However, it is also likely that rebuilding the pool with
  131: the 4096 blocksize is likely to result in better behavior because ZFS
  132: will only try to do 4096-byte writes.  \todo Verify this and find the
  133: actual change and explain better.
  135: ## pool importing problems
  137: While one can "zpool pool0 /dev/wd0f" and have a working pool, this
  138: pool cannot be exported and imported straigthforwardly.  "zpool
  139: export" works fine, and deletes zpool.cache.  "zpool import", however,
  140: only looks at entire disks (e.g. /dev/wd0), and might look at slices
  141: (e.g. /dev/dk0).  It does not look at partitions like /dev/wd0f, and
  142: there is no way on the command line to ask that specific devices be
  143: examined.  Thus, export/import fails for pools with disklabel
  144: partitions.
  146: One can make wd0 be a link to wd0f temporarily, and the pool will then
  147: be importable.  However, "wd0" is stored in zpool.cache and on the
  148: next boot that will attempt to be used.  This is obviously not a good
  149: approach.
  151: One an mkdir e.g. /etc/zfs/pool0 and in it have a symlink to
  152: /dev/wd0f.  Then, zpool import -d /etc/zfs/pool0 will scan
  153: /etc/zfs/pool0/wd0f and succeed.  The resulting zpool.cache will have
  154: that path, but having symlinks in /etc/zfs/POOLNAME seems acceptable.
  156: \todo Determine a good fix, perhaps man page changes only, fix it
  157: upstream, in curent, and in 9, before removing this discussion.
  159: ## mountpoint conventions
  161: By default, datasets are mounted as /poolname/datasetname.  One can
  162: also set a mountpoint; see zfs(8).
  164: There does not appear to be any reason to choose explicit mountpoints
  165: vs the default (and either using data in place or symlinking to it).
  167: ## mount order
  169: NetBSD 9 mounts other file systems and then ZFS file systems.  This can
  170: be a problem if /usr/pkgsrc is on ZFS and /usr/pkgsrc/distfiles is on
  171: NFS.  A workaround is to use noauto and do the mounts in
  172: /etc/rc.local.
  174: NetBSD current after 20200301 mounts ZFS first.  The same issues and
  175: workarounds apply in different circumstances.
  177: ## NFS
  179: zfs filesystems can be exported via NFS, simply by placing them in
  180: /etc/exports like any other filesystem.
  182: The "zfs share" command adds a line for each filesystem with the
  183: sharenfs property set to /etc/zfs/exports, and "zfs unshare" removes
  184: it.  This file is ignored on NetBSD-9 and current before 20210216; on
  185: current after 20210216 those filesystems should be exported (assuming
  186: NFS is enabled).  It does not appear to be possible to set options
  187: like maproot and network restrictions via this method.
  189: On current before 20210216, a remote mkdir of a filesystem mounted via
  190: -maproot=0:10 causes a kernel NULL pointer dereference.  This is now
  191: fixed.
  193: ## zvol
  195: Within a ZFS pool, the standard approach is to have file systems, but
  196: one can also create a zvol, which is a block device of a certain size.
  198: As an example, "zfs create -V 16G tank0/xen-netbsd-9-amd64" creates a
  199: zvol (intended to be a virtual disk for a domU).
  201: The zvol in the example will appear as
  202: /dev/zvol/rdsk/tank0/xen-netbsd-9-amd64 and
  203: /dev/zvol/dsk/tank0/xen-netbsd-9-amd64 and can be used like a
  204: disklabel partition or wedge.  However, the system will not read
  205: disklabels and gpt labels from a zvol.
  207: Doing "swapctl -a" on a zvol device node fails.  \todo Is it really
  208: true that NetBSD can't swap on a zvol?  (When using a zvol for swap,
  209: standard advice is to avoid the "-s" option which avoids reserving the
  210: allocated space.  Standard advice is also to consider using a
  211: dedicated pool.)
  213: \todo Explain that one can export a zvol via iscsi.
  215: One can use ccd to create a normal-looking disk from a zvol.  This
  216: allows reading a GPT label from the zvol, which is useful in case the
  217: zvol had been exported via iscsi and some other system created a
  218: label.
  220: # Memory usage
  222: Basically, ZFS uses lots of memory and most people run it on systems
  223: with large amounts of memory.  NetBSD works well on systems with
  224: comparatively small amounts of memory.  So a natural question is how
  225: well ZFS works on one's VAX with 2M of RAM :-) More seriously, one
  226: might ask if it is reasonable to run ZFS on a RPI3 with 1G of RAM, or
  227: if it is reasonable on a system with 4G.
  229: The prevailing wisdom is more or less that ZFS consumes 1G plus 1G per
  230: 1T of disk.  32-bit architectures are viewed as too small to run ZFS.
  232: Besides RAM, zfs requires that architecture kernel stack size is at
  233: least 12KB or more -- some operations cause stack overflow with 8KB
  234: kernel stack. On NetBSD, the architectures with 16KB kernel stack are
  235: amd64, sparc64, powerpc, and experimental ia64, hppa. mac68k and sh3
  236: have 12KB kernel stack. All others use only 8KB stack, which is not
  237: enough to run zfs.
  239: NetBSD has many statistics provided via sysctl; see "sysctl
  240: kstat.zfs".
  242: FreeBSD has tunables that NetBSD does not seem to have, described in
  243: [FreeBSD Handbook ZFS Advanced
  244: section](
  246: # Interoperability with other systems
  248: Modern ZFS uses pool version 5000 and feature flags.
  250: It is in general possible to export a pool and them import the pool on
  251: some other system, as long as the other system supports all the used
  252: features.
  254: \todo Explain how to do this and what is known to work.
  256: \todo Explain feature flags relationship to FreeBSD, Linux, iIllumos,
  257: macOS.
  259: # Sources of ZFS code
  261: Currently, there are multiple ZFS projects and codebases:
  263:   - [OpenZFS](
  264:   - [openzfs repository](
  265:   - [zfsonlinux](
  266:   - [OpenZFS on OS X ]( [repo](
  267:   - proprietary ZFS in Solaris (not relevant in open source)
  268:   - ZFS as released under the CDDL (common ancestor, now of historical interest)
  270: OpenZFS is a coordinating project to align open ZFS codebases.  There
  271: is a notion of a shared core codebase and OS-specific adaptation code.
  273:   - [zfsonlinux relationship to OpenZFS](
  274:   - FreeBSD more or less imports code from openzfs and pushes back fixes. \todo Verify this.
  275:   - NetBSD has imported code from FreeBSD.
  276:   - The status of ZFS on macOS is unclear (2021-02).

CVSweb for NetBSD wikisrc <> software: FreeBSD-CVSweb