Skip to content

ZFS Interactions

exarkun edited this page Sep 10, 2014 · 1 revision

Overview

This is an in-progress document describing how and why the Flocker's ZFS interactions might be changed from ad hoc child process (zfs(8)) execution to an in-process Python library binding the libzfs_core C API.

Current State

Flocker 0.1's node software (flocker-reportstate, flocker-changestate, flocker-volume, etc) interrogates ZFS to determine the current state of the system (eg to determine what filesystems exist and what snapshots of them exist) and to make changes to that state to facilitate Flocker's volume functionality (eg, create a new filesystem, generate a data stream to replicate a filesystem, etc). These operations are done by running the command line zfs program with certain arguments to achieve the desired result. In cases where information is being retrieved from the system, the output is also parsed.

Whether the child process is launched and tracked in a blocking or non-blocking way is arbitrary and essentially reflects what was expedient at the time that particular interaction was implemented.

Problems

  • The ZFS command line is primarily oriented towards use by humans and neither its arguments nor its output are tailored for programmatic usage.
  • The diverging ways the child process (primarily Popen vs spawnProcess) is handled results in more complexity and code.
  • Performance is probably somewhat less than it could be (in practical terms the difference here may be negligible; it hasn't been measured yet).
  • The complexity of error handling is greatly increased because errors may be signaled by a special exit status, a human-readable error message, or - most commonly - both.
  • The complexity of testing is increased because of the breadth of the interface by which zfs is invoked (a list of bytes - there are many ways this can go wrong) and unit tests typically come down to the repetitive, low-value "If I tell you to do X then do you do X" style.

The Solution

Use an in-process library. Upstream OpenZFS is tinkering with a library called libzfs_core. This should eventually supersede libzfs which upstream complains was never meant for public consumption and has an awful interface (whereas libzfs_core will have a "committed" interface .

libzfs_core is a C library. We will develop Python bindings for it (probably using cffi). Ideally we will contribute this upstream to OpenZFS where it will receive more visibility and where the pool of maintainers and developers is larger.

libzfs_core presents a blocking interface. Ultimately we will need to use multithreading to invoke it to allow other logic to proceed concurrently (eg logic in support of a "node liveness" protocol) instead of blocking all other operations on a node for each ZFS interaction. libzfs_core is thread-safe We may want to explore the idea of a non-blocking interface with upstream and work with them to design and implement one if they are amenable and the idea proves technically feasible.

libzfs_core is not as featureful as the zfs command line. For example, libzfs_core cannot rename filesystems. We will contribute implementations of missing functionality upstream. Meanwhile Flocker can be ported to use the parts of libzfs_core that already exist and satisfy some requirement we have (in other words, this is an incremental process).

Clone this wiki locally