ZUFS: Ultra Low-Latency Filesystem Development in User Space
Sagi Manole, Netapp
Emerging persistent memory (PM) devices, such as Intel Optane DCPMM, operate at near- memory speed, and thus require novel filesystems and kernel-to-userspace bridges that are optimized for extreme low latencies. We present such a zero-copy bridge, called ZUFS and two filesystems that use it: the open-source PMFS2 and the feature-rich NetApp MAX FS.
ZUFS was architected for synchronous, zero-copy, direct-mapped access, from the application to the user space filesystem and vice versa. It leverages CPU core locality and direct memory mapping to achieve several micro-second latency for a persistent write through the software stack to persistent media and back. ZUFS efficient and modern direct-mapped access may be used by additional user-pace services and is not limited to filesystems.
ZUFS interfaces closely follow the Linux VFS API. This keeps the shim layer minimal as well as makes it convenient for usage by kernel filesystem developers. ZUFS relies on VFS locking mechanisms for some system calls, such as directory modifications. Otherwise, ZUFS implements a relatively thin locking infrastructure, in order to allow the user space filesystem, the freedom of implementation.
This talk describes two filesystems developed in user space leveraging ZUFS, each with its own internal data-structures, locking mechanisms and limitations. NetApp MAX FS is a commercial filesystem that support snapshots, auto-tiering, mmap, mirroring and other advanced data services at low latency. PMFS2 is an open-source filesystem, derived from the older in-kernel pmfs implementation. Both ZUFS-based filesystems were measured to run more than a million random IOPS with an average latency of sub-ten microseconds.
Sagi Manole is the engineering manager for the MAX Data File System team and ZUFS at NetApp. His responsibilities are to develop the product and act as a co-maintainer of the ZUFS layer.
Sagi brings a rich development background to NetApp and has over 10 years of experience in the Storage industry. He was the senior SW engineer and co-architect at Plexistor and joined NetApp as part of the Plexistor acquisition. Prior to Plexistor, Sagi performed a verity of SW engineering roles in Primary Data and IBM Research Lab. In addition to his work in the business, Sagi holds over 20 patents and publications in prestigious engineering journals.
Sagi received his B.Sc. in EE and his M.Sc. in Computer Engineering from Tel Aviv University. His thesis involved workload optimization of proteomics pattern matching using embedded accelerator.