Hosted at

36th International Conference
on Massive Storage Systems
and Technology (MSST 2020)
May 4 — 8, 2020

Sponsored by Santa Clara University,
School of Engineering

Since the conference was founded, in 1974, by the leading national laboratories, MSST has been a venue for massive-scale storage system designers and implementers, storage architects, researchers, and vendors to share best practices and discuss building and securing the world's largest storage systems for high-performance computing, web-scale systems, and enterprises.

Hosted at

Santa Clara University
Santa Clara, CA

ZUFS: Ultra Low-Latency Filesystem Development in User Space

Sagi Manole, Netapp

Sagi Manole
Emerging persistent memory (PM) devices, such as Intel Optane DCPMM, operate at near- memory speed, and thus require novel filesystems and kernel-to-userspace bridges that are optimized for extreme low latencies. We present such a zero-copy bridge, called ZUFS and two filesystems that use it: the open-source PMFS2 and the feature-rich NetApp MAX FS.

ZUFS was architected for synchronous, zero-copy, direct-mapped access, from the application to the user space filesystem and vice versa. It leverages CPU core locality and direct memory mapping to achieve several micro-second latency for a persistent write through the software stack to persistent media and back. ZUFS efficient and modern direct-mapped access may be used by additional user-pace services and is not limited to filesystems.

ZUFS interfaces closely follow the Linux VFS API. This keeps the shim layer minimal as well as makes it convenient for usage by kernel filesystem developers. ZUFS relies on VFS locking mechanisms for some system calls, such as directory modifications. Otherwise, ZUFS implements a relatively thin locking infrastructure, in order to allow the user space filesystem, the freedom of implementation.

This talk describes two filesystems developed in user space leveraging ZUFS, each with its own internal data-structures, locking mechanisms and limitations. NetApp MAX FS is a commercial filesystem that support snapshots, auto-tiering, mmap, mirroring and other advanced data services at low latency. PMFS2 is an open-source filesystem, derived from the older in-kernel pmfs implementation. Both ZUFS-based filesystems were measured to run more than a million random IOPS with an average latency of sub-ten microseconds.

Sagi Manole is the engineering manager for the MAX Data File System team and ZUFS at NetApp. His responsibilities are to develop the product and act as a co-maintainer of the ZUFS layer.

Sagi brings a rich development background to NetApp and has over 10 years of experience in the Storage industry. He was the senior SW engineer and co-architect at Plexistor and joined NetApp as part of the Plexistor acquisition. Prior to Plexistor, Sagi performed a verity of SW engineering roles in Primary Data and IBM Research Lab. In addition to his work in the business, Sagi holds over 20 patents and publications in prestigious engineering journals.

Sagi received his B.Sc. in EE and his M.Sc. in Computer Engineering from Tel Aviv University. His thesis involved workload optimization of proteomics pattern matching using embedded accelerator.

Page Updated March 20, 2020