International Conference on
Massive Storage Systems
and Technology

Sponsored by Santa Clara University,
School of Engineering

Since the conference was founded, in 1974, by the leading national laboratories, MSST has been a venue for massive-scale storage system designers and implementers, storage architects, researchers, and vendors to share best practices and discuss building and securing the world's largest storage systems for high-performance computing, web-scale systems, and enterprises.

Hosted at
Santa Clara University
Santa Clara, CA

The 37th MSST storage conference will be held the week of May 22, 2023 at Santa Clara University. As an initial resumption of the traditional in-person conference, MSST 2023 will hold three days of events, described below. We look forward to you joining us in Santa Clara this year, as we continue to prepare for a return to our full-week format in 2024—look for an announcement of dates and research-paper submission details early this summer.

Santa Clara University
The beautiful Santa Clara University campus

Subscribe to our email list for (infrequent) information along the way.


We understand that the economy is putting pressure on budgets, so we have attempted to keep registration costs low:
Tutorial/BoF (Monday)   $90.00
Invited Talks (Tuesday and Wednesday) $180.00
All three days (slightly discounted) $260.00

These rates are available (and refundable) through April 30th.

Registration is open at:

Tutorials—Birds of a Feather, Monday, May 22

(Draft Schedule)

  7:30 — 8:30 Registration / Breakfast
  8:30 — 10:00 Large Scale File/Storage System Indexing with the Grand Unified File Index
Jason Lee, Dominic Manno, Gary Grider, LANL
As filesystems become larger and larger, management of the data stored in these filesystems becomes more and more complex and time consuming. Querying filesystem metadata with standard tools can take hours or even days due to the sheer amount of metadata to be processed and how standard tools go about processing the metadata. Some filesystems provide custom tools that assist in querying their metadata, but there is no unified set of tools to do so. Additionally, such tools are usually reserved for administrators, not users. The Grand Unified File Index (GUFI) solves all these issues and more. GUFI provides a set of highly performant, parallel tools that allow for complex queries to be applied to the metadata of arbitrary filesystems. The GUFI toolset can be used by both administrators as well as users without violating permission semantics. Indices from disconnected filesystems can be combined, allowing for queries across multiple filesystems at once. This session will provide a high-level overview of GUFI as well as a tutorial on using GUFI.
10:00 — 10:30 Break
10:30 — 12:00 Secure NFS and Unstructured Storage Solutions Including Multi-Level Security
RackTop Systems
12:00 — 1:00 Lunch
  1:00 — 2:30 OpenZFS New Features Including Direct/IO and Compression/Erasure Offloads
Brian Atkinson, LANL
Kelly Ursenbach, Eideticom
ZFS is an open source volume manager and file system that provides many built-in data integrity and data transformations features. We will discuss two projects that LANL has been working on, with industry partners, to improve overall ZFS performance with NVMe devices. The first project adds support for O_DIRECT in ZFS, reducing the number of memory copies that occur in the ZFS code path which, in turn, reduces memory bandwidth pressure. The ZFS Interface for Accelerators (Z.I.A.) project offloads CPU- and memory-bandwidth-intensive operations to local and remote computational storage devices.

We seek to get a sense of what operations ZFS users would like to see offloaded to computational storage devices other than the ones that have already been offloaded.
  2:30 — 3:00 Break
  3:00 — 4:30 Standards-Based Parallel Global File Systems and Automated Data
                      Orchestration with NFS
David Flynn, Trond Mykelbust, Douglas Fallstrom, Hammerspace
High-performance computing applications, web-scale storage systems, and modern enterprises increasingly have the need for a data architecture that will unify at the edge, and in data centers, and clouds. These organizations with massive-scale data requirements need the performance of a parallel file system coupled with a standards-based solution that will be easy to deploy on machines with diverse security and build environments.

Join this BoF to discuss:

Invited Track, Tuesday, May 23

(Draft Schedule)

  7:30 — 8:30 Registration / Breakfast
  8:30 — 9:30 Keynote
Data Impact on the Environment
Erik Riedel, Carnegie Mellon
  9:30 — 10:15 Lustre/ Large Site Report
Lustre Metadata Writeback Caching
Oleg Drokin, Whamcloud
Storage and Data Management for Science at the Large Hadron Collider at CERN
Dr. Andreas-Joachim Peters, CERN IT Storage Group
10:15 — 10:45 Break
10:45 — 12:00 Storage Class Memory
Samsung’s CXL-Era Memory Expansion Devices with NAND Flash Media
Rekha Pitchumani, Samsung
Post Optane: What comes next for Storage Class Memory
Jongryool Kim, SK Hynix
Advanced Storage and Memory Hierarchy in AI and HPC with DAOS Storage
Andrey Kudryavtsev, Intel
12:00 — 1:00 Lunch
  1:00 — 3:00 Computational Storage
Accelerated Disks and Flashes: LANL's early experience in Speeding Up Analytics Workloads Using Smart Devices
Qing Zheng, LANL
NVME/Computational Storage Standards Update
Alan Blumgarner, Solidigm
Computational Storage Solutions Over Fabrics for ZFS
Kelly Ursenbach, Eideticom
Current and future SSD architectures for Computational Storage
Ramdas Kamchare, Samsung
  3:00 — 3:30 Break
  3:30 — 4:30 Future Cloud
Hyperscale Perspectives on Storage
Ross Stenfort, Meta
Fifty Shades of S3: Navigating the Gray Areas of API Implementation
Gregory Touretsky, Seagate

Invited Track, Wednesday, May 24

(Draft Schedule)

  7:30 — 8:30 Registration / Breakfast
  8:30 — 9:30 CXL
CXL in Data Storage Applications
Siamak Tavallaei, Google
Rethinking Byte-Accessibility of SSDs from a CXL-Attached Memory and Storage System
John Kim, Sk Hynix
  9:30 — 10:00 Break
10:00 — 12:00 Inexpensive Large Capacity Storage
Open Questions Regarding Future Archival and Backup Storage Systems
Bruce Montague, Veritas
Magnetic Tape Industry Roadmaps
Matt Ninesling, Spectra
Tape Technology Provider Talk
12:00 — 1:00 Lunch
  1:00 — 3:00 Failure at Scale
What Ten Years of Drive Stats Data Can Tell Us
Andy Klein, Backblaze
Analysis and Design Considerations of Multi-level Erasure Coding in Hierarchical Data Centers
Meng Wang, University of Chicago
MarFS as a Multi-Level Erasure Archive
Garrett Ransom, LANL
Improving Data Reliability in Exascale Storage Clusters
Saurabh Kadekodi, Google
  3:00 — 3:30 Break
  3:30 — 4:00 DNA Storage
DNA Storage Erasure Encoding
Dominic Manno, LANL
  4:00 — 4:15 Wrap-Up

2023 Organizers
Conference Chair     Prof. Ahmed Amer, SCU
Program Committee     Gary Grider, John Bent, Alex Parga
Industry Co-Chairs     Alex Parga, Adam Manzanares
Registration Chair     Prof. Shiva Jahangiri, SCU

Page Updated March 22, 2023