Joy Arulraj

The Design and Implementation of a Non-Volatile Memory Database Management System Degree Type: Ph.D. in Computer Science
Advisor(s): Andy Pavlo
Graduated: August 2018

Abstract:

This dissertation explores the implications of non-volatile memory (NVM) for database management systems (DBMSs). The advent of NVM will fundamentally change the dichotomy between volatile memory and durable storage in DBMSs. These new NVM devices are almost as fast as volatile memory, but all writes to them are persistent even after power loss. Existing DBMSs are unable to take full advantage of this technology because their internal architectures are predicated on the assumption that memory is volatile. With NVM, many of the components of legacy DBMSs are unnecessary and will degrade the performance of data-intensive applications.

We present the design and implementation of DBMS architectures that are explicitly tailored for NVM. The dissertation focuses on three aspects of a DBMS: (1) logging and recovery, (2) storage and buffer management, and (3) indexing. First, we present a logging and recovery protocol that enables the DBMS to support near-instantaneous recovery. Second, we propose a storage engine architecture and buffer management policy that leverages the durability and byte-addressability properties of NVM to reduce data duplication and data migration. Third, the dissertation presents the design of a range index tailored for NVM that is latch-free yet simple to implement. All together, the work described in this dissertation illustrates that rethinking the fundamental algorithms and data structures employed in a DBMS for NVM improves performance and availability, reduces operational cost, and simplifies software development.

Thesis Committee:
Andrew Pavlo (Chair)
Todd Mowry
Greg Ganger
Samuel Madden (Massachusetts Institute of Technology)
Donald Kossmann (Microsoft Research)

Srinivasan Seshan, Head, Computer Science Department
Andrew W. Moore, Dean, School of Computer Science

Keywords:
Non-Volatile Memory, Database Management System, Logging and Recovery, Storage Management, Buffer Management, Indexing