Doctoral Speaking Skills Talk - William Zhang December 3, 2024 12:00pm — 1:00pm Location: In Person - Blelloch-Skees Conference Room, Gates Hillman 8115 Speaker: WILLIAM ZHANG, Ph.D. Student, Computer Science Department, Carnegie Mellon University https://17zhangw.github.io/ The Holon Approach to Holistic Database Optimization The optimal configuration of a database management system (DBMS) depends on its workload, database contents, and hardware. However, these constantly change over time. It is increasingly difficult for humans to reason about those changes with the growing complexity and variety of tunable DBMS system components. As such, this necessitates using automated machine learning agents to tune the DBMS. The challenge is that a tuning agent must explore a large action space to construct and try out promising configurations while learning how to optimize for a given workload and objective (e.g., minimize latency) function. As the tuner reasons across DBMS tunable aspects (e.g., knobs and indexes), the number of available combined actions grows combinatorially. In order to manage this explosion, existing techniques optimize each tunable aspect individually and in isolation from one another. They then attempt to compose the local optima discovered by each tuner into a holistic configuration. However, this process does not guarantee finding global optima. Rather than composing bespoke tuners, we should use a holistic model to simultaneously reason across multiple configuration spaces (i.e., consider multiple tuning decisions simultaneously). In order to manage the space's complexity, we make a critical insight. Although the number of unique actions in the space is large, many share similar properties. This similarity, derived from performance estimates or domain knowledge, enables an agent to reduce the effective space by transferring observations from one action to similar actions. With this holistic technique and considering orders of magnitude more complex and varied tunable options, we achieve up to 53% workload reduction over state of the art tuners for tuning PostgreSQL on analytical workloads. We will conclude the talk with a brief discussion on future directions and next steps. Presented in Partial Fulfillment of the CSD Speaking Skills Requirement