Doctoral Speaking Skills Talk - Kaiyang Zhao April 28, 2025 10:00am — 11:00am Location: In Person - Traffic21 Classroom, Gates Hillman 6501 Speaker: KAIYANG ZHAO , Ph.D. Student, Computer Science Department, Carnegie Mellon University https://www.cs.cmu.edu/~kaiyang2/ Contiguitas: The Pursuit of Physical Memory Contiguity in Datacenters The unabating growth of the memory needs of emerging datacenter applications has exacerbated the scalability bottleneck of virtual memory. However, reducing the excessive overhead of address translation will remain onerous until the physical memory contiguity predicament gets resolved. To address this problem, this paper presents Contiguitas, a novel redesign of memory management in the operating system and hardware that provides ample physical memory contiguity. We identify that the primary cause of memory fragmentation in Meta's datacenters is unmovable allocations scattered across the address space that impede large contiguity from being formed. To provide ample physical memory contiguity by design, Contiguitas first separates regular movable allocations from unmovable ones by placing them into two different continuous regions in physical memory and dynamically adjusts the boundary of the two regions based on memory demand. Drastically reducing unmovable allocations is challenging because the majority of unmovable pages cannot be moved with software alone given that access to the page cannot be blocked for a migration to take place. Furthermore, page migration is expensive as it requires a long downtime to (a) perform TLB shootdowns that scale poorly with the number of victim TLBs, and (b) copy the page. To this end, Contiguitas eliminates the primary source of unmovable allocations by introducing hardware extensions in the last-level cache to enable the transparent and efficient migration of unmovable pages even while the pages remain in use. We build the operating system component of Contiguitas into the Linux kernel and run our experiments in a production environment at Meta's datacenters. Our results show that Contiguitas's OS component successfully confines unmovable allocations, drastically reducing unmovable 2MB blocks from an average of 31% scattered across the address space down to 7% confined in the unmovable region, leading to significant performance gains. Specifically, we show that for three major production services, Contiguitas achieves end-to-end performance improvements of 2-9% for partially fragmented servers, and 7-18% for highly fragmented servers, which account for nearly a quarter of Meta's fleet. We further use full-system simulations to demonstrate the effectiveness of the hardware extensions of Contiguitas. Our evaluation shows that Contiguitas-HW enables the efficient migration of unmovable allocations, scales well with the number of victim TLBs, and does not affect application performance. Presented in Partial Fulfillment of the CSD Speaking Skills Requirement Add event to Google Add event to iCal