Vijay R. Vasudevan

Energy-efficient Data-intensive Computing with a Fast Array of Wimpy Nodes Degree Type: Ph.D. in Computer Science
Advisor(s): David Andersen
Graduated: December 2011

Abstract:

Large-scale data-intensive computing systems have become a critical foundation for Internet-scale services. Their widespread growth during the past decade has raised datacenter energy demand and created an increasingly large financial burden and scaling challenge: Peak energy requirements today are a significant cost of provisioning and operating datacenters. In this thesis, we propose to reduce the peak energy consumption of datacenters by using a FAWN: A Fast Array of Wimpy Nodes. FAWN is an approach to building datacenter server clusters using low-cost, low-power servers that are individually optimized for energy efficiency rather than raw performance alone. FAWN systems, however, have a different set of resource constraints than traditional systems that can prevent existing software from reaping the improved energy efficiency benefits FAWN systems can provide.

This dissertation describes the principles behind FAWN and the software techniques necessary to unlock its energy efficiency potential. First, we present a deep study into building FAWN-KV, a distributed, log-structured key-value storage system designed for an early FAWN prototype. Second, we present a broader classification and workload analysis showing when FAWN can be more energy-efficient and under what workload conditions a FAWN cluster would perform poorly in comparison to a smaller number of high-speed systems. Last, we describe modern trends that portend a narrowing gap between CPU and I/O capability and highlight the challenges endemic to all future balanced systems. Using FAWN as an early example, we demonstrate that pervasive use of "vector interfaces" throughout distributed storage systems can improve throughput by an order of magnitude and eliminate the redundant work found in many data-intensive workloads.

Thesis Committee:
David G. Andersen (Chair)
Gregory R. Ganger
Garth A. Gibson
Luiz A. Barroso (Google)
Michael E. Kaminsky (Intel Labs)

Jeannette Wing, Head, Computer Science Department
Randy Bryant, Dean, School of Computer Science

Keywords:
Energy Efficiency, Low Power, Cluster Computing, Flash

CMU-CS-11-131.pdf (1.2 MB) ( 154 pages)
Copyright Notice