Daniel K. Blandford

Compact Data Structures with Fast Queries Degree Type: Ph.D. in Computer Science
Advisor(s): Guy Blelloch
Graduated: December 2005

Abstract:

Many applications dealing with large data structures can benefit from keeping them in compressed form. Compression has many benefits: it can allow a representation to fit in main memory rather than swapping out to disk, and it improves cache performance since it allows more data to fit into the cache. However, a data structure is only useful if it allows the application to perform fast queries (and updates) to the data.

This thesis describes compact representations of several types of data structures including variable-bit-length arrays and dictionaries, separable graphs, ordered sets, text indices, and meshes. All of the representations support fast queries; most support fast updates as well. Several structures come with strong theoretical results. All of the structures come with experimental results showing good compression results. The compressed data structures are usually close to as fast as their uncompressed counterparts, and sometimes are faster due to caching effects.

Thesis Committee:
Guy E. Blelloch (Chair)
Christos Faloutsos
Danny Sleator
Ian Munro (Waterloo)

Jeannette Wing, Head, Computer Science Department
Randy Bryant, Dean, School of Computer Science

Keywords:
Data compression, text indexing, meshing

CMU-CS-05-196.pdf (725.57 KB) ( 126 pages)
Copyright Notice