Douglas Clark

List Structure: Measurements, Algorithms, and Encodings Degree Type: Ph.D. in Computer Science
Advisor(s): Samuel Fuller
Graduated: August 1976

Abstract:

This thesis is about list structures: how they are used tn practice, how they can be moved .and copied efficiently, and howthey can .be represented by space-saving encodings. The approach taken to these subjects is mainly empirical. Measurement results are based on five large pl_ograms written in Interlisp, a sophisticated Lisp system that runs on the PDP-l O. Static data were collected at the end of typical runs of the programs, and all list structure used as data by them was measured (about 50,000 cells each). Strong regularities were discovered. In each program, about one-third of all cars pointed to lists, the rest mainly to literal atoms and small integers; roughly three-fourths of cdrs pointed to lists, the rest mainly to the atom NIL. List pointers generally pointed to a location physically nearby .in memory, a condition that appears to depend only on the sequential allocation of new list cells.

Atom pointers were distributed approximately according to Zipf's law, which models word occurrence tn natural ]anguage text. Less agreement was found among the programs when dynamic references to list structure were considered. During short runs of the programs, most of these references were due to the functions car and cdr, which occurred about equally often. Dynamic data type distributions only roughly corresponded to the static ones and were different from program to program= dynamic list pointer distances tended to be small, but did not closely match the static resultsl and the ' distributions of dynamic atom references were much more skewed than Zlpf"s law " would predict. Further measurements showed that repeated reference to cells !s w likely: half or more of all references were to one of the ten most recently referenced cells.

Thesis Committee:
Samuel Fuller (Chair)

Joseph Traub, Head, Computer Science Department