Andrew W. Appel

Compile-time Evaluation and Code Generation for Semantics-directed Compilers Degree Type: Ph.D. in Computer Science
Advisor(s): Merrick Furst
Graduated: May 1985

Abstract:

A semantics-directed compiler translates programs into machine language using a formal specification of the semantics of the programming language.

The first programming languages designed for digital computers had syntax and semantics which were designed ad hoc , e.g. Fortran. The language Algol-60 had at least a formally-defined syntax, which was described in the notation of context-free grammars. This formal syntax definition turned out to be an advantage not only in that its human users could read it, but that it was possible to construct parsers automatically from the formal definition.

This stimulated interest in the formal definition of programming language semantics. An early and still useful technique was that of attribute-grammars Knuth 68 oot Knuth, Donald E., Semantics of context-free languages, Mathematical Systems Theory 2 2 :127-145, 1968. which associates with each non-terminal in the grammar a set of attributes . Attributes may be synthesized computed from the attributes of the symbols produced by the non-terminal or inherited computed from the attributes of the symbol producing the non-terminal . Attribute grammars are amenable to reasonably efficient automatic translation.

Unfortunately, attribute grammars without higher-order functions are not sufficiently powerful to represent the entire semantics of programming languages. They are best suited to representing those semantic aspects which "stick close to the parse tree" such as variable declarations.

Denotational semantics Scott 71 oot Scott, D.S. and Strachey, C., Towards a mathematical semantics for computer languages, in Proc. Symp. Computers and Automata , pages 19-46. Polytechnic Press, NY, 1971. , a more powerful formal description which is sufficiently powerful to represent any computable function. The been a lot of work done in automatic translation from denotational descriptions.

a program, between the source programming language typically a descendent of Algol and the machine language. The work in the field has consisted of finding elegant ways to represent programming-language constructs in expressions into the target language and sometimes the choices made in representation affect the code-generation techniques . However, it has generally been the case that it has been easier to define the into machine-code: code generation techniques lag behind formal definitions.

This thesis describes a code-generation technique designed to produce machine code for conventional von Neumann computers. It is powerful enough to handle the full semantics of languages like Pascal, and can use a reasonably natural denotational specification. On the other hand, this semantics-directed compilation technique is not magical: it is ill-suited to languages that have no natural implementation on von Neumann architectures.