Implementing Programming Languages

An Introduction to Compilers and Interpreters
by Aarne Ranta, with an appendix coauthored by Markus Forsberg

Publication Details

Aarne Ranta, Implementing Programming Languages. An Introduction to Compilers and Interpreters, with an appendix coauthored by Markus Forsberg, College Publications, London, 2012. ISBN 978-1-84890-064-6

Publisher's web page

Where to buy

amazon.com $18.00

amazon.co.uk £11.00

adlibris.se SEK 163

cremona.se SEK 135

College Publications, Volume 16 in Computing series

Authors

Aarne Ranta, http://www.cse.chalmers.se/~aarne/

Markus Forsberg, http://spraakbanken.gu.se/personal/markus/

University of Gothenburg, Sweden

Purpose

This book is meant as a text book for Computer Science students, years 2 and later.

It is also usable as a self-study book, and to some extent, as a manual to the BNFC tool (BNF Converter).

The book is based on material created for the course Programming Language Technology at Chalmers University of Technology and University of Gothenburg, Sweden. The course has been given by the author yearly since 2007. The book also covers some material for the continuation course Compiler Construction, which the author taught in 2002 to 2006.

Description

Implementing a programming language means bridging the gap from the programmer's high-level thinking to the machine's zeros and ones. If this is done in an efficient and reliable way, programmers can concentrate on the actual problems they have to solve, rather than on the details of machines. But understanding the whole chain from languages to machines is still an essential part of the training of any serious programmer. It will result in a more competent programmer, who will moreover be able to develop new languages. A new language is often the best way to solve a problem, and less difficult than it may sound.

This book follows a theory-based practical approach, where theoretical models serve as blueprint for actual coding. The reader is guided to build compilers and interpreters in a well-understood and scalable way. The solutions are moreover portable to different implementation languages. Much of the actual code is automatically generated from a grammar of the language, by using the BNF Converter tool. The rest can be written in Haskell or Java, for which the book gives detailed guidance, but with some adaptation also in C, C++, C#, or OCaml, which are supported by the BNF Converter.

The main focus of the book is on standard imperative and functional languages: a subset of C++ and a subset of Haskell are the source languages, and Java Virtual Machine is the main target. Simple Intel x86 native code compilation is shown to completes the chain from language to machine. The last chapter leaves the standard paths and explores the space of language design ranging from minimal Turing-complete languages to human-computer interaction in natural language.

Table of Contents

Assignments

Here are preliminary versions of some assignments; full versions coming soon. In particular, the test suites and the code templates will be extended.

1. Parser for C++

2. Type Checker for CPP

3. Interpreter for CPP

4. Code Generator for CPP

5. Interpreter for Fun

6. A Special-Purpose Language

Slides

Chapter 1, Compilation Phases

Chapter 2, Grammars

Chapter 3, Lexing and parsing

Chapter 4, Type checking

Chapter 5, Interpreters

Chapter 6, Code generation

Chapter 7, Functional programming languages

Chapter 8, The language design space

Exercises

Electronic self-study exercises

A set of exercises from a related course, and solutions to some of them.

Code examples

CPP grammar (Chapter 2)

Query language (Chapter 8); also as a web demo

Links to software

BNFC

Precompiled Java tools (Cup and JLex)

Jasmin

GF

Links to on-line reference publications

Paul Carter, A Tutorial on PC assembly.

Paul Graham, essays

Mark Jones, Typing Haskell in Haskell

Xavier Leroy, Compiling Functional Languages

Simon Peyton Jones and David Lester, Implementing Functional Languages: a Tutorial,

Eric Raymond, The Art of Unix Programming,

R. Stallman, Using and Porting the GNU Compiler Collection

S. Thompson, Type Theory and Functional Programming

Errata

p. 10 (and also later): it is stated that Python is an untyped language. By this we mean that Python has no compile-time type checking. But it does have a run-time notion of types, known as dynamic typing.

p. 27: too many classes in the Java example have the name EAdd. Should be EAdd, EASub, EMul, EDiv.

p. 73: in checkExp code, if (typ2 = typ) should be if (typ2 == typ)

p. 92: in the last rule for ifeq L, the code pointer should become P+1 when v != 0

p. 170: the lin rules for TAll and TAny generate a bogus condition. The proper rules are:

    TAll kind = parenth ("\\p -> and [p x | x <-" ++ kind ++ "]") ;
    TAny kind = parenth ("\\p -> or  [p x | x <-" ++ kind ++ "]") ;

Please send your errata to Aarne Ranta!