Module 12: The untyped lambda calculus

This module is due at 1:15pm on Thursday, October 20.

The language

The language we will consider in this module has the following syntax:

<expr> ::=
  | <var>
  | <int>
  | <expr> '+' <expr>
  | '^' <var> '->' <expr>
  | <expr> <expr>
  | '(' <expr> ')'

It has integers and addition, just to give us something basic to compute with. It also has variables, and lambda expressions of the form ^x -> e, where x is a variable name and e is an expression (where x is bound in e). This represents a function which takes the parameter x as input and yields e as its output.1 Finally, our language has function application which is written like f x, just as in Haskell. Also as in Haskell, we consider function application to associate to the left, so f x y is the same as (f x) y. If you take away integers and addition, this language is known as the lambda calculus. (It turns out that you can encode integers and addition, and much else besides, using only functions and function application, but we won’t get into that!) It is a very simple model of a language with (one-argument) functions.

Parsing

> import Parsing2
> 
> lexer :: TokenParser u
> lexer = makeTokenParser emptyDef
> 
> parens :: Parser a -> Parser a
> parens = getParens lexer
> 
> identifier :: Parser String
> identifier = getIdentifier lexer
> 
> reservedOp :: String -> Parser ()
> reservedOp = getReservedOp lexer
> 
> whiteSpace :: Parser ()
> whiteSpace = getWhiteSpace lexer
> 
> integer :: Parser Integer
> integer = getInteger lexer
> 
> parseAtom :: Parser Expr
> parseAtom
>   =   undefined  <$> identifier
>   <|> undefined  <$> integer
>   <|> undefined          -- lambda
>   <|> parens parseExpr
> 
> parseExpr :: Parser Expr
> parseExpr = undefined    -- addition and application
> 
> expr :: Parser Expr
> expr = whiteSpace *> parseExpr <* eof

You can use this function for testing:

> p :: String -> Expr
> p s = case parse expr s of
>   Left err -> error (show err)
>   Right e  -> e

Semantics and substitution

The central idea behind the semantics for the lambda-calculus is what happens when a lambda is applied to an argument: in (^x -> e1) e2, the function ^x -> e1 is expecting an argument x, and it is applied to an argument e2, so (^x -> e1) e2 should reduce to e2 with e1 substituted for x. For example,

(^x -> x + x + 3) 6 --> 6 + 6 + 3 --> 15.

This is clear enough in theory, and one can go on to define a small-step interpreter for the lambda calculus along these lines, as you did in the previous module. However, it is very tricky to get substitution right! In fact, your implementation of subst in the previous module is almost certainly wrong, although it did not cause a problem because of the limited way we made use of it.

To see the problem, consider the following expression:

let x = 6 in (let y = x + 2 in (let x = 3 in x + y))

Now suppose we tried to evaluate this expression a different way, by first reducing the subexpression let y = x + 2 in (let x = 3 in x + y) before reducing the outer let.

One solution to this problem is to consistently rename things in order to avoid name conflicts. For example:

let x = 6 in (let y = x + 2 in (let x = 3 in x + y))
let x = 6 in (let y = x + 2 in (let q = 3 in q + y))
let x = 6 in (let q = 3 in q + (x + 2))
let x = 6 in 3 + (x + 2)
3 + (6 + 2)
3 + 8
11

This works, but is inefficient; there are also many other solutions to this problem as well!

A big-step interpreter

In any case, we are going to sidestep the whole substitution morass by taking a different approach. Instead of a small-step interpreter which relies on substitution, you will write a big-step interpreter. This has its own difficulties: in particular, since a big-step interpreter typically turns an expression into some sort of value, we have to decide what kind of values our interpreter can generate. Of course, integers should be one potential kind of value. But we also have functions to contend with.

The approach we will take today is to represent function values in our language as actual Haskell functions, of type Value -> Either InterpError Value. That is, a function value is something which takes a fully evaluated argument as input, and either produces a fully evaluated Value as a result, or crashes with an InterpError. Note that this only works since Haskell has first-class functions! (Another approach we will explore later, which works even in languages without first-class functions, is to use closures.)

Note that you will not be able to derive Show for your Value type, because it is impossible to inspect a function in any way other than by applying it to an argument. This is actually a problem, since it makes it difficult to generate good error messages or to display the output of the interpreter (using closures will solve this problem as well).

Feedback


  1. In Haskell, a backslash is used for lambda expressions (it is supposed to look similar-ish to a lower case Greek lambda (λ), the traditional notation). However, in our context it would be annoying to use a backslash since it also serves as an escape character in string literals, so to write something like \x -> \y -> x + y as a Haskell string you have to write "\\x -> \\y -> x + y".