Module 14: Type checking first-class functions

This module is due on Tuesday, November 1 at 1:15pm.

Write your team names here:

In this module we will continue exploring how to type check a language with first-class functions. This language, together with its type system, is known as the simply-typed lambda calculus, or STLC¹ (extended with integers and addition).

Pick an initial driver and write their name here:
You should begin by copying in your code from the previous module, including the interpreter and the parser extended with optional type annotations on lambda expressions.

Types for lambdas and applications

You already know how to typecheck numeric literals, arithmetic operations such as addition, and variables; the things we need to focus on are the language features specifically having to do with functions, namely, lambdas and applications. Let’s think more carefully about how to give these things types. In answering the following questions you may want to refer back to the examples in the previous module for inspiration.

In the following, the letter e will stand for any expression, and T will stand for any type.

For a function application e1 e2 to be well-typed,
- what must be true about the type of e1?
- what must be true about the type of e2?
- In general, given e1 e2, how can you figure out what type it should have?
Consider the lambda expression ^x [T] -> e, where the variable x is annotated with a type T, and e is some expression. What type should it have?
Consider a lambda expression ^x -> e (without a type annotation on x). Suppose someone tells you that this lambda expression is supposed to have a particular type T. Let’s try a few examples first:
- Does ^x -> 3 have type Int?
- Does ^x -> 3 have type (Int -> Int) -> Int?
- Does ^x -> x + 2 have type (Int -> Int) -> Int?
- Does ^x -> (^y -> 3) have type Int -> (Int -> Int)?
Now generalize the previous examples. How can you check whether a lambda expression ^x -> e has type T? (Hint: what must be true of T? Assuming T is OK, how will you check that T is a valid type for ^x -> e?)

So we know how to infer the type of an application or a lambda with a type annotation, and how to check the type of a lambda without a type annotation.

What about inferring the type of a lambda without a type annotation? As you saw in the previous module, this can sometimes be done, but it is tricky. For example, it is possible to infer that (^x -> (^f -> f x + x)) has type Int -> (Int -> Int) -> Int, but this requires some nontrivial deduction. For the purposes of this module, we are not going to do this. Instead, we will just throw up our hands and refuse to infer the type of a lambda without a type annotation. (However, if you are interested to know how this “nontrivial deduction” (called type reconstruction) works in general, I would be happy to explain it. In fact, implementing this for some nontrivial language could be a nice part of a final project.)

A type checker for STLC

ROTATE ROLES and write the name of the new driver here:
Use the above insights to implement a type checker. Some notes:
- Again, you should write two mutually recursive functions,
```
infer :: Ctx -> Expr -> Either TypeError Type
```
  and
```
check :: Ctx -> Expr -> Type -> Either TypeError ()
```
  (your functions might have slightly different types depending on how you decide to set things up).
- In previous type checkers we have written, the check function was fairly trivial: all it did was infer the type of the given expression and make sure it matched the given type. This time, however, check will be a bit more interesting: it should first see if it is given a lambda with no type annotation, and do something appropriate. In all other cases, it should fall back to inferring the type and making sure it matches the given type, as before.
  
  (As an aside, the reason we structured our previous type checkers using both an infer function and a check function was for exactly this reason: more complex languages tend to have some expressions whose types can be checked but not inferred, so both the check and infer functions do nontrivial work. Structuring a type system in this way, with mutually recursive check and infer functions, is called bidirectional type checking.)
- You will need to create a TypeError type and populate it with appropriate constructors representing the various possible sorts of errors your type checker may encounter.

Updating the interpreter

ROTATE ROLES and write the name of the new driver here:

Now that we have a type checker, the interpreter doesn’t have to do so much work. However, unlike with previous interpreters, we can’t really get rid of the Value type. When we interpreted Arith with booleans and integers, for example, we could just represent all values as integers; but now we have integers and closures, which are so different that there’s no appropriate type we can use to easily represent both. So your interpreter will still need to pattern-match on Values. Nonetheless, it can still be simplified quite a bit:

If an expression type checks, interpreting it should never cause a runtime error: type errors and unbound variables are of course impossible, and this language doesn’t have division by zero either. So you should change the type of your interpreter to just return a Value instead of an Either InterpError Value.
If you see, for example, a Plus operation, you can assume that the Values of the two subexpressions will be integers. You do not even need to include cases for the other possibilities (or, if you like, you can have a catch-all case that simply calls error with a panicked message exhorting the user to report a bug in your type checker).
Finally, if you have not already, update an appropriate eval :: String -> IO () function which accepts an expression as input and prints either a parse error, a type error, or the final value resulting from interpreting a correctly type-checked expression.
Here are some examples you can use to test your implementation:
- 3 should evaluate to 3.
- 4 5 should be a type error.
- (^x -> 3) 4 should be a type error, since we can’t infer the type of the lambda.
- (^x [Int] -> 3) 4 should evaluate to 3.
- (^x [Int -> Int] -> 3) 4 should be a type error.
- (^x [Int] -> x+2) (^x -> 5) should be a type error.
- x + 3 should give an undefined variable error (not a type error).
- (^f [Int] -> f x + f 2) 5 should give an undefined variable error or a type error (either one is OK).
- (^f [Int] -> f 1 + f 2) 5 should be a type error.
- (^f [Int -> Int] -> f 1 + f 2) 5 should be a type error.
- (^f [Int -> Int] -> f 1 + f 2) (^x [Int] -> x + 8) should evaluate to 24.
- (^f [Int -> Int] -> f 1 + f 2) (^x -> x + 8) should also evaluate to 24. It should not be a type error.
- (^f : (Int -> Int) -> (Int -> Int). f (^y. y + 10) 6) (^g. ^x. g (g (g x))) should evaluate to 36.

Correcting a few issues

ROTATE ROLES and write the name of the new driver here:
What is the type of show?
What is the result of show "hello"?
What is the result of show (show "hello")?
Can you explain what is going on in the above results?
What is the type of putStrLn?
What is the type of print?
Just by looking at their types, make a guess as to the difference between these two functions.
Test your guess by trying things like
- putStrLn 3
- putStrLn (show 3)
- print 3
- putStrLn "hello"
- putStrLn (show "hello")
- print "hello"
Explain when you should use print and when you should use putStrLn.

Explain what is wrong with this hypothetical code:

case interp M.empty e of
  Right v  -> print (showValue v)
  Left err -> print (showError err)

How would you fix it?
Explain what is wrong with this hypothetical code:
```
case interp env e of
  Right (IntValue i) -> doStuff i
  _                  -> Left $ NotAnIntError e
```
(Hint: if we expect to see an int value and the Expr is actually a variable “x”, where x is not in scope, what error should be generated? What error would actually be generated by the above code?)
How would you fix it?
Go back and make sure your module is free of these issues.

Lambda calculus in the real world

Take a look at this page from the GHC wiki. Haskell syntax is complicated, but it all ultimately gets desugared to this very simple data type Expr.

Which parts of Expr look familiar? Which are unfamiliar?
Go learn something about one of the unfamiliar parts, and explain it here.
Take a look at this function from the GHC source. You won’t understand a lot of it, but that’s OK. What is this function doing?

Feedback

How long would you estimate that you spent working on this module?
Were any parts particularly confusing or difficult?
Were any parts particularly fun or interesting?
Record here any other questions, comments, or suggestions for improvement.

I always get STLC and SLTC mixed up.↩