Module 08: Variables

This module will be due at 1:15pm on Tuesday, October 3.

In this module, we will explore how to support a language with variables.

We’ll start with the familiar Arith language.

> {-# LANGUAGE GADTs #-}
> 
> import Prelude hiding ((<$>), (<$), (<*>), (<*), (*>))
> import Parsing
> 
> data Arith where
>   Lit :: Integer -> Arith
>   Bin :: Op -> Arith -> Arith -> Arith
>   deriving (Show)
> 
> data Op where
>   Plus  :: Op
>   Minus :: Op
>   Times :: Op
>   deriving (Show, Eq)
> 
> interpArith :: Arith -> Integer
> interpArith (Lit i)           = i
> interpArith (Bin Plus e1 e2)  = interpArith e1 + interpArith e2
> interpArith (Bin Minus e1 e2) = interpArith e1 - interpArith e2
> interpArith (Bin Times e1 e2) = interpArith e1 * interpArith e2
> 
> lexer :: TokenParser u
> lexer = makeTokenParser $ emptyDef
>   { reservedNames = ["let", "in"] }
>     -- tell the lexer that "let" and "in" are reserved keywords
>     -- which may not be used as variable names
> 
> parens :: Parser a -> Parser a
> parens     = getParens lexer
> 
> reservedOp :: String -> Parser ()
> reservedOp = getReservedOp lexer
> 
> reserved :: String -> Parser ()
> reserved = getReserved lexer
> 
> integer :: Parser Integer
> integer    = getInteger lexer
> 
> whiteSpace :: Parser ()
> whiteSpace = getWhiteSpace lexer
> 
> identifier :: Parser String
> identifier = getIdentifier lexer
> 
> parseArithAtom :: Parser Arith
> parseArithAtom = (Lit <$> integer) <|> parens parseArith
> 
> parseArith :: Parser Arith
> parseArith = buildExpressionParser table parseArithAtom
>   where
>     table = [ [ Infix (Bin Times <$ reservedOp "*") AssocLeft ]
>             , [ Infix (Bin Plus  <$ reservedOp "+") AssocLeft
>               , Infix (Bin Minus <$ reservedOp "-") AssocLeft
>               ]
>             ]
> 
> arith :: Parser Arith
> arith = whiteSpace *> parseArith <* eof
> 
> eval :: String -> Maybe Integer
> eval s = case parse arith s of
>   Left _  -> Nothing
>   Right e -> Just (interpArith e)

We will now add variables to this language. In particular, we’re going to add let-expressions, which look something like this:

 >>> let x = 4*12 in x*x + 3-x
 2259

This locally defines x as a name for the value of 4*12 within the expression x*x + 3-x. Substituting 48 for each occurrence of x and evaluating the result yields the final value of 2259.

Haskell has let-expressions too, so you can try typing the above expression at a GHCi prompt.

Syntax

We need to make two changes to the syntax of the language.

where <var> is a variable name (a String) and the two occurrences of <arith> are Arith expressions.

Semantics

Now we will extend the interpreter appropriately. The presence of variables introduces two new wrinkles to our interpreter:

  1. The interpreter will need to keep track of the current values of variables, so that when we encounter a variable we can substitute its value.
  2. Variables introduce the possibility of runtime errors. For example, let x = 2 in x + y will generate a runtime error, since y is never defined.

For now, you should deal with undefined variables simply by calling the error function with an appropriate error message, which will make the interpreter crash. In a later section we will explore a better way to handle runtime errors.

To keep track of the values of variables, we will use a mapping from variable names to values, called an environment. To represent such a mapping, we can use the Data.Map module, which works similarly to Python dictionaries or Java’s TreeMap. Add the following line to the imports at the top of this file:

import qualified Data.Map as M

You can now refer to things from Data.Map by prefixing them with M., for example, M.empty, M.insert, M.lookup. (The reason for prefixing with M like this is that otherwise there would be conflicts with functions from the Prelude.)

Dealing with errors, take 1

The interpreter works, but we have introduced the possibility of runtime errors, and we are not dealing with them in a nice way: if a variable is undefined, the interpreter simply crashes. Ideally, the interpreter should never crash, but instead return a useful error value that can be caught and further processed however we wish.

To do this we will need to change the type of the interpreter again. Currently, its type promises that it will always return an Integer, but this is no longer possible: in case of a runtime error it will not be able to return an Integer. (Of course, we could arrange for it to return some default Integer such as 0 in the case of a runtime error, but there would then be no way to distinguish between an expression that legitimately evaluated to 0 and one that resulted in a runtime error.)

As we have seen (for example, in the type of parseSome), the possibility of errors can be represented by the Either type.

Dealing with errors, take 2

The annoying thing about interpArith2 is that it had to mix together the actual work of interpreting with the work of doing case analysis to figure out when a runtime error had occurred. In this section we will explore ways of hiding all the case analysis inside a few general combinators which we can then use to implement the interpreter in a nicer way.

Now we only have the Let case remaining. It turns out that (<<$>>) and (<<*>>) are not enough to implement the Let case, since the two recursive calls to interpArith3 are not independent (the second recursive call needs to use an environment extended with the result of the first recursive call).

> (>>>=) :: Either e a -> (a -> Either e b) -> Either e b
> Left e1 >>>= _ = undefined
> Right a >>>= f = undefined

Dealing with errors, take 3

As you have probably noticed, (<<$>>) and (<<*>>) are similar to the (<$>) and (<*>) operators for parsers. In fact, they are both instances of a more general pattern.

Feedback