Module 08: Variables

This module will be due at 1:15pm on Tuesday, October 3.

In this module, we will explore how to support a language with variables.

We’ll start with the familiar Arith language.

{-# LANGUAGE GADTSyntax #-}

import Prelude hiding ((<$>), (<$), (<*>), (<*), (*>))
import Parsing

data Arith where
  Lit :: Integer -> Arith
  Bin :: Op -> Arith -> Arith -> Arith
  deriving (Show)

data Op where
  Plus  :: Op
  Minus :: Op
  Times :: Op
  deriving (Show, Eq)

interpArith :: Arith -> Integer
interpArith (Lit i)           = i
interpArith (Bin Plus e1 e2)  = interpArith e1 + interpArith e2
interpArith (Bin Minus e1 e2) = interpArith e1 - interpArith e2
interpArith (Bin Times e1 e2) = interpArith e1 * interpArith e2

lexer :: TokenParser u
lexer = makeTokenParser $ emptyDef
  { reservedNames = ["let", "in"] }
    -- tell the lexer that "let" and "in" are reserved keywords
    -- which may not be used as variable names

parens :: Parser a -> Parser a
parens     = getParens lexer

reservedOp :: String -> Parser ()
reservedOp = getReservedOp lexer

reserved :: String -> Parser ()
reserved = getReserved lexer

integer :: Parser Integer
integer    = getInteger lexer

whiteSpace :: Parser ()
whiteSpace = getWhiteSpace lexer

identifier :: Parser String
identifier = getIdentifier lexer

parseArithAtom :: Parser Arith
parseArithAtom = (Lit <$> integer) <|> parens parseArith

parseArith :: Parser Arith
parseArith = buildExpressionParser table parseArithAtom
  where
    table = [ [ Infix (Bin Times <$ reservedOp "*") AssocLeft ]
            , [ Infix (Bin Plus  <$ reservedOp "+") AssocLeft
              , Infix (Bin Minus <$ reservedOp "-") AssocLeft
              ]
            ]

arith :: Parser Arith
arith = whiteSpace *> parseArith <* eof

eval :: String -> Maybe Integer
eval s = case parse arith s of
  Left _  -> Nothing
  Right e -> Just (interpArith e)

We will now add variables to this language. In particular, we’re going to add let-expressions, which look something like this:

 >>> let x = 4*12 in x*x + 3-x
 2259

This locally defines x as a name for the value of 4*12 within the expression x*x + 3-x. Substituting 48 for each occurrence of x and evaluating the result yields the final value of 2259.

Haskell has let-expressions too, so you can try typing the above expression at a GHCi prompt.

Syntax

We need to make two changes to the syntax of the language.

Semantics

Now we will extend the interpreter appropriately. The presence of variables introduces two new wrinkles to our interpreter:

  1. The interpreter will need to keep track of the current values of variables, so that when we encounter a variable we can substitute its value.
  2. Variables introduce the possibility of runtime errors. For example, let x = 2 in x + y will generate a runtime error, since y is never defined.

For now, you should deal with undefined variables simply by calling the error function with an appropriate error message, which will make the interpreter crash. In a later section we will explore a better way to handle runtime errors.

To keep track of the values of variables, we will use a mapping from variable names to values, called an environment. To represent such a mapping, we can use the Data.Map module, which works similarly to Python dictionaries or Java’s Map interface. Add the following line to the imports at the top of this file:

import qualified Data.Map as M

You can now refer to things from Data.Map by prefixing them with M., for example, M.empty, M.insert, M.lookup. (The reason for prefixing with M like this is that otherwise there would be conflicts with functions from the Prelude.)

Dealing with errors, take 1

The interpreter works, but we have introduced the possibility of runtime errors, and we are not dealing with them in a nice way: if a variable is undefined, the interpreter simply crashes. Ideally, the interpreter should never crash, but instead return a useful error value that can be caught and further processed however we wish.

To do this we will need to change the type of the interpreter again. Currently, its type promises that it will always return an Integer, but this is no longer possible: in case of a runtime error it will not be able to return an Integer. (Of course, we could arrange for it to return some default Integer such as 0 in the case of a runtime error, but there would then be no way to distinguish between an expression that legitimately evaluated to 0 and one that resulted in a runtime error.)

As we have seen (for example, in the type of parseSome), the possibility of errors can be represented by the Either type.

Dealing with errors, take 2

The annoying thing about interpArith2 is that it had to mix together the actual work of interpreting with the work of doing case analysis to figure out when a runtime error had occurred. In this section we will explore ways of hiding all the case analysis inside a few general combinators which we can then use to implement the interpreter in a nicer way.

Now we only have the Let case remaining. It turns out that (<<$>>) and (<<*>>) are not enough to implement the Let case, since the two recursive calls to interpArith3 are not independent (the second recursive call needs to use an environment extended with the result of the first recursive call).

Dealing with errors, take 3

As you have probably noticed, (<<$>>) and (<<*>>) are similar to the (<$>) and (<*>) operators for parsers. In fact, they are both instances of a more general pattern.

Feedback