Module 04: Syntax and semantics
Write your team names here:
You may again choose whoever you want to start as the driver. Write your choice here:
Remember:
- You should cite at least two sources on each assignment. If you don’t have anything to cite, go learn something and cite it.
- Be sure that your module loads into GHCi with no errors before turning it in.
> {-# LANGUAGE GADTs #-}
>
> import Data.Char -- so you can use the 'isDigit' function later
Some examples
Example 1
<mirror> ::= '.'
| 'L' <mirror> 'R'
Example strings that match <mirror>
:
"."
"L.R"
"LL.RR"
Example strings that do not match <mirror>
:
"Q"
"LR"
"LL.RRR"
Example 2
<tree> ::= '#'
| '(' <tree> ')'
| '(' <tree> <tree> ')'
Example strings that match <tree>
:
"(##)"
"((#)(#(##)))"
Example strings that do not match <tree>
:
"((#)"
"(###)"
Example 3
<digit> ::= '0' | '1' | '2' | '3' | '4' | '5' | '6' | '7' | '8' | '9'
<natural> ::= <digit> | <digit> <natural>
<integer> ::= <natural> | '-' <natural>
Example strings that match <integer>
:
"0023"
"-25"
Example strings that do not match <integer>
:
"x27"
"--25"
"4-2"
For each of
<mirror>
,<tree>
, and<integer>
, give three more examples of strings that match, and three more examples that do not match.What does
|
mean?What is the difference between something in single quotes (like ’*’) and something in angle brackets (like
<tree>
)?
The things in single quotes are usually called terminals, and the things in brackets are nonterminals. These sorts of definitions are known as (context-free) grammars, written in Backus-Naur Form or Backus Normal Form (BNF), named for John Backus and Peter Naur.
In what context was BNF first developed?
What else are John Backus and Peter Naur known for?
Now, back to your regularly scheduled grammars…
Does
<natural>
match the empty string “”?An alternative, equivalent way to define
<natural>
is as follows:<natural> ::= <digit>+
Given that this is equivalent to the original definition, what do you think + means?
Technically this sort of + notation was not included in the original form of BNF, but it is a common extension.
<natural> ::= <digit>*
would match all the same strings as<digit>+
, but also matches the empty string. What do you think*
means in this context?Describe how to modify the definition of
<natural>
so it does not allow unnecessary leading zeroes. That is,"203"
should still match but"0023"
should not; however,"0"
should still be a valid<natural>
. If you wish, you can also introduce more definitions or modify definitions besides<natural>
.Write down a grammar (as concisely as possible) that matches all these strings:
"(XX)Y" "(XXXX)YZ" "(XX)ZZZZZ" "(XXXXXX)YZZ"
…but does not match any of these strings:
"()Y" "(XXX)YZ" "(X)ZZZ" "(XX)YY"
Mirror, mirror
- ROTATE ROLES and write the name of the new driver here:
> {-
> <mirror> ::= '.'
> | 'L' <mirror> 'R'
> -}
>
> data Mirror where
> Middle :: Mirror
> Layer :: Mirror -> Mirror
> deriving Show
>
> prettyMirror :: Mirror -> String
> prettyMirror Middle = "."
> prettyMirror (Layer m) = "L" ++ prettyMirror m ++ "R"
>
> parseMirror :: String -> (Mirror, String)
> parseMirror ('.' : rest) = (Middle, rest)
> parseMirror ('L' : rest) =
> case parseMirror rest of
> (m, 'R' : rest') -> (Layer m, rest')
Write down three different example values of type
Mirror
.Try calling
prettyMirror
on your exampleMirror
values above, and record the results.For this language, how are the concrete syntax (represented by the grammar
<mirror>
) and abstract syntax (represented by the data typeMirror
) different?Try calling
parseMirror
on five different example inputs. An example input can be anyString
. Try to pick a variety of examples that show the range of behavior ofparseMirror
. Record the results here.Describe the behavior of
parseMirror
. Your answer should refer to the grammar<mirror>
.Why does
parseMirror
return a(Mirror, String)
pair instead of just aMirror
? (Hint: if you are not sure, try writing a functionparseMirror2 :: String -> Mirror
which behaves the same asparseMirror
but does not return the extraString
.)Modify
parseMirror
so that it has typeString -> Maybe (Mirror, String)
and never crashes. Instead of crashing on inputs that do not match<mirror>
, it should returnNothing
. Call your modified functionparseMirrorSafe
and write it below.
BIN (aka Your First Language)
- ROTATE ROLES and write the name of the new driver here:
Consider the following BNF grammar:
<bin> ::= '#'
| '(' <bin> <bin> ')'
Write an algebraic data type called
Bin
which corresponds to<bin>
. That is,Bin
should encode the abstract syntax trees corresponding to the concrete syntax<bin>
.Write a function
prettyBin
which turns an abstract syntax tree into concrete syntax.Write a function
parseBin :: String -> Maybe (Bin, String)
which turns concrete syntax into abstract syntax.
We can give a semantics (meaning) to abstract Bin
trees as follows: a leaf has value 1; the value of a branch with left and right subtrees is the value of the left subtree plus twice the value of the right subtree.
- Write a function
interpBin :: Bin -> Integer
which implements this semantics.
We can think of this as a little programming language for computing integers; let’s call it BIN. (Of course, BIN is not a very useful language, but don’t let it hear you say that because you might hurt its feelings.) For example, "((##)(##))"
is a program to compute the value 9. parseBin
is a parser for the language, and interp
is an interpreter.
Put the pieces together to create a function
evalBin :: String -> Maybe Integer
. For example,evalBin "#" --> Just 1 evalBin "((##)(##))" --> Just 9 evalBin "(##" --> Nothing
At the beginning of class we considered three different sources of error in a programming language. Which kind(s) of errors are possible in BIN?
Why can’t the other kind(s) of errors occur in BIN?
EBIN
- ROTATE ROLES and write the name of the new driver here:
The language EBIN (which stands, of course, for extended BIN) is just like BIN except it also allows single digits, which can stand in for an entire tree. In other words, a digit can go anywhere a #
can go in a BIN program. For example, "((4#)(#2))"
is a valid EBIN program. Note that "((##)(42))"
is also a valid EBIN program, which does not contain the number 42 but rather the single digits 4
and 2
.
The digit n
stands for a full binary tree of height n
. For example,
0
stands for#
1
stands for(##)
2
stands for((##)(##))
and so on. Thus, EBIN is not a completely new, separate language, but just adds some syntax sugar to BIN. That is, single digits are just a convenient shorthand for more cumbersome but equivalent expressions in BIN. We define EBIN in terms of BIN, and we can think of EBIN programs as abbreviated BIN programs.
Write down a BNF grammar for EBIN.
Make an algebraic data type
EBin
to represent EBIN abstract syntax.Write a parser for EBIN. (Hints: you may find the
isDigit
andread
functions useful. Note thatData.Char
was imported at the beginning of the module so you can useisDigit
. You are always welcome to import other library modules as needed.read
has a more general type but for the purposes of this assignment you can think of it as having typeString -> Integer
; for example,read "34" = 34
.)Write a function
desugar :: EBin -> Bin
which desugars EBIN abstract syntax into equivalent BIN abstract syntax. (Hint: you will probably find it useful to write a separate functiongrow :: Integer -> Bin
which converts an integer into a full tree of the given height.)Could you also do the desugaring phase before the parsing phase? If so, what would be the type of
desugar
?Now put all the pieces together to define an evaluator
evalEBin :: String -> Maybe Integer
. That is,evalEBin
should encompass the following phases:Concrete syntax \(\rightarrow\) parse \(\rightarrow\) EBIN AST \(\rightarrow\) desugar \(\rightarrow\) BIN AST \(\rightarrow\) interpret \(\rightarrow\) Integer
For example,
evalEBin "(#2)"
should yield 19, and"((4#)(#2))"
should evaluate to 121.
As an (optional) fun aside, make a conjecture about the value of EBIN programs consisting of a single digit. That is, what can you say about evalEBin "0"
, evalEBin "1"
, evalEBin "2"
, and so on? Can you prove your conjecture?