CSci 230: Computing Systems Organization
Home Syllabus Readings Assignments Tests

Assignment 12: Minc compiler

Due: 5:00pm, Tuesday, November 25. Value: 50 pts.

The following three files implement most of a compiler to translate a program written in a simple language called Minc into ARM assembly language.

main.c Loads a file and builds an internal representation before calling generate_code. You should not modify this file.
codegen.c Defines generate_code, which given a representation of a program using a C struct should print the corresponding ARM assembly language. This is the file you should modify; you should not add or modify other files.
codegen.h A header file defining structures and functions useful for both main.c and codegen.c.

Your assignment is to complete codegen.c to handle the entirety of the Minc language described below; you should not add or modify other files, and your submitted solution should be entirely in codegen.c.

The Minc language

The name Minc comes from minimal C, as Minc is a very basic C-style language. It supports just four statement types.

Each of these statements includes an arithmetic expression. Minc expressions are very constrained:

Despite these severe constraints, one can still write a reasonable program. The following two basic examples illustrate.

Count down 10 to 1: Greatest common divisor:
cur = 10;
while (cur > 0) {
    cur = cur - 1;
    print 1 + cur;
}
b = 40;
a = 25;
while (b != a) {
    if (b > a) {
        b = b - a;
        print b;
    }
    if (a > b) {
        a = a - b;
        print a;
    }
}
print a;

The following links are to legal Minc programs:

count.mc: Count down 10 to 1 (example above)
gcd.mc: Greatest common divisor (example above)
hailstone.mc: Iterate through hailstone sequence
factor.mc: Show all factors of a number

Memory representation

The distributed code already converts a given file into an internal memory representation that is easier to process than the original progra text. This internal representation uses a type named struct statement.

struct statement {
    int stype;     /* one of: STMT_PRINT, STMT_ASSN, STMT_IF, STMT_WHILE */
    int line;      /* line number in source file where statement starts */
    int dest_var;  /* used for STMT_ASSN only, indicating var id to save*/
    int op_type;   /* one of: OP_LT, OP_EQ, ..., OP_NONE, OP_ADD, OP_SUB */
    int left_is_var/* 0 if constant, 1 if variable */
    int left_val;  /* if constant, number; if variable, variable id */
    int right_is_var/* 0 if constant, 1 if variable */
    int right_val/* if constant, number; if variable, variable id */
    struct statement *body/* used for STMT_IF and STMT_WHILE only, */
                   /* pointing to first statement in body. */
    struct statement *next/* next statement to execute after this one, */
                   /* or NULL if final statement in list. */
};

As indicated by the final next field, this is a linked list of statements, each object in the list representing a single statement on the same level of the program. The other fields indicate the various parts of each individual statement, starting with stype indicating which of the four statement types it is.

This is probably easiest to comprehend using an example. Below is a Minc program accompanied by an illustration showing how the distributed program represents this Minc program in memory.

n = 10;
sum = 0;
while (n > 0) {
    sum = sum + n;
    n = n - 1;
}
print sum;

As it reads the program, it assigns a number between 0 and 9 to each unique variable; in this case, it chose 0 for n and 1 for sum.

Testing the program

First, compile your program with a statement such as “gcc main.c codegen.c”. Ensure that you have a Minc program saved in a file, which in our examples we'll presume is named test.mc.

The distributed program includes a working interpreter that you can use if you simply want to execute a Minc program to see what it does. To execute the Minc program, enter “./a.out -x test.mc”.

But if you want the program to generate ARM assembly code, omit the -x flag: “./a.out test.mc. That will display the ARM assembly translation to the screen. You can redirect this output to a file instead: “./a.out test.mc > test.s

Once you have saved the ARM assembly code, you can then load it within aas and execute it. (You'll want to ensure that io.s from the malloc assignment is available in the same directory.) You should find that using aas to execute the code generated by your program should work identically to “./a.out -x test.mc”.

Example output

In case it's helpful, below is the sample program illustrated above along with the ARM program that a correct solution might generate. This is not the only correct solution, though: The real test is how the generated code behaves within aas, as explained in the previous section.

n = 10;
sum = 0;
while (n > 0) {
    sum = sum + n;
    n = n - 1;
}
print sum;
      MOV SP#0xFF00
    BL ttyStart
    MOV R4#10
    MOV R5#0
go001
    MOV R1#0
    CMP R4R1
    BLE go002
    ADD R5R5R4
    MOV R1#1
    SUB R4R4R1
    B go001
go002
    MOV R0R5
    BL printInt
    MOV R0#'\n'
    BL printChar
halt B halt
    INCLUDE "io.s"

By the way, when you want a C program to display a backslash '\' in the output, you will need to include a double-backslash. For example, in the statement “printf("#'\\n'\n");”, the first double-backslash ends up translating to a single backslash in the output, whereas the lone backslash followed by n translates to a newline (with no backslash).