Due: 5:00pm, Friday, October 31. Value: 30 pts.
Physics, unfortunately, exists. Whenever we transmit a bit, or store a bit magnetically on a disk, or compute a bit using a transistor, physics can intervene: Electrons change spin, gamma rays strike, or electrical signals are disrupted by lightning or solar flares. Basically, bits are not as reliable as we'd like to think.
However, we tend to think of them as reliable, and the reason we get away with this is that our devices overcome the physics complexities using error correcting codes: You think you're working with a byte, but in fact the computer works more than eight bits, knowing that a bit could easily flip due to physics. As we read the “byte,” the computer examines the redundant bits to check whether they indicate that a bit is flipped; if one has, then the error correcting bits provide information on how to recover the original information.
(This section provides background information, which is helpful but not necessary to completing the assignment.)
The simplest error-correcting code is to store each bit in triplicate. When we read a conceptual bit, we check whether all three physical bits match; if they don't, we conclude that one must have inadvertently flipped due to a lightning strike or a gamma ray, and so we use the majority vote to determine which is the original value. Of course, it could be that two bits flipped, and so the majority is wrong, but even one flip is quite rare, and two flips is extraordinarily unlikely.
Storing bits in triplicate, of course, triples the amount of data we have to work with. It is quite inefficient. In this assignment we'll use a more efficient scheme, called the Hamming(7,4) code [Wikipedia description], which encodes each group of four conceptual bits with seven physical bits, according to the following scheme:
conceptual physical conceptual physical bits code bits code 0000 0000000 1000 1110000 0001 1101001 1001 0011001 0010 0101010 1010 1011010 0011 1000011 1011 0110011 0100 1001100 1100 0111100 0101 0100101 1101 1010101 0110 1100110 1110 0010110 0111 0001111 1111 1111111
Crucially, none of the seven-bit physical codes are adjacent: That is, you can't flip a bit in one of the seven-bit physical codes and end up with another physical code found in the table. In fact, you can't even take a physical code in the table and flip two bits to arrive at another code in the table. This is the key property that gives this encoding its error-correcting property: If somehow we see a seven-bit physical code that isn't in the table, then a bit must have flipped. Trying to flip each one of the bits will allow us to find a code that is in the table — but we'll find only one such code, and we'll conclude that this was the original code before the gamma ray struck (assuming, again, that gamma rays didn't coincidentally strike two bits at the same time).
There are far more complex schemes that are yet more efficient. For example, commercially available RAM frequently uses a code where 120 conceptual bits are encoded in 127 physical bits, which still allows for the error-correcting property while using only 6% extra space (as opposed to the 75% extra space required by Hamming(7,4)).
In this assignment, you don't have to worry about writing a program to encode data. That is achieved by the JavaScript form below: Enter text in the “Original Text” and the noise rate at which bits are corrupted, and it generates a hexadecimal code representing the Hamming code corresponding to the text, with some bits flipped randomly.
Uncorrupted: | |
Corrupted: | |
Decoded: |
The encoded text is found by taking each character of “Original Text” (plus a trailing NUL character) and dividing its 7-bit ASCII code into two 4-bit portions called “nibbles”: The upper three bits, to which is added a topmost 0 bit, and the lower four bits. Each nibble is translated into a seven-bit value using the Hamming codes tabulated above, and the seven-bit value is represented using two hexadecimal digits. Since a single letter translates into two nibbles, and each nibble translates into two hexadecimal digits, the encoded string is four times as long as the original.
The JavaScript form also shows the result of decoding the “corrupted” text using the program below. Of course, this is simply the same as “Original Text.”
Your assignment is to write an ARM assembly language program that reads a corrupted Hamming code generated by the above JavaScript form and reconstructs the original text. Fortunately, you don't have to figure out the algorithm to do this. I am helpfully providing you with a C program that accomplishes the task, and it's your job to translate it to ARM assembly language.
#include <stdlib.h>
#include <stdio.h>
int main() {
int c0, c1, c2;
int b, corrupt;
int prev, nib;
prev = -1;
while (1) {
// read two hexadecimal digits, combine into a byte
c0 = getchar();
c1 = getchar();
b = ((c0 - (c0 <= '9' ? '0' : 'W')) << 4)
| (c1 - (c1 <= '9' ? '0' : 'W'));
// determine which bit is corrupt, if any
c0 = (b ^ (b >> 2) ^ (b >> 4) ^ (b >> 6)) & 0x1;
c1 = (b ^ (b >> 1) ^ (b >> 4) ^ (b >> 5)) & 0x1;
c2 = (b ^ (b >> 1) ^ (b >> 2) ^ (b >> 3)) & 0x1;
corrupt = c0 | (c1 << 1) | (c2 << 2);
// flip corrupt bit if it exists, and retrieve original nibble
if (corrupt != 0) {
b ^= 1 << (7 - corrupt);
}
nib = ((b >> 1) & 0x8) | (b & 0x7);
if (prev == -1) {
// save nibble to combine with next nibble
prev = nib;
} else if (prev == 0 && nib == 0) {
// we found trailing NUL - exit program
putchar('\n');
return 0;
} else {
// display combination of two nibbles
putchar((prev << 4) | nib);
prev = -1;
}
}
}
You should use aas to test your programs; the section “Using aas” provides some examples of using this program. The program you write should include and use the io.s library of subroutines, described in “The io.s library.”
Commenting programs is particularly important when you're dealing with assembly language. The io.s library illustrates good commenting style.
The file io.s [Download] contains several subroutines helpful for displaying to the screen and for reading from the keyboard.
ttyStart | Initializes the terminal. No other io.s subroutines will work unless this is called first. | (Changes R0, R3.) |
getInt | Reads digits from the user until finding a non-digit character, interprets the digits as a base-10 number, which it places into R0. The first non-digit character read is consumed. | (Changes R0, R1, R2, R3.) |
getChar | Reads a single character from the user and places it into R0. | (Changes R0, R3.) |
printStr | Displays all characters in a NUL-terminated array whose first address is found in R0. | (Changes R0, R1, R2, R3, R12.) |
printInt | Displays base-10 representation of integer found in R0. | (Changes R0, R1, R2, R3, R12.) |
printChar | Displays character whose ASCII code is in the lower byte of R0. | (Changes R0, R1, R3.) |
Below are two program illustrating how to use io.s. The first just echoes everything the user types; the second repeatedly reads integers from the user and displays their squares.
BL ttyStart
again BL getChar ; load one character into R0
BL printChar ; and echo it
B again
INCLUDE "io.s"BL ttyStart
again BL getInt ; read an integer into R0,
MUL R5, R0, R0 ; square it
ADD R0, PC, #sqris ; display string "square is " first.
BL printStr
MOV R0, R5 ; place square into R0 to be displayed
BL printInt ; and show integer's square
MOV R0, #'\n' ; followed by a newline
BL printChar
B again
sqris DCB "square is ", 0
INCLUDE "io.s"
The program aas includes an ARM assembler and simulator. It is already installed on the laboratory computers, so you have only to execute the aas command in the terminal window to start it. If you want to use it on other computers, you can download it from the aas home page.
The below transcript illustrates using aas. It uses the squares.s file, which we're imagining contains the second program given above, for displaying the square of each number the user types. For this to work, you would need to execute aas from within a directory that contains io.s and squares.s.
In the transcript, after starting aas, we first tell it to assemble the code found in squares.s and load the resulting machine code into the simulated computer's memory. Then we tell it that when the program executes, it should send a line containing the digit 5 to the program, and then it should send a line containing the digits 1,4,5 to the program. We then start the simulated computer running using the step command, where we tell the computer to execute five million instructions, which is enough for the simulated computer to read and process each of the two numbers. The program would spend most of the last five million instructions simply waiting for further input. But we then exit aas.
linux$ aas Welcome to aas 0.0.5 (http://www.toves.org/aas/). Type 'help' to list commands. (c) 2008, Carl Burch. For copyright information, enter 'print license'. (aas) load squares.s (aas) input input: 5 (aas) input input: 145 (aas) step 5m square is 25 square is 21025 (aas) quit
Several additional commands are available within
aas that are helpful for debugging a program. In the following
transcript, we set a breakpoint so that the computer stops just
before executing line 3 of squares.s, which contains the
MUL
instruction.
After executing the program so that it reaches this point, we
display the value of R5
, then execute just a single
instruction (MUL
), and then we display R5
again.
linux$ aas Welcome to aas 0.0.5 (http://www.toves.org/aas/). Type 'help' to list commands. (c) 2008, Carl Burch. For copyright information, enter 'print license'. (aas) load squares.s (aas) list 0 10 0x0000 0xeb00000b line 1: BL ttyStart 0x0004 0xeb00000f line 2: BL getInt 0x0008 0xe0050090 line 3: MUL R5, R0, R0 0x000c 0xe28f0014 line 4: ADD R0, PC, #sqris (aas) break 3 breakpoint added at 0x8 (aas) input input: 25 (aas) step 1m execution reached breakpoint 0x8 after 49 steps; stopped (aas) print r5 R5 0x00000000 [0] (aas) step (aas) print r5 R5 0x00000271 [625] (aas) step 1m square is 625 (aas) quit