Assignment 9: ARM error correction

Due: 5:00pm, Friday, October 31. Value: 30 pts.

Physics, unfortunately, exists. Whenever we transmit a bit, or store a bit magnetically on a disk, or compute a bit using a transistor, physics can intervene: Electrons change spin, gamma rays strike, or electrical signals are disrupted by lightning or solar flares. Basically, bits are not as reliable as we'd like to think.

However, we tend to think of them as reliable, and the reason we get away with this is that our devices overcome the physics complexities using error correcting codes: You think you're working with a byte, but in fact the computer works more than eight bits, knowing that a bit could easily flip due to physics. As we read the “byte,” the computer examines the redundant bits to check whether they indicate that a bit is flipped; if one has, then the error correcting bits provide information on how to recover the original information.

Error-correcting codes

(This section provides background information, which is helpful but not necessary to completing the assignment.)

The simplest error-correcting code is to store each bit in triplicate. When we read a conceptual bit, we check whether all three physical bits match; if they don't, we conclude that one must have inadvertently flipped due to a lightning strike or a gamma ray, and so we use the majority vote to determine which is the original value. Of course, it could be that two bits flipped, and so the majority is wrong, but even one flip is quite rare, and two flips is extraordinarily unlikely.

Storing bits in triplicate, of course, triples the amount of data we have to work with. It is quite inefficient. In this assignment we'll use a more efficient scheme, called the Hamming(7,4) code [Wikipedia description], which encodes each group of four conceptual bits with seven physical bits, according to the following scheme:

conceptual physical conceptual physical

bits code bits code

0000 0000000 1000 1110000

0001 1101001 1001 0011001

0010 0101010 1010 1011010

0011 1000011 1011 0110011

0100 1001100 1100 0111100

0101 0100101 1101 1010101

0110 1100110 1110 0010110

0111 0001111 1111 1111111

conceptual	physical	conceptual	physical
0000	0000000	1000	1110000
0001	1101001	1001	0011001
0010	0101010	1010	1011010
0011	1000011	1011	0110011
0100	1001100	1100	0111100
0101	0100101	1101	1010101
0110	1100110	1110	0010110
0111	0001111	1111	1111111

Crucially, none of the seven-bit physical codes are adjacent: That is, you can't flip a bit in one of the seven-bit physical codes and end up with another physical code found in the table. In fact, you can't even take a physical code in the table and flip two bits to arrive at another code in the table. This is the key property that gives this encoding its error-correcting property: If somehow we see a seven-bit physical code that isn't in the table, then a bit must have flipped. Trying to flip each one of the bits will allow us to find a code that is in the table — but we'll find only one such code, and we'll conclude that this was the original code before the gamma ray struck (assuming, again, that gamma rays didn't coincidentally strike two bits at the same time).

There are far more complex schemes that are yet more efficient. For example, commercially available RAM frequently uses a code where 120 conceptual bits are encoded in 127 physical bits, which still allows for the error-correcting property while using only 6% extra space (as opposed to the 75% extra space required by Hamming(7,4)).

Encoding & decoding

In this assignment, you don't have to worry about writing a program to encode data. That is achieved by the JavaScript form below: Enter text in the “Original Text” and the noise rate at which bits are corrupted, and it generates a hexadecimal code representing the Hamming code corresponding to the text, with some bits flipped randomly.


Original Text:
Noise Rate:

Uncorrupted:
Corrupted:

Decoded:

The encoded text is found by taking each character of “Original Text” (plus a trailing NUL character) and dividing its 7-bit ASCII code into two 4-bit portions called “nibbles”: The upper three bits, to which is added a topmost 0 bit, and the lower four bits. Each nibble is translated into a seven-bit value using the Hamming codes tabulated above, and the seven-bit value is represented using two hexadecimal digits. Since a single letter translates into two nibbles, and each nibble translates into two hexadecimal digits, the encoded string is four times as long as the original.

The JavaScript form also shows the result of decoding the “corrupted” text using the program below. Of course, this is simply the same as “Original Text.”

Assignment

Your assignment is to write an ARM assembly language program that reads a corrupted Hamming code generated by the above JavaScript form and reconstructs the original text. Fortunately, you don't have to figure out the algorithm to do this. I am helpfully providing you with a C program that accomplishes the task, and it's your job to translate it to ARM assembly language.

#include <stdlib.h> #include <stdio.h> int main() { int c0, c1, c2; int b, corrupt; int prev, nib; prev = -1; while (1) { // read two hexadecimal digits, combine into a byte c0 = getchar(); c1 = getchar(); b = ((c0 - (c0 <= '9' ? '0' : 'W')) << 4) | (c1 - (c1 <= '9' ? '0' : 'W')); // determine which bit is corrupt, if any c0 = (b ^ (b >> 2) ^ (b >> 4) ^ (b >> 6)) & 0x1; c1 = (b ^ (b >> 1) ^ (b >> 4) ^ (b >> 5)) & 0x1; c2 = (b ^ (b >> 1) ^ (b >> 2) ^ (b >> 3)) & 0x1; corrupt = c0 | (c1 << 1) | (c2 << 2); // flip corrupt bit if it exists, and retrieve original nibble if (corrupt != 0) { b ^= 1 << (7 - corrupt); } nib = ((b >> 1) & 0x8) | (b & 0x7); if (prev == -1) { // save nibble to combine with next nibble prev = nib; } else if (prev == 0 && nib == 0) { // we found trailing NUL - exit program putchar('\n'); return 0; } else { // display combination of two nibbles putchar((prev << 4) | nib); prev = -1; } } }

You should use aas to test your programs; the section “Using aas” provides some examples of using this program. The program you write should include and use the io.s library of subroutines, described in “The io.s library.”

Commenting programs is particularly important when you're dealing with assembly language. The io.s library illustrates good commenting style.

The `io.s` library

The file io.s [Download] contains several subroutines helpful for displaying to the screen and for reading from the keyboard.

`ttyStart`	Initializes the terminal. No other `io.s` subroutines will work unless this is called first.	(Changes R0, R3.)
`getInt`	Reads digits from the user until finding a non-digit character, interprets the digits as a base-10 number, which it places into R0. The first non-digit character read is consumed.	(Changes R0, R1, R2, R3.)
`getChar`	Reads a single character from the user and places it into R0.	(Changes R0, R3.)
`printStr`	Displays all characters in a NUL-terminated array whose first address is found in R0.	(Changes R0, R1, R2, R3, R12.)
`printInt`	Displays base-10 representation of integer found in R0.	(Changes R0, R1, R2, R3, R12.)
`printChar`	Displays character whose ASCII code is in the lower byte of R0.	(Changes R0, R1, R3.)

Below are two program illustrating how to use io.s. The first just echoes everything the user types; the second repeatedly reads integers from the user and displays their squares.

BL ttyStart again BL getChar ; load one character into R0 BL printChar ; and echo it B again INCLUDE "io.s" BL ttyStart again BL getInt ; read an integer into R0, MUL R5, R0, R0 ; square it ADD R0, PC, #sqris ; display string "square is " first. BL printStr MOV R0, R5 ; place square into R0 to be displayed BL printInt ; and show integer's square MOV R0, #'\n' ; followed by a newline BL printChar B again sqris DCB "square is ", 0 INCLUDE "io.s"

Using `aas`

The program aas includes an ARM assembler and simulator. It is already installed on the laboratory computers, so you have only to execute the aas command in the terminal window to start it. If you want to use it on other computers, you can download it from the aas home page.

The below transcript illustrates using aas. It uses the squares.s file, which we're imagining contains the second program given above, for displaying the square of each number the user types. For this to work, you would need to execute aas from within a directory that contains io.s and squares.s.

In the transcript, after starting aas, we first tell it to assemble the code found in squares.s and load the resulting machine code into the simulated computer's memory. Then we tell it that when the program executes, it should send a line containing the digit 5 to the program, and then it should send a line containing the digits 1,4,5 to the program. We then start the simulated computer running using the step command, where we tell the computer to execute five million instructions, which is enough for the simulated computer to read and process each of the two numbers. The program would spend most of the last five million instructions simply waiting for further input. But we then exit aas.

linux$ aas
Welcome to aas 0.0.5 (http://www.toves.org/aas/). Type 'help' to list commands.
(c) 2008, Carl Burch. For copyright information, enter 'print license'.
(aas) load squares.s
(aas) input
input: 5
(aas) input
input: 145
(aas) step 5m
square is 25
square is 21025
(aas) quit

Several additional commands are available within aas that are helpful for debugging a program. In the following transcript, we set a breakpoint so that the computer stops just before executing line 3 of squares.s, which contains the MUL instruction. After executing the program so that it reaches this point, we display the value of R5, then execute just a single instruction (MUL), and then we display R5 again.

linux$ aas
Welcome to aas 0.0.5 (http://www.toves.org/aas/). Type 'help' to list commands.
(c) 2008, Carl Burch. For copyright information, enter 'print license'.
(aas) load squares.s
(aas) list 0 10
0x0000 0xeb00000b  line 1: BL ttyStart
0x0004 0xeb00000f  line 2: BL getInt
0x0008 0xe0050090  line 3: MUL R5, R0, R0
0x000c 0xe28f0014  line 4: ADD R0, PC, #sqris
(aas) break 3
breakpoint added at 0x8
(aas) input
input: 25
(aas) step 1m
execution reached breakpoint 0x8 after 49 steps; stopped
(aas) print r5
R5  0x00000000 [0]
(aas) step
(aas) print r5
R5  0x00000271 [625]
(aas) step 1m
square is 625
(aas) quit