last modified 3/12/02

Toy programming

In the TOY machine, there are 256 memory locations, each of which holds a 4 digit hex number (16 bits.) You can think of TOY memory as being an array of size 256: if you want to access a particular element in an array, you just need to know its index. To access a particular word in memory, you need to know its address. In TOY, all our numbers are hexadecimal, so the addresses range from 00 to FF (0 to 255 in base-10.) These memory locations can hold actual data (like the values that would be stored in simple variables, structs, or arrays in C.) And the instructions of the TOY program itself are stored in memory (usually in some part of memory separate from the part(s) being used for storing data.)

The TOY machine also has 16 registers which each hold a 4 digit hex number. Most of the TOY instructions operate on the values stored in these registers. If, for example, you want to add two of the values you have stored in memory, you first need to 'load' the values into registers from memory before you can add them (and put the result into a register.) These 16 registers are numbered 0 to F. You can refer to register 2, for example, by writing R2 or R[2]. (You will see both versions in lectures, precepts, exercises, etc. Just know that R2 is the same as R[2].) Important Note: Register 0 (R0) always holds the value zero. So you cannot use this register for any other value.

There is one additional register which is called the program counter, PC for short. It holds the address of the next instruction to be executed by the TOY machine. (Since all the TOY instructions are stored in memory, the PC will have a value between 00 and FF.)

In order to actually run any TOY programs, you need to use either the C or Java simulator (see the course webpage to find links to these.) I highly recommend that you take a look at the C code to see how it 'runs' a TOY program. Notice that after memory and the registers are intialized, the program is basically one big loop that continues until a HALT instruction is reached. The value of the PC increases by 1 each time through the loop, and the instruction at the address equal to the value of the PC is executed.


Example 1: Sum of integers from 1 to N

TOY         COMMANDS                C PROGRAM  (R1=N, R2=sum, R3=constant 1)
10: 81FF    R1 <- stdin             scanf("%4hX", &N);
11: 7200    R2 <- 0000              sum=0;
12: 7301    R3 <- 0001              do {
13: 1221    R2 <- R2 + R1               sum = sum + N;
14: 2113    R1 <- R1 - R3               N = N - 1;
15: D113    if (R1 > 0) PC <- 13    } while (N > 0);
16: 92FF    stdout <- R2            printf("%04hX\n", sum);
17: 0000    HALT                    return 0;
The leftmost column of numbers above is a sequence of TOY memory addresses. This program is stored in memory locations 10 to 17. Both (C and Java) TOY simulators assume that the first instruction of any TOY program is located at memory location 10. You can think of instruction 10 as being a bit like the first statement in the main() function of a C program. It may not be the first line in a file, but it is the first line that is executed when the program is run. If you look at the C code for the TOY simulator (toy.c), notice that the variable pc is initialized to 10 and that each time through the loop in the main() body of the program, pc is incremented by 1. When you trace through TOY code line-by-line, you're essentially doing 'the job' of the pc by keeping track of the line number (memory address).

In the COMMANDS column, I've given the meaning for each of the TOY instructions. You don't need to memorize any of the TOY instructions. Just use the TOY "cheat sheet" which lists the 16 instructions (0 - F). You can find a copy of this list at the very end of your course packet. For each instruction in my TOY program above, I looked at its 'opcode' (the first hex digit) to determine its operation. For example, the second instruction has opcode=7. My TOY cheat sheet tells me:

7: load address      R[d] <- addr

There are two types of instruction 'format'. I can see that this instruction uses Format 2 because it uses 'addr'. In a Format 2 instruction, the first hex digit is the opcode. The second hex digit is 'd', and the third and fourth hex digits combine to form the 'addr'. For instruction 7200, the second hex digit is 2, and the third and fourth are 00. Combining all this information, I have that this instruction puts the value 00 into register 2 (R[2]).

Whenever I'm given some TOY code, the first thing I do is go through and translate each command into 'what it does', as I've illustrated in the second column above. Then I can trace through the code with some actual values, just as I would if given some C code. In this case, I find that if I initialize R1 with the value 3, 0006 will print as a result. And by tracing through the code, I can determine that this TOY program will print the sum of the integers from 1 to N (which is input by the user.)

In the right-most column, I wrote a C program to do the same thing. Notice that if I assign variable names to my registers (R1=N and R2=sum), my TOY program becomes much easier to understand. (Note that I also needed a third register, R3, to store the constant value 1.) By comparing the TOY code to the C code, the structure of the TOY code becomes more apparent. Reading just the TOY code or even both the TOY code and the corresponding explanation of the COMMANDS, it might not have been completely obvious that I had a do-while loop. Writing C code that corresponds to the TOY code you're trying to read or write can often help.


Arithmetic and Logical Instructions

These instructions (1 - 6) are all Format 1 instructions. That means that the first hex digit encodes the opcode, the second hex digit is d, the third hex digit is s, and the fourth hex digit is t. Suppose I have initialized R1 with (39)16 and R2 with (3)16. And I want to know what value will be stored in R3 after each of these instructions:

The first two instructions are pretty simple: (39 + 3)16 = (3C)16 and (39 - 3)16 = (26)16.
The four logical instructions are bitwise operations. That means that I need 'bits', so I'm going to convert my hex numbers into binary numbers.
(39)16 = (0000 0000 0011 1001)2 (spaces for readability)
(3)16 = (0000 0000 0000 0011)2
For the and operation, I need to line up my two binary numbers and then apply the and operator to the pair of bits in each column. Recall that the and of 2 bits is 1 if they are both 1 and 0 otherwise. So I have:
0000000000111001
0000000000000011
----------------
0000000000000001
So R3 gets the value 1.
I'll do the same thing for xor but use the xor rule instead of the and rule. (The xor of 2 bits is 1 if one of the bits is 1 but not both.)
0000000000111001
0000000000000011
----------------
0000000000111010
So R3 is (0000 0000 0011 1010)2 = (003A)16
For shift left and shift right, I need the binary representation of R[s], which in this case is R1 = (0000 0000 0011 1001)2. R[t], which is R2 (= 3), is the number of bits to shift. To shift R1 left 3 bits, I want to move each bit from its current position to a position 3 bits to the left. That means that the leftmost 3 bits are just going to 'drop off' the number. And I need to shift 3 new bits in from the right. The new bits that get shifted in will always be all 0's. This operation is actually not as complicated as this explanation may make it sound. Instead of shifting all 3 bits at once (which is what one would normally do), here's what I get if I just shift 1 bit at a time:
Original bits:
0000000000111001
Shift left 1 bit:
0000000001110010
Shift left 2 bits:
0000000011100100
Shift left 3 bits:
0000000111001000
To summarize, R3 <- R1 << R2 = R3 <- (39)16 << 3 = R3 <- (0000000000111001)2 << 3 = R3 <- (0000000111001000)2 = R3 <- (01C8)16
I'll use the same approach to calculate (39)16 >> 3, except I'll shift to the right instead of to the left. Since the first digit of the 16-bit binary representation of (39)16 is 0, I'll shift in zero's. (If the first bit were a 1, I'd shift in 1's. This rule applies only to right shift. For left shift, always shift in 0's.)
R3 <- R1 >> R2 = R3 <- (39)16 >> 3 = R3 <- (0000000000111001)2 >> 3 = R3 <- (0000000000000111)2 = R3 <- (0007)16
Notice that I just 'lost' the three rightmost digits: 001 when I shifted right 3 bits.


Transfer Instructions

Here's the example code (which doesn't do anything useful) I gave in precept:

10: 7112    R1 <- 0012
11: 8210    R2 <- mem[10]
12: 9215    mem[15] <- R2
13: A201    R2 <- mem[R1]
14: B201    mem[R1] <- R2
15: 0000    halt
16: 0000    halt
If we step through the code line-by-line, we get
  1. R1 gets the value 0012
  2. R2 gets the value stored at memory location 10, which is 7112
  3. we store the value of R2 (which is now 7112) at memory location 15. This overwrites the instruction that's already there! So now our program looks like this:
    10: 7112    R1 <- 0012
    11: 8210    R2 <- mem[10]
    12: 9215    mem[15] <- R2
    13: A201    R2 <- mem[R1]
    14: B201    mem[R1] <- R2
    15: 7112    R1 <- 0012
    16: 0000    halt
    and we continue with the next instruction...
  4. R2 gets the value stored at the memory location addressed by the value in R1. R1 has the value 0012. So, R2 gets mem[12], which is 9215.
  5. we store the value of R2 (which is 9215) at the memory location addressed by the value in R1. R1 has the value 0012. So mem[12] gets 9215. We're overwriting this instruction with a new one, too, but it happens to be the same. So this has no effect.
  6. R1 gets the value 0012
  7. halt: program ends
Question: what happens if we change instruction 13 from A201 to A102?
(Hint: it's very bad... email me if you can't figure it out.)


implementing loops in TOY

In the TOY program given above that calculates the sum of the integers from 1, N, I used instruction D: branch positive to implement a do-while loop. In general, instructions C (branch zero) and D are the most useful instructions for implementing loops. Sometimes, you will need to manipulate the values in registers to test for a particular condition. For example, if you wanted a loop to repeat as long as the value in register A was less than 50, you could write something like this:

7B32    RB <- 32 (50: base 10)
2BBA	RB <- RB - RA
DBxx	if (RB > 0) jump to line xx.


implementing if-else in TOY

Consider the following C code, in which <A>, <B>, and <C> are each just 1 or more lines of code:

if (R1 > R2) {
    <A>
} else {
    <B>
}
<C>
If you want to implement this in TOY, you need to find a way to compare R1 and R2. Looking through the instructions, only 2 (C and D) compare values at all. And they both compare the value in a register to 0. So, just like the loop example above, we're going to subtract the value of R1 from R2 and then compare that difference to zero:
10: 2312    R3 <- R1 - R2
11: D320    if (R3 > 0) PC <- 20
12: code for <B>
 .
 .
 .
1x: C030    goto 30
 .
 .
 .
20: code for <A>
 .
 .
 .
2x: C030    goto 30
 .
 .
 .
30: code for <C>
Notice that in the TOY implementation, the code for <B> comes before the code for <A>. If R1 is greater than R2, then R3 will be greater than 0. So we jump to the if-code. Instructions 1x and 2x indicate the end of the else-code (<B>) and if-code (<A>), respectively. If you know exactly how many lines of TOY code you'll need for <A>, you can avoid jumping at line 2x (x is any hex digit > 0). But you will always need to jump after the <B> code to skip the <A> code. So that jump statement is necessary.


functions in TOY

... coming soon ...


arrays in TOY

... coming soon ...

 


Questions from Students

  1. I'm unsure of the meaning positive in the "branch positive" definition. Does it jump if the register holds a number which is negative according to two's compliment?
    Because strictly a negative number is still > 0