5.3. Instruction Set Architecture


This section under major construction.


The instruction set architecture (ISA) is the interface between the TOY programming langauge and the physical hardware that executes the program. The ISA specifies the size of main memory, number of registers, and number of bits per instruction. It also specifies exactly which instructions the machine is capable of performing and how each of the instruction bits is interpreted.

The TOY ISA. The TOY machine has 256 words of main memory, 16 registers, and 16-bit instructions. There are 16 different instruction types; each one is designated by one of the opcodes 0 through F. Each instruction manipulates the contents of memory, registers, or the program counter in a completely specified manner. The 16 TOY instructions are organized into three categories: arithmetic-logic, transfer between memory and registers, and flow control. The table below gives a brief summary. (Here is a text version of the TOY cheatsheet.) We describe them in more detail later.


OPCODE DESCRIPTION FORMAT PSEUDOCODE
0 halt - exit
1 add 1 R[d] <- R[s] + R[t]
2 subtract 1 R[d] <- R[s] - R[t]
3 and 1 R[d] <- R[s] & R[t]
4 xor 1 R[d] <- R[s] ^ R[t]
5 left shift 1 R[d] <- R[s] << R[t]
6 right shift 1 R[d] <- R[s] >> R[t]
7 load address 2 R[d] <- addr
8 load 2 R[d] <- mem[addr]
9 store 2 mem[addr] <- R[d]
A load indirect 1 R[d] <- mem[R[t]]
B store indirect 1 mem[R[t]] <- R[d]
C branch zero 2 if (R[d] == 0) pc <- addr
D branch positive 2 if (R[d] > 0) pc <- addr
E jump register - pc <- R[d]
F jump and link 2 R[d] <- pc; pc <- addr


Each TOY instruction consists of 4 hex digits (16 bits). The leading (left-most) hex digit encodes one of the 16 opcodes. The second (from the left) hex digit refers to one of the 16 registers, which we call the destination register and denote by d. The interpretation of the two rightmost hex digits depends on the opcode. With Format 1 opcodes, the third and fourth hex digits are each interpreted as the index of a register, which we call the two source registers and denote by s and t. For example, the instruction 1462 adds the contents of registers s = 6 and t = 2 and puts the result into register d = 4. With Format 2 opcodes, the third and fourth hex digits (the rightmost 8 bits) are interpreted as a memory address, which we denote by addr. For example, the instruction 9462 stores the contents of register d = 4 into memory location addr = 62. Note that there is no ambiguity between Format 1 and Format 2 instruction since each opcode has a unique format.

Format 1 and 2 TOY Instructions


Modern day microprocessors and ISAs. Today, there is a wide variety of ISAs used on modern day microprocessors: IA-32 (Intel, AMD), PowerPC (IBM, Motorola), PA-RISC (Hewlett-Packard), and SPARC (SUN Microsystems). These ISAs typically access millions of words of main memory, each of which is 32 or 64 bits wide. For example IA-32 has 8 32-bit general purpose registers; PowerPC has 32 64-bit general purpose registers. The number of instruction types varies greatly from half a dozen to several hundred. The ISA is chosen in order to make it easy to build the underlying hardware and compilers, while striving to maximize performance and minimize cost. Unfortunately, sometimes both goals are sacrificed to maintain backward compatibility with obsolete hardware. There are always tradeoffs.

The TOY machine possesses all of the essential features of modern day microprocessors. However, it is much easier to understand because it has only 16 instructions and two instruction formats. In contrast, the IA-32 (the one used on Intel PC's) has more than 100 instruction types and over a dozen different instruction formats. As a programming language, TOY is also much simpler than the Java programming language. This makes it easier to fully understand, but not necessarily to write code or debug. A Java compiler (e.g., javac) is a program that automatically converts Java code into the ISA of the computer on which it is to be executed. You will soon appreciate the convenience of working in a higher-level language like Java.

Q + A

Q. What's the difference between load address, load, and load indirect?

A. All three instructions load some value into a register. Load address is the simplest: 7C30 loads the value 0030 into register C. Both load and load indirect copy the contents of some memory cell into a register. Suppose that memory location 30 contains AAAA, memory location 31 contains BBBB, and that register A is storing the value 0031. Then 8C30 loads the value AAAA into register C, while AC0A loads the value BBBB into register C.

Q. What's the difference between E100 and E1CC?

A. Nothing. The last two hex digits are ignored with opcode E. Similarly, the third hex digit is ignored with opcodes A and B. The last three hex digits are ignored with opcode 0.

Q. What does 5ABC (left shift) do if register C contains a value bigger than 15?

A. Since it only makes sense to left shift 15 positions, TOY only considers the last hex digit of register C to determine the number of bits to shift. So it is not true that if register C is FFFD (-3) that the left shift would become a right shift by 3.

Q. What would happen if the program counter were FF?

A. The machine would try to read whatever value is stored in memory location FF. Since FF is used for standard input, the machine would fetch the next instruction from standard input. The program counter would then be incremented to 00. It would be extremely rare for a TOY programmer to write a program with this intent.

Exercises

  1. Give a single instruction that changes the program counter to memory address 15 regardless of the contents of any registers or memory cells.

    Answer: C015 or F015. Both instructions rely on the fact that register 0 is always 0000.

  2. List seven instructions (with different opcodes) that put 0000 into register A.

    Answer: 1A00, 2Axx, 3A0x, 4Axx, 5A0x, 6A0x, 7A00, where x is any hex digit.

  3. List three ways (different opcodes) to set the program counter to 00 without changing the contents of any register or memory cell.

    Answer: C000, E0xy, F000.

  4. List five instructions (with different opcodes) that are no-ops. Exclude cases where the destination register is 0.

    Answer: 1xx0, 1x0x, 2xx0, 3xxx, 5xx0, 6xx0, where x is any hex digit other than 0.

  5. List a Format 2 instruction that is a no-op.

    Answer: D0xy where xy is any two hex digit number.

  6. List six ways to assign the contents of register B into register A.

    Answer: 1AB0, 1A0B, 2AB0, 3ABB, 4A0B, 4AB0, 5AB0, and 6AB0. where x is any hex digit.

  7. There is no bitwise or operator in TOY. (In Java it is |.) Explain how to compute RC <- RA | RB via a sequence of three TOY instructions. That is, the ith bit of RC should equal 1 if and only if either the ith bit of RA or the ith bit of RB is 1.

    Answer: 3DAB 4EAB 1CDE.

  8. There is no bitwise nor operator in TOY (or Java). Explain how to compute RC <- RA nor RB via a sequence of TOY instructions. That is, the ith bit of RC should equal 0 if and only if either the ith bit of RA or the ith bit of RB is 1.
  9. There is no bitwise nand operator in TOY (or Java). Explain how to compute RC <- RA nor RB via a sequence of TOY instructions. That is, the ith bit of RC should equal 0 if and only if both the ith bits of RA and RB are 1.
  10. There is no bitwise not operator in TOY. (In Java it is ~.) Explain how to compute RB <- ~RA via a sequence of three TOY instructions. That is, RB should be equal to RA except that every bit is flipped.

    Answer: 7101 2B01 4BAB or 7101 2B01 2BBA.

  11. There is no absolute value function in TOY. Explain how to compute RB <- |RA| using as few instructions as possible.
  12. There is no branch if nonnegative operator in TOY. Explain how to jump to memory address 15 if register A is greater than or equal to 0.

    Answer: Use both branch is positive and branch if zero, one after the other: CA15 DA15.

  13. Show that the subtract operator is redundant. That is, explain how to compute RC <- RA - RB by using a sequence of TOY instructions that don't involve opcode 2.
  14. Which of the 16 TOY instructions do not use all 16 bits? Answer: halt (only uses first 4 bits), load indirect (does not use last 4 bits), store indirect (does not use last 4 bits), jump register (does not use last 4 bits).
  15. We interpret some TOY integers to be negative according to two's complement notation. What are the only instructions for which this matters?

    Answer: The branch if positive treats integers between 0001 and 7FFF as positive. The right shift is a signed shift so that if the leftmost bit is 1, then 1's are prepended to the beginning. All other instructions (even subtract!) do not depend on whether TOY has negative integers or not.

Creative Exercises

  1.