## **Instruction Pipelining**

• Instruction Pipeline

| Fetch | Decode | Operand | Execute  | Store   |         |         |         |       |
|-------|--------|---------|----------|---------|---------|---------|---------|-------|
|       | Fetch  | Decode  | Operand  | Execute | Store   |         |         |       |
|       |        | Fetch   | Decode   | Operand | Execute | Store   |         |       |
|       |        |         | Fetch    | Decode  | Operand | Execute | Store   | 1     |
|       |        |         | <u> </u> | Fetch   | Decode  | Operand | Execute | Store |

- PC is incremented by 4 at the Fetch stage to fetch the next instruction
- A delay slot caused by a jmp instruction, why?

| PC              | PC' | instruction       |                        |           |              |              |                  |
|-----------------|-----|-------------------|------------------------|-----------|--------------|--------------|------------------|
| 8               | 12  | add               | add                    | add       |              |              |                  |
| 12              | 16  | jmp 40            |                        | jmp       | jmp          |              |                  |
| 16              | 40  | <u>delay slot</u> |                        |           | <u>delay</u> | <u>delay</u> |                  |
|                 | 4.4 | <br>aub           |                        |           |              | sub          | sub              |
| Copyright ©1997 |     | Con               | nputer Science 217: De | łay Słots |              |              | Page 162         |
|                 |     |                   |                        |           |              |              | November 2, 1999 |

## **Delay-Slot Instructions**

- If this is a "feature," it certainly contradicts "normal" expectations If you think this is confusing, use a nop in all delay slots
- Optimizers may be able take advantage of the delay slot

· Annul bit controls the execution of the delay-slot instruction

bg<u>,a</u> L1 mov a,c

the ", a" causes the mov instruction to be executed if the branch is taken,

and not executed if the branch is not taken

• Exception

ba, a L does not execute its delay-slot instruction

• What is the advantage of this counterintuitive convention?

Copyright ©1997

Computer Science 217: Delay Slots

November 2, 1999

Page 164

## Annul Bit, cont'd

• Optimized for (i = 0; i < n; i++) 1; 2; ...; n

| Puill |          | $y_1 (1 = 0, 1 < 11, 177)$ | •      |          |                   |
|-------|----------|----------------------------|--------|----------|-------------------|
|       | clr      | i                          | better | code     | uses delay slots  |
|       | ba       | L2                         |        | clr      | i                 |
|       | nop      |                            |        | ba,a     | L2                |
| L1:   | 1        |                            | L1:    | 2        |                   |
|       | 2        |                            |        |          |                   |
|       |          |                            |        | n        |                   |
|       | n        |                            |        | inc      | i                 |
|       | inc      | i                          | L2:    | cmp      | i,n               |
| L2:   | cmp      | i,n                        |        | bl,a     | L1                |
|       | bl       | L1                         |        | 1        |                   |
|       | nop      |                            |        | o :      |                   |
|       | 4 :      |                            | n +    | 3 Instri | uctions/iteration |
| n + 4 | 4 instri | uctions/iteration          |        |          |                   |

## Convention of programming

don't use annul unless absolutely necessary

place nop after control-transfer instructions

• What happens when the delay-slot instruction is a control-transfer instruction?