### Problem Set Number 8 Computer Science 471 Due by 5 PM, Monday Nov. 25, 1996

1. (5 points) Exercise 7.10 from the text.

2. (5 points) What's good about a direct-mapped cache whose size is less than or equal to the virtual memory page size? Hint: try adding such a cache to your picture from question 1. How can adding associativity achieve the same good for bigger caches?

3. (16 points) (An improved version of Exercise 7.17 from the text.) Consider the D-stream reference of a load instruction about to be executed in a system with a physically-addressed cache much larger than one page. This reference's address mapping is either in the TLB or not; its word of data is either in the cache or not; and its virtual memory page is either currently in physical memory or not. With these three two-choice possibilities, there would seem to be eight possible combinations for a load reference. Some of these eight cannot occur for various reasons, including their being forbidden by the memory-management system, while some of them represent perfectly normal situations.

For each of the eight combinations, please say whether the combination can occur or can't, and why it can't if it can't. For the ones that can occur, please describe what will happen as a result. Also, give the very rough frequency and the very very rough cost (in cycles) for the possibilities that can occur. (Your very rough frequencies should sum to roughly 100 percent, of course.)

4. (10 points) Set up an equation for CPU time (in cycles, of course) that expresses the effects of cache and TLB misses in a system with a 2-level cache. (In such a system, misses from the 1st-level caches are sent to the much bigger and slower 2nd-level cache. Only misses from the 2nd-level cache are sent to main memory.) There are separate physically-addressed 1st-level I- and D-caches, separate I-stream and D-stream TLBs, and one 2nd-level cache. Carefully define whatever terms you need, and try to make the equation simple (and accurate!). Treat loads and stores identically, and ignore write-back penalties and page faults. Assume that references that hit in the 1st-level caches and TLBs have the same CPI as non-references (like add). There are many ways to answer this question, but please pay special attention to giving precise definitions of the terms in your equation, because there are also many ways to get this equation slightly wrong! Hint: try using misses per instruction as your metric, rather than miss ratio.

5. (10 points) Here's an idea for speeding up cache access without going all the way to a virtual cache: use a virtual index but a physical tag. In other words, use bits from a reference's virtual address as the index, but store in the tag part of the cache bits from the reference's physical address. Could this idea be made to work? Draw a picture of a TLB and cache using this idea, and show the circuit for the cache hit signal, including exactly which bits get compared with which other bits to produce the address match. Assume all this: 32-bit virtual (byte!) addresses, 30-bit physical addresses, 1 Kbyte pages, 64-byte cache blocks, direct-mapped cache and TLB, and cache size 128 Kbytes. What are the advantages and the drawbacks of this idea, in your view?

6. (1 point) How long did this take you, not counting the reading, and with whom did you work?