**Due midnight Thursday November 5**

Collaboration policy for COS 109:
Working together to really understand the material in problem sets and
labs is encouraged, but once you have things figured out, you must part
company and compose your written answers independently. That helps
you to be sure that you understand the material, and it obviates questions
of whether the collaboration was too close.
You must list any other class members with whom you collaborated. |

Problem set answers need not be long, merely clear enough that we can understand what you have done, though for computational problems, show enough of your work that we can see where your answer came from. Please type your answers if at all possible. Thanks.

Don't forget that if the data you start with is approximate, the results cannot be precise.

(a) The New York Times said a couple of years ago, "Imagine a bookshelf 150 miles long: the content of the books on that shelf would be roughly one terabyte of data." Is this estimate much too high, much too low, or sort of reasonable, and why? Assume the books are just text, like a novel.

(b) A story in the *Prince* on 11/4/14 said that in the 2008-2009
academic year, "Princeton students printed 11,040,632 sheets of paper,
enough to stretch from Princeton to Salt Lake City, or a stack 17 times
taller than Fine Hall." The number is obviously speciously precise, but...

(i) Using this factoid, compute approximately how far
Salt Lake City is from Princeton. (Don't look it up!)

(ii) Approximately how tall is Fine Hall? (Ditto.)

(c) On 11/17/19, the *NY Times* said that by 2025, "stored
data worldwide will total 175 zettabytes = the storage capacity
of ~1.75 billion human brains."
What is the capacity of a human brain in gigabytes?

(d) The same article said that one million petabytes is the capacity of 250 billion DVDs. What is the capacity of a DVD in gigabytes? (You can think about but need not report how your brain capacity compares with that of a DVD.)

(e) Old versions of Excel supported a maximum of 65,536 rows; columns are labeled A, B, ..., AA, AB, ..., through IV. Explain why the last column has that label.

(f) Microsoft's web page says that Excel 2007 "increases the maximum
number of rows per worksheet from 65,536 to over 1 million," and says
that the last column is now labeled XFD.

(i) How many bits are likely used to represent
the row number in Excel 2007?

(ii) Explain briefly where "XFD" likely comes from.

(a) Suppose that Alice, Bob and Carol have a relationship in which

- each person needs to encrypt their own personal information, for which each person needs an individual key;
- each person needs to carry on private conversations with each other person (i.e., Alice and Bob want to talk without Carol being able to know what they are saying, Alice and Carol want to talk without Bob, and so on);
- Alice, Bob and Carol need to talk among themselves while preventing outsiders from understanding those conversations as well.

(b) Suppose that David joins the group, and relationships are even more complicated: they need keys for every individual and every possible combination of private conversations among subgroups (e.g., Alice, Carol and David want to be able to talk among themselves while excluding Bob) as well as excluding outsiders. How many different secret keys are now needed? List the keys by the people who will share them.

(c) As the group grows, more and more keys will be needed. Give a general formula or way to express how the number of keys varies as a function of N, the number of people involved. If you don't see a formula, then give the numbers you have computed, for 1 through 6 people.

(Hint: figure out how many keys are needed for one person, for two
people and so on up to five people, combine that with what you did
above, and look for a pattern.
A different hint: make a table with a column for each person and a row
for each group.
For each group put a 1 in column *i* if person *i* is a member
of that subgroup and 0 otherwise. How many rows does the table have?)

Let's test its survivability. There may be multiple answers to some of these questions; you only need to give one.

(a) What is the smallest set of **links** that you would have
to destroy to completely disconnect Princeton from Stanford? List the links
by their endpoints, for example A-B or Stanford-F.

(b) If a router is destroyed, any link connected to it no longer
works. What is the smallest set of **routers** (not including
Stanford or Princeton themselves) that you would have to destroy to
disconnect Princeton from Stanford? List the routers by letter.

(c) What routers are exactly two "hops" away from Princeton? That is, what
routers can you reach from Princeton by following a path that goes
through *exactly two* links? Don't count direct loopbacks like
Princeton-D-Princeton; that is, Princeton is not two hops from
Princeton. (Note that there are some situations where a node can be
reached by paths of different lengths.)

(d) What is the shortest path (that is, the path with the fewest links) from Princeton to Stanford? Give the sequence of routers that make up the path.

(e) What is the longest path from Princeton to Stanford that doesn't have a loop, that is, that doesn't go through any router more than once?

(f) Suppose that you want to build a network in which every
router can communicate with every other router, either directly or indirectly
by passing information through other routers. For that kind of network,
what is the absolute minimum number of links necessary to connect
*R* routers?

Please use this Word template for your answers. It would be a great help if you could type your answers.

Submit your problem set ** in PDF format ** by uploading a file
called `pset7.pdf` to
https://tigerfile.cs.princeton.edu/COS109_F2020/Pset7.
You can submit as many times as you like; we will only look at the last one.