COS 109: Problem Set 2

Due 5:00 PM, Tuesday October 3, on paper, in the box outside my office (Room 419, 4th floor, CS building) or in class.

Collaboration policy for COS 109: Working together to really understand the material in problem sets and labs is encouraged, but once you have things figured out, you must part company and compose your written answers independently. That helps you to be sure that you understand the material, and it obviates questions of whether the collaboration was too close.

You must list any other class members with whom you collaborated.

To hand in the assignment use the drop box at https://dropbox.cs.princeton.edu/COS109_F2017/Problem_Set_2

Problem set answers need not be long, merely clear enough that we can understand what you have done, though for computational problems, show enough of your work that we can see where your answer came from. There is no need to repeat the question, and it saves paper if you just give the answers. PLEASE submit typed material, not hand-written, and keep a copy for yourself just in case something goes astray. Thanks.

And a reminder from the first problem set: Keep in mind that if the data you start with is approximate, the results cannot be precise.

 

1. Big Data, Big Numbers, Big Errors

The world is full of big numbers, some of which mean something, but most people have no understanding or intuitive grasp of how big such numbers are, nor are they comfortable doing arithmetic on them. The purpose of this problem is to get you to work with some big numbers. In these problems, and in the course in general, it's preferable to use exponents like 1012 rather than long decimal numbers like 100000000000; exponents are easier to read and compute with, so you are less likely to make arithmetic errors.

In what follows, we will follow the standard convention and when representing units, use B for bytes and b for bits. So, 1 Gb is one Gigabit and 1 GB is one Gigabyte. 1 Gbps represents a transfer rate of 1 gigabit per second.

(a) Companies like Google and Facebook have multiple data centers that store copies of their information. If a company builds a new data center, it must clone the data from an existing data center. It is estimated that Google has accumulated over 20 petabytes of data in Google Maps.

For parts (i) and (iii) below, ignore overhead and delays; this is entirely about transfer rates.

      (i) How long would it take to send the information to a new data center over a 100 Gbps data link?

      (ii) Alternatively, you could store the information on DVD's. Each DVD can hold about 5 GB of data and is 1.2 mm thick. If you were to store the information on DVD's, about how many would be needed? And, if you were to stack the DVD's, how high would the stack be?

      (iii) If you were to store all the information on disks which had very high capacity and drive a truck full of disks with the same information from California to New York, how would be the transfer rate compare to that computed in part (i). (N.B. Google maps tells us that this trip takes 42 hours.)

(b) A computer's CPU has a clock that synchronizes all circuitry. For this problem, we will assume that each gate of the circuit executes once each time the clock ticks and that these gate executions taken together enable the CPU to perform one instruction (e.g. adding or multiplying 2 numbers) with each tick of the clock.
The rate of clock ticking is given as a frequency. For example, a CPU with a frequency of 800 MHz has a clock that ticks 800 million times per second.

      (i) How many instructions could your computer be carrying out during the 75 minutes of a COS 109 lecture? If you don't know the clock speed of your computer, assume that it runs at 3 GHz.

      (ii) If passwords are chosen from a 26 letter alphabet and are 6 characters long, then there are 26 6 possible passwords. Assume that a password can be tested in about 100 instructions. How long will it take to try all passwords on your computer?

(c) Here are some facts that you will need to do this problem:

      (i) The websites https://zephoria.com/top-15-valuable-facebook-statistics/ and https://www.statista.com/statistics/398136/us-facebook-user-age-groups/ contain a number of statistics about facebook users. Among these facts is the estimate that facebook has 2.01 billion users of whom 1.32 billion use facebook every day. In the United States, the number of users is 214 million. You can assume that the fraction of facebook user in the United States who use facebook each day is the same as the fraction of all facebook users who do so. Further, you can assume that the average time spent on facebook each day is 39 minutes. Consider this in light of the following facts:

      (i) A person who walks a mile in 15 minutes burns about 100 calories and for every 3500 calories you burn without changing eating habits, you lose a pound.

      (ii) It has been estimated that the average American is 17 pounds over weight.

Put all of these numbers together and making the assumptions that:

      (i) people would walk instead of using facebook

      (ii) people would eat no more even though they were walking

      (iii) people who do not use facebook will not lose weight in this experiment.

With these assumptions, how long would it take before the average weight in the United States no longer was over weight?

 

2. Bits, Bytes, Binary and my T shirts

(a) Now that you get the joke, write out the decimal numbers 15, 32, 60, 127, 8191 in binary. To be sure that you understand what you're doing, do the conversions between decimal and binary by hand, not by a program or a calculator. Show your work in all cases.

(b) Some years ago, a graduating class gave me this shirt

Which class were they?

(c) It has been estimated that Fed Ex averages 3.4 million packages per day. Each package is assigned a code number.
    (i) If we assume that code numbers are not repeated for a month, how many bytes are required to provide unique code numbers to all packages?
    (ii) How does your answer to (i) change for UPS which averages 16 million packages per day
    (iii) What range of numbers can you represent with the fingers and thumbs of two hands if you treat each finger and thumb as a binary digit?
    (iv) How much larger does this number get if you use all fingers and all toes?

 

3. A quantitaive exercise

Estimate the number of spoons that the dining hall at Forbes college has. You can assume that they restock at the beginning of the academic year and your estimate should be a prediction of that number. You should also estimate how many spoons there are at the end of the academic year.
As with the problems of this sort that we have worked in class, it is not necessary to get the precise answer. What is important is to explain the assumptions you have made and to give the rationale for your assumptions. Feel free to work in round numbers as you develop your estimate.