COS 109: Problem Set 1

Sat Sep 20 14:50:11 EDT 2014

Due 5:00 PM, Wednesday Sept 24, on paper, in the box outside my office (Room 311, 3rd floor, CS building) or in class.

Collaboration policy for COS 109: Working together to really understand the material in problem sets and labs is encouraged, but once you have things figured out, you must part company and compose your written answers independently. That helps you to be sure that you understand the material, and it obviates questions of whether the collaboration was too close.

You must list any other class members with whom you collaborated.

Problem set answers need not be long, merely clear enough that we can understand what you have done, though for computational problems, show enough of your work that we can see where your answer came from. There is no need to repeat the question, and it saves paper if you just give the answers. PLEASE submit typed material, not hand-written, and keep a copy for yourself just in case something goes astray. Thanks.

And a reminder from the first problem set: Keep in mind that if the data you start with is approximate, the results cannot be precise.

 

1. Big Data, Big Numbers, Big Errors

The world is full of big numbers, some of which mean something, but most people have no understanding or intuitive grasp of how big such numbers are, nor are they comfortable doing arithmetic on them. The purpose of this problem is to get you to work with some big numbers. In these problems, and in the course in general, it's preferable to use exponents like 1012 rather than long decimal numbers like 100000000000; exponents are easier to read and compute with, so you are less likely to make arithmetic errors.

(a) Companies like Google and Facebook have multiple data centers that store copies of their information. If a company builds a new data center, it must clone the data from an existing data center. Suppose that a data center has 10 PB of data.

      (i) How long would it take to send the information to a new data center over a 100 Gbps data link?

      (ii) How long would it take you to drive a truck full of disks with the same information from California to New York?

For both parts, ignore overhead and delays; this is entirely about transfer rates. And pay attention to units: B is conventionally bytes and b is bits.

(b) An article about municipal fiber systems talks of "gigabit speeds, a rate that can download a 4.5 gigabyte movie in 36 seconds. The same file takes an hour at 10 megabits per second." How many gigabits per second does "gigabit speeds" mean here?

(c) An article on cloud storage in the New York Times on 8/25/14 says that 2 GB would hold 1000 books or 7 minutes of HDTV. How many GB would it take to store an episode of "Game of Thrones" in HDTV? How many GB (very roughly) would it take to store the script for the episode as text?

(d) A news story says "The first GB storage device in 1980 cost $120K, and weighed 250 kg." A few weeks ago I bought a 32 GB SD card for my camera; it cost me $18, and weighs 2 gm. SanDisk just announced a 512 GB SD card at $800. For each of these devices, compute the approximate cost per bit and the approximate weight per bit.

(e) A gizmodo.com article about Netflix says "If you had to take a 7.5TB update on your home internet, you would be screwed. The average download speed in the United States is 18.6Mbps. So with an average connection, that 7.5TB would take you about 40 days to chew through." Is 40 days much too long, much too short, or about right, and (reasoning quantitatively) why do you say so?

(f) A blog post about Facebook's "Look Back" video feature says "To store videos for everyone, we would need an estimated 25 petabytes of disk space. Visualized as a stack of standard 1TB laptop drives, this would be a tower measuring more than 775 feet tall -- two-thirds the height of the Empire State Building." How thick is a drive?

 

2. Bits, Bytes and Binary

"There are only 10 types of people in the world: those who can read binary and those who can't."


© Bill Amend, Foxtrot

(a) Now that you get the joke, write out the decimal numbers 15, 16, 17, 31, 32, 33, 63, 64, 65, 127, 128, 129, 255, 256, 257 in binary and hex. To be sure that you understand what you're doing, do the conversions between decimal, binary and hex by hand, not by a program or a calculator.

(b) An electronic car door key and its car must share some unique identification number so that my key won't unlock your car and vice versa.
    (i) Estimate roughly how many bits would be needed to provide each car in the United States with a unique identification number. Briefly explain your reasoning.
    (ii) Exactly how many bytes would this number require? [Note: there are no fractional bytes; it's always an integral number.]
    (iii) Suppose we include Canadian cars. How many bytes would now be required?

(c) (i) How many bits are needed to represent the current senior class year (i.e., 2015) as a binary number?
    (ii) How many bytes are needed?
    (iii) For what year will this number of bits increase?
    (iv) How many bytes will then be needed?

(d) (i) What range of numbers can you represent with the fingers and thumbs of two hands if you treat each finger and thumb as a binary digit?
    (ii) Draw a picture of a pair of hands displaying the number 132. (No artistic talent is required or expected.)