COS 109: Problem Set 2 1. Things That Go Round and Round (a) Approximately what disk capacity would be required to hold everything you have heard in your life so far, in MP3 format? To unify everyone's computations, assume that you are 20 years old, and that you're listening all the time even when you're asleep. 20 * 365 * 24 * 60 minutes is about 10.5 M minutes so it's about 10 or 11 TB; i would also take 10.5 TB (or the equivalent in any correct units) but it doesn't warrant any more precision than this. (b) A news story on 3/8/09 about software piracy said that the Pirate Bay's servers contained about 65 terabytes of files, corresponding to around 16,000 full-length movies. If 16,000 is right, about many gigabytes are there in a typical movie? What independent information do you have that would tend to support or contradict this value? ("Use the notes, Luke.") 65 * 10^12 / 16 * 10^3 is about 4 * 10^9, or 4 GB the notes say somewhere that a DVD holds 4.7 GB (c) Google has multiple data centers that store copies of their searchable information. If they build a new data center, they have to clone the data from an existing data center. Suppose that a center has 10 PB of disk space (I'm making this number up). Is it faster to ship a bunch of disks say from California to New York by truck or to send the information over a 100 Gbps data link? Explain your answer quantitatively. Ignore overhead and delays; this question is entirely about transfer rates. And pay attention to units: B is conventionally bytes and b is bits. 10 PB is 10^16 bytes 100 Gbps is about 10^10 bytes/second so 10^6 seconds so at least 10 days a truck ought to be able to drive say 2500-3000 miles in 5-6 days anything that gets these basic ideas right and does sensible arithmetic on them is acceptable. (d) If the disk in a classic iPod is 1 inch in diameter and rotates at 3,600 revolutions per minute, about how fast in inches per second is information traveling past the read head at the outermost edge of the disk? If a desktop disk is 3.5 inches in diameter and rotates at 7,200 rpm, how fast is the information traveling at its outermost edge? 3600 rpm is 60 rps circumference is pi 60 * 3.14 = 188 so about 190 inches/sec? anything in this range, and i don't mind a couple of significant figures here. 7 times faster than the previous, since diameter is 3.5x and speed is 2x. note that no real work is necessary: it's just a factor of 7. (e) An article in the New York Times on 2/7/09 said that the organization that manages the Harvard endowment grew by 33% under its previous manager, but the new manager is going to cut employment by 25%. If Harvard started with 75 employees in this group before the previous manager started hiring, how many will be left after the new manager finishes firing? 75 increased by 1/3 to 100 100 cut by 25% to 75 nothing magic here, but there are no fractional people! (f) Another article in the New York Times about the same time said that the ball used in a Chicago version of softball has a 16 inch diameter. What would be the circumference of this ball? A correction a couple of days later said that the circumference is 16"; in that case, what's the diameter of a Chicago softball? A regulation softball is 12 inches in circumference. What is the ratio of its diameter to a Chicago softball? 16 pi 16 / pi 12/16 or 3/4 2. Bits, Bytes and Binary (a) Now that you get the jokes, write out the decimal numbers 15, 16, 17, 31, 32, 47, 63, 64, 99, 100 in binary and hex. (To be sure that you understand what you're doing, do the conversions between decimal, binary and hex by hand, not by a program or a calculator.) 15 1111 F 16 10000 10 17 10001 11 31 11111 1F 32 100000 20 47 101111 2F 63 111111 3F 64 1000000 40 99 1100011 63 100 1100100 64 (b) How many bits are needed to represent the current senior class year (i.e., 2010) as a binary number? How many bytes are needed? For what year will this number of bits increase? How many bytes will then be needed? 11 bits, 2 bytes 2048 2 bytes (12 bits) answers related to the number of students in the class are wrong; the question is about the year itself, not the number of bodies. also, there are no fractional bytes; bytes are 8 bits so one has to round up. (c) "The human genome consists of some 2.9 billion of the letters AGCT, the equivalent of about 750 megabytes of data." (NY Times, 6/26/07) (i) If the letters are represented in ASCII, about how many megabytes would be required to store the representation of the genome? (ii) Give a binary representation of the letters AGCT that would make it possible to fit the genome into 750 MB. 2900 MB (or 2.9 GB) any representation with 2 bits/char, like 00 01 10 11 this was meant to remind people of the fact that when there are only 4 distinct things to encode (A G C T), 2 bits is all you need. think of freshman, sophomore, junior, senior. (d) What range of numbers can you represent with the fingers of two hands if you treat each finger as a binary digit but don't use your thumbs? What is the range if you can use your thumbs as well? 0 .. 255 0 .. 1023 these have to be exact. (e) Draw a picture of a pair of hands displaying the number 132. Assume that one uses fingers and thumbs. | | 00|00 00|00 no penalty for lack of artistic talent (and no credit for even amazing talent). Connelly's summary... Most mistakes were due to math errors, so simply double-checking math or comparing with a friend (if that's allowed) would improve everyone's grade! Common mistakes were: 1(a) Answers that were orders of magnitude too large or small, due to not carefully converting units. 1(c) Same problem 2(a) Most got the hex/binary conversions right, but as a sanity check people should've made sure the binary/hex numbers were also increasing. 2(b) Lots of confusion here. Incorrect answers were based on the number of students in the class, off-by-one errors (10 bits, 12 bits instead of 11 bits, 4096 as the year instead of 2048, or 2049 instead of 2048), and lots of people used fractional bytes. 2(c) This was probably the worst. Lots of people didn't seem to get that 1 byte = 1 letter of ASCII (unless using 7-bit encoding). Instead I'd get weird answers like "1 ASCII letter = 0.5 bytes, therefore..." or long mathematical conversions between ASCII, bits, bytes, etc that would come out with the wrong answer. Also some people didn't seem to get that the "binary representation of the letters AGCT" was something THEY get to make up (e.g. A=00, G=01, C=10, T=11), and instead they'd copy down the ASCII binary representation of the string "AGCT" (A=01000001, etc) and then get confused. A number of people also did A=01, G=10, C=11, T=100, which doesn't fit in 750 MB. 2(e) Most got the conversion to binary right, but then some of these people had confusion when drawing the fingers. I think people didn't realize they can left pad a number with zeros without changing its value...