Due midnight, Thursday Sep 10 Eastern time
Collaboration policy for COS 109:
Working together to really understand the material in problem sets and
labs is encouraged, but once you have things figured out, you must part
company and compose your written answers independently. That helps
you to be sure that you understand the material, and it obviates questions
of whether the collaboration was too close.
You must list any other class members with whom you collaborated.
Problem set answers need not be long, merely clear enough that we can understand what you have done, though for computational problems, show enough of your work that we can see where your answer came from. There is no need to repeat the question. Thanks.
Do not use Google or other search engines for these questions. Your task is to work from what you know and can reason about, not what you can look up.
Note the adverbs like approximately, roughly and about, which are meant to remind you that you are doing calculations on approximate values. You really should not have any computed value with more than two or maybe three digits of significance, since your computations are based on informed guesses, not precise measurements.
Princeton stores a lot of information about students, including academic records. This question is about making a ballpark estimate about how much data that might be. There is no "right" answer, so think about it, state your assumptions clearly, then do the simple arithmetic to come up with defensible answers.
(a) Roughly how many bytes would it take to store your academic record for one representative semester? That would include your name, your year (e.g., 2021), the 4 or 5 courses you took, and your grade in each. Suggestion: Think of the number of letters, digits and other characters you would have to write down if you were doing this by hand; each of those characters takes at most one byte of storage. Just to make it concrete, write out that entry for this fall. (You can guess about your grades.)
(b) About how many bytes would it take to store this information for all current undergrads?
(c) If you had the data for all students for an entire academic year, you might be able to compress the data so it takes significantly less space for the whole thing than naively storing information in the form you used in part (a). Briefly describe one way that you might be able to do some compression. We're not looking for anything sophisticated here, just an idea that might reduce the total amount of space needed. It may help to think about what's repetitive and thus redundant and thus only needs to be stored once for a large group of related items.
(d) "The academic records of all students since Princeton was founded in 1746 would just about fit on the disk in a current laptop computer." Is this statement likely to be accurate say within a factor of 5 or 10, way too conservative (much more data would fit, say every full-time college student in the USA), or way too optimistic (it really would take a lot more disk space just for Princeton)? Be as quantitative as necessary to support your position, but not excessively so -- this is a question about ballpark figures. You can make any reasonable assumptions about squeezing that seem warranted from your answer to part (c). The "right" answer here may well depend on your assumptions, so state them clearly.
Here are some estimation questions in a different domain, inspired by a random conversation with friends at Google. Make your best estimates, explaining your assumptions briefly, but don't anguish over it -- again, there are no "right" answers. For parts (b) through (d), state clearly what values you are assuming from earlier answers, so we can give you credit for sound reasoning even if your starting numbers seem too far off. (If you're more comfortable in metric, use meters and km instead.)
(a) Google's goal for Street View is to have pictures taken along every street in the the world. Thinking parochially for the moment, estimate how many miles of streets there are in the USA. Hint: imagine the country covered by a uniform grid with a road every mile in each direction. That would be too coarse in cities and suburbia but not bad in unpopulated areas. Compute with that, then refine it if necessary.
How big is the USA? You'll have to estimate that too if you don't already know. One way: imagine it's a rectangle that's W miles wide and H miles high; you can look at a map to get rough values for W and H. With apologies to their respective residents, ignore Alaska (no roads) and Hawaii (too small to matter).
(b) How much money would Google spend on gas for all this driving?
(c) Suppose that on average, Google takes one picture every foot (0.3 m) using ordinary cellphone cameras and stores JPEG images. How many terabytes would it take to store Street View images for the whole USA?
(d) How much would it cost to buy ordinary external backup disks to store this data? You can check sites like Amazon or Best Buy to see how much disks cost if you haven't a clue.
(a) What is the capacity of the disk in your computer, in gigabytes? Your computer will tell you if you ask it; part of the exercise is figuring out how.
(b) How many gigabytes of RAM does your computer have?
(c) How many pixels does the screen of your computer have horizontally and vertically? How many total pixels is that?
(d) Assuming that the pixels on the screen of your cell phone are the same size as the pixels on your laptop, approximately how many pixels are there on your cell phone screen? (Hint: how many pixels are there per square inch or cm on your laptop screen?) If you can easily find out exactly how many pixels there really are, include that information too; it would be interesting to see whether the assumption of same-size pixels is valid.
Either way, convert your answers to PDF and submit them by uploading a file called pset1.pdf to https://tigerfile.cs.princeton.edu/COS109_F2020/Pset1. You can submit as many times as you like; we'll only look at the last one.