### COS 109: Problem Set 1

Tue Sep 5 17:30:41 EDT 2023

Due midnight, Wednesday Sept 13

 Collaboration policy for COS 109: Working together to really understand the material in problem sets and labs is encouraged, but once you have things figured out, you must part company and compose your written answers independently. That helps you to be sure that you understand the material, and it obviates questions of whether the collaboration was too close. You must list any other class members with whom you collaborated.

Problem set answers need not be long, merely clear enough that we can understand what you have done, though for computational problems, show enough of your work that we can see where your answer came from. There is no need to repeat the question; use this Google Doc for your answers. Thanks.

Do not use Google or other search engines for these questions. Your task is to work from what you know and can reason about, not what you can look up. Similarly, don't use ChatGPT or similar generative AI programs. You won't learn nearly as much as if you do the work yourself, you won't know whether the AI answers are right or merely hallucinations, and of course you won't be able to use Google or AI on exams.

Estimation is a useful quantitative reasoning skill, and most problem sets will have some estimation questions. Webster's defines estimate as "to judge or determine generally but carefully; calculate approximately." Note the words generally and approximately. There isn't a correct answer; your task is to come up with a sensible ballpark or "back of the envelope" value. If the data you start with is approximate, the results cannot be precise. Just because a calculator displays 8 figures, they aren't necessarily significant; do not blindly copy down all the digits. In fact, it's good practice to do calculations like these by hand.

In the following, adverbs like approximately, roughly and about are meant to remind you that you are doing calculations on approximate values. You really should not have any computed value with more than two or maybe three digits of significance, since your computations are based on informed guesses, not precise measurements.

#### 1. In and Out

(a) About how many times per day (total for all undergrads) are prox cards used to enter dorm buildings and individual rooms? This value might well be different on different days of the week, so come up with a suitable representative value.

(b) Roughly how many bytes of disk space might it take to store a record of a typical single transaction: a particular student opens a specific entry or room at a specific time on a specific date? Suggestion: Think of the number of letters, digits and other characters you would have to write down if you were doing this by hand; each of those characters takes at most one byte of storage. Include an example of one such transaction, just to make it concrete.

(c) If you had the data for a bunch of transactions, say an entire academic year, you might be able to compress the data so it takes significantly less space for the whole thing than naively storing information in the form you used in part (b). Briefly describe one way that you might be able to do some compression. We're not looking for anything sophisticated here, just an idea that might reduce the total amount of space needed. It may help to think about what's repetitive and thus redundant and thus only needs to be stored once for a large group of related transactions.

(d) "The records of all student entries to all dorms and rooms since Princeton was founded in 1746 would just about fit on the disk in a current laptop computer." Is this statement likely to be accurate say within a factor of 5 or 10, way too conservative (much more data would fit, say every full-time college student in the USA), or way too optimistic (it really would take a lot more disk space just for Princeton)? Be as quantitative as necessary to support your position, but not excessively so -- this is a question about ballpark figures. You can make any reasonable assumptions about compression that seem warranted from your answer to part (c). The "right" answer here may well depend on your assumptions, so be sure to state them clearly.

(e) Suppose a surveillance camera takes a picture of you each time you use your prox to enter a dorm or your room. How many gigabytes (very roughly) would be required to store these pictures of you over the course of one academic year? (To estimate the size of a typical image, you could take a selfie on your phone.)

#### 2. On the Street Where You Live

Here are some estimation questions in a different domain, inspired by a random conversation with friends at Google. Make your best estimates, explaining your assumptions briefly, but don't anguish over it -- again, there are no "right" answers. For parts (b) through (d), state clearly what values you are assuming from earlier answers, so we can give you credit for sound reasoning even if your starting numbers seem too far off. (If you're more comfortable in metric, use meters and km instead.)

(a) Google's goal for Street View is to have pictures taken along every street in the world. Thinking parochially for the moment, estimate how many miles of streets there are in the USA. Hint: imagine the country covered by a uniform grid with a road every mile in each direction. That would be too coarse in cities and suburbia but not bad in unpopulated areas, of which there are many. Compute with that, then refine it if necessary.

How big is the USA? You'll have to estimate that too if you don't already know. One way: imagine it's a rectangle that's W miles wide and H miles high; you can look at a map to get rough values for W and H. With apologies to their respective residents, ignore Alaska (no roads) and Hawaii (too small to matter).

(b) How much money would Google spend on gas for all this driving?

(c) Suppose that on average, Google takes one picture every yard (or meter) using ordinary cellphone cameras and stores JPEG images. How many terabytes of disk would it take to store Street View images for the whole USA?

(d) How much would it cost to buy ordinary external backup disks to store this data? You can check sites like Amazon or Best Buy to see how much disks cost if you haven't a clue. [You might find this cautionary tale interesting.]

Many of our electronic gadgets have associated capacity or speed numbers that we discuss in class. Answer the following for your own devices if possible, but if not, use a friend's. Be sure to get the units right.

(a) What is the capacity of the disk in your computer, in gigabytes? Your computer will tell you if you ask it; part of the exercise is figuring out how.

(b) How many gigabytes of RAM does your computer have?

(c) How many pixels does the screen of your computer have horizontally and vertically? How many total pixels is that?

(d) Speedtest.net measures download and upload speeds between your computer and the Internet. What speeds does it report for you, and where were you when you made the measurement? [Note the units: it's megabits per second. If you need to convert, one byte is 8 bits.] How reproducible are the values?

If you prefer, you can download Speedtest or similar programs as a phone app. Tell us which you did, and (even better) what the comparison is between phone and desktop speeds.

#### Submission

Please use this this Google Doc as a template for your answers. It would be a great help if you type your answers so we can read them, but if you prefer to write by hand, that's OK as long as you are very neat. (That is, not like me!)

Either way, convert your answers to PDF and submit them by uploading a file called pset1.pdf to the "Pset1" assignment in Gradescope. When submitting on Gradescope, you will be asked to label each question with the page of the PDF your answer appears on (feel free to reach out if this causes any issues). You can submit as many times as you like; we'll only look at the last one.