This is the last lab. Have fun.
Privacy is much in the news of late, with concerns ranging from identity theft through government surveillance to commercial exploitation of information about our purchases, our interests, our activities, our friends, and everything else. This lab will explore some issues of privacy and access to information.
This is a comparatively open-ended lab, so you may well find ambiguities and fuzzy bits. Don't worry about them, since this is meant to be for exploration rather than precise answers. But feel free to make suggestions so we can improve the lab for next time.
This lab is intended to be more than a Google and Wikipedia exercise; you must cast your net more widely, by using other search engines and other information sources. You will be graded partly on how well you do this, so tell us for each thing what tools you used and how well they worked. Among the alternative search engines you might try are (in alphabetical order): Altavista, Bing, Clusty, Dogpile, DuckDuckGo, Yahoo, Yandex, Yippy.
Which of these appear to be merely clones of each other, rather than being independent?
There are also sites that do telephone number lookup or that maintain public records; financial sites like Yahoo finance and Google finance provide access to holdings and insider trading; and of course social networks like Facebook, Google+ and LinkedIn reveal a lot about their users. Explore; that's part of the exercise. You might find it more productive to spread the lab over a couple of days so you have time to think about possibilities.
As you go along, we want you to collect your observations and comments in a Word document. You must use this template, lab8.docx, so we have some uniformity among the submissions. Please download this file now and begin to edit it. In the following, when we ask you to "report", we're looking for a reasonably organized but not too long description. The questions in the text are meant to start you thinking, but need not be answered literally.
We're not going to grade your writing, but you'll leave a better impression if there aren't too many spelling mistakes, flagrant grammar errors, random formatting, and so on. It's ok to summarize with lists rather than complete sentences, but try to distill the essence of what you've seen rather than just cutting and pasting.
You can do this lab anywhere. Some threats primarily affect PCs running Windows, but all users have to be suspicious about most things all the time no matter what system they are running. And of course the risks are similar for cellphones, which we're not discussing here. Keep that in mind.
For this section, you should use at least three sources, not just Google.
How much can you learn about someone just by searching online information? For yourself or members of your family or someone else close to you, see how much you can discover about that person online. Examples of the kind of information you might look for include home address, telephone number, age, birthday, education, employment, political contributions, sports and hobbies, organizations and memberships, price of their home, names of other family members (like mother's maiden name, for example), activities and interests. Can you find a picture? Was it one that you knew about?
It might be possible to get information by searching for a phone number or street address or social security number. (Read this Wikipedia article first about why it is a bad idea to search for your own SSN.) Do phone numbers or addresses reveal family names? Is information always consistent?
Can you find a good picture of your home (or a friend's) with Google Maps or Earth, or Microsoft Maps? Which one of these gives the best image? Can you make out your car or some other possession? How much might the house be worth? See, for example, Zillow. If you visit Zillow, what kinds of addresses does it show you without being asked? How does Zillow compare to Trulia: which one appears to reveal more information, or are they about the same?
What does your Facebook page reveal about you that you find surprising or worth thinking about?
There's no need to go overboard on this; the goal is definitely not to invade anyone's privacy, but to get a sense of the accessibility of ostensibly private information.
As we saw in class, the mere act of visiting a web site reveals some information about you. There are a variety of sites that report back to you about what information your visit reveals, or about what vulnerabilities your system appears to have. Visit some of these and see what they tell you. Here are some useful ones; you might find it instructive to try to find others like them.
To determine your IP address(es) and other identification info on Windows, run ipconfig, with Start / Run / cmd. In the resulting commandline window, type ipconfig /all. On a Mac, System Preferences / Network / Ethernet / Advanced... / TCP/IP.
Search for some service or store with several search engines and see how accurately they geo-locate you. (You might turn on "private browsing" or "incognito mode" in your browser first.) Look for significant differences in apparent accuracy among Google, Bing and other search engines.
Send mail to yourself at Gmail or similar site. Include in the mail some words or phrases that might trigger a particular kind of advertisement. Examine the advertisements that you see when you read the mail and look for ones that appear to know your geographical location.
The specific combination of which browser you use, what fonts you have available, and half a dozen other tidbits can identify you uniquely, or almost so, to a surprising degree. Visit Panopticlick for an eye-opening demonstration. How unique are you? Try it with two different browsers.
We've talked about how cookies can be used to track what web sites you visit, especially "third-party" cookies (that is, cookies that come from someone other than the web site you accessed directly) that aggregate and correlate information about your visits to apparently unrelated sites.
First, how many cookies do you currently have? Record at least the rough count, and whether this is before or after you removed cookies after the lecture about them. The easiest way to find cookies is to use the browser. In Firefox on a Mac, use Preferences and the Privacy tab. In Chrome, "customize and control" icon / Settings / Use advanced settings / Privacy / Content settings... / All cookies and site data. In Safari, Preferences / Privacy. (These details change continuously.)
Now remove all cookies. Set your browser preferences to allow all cookies, then visit at least half a dozen major sites (media, sports and e-ecommerce sites are good for this, as are search engines and even universities). Check how many cookies a typical visit deposits. Track the cookies that are tracking you and look for evidence of linkage, e.g., updated third-party cookies or URLs after visiting independent sites. You might find it easiest to use a browser setting that asks about each cookie individually.
For sites that you visit regularly, see whether they deposit third-party cookies. See whether the cookies contain interesting information instead of just long strings of apparently random letters and numbers. Look at the dates when cookies expire.
Repeat some of the experiment with a different browser if you can do so without too much work. Determine what happens in the second browser if you remove the cookies from the first browser.
"Web bugs" are another way to track when someone visits a web site or accesses information using a program that interprets HTML; a web bug is typically an almost invisible 1x1 pixel image that includes a URL, like this one encountered at cnn.com:
<img src="http://cnnglobal.122.2o7.net/b/ss/cnnglobal/1/H.1--NS/0" height="1" width="1" border="0">When the image is retrieved, the server at 2o7.net, a large aggregator owned by Adobe, knows that you have visited the page that contained the img tag. (The AdBlock extension removes many advertising images both large and small and thus greatly reduces tracking.)
Find a web page that includes a web bug from a third-party. You can often find candidate links in a web page by searching for things like "height=1" in various forms, and in the Security panels in Firefox. Look for a web bug in an email message.
Flash deposits cookie-like data on your machine as well. Flash cookies are easy though tedious to get rid of, by visiting this site. Before removing them, check what Flash cookies you have. Determine how many sites are represented and how much total space is used.
As we discussed in class, there are ways to limit your risks and the amount of information that you reveal. Virus checkers are the most important, but there are plenty of others as well.
Many web sites insist that you provide a working email address before they will let you register or access some service. 10MinuteMail provides a useful service: it gives you an email address that's valid for 10 minutes and shows you whatever mail arrives during that time; that lets you retrieve the registration key or whatever, without giving away a real address. Try this service. Determine how long it takes for mail to arrive. Measure how long it take for the temporary address to time out.
As an alternative, Yopmail lets you invent your own email address, and retains mail for that address for a week. Try Yopmail. Which one do you prefer, and why?
Check what plug-ins and add-ons are already installed in your browser. Among those you might consider adding are AdBlock, NoScript, Ghostery, FlashBlock, and Cookie Monster; each reduces your exposure to various kinds of tracking and potentially harmful content.
Word, Excel and other programs include a Visual Basic interpreter that can be used to silently run programs that are included in documents. (Office 2008 on Macs does not support VB, but Office 2011 on Macs does.) What level of macro protection are you running in Word and Excel? (Look under Tools / Macros, or Preferences / Security on Macs with Office 2011.) If you run Internet Explorer, determine the security level that is being applied to ActiveX controls.
Reconsider your Facebook privacy settings. Bear in mind that most Facebook information is readily available outside your own list of friends and networks.
Finally, if you saw anything interesting or suspicious that we didn't ask about specifically, or if you have any thoughts on how to improve this lab, we'd like to hear them. There are a couple of wrapup questions in lab8.docx that address this:
When you're all done,
Upload lab8.doc to the CS dropbox for Lab 8: https://dropbox.cs.princeton.edu/COS109_F2015/Lab8.