Lab 8: Privacy and Security

Sun Dec 6 11:27:32 EST 2009

This is the last lab. Have fun.

Privacy is much in the news of late, with concerns ranging from identity theft through government surveillance to commercial exploitation of information about our purchases, our interests, our activities, our friends, and everything else. This lab will explore some issues of privacy and access to information.

This is a comparatively open-ended lab, so you may well find ambiguities and fuzzy bits. Don't worry about them, since this is meant to be for exploration rather than precise answers. But feel free to make suggestions so we can improve the lab for next time.

This lab is intended to be more than a Google and Wikipedia exercise; you should cast your net more widely, by using other search engines and other information sources. You will be graded partly on how well you do this, so tell us for each thing what tools you used and how well they worked. Among the search engines you might try are Yahoo, Microsoft, Ask, Clusty, Mooter and Cuil. There are also sites that do telephone number lookup or that maintain public records; financial sites like Yahoo finance and Google finance provide access to holdings and insider trading; and of course social networks like Facebook and Twitter can reveal a huge amount about their users. Explore; that's part of the exercise. You might find it more productive to spread the lab over a couple of days so you have time to think about possibilities.

As you go along, we want you to collect your observations and comments in a Word document. You must use this template, lab8.doc, so we have some uniformity among the submissions. Please download this file now and begin to edit it. In the following, when we ask you to "report", we're looking for a reasonably organized but not too long description. The questions in the text are meant to start you thinking, not necessarily be answered literally. We're not going to grade your writing, but you'll leave a better impression if there aren't too many spelling mistakes, flagrant grammar errors, random formatting, and so on. It's ok to summarize with lists rather than complete sentences, but do try to distill the essence of what you've seen rather than just cutting and pasting.

If you're using Office 2007, you must save and submit your lab8.doc file in Office 2003 format; we can't read the new .docx format. Please do this right the first time. Thanks.

You can do this lab anywhere. Some threats only affect PCs running Windows, but all users have to be suspicious about most things all the time.


Part 1: Personal Information

For this section, you should use at least three search engines, to cast a wider net than Google alone does.

Sometimes people state strong opinions forcefully, and the record lives on forever.

  • Who said "You have zero privacy. Get over it." and in what setting?
  • What is Bill Gates's holding of Microsoft stock and how much is that worth?
  • Where is Bill Gates's house?

How much can you learn about someone just by searching online information? For yourself or members of your family or someone else close to you, see how much you can discover about that person online. Examples of the kind of information you might look for include home address, telephone number, age, birthday, education, employment, political contributions, organizations and memberships, price of their home, names of other family members (the classic mother's maiden name, for example), activities and interests. Can you find a picture? Was it one that you knew about?

Do you get any information by searching for a phone number or street address or social security number? (Read this Wikipedia article first about why it is a bad idea to search for your own SSN.) Does a phone number or address reveal a family name? Did you find inconsistent information?

Can you find a good picture of your home (or someone else's) with Google Maps or Earth, or Microsoft Maps? Which one of these gives the best image? Can you make out your car or some other possession? How much might the house be worth? See, for example, Zillow. If you visit Zillow, what kinds of addresses does it show you without being asked?

What does your Facebook page reveal about you that you find surprising or worth thinking about?

There's no need to go overboard on this; the goal is definitely not to invade anyone's privacy but to get a sense of the accessibility of ostensibly private information.

  • Report on the nature of the information that you were able to find. Don't include actual phone numbers or street addresses in your report; other information like city of residence, political contributions, memberships, and so on is fair game.
  • What parts of the information that you found are ostensibly restricted, for example because they are available only to Princeton students or to members of your Facebook friends? Can you find any evidence that the restricted information is actually more widely available?


Part 2: What Else Do They Know About You

As we saw in class, the mere act of visiting a web site reveals some information about you. There are a variety of sites that report back to you about what information your visit reveals, or about what vulnerabilities your system appears to have. Visit some of these and see what they tell you. Here are some useful ones; can you find others like them?

The pages at Gibson Research are pretty technical but worth some study. And apropos of the geolocation example, this site might tell you how the information could be used.

To determine your IP address(es) and other identification info, run ipconfig on Windows XP, Start / Run / cmd. In the resulting commandline window, type ipconfig /all. On a Mac, System Preferences / Show Airport / TCP/IP and related tabs. Are the values consistent with what is seen by the outside?

Make a search for some service or store with several search engines and see how accurately they geo-locate you. Are there major differences in apparent accuracy among Google, Yahoo and Microsoft Bing?

Send mail to yourself at Gmail or similar site. Include in the mail some words or phrases that might trigger a particular kind of advertisement. What advertisements do you see when you read the mail? Do any of the advertisements appear to know your geographical location?

  • What potentially significant information about you and your computer does your browsing reveal to sites that you visit?
  • What IP address was displayed by the web pages? Is this the same IP address as you see by running ipconfig on Windows or System Preferences on a Mac?
  • What potential vulnerabilities are reported about your system?
  • Did you find any other sites that provide similar or analogous services?
  • What did you learn about geographical location data?


Part 3: Cookie Crumbs

We've talked about how cookies can be used to track what web sites you visit, especially "third-party" cookies (that is, cookies that come from someone other than the web site you accessed directly) that aggregate and correlate information about your visits to apparently unrelated sites.

First, how many cookies do you currently have? Report at least the rough count, and whether this is before or after you removed cookies after the lecture about them.

Now remove all cookies. Set your browser preferences to allow all cookies, then visit at least half a dozen major sites (media, sports and e-ecommerce sites are good for this, as are search engines and even universities). How many cookies does a typical visit deposit? Track the cookies that are tracking you: can you see any evidence of linkage, e.g., updated third-party cookies or URLs after visiting independent sites? You might find it easiest to use a browser setting that asks about each cookie individually.

What sites that you visit regularly deposit third-party cookies? Do any contain interesting information instead of just long strings of apparently random letters and numbers? What's the latest cookie expiration date you can find?

Repeat some of the experiment with a different browser if you can do so without too much work. If you remove the cookies from the first browser, what happens with the second browser?

You might find this paper and this paper interesting reading; they are technical in places but not much beyond what we've done in the class.

  • Report on what you found out about cookies. In particular, what evidence did you find of third-party tracking? Include a sample third-party cookie. It's ok to abbreviate long ones.

"Web bugs" are another way to track when someone visits a web site or accesses information using a program that interprets HTML; a web bug is typically an almost invisible 1x1 pixel image that includes a URL, like this one encountered at cnn.com:

     <img src="http://cnnglobal.122.2O7.net/b/ss/cnnglobal/1/H.1--NS/0"
          height="1" width="1" border="0">
When the image is retrieved, the server at 2o7.net (one of the largest aggregators) knows that you have visited the page that contained the img tag. (The Adblock extension in Firefox gets rid of a lot of third-party images both large and small.)

Find a web page that includes a web bug from a third-party. You can often find candidate links in a web page by searching for things like "height=1" in various forms, and in the Security panels in Firefox. Can you find a web bug in an email message?

  • Report on what you found out about web bugs. Include a sample web bug URL and describe what it is.

Flash deposits cookie-like data on your machine as well. It is easy to get rid of, by visiting this site. Before removing them, check what Flash cookies you have. How many sites are represented and how much total space is used?

  • Report on what you found out about Flash cookies. Include the count, size, and a handful of site names. Are there sites that you do not think you ever visited directly?


Part 4: Defenses and Countermeasures

As we discussed in class, there are ways to limit your risks and the amount of information that you reveal. Virus checkers are the most important, but there are plenty of others as well.

Many web sites insist that you provide a working email address before they will let you register or access some service. 10MinuteMail provides a useful alternative: it provides an email address that's valid for 10 minutes and shows you whatever mail arrives during that time; that lets you retrieve the registration key or whatever, without giving away a real address. Try this service. How long does it take for mail to arrive? Empirically, how long does it take for the temporary address to time out?

Check your own environment. What browser do you routinely use? What are your default settings for cookies, filename extensions, Javascript, Java, popups, automatic updates, downloading, software, installation, programs that start automatically, etc.? Does your mail reader provide a previewer that interprets HTML and thus is subject to web bugs? Do you read mail in HTML format by default? Try sending yourself mail with a reference to an image in your public_html directory, i.e., http://www.princeton.edu/~your_netid/your_file, to see whether the image is retrieved and displayed.

For Firefox, what plug-ins and add-ons are already installed? Among those you might consider adding are Adblock, NoScript, FlashBlock, and CSLite; each reduces your exposure to various kinds of tracking and potentially harmful content.

As we saw in class, Word, Excel and other programs include a Visual Basic interpreter that can be used to silently run programs that are included in documents. What level of macro protection are you running in Word and Excel? (Look under Tools / Macros.) If you run Internet Explorer, what security level is being applied to ActiveX controls?

Perhaps surprisingly, PDF files can contain Javascript code, which will be interpreted by Adobe Acrobat Reader. Does your version of Acrobatenable Javascript? If so, you can turn it off without much consequence.

What do you reveal on Facebook? Bear in mind that most of this information is readily available outside your own list of friends and networks.

  • What was the result of your experiments with 10MinuteMail? What email address did it assign to you? If you use it a second time a while later, what is the second address?
  • What operating system are you running? What browser do you normally use? What mail client do you normally use?
  • Report on how you have your defenses configured. Did you tighten up any defense as a result of your experiments during this lab?


Part 5: Submitting your Work

Finally, if you saw anything interesting or suspicious that we didn't ask about specifically, or if you have any thoughts on how to improve this lab, we'd like to hear them. Thanks.

When you're all done, don't put this lab in your public_html directory. Instead:

Reminder: If you're using Office 2007, you must save and submit your lab8.doc file in Office 2003 format; we can't read .docx files. Please do this right the first time. Thanks.