Lab 8: Privacy and Security

Thu Nov 20 10:57:20 EST 2014

This is the last lab. Have fun.

Privacy is much in the news of late, with concerns ranging from identity theft through government surveillance to commercial exploitation of information about our purchases, our interests, our activities, our friends, and everything else. This lab will explore some issues of privacy and access to information.

This is a comparatively open-ended lab, so you may well find ambiguities and fuzzy bits. Don't worry about them, since this is meant to be for exploration rather than precise answers. But feel free to make suggestions so we can improve the lab for next time.

This lab is intended to be more than a Google and Wikipedia exercise; you must cast your net more widely, by using other search engines and other information sources. You will be graded partly on how well you do this, so tell us for each thing what tools you used and how well they worked. Among the alternative search engines you might try are (in alphabetical order): Altavista, Bing, Clusty, Dogpile, DuckDuckGo, Yahoo, Yandex, Yippy.

Which of these appear to be merely clones of each other, rather than being independent?

There are also sites that do telephone number lookup or that maintain public records; financial sites like Yahoo finance and Google finance provide access to holdings and insider trading; and of course social networks like Facebook, Google+ and LinkedIn reveal a lot about their users. Explore; that's part of the exercise. You might find it more productive to spread the lab over a couple of days so you have time to think about possibilities.

As you go along, we want you to collect your observations and comments in a Word document. You must use this template, lab8.doc, so we have some uniformity among the submissions. Please download this file now and begin to edit it. In the following, when we ask you to "report", we're looking for a reasonably organized but not too long description. The questions in the text are meant to start you thinking, but need not be answered literally.

We're not going to grade your writing, but you'll leave a better impression if there aren't too many spelling mistakes, flagrant grammar errors, random formatting, and so on. It's ok to summarize with lists rather than complete sentences, but try to distill the essence of what you've seen rather than just cutting and pasting.

You can do this lab anywhere. Some threats primarily affect PCs running Windows, but all users have to be suspicious about most things all the time no matter what system they are running. And of course the risks are similar for cellphones, which we're not discussing here. Keep that in mind.

 


Part 1: Personal Information

For this section, you should use at least three sources, not just Google.

How much can you learn about someone just by searching online information? For yourself or members of your family or someone else close to you, see how much you can discover about that person online. Examples of the kind of information you might look for include home address, telephone number, age, birthday, education, employment, political contributions, sports and hobbies, organizations and memberships, price of their home, names of other family members (like mother's maiden name, for example), activities and interests. Can you find a picture? Was it one that you knew about?

It might be possible to get information by searching for a phone number or street address or social security number. (Read this Wikipedia article first about why it is a bad idea to search for your own SSN.) Do phone numbers or addresses reveal family names? Is information always consistent?

Can you find a good picture of your home (or a friend's) with Google Maps or Earth, or Microsoft Maps? Which one of these gives the best image? Can you make out your car or some other possession? How much might the house be worth? See, for example, Zillow. If you visit Zillow, what kinds of addresses does it show you without being asked? How does Zillow compare to Trulia: which one appears to reveal more information, or are they about the same?

What does your Facebook page reveal about you that you find surprising or worth thinking about?

There's no need to go overboard on this; the goal is definitely not to invade anyone's privacy, but to get a sense of the accessibility of ostensibly private information.

  • Who said "You have zero privacy. Get over it." and in what setting?
  • Who said "You have control over every single thing that you've shared on Facebook" and in what setting? (You might think about whether it's true.)
  • Which of the search engines listed above are using someone else's results?
  • List the information that you were able to find about individuals and what tools you used to find it. Don't include actual phone numbers or street addresses in your report; other information like city of residence, political contributions, memberships, and so on is fair game.

 


Part 2: What Else Do They Know About You

As we saw in class, the mere act of visiting a web site reveals some information about you. There are a variety of sites that report back to you about what information your visit reveals, or about what vulnerabilities your system appears to have. Visit some of these and see what they tell you. Here are some useful ones; you might find it instructive to try to find others like them.

The pages at Gibson Research are pretty technical but worth some study. And apropos of the geolocation example, this site might tell you how the information could be used.

To determine your IP address(es) and other identification info on Windows, run ipconfig, with Start / Run / cmd. In the resulting commandline window, type ipconfig /all. On a Mac, System Preferences / Network / Ethernet / Advanced... / TCP/IP.

Search for some service or store with several search engines and see how accurately they geo-locate you. (You might turn on "private browsing" or "incognito mode" in your browser first.) Look for significant differences in apparent accuracy among Google, Bing and other search engines.

Send mail to yourself at Gmail or similar site. Include in the mail some words or phrases that might trigger a particular kind of advertisement. Examine the advertisements that you see when you read the mail and look for ones that appear to know your geographical location.

The specific combination of which browser you use, what fonts you have available, and half a dozen other tidbits can identify you uniquely, or almost so, to a surprising degree. Visit Panopticlick for an eye-opening demonstration. How unique are you? Try it with two different browsers.

  • List what you learned from Privacy.net (and grc.com if you are using Windows). What potentially significant information about you and your computer does your browsing reveal to sites that you visit? What potential vulnerabilities are reported about your system?
  • Was your IP address consistent when you checked it on your operating system and when you saw it on Privacy.net or other page that analyzed your traffic? If they were different, what could be the reason?
  • What did you learn about geographical location data? List the sites that you found that appeared to guess your location and report how precisely they had you located.
  • What did Panopticlick say? List the two browsers and the number of bits of identifying information reported for each.

 


Part 3: Cookie Crumbs

We've talked about how cookies can be used to track what web sites you visit, especially "third-party" cookies (that is, cookies that come from someone other than the web site you accessed directly) that aggregate and correlate information about your visits to apparently unrelated sites.

First, how many cookies do you currently have? Record at least the rough count, and whether this is before or after you removed cookies after the lecture about them. The easiest way to find cookies is to use the browser. In Firefox on a Mac, use Preferences and the Privacy tab. In Chrome, "customize and control" icon / Settings / Use advanced settings / Privacy / Content settings... / Cookies. In Safari, Preferences / Privacy. (These details change continuously.)

Now remove all cookies. Set your browser preferences to allow all cookies, then visit at least half a dozen major sites (media, sports and e-ecommerce sites are good for this, as are search engines and even universities). Check how many cookies a typical visit deposits. Track the cookies that are tracking you and look for evidence of linkage, e.g., updated third-party cookies or URLs after visiting independent sites. You might find it easiest to use a browser setting that asks about each cookie individually.

For sites that you visit regularly, see whether they deposit third-party cookies. See whether the cookies contain interesting information instead of just long strings of apparently random letters and numbers. Look at the dates when cookies expire.

Repeat some of the experiment with a different browser if you can do so without too much work. Determine what happens in the second browser if you remove the cookies from the first browser.

"Web bugs" are another way to track when someone visits a web site or accesses information using a program that interprets HTML; a web bug is typically an almost invisible 1x1 pixel image that includes a URL, like this one encountered at cnn.com:

     <img src="http://cnnglobal.122.2o7.net/b/ss/cnnglobal/1/H.1--NS/0"
          height="1" width="1" border="0">
When the image is retrieved, the server at 2o7.net, a large aggregator owned by Adobe, knows that you have visited the page that contained the img tag. (The AdBlock extension removes many advertising images both large and small and thus greatly reduces tracking.)

Find a web page that includes a web bug from a third-party. You can often find candidate links in a web page by searching for things like "height=1" in various forms, and in the Security panels in Firefox. Look for a web bug in an email message.

Flash deposits cookie-like data on your machine as well. Flash cookies are easy though tedious to get rid of, by visiting this site. Before removing them, check what Flash cookies you have. Determine how many sites are represented and how much total space is used.

  • How many cookies were there on your computer when you started the lab? How many cookies does a typical visit deposit? What was the most distant expiration date?
  • Did you find third-party cookies? Provide a sample cookie.
  • Report on what you found out about web bugs. Include a sample web bug URL and describe what it is.
  • Report on what you found out about Flash cookies. Include the count, size, and a handful of site names. Are there sites that you don't think you have visited directly?

 


Part 4: Defenses and Countermeasures

As we discussed in class, there are ways to limit your risks and the amount of information that you reveal. Virus checkers are the most important, but there are plenty of others as well.

Many web sites insist that you provide a working email address before they will let you register or access some service. 10MinuteMail provides a useful service: it gives you an email address that's valid for 10 minutes and shows you whatever mail arrives during that time; that lets you retrieve the registration key or whatever, without giving away a real address. Try this service. Determine how long it takes for mail to arrive. Measure how long it take for the temporary address to time out.

As an alternative, Yopmail lets you invent your own email address, and retains mail for that address for a week. Try Yopmail. Which one do you prefer, and why?

Check your own environment. For your regular browser record your default settings for cookies, filename extensions, Javascript, Java, popups, automatic updates, downloading, software, installation, programs that start automatically, etc. If your mail reader provides a previewer that interprets HTML and thus is subject to web bug, try sending yourself mail with a reference to an image in your public_html directory, i.e., http://www.princeton.edu/~your_netid/lab6, to see whether the image is retrieved and displayed.

Check what plug-ins and add-ons are already installed in your browser. Among those you might consider adding are AdBlock, NoScript, Ghostery, FlashBlock, and Cookie Monster; each reduces your exposure to various kinds of tracking and potentially harmful content.

Install Ghostery, which works in Firefox, Chrome and Safari. This extension detects and disables Javascript trackers, which would otherwise report your page visits and activities to advertising aggregators. The Firefox version also disables third-party cookies. Determine how many trackers Ghostery reports that it is blocking. Visit some sites to see how many trackers are in use. For me, Princeton has one (Google Analytics); ITWorld.com has 10. Try to find the highest number possible; there might even been a small and worthless prize for the person who finds the worst offender.

Word, Excel and other programs include a Visual Basic interpreter that can be used to silently run programs that are included in documents. (Office 2008 on Macs does not support VB, but Office 2011 on Macs does.) What level of macro protection are you running in Word and Excel? (Look under Tools / Macros, or Preferences / Security on Macs with Office 2011.) If you run Internet Explorer, determine the security level that is being applied to ActiveX controls.

Perhaps surprisingly, PDF files can contain Javascript code, which will be interpreted by Adobe Acrobat Reader. Find out whether your version of Acrobat enables Javascript; if so, you can safely turn it off.

Reconsider your Facebook privacy settings. Bear in mind that most Facebook information is readily available outside your own list of friends and networks.

  • What was the result of your experiments with 10MinuteMail? What email address did it assign to you? If you use it a second time a while later, what is the second address? What was your experience with Yopmail? Which do you prefer and why?
  • What operating system are you running? What browser do you normally use? What mail client do you normally use?
  • What did you learn from Ghostery? What was the largest number of trackers, and at what site?
  • Report on how you have your defenses configured for Word, Excel, Acrobat Reader, and your most frequently used web browser. Summarize your privacy settings for your Facebook account. ("I don't have a Facebook account" is fine in the unlikely event that it's true.)

 


Part 5: Submitting your Work

Finally, if you saw anything interesting or suspicious that we didn't ask about specifically, or if you have any thoughts on how to improve this lab, we'd like to hear them. There are a couple of wrapup questions in lab8.doc that address this:

  • What changes if any did you make to your online settings and behavior as a result of doing this lab?
  • [optional] What changes might we make to the lab to improve it?

Thanks.

When you're all done,

Upload lab8.doc to the CS dropbox for Lab 8: https://dropbox.cs.princeton.edu/COS109_F2014/Lab8.
Be sure to upload the file as lab8.doc; do not use .docx format.