COS 435, Spring 2006 - Homework 1:  Search engine test

Due 5:00pm Monday, February 27, 2006.


General Collaboration Policy

You may discuss problems with other students in the class. However, each student must write up his or her own solution to each problem independently. That is, while you may formulate the solutions to problems in collaboration with classmates, you must be able to articulate the solutions on your own.


Lateness Policy

A late penalty will be applied, unless there are extraordinary circumstances and/or prior arrangements:
This homework is our class experiment with evaluating search engines. This is only meant to be an exercise, so I do not expect we can do a thorough enough job to call the study valid. But it will have all the components of a full evaluation and hopefully we will get something interesting.

Each of you has chosen one query, which you will  run on each of three search engines: Google, Yahoo, and MSN Search. I chose these for their popularity and distinct underlying search indexes. Consider only the regular search results, not sponsored links.  Also, ignore the clustering by site in Google and MSN Search; count each result returned.   If you are having trouble with several results in languages other than English, you can go to the advanced search and choose English only, but then do this for all of the search engines. (In my trials, I did not get foreign-language results with a regular search, so this may not be an issue.)  Be forewarned that I once got less than 10 results on the first results page from MSN Search when it was set to 10 results per page (and there were more than 10 results).

Before running the query on the search engines,  write a description of what you will consider a relevant document and what you will consider a highly relevant document for your own hand assessment of search engine results.  Use the model provided by the TREC narrative section of topic specifications, which is used by the TREC experiments to define relevance (for examples see the class presentation notes for February 9 on "Relevance by TREC method" ).  You will hand in this description.

After writing your description of relevance and high relevance, run your query on each search engine and record the first 20 results returned. To get a pool for hand assessment, take the the first 15 results from each search engine. Collect the 45 results from the 3 search engines, remove duplicates, and visit each result to decide relevance. Score each result as one of:

After constructing the pool, go back and rate each of the first 20 results returned by each search engine. If a result does not appear in the pool, it receives a rating of 0 (irrelevant). If a document appears twice under different URLs in the list for one search engine, count it only for it's better ranking for that search engine and delete any additional appearances within the same list. In this case there will be less than 20 distinct documents returned by the search engine. Do not go back to the search engine to get more documents. Keep only what was returned in the first 20, in the order they were ranked, and give the last positions, with missing documents, ratings of 0. For each search engine, calculate the discounted cumulative gain (see the paper in your readings Evaluation by highly relevant documents) for each document rank from 1 through 20. You should turn in three length-20 lists of discounted cumulative gains -- one list for each search engine. Please do this by email so I can combine them easily. I will do the averaging across queries for each search engine and compare the results. In your email also include the actual query used, your description of relevance and high relevance, and any observations you think interesting or relevant about the search results overall or for a particular engine. Also save all the data you have collected -- just in case.

The Search Engine Watch has a lot of useful information about search engines.   See "Our Departments"  below the news on their home page.