Evaluation report for COS435 Assignment 2

by Yiming Liu


Overview

In Assignment 2, we studied how to design an information need, and how to use it to evaluate a search engine. Every student tested their information need on two search engines Google and Bing, and used the pool method to measure the following scores of the first 30 search results:

As we know, Google and Bing are both excellent search engines, but which one has a better overall performance for the information needs designed by us? Let's check it! Here we gather the search results of 22 students, and compute the average value of each score.

Note: The scores displayed in the figures below are calculated by the TA based on the relevance score provided by the students. They can be different from the scores recorded by students if there are errors in the scores calculated by the students.


Reciprocal Rank

The following figure compares the mean reciprocal rank of Bing and Google across the 22 queries. For the detailed plots for each query, please click here. For each of the 22 queries, both Bing and Google have a reciprocal rank which is at least 0.5, i.e. they can ensure that the top 2 results have at least one relevant result. Neither of the two search engines can beat the other in all queries. However, Google has a better overall performance with this measurement.


Precision at Rank 10

The following figure compares the mean precision at rank 10 of Bing and Google across the 22 queries. For the detailed plots for each query, please click here. As we can see, both search engines are able to get above 70% precision at Rank 10, while Google has a better mean precision. Also, we can find that Bing has a larger standard deviation (see the error bar).


DCG(10)

The following figure compares the mean DCG(10) value of Bing and Google across the 22 queries. For the detailed plots for each query, please click here. Again, we can find that Google has a higher mean value and a smaller standard deviation, which suggest a better overall performance.


Average precision at Rank 20

The following figure compares the mean average precision at Rank 20 of Bing and Google across the 22 queries. For the detailed plots for each query, please click here. Google wins again.


Recall at Rank 30

For any of the 22 queries, neither Bing nor Google can retrieve all relevant documents within the top 30 results. Therefore, we compare the recall at Rank 30 here. For the detailed plots for each query, please click here. The winner is still Google.


Conclusion

In this year's evaluation, Google beats Bing in terms of each score for the queries submitted by 22 students.