|
Richard L. Smith '70 Freshman Seminar
Google and Ye Shall Find???
|
FRS 117
Fall 2007
|
Directory
General
Information | Schedule and
Assignments
| Blog (login for announcements)
General information pages for the
remainder of the semester (subject to additions):
click
here for
weeks 4 through 6
Nov. 7: No Class
Week 7, Nov. 14:
Guest instructors Prof. Edward Felten, Director of the Center for
Information Technology Policy, and David Robinson,
Associate Director of the Center for Information Technology
Policy.
Topics:
Social Issues
focus on privacy
Week 8 and first half Week 9,
Nov. 21 and 28:
Guest instructor Prof. Moses Charikar of the Computer Science Dept.
Topics:
Quality of search engine results
trust in results
quality of results versus
goals of search
Improving search engine results
Comparing search engines
Class discussion: Think
about the quality of the searches you have done in the past year.
What kinds of searches have
been easiest? What have been most difficult for
you? Do the easy searches share common features? the
difficult searches? What improvements to search - either new
options for specifying the search or new ways to rank or present the
results - would be helpful? We discussed trusting the results of
search a bit earlier in the semester; what issues in trusting
results would you like to revisit or introduce?
For our discussion of improving search engine results, we will start
considering what search engines other than Google have to offer that is
different, beginning with some well-known techniques that Google
chooses not to implement. We will eventually move to how we might
fairly compare search engines. In preparation for these topics,
try out some of the search engines you don't usually use.
Certainly try some searches on
Yahoo
if you don't use it at least occasionally already. Other
well-know engines to try:
MSN
Windows Live and
Ask (
AOL Web search is
"enhanced" by Google, so we really don't expect to see organic search
results that are much different than from Google. Of course the
user experience and the ads may be different.)
Written
assignment due this week: NONE
Reading
for discussion today:
*(Originally
for week 6) Andrei
Broder,
A taxonomy of web search, ACM Special
Interest Group on Information Retrieval (SIGIR) Forum, Vol. 36 (2), Fall 2002, pp. 3-10. This
article is mentioned by Battelle in Chapter 2. Is this
paper, written over 5 years ago, still relevant today?
(Originally for week 6) Some articles on
Wikipedia as we think about trusting search and sources:
*The word on Wikipedia:
Trust but verify, MSNBC and NBC News, March 29, 2007.
*Why
you can't cite Wikipedia in my class Viewpoint piece in
Communications of the
ACM, Vol.50(9), Sept. 2007, by the Middlebury
College history professor referred to in the MSNBC
article above. This piece has some inaccuracies of its own
(can you spot them?), but presents
Professor Waters view in his own words.
*Wikipedia
2.0 - now with added trust , NewScientist.com News Service, 20
September 2007.
*Why
Wikipedia Must Jettison Its Anti-Elitism,
b
y lsanger, on site kuro5hin.org ,
Fri Dec
31, 2004. This is the article by Larry Sanger
referred to in the MSNBC article above.
*Exploring the Digital Universe, eLearn Magazine,
an answer to Wikipedia.
*About the Open Directory
Project
References
for technical
material:
Second half Week 9, Nov. 28:
Guest instructor David Robinson, Associate Director of
the Center for Information Technology Policy.
Topics:
Intellectual Property and Copyright
Google Book Search
Class discussion: Consider
the claims of Google and those of the publishers and authors opposing
Google's copying of copyrighted books without permission as discussed
in the Salon article
below. What do you think and why?
Written
assignment due this week: NONE
Reading
for discussion today:
Week 10, Dec. 5:
Topics:
Future of search
Understanding Intent
Human interventions
Concept-based search
Using the "Database of Intentions"
Class discussion: Considering
the new information you have from class last
week and from the reading, what do you think are the most pressing
needs for search improvement? What approaches do you think are most
promising?
Written
assignment due this week: NONE
Reading
for discussion today:
*
Battelle, Chapter 11 ("Perfect Search")
* Some articles on new
search engine features:
*
Two clustering meta-search engines to try:
Week 11, Dec. 12:
Topics:
The Semantic Web
Preserving digital content
Archiving the Web
Searching non-text media by features rather than keyword
Class discussion: Be
prepared to discuss the article "The Semantic Web" by Berners-Lee et.
al. . (Recall that Tim Berners-Lee is credited as the inventor of
the World Wide Web.) This article is over 6 years old. Have
you encountered any tools that approach the functionality described in
this article? What do you think would be needed to achieve
the vision in the article? Are you eager for such a
tool? What would you pay for it? (The other readings point
to some commercial projects.)
Consider the task of preserving knowledge in digital form. Should
everything be archived? If not, then what? This problem is
older than the Web. How many non-print forms of recording (any
media) have you encountered that have disappeared or are disappearing?
We'll look at some methods of searching visual and audio media without
depending on text labels. Look at and try out the sites
listed below.
Reading
for discussion today:
The Semantic Web:
Archiving Projects:
Searching non-text media by non-text
features:
Week 12, Thursday January 10,
noon-2:50pm, Forbes Multi-purpose
room (across the hall from our usual room)
Topics:
The future of Google
Google serving all your needs
Monopoly of
information
"Computing in the Clouds"
compare peer-to-peer systems
Class discussion: What is Google emphasizing for
the near future? As a consumer, what would you like from Google
in the future?
One direction receiving a lot of press is "Computing in the Clouds" --
and not only as a big part of Google's future, but as a big part of
computing in general. Do you use "cloud computing" services? What
are the pros and cons?
Peer-to-peer networking has become famous for allowing users share
content and infamous for allowing users to side-step copyright.
Do you use peer-to-peer systems (e.g.
Kazaa,
BitTorrent)? Do you expect
"peer-to-peer" services and "cloud computing" to conflict, co-exist, or
complement each other?
From Google itself:
“Google's mission is to
organize the world's
information and make it universally accessible and useful.”
What is Google's long-term vision for achieving this mission?
Written
assignment due this week:
NONE
Reading
for discussion today:
*
Review Battelle, Chapters 10, 11 and Afterword
* Look over Google's
offerings beyond search:
Google
Services & Tools and
Google Labs
Computing in the Clouds:
* Software
via the Internet: Microsoft in ‘Cloud’ Computing by John Markoff,
The New York Times, September 3,
2007.
* Google
Gets Ready to Rumble With Microsoft by Steve Lohr and Miguel Helft,
The New York Times, December
16, 2007.
* I.B.M.
to Push ‘Cloud Computing,’ Using Data From Afar by Steve Lohr,
The New York Times, November 15,
2007
.
*Computing
in the Cloud? I’ll Keep my Data, Thank You, Dec. 17, 2007, blog by
Michael Zimmer, 2007-2008 Microsoft Resident Fellow at the
Information Society Project
at
Yale Law School.
Peer-to-peer:
References
for technical material:
Final paper due Tuesday, January 15, 2008 at 5pm !
last revised Mon Jan 7 12:05
EST 2008
Copyright
2007, 2008 Andrea S. LaPaugh