Princeton University
Computer Science Dept.

Computer Science 435
Information Retrieval, Discovery, and Delivery

Andrea LaPaugh

Spring 2006

General Information | Schedule and Readings | Work of the Course | Project Page | Announcements

Part 1: topics in information retrieval and manipulation

Tues. Feb. 7:
Overview of course topics and organization.  
Begin information retrieval topics: How do we model information retrieval and how do we evaluate it?
Thurs. Feb. 9:  Evaluation in the age of the Web
Tues. Feb. 14:
Classic text information retrieval.
Thurs. Feb. 16: Classic models continued: latent semantic indexing


Sunday Feb. 19:  Homework 1 is now availabledue Feb. 27, 2006.

Tues. Feb. 21:
Classic models continued: the Bayesian inference network probabilistic model.
Thurs. Feb. 23:  Ranking documents using linked structures (social networks, hypertext, the Web)
Monday Feb. 27, 2006:   
Homework 1 due.

Tues. Feb. 28:  Hubs, authorities and communities;  relevance feedback;  a sample of experimental comparisons of IR algorithms.
Thurs. March 2: Web characterization for search engines;  indexing Friday, March 3:  Homework 2 is now availabledue March 9, 2006.

Tues.  March 7
Indexing continued;

Thurs. March 9 Homework 2 due.
Thurs. March 9: Guest presentation:  Professor Tom Funkhouser speaks on  A Search Engine for 3D Models

Tues.  March 14
compression of indexes (moved from Mar. 7) ;

Part 2: topics in information discovery

Thurs. March 16 Project proposal dueSee course Project Page.
Thurs. March 16:
Introduction to information discovery topics; document similarity
Fri. March 17, Homework 3 available CORRECTED Mar. 20.

Tues. March 21 and Thurs. March 23 SPRING BREAK
Tues.  March 28

Thurs. March 30: Clustering continued;  Martin Makowiecki presents k-means algorithm Friday,  March 31 Homework 3 due.   Note change of due date.

Tues.  April 4: Clustering continued (example of use of min-max cut, cluster comparison, cluster applications)  and question and answer session for the exam.

TAKE-HOME EXAM  AVAILABLE in class 12:20pm THURDAY April 6;  DUE 11am  in class TUESDAY April 11. 
See the March 31 posting on announcements for  further information.

Thurs. April 6Clustering based on minimum cut trees
Tues. April 11:  take-home exam due in class 11am.
Tues.  April 11
Data mining - associations within data

Thurs. April 13Data mining cont.:  beyond association rules;  introduction to recommender systems

Tuesday April 18: presentations by Chris Richbourg and John Son on collaborative filtering methods. 
PROJECT  progress check and PRESENTATION topic DUE Wed, April 19. 
See the April 10 posting on announcements for  further information.

Part 3: systems issues in delivering digital information

Thursday April 20: Crawling the Web and other social networks

Tuesday April 25:
presentation by Joseph Bradley on latent Dirichlet allocation; overview of Web caching.

Wednesday Aprill 26: Homework 4 (last) now available - due Wednesday May 3 
Note change of due date.

Thursday April 27:  presentation by Frank Macreery on prediction of Web page access;  finish Web caching

Tuesday May 2: Content distribution networks;  very brief publish-subscribe overview
Wednesday  May 3:  Homework 4 due.   Note change of due date.
Thursday May 4: presentations by Michael Wenzel on music classification and Sergio Garza on the Cat-a-Cone search interface; wrap-up.

Project Report due 5:00 pm Dean's Date, Tuesday May 16, 2006

Project demonstration:  If you have implemented somthing that lends itself to live demonstartion, I would like to see it after I receive your report and  before 5pm Mon. May 22, 2006.

* on reserve in the Engineering Library

A.S. LaPaugh Thu May  4 17:13:27 EDT 2006