Princeton University
Computer Science Dept.

Computer Science 598b
Advanced Topics in Computer Science

Digital Information Access

Andrea LaPaugh

Spring 1999


General Information | Schedule and Readings | WWW Resources | Course tools

Tues, Feb. 2: Organization and brief overview of topics
Reading:

Part 1: topics in information retrieval and manipulation

Thurs., Feb. 4: class cancelled.

Tues., Feb. 9: Review of indexing, hashing, basic searching.
Reading:

Thurs., Feb. 11: Handling large spaces in the vector space model: Latent Semantic Indexing.
Reading:

Also of interest:

Tues., Feb. 16: Handling large collections, continued: inverted files.
Reading:

Thurs., Feb. 18: No class due to conflict in my schedule

Tues., Feb. 23: Evaluating retrieval systems:
Reading:

Thurs., Feb.25 : Ranking documents
Reading:

Tues., Mar. 1: Using URL structure for Web document categorizing:
Reading:
The following three papers are available from Jon Kleinberg's publication list. The first paper is a technique developed by Kleinberg to use the URL structure of hypertext documents to deduce which documents might be authorities for a topic. This technique is used by the following two papers. We will study the first paper in detail and then more briefly look at its applications as demonstrated by the folllow-on papers. The papers appear in the order in which you should give your attention to them.

Thurs., Mar. 3: Document similarity and clustering
Reading:

Also of interest:

Tues., Mar. 9: Finish "document clustering and similarity".
Class will end at 4:00pm so that people can attend the Program in Science, Technology and Ethics seminar
Internet Privacy: A Right or a Contradiction, by Jason Catlett of Junkbusters Corporation. 4:30 pm in Bowl 5, Roberston Hall

Thurs., Mar.11: Semantic and feedback techniques.
Reading:

Also of interest:

Tues., Mar. 16: and Thurs., Mar. 18: spring recess

Part 2: systems issues in delivering digital information

Tues., Mar. 23: Spiders
Reading:

Also of interest:

Thurs., Mar. 25: More on "crawling" the web:
We will finish the papers from Tuesday and discuss the Introna and Nissenbaum paper.
Class will end at 3:55pm so that people may attend the department colloquium.

Reading:

Also of interest:

Tues., Mar. 30: Web caching
Reading:

Also of interest:

Thurs., Apr. 1: Web prefetching
Reading:

Also of interest:

Tues., Apr. 6: meta-engines
Reading:

Also of interest:

Thurs., Apr. 8:agents
Reading:

Also of interest:

Tues., Apr. 13: more on interoperability
Following class is the Program in Science, Technology and Ethics seminar by Peter Arge: The Embedded Internet (co-sponsored by Computer Science Dept.)
Reading:

Thurs., Apr. 15: reliability and permanence
This class will end at 4pm due to another commitment that I need to keep. Let's try to start on time.
Reading:

Also of interest:

Tues., Apr. 20: Peter Yianilos: Archival Intermemory
Reading:

Also of interest:

Thurs., Apr. 22: Geliang Tong: The CMU video library project.
Reading:

Tues., Apr. 27: Jon Forsyth: audio retrieval
Reading:

Also of interest:

Thurs., Apr. 29: Peter Mei: multivalent-documents
Reading:

Also of interest:

Mon., May 17: course projects due

Tues., May 18, 10:30-noon Final meeting: presentation of class projects and discussion of economic issues.
Reading:

Also of interest:


* indicates handed out in class