Princeton University
Computer Science Dept.

Computer Science 435
Information Retrieval, Discovery, and Delivery

Andrea LaPaugh

Spring 2006

General Information | Schedule and Readings | Work of the Course | Project Page | Announcements

Information about the Course Project

Each student will do a final project of his or her choosing related to the material of the course.

Information on Project requirements:

Project Report due 5:00 pm Dean's Date, Tuesday May 16, 2006: 
You are required to submit a report that describes your project. This must include the statement of the topic and the goals of the project, your methodology and the results. If it is an experimental project, you need to describe what was implemented, the major implementation decisions,  how you designed the experiments, and the experimental results. If you developed a system or tool, you may not have experiments per se, but you must describe how you are evaluating the project and the outcome.  You should also relate your work to other work on the problem.  Your code should be in an appendix or posted on a Web page with the URL provided (Web posting is preferred).   If your project is a theoretical study, you need to describe the problem, review what was known about the problem before your analysis, and give the details and the results of your theoretical analysis. If your project is a literature-based project, you need to describe the major issues under study, summarize the major techniques and the theoretical and/or experimental results presented in the literature and critically analyze the results.  For any type of project, be sure to include a bibliography of all the sources you used.

Projects will be graded on thoroughness and depth of thought. Difficulty will be taken into consideration. Keep in mind that evaluation is an important part of any project. Be clear on the goals of your project and how you demonstrate or measure success.

Project demonstration:  If you have implemented something that lends itself to live demonstration, I would like to see it after I receive your report and  before 5pm Mon. May 22, 2006.

Project Proposal Due Thursday March 17, 2006.
Submit by email a paragraph describing your project.  Include as much detail as possible.  This will be the starting point of a discussion with Professor LaPaugh to make sure the project is of the appropriate scope for a class project.

List of suggested projects. This list will be expanded as the semester progresses.

These topics are fairly broad and need further refinement based on a student's particular interests. Students are encouraged to suggest other project topics based on their own interests.

  1. Do a literature search and analysis of the state of the art of video libraries and video retrieval (no implementation).  Any other media can be substituted here - image, music.  Data mining activities can be substituted for retrieval activities.
  2. Experiment with recommender systems for a particular application -- learn the techniques and experiment with your own.
  3. Investigate the use of dependence among index terms (e.g. co-occurrence) in the literature and by your own experiments.
  4. Investigate the history and development of the SMART project at Cornell (originally Harvard) by Dr. G. Salton. This project produced many results in classic information retrieval (no implementation).
  5. Investigate the problems with digitizing of old documents to build digital archives: optical character recognition (OCR), searching "OCRed" documents.
  6. Investigate searches for handheld display.  What special things are done now by companies  providing service?  How do search engines perform?  Are special ranking algorithms needed that do REALLY well at getting the top few ( 5? 7?)? Are there things that can be done?  Propose one and test.
  7. Investigate one or more variations on the use of inference networks for search beyond what we do in class.  Test variations from the literature or suggest your own.  Implement and test against small document  collection.
  8. Investigate inference networks for non-text collections searched using text queries or  non-text queries.
  9. Investigate the data mining of a data stream from a particular application, e.g. a particular set of sensors.
  10. Investigate the use of clustering in some application.
  11. Propose and implement a visualization of the relationship between some collection of objects (text documents, images, Web pages, etc.)
  12. Experiment with methods for predicting the next Web page a user will access.
  13. Investigate the efficiency of matching publications to subscriptions in various places within a publish-subscribe architecture.

Online resources

A.S. LaPaugh Fri Apr 14 16:16:20 EDT 2006