Princeton University
Computer Science Dept.

Computer Science 425

Database & Information Management Systems

Andrea LaPaugh

Fall 2006


Directory
General Information | Schedule and Readings |  Project Page | Announcements

Project Overview

The goal of the course project is to have you explore in greater detail some aspect of the design or use of database systems. The choice of project is yours, but it does need to be approved. You may work individually or in pairs.

Examples

Most projects are primarily experimental (this includes application development), but theoretical analysis and in-depth literature research are also possible.  If you choose to implement a database for an application, the database must have some complexity that makes it more than straightforward (see more discussion of this in item 2 below). An experimental project may also study in depth an algorithm or algorithms from database or search systems (see item 1 below).  Theoretical projects analyzing some aspect of database design must add substantial depth to the textbook treatments of the subject.  A project may be primarily in-depth literature research of the state-of-the-art of an area within databases or information retrieval  that we have not covered in class. However, such projects must include critical analysis of the results in the literature.
Important note: If you are a graduate student and wish to satisfy your programming requirement with the COS 425 project, you must choose an implementation project, and you must notify me in advance that you want to use the project to satisfy the requirement.

Here are some suggestions. You may choose one or use them as guides to specifying your own project.  Note that many of these suggestions talk in terms of one of database systems or information retrieval (search) systems.  In reality, database and information retrieval are combined in many tools and applications;  your project certainly can combine them.

  1. In earlier offerings of COS425, the same project was done by all students. It was the implementation and evaluation of the main query optimization algorithm presented in Ramakrishnan and Gerke Database Management Systems (3rd edition). A detailed specification of that project can be found in the COS 425 spring 2001 project description.
  2. Implement an application that requires database support. Implement the user interface, the application interface to the database and the database. This application needs to have some complexity in functionality, constraint maintenance, reliability or user interface.  The user interface may be minimalist if the focus of the project is elsewhere.   The application should be something in which you are interested and for which you can obtain or generate a reasonable set of interactions and data for testing.  The application need not be "serious":  one previous student implemented a database and rule system for the game Warhammer.    The database may be relational or in XML.   Depending on the system being used and the student's background, learning the use and configuration of an API for a database system may be considered a substantial project goal.  (See Chapters 6 and 7 of Ramakrishnan and Gehrke, Database Management Systems (3rd edition), for a discussion of application development for SQL. See Chapter 10 of Silberschatz, Korth and Sudarshan Database System Concepts (5th edition) for a brief discussion of application development in XML.  Problem sets 3 and 4 give you pointers to SQL and XML servers.) 
  3. Database techniques and systems exist for special kinds of data. For example, Chapter 28 of our text presents techniques for spatial (geometric) data. Your project may focus on techniques for a special kind of data. Possibilities include:
  4. There are also customized information retrieval techniques for special kinds of data.
  5. Do an empirical study of two or three search services that use different search engines, i.e. the crawling, indexing and search software are different.  To the best of my knowledge Google, Yahoo, Ask  and MSN Search are all different, but you should check that nothing has changed in this fast-changing area (e.g. until 2004, Yahoo used Google for its Web search).   To do your study, read about well-established studies like those conducted by TREC.  Be forewarned that such studies involve testing the search engines on identical queries and analyzing the results, which can be very time consuming. 
  6. Explore methods for enhancing and changing  queries that are specified by search terms -- i.e. the kind of queries we use every day.  Such enhancements and changes attempt to deal with ambiguity, synonyms and other natural-language issues.  Read about state-of-the-art  methods for changing queries.  Do some analysis of methods -- for example,  do a comparative analysis from the literature or conduct experiments to evaluate methods.  You may design and analyze a method of your own design, but you must have knowledge of prior methods and results.

Requirements

Each individual or pair must:
  1. By 5pm on Wed. Nov 8, 2006 send email to Professor LaPaugh containing a one-paragraph description of your project. Each individual must email Professor LaPaugh to confirm partnerships.
  2. During the week of Nov. 27, 2006 meet with Professor LaPaugh for 15-20 minutes to discuss project progress and issues.
  3. Submit by 5pm Tues. Jan 16, 2007 (Dean's date) a report that describes your project. This must include the goals of the project, your methodology and the results   If it is an application implementation, you need to describe the application, your design requirements, the major implementation decisions, and your assessment of the result. If it is an experimental algorithm study, you need to describe what was implemented, the major implementation decisions, how you designed the experiments, and the experimental results.  If it is a theoretical study, you need to describe the problem, review what was known about the problem before your analysis, and give the details and the results of your theoretical analysis. If it is a literature-based project, you need to describe the major issues under study, summarize the major techniques and experiments presented in the literature and critically analyze the results; you must have a bibliography that includes recent research.  For any project that involves programming, all source code you write should be in an appendix or made accessible on the Web.
  4. After the project report is submitted and before 5pm Mon. Jan. 22, 2007 each individual or pair must meet with Professor LaPaugh for a project demonstration (where applicable) and discussion.
Projects will be graded on thoroughness and depth of analysis. Difficulty will be taken into consideration. Keep in mind that evaluation is an important part of any project. Be clear on the goals of your project and how you demonstrate or measure success.

A.S. LaPaugh Mon Oct 30 16:05:26 EST 2006