Princeton University
Computer Science Dept.

Computer Science 597A

Principles of Database and Information Systems


Andrea LaPaugh

Schedule and Readings

Fall 2008


General Information | Schedule and Readings | Project Page | Announcements

Readings

Reading should be done before the topic is begun in class to facilitate class discussion.

Problem Sets

Click on the problem set number in the schedule below to reach the problem set due that week. (If there is no link, the problem set is not ready for distribution.)  Note that problems sets are not always due on the same day of the week.  Late penalties will be applied, unless there are extraordinary circumstances and/or prior arrangements.  The due date and time and the late penalties for each assignment are specified on the assignment.

Schedule

Subject to change. Be sure to check readings and assignments for changes up to 5 days before due.

prob. set due date Topics Readings in Silberschatz, Korth and Sudarshan
PART 1:  MODELS and QUERIES

Mon 9/15 Introduction
pdf of slides:  intro,  ER model
Chapter 1
9/17 The entity-relationship model
pdf of slides:  ER model continued
6.1 through 6.5

Mon 9/22 Extensions to the entity-relationship model
The relational model
Relating models
pdf of slides:  final ER modelRelational model
6.7 (skip 6.7.6); 6.8;  NEW! 6.6
2.1
6.9
9/24  Relational algebra
pdf of 9/24 slides:
relational model and algebra;   supplement
pdf of 9/29 slides:  relational algebra cont.
2.2 through 2.5
PS 1 due Mon.  9/29
Mon 9/29
10/1 Relational Calculus
pdf of slides: relational calculus corrected
5.1, 5.2
PS 2 (pdf)
due Wed. 10/8
Mon 10/6 Overview of the SQL Language
pdf of slides: sqlcorrection: slide 28
Chapter 3
10/8 Designing with and enforcing constraints
pdf of slides: constraints & decomposition (final)
4.2;  7.3 and 7.5
NEW! 7.4:  need for def.s
PS 3 due Fri. 10/17
Mon 10/13
XML and the tree model
pdf of slides: XML (final)
NEW! XML "FOR vs LET" example (pdf)
NEW!XML Schema for books example (pdf)
Chapter 10 except 10.5 and 10.6
10/15

Mon 10/20
Information retrieval (IR)
pdf of slides: IR (final)
Chapter 19 except 19.2,  19.5
PART 2:  STORING, RETRIEVING, and MAINTAINING

PS  4 due Fri 10/24
10/22
Inverted indexes and search 19.5;  NEW!  Introduction to Information Retrieval (online version),  Chap. 1 and Section 2.4.2
fall break
project proposal
due Fri 11/7
Mon 11/3  File Organization
pdf of slides:  file organization
11.7 through 11.9 (helpful background: 11.2, 11.5, 11.6)
11/5 Indexing
pdf of slides:  indexing,
B+ tree insert, delete examples
NEW! dynamic hashing and costs
Chapter 12, except 12.9
test 1: out Mon 11/10
due Wed 11/12
Mon 11/10
 11/12 Relational Query Evaluation
pdf of slides: query evaluation
Chapter 13 

PS 5 due Wed 11/19
Sign-up for presentation
by Thurs. 11/20
Mon 11/17
11/19
Relational Query Optimization
pdf of slides:  query optimization (final)
14.1 through 14.4.3
declare paper for presentation
by Mon. 11/24
PS 6 due Wed 11/26
Mon  11/24 Indexes and evaluation for XML
pdf of slides: XML indexing
10.6;  NEW!Recommended:  Index-driven XQuery procesing in the eXist XML database
11/26 Transactions
pdf of slides: transactions
Chapter 15;  NEW! 16.1.1 - 16.1.3; 16.6.   Recommended:  16.7
PART 3:  CURRENT RESEARCH:  Sample of advances in fundamentals and applications
Students present
project progress meetings
Mon 12/1
Ana Pop
Jeffrey K. Terrace
Addressing the problems with life-science databases for traditional uses and systems biology, Philippi and Kohler, Nature Reviews Genetics, (09 May 2006).
Reorganizing compressed text, Brisaboa et. al.,  SIGIR Conf., 2008.

12/3 Michael D. Golightly
Peng Jiang
Joe Jiang
Querying the Internet with PIER (pdf), Huebsch et. al., VLDB Conf., 2003.
Mining significant graph patterns by leap search, Yan et. al.,  SIGMOD Conf., 2008.
NEW! Models and issues in data stream systems, Babcock et. al.,  PODS, 2002. (Original more advanced paper:  Continuously Adaptive Continuous Queries over Streams, Madden et. al., SIGMOD Conf., 2002.)

test 2: out Wed. 12/10
due Fri. 12/12
Mon 12/8 Jesse J. Farnham
Aaron K. Wong
Content-Based Music Information Retrieval: Current Directions and Future Challenges (embedded pdf),  Casey et. al., Proceedings of the IEEE, April 2008.
Literature mining for the biologist: from information retrieval to biological discovery,  Jensen et. al.,  Nature Reviews Genetics,  7(February 2006).
12/10
Zia Khan
Hao Y Liu
Final remarks
pdf of slides:  final remarks
MapReduce: simplified data processing on large clusters, Dean and Ghemawat, CACM,  51(1), Jan. 2008.
Management of probabilistic data: foundations and challenges, Dalvi and Suciu, PODS, 2007.
Project report due 5pm Tuesday, Jan. 13, 2008 (Dean's Date)
Before 5pm Mon. Jan. 19, 2009:  project demonstration (where applicable) and discussion.



A.S. LaPaugh Fri Dec 12 10:48:30 EST 2008