Princeton University
Computer Science Dept.

Computer Science 597A

Principles of Database and Information Systems


Andrea LaPaugh

Schedule and Readings

Fall 2011


General Information | Schedule and Readings | Project Page | Announcements

Readings

Reading should be done before the topic is begun in class to facilitate class discussion.

Problem Sets

Click on the problem set number in the schedule below to reach the problem set due that week. (If there is no link, the problem set is not ready for distribution.)  Note that problems sets may not always be due on the same day of the week.  Late penalties will be applied, unless there are extraordinary circumstances and/or prior arrangements.  The due date and time and the late penalties for each assignment are specified on the assignment.

Schedule

Subject to change. Be sure to check readings and assignments for changes up to 5 days before due.

prob. set due date Topics Readings (in Ramakrishnan and Gehrke unless specified)
PART 1:  MODELS and QUERIES

Mon 9/19 Introduction
pdf of slides:  introduction
Chapter 1
9/21 The entity-relationship model
pdf of slides: ER model
Chapter 2

Mon 9/26 The relational model;
relational algebra and calculus

pdf of slides: 
Codd, E.F., A Relational Model of Data for Large Shared Data Banks, Comm. ACM 13, 6 (June 1970), 377-387.
Chapter 3
Chapter 4
9/28
PS 1 due Mon. 10/3
Mon 10/3
10/5
PS 2 due Wed. 10/12
Mon 10/10 Overview of the SQL Language
pdf of slides: SQL
Chapter 5
10/12 Designing with and enforcing constraints
pdf of slides: FDs and normal forms
Chapter 19, Sections 1-7 and 9
PS 3 due Wed. 10/19 Mon 10/17
XML and the tree model
pdf of slides: XML
Chapter 27, sections 6 and 7.  A more thorough presentation can be found in Silberschatz et. al. Chapter 23,  Sections 1-4 and 7
10/19

PS 4 due Wed. 10/26
project prop. due Fri 10/28
Mon 10/24
Information retrieval (IR)
pdf of slides: information retrieval
Chapter 27,  Sections 1-5 OR Silberschatz et. al. Chapter 21, Sections 1-7
Introduction to Information Retrieval (online version),  Chap. 1 and Section 2.4.2
10/26
fall break
PART 2:  STORING, RETRIEVING, and MAINTAINING
exam 1 Wed 11/9 thru Fri. 11/11
Mon 11/7
File Organization
pdf of slides: file organization
Chapter 8, Sections 1-3 
(Chapter 9 background)
 11/9 Indexing
pdf of slides: indexing Part I, indexing Part II
Chapter 8, Sections 4 and 5
Chapter 10
Chapter 11

Mon 11/14
 11/16 Relational Query Evaluation
pdf of slides: query evaluation
Chapter 12, Sections 1-3
Chapter 13, Sections1-3
Chapter 14
choice of paper
for presentation due
Monday 11/21:
see description
 Mon 11/21
11/23
Relational Query Optimization
pdf of slides: query opt.
Chapter 12, Sections 4-6
PS 5 due Wed 11/30

project progress
meetings 12/1-12/7
Mon 11/28 Transactions
pdf of slides: transactions
Chapter 16
Chapter 17: Sections 1 - 4
11/30
Crash Recovery
pdf of slides: crash recovery
Chapter 18
PART 3:  CURRENT RESEARCH:  Sample of advances in fundamentals and applications
Students present

Mon 12/5
Final Remarks (pdf of slides)

1.Donghun Lee
2.Arpan K. Ghosh
1. MCDB-R: risk analysis in the databaseProceedings of VLDB Endowment, Sept 2010
2. Entangled queries: enabling declarative data-driven coordination, SIGMOD, 2011
12/7 1. Joshua Wetzel
2. Nanxi Kang
3. Xin Jin
4. Shilpa Nadimpalli
1. Semantic ranking and result visualization for life sciences publications.  ICDE 2010: 860-871
2. Flexible Aggregate Similarity Search, SIGMOD, 2011
3. Collaborative Topic Modeling for Recommending Scientific ArticlesKDD 2011
4. WHAM: A High-throughput Sequence Alignment Method, SIGMOD,  2011
exam 2 Wed 12/14 thru Fri. 12/16
Mon 12/12 1. Katherine Edwards
2. Dushyant Arora
3. Peng Sun
4. Victoria Yao
1. NEW! Tedi: efficient shortest path query answering on graphs, SIGMOD, 2010
2. CrowdDB: Answering Queries with Crowdsourcing, SIGMOD, 2011
3. MultiRank: co-ranking for objects and relations in multi-relational dataKDD 2011
4.  PathCase-SB Architecture and Database Design, BMC Systems Biology 2011
12/14
1. Nicholas Jones
2. Young-suk Lee
3. Qian Zhu
1. CryptDB: protecting confidentiality with encrypted query processing, SOSP, 2011
2. Relevance and ranking in online dating systems, SIGIR 2010
3. Structured digital tables on the Semantic Web: toward a structured digital literature, Molecular Systems Biology 6, 24 August 2010
winter break

Mon 1/9/12
3:00-4:20pm
in Rm 302

1. Danielle Bragg
2. Xiaozhou Li
3. Dana Hoffman
4. Dmitry Drutskoy
1. NEW! Extreme visualization: squeezing a billion records into a million pixels, SIGMOD, 2008
2. Cassandra - A Decentralized Structured Storage System, LADIS 2009
3. Automatic discovery of attributes in relational databasesSIGMOD 2011
4. Turbocharging DBMS Buffer Pool Using SSDSIGMOD 2011

Project report due 5pm Tuesday, Jan. 17, 2012 (Dean's Date)

Before 5pm Mon. Jan. 23, 2012:  project demonstration (where applicable) and discussion.



A.S. LaPaugh Sat Dec 10 13:49:03 EST 2011