|
Computer Science 597A
Advanced Topics in Computer Science:
Managing and Exploring Large Datasets
|
Fall 2002
|
Syllabus and Readings
Week 1 (9/13): Trends in storage systems and data management.
Week 2 (9/20): Research projects
Week 3 (9/27): Disk archival storage
Guest Lecture by Sean Quinlan, Bell labs, Lucent Technologies.
Week 4 (10/4): Overview of commercial storage file systems (Zhifeng Chen)
Week 5 (10/11): Content addressed storage system
(Han Chen and Grant Wallace)
Week 6 (10/18): Global Storage Systems (Christine Lv and Wen Xu)
- Towards an Archival Intermemory, Andrew V. Goldberg and Peter N. Yianilos,
Advances in Digital Libraries ADL '98, April 1998.
- OceanStore: An Architecture for
Global-Scale Persistent Storage, John Kubiatowicz, David Bindel, Yan Chen, Steven
Czerwinski, Patrick
Eaton, Dennis Geels, Ramakrishna Gummadi, Sean Rhea, Hakim Weatherspoon,
Westley Weimer, Chris Wells, and Ben Zhao. In Proceedings of the Ninth
international Conference on Architectural Support for Programming Languages
and Operating Systems (ASPLOS 2000), November 2000.
- Storage management and caching in PAST, a
large-scale, persistent peer-to-peer storage utility, A. Rowstron
and P. Druschel, ACM Symposium on Operating Systems Principles
(SOSP'01), Banff, Canada, October 2001.
Week 7 (10/25): Low-Bandwidth, Disconnected and Personal Storage (Phil
Shilane and Victor Shnayder)
- A Low-bandwidth Network File System.
Athicha Muthitacharoen, Benjie Chen, and David Mazières. In Proceedings
of the ACM 18th Symposium on Operating Systems Principles. Banff,
Canada. October, 2001.
- Exploiting Weak Connectivity for Mobile
File Access. Lily B. Mummert, Maria R. Ebling, M. Satyanarayanan.
In Proceedings of the 15th ACM Symposium on Operating Systems Principles,
December 1995.
- Flexible Update Propagation for Weakly Consistent
Replication. K. Petersen, M. J. Spreitzer, D. B. Terry, M. M. Theimer, and A. J.
Demers. In Proceedings of the 16th ACM Symposium on Operating Systems Principles
(SOSP-16), Saint Malo,
France, October 5-8, 1997, pages 288-301.
- PersonalRAID: Mobile Storage for Distributed and Disconnected
Computers. S. Sobti, N. Garg, C. Zhang, X. Yu, A. Krishnamurthy, and R. Y.
Wang. In Proceedings of the USENIX First Conference on File and Storage
Technologies (FAST). January 2002
Week 8 (11/8): SkyServer
Guest lecture by Prof. Alex Szalay at John Hopkins University
- Data Mining the SDSS SkyServer Database.
J. Gray, D.Slutz, A. Szalay, A. Thakar, P. Kuntz, C.Stoughton, MSR TR
2002-1, pp1-40, 2002.
- The SDSS SkyServer – Public Access to
the Sloan Digital Sky Server Data. A. Szalay, J. Gray, A. Thakar, P.
Kuntz, T. Malik, J. Raddick, C.Stoughton. J. Vandenberg, ACM SIGMOD 2002,
MSR TR 2001 104.
Week 9 (11/15): TerraServer (Zhiyan (Cathy) Liu, Ruoming Pang)
Week 10 (11/22): Data in Intensive Care Unit
Guest lecture by Dr. William Hanson at UPenn
Medical School (Director of ICU)
- Data Collection and Analysis in the ICU,
Charles G. Durbin, Jr. MD, Anesthesiology Clinics of North America,
15(4): 879-897. December 1997.
- Artificial Intelligence Applications in the
Intensive Care Unit, C. William Hanson III, MD, FCCM, Bryan E. Marshall,
MD, FRCP, FRCA, Critical Care Medicine, 29 (2): 427-435. 2001.
Week 11 (12/6): How Google Manage and Mine Data
Guest lecture by Rob Shillingsburg
at Google.
The guest lecture is cancelled due
to the weather. We will have a discussion on the papers and and talk about
projects.
- The Anatomy of a Large-Scale Hypertextual Web
Search Engine. Sergey Brin and Lawrence Page, The 7th World Wide Web
Conference, and also in Computer Networks 30(1-7): 107-117
(1998).
- Section 8.12, "Putting it All Together: The Google Cluster of
PCs," Computer Architecture, A Quantitative Approach, Third Edition.
John L. Hennessey and David A. Patterson, pages 855-862, 2003.
Week 12 (12/12): Students presentation of projects
TIME:
3-5pm: Each project gives 15-20 min
presentation. Submit a web page about the project.
Final meeting (1/20): Show and tell
Each project group shows and tells
the project results. Tentative time is from 1:30 to 4pm.