Princeton University
Computer Science Department

Computer Science 598F

Systems and Analytics for Big Data

Spring 2016

Directory

General Information | Syllabus | Projects


Course Summary

The total data in the digital universe has exceeded 10 Zetabytes (10,000,000 Petabytes) in 2016.  In the world of big data, we are moving into the era of using, ZB, EB, and PB as units to talk about the scale of datasets, instead of using TB or GB. How do we build to systems to process  data in such a new era?  What data analytics do we perform on big data?  


In this seminar, we are reading and discussing papers related to big data in two major areas: systems and analytics.  Papers on systems include topics such as architectural support and systems designs for large-scale data centers, planet-scale distributed systems, storage systems for internet scale systems, large-scale graph computing systems, and so on.  Papers on data analytics systems include data mining techniques for big data, deep learning techniques for big data,  large-scale knowledge base, and so on. 


Students who are taking the course for credits are required to give a presentation on a specific topic (typically from two or three closely related papers), read one paper each week, and work individually or jointly on a small project.  This course satisfies the programming requirement for our graduate program. 


Administrative Information


Meeting time: Friday 3:0-5:20pm

Location: Room 301, CS building

Grading Policy

This graduate seminar will be graded roughly as follows:
Late submissions are due at 11:59pm of the due date.  Late submissions of reading notes will not be accepted.  Late submission of warm-up exercise and project will be graded using our typical formula:
                grade = original_grade * exp(-time_late/three_days)

Reading Requirements

Students who are taking this course for credits are required to read all primary papers (the first paper of each presentation).   For each primary paper, please use no more than 5 sentences to answer each of the following questions.
Please submit your reading note before each class by using the department dropbox for this course.  The specific dropbox link will be provided on the syllabus web page.

Presentation

Each student will give a presentation based on two or more papers and lead discussions about the papers.   See the syllabus for the presentation schedule.

After your presentation in class, you should submit your presentation to the dropbox of this class. 

Projects

The following are the tentative deadlines for your projects:

Submit project report: You should write a double-column, conference-paper format a report as if you are writing a short conference paper.  Your report should have the following sections: Abstract, Introduction, related work, your idea, approach or design, and evaluation, conclusion, and who-did-what. Your report should be concise, no more than 10 pages.

Projects can be done by either an individual or a group of students.   If two ore more students are working on one project, we expect you to state clearly who did what in the final report.