The total data in the digital universe has
exceeded 10 Zetabytes (10,000,000 Petabytes) in 2016.In the world of
big data, we are moving into the era of using, ZB, EB, and
PB as units to talk about the scale of datasets, instead of
using TB or GB. How do we build to systems to process data in such a new
era?What data
analytics do we perform on big data?
In this seminar, we are reading and discussing papers
related to big data in two major areas: systems and
analytics. Papers
on systems include topics such as architectural support and
systems designs for large-scale data centers, planet-scale
distributed systems, storage systems for internet scale
systems, large-scale graph computing systems, and so on.Papers on data
analytics systems include data mining techniques for big
data, deep learning techniques for big data, large-scale
knowledge base, and so on.
Students who are taking the course for
credits are required to give a presentation on a specific
topic (typically from two or three closely related papers),
read one paper each week, and work individually or jointly
on a small project.This
course satisfies the programming requirement for our
graduate program.
Administrative Information
Meeting time: Friday 3:0-5:20pm
Location: Room 301, CS building
Grading Policy
This graduate seminar will be graded roughly as
follows:
Presentation
(30%)
Reading
and participation (30%)
Project
(40%)
Late submissions are due at 11:59pm of the due
date.Late
submissions of reading notes will not be accepted.Late submission of
warm-up exercise and project will be graded using our typical
formula:
grade = original_grade * exp(-time_late/three_days)
Reading Requirements
Students who are taking this course for credits
are required to read all primary papers (the first paper of each
presentation).For
each primary paper, please use no more than 5 sentences to
answer each of the following questions.
Summary
using your own words
What
are
the strengths of the paper?
What are the weaknesses of the paper?
Your
thoughts on future directions
Please submit your reading note before each class by using the
department dropbox for this course.The specific dropbox link will be provided on the
syllabus web page.
Presentation
Each student will give a presentation based on
two or more papers and lead discussions about the
papers. See the syllabus for the presentation
schedule.
After your presentation in class, you should submit your
presentation to the dropbox of this class.
Projects
The following are the tentative
deadlines for your projects:
3/4:
Submit project proposals via dropbox:
Use one page to outline the main idea and the plan of your
project.We
suggest that you meet with the faculty to discuss your
project before submitting your project proposal.
4/1:
Submit
project progress via dropbox:
Use powerpoint
slides to show the progress of your project.You should tell
where you are in terms of project design and
implementation.
5/6:
Project
demos or presentations (15 minutes for each project)
5/10:
Submit a project report
Submit project report: You should write a
double-column, conference-paper format
a report as if you are writing a short conference paper.Your report should
have the following sections: Abstract, Introduction, related
work, your idea, approach or design, and evaluation,
conclusion, and who-did-what. Your report should be concise,
no more than 10 pages.
Projects can be done by either an individual
or a group of students.If two ore more students are working on one project, we
expect you to state clearly who did what in the final report.