Princeton University
Computer Science Department

Computer Science 598F
Systems and Analytics for Big Data

 

Spring 2016

 

 


Directory

General Information | Syllabus | Projects


Tentative Schedule ­

Dates

Presenters

Topics

Readings

Presentations

To do

2/5

K. Li

General information




2/12

Guest lecture: Prof. Nick Turk-Browne, Princeton Neuroscience Institute

Real-time fMRI data analysis
Functional Interactions as Big Data in the Human Brain (Science, 2013)
Closed-loop training of attention with real-time brain imaging (Nature-Neuroscience, 2015)
Full correlation matrix analysis (FCMA): An unbiased method for task-related functional connectivity, (J. Neuroscience Methods, 2015)
Full Correlation Matrix Analysis of fMRI Data on Intel Xeon Phi Coprocessors (SC, 2015)

 

pdf


Submit notes

2/19


K. Li



Linpeng Tang

Some classic papers  Infrastructure



Warmup exercise

MapReduce (OSDI, 2004),  GFS (SOSP, 2003),
BigTable, (OSDI, 2006), Chubby (OSDI, 2006).

Spark (NSDI, 2012)
Spaark on a local cluster
pdf


Submit notes

2/26

Guest lecture: Prof. Sebastian Seung


Challenges in brain mapping

3D CNN for boundary detection (NIPS, 2015),
A review paper (2012),
Image segmentation challenge (ISBI, 2012)

link


Submit notes

3/4

Prof. Eric Xing (CMU)

Dr. Zhifeng Chen
(Google Brain team)

Petuum

TensorFlow


Strategies and Principles of Distributed Machine Learning on Big Data (2015)
Petuum: A New Platform for Distributed Machine Learning on Big Data (KDD 2015)

TensorFlow: Large-scale machine learning on heterogeneous systems.   (Google white paper, 2015)

Large Scaled Distributed Deep Networks. 
(NIPS 2012).



Submit
project proposal

Submit notes

3/11

Yang,
Bai

Dedup memory
DNA data compression
HICAMP: Architectural Support for Efficient Concurrency-safe Shared Structured Data Access (ASPLOS, 2012)
Log-Structured Memory for DRAM-based Storage.  (FAST 2014)
HICAMP bitmap: space-efficient updatable bitmap index for in-memory databases (2014)
A Survey of Techniques for Sequence Similarities Matching in Compression (2014)
Data Compression for sequencing data (AMB 2013)


Submit notes

3/25


Chang,


Naghib

Optimizations/High level features
Arch support for
Internet services

Making Sense of Performance in Data Analytics Frameworks (NSDI, 2015)
Efficient Coflow Scheduling with Varys (SIGCOMM, 2014)
Mastering the game of Go with deep neural networks and tree search.  (Nature 2016)


4/1

Kathpalia

Guest lecture: Dr. Bill Tang (PPPL)

ImageNet competition

Big Data in Fusion Research

ImageNet Classification with Deep Convolutional Neural Networks (NIPS, 2012).

Deep Residual Learning for Image Recognition (2015)

Big Data Machine Learning for Disruption Predictions

White paper



Submit notes

4/8

Tang,

Lin

Facebook social network

TAO: Facebook’s Distributed Data Store for the Social Graph (NSDI, 2013) 
Social Hash: An Assignment Framework for Optimizing Distributed Systems Operations on Social Networks (NSDI, 2016)
Building Watson: An Overview of the DeepQA Project.  (AI Magazine, 2011).

Submit notes,

Submit progress report

4/15


Qiu,


Suo

Sparse FFT,
Compressing DNA sequence data

Human genomes as email attachments (Bioinformationcs, 2009)
The human genome contracts again (Bioniformatics, 2013),
Simple and Practical Algorithm for Sparse Fourier Transform (SODA, 2012)

Submit notes

4/22

Zeng,

Cheng



RGB-D Image data

Clustering

Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images (CVPR, 2016)

Submit notes

4/29

Ko,
Ravi

MOOC data
CNN structure

Modeling and Prediction Learning Behavior in MOOCs
(WSDM, 2016)
Exploiting Cyclic Symmetry in Convolutional Neural Networks
(ICML, 2016)

 


submit notes

5/13


Project final presentation



Submit final presentations

5/16


Final report submission



Submit final reports

 

Some Topics on Big Data Systems

Some Topics on Analytics of Big Data