Princeton University
Computer Science Department

Computer Science 598D

Systems and Machine Learning

Spring 2020

 

Suggested Readings
(Papers in green color were read in Spring 2019)

Introductory Papers

 

A New Golden Age in Computer Architecture: Empowering the Machine- Learning Revolution.
Jeff Dean, David Patterson, and Cliff Young, 
IEEE Micro, 38(2), 21-29.

Deep Learning.

Yann LeCun, Yoshua Bengio, and Geoffrey Hinton, 

Nature 521, May 2015.

 

Human-level control through deep reinforcement learning.

Mnih, V. et al, Nature, 2015.  (Earlier version).

 

Reinforcement Learning: An Introduction. (book)
Richard S. Sutton and Andrew G. Barto.

MIT Press 2018.

Systems for ML

Library Framework

Torch7: A Matlab-like environment for machine learning,

Ronan Collobert , Koray Kavukcuoglu.  NIPS Workshop.  2011.

Pytorch.

 

Caffe: Convolutional Architecture for Fast Feature Embedding,

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell.

Proceeding MM '14 Proceedings of the 22nd ACM international conference on Multimedia, Pages 675-678. 2014.

 

Caffe2 vs. Caffe.

 

TensorFlow: A System for Large-Scale Machine Learning.

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng.

Proceedings of the 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI). November 2–4, 2016

 

Google TPU

In-Datacenter Performance Analysis Of A Tensor Processing Unit.
Norman P. Jouppi, et al.
ISCA 2017.

 

A Domain-Specific Architecture for Deep Neural Networks.

Norman P. Jouppi, Cliff Young, Nishant Patil, David Patterson.

CACM 2018.

 

Cloud TPU.

 

Tensor Processing Units for Financial Monte Carlo

Francois Belletti Davis King Kun Yang Roland Nelet Yusef Shafi Yi-fan Chen John Anderson

Arxiv (2019)

 

Microsoft Brainwave

A configurable cloud-scale DNN processor for real-time AI.

Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, Doug Burger.

ISCA, 2018.

 

Serving DNNs in Real Time at Datacenter Scale with Project Brainwave

Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu, Mahdi Ghandi, Daniel Lo, Steve Reinhardt, Shlomi Alkalay, Hari Angepat, Derek Chiou, Alessandro Forin, Doug Burger, Lisa Woods, Gabriel Weisz, Michael Haselman, Dan Zhang

IEEE Micro 2018.

 

Configurable Clouds

Adrian Caulfield, Eric Chung, Andrew Putnam, Hari Angepat, Daniel Firestone, Jeremy Fowers, Kalin Ovtcharov, Michael Haselman, et al, Stephen Heil, Matt Humphrey, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger

IEEE Micro | June 2017, Vol 37(3): pp. 52-61

 

Cloud-Scale Acceleration Architecture

Adrian Caulfield, Eric Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger

Proceedings of the 49th Annual IEEE/ACM International Symposium on Microarchitecture | October 2016

 

Network Pruning

Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding.

Song Han, Huizi Mao, William J Dally

ICLR 2016.

 

EIE: Efficient Inference Engine on Compressed Deep Neural Network.

S. Han, X. Liu, H. Mao, J. Pu, A. Pedram, M.A. Horowitz, W.J. Dally,

ISCA 2016.

 

Learning structured sparsity in deep neural networks.

Wei Wen, Chunpeng Wu, Yandan Wang, Yiran Chen, Hai Li.

NIPS 2017. 

 

Rethinking the Value of Network Pruning

Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell

ICLR 2019.

 

AMC: AutoML for Model Compression and Acceleration on Mobile Devices

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han

CVPR 2019

 

ADMM-NN: An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction Methods of Multipliers

Ao Ren, Jiayu Li, Tianyun Zhang, Shaokai Ye, Wenyao Xu, Xuehai Qian, Xue Lin, Yanzhi Wang.
ASPLOS 2019

 

Some Recent Architecture Papers

 

SCNN: An accelerator for compressed-sparse convolutional neural networks.

Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, William J. Dally.

ISCA 2017

 

Sparse ReRAM Engine: Joint exploration of activation and weight sparsity on compressed neural network.

Tzu-Hsien Yang, Hsiang-Yun Cheng, Chia-Lin Yang, I-Ching Tseng, Han-Wen Hu, Hung-Sheng Chang, Hsiang-Pang Li.

ISCA 2019.

 

Bit Prudent In-Cache Acceleration of Deep Convolutional Neural Networks

Xiaowei Wang, Jiecao Yu, Charles Augustine†, Ravi Iyer†, Reetuparna Das

HPCA 2019

 

MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks.

Hanhwi Jang, Joonsung Kim, Jae-Eon Jo, Jaewon Lee, Jangwoo Kim.

ISCA 2019.

 

TIE: Energy-efficient tensor train-based inference engine for deep neural network.

Chunhua Deng, Fangxuan Sun, Xuehai Qian, Jun Lin, Zhongfeng Wang, Bo Yuan.

ISCA 2019.

 

Accelerating Distributed Reinforcement Learning with In-Switch Computing.

Youjie Li, Iou-Jen Liu, Deming Chen, Alexander Schwing, Jian Huang.

ISCA 2019.

 

Laconic Deep Learning Inference Acceleration.

Sayeh Sharify, Alberto Delmas Lascorz, Mostafa Mahmoud, Milos Nikolic, Kevin Siu, Dylan Malone Stuart, Zissis Poulos, Andreas Moshovos.

ISCA 2019.

 

DeepAttest: An End-to-End Attestation Framework for Deep Neural Networks.

Huili Chen, Cheng Fu, Bita Darvish Rouhani, Jishen Zhao, Farinaz Koushanfar.

ISCA 2019

 

PUMA: A Programmable Ultra-efficient Memristor-based Accelerator for Machine Learning Inference.

Aayush Ankit, Izzat El Hajj, Sai Rahul Chalamalasetti, Geoffrey Ndu, Martin Foltin, R. Stanley Williams, Paolo Faraboschi, Wen-mei W Hwu, John Paul Strachan, Kaushik Roy, Dejan S Milojicic.

ASPLOS 2019.

 

FPSA: A Full System Stack Solution for Reconfigurable ReRAM-based NN Accelerator Architecture.

Yu Ji, Youyang Zhang. Xinfeng Xie, Shuangchen Li, Peiqi Wang, Xing Hu, Youhui Zhang, Yuan Xie.

ASPLOS 2019

 

Bit-Tactical: A Software/Hardware Approach to Exploiting Value and Bit Sparsity in Neural Networks.

Alberto Delmas Lascorz, Patrick Judd, Dylan Malone Stuart, Zissis Poulos, Mostafa Mahmoud, Sayeh Sharify, Milos Nikolic, Kevin Siu, Andreas Moshovos.

ASPLOS 2019

 

TANGRAM: Optimized Coarse-Grained Dataflow for Scalable NN Accelerators 

Mingyu Gao, Xuan Yang, Jing Pu, Mark Horowitz, Christos Kozyrakis (Stanford University)

ASPLOS 2019

 

Packing Sparse Convolutional Neural Networks for Efficient Systolic Array Implementations: Column Combining Under Joint Optimization.

HT Kung, Bradley McDanel, Sai Qian Zhang (Harvard University)

ASPLOS 2019

 

Split-CNN: Splitting Window-based Operations in Convolutional Neural Networks for Memory System Optimization 

Tian Jin (IBM T.J. Watson Research Center);Seokin Hong (Kyungpook National University)

ASPLOS 2019

 

Optimizations of Deep Learning Computations

TVM: An automated end-to-end optimizing compiler for deep learning.

Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy.

OSDI 2018.

 

Dynamic Control Flow in Large-Scale Machine Learning

Yuan Yu Martin Abadi Paul Barham Eugene Brevdo Mike Burrows Andy Davis Jeff Dean Sanjay Ghemawat Tim Harley Peter Hawkins Michael Isard Manjunath Kudlur Rajat Monga Derek Murray Xiaoqiang Zheng

EuroSys 2018

 

PipeDream: Generalized Pipeline Parallelism for DNN Training.

Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur , Gregory R. Ganger, Phillip B. Gibbons, Matei Zaharia.

SOSP 2019.

 

TASO: Optimizing Deep Learning Computation with Automatic Generation of Graph Substitutions.

Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, Alex Aiken.

SOSP 2019.

 

Astra: Exploiting Predictability to Optimize Deep Learning

Muthian Sivathanu, Tapan Chugh, Sanjay Srivallabh, Lidong Zhou
ASPLOS 2019

CROSSBOW: Scaling Deep Learning with Small Batch Sizes on Multi-GPU Servers

Alexandros Koliousis, Pijika Watcharapichat, Matthias Weidlich, Luo Mai, Paolo Costa, Peter Pietzuch

VLDB 2019.

Static Automatic Batching in TensorFlow

Ashish Agarwal

ICML (2019)

 

Nexus: A GPU Cluster Engine for Accelerating DNN-Based Video Analysis

Haichen Shen, Lequn Chen, Yuchen Jin, Liangyu Zhao, Bingyu Kong, Mathhai Philipose.

SOSP 2019.

 

Kelp: QoS for Accelerators in Machine Learning Platforms

Haishan Zhu David Lo Liqun Cheng Rama Govindaraju Parthasarathy Ranganathan Mattan Erez

HPCA 2019.

 

Security

Intriguing properties of neural networks.

C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus.

arXiv preprint arXiv:1312.6199, 2013.

 

Deep neural networks are easily fooled: High confidence predictions for unrecognizable images.

Anh Nguyen, Jason Yosinski, and Je Clune. 2015.

In CVPR. 427–436.

 

Explaining and harnessing adversarial examples.

Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy.

In International Conference on Learning Representations (ICLR). 2015.

 

Delving into Transferable Adversarial Examples and Black-box Attacks.

Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017.

In International Conference on Learning Representations.

 

Robust Physical-World Attacks on Machine Learning Models.

Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo Li, Atul Prakash, Amir Rahmati, Dawn Song.

August, 2017.

 

Towards deep learning models resistant to adversarial attacks.

Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu.

ICLR 2018.

 

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Anish Athalye, Nicholas Carlini, David Wagner.

ICML 2018.

 

Mixup: Beyond Empirical Risk Minimization

Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz

ICLR 2018.

 

Mixup Inference: Better Exploiting Mixup to Defend Adversarial Attacks

Tianyu Pang, Kun Xu, Jun Zhu

 

Adversarial training for free!

Ali Shafahi, Mahyar Najibi, Mohammad Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor

Tom Goldstein

NIPS 2019.

 

Privacy Preserving

 

Secure Multi-pArty Computation Grid LOgistic REgression (SMAC-GLORE)

Haoyi Shi1,2, Chao Jiang1,3, Wenrui Dai1 , Xiaoqian Jiang1 , Yuzhe Tang2 , Lucila Ohno-Machado1 and Shuang Wang

BMC Medical Informatics and Decision Making 2016

 

Practical Secure Aggregation for Privacy-Preserving Machine Learning

Aaron Segal Antonio Marcedone Benjamin Kreuter Daniel Ramage H. Brendan McMahan Karn Seth Keith Bonawitz Sarvar Patel Vladimir Ivanov

CCS 2017

 

Helen: Maliciously Secure Coopetitive Learning for Linear Models

Wenting Zheng, Raluca Ada Popa, Joseph E. Gonzalez, and Ion Stoica

IEEE S&P 2019

 

Decentralized & Collaborative AI on Blockchain

Justin D. Harris Bo Waggoner

2019 IEEE International Conference on Blockchain (Blockchain) | July 2019

 

Low Latency Privacy Preserving Inference

Alon Brutzkus, Oren Elisha, Ran Gilad-Bachrach

ICML 2019.

 

Differential Privacy: A Survey of Results

Cynthia Dwork. 

International Conference on Theory and Applications of Models of Computation. 2008.

 

Privacy-Preserving Deep Learning.
Reza Shokri and Vitaly Shmatikov,
CCS 2015.

 

Deep Learning with Differential Privacy.

Martín Abadi,Andy Chu,Ian Goodfellow, H. Brendan McMahan,Ilya Mironov,Kunal Talwar,Li Zhang

CCS 2016.

 

Practical secure aggregation for privacy preserving machine learning.
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H. Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn Seth.
In ACM Conference on Computer and Communications Security (ACM CCS), 2017.

 

Honeycrisp: Large-Scale Differentially Private Aggregation Without a Trusted Core.

Edo Roth, Daniel Noble, Brett Hemenway Falk, Andreas Haeberlen.

SOSP 2019.

 

Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform

Mathias Lécuyer, Riley Spahn, Kiran Vodrahalli, Roxana Geambasu, and Daniel Hsu

SOSP 2019.

 

Federated Learning

 

Federated Learning: Strategies for Improving Communication Efficiency.
Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon, 2016.

 

Federated optimization: Distributed machine learning for on-device intelligence.
Jakub Konecˇny ́, H. Brendan McMahan, Daniel Ramage, and Peter Richta ́rik.
arXiv preprint arXiv:1610.02527, 2016.

 

Federated learning: Collaborative machine learning without centralized training data.
H. Brendan McMahan and Daniel Ramage., 2017.

 

Federated Learning for Mobile Keyboard Prediction

Andrew Hard Chloé M Kiddon Daniel Ramage Francoise Beaufays Hubert Eichner Kanishka Rao Rajiv Mathews Sean Augenstein

(2018)

 

Towards Federated Learning at Scale: System Design

Keith Bonawitz Hubert Eichner Wolfgang Grieskamp Dzmitry Huba Alex Ingerman Vladimir Ivanov Chloé M Kiddon Jakub Konečný Stefano Mazzocchi Brendan McMahan Timon Van Overveldt David Petrou Daniel Ramage Jason Roselander

SysML 2019

 

Deep Leakage from Gradients
Ligeng Zhu, Zhijian Liu, Song Han
NIPS 2019.

 

Advances and Open Problems in Federated Learning

Peter Kairouz, et al. 

arXiv preprint arXiv:1912.04977

 

ML for Systems

Learned Compilation

Compiler Auto-Vectorization with Imitation Learning
Charith Mendis, Cambridge Yang, Yewen Pu, Saman Amarasinghe, Michael Carbin
NIPS 2019

 

Learned Data Structures

The Case for Learned Index Structures.

T. Kraska, A. Beutel, E. H. Chi, J. Dean, and N. Polyzotis.

SIGMOD 2018.  pages 489-504.

 

A Model for Learned Bloom Filters and Optimizing by Sandwiching. 

Michael Mitzenmacher.

NIPS 2018.

 

Meta-Learning Neural Bloom Filters

Jack W Rae, Sergey Bartunov, Timothy P Lillicrap

ICML 2019.

 

Learning Space Partitions for Nearest Neighbor Search

Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner

ICLR 2020

 

Architecture

 

Perceptron-Based Prefetch Filtering

Eshan Bhatia, Daniel A. Jiménez, Paul Gratz, Elvira Teran, Seth Pugsley, Gino Chacon

ISCA 2019.

 

Post-Silicon CPU Adaptations Made Practical Using Machine Learning.

Stephen J Tarsa, Rangeen Basu Roy Chowdhury, Julien Sebot, Gautham Chinya, Jayesh Gaur, Karthik Sankaranarayanan, Chit-Kwan Lin, Robert Chappell, Ronak Singhal. Hong Wang.

ISCA 2019.

 

Bit-Level Perceptron Prediction for Indirect Branch Prediction.

Elba Garza, Samira Mirbagher, Tahsin Ahmad Khan, Daniel A. Jiménez.

ISCA 2019.

 

Generative and Multi-phase Learning for Computer Systems Optimization.

Yi Ding, Nikita Mishra, Henry Hoffmann.

ISCA 2019.

 

Database Systems

Neo: A Learned Query Optimizer

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, Nesime Tatbul

VLDB 2019.

 

Learning to optimize join queries with deep reinforcement learning

S Krishnan, Z Yang, K Goldberg, J Hellerstein, Ion Stoica.  2019.

 

 

DeepBase: Deep Inspection of Neural Networks

Thibault Sellam, Kevin Lin, Ian Yiran Huang, Yiru Chen, Michelle Yang, Carl Vondrick, Eugene Wu

SIGMOD 2019.

 

Democratizing Data Science through Interactive Curation of ML Pipelines

Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann,

Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, Tim Kraska

SIGMOD 2019

 

 

Networking

Neural Packet Classification

Eric Liang , Hang Zhu , Xin Jin , Ion Stoica

ACM SIGCOMM, 2019

 

Learning in situ: a randomized experiment in video streaming.

Francis Y. Yan, Hudson Ayers,  Chenzhi Zhu,  Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, and Keith Winstein

NSDI 2020.

 

Caching and Access Patterns

 

Back to the future: leveraging Belady’s algorithm for improved cache replacement.

Akanksha Jain and Calvin Lin.

ISCA 2016.

 

Rethinking Belady's Algorithm to Accommodate Prefetching
A. Jain and C. Lin.
ISCA 2018.

 

AViC : A Cache for Adaptive Bitrate Video.

Akhtar, Zahaib, et al.

CoNEXT'19

 

Flashield: a Hybrid Key-value Cache that Controls Flash Write Amplification

Eisenman, Assaf, et al.

NSDI'19

 

LHD: Improving Cache Hit Rate by Maximizing Hit Density

Beckmann, Nathan, Haoxian Chen, and Asaf Cidon.

NSDI'18

 

 

Applying Deep Learning to the Cache Replacement Problem
Z. Shi, X. Huang, and A. Jain, C. Lin.
MICRO 2019.

 

Learning Relaxed Belady for Content Distribution Network Caching

Zhenyu Song, Daniel S. Berger, Kai Li, Wyatt Lloyd

NSDI 2020.

 

Image Compression

Variable Rate Image Compression with Recurrent Neural Networks

George Toderici, Sean M. O'Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, Rahul Sukthankar.

ICLR 2016.

 

Full Resolution Image Compression with Recurrent Neural Networks.

George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, Michele Covell;

CVPR 2017.

 

Real-time adaptive image compression.

Oren  Rippel, Lubomir Bourdev

ICML 2017.

 

Resource Scheduling and Placement

 

Device placement optimization with reinforcement learning.

Azalia Mirhoseini, Hieu Pham, Quoc V Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou, Naveen Kumar, Mohammad Norouzi, Samy Bengio, and Jeff Dean.
ICML 2017.

 

Learning Scheduling Algorithms for Data Processing Clusters.

Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh.  

SIGCOMM 2019.

 

Auto ML

 

Neural Architecture Search with Reinforcement Learning

Barret Zoph Quoc V. Le

ICLR (2017)

 

AMC: AutoML for Model Compression and Acceleration on Mobile Devices.

Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han

ECCV 2018

 

Learning Transferable Architectures for Scalable Image Recognition

Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le;

CVPR 2018. 

 

Beyond Data and Model Parallelism for Deep Neural Networks

Zhihao Jia, Matei Zaharia, Alex Aiken

SysML 2018