|
Computer Science 598D
Systems and Machine Learning
|
Spring 2020
|
A New Golden Age in
Computer Architecture: Empowering the Machine- Learning Revolution.
Jeff Dean, David Patterson, and Cliff Young,
IEEE Micro, 38(2), 21-29.
Yann LeCun, Yoshua Bengio, and Geoffrey Hinton,
Nature 521, May 2015.
Human-level control through deep reinforcement learning.
Mnih, V. et al, Nature,
2015. (Earlier version).
Reinforcement
Learning: An Introduction. (book)
Richard S. Sutton and Andrew G. Barto.
MIT Press 2018.
Torch7: A Matlab-like
environment for machine learning,
Ronan Collobert , Koray Kavukcuoglu. NIPS Workshop. 2011.
Caffe: Convolutional Architecture for Fast Feature
Embedding,
Yangqing Jia,
Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick,
Sergio Guadarrama, Trevor Darrell.
Proceeding MM '14 Proceedings of the 22nd ACM international
conference on Multimedia, Pages 675-678. 2014.
TensorFlow: A System for Large-Scale Machine Learning.
Martín Abadi,
Paul Barham, Jianmin Chen, Zhifeng
Chen, Andy Davis, Jeffrey Dean, Matthieu Devin,
Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek
G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan,
Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng.
Proceedings of the 12th USENIX Symposium on Operating
Systems Design and Implementation (OSDI). November
2–4, 2016
In-Datacenter Performance Analysis Of A Tensor Processing Unit.
Norman P. Jouppi, et al.
ISCA 2017.
A Domain-Specific Architecture for
Deep Neural Networks.
Norman P.
Jouppi, Cliff Young, Nishant
Patil, David Patterson.
CACM
2018.
Tensor Processing Units for Financial Monte Carlo
Francois Belletti Davis King Kun Yang Roland Nelet Yusef Shafi Yi-fan Chen John Anderson
Arxiv (2019)
A configurable cloud-scale
DNN processor for real-time AI.
Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Todd Massengill, Ming Liu, Daniel Lo, Shlomi Alkalay, Michael Haselman, Logan Adams, Mahdi Ghandi, Stephen Heil, Prerak Patel, Adam Sapek, Gabriel Weisz, Lisa Woods, Sitaram Lanka, Steven K. Reinhardt, Adrian M. Caulfield, Eric S. Chung, Doug Burger.
ISCA,
2018.
Serving DNNs in Real Time at
Datacenter Scale with Project Brainwave
Eric Chung, Jeremy Fowers, Kalin Ovtcharov, Michael Papamichael, Adrian Caulfield, Todd Massengill, Ming Liu,
Mahdi Ghandi, Daniel Lo, Steve Reinhardt, Shlomi Alkalay, Hari Angepat, Derek Chiou, Alessandro Forin, Doug Burger, Lisa Woods, Gabriel Weisz, Michael Haselman, Dan Zhang
IEEE Micro 2018.
Adrian Caulfield, Eric Chung, Andrew Putnam, Hari Angepat, Daniel Firestone, Jeremy Fowers, Kalin Ovtcharov, Michael Haselman, et al, Stephen Heil, Matt Humphrey, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger
IEEE Micro | June 2017,
Vol 37(3): pp. 52-61
Cloud-Scale
Acceleration Architecture
Adrian Caulfield, Eric Chung, Andrew Putnam, Hari Angepat, Jeremy Fowers, Michael Haselman, Stephen Heil, Matt Humphrey, Puneet Kaur, Joo-Young Kim, Daniel Lo, Todd Massengill, Kalin Ovtcharov, Michael Papamichael, Lisa Woods, Sitaram Lanka, Derek Chiou, Doug Burger
Proceedings of the 49th
Annual IEEE/ACM International Symposium on Microarchitecture | October 2016
Song Han,
Huizi Mao, William J Dally
ICLR
2016.
EIE: Efficient Inference Engine on
Compressed Deep Neural Network.
S. Han,
X. Liu, H. Mao, J. Pu, A. Pedram, M.A. Horowitz, W.J.
Dally,
ISCA
2016.
Learning
structured sparsity in deep neural networks.
Wei Wen, Chunpeng
Wu, Yandan Wang, Yiran
Chen, Hai Li.
NIPS 2017.
Rethinking the
Value of Network Pruning
Zhuang Liu, Mingjie Sun, Tinghui Zhou, Gao Huang, Trevor Darrell
ICLR 2019.
AMC: AutoML for Model Compression and Acceleration on Mobile
Devices
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han
CVPR 2019
ADMM-NN:
An Algorithm-Hardware Co-Design Framework of DNNs Using Alternating Direction
Methods of Multipliers
Ao Ren, Jiayu Li, Tianyun
Zhang, Shaokai Ye, Wenyao
Xu, Xuehai Qian, Xue Lin, Yanzhi Wang.
ASPLOS 2019
SCNN: An accelerator for compressed-sparse
convolutional neural networks.
Angshuman Parashar, Minsoo Rhu, Anurag Mukkara, Antonio Puglielli, Rangharajan Venkatesan, Brucek Khailany, Joel Emer, Stephen W. Keckler, William
J. Dally.
ISCA 2017
Sparse ReRAM Engine: Joint exploration of activation and weight sparsity on compressed neural network.
Tzu-Hsien Yang, Hsiang-Yun Cheng, Chia-Lin Yang, I-Ching Tseng, Han-Wen Hu, Hung-Sheng Chang, Hsiang-Pang Li.
ISCA 2019.
Bit Prudent
In-Cache Acceleration of Deep Convolutional Neural Networks
Xiaowei Wang, Jiecao Yu, Charles Augustine†, Ravi Iyer†, Reetuparna Das
HPCA 2019
MnnFast: A Fast and Scalable System Architecture for Memory-Augmented Neural Networks.
Hanhwi Jang, Joonsung Kim, Jae-Eon Jo, Jaewon Lee, Jangwoo Kim.
ISCA 2019.
TIE: Energy-efficient tensor train-based inference engine for deep neural network.
Chunhua Deng, Fangxuan Sun, Xuehai Qian, Jun Lin, Zhongfeng Wang, Bo Yuan.
ISCA 2019.
Accelerating Distributed Reinforcement Learning with In-Switch Computing.
Youjie Li, Iou-Jen Liu, Deming Chen, Alexander Schwing, Jian Huang.
ISCA 2019.
Laconic Deep Learning Inference Acceleration.
Sayeh Sharify, Alberto Delmas Lascorz, Mostafa Mahmoud, Milos Nikolic, Kevin Siu, Dylan Malone Stuart, Zissis Poulos, Andreas Moshovos.
ISCA 2019.
DeepAttest: An End-to-End Attestation Framework for Deep Neural Networks.
Huili Chen, Cheng Fu, Bita Darvish Rouhani, Jishen Zhao, Farinaz Koushanfar.
ISCA 2019
PUMA: A Programmable Ultra-efficient Memristor-based
Accelerator for Machine Learning Inference.
Aayush Ankit, Izzat El Hajj, Sai Rahul Chalamalasetti, Geoffrey Ndu, Martin Foltin, R. Stanley Williams, Paolo Faraboschi, Wen-mei W Hwu, John Paul Strachan, Kaushik Roy, Dejan S Milojicic.
ASPLOS 2019.
FPSA: A Full System Stack Solution for Reconfigurable
ReRAM-based NN Accelerator Architecture.
Yu Ji, Youyang Zhang. Xinfeng Xie, Shuangchen Li, Peiqi Wang, Xing Hu, Youhui Zhang, Yuan Xie.
ASPLOS 2019
Bit-Tactical: A Software/Hardware Approach to Exploiting Value and Bit Sparsity in Neural Networks.
Alberto Delmas Lascorz, Patrick Judd, Dylan Malone Stuart, Zissis Poulos, Mostafa Mahmoud, Sayeh Sharify, Milos Nikolic, Kevin Siu, Andreas Moshovos.
ASPLOS 2019
TANGRAM: Optimized Coarse-Grained Dataflow for Scalable NN
Accelerators
Mingyu Gao, Xuan Yang, Jing Pu, Mark Horowitz, Christos Kozyrakis (Stanford University)
ASPLOS 2019
Packing Sparse Convolutional Neural Networks for Efficient
Systolic Array Implementations: Column Combining Under Joint Optimization.
HT Kung, Bradley McDanel, Sai Qian Zhang (Harvard University)
ASPLOS 2019
Split-CNN: Splitting Window-based Operations in Convolutional
Neural Networks for Memory System Optimization
Tian Jin (IBM T.J. Watson Research Center);Seokin Hong (Kyungpook National University)
ASPLOS 2019
TVM: An automated end-to-end optimizing compiler for
deep learning.
Tianqi Chen, Thierry Moreau, Ziheng Jiang, Lianmin Zheng, Eddie Yan, Meghan Cowan, Haichen Shen, Leyuan Wang, Yuwei Hu, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy.
OSDI
2018.
Dynamic
Control Flow in Large-Scale Machine Learning
Yuan Yu Martin Abadi Paul Barham Eugene Brevdo Mike Burrows Andy Davis Jeff Dean Sanjay Ghemawat Tim Harley Peter Hawkins Michael Isard Manjunath Kudlur Rajat Monga Derek Murray Xiaoqiang Zheng
EuroSys 2018
PipeDream: Generalized Pipeline Parallelism for DNN Training.
Deepak Narayanan, Aaron Harlap, Amar Phanishayee, Vivek Seshadri, Nikhil R. Devanur , Gregory R. Ganger, Phillip B. Gibbons, Matei Zaharia.
SOSP 2019.
TASO:
Optimizing Deep Learning Computation with Automatic Generation of Graph
Substitutions.
Zhihao Jia, Oded Padon, James Thomas, Todd Warszawski, Matei Zaharia, Alex Aiken.
SOSP 2019.
Astra:
Exploiting Predictability to Optimize Deep Learning
Muthian Sivathanu, Tapan
Chugh, Sanjay Srivallabh, Lidong Zhou
ASPLOS 2019
CROSSBOW: Scaling
Deep Learning with Small Batch Sizes on Multi-GPU Servers
Alexandros Koliousis, Pijika Watcharapichat, Matthias Weidlich, Luo Mai, Paolo Costa, Peter Pietzuch
VLDB
2019.
Static
Automatic Batching in TensorFlow
Ashish Agarwal
ICML (2019)
Nexus:
A GPU Cluster Engine for Accelerating DNN-Based Video Analysis
Haichen Shen, Lequn Chen, Yuchen Jin, Liangyu Zhao, Bingyu Kong, Mathhai Philipose.
SOSP 2019.
Kelp: QoS
for Accelerators in Machine Learning Platforms
Haishan Zhu David Lo Liqun Cheng Rama Govindaraju Parthasarathy Ranganathan Mattan Erez
HPCA 2019.
Intriguing
properties of neural networks.
C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus.
arXiv preprint arXiv:1312.6199, 2013.
Deep
neural networks are easily fooled: High confidence predictions for
unrecognizable images.
Anh Nguyen, Jason Yosinski, and Je Clune. 2015.
In CVPR. 427–436.
Explaining and
harnessing adversarial examples.
Ian J Goodfellow, Jonathon Shlens, and Christian Szegedy.
In International Conference on Learning Representations (ICLR). 2015.
Delving into
Transferable Adversarial Examples and Black-box Attacks.
Yanpei Liu, Xinyun Chen, Chang Liu, and Dawn Song. 2017.
In International Conference on Learning Representations.
Robust Physical-World Attacks on Machine Learning Models.
Ivan Evtimov, Kevin Eykholt, Earlence Fernandes, Tadayoshi Kohno, Bo
Li, Atul Prakash, Amir Rahmati, Dawn Song.
August, 2017.
Towards deep
learning models resistant to adversarial attacks.
Aleksander Madry, Aleksandar Makelov, Ludwig Schmidt, Dimitris Tsipras, and Adrian Vladu.
ICLR 2018.
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples
Anish Athalye, Nicholas Carlini, David Wagner.
ICML 2018.
Mixup: Beyond Empirical Risk Minimization
Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, David Lopez-Paz
ICLR 2018.
Mixup Inference: Better Exploiting Mixup
to Defend Adversarial Attacks
Tianyu Pang, Kun Xu, Jun Zhu
Adversarial
training for free!
Ali Shafahi, Mahyar Najibi, Mohammad Amin Ghiasi, Zheng Xu, John Dickerson, Christoph Studer, Larry S. Davis, Gavin Taylor
Tom Goldstein
NIPS 2019.
Secure
Multi-pArty Computation Grid
LOgistic REgression
(SMAC-GLORE)
Haoyi Shi1,2, Chao Jiang1,3, Wenrui Dai1 , Xiaoqian Jiang1 , Yuzhe Tang2 , Lucila Ohno-Machado1
and Shuang Wang
BMC
Medical Informatics and Decision Making 2016
Practical Secure Aggregation for Privacy-Preserving Machine Learning
Aaron Segal Antonio Marcedone Benjamin Kreuter Daniel Ramage H. Brendan McMahan Karn Seth Keith Bonawitz Sarvar Patel Vladimir Ivanov
CCS 2017
Helen:
Maliciously Secure Coopetitive Learning for Linear
Models
Wenting Zheng, Raluca Ada Popa, Joseph E. Gonzalez, and Ion Stoica
IEEE S&P 2019
Decentralized
& Collaborative AI on Blockchain
Justin D. Harris Bo Waggoner
2019 IEEE International Conference on Blockchain (Blockchain) | July 2019
Low Latency
Privacy Preserving Inference
Alon Brutzkus, Oren Elisha, Ran Gilad-Bachrach
ICML 2019.
Differential
Privacy: A Survey of Results
Cynthia Dwork.
International
Conference on Theory and Applications of Models of Computation. 2008.
Privacy-Preserving Deep Learning.
Reza Shokri and Vitaly Shmatikov,
CCS 2015.
Deep Learning with Differential Privacy.
Martín Abadi,Andy Chu,Ian Goodfellow, H. Brendan McMahan,Ilya Mironov,Kunal Talwar,Li Zhang
CCS 2016.
Practical secure aggregation for privacy preserving
machine learning.
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, H.
Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, and Karn
Seth.
In ACM Conference on Computer and Communications Security (ACM CCS),
2017.
Honeycrisp: Large-Scale Differentially Private Aggregation Without a Trusted Core.
Edo Roth, Daniel Noble, Brett Hemenway Falk, Andreas Haeberlen.
SOSP 2019.
Privacy Accounting and Quality Control in the Sage Differentially Private ML Platform
Mathias Lécuyer, Riley Spahn, Kiran Vodrahalli, Roxana Geambasu, and Daniel Hsu
SOSP 2019.
Federated Learning: Strategies for Improving Communication Efficiency.
Jakub Konečný, H. Brendan McMahan, Felix X. Yu, Peter
Richtárik, Ananda Theertha
Suresh, Dave Bacon, 2016.
Federated optimization: Distributed machine learning
for on-device intelligence.
Jakub Konecˇny ́, H. Brendan McMahan, Daniel Ramage, and Peter Richta ́rik.
arXiv preprint arXiv:1610.02527, 2016.
Federated learning: Collaborative machine learning
without centralized training data.
H. Brendan McMahan and Daniel Ramage., 2017.
Federated Learning for Mobile Keyboard
Prediction
Andrew Hard Chloé M Kiddon Daniel Ramage Francoise Beaufays Hubert Eichner Kanishka Rao Rajiv
Mathews Sean Augenstein
(2018)
Towards Federated Learning at Scale:
System Design
Keith Bonawitz Hubert Eichner Wolfgang Grieskamp Dzmitry Huba Alex Ingerman Vladimir Ivanov Chloé M Kiddon Jakub Konečný Stefano Mazzocchi Brendan McMahan Timon Van Overveldt David Petrou Daniel Ramage Jason Roselander
SysML 2019
Deep Leakage from Gradients
Ligeng Zhu, Zhijian Liu, Song Han
NIPS 2019.
Advances and Open Problems in
Federated Learning
Peter Kairouz, et al.
arXiv preprint arXiv:1912.04977
Compiler
Auto-Vectorization with Imitation Learning
Charith Mendis, Cambridge Yang, Yewen
Pu, Saman Amarasinghe,
Michael Carbin
NIPS 2019
The Case for Learned Index Structures.
T. Kraska, A. Beutel, E. H. Chi, J.
Dean, and N. Polyzotis.
SIGMOD
2018. pages 489-504.
A Model for Learned Bloom Filters
and Optimizing by Sandwiching.
Michael Mitzenmacher.
NIPS 2018.
Meta-Learning Neural Bloom Filters
Jack W Rae, Sergey Bartunov, Timothy P Lillicrap
ICML 2019.
Learning
Space Partitions for Nearest Neighbor Search
Yihe Dong, Piotr Indyk, Ilya Razenshteyn, Tal Wagner
ICLR 2020
Perceptron-Based Prefetch Filtering
Eshan Bhatia, Daniel A. Jiménez, Paul Gratz, Elvira Teran, Seth Pugsley, Gino Chacon
ISCA 2019.
Post-Silicon CPU Adaptations Made Practical Using Machine Learning.
Stephen J Tarsa, Rangeen Basu Roy Chowdhury, Julien Sebot, Gautham Chinya, Jayesh Gaur, Karthik Sankaranarayanan, Chit-Kwan Lin, Robert Chappell, Ronak Singhal. Hong Wang.
ISCA 2019.
Bit-Level Perceptron Prediction for Indirect Branch Prediction.
Elba Garza, Samira Mirbagher, Tahsin Ahmad Khan, Daniel A. Jiménez.
ISCA 2019.
Generative and Multi-phase Learning for Computer Systems Optimization.
Yi Ding, Nikita Mishra, Henry Hoffmann.
ISCA 2019.
Neo: A Learned
Query Optimizer
Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, Nesime Tatbul
VLDB 2019.
Learning to optimize join queries with deep reinforcement learning
S Krishnan, Z Yang, K Goldberg, J Hellerstein, Ion Stoica. 2019.
DeepBase: Deep Inspection of Neural Networks
Thibault Sellam, Kevin Lin, Ian Yiran Huang, Yiru Chen, Michelle Yang, Carl Vondrick, Eugene Wu
SIGMOD 2019.
Democratizing
Data Science through Interactive Curation of ML Pipelines
Zeyuan Shang, Emanuel Zgraggen, Benedetto Buratti, Ferdinand Kossmann,
Philipp Eichmann, Yeounoh Chung, Carsten Binnig, Eli Upfal, Tim Kraska
SIGMOD 2019
Neural Packet Classification
Eric Liang , Hang Zhu , Xin Jin , Ion Stoica
ACM SIGCOMM, 2019
Learning in situ: a randomized experiment in video
streaming.
Francis Y. Yan, Hudson Ayers, Chenzhi Zhu, Sadjad Fouladi, James Hong, Keyi Zhang, Philip Levis, and Keith Winstein
NSDI 2020.
Back to the future: leveraging Belady’s algorithm for improved cache replacement.
Akanksha Jain and Calvin Lin.
ISCA 2016.
Rethinking
Belady's Algorithm to Accommodate Prefetching
A. Jain and C. Lin.
ISCA 2018.
AViC : A Cache for Adaptive Bitrate Video.
Akhtar,
Zahaib, et al.
CoNEXT'19
Flashield: a Hybrid Key-value Cache that Controls Flash Write
Amplification
Eisenman,
Assaf, et al.
NSDI'19
LHD:
Improving Cache Hit Rate by Maximizing Hit Density
Beckmann,
Nathan, Haoxian Chen, and Asaf
Cidon.
NSDI'18
Applying Deep Learning to the Cache
Replacement Problem
Z. Shi, X. Huang, and A. Jain, C. Lin.
MICRO 2019.
Learning Relaxed Belady for Content Distribution Network Caching
Zhenyu Song, Daniel S. Berger, Kai Li, Wyatt Lloyd
NSDI 2020.
Variable Rate Image Compression with Recurrent Neural Networks
George Toderici, Sean M. O'Malley, Sung Jin Hwang, Damien Vincent, David Minnen, Shumeet Baluja, Michele Covell, Rahul Sukthankar.
ICLR 2016.
Full
Resolution Image Compression with Recurrent Neural Networks.
George Toderici, Damien Vincent, Nick Johnston, Sung Jin Hwang, David Minnen, Joel Shor, Michele Covell;
CVPR 2017.
Real-time
adaptive image compression.
Oren Rippel, Lubomir Bourdev
ICML 2017.
Device
placement optimization with reinforcement learning.
Azalia Mirhoseini,
Hieu Pham, Quoc V Le, Benoit Steiner, Rasmus Larsen, Yuefeng Zhou,
Naveen Kumar, Mohammad Norouzi, Samy
Bengio, and Jeff Dean.
ICML 2017.
Learning
Scheduling Algorithms for Data Processing Clusters.
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh.
SIGCOMM 2019.
Neural Architecture Search with Reinforcement Learning
Barret Zoph Quoc V. Le
ICLR
(2017)
AMC: AutoML for Model Compression and Acceleration on Mobile
Devices.
Yihui He, Ji Lin, Zhijian Liu, Hanrui Wang, Li-Jia Li, Song Han
ECCV 2018
Learning
Transferable Architectures for Scalable Image Recognition
Barret Zoph, Vijay Vasudevan, Jonathon Shlens, Quoc V. Le;
CVPR 2018.
Beyond Data and
Model Parallelism for Deep Neural Networks
Zhihao Jia, Matei Zaharia, Alex Aiken
SysML 2018