Yingyu Liang

Princeton University

About Me


    2015-


    2014-2015


    2014


    2010


    2008


    Contact



    Research

  • I'll be an AP at University of Wisconsin-Madison from 2017 Fall.

  • Lecturer/Associate Research Scholar in Computer Science, Princeton University
    Host: Sanjeev Arora

  • Postdoc in Computer Science, Princeton University
    Hosts: Sanjeev Arora, Moses Charikar

  • Ph.D. in Computer Science, Georgia Tech
    Advisor: Maria-Florina Balcan

  • M. S. in Computer Science, Tsinghua University
    Advisors: Bo Zhang, Jianmin Li

  • B. S. in Computer Science, Tsinghua University


  • yingyul at cs dot princeton dot edu
    Computer Science Building, Princeton University, NJ 08544


  • Machine learning. In particular, analyzing machine learning models, designing efficient algorithms for real world applications, and applying machine learning on interdisciplinary tasks.

Teaching

Publications

(authors are listed in alphabetic order, except for those papers with *)

Journal Publications

  • Scalable Influence Maximization for Multiple Products in Continuous-Time Diffusion Networks*
    Nan Du, Yingyu Liang, Maria-Florina Balcan, Manuel Gomez-Rodriguez, Hongyuan Zha, Le Song.
    To appear in Journal of Machine Learning Research (JMLR), 2017.
    [ARXIV] [CODE]

  • A Latent Variable Model Approach to PMI-based Word Embeddings
    Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, Andrej Risteski.
    Transactions of the Association for Computational Linguistics (TACL), 2016.
    [TACL] [ARXIV] [CODE] [Sanjeev's post]

  • Clustering Under Perturbation Resilience
    Maria-Florina Balcan, Yingyu Liang.
    SIAM Journal on Computing (SICOMP), 2016.
    [SICOMP] [PAPER]

  • Robust Hierarchical Clustering
    Maria-Florina Balcan, Pramod Gupta, Yingyu Liang.
    Journal of Machine Learning Research (JMLR), 2014.
    [ARXIV] [CODE]

Manuscripts

  • Optimal Sample Complexity for Matrix Completion and Related Problems via $\ell_2$-Regularization
    Maria-Florina Balcan, Yingyu Liang, David Woodruff, Hongyang Zhang.
    [ARXIV]

  • Mapping Between Natural Movie fMRI Responses and Word-Sequence Representations*
    Kiran Vodrahalli, Po-Hsuan Chen, Yingyu Liang, Janice Chen, Esther Yong, Christopher Honey, Peter Ramadge, Ken Norman, Sanjeev Arora.
    [ARXIV][Appear in NIPS'16 Workshop]

  • Linear Algebraic Structure of Word Senses, with Applications to Polysemy
    Sanjeev Arora, Yuanzhi Li, Yingyu Liang, Tengyu Ma, Andrej Risteski.
    [ARXIV] [CODE] [Sanjeev's post]

  • Why are Deep Nets Reversible: A Simple Theory, with Implications for Training
    Sanjeev Arora, Yingyu Liang, Tengyu Ma.
    [ARXIV][Appear in ICLR'16 Workshop]

Conference Publications

  • Generalization and Equilibrium in Generative Adversarial Nets (GANs)
    Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, Yi Zhang.
    International Conference on Machine Learning (ICML), 2017.
    [ARXIV]

  • Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations
    Yuanzhi Li, Yingyu Liang.
    International Conference on Machine Learning (ICML), 2017.

  • Differentially Private Clustering in High-Dimensional Euclidean Spaces
    Maria-Florina Balcan, Travis Dick, Yingyu Liang, Wenlong Mou, Hongyang Zhang.
    International Conference on Machine Learning (ICML), 2017.

  • A Simple but Tough-to-Beat Baseline for Sentence Embedding
    Sanjeev Arora, Yingyu Liang, Tengyu Ma.
    International Conference on Learning Representations (ICLR), 2017.
    [OPEN REVIEW] [CODE][Preliminary version appeared in NIPS'16 Workshop]

  • Diversity Leads to Generalization in Neural Networks*
    Bo Xie, Yingyu Liang, Le Song.
    International Conference on Artificial Intelligence and Statistics (AISTAT), 2017.
    [ARXIV][Preliminary version appeared in NIPS'16 Workshop]

  • Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
    Yuanzhi Li, Yingyu Liang, Andrej Risteski.
    Neural Information Processing Systems (NIPS), 2016.
    [ARXIV]

  • Recovery Guarantee of Weighted Low-Rank Approximation via Alternating Minimization
    Yuanzhi Li, Yingyu Liang, Andrej Risteski.
    International Conference on Machine Learning (ICML), 2016.
    [ARXIV]

  • Communication Efficient Distributed Kernel Principal Component Analysis
    Maria-Florina Balcan, Yingyu Liang, Le Song, David Woodruff, Bo Xie.
    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016.
    [ARXIV]

  • Learning in Indefinite Proximity Spaces - Recent Trends*
    Frank-Michael Schleif, Peter Tino, Yingyu Liang.
    European Symposium on Artificial Neural Networks (ESANN), 2016.
    [PAPER]

  • Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients*
    Bo Xie, Yingyu Liang, Le Song.
    Neural Information Processing Systems (NIPS), 2015.
    [ARXIV]

  • Distributed Frank-Wolfe Algorithm: A Unified Framework for Communication-Efficient Sparse Learning*
    Aurelien Bellet, Alireza Bagheri Garakani, Yingyu Liang, Maria-Florina Balcan, Fei Sha.
    SIAM International Conference on Data Mining (SDM), 2015.
    [ARXIV] [PRESENTATION] [CODE]

  • Scalable Kernel Methods via Doubly Stochastic Gradients*
    Bo Dai, Xie Dai, Niao He, Yingyu Liang, Anant Raj, Maria-Florina Balcan, Le Song.
    Neural Information Processing Systems (NIPS), 2014.
    [ARXIV] [POSTER] [CODE]

  • Learning Time-Varying Coverage Functions*
    Nan Du, Yingyu Liang, Maria-Florina Balcan, Le Song.
    Neural Information Processing Systems (NIPS), 2014.
    [FULL VERSION] [POSTER]

  • Improved Distributed Principal Component Analysis
    Maria-Florina Balcan, Vandana Kanchanapally, Yingyu Liang, David Woodruff.
    Neural Information Processing Systems (NIPS), 2014.
    [ARXIV] [POSTER] [CODE]

  • Influence Function Learning in Information Diffusion Networks*
    Nan Du, Yingyu Liang, Maria-Florina Balcan, Le Song.
    The 31th International Conference on Machine Learning (ICML), 2014.
    [PAPER] [FULL VERSION] [POSTER] [CODE]

  • Distributed k-Means and k-Median Clustering on General Topologies
    Maria-Florina Balcan, Steven Ehrlich, Yingyu Liang.
    Neural Information Processing Systems (NIPS), 2013.
    [PAPER] [FULL VERSION] [SLIDES] [POSTER] [CODE]

  • Modeling and Detecting Community Hierarchies
    Maria-Florina Balcan, Yingyu Liang.
    The 2nd International Workshop on Similarity-Based Pattern Analysis and Recognition (SIMBAD), 2013.
    [PAPER] [SLIDES]

  • Efficient Semi-supervised and Active Learning of Disjunctions
    Maria-Florina Balcan, Christopher Berlind, Steven Ehrlich, Yingyu Liang.
    The 30th International Conference on Machine Learning (ICML), 2013.
    [PAPER] [SUPPLEMENTARY MATERIAL] [SPOTLIGHT] [POSTER]

  • Clustering under Perturbation Resilience
    Maria-Florina Balcan, Yingyu Liang.
    The 39th International Colloquium on Automata, Languages and Programming (ICALP), 2012.
    [PAPER] [SLIDES] [EXTENDED ARXIV VERSION] [POSTER]

  • Learning Vocabulary-based Hashing with AdaBoost*
    Yingyu Liang, Jianmin Li, Bo Zhang.
    The 16th International Conference of Multimedia Modeling (MMM), 2010.
    [PAPER]

  • Vocabulary-based Hashing for Image Search*
    Yingyu Liang, Jianmin Li, Bo Zhang.
    The ACM International Conference on Multimedia (MM), 2009.
    [PAPER]

Workshop Publications

  • Diversity Leads to Generalization in Neural Networks*
    Yingyu Liang, Bo Xie, Le Song.
    In Non-convex Optimization for Machine Learning Workshop in NIPS 2016.
    [ARXIV]

  • Mapping Between Natural Movie fMRI Responses and Word-Sequence Representations*
    Kiran Vodrahalli, Po-Hsuan Chen, Yingyu Liang, Janice Chen, Esther Yong, Christopher Honey, Peter Ramadge, Ken Norman, and Sanjeev Arora.
    In Representation Learning in Artificial and Biological Neural Networks Workshop in NIPS 2016.
    [ARXIV]

  • A Simple but Tough-to-Beat Baseline for Sentence Embedding
    Sanjeev Arora, Yingyu Liang, Tengyu Ma.
    In Multi-class and Multi-label Learning in Extremely Large Label Spaces Workshop in NIPS 2016.
    [OPEN REVIEW]

  • Why are Deep Nets Reversible: A Simple Theory, with Implications for Training
    Sanjeev Arora, Yingyu Liang, Tengyu Ma.
    International Conference on Learning Representations (ICLR), 2016.
    [ARXIV]

  • Distributed Frank-Wolfe Algorithm: A Unified Framework for Communication-Efficient Sparse Learning*
    Aurelien Bellet, Alireza Bagheri Garakani, Yingyu Liang, Maria-Florina Balcan, and Fei Sha.
    The Workshop on New Learning Frameworks and Models for Big Data in ICML 2014.
    [ARXIV] [PRESENTATION]

  • Distributed PCA and k-Means Clustering
    Maria-Florina Balcan, Vandana Kanchanapally, Yingyu Liang.
    The Big Learning Workshop in NIPS 2013.
    [PAPER] [PRESENTATION] [POSTER] [CODE]

  • Clustering Perturbation Resilient k-Median Instances
    Maria-Florina Balcan, Yingyu Liang.
    The Learning Faster from Easy Data Workshop in NIPS 2013.
    [PAPER] [SPOTLIGHT] [POSTER]

  • THU-IMG at TRECVID 2009*
    Yingyu Liang, Binbin Cao, Jianmin Li, Chenguang Zhu, Yongchao Zhang, Chenhao Tan, Ge Chen, Chen Sun, Jinhui Yuan, Mingxing Xu, and Bo Zhang.
    The TRECVID workshop, 2009.
    [REPORT]

  • THU and ICRC at TRECVID 2008*
    Yingyu Liang, Xiaobing Liu, Zhikun Wang, Jianmin Li, Binbin Cao, Zhichao Cao, Zhenlong Dai, Zhishan Guo, Wen Li, Leigang Luo, Zhaoshi Meng, Yinfeng Qin, Shi Qiu, Aibo Tian, Dong Wang, Qiuping Wang, Chenguang Zhu, Xiaolin Hu, Jinhui Yuan, Peijiang Yuan, Bo Zhang, Shi Chen, Jianguo Li, Tao Wang, and Yimin Zhang.
    The TRECVID workshop, 2008.
    [REPORT]

  • THU and ICRC at TRECVID 2007*
    Jinhui Yuan, Zhishan Guo, Li Lv, Wei Wan, Teng Zhang, Dong Wang, Xiaobing Liu, Cailiang Liu, Shengqi Zhu, Duanpeng Wang, Yang Pang, Nan Ding, Ying Liu, Jiangping Wang, Xiujun Zhang, Xiaozheng Tie, Zhikun Wang, Huiyi Wang, Tongchun Xiao, Yingyu Liang, Jianmin Li, Fuzong Lin, Bo Zhang, Jianguo Li, Weixin Wu, Xiaofeng Tong, Dayong Ding, Yurong Chen, Tao Wang, and Yimin Zhang.
    The TRECVID workshop, 2007.
    [REPORT]

Ph.D. Thesis

Master Thesis

Activities

Presentations

  • Theory for New Machine Learning Problems and Applications

    • University of Texas at Dallas, February 2017
    • University of Massachusetts Amherst, February 2017
    • University of Illinois Urbana-Champaign, February 2017
    • University of Wisconsin-Madison, March 2017
    • Northwestern University, March 2017
    • University of California, Santa Barbara, March 2017
    • University of Pennsylvania, March 2017
    • Emory University, March 2017
    • Penn State University, April 2017
    • Cornell University, May 2017

  • Recovery guarantee of non-negative matrix factorization via alternating updates

    • Simons Workshop on Learning, Algorithm Design and Beyond Worst-Case Analysis, November 2016

  • Distributed kernel principal component analysis

    • KDD, August 2016
    • IMA Workshop on Transdisciplinary Foundations of Data Science, September 2016

  • Recovery guarantee of weighted low-rank approximation via alternating minimization

    • Rutgers University, DIMACS Theoretical Computer Science Seminar, April 2016
    • ICML, June 2016

  • Why are deep nets reversible: a simple theory, with implications for training

    • ICLR, May 2016

  • Two distributed optimization algorithms for machine learning

    • IMA Workshop on Convexity and Optimization: Theory and Applications, February 2015

  • Communication efficient algorithms for distributed unsupervised learning

    • CMU Machine Learning Lunch, October 2014

  • Modern aspects of unsupervised learning: stability and scalability

    • Princeton Theory Lunch, September 2014

  • Clustering under perturbation resilience

    • Learning Faster from Easy Data Workshop at NIPS, December 2013
    • University of Maryland, September 2013
    • ACO student seminar, Georgia Institute of Technology, August 2013
    • Theory group seminar, Georgia Institute of Technology, July 2012

  • Distributed PCA and k-means clustering

    • Georgia Scientific Computing Symposium, February 2014
    • Big Learning Workshop at NIPS, December 2013

  • Distributed k-median and k-means clustering on general topologies

    • NIPS, December 2013
    • HPArch Lab, Georgia Institute of Technology, November 2013
    • George Washington University, September 2013
    • MURI Symposium, University of Maryland, September 2013

  • Efficient semi-supervised and active learning of disjunctions

    • ICML, June 2013

Symposiums/Workshops

  • Simons Program on Foundations of Machine Learning, January to May 2017

  • Simons Workshop on Learning, Algorithm Design and Beyond Worst-Case Analysis, November 2016

  • IMA Workshop on Transdisciplinary Foundations of Data Science, September 2016

  • Information Theory and Applications Workshop (ITA), February 2015

  • IMA Workshop on Convexity and Optimization: Theory and Applications, February 2015

  • Georgia Scientific Computing Symposium, February 2014

  • MURI Symposium, September 2013

  • ARC Theory Day, April 2013

  • ARC-Yandex Workshop: Internet Topology and Economics, November 2012

  • CMU Summer School on Algorithmic Economics, August 2012

  • Center of Data Analytics Workshop on Big Data Research and Development, April 2012

  • ARC Submodular Workshop, March 2012

  • ARC Theory Day, November 2011

  • Machine Learning Summer School at Purdue, June 2011

Service

  • Area Chair for ICML 2016;
    PC member for Tsinghua University CS PhD Forum 2010, ICCI 2010, SIMBAD 2015, ESANN 2016 special session: Indefinite proximity learning

  • Reviewer for conferences: COLT 2012, UAI 2012, STACS 2013, SIMBAD 2015, SODA 2015, FOCS 2015, ITCS 2015, AISTAT 2015, COLT 2015, IJCAI 2015, UAI 2015, ICML 2015, AISTAT 2016, SODA 2016, WDSM 2016, COLT 2016, UAI 2016, NIPS 2016, AAAI 2017, STOC 2017, WWW 2017, ICML 2017, NIPS 2017;
    Reviewer for journals/collections: Data Mining and Knowledge Discovery 2011, IEEE Transactions on Information Theory 2014, Information & Computation 2015, Journal of ACM 2016, Machine Learning 2016, Journey Through Discrete Mathematics: A Tribute to Jiri Matousek, Journal of Machine Learning Research 2016 and 2017.

  • Organizer of Machine Learning Reading Group at Georgia Tech, 2012-2014

  • ICML 2013 Volunteer, June 2013

Machine Learning

Theoretical CS