Yingyu Liang

Princeton University

About Me

    2014 -


    2014


    2010

    2008


    Contact



    Research

  • Postdoc/Lecturer in Computer Science, Princeton University
    Hosts: Sanjeev Arora

  • Ph.D. in Computer Science, Georgia Tech
    Advisor: Maria-Florina Balcan

  • M. S. in Computer Science, Tsinghua University

  • B. S. in Computer Science, Tsinghua University


  • yingyul at cs dot princeton dot edu
    Computer Science Building, Princeton University, NJ 08544


  • Machine learning, distributed algorithms, analysis of network and dynamic data

Publications

Manuscript

  • Linear Algebraic Structure of Word Senses, with Applications to Polysemy
    With Sanjeev Arora, Yuanzhi Li, Tengyu Ma, and Andrej Risteski.
    [ARXIV]

  • Why are Deep Nets Reversible: A Simple Theory, with Implications for Training
    With Sanjeev Arora, and Tengyu Ma.
    [ARXIV]

Journal

  • A Latent Variable Model Approach to PMI-based Word Embeddings
    With Sanjeev Arora, Yuanzhi Li, Tengyu Ma, and Andrej Risteski.
    Transactions of the Association for Computational Linguistics (TACL), 2016.
    [ARXIV]

  • Scalable Influence Maximization for Multiple Products in Continuous-Time Diffusion Networks
    With Nan Du, Maria-Florina Balcan, Manuel Gomez-Rodriguez, Hongyuan zha, and Le Song.
    To appear in Journal of Machine Learning Research (JMLR), 2016.

  • Clustering Under Perturbation Resilience
    With Maria-Florina Balcan.
    SIAM Journal on Computing (SICOMP), 2016.
    [PAPER]

  • Robust Hierarchical Clustering
    With Maria-Florina Balcan and Pramod Gupta.
    Journal of Machine Learning Research (JMLR), 2014.
    [ARXIV]

Conference

  • Recovery Guarantee of Non-negative Matrix Factorization via Alternating Updates
    With Yuanzhi Li, and Andrej Risteski.
    Neural Information Processing Systems (NIPS), 2016.

  • Recovery Guarantee of Weighted Low-Rank Approximation via Alternating Minimization
    With Yuanzhi Li, and Andrej Risteski.
    International Conference on Machine Learning (ICML), 2016.
    [ARXIV]

  • Distributed Kernel Principal Component Analysis
    With Maria-Florina Balcan, Le Song, David Woodruff, and Bo Xie.
    ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2016.
    [ARXIV]

  • Scale Up Nonlinear Component Analysis with Doubly Stochastic Gradients
    With Bo Xie, and Le Song.
    Neural Information Processing Systems (NIPS), 2015.
    [ARXIV]

  • Distributed Frank-Wolfe Algorithm: A Unified Framework for Communication-Efficient Sparse Learning
    With Aurelien Bellet, Alireza Bagheri Garakani, Maria-Florina Balcan, and Fei Sha.
    SIAM International Conference on Data Mining (SDM), 2015.
    [ARXIV] [PRESENTATION]

  • Scalable Kernel Methods via Doubly Stochastic Gradients
    With Bo Dai, Xie Dai, Niao He, Anant Raj, Maria-Florina Balcan, and Le Song.
    Neural Information Processing Systems (NIPS), 2014.
    [ARXIV] [POSTER]

  • Learning Time-Varying Coverage Functions
    With Nan Du, Maria-Florina Balcan, and Le Song.
    Neural Information Processing Systems (NIPS), 2014.
    [FULL VERSION] [POSTER]

  • Improved Distributed Principal Component Analysis
    With Maria-Florina Balcan, Vandana Kanchanapally, and David Woodruff.
    Neural Information Processing Systems (NIPS), 2014.
    [ARXIV] [POSTER] [CODE]

  • Influence Function Learning in Information Diffusion Networks
    With Nan Du, Maria-Florina Balcan, and Le Song.
    The 31th International Conference on Machine Learning (ICML), 2014.
    [PAPER] [FULL VERSION] [POSTER]

  • Distributed k-Means and k-Median Clustering on General Topologies
    With Maria-Florina Balcan and Steven Ehrlich.
    Neural Information Processing Systems (NIPS), 2013.
    [PAPER] [FULL VERSION] [SLIDES] [POSTER] [CODE]

  • Modeling and Detecting Community Hierarchies
    With Maria-Florina Balcan.
    The 2nd International Workshop on Similarity-Based Pattern Analysis and Recognition (SIMBAD), 2013.
    [PAPER] [SLIDES]

  • Efficient Semi-supervised and Active Learning of Disjunctions
    With Maria-Florina Balcan, Christopher Berlind, and Steven Ehrlich.
    The 30th International Conference on Machine Learning (ICML), 2013.
    [PAPER] [SUPPLEMENTARY MATERIAL] [SPOTLIGHT] [POSTER]

  • Clustering under Perturbation Resilience
    With Maria-Florina Balcan.
    The 39th International Colloquium on Automata, Languages and Programming (ICALP), 2012.
    [PAPER] [SLIDES] [EXTENDED ARXIV VERSION] [POSTER]

  • Learning Vocabulary-based Hashing with AdaBoost
    With Jianmin Li and Bo Zhang.
    The 16th International Conference of Multimedia Modeling (MMM), 2010.
    [PAPER]

  • Vocabulary-based Hashing for Image Search
    With Jianmin Li and Bo Zhang.
    The ACM International Conference on Multimedia (MM), 2009.
    [PAPER]

Workshop

  • Why are Deep Nets Reversible: A Simple Theory, with Implications for Training
    With Sanjeev Arora, and Tengyu Ma.
    International Conference on Learning Representations (ICLR), 2016.
    [ARXIV]

  • Distributed Frank-Wolfe Algorithm: A Unified Framework for Communication-Efficient Sparse Learning
    With Aurelien Bellet, Alireza Bagheri Garakani, Maria-Florina Balcan, and Fei Sha.
    The Workshop on New Learning Frameworks and Models for Big Data in ICML 2014.
    [ARXIV] [PRESENTATION]

  • Distributed PCA and k-Means Clustering
    With Maria-Florina Balcan and Vandana Kanchanapally.
    The Big Learning Workshop in NIPS 2013.
    [PAPER] [PRESENTATION] [POSTER] [CODE]

  • Clustering Perturbation Resilient k-Median Instances
    With Maria-Florina Balcan.
    The Learning Faster from Easy Data Workshop in NIPS 2013.
    [PAPER] [SPOTLIGHT] [POSTER]

  • THU-IMG at TRECVID 2009
    With Binbin Cao, Jianmin Li, Chenguang Zhu, Yongchao Zhang, Chenhao Tan, Ge Chen, Chen Sun, Jinhui Yuan, Mingxing Xu, and Bo Zhang.
    The TRECVID workshop, 2009.
    [REPORT]

  • THU and ICRC at TRECVID 2008
    With Xiaobing Liu, Zhikun Wang, Jianmin Li, Binbin Cao, Zhichao Cao, Zhenlong Dai, Zhishan Guo, Wen Li, Leigang Luo, Zhaoshi Meng, Yinfeng Qin, Shi Qiu, Aibo Tian, Dong Wang, Qiuping Wang, Chenguang Zhu, Xiaolin Hu, Jinhui Yuan, Peijiang Yuan, Bo Zhang, Shi Chen, Jianguo Li, Tao Wang, and Yimin Zhang.
    The TRECVID workshop, 2008.
    [REPORT]

  • THU and ICRC at TRECVID 2007
    With Jinhui Yuan, Zhishan Guo, Li Lv, Wei Wan, Teng Zhang, Dong Wang, Xiaobing Liu, Cailiang Liu, Shengqi Zhu, Duanpeng Wang, Yang Pang, Nan Ding, Ying Liu, Jiangping Wang, Xiujun Zhang, Xiaozheng Tie, Zhikun Wang, Huiyi Wang, Tongchun Xiao, Yingyu Liang, Jianmin Li, Fuzong Lin, Bo Zhang, Jianguo Li, Weixin Wu, Xiaofeng Tong, Dayong Ding, Yurong Chen, Tao Wang, and Yimin Zhang.
    The TRECVID workshop, 2007.
    [REPORT]

Ph.D. Thesis

Master Thesis

Teaching

Activities

Presentation

  • Recovery guarantee of non-negative matrix factorization via alternating updates

    • Simons Workshop on Learning, Algorithm Design and Beyond Worst-Case Analysis, November 2016

  • Distributed kernel principal component analysis

    • KDD, August 2016
    • IMA Workshop on Transdisciplinary Foundations of Data Science, September 2016

  • Recovery guarantee of weighted low-rank approximation via alternating minimization

    • Rutgers University, DIMACS Theoretical Computer Science Seminar, April 2016
    • ICML, June 2016

  • Why are deep nets reversible: a simple theory, with implications for training

    • ICLR, May 2016

  • Two distributed optimization algorithms for machine learning

    • IMA Workshop on Convexity and Optimization: Theory and Applications, February 2015

  • Communication efficient algorithms for distributed unsupervised learning

    • CMU Machine Learning Lunch, October 2014

  • Modern aspects of unsupervised learning: stability and scalability

    • Princeton Theory Lunch, September 2014

  • Clustering under perturbation resilience

    • Learning Faster from Easy Data Workshop at NIPS, December 2013
    • University of Maryland, September 2013
    • ACO student seminar, Georgia Institute of Technology, August 2013
    • Theory group seminar, Georgia Institute of Technology, July 2012

  • Distributed PCA and k-means clustering

    • Georgia Scientific Computing Symposium, February 2014
    • Big Learning Workshop at NIPS, December 2013

  • Distributed k-median and k-means clustering on general topologies

    • NIPS, December 2013
    • HPArch Lab, Georgia Institute of Technology, November 2013
    • George Washington University, September 2013
    • MURI Symposium, University of Maryland, September 2013

  • Efficient semi-supervised and active learning of disjunctions

    • ICML, June 2013

Symposium/Workshop

  • Simons Workshop on Learning, Algorithm Design and Beyond Worst-Case Analysis, November 2016

  • IMA Workshop on Transdisciplinary Foundations of Data Science, September 2016

  • Information Theory and Applications Workshop (ITA), February 2015

  • IMA Workshop on Convexity and Optimization: Theory and Applications, February 2015

  • Georgia Scientific Computing Symposium, February 2014

  • MURI Symposium, September 2013

  • ARC Theory Day, April 2013

  • ARC-Yandex Workshop: Internet Topology and Economics, November 2012

  • CMU Summer School on Algorithmic Economics, August 2012

  • Center of Data Analytics Workshop on Big Data Research and Development, April 2012

  • ARC Submodular Workshop, March 2012

  • ARC Theory Day, November 2011

  • Machine Learning Summer School at Purdue, June 2011

Service

  • Area Chair for ICML 2016;
    PC member for Tsinghua University CS PhD Forum 2010, ICCI 2010, SIMBAD 2015, ESANN 2016 special session: Indefinite proximity learning, IJCAI 2015, UAI 2015, ICML 2015, NIPS 2016

  • Reviewer for conferences: COLT 2012, UAI 2012, STACS 2013, SIMBAD 2015, SODA 2015, FOCS 2015, ITCS 2015, AISTAT 2015, COLT 2015, IJCAI 2015, AISTAT 2016, SODA 2016, WDSM 2016, COLT 2016, UAI 2016;
    Reviewer for journals/collections: Data Mining and Knowledge Discovery 2011, IEEE Transactions on Information Theory 2014, Information & Computation 2015, Journal of ACM 2016, Machine Learning 2016, Journey Through Discrete Mathematics: A Tribute to Jiri Matousek.

  • Organizer of Machine Learning Reading Group at Georgia Tech, 2012-2014

  • ICML 2013 Volunteer, June 2013

Machine Learning

Theoretical CS