Buenos Aires Machine Learning Summer School 2018
Optimization for Machine Learning

Elad Hazan

June 2018


We survey the optimization viewpoint to machine learning and the basic methods that are used to train learning machines. We describe stochastic and online gradient descent, as well as recent improvements: adaptive regularization, acceleration and variance reduction. We survey the state of the art in second order stochastic optimization, conditional gradient algoithm and provable guarantees non-convex continuous optimization.



Bibliography: Books, Surveys and Research Papers

Textbooks and surveys:

Convex Optimization Boyd and Vandenberghe
Introductory Lectures on Convex Programming Y. Nesterov
Convex Optimization: Algorithms and Complexity S. Bubeck
Introduction to Online Convex Optimization E. Hazan
Online Learning and Online Convex Optimization S. Shalev-Shwartz
The Multiplicative Weights Update Method: a Meta-Algorithm and Applications S. Arora, E. Hazan and S. Kale

Research articles:

Variance reduction

Online gradient descent, logarithmic regret and applications to soft-margin SVM:

Stochastic Second Order Methods

Non-convex optimization for ML


Adaptive regularization and AdaGrad

Projection-free algorithms

Lower bounds