Publications (by Years)

2024

Directional Smoothness and Gradient Methods: Convergence and Adaptivity [arxiv]

Is RLHF More Difficult than Standard RL? [arxiv]

Breaking the Curse of Multiagency: Provably Efficient Decentralized Multi-Agent RL with Function Approximation [arxiv]

Learning Rationalizable Equilibria in Multiplayer Games [arxiv]

Learning markov games with adversarial opponents: Efficient algorithms and fundamental limits [arxiv]

V-Learning – A Simple, Efficient, Decentralized Algorithm for Multiagent RL [arxiv]

Chi Jin, Qinghua Liu, Yuanhao Wang, Tiancheng Yu (α-β order)
Mathematics of Operations Research, Best Paper in ICLR 2022 “Gamification and Multiagent Solutions” workshop

An Exponential Lower Bound for Linearly Realizable MDP with Constant Suboptimality Gap [arxiv]

Near-optimal Local Convergence of Alternating Gradient Descent-Ascent for Minimax Optimization [arxiv]

Online learning in unknown markov games [arxiv]

On the suboptimality of negative momentum for minimax optimization[arxiv]

Improved Algorithms for Convex-Concave Minimax Optimization [arxiv]

On Solving Minimax Optimization Locally: A Follow-the-Ridge Approach [arxiv]

Yuanhao Wang*, Guodong Zhang*, Jimmy Ba
ICLR 2020, invited talk at NeurIPS 2019 Smooth Games Optimization and Machine Learning Workshop

Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP [arxiv]

Distributed Bandit Learning: Near-Optimal Regret with Efficient Communication [arxiv]

16-qubit IBM universal quantum computer can be fully entangled [paper link][arxiv]

Robust Linear Regression via Least Squares [pdf]

A standalone note on the robustness of linear regression for hypercontractive data distribution. Later used in the upper bound part of our paper on linearly realizable MDPs with gaps.

What is Momentum for Minimax Optimization? [pdf]

For quadratic minimization, Chebyshev polynomials can be used to derive Polyak's momentum. A similar tactic for minimax optimization results in a peculiar algorithm.

Non-asymptotic Analysis for Polyak's Momentum in Quadratic Functions [blog post]

Refined Analysis of FPL for Adversarial Markov Decision Processes [arxiv]