04-22
Learning Models of the Environment for Reinforcement Learning

Photo of Tim Lillicrap Model-based algorithms for decision making have long held the promise of being more powerful and data-efficient than model-free counterparts. However, the widespread application of model-based methods has been limited by the need for perfect models of the environment. The game of Go was mastered by AlphaGo using a combination of neural networks and the MCTS planning algorithm. But, planning required a perfect simulator of the game rules. For cases such as robotics or natural language understanding, where no perfect simulators are available, so model-based approaches are not easy to apply effectively. Addressing this limitation, I will describe algorithms (Dreamer and MuZero) that utilize deep neural networks to learn robust environment models. These models are used to imagine potential futures. Imagined futures can be used for planning, and for learning policy and value functions. The advent and success of powerful model-based learning algorithms offer hints about the next wave of language models.

Bio: Timothy Lillicrap received an Hon. B.Sc. in Cognitive Science & Artificial Intelligence from the University of Toronto and a Ph.D. in Systems Neuroscience from Queen’s University in Canada. He moved to the University of Oxford in 2012 where he worked as a Postdoctoral Research Fellow. In 2014 he joined Google DeepMind as a Research Scientist and became a Director of Research in 2023. His research focuses on machine learning for optimal control and decision making, as well as using these mathematical frameworks to understand how the brain learns. He has developed new algorithms for exploiting deep networks in the context of reinforcement learning, and new recurrent memory architectures for one-shot learning problems. His projects have included applications of deep learning to robotics, solving games such as Go and Starcraft, and human interaction.

To request accommodations for a disability please contact Donna Ghilino, dg3548@princeton.edu, at least one week prior to the event.

Date and Time

Monday April 22, 2024 12:30pm - 1:30pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Department Colloquium Series

Speaker

Timothy Lillicrap, from Google DeepMind

Host

Sanjeev Arora

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

04-22 Learning Models of the Environment for Reinforcement Learning

04-22
Learning Models of the Environment for Reinforcement Learning