Quick links

Reinventing Partially Observable Reinforcement Learning

Date and Time
Wednesday, March 12, 2008 - 4:15pm to 5:45pm
Computer Science Small Auditorium (Room 105)
Eyal Amir, from University of Illinois, Urbana-Champaign
David Blei
Many complex domains offer limited information about their exact state and the way actions affect them. There, autonomous agents need to make decisions at the same time that they learn action models and track the state of the domain. This combined problem can be represented within the framework of reinforcement learning in POMDPs, and is known to be computationally difficult.

In this presentation I will describe a new framework for such decision making, learning, and tracking. This framework applies results that we achieved about updating logical formulas (belief states) after deterministic actions. It includes algorithms that represent and update the set of possible action models and world states compactly and tractably. It makes a decision with this set, and updates the set after taking the chosen action. Most importantly, and somewhat surprisingly, the number of actions that our framework takes to achieve a goal is bounded polynomially by the length of an optimal plan in a fully observable, fully known domain, under lax conditions. Finally, our framework leads to a new stochastic-filtering approach that has better accuracy than previous techniques.

* Joint work with Allen Chang, Hannaneh Hajishirzi, Stuart Russell, Dafna Shahaf, and Afsaneh Shirazi (IJCAI'03,IJCAI'05,AAAI'06,ICAPS'06,IJCAI'07,AAAI'07).

Follow us: Facebook Twitter Linkedin