Estimation Problems in Machine Learning
Date and Time
Wednesday, April 10, 2002 - 4:00pm to 5:30pm
Computer Science Small Auditorium (Room 105)
Peter Bartlett, from BIOwulf Technologies and Australian National University
Statistical estimation problems arise naturally in many areas of machine intelligence. In this talk, we consider problems of this kind from two areas of machine learning: supervised learning, where the goal is to learn to make decisions in an i.i.d. setting, and reinforcement learning, where the goal is to learn to make a sequence of related decisions. In prediction problems, such as pattern classification and regression, estimates of prediction error are important for the analysis and design of learning algorithms. We first review classical error estimates, and then describe more recent `large margin' estimates, which give a better explanation of the success of some of the most popular pattern classification techniques. All of these estimates measure the complexity of a function class without exploiting any information about the process that generated the data. We describe recent work on data-dependent error estimates, which can be much more accurate because they use the training data to capture important properties of this process. In reinforcement learning problems, an agent chooses actions to take in some environment, aiming to maximize a reward function. Many control, scheduling, optimization and game-playing tasks can be formulated in this way. Policy gradient methods consider agents that are restricted to some set of policies, and aim to move through policy space so as to improve the performance of the agent. The central problem for such methods is that of estimating performance gradients. We present algorithms for this problem, and show how their performance depends on properties of the controlled system, such as mixing times.