04-12
Learning, Adversaries, and Limited Feedback

A basic assumption often made in Machine Learning is that the data we observe are independent and identically distributed. But is this really necessary? Is it even realistic in typical scenarios? A number of recent results provide guarantees for arbitrary (or even adversarially-generated) data sequences. It turns out, for a wide class of problems, the learner's performance is no worse when the data is assumed to be arbitrary as opposed to i.i.d. This result involves a nice application of the Minimax theorem, which I'll briefly describe. I'll also delve into a more challenging problem: the "bandit" setting, in which the learner receives only very little feedback on each example. It was unknown for some time whether there exists an efficient algorithm that achieves the same guarantee for non-i.i.d. data for a general class of learning/decision problems. I'll present the first known such bandit algorithm, and I'll sketch the central ideas behind the technique, borrowing several tricks from the Interior Point optimization literature.

Date and Time

Monday April 12, 2010 4:30pm - 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Department Colloquium Series

Speaker

Jake Abernethy, from UC Berkeley

Host

Robert Schapire

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

04-12 Learning, Adversaries, and Limited Feedback

04-12
Learning, Adversaries, and Limited Feedback