04-26
Learning and Recognizing Real-World Scenes and Objects

For both humans and machines, the ability to learn and recognize the semantically meaningful contents of the visual world is an essential and important functionality. Among the many kinds of high level visual tasks, we are particularly interested in the questions of natural scene and generic object categorization.

In this talk, we will begin by showing a recent algorithm developed toward learning and recognizing complex real-life images such as busy city street, beach, kitchen, etc. To motivate this topic, I will present a series of recent human psychophysics studies on natural scene recognition. All these experiments converge to one prominent phenomena of the human visual system: humans are extremely efficient and rapid in capturing the overall gist of natural images. Can we achieve such a feat in computer vision modeling? We propose here a generative Bayesian hierarchical model that learns to categorize natural images in a weakly supervised fashion. We represent an image by a collection of local regions, denoted as codewords obtained by unsupervised clustering. Each region is then represented as part of a `theme'. In previous work, such themes were learnt from hand-annotations of experts, while our method learns the theme distribution as well as the codewords distribution over the themes without such supervision. We report excellent categorization performances on a large set of 13 categories of complex scenes.

If time permits, I will also briefly go over some work toward generic object categorization in cluttered scenes, especially under constraint training condition. We will highlight some progress made in one-shot learning of object categories. Learning visual models of object categories notoriously requires thousands of training examples; this is due to the diversity and richness of object appearance which requires models containing hundreds of parameters. We present a method for learning object categories from just a few images (1~5). It is based on incorporating "generic" knowledge which may be obtained from previously learnt models of unrelated categories. We operate in a variational Bayesian framework: object categories are represented by probabilistic models, and "prior" knowledge is represented as a probability density function on the parameters of these models. The "posterior" model for an object category is obtained by updating the prior in the light of one or more observations. Our ideas are demonstrated on objects belonging to 101 widely varied categories.

Date and Time

Wednesday April 26, 2006 4:00pm - 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Colloquium Series

Speaker

Fei-Fei Li, from University of Illinois - Urbana Champaign

Host

Szymon Rusinkiewicz

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

04-26 Learning and Recognizing Real-World Scenes and Objects

04-26
Learning and Recognizing Real-World Scenes and Objects