Discovering Meaning in the Visual World Fei-Fei Li Computer Science, Princeton University When humans encounter images or videos of the visual world, our visual system is capable of extracting a rich plethora of information in a fleeting speed. Among them, a large portion of this information is related to the semantic meanings, such as objects, scenes, purposeful motions and events. This ability still poses a big challenge to today's computer vision algorithms. In this talk, we begin by introducing a bag of words model (BoW) for natural scene categorization based on the Latent Dirichlet Allocation model. We show an excellent result on a dataset of thirteen classes. In object recognition, we present a recent algorithm that aims to collect large object class datasets from the Internet via unsupervised incremental learning. This is achieved by a non-parametric latent topic model. While the bag of words models can characterize objects and images in a computationally efficient way, the classical model representations omit the encoding of spatial information, hence limiting the BoW models from more detailed image understanding. We show briefly that by introducing a hierarchical region-based representation to capture the spatial coherency of the image, we are able to achieve simultaneous segmentation and recognition of objects far better than the traditional BoW models. Last we conclude the talk by showcasing a couple of recent and to-be-published works in human motion recognition in videos and human activity classification in static images. 1) S. Savarese and L. Fei-Fei. 3D generic object categorization, 2) J. Li and L. Fei-Fei. What, where and who? Classifying event by 3) L. Cao and L. Fei-Fei. Spatially coherent latent topic model for 4) L. Fei-Fei, R. Fergus and P. Perona. Learning generative visual 5) J. Li, G. Wang and L. Fei-Fei. OPTIMOL: automatic Object Picture 6) J.C. Niebles and L. Fei-Fei. A hierarchical model of shape and 7) L. Fei-Fei, Iyer, A., Koch, C., & Perona, P. What do we perceive in 8) J.C. Niebles, H. Wang, L. Fei-Fei. Unsupervised learning of human 9) L. Fei-Fei and P. Perona. A Bayesian Hierarchical Model for Learning Natural Scene Categories. IEEE CVPR. 2005. |
