What could computers learn about visual concepts, such as objects and actions, by watching large quantities of internet video, with minimal human supervision? In this talk, I will present several of our recent explorations in this area, including spatio-temporal object segmentation in video, weakly-supervised learning of human actions and unsupervised discovery of motion patterns in animal videos.
The talk includes research contributions from many colleagues at Google, interns and faculty colleagues: L. Del Pero, I. Essa, L. Fei-Fei, V. Ferrari, M. Grundmann, G. Hartmann, J. Hoffman, A. Karpathy, T. Leung, V. Kwatra, O. Madani, J. Rehg, S. Ricco, S. Shetty, K. Tang, G. Toderici, D. Tsai, J. Yagnik.
Rahul Sukthankar is a scientist at Google Research, an adjunct research professor in the Robotics Institute at Carnegie Mellon and courtesy faculty at UCF. He was previously a senior principal researcher at Intel Labs, a senior researcher at HP/Compaq and research scientist at Just Research. Rahul received his Ph.D. in Robotics from Carnegie Mellon in 1997 and his B.S.E. in Computer Science from Princeton in 1991. His current research focuses on computer vision and machine learning, particularly in the areas of object recognition and video understanding.