10-26
3D-aware Representation Learning for Vision

Given only a single picture, people are capable of inferring a mental representation that encodes rich information about the underlying 3D scene. We acquire this skill not through massive labeled datasets of 3D scenes, but through self-supervised observation and interaction. Building machines that can infer similarly rich neural scene representations is critical if they are to one day parallel people’s ability to understand, navigate, and interact with their surroundings. In my talk, I will discuss how this motivates a 3D approach to self-supervised learning for vision. I will then present recent advances of my research group towards enabling us to train self-supervised scene representation learning methods at scale, on uncurated video without pre-computed camera poses. I will further present recent advances towards modeling of uncertainty in 3D scenes, as well as progress on endowing neural scene representations with more semantic, high-level information.

Bio: Vincent is an Assistant Professor at MIT EECS, where he is leading the Scene Representation Group. Previously, he finished his Ph.D. at Stanford University. He is interested in the self-supervised training of 3D-aware vision models: His goal is to train models that, given a single image or short video, can reconstruct a representation of the underlying scene that incodes information about materials, affordance, geometry, lighting, etc, a task that is simple for humans, but currently impossible for AI.

To request accommodations for a disability, please contact Emily Lawrence at emilyl@cs.princeton.edu at least one week prior to the event.
This talk will be recorded and live streamed via Zoom. Register for webinar here.

Date and Time

Thursday October 26, 2023 4:30pm - 5:30pm

Location

Computer Science Small Auditorium (Room 105)

Event Type

CS Department Colloquium Series

Speaker

Vincent Sitzmann, from MIT EECS

Host

Ellen Zhong

Contributions to and/or sponsorship of any event does not constitute departmental or institutional endorsement of the specific program, speakers or views presented.

CS Talks Mailing List

10-26 3D-aware Representation Learning for Vision

10-26
3D-aware Representation Learning for Vision