DateTopic/papersPresenter (and link to slides)
Mon, Feb 5

Intro, and look at recognition datasets

Olga Russakovsky (logistics slides, lecture slides)

Module 1: Image segmentation, both strongly and weakly supervised

Wed, Feb 7

Large-scale object segmentation

Olga Russakovsky (ImageNet slides, segmentation propagation slides, graphcut slides)

Mon, Feb 12

Semantic segmentation

Rohan Doshi (slides, PDF)

Wed, Feb 14

Variations on segmentation supervision:

Yannis Karakozis (slides, PDF) (some useful math notes: slides, PDF)

Mon, Feb 19

Instance segmentation:

Berthy Feng + Riley Simmons-Edler (slides, PDF)

Wed, Feb 21

Combining semantic and instance segmentation

Stephanie Liu + Andrew Zhou (slides, PDF)

Mon, Feb 26

Intro to RNNs and cool annotation framework

William Hinthorn (slides, PDF)

For those wishing to brush up on LSTMs

Other cool papers we may not have a chance to cover

Module 2: Language + vision, including captioning, VQA, ...

Wed, Feb 28

Open-world annotation and recognition

Bharath Srivatsan (slides, PDF)

Mon, March 5

From recognition to captioning

Vikash + Qasim Nadeem (slides, PDF)

Wed, March 7

Captioning methods

Ryan McCaffrey + Alex Yue (slides, PDF)

Mon, March 12

No class -- midterms, No class -- midterms, ECCV deadline, CS PhD visit day

Wed, March 14

No class -- midterms, No class -- midterms, ECCV deadline, CS PhD visit day

Spring Break
Mon, March 26

Visual question answering

Prem Nair + Shayan Hassantabar (slides, PDF) (some paper notes: slides, PDF)

Wed, March 28

VQA method: simple baselines

Allen Wu (slides, PDF)

Tue, March 27th 12:30-1:30pm: Prof. Jia Deng (U of Michigan) colloquium on Visual Reasoning

Thu, March 29th 12:30-1:30pm: Justin Johnson (Stanford) colloquium on Language + Vision

Mon, April 2

Attention-based VQA methods

Nick Jiang (slides, PDF)

Wed, April 4

Neural module networks (presenter's choice)

Berthy Feng (slides, PDF)

Other cool papers we may not have a chance to cover

Module 3: Video understanding

Mon, April 9

Classic video datasets and algorithms



  • Action Recognition with Improved Trajectories by Wang and Schmid ICCV'13
  • Local handcrafted features are convolutional neural networks by Lan et al. ICLR'16

Divya Thuremella + Qasim Nadeem (slides, PDF)

Wed, April 11

Two classic deep learning frameworks for action classification

Haochen Li (slides, PDF)

April 11th in class: title, selection of options 1-3, (optional) partner name due

April 12th 12:30-1:30pm: Saurabh Gupta (Berkeley) colloquium on Vision+Robotics

April 13th: project milestone due

Mon, April 16

From classification to temporal localization with 3D convolutions

Austin Le (slides, PDF)

Wed, April 18

Two simple (relatively speaking) models for temporal action localization

Jiaqi Su (slides, PDF)

April 20th: feedback on milestones due

Mon, April 23

Action recognition in the spirit of object detection

Nicholas Turner + Sven Dorkenwald (slides, PDF)

Wed, April 25

Favorite video understanding paper. The presenters should take the lead on finalizing the topic. They can poll/discuss with others on Piazza, or just propose a topic themselves. Please do confirm with me before finalizing.

Suggestions: very recent work on a new architecture for action recognition

Or work on VQA or captioning in videos, some sample papers below

Julie LaChance + Vikash (slides, PDF)

Other cool video papers

Mon, April 30Project Spotlights
Wed, May 2Project Spotlights

Friday, May 11th: project report due

Tuesday, May 15th: report feedback due