COS598/PSY594

Vision: From Neuronal Mechanisms to Computational Models

Spring 2008

by Fei-Fei Li

Course home Syllabus and lecture notes Course project
    

click here for the PDF version of the syllabus

 

Please scroll down for all the paper downloads.

 

Lecture #

Date

Description

Readings

Presenter

Remarks

1

Mon, Feb 04

 

Administrative matter;

A case study: natural scene categorization

 

2

Mon, Feb 11
The primate visual pathway

Gross, 1992;

Bear et al. Neuroscience (Chp  9 & 10), 2001

Guest speaker:

Prof. Charlie Gross

3

Mon, Feb 18

Object parts in IT;

Faces in FFA

Tanaka, 1996;
Kanwisher et al, 1997, 2007;
Gauthier et al. 1999;

Tarr & Gauthier 2000

Dan;

Peter

4

Mon, Feb 25

Object perception:
Object-centric vs. viewer-centric;

Evidence from monkey physiology.

Biederman et al. 1987, 2000;
Tarr, 1995;
Bulthoff et al. 1995;

Logothetis 1995, 1996

Phil;

Andy

5

Mon, Mar 3

Visual Thinking with Graph Network

(abstract)

Shi & Malik, PAMI 2000;

Belongie et al. PAMI 2002;

Borenstein & Ullman 2002;

Shashua & Ullman 1988

Guest speaker:

Prof. Jianbo Shi

6

Mon, Mar 10
Biologically inspired models for object recognition

Fukushima 1980;
Et al. Poggio, 1999, 2007;

LeCun et a. 1998

Xiaobai;

Doug

 

Mon, Mar 24

no class

 

 

7

Mon, Mar 31

Computer Vision models for object recognition

Ullman et al. 2002

Felzenszwalb et al. 2005

Mike;

Hanlin

8

Mon, Apr 7

Objects in scenes; Attention and objects

Treisman & Gelade 1980. Thorpe et al. 1996; Ahissar & Hochstein 2004

Hanlin;

Peter

FIT;

RHT

9

Mon, Apr 14

Modeling object-based attention

Itti et al. 1998;

Lowe 2000, 2004

Mike;

Dan

10

Mon, Apr 21

Objects in context

Bierderman 2004;

Bar 2003, 2004

Philip;

Dan

11

Mon, Apr 28
Biological motion recognition

G. Rizzolatti & L. Craighero, 2004;

R. Blake & M. Shiffrar, 2007;

H. Jhuang et al. 2007

Andy;

Doug

12

TBA

course project presentation

     

 

Tue, May 13

Dean's Day

Course project due


 

References

Lecture #2:
M. F. Bear, Neuroscience (2001) Chap. 9 & Chap. 10.
C. G. Gross, Representation of visual stimuli in inferior temporal cortex. Phil. Trans. Royal Soc. B, Lond., 1992, 335: 3-10.

 

Lecture #3:

K. Tanaka (1996). Inferotemporal cortex and object vision. Annual Review Neuroscience. 19: 109-139.

N. Kanwisher, J. McDermott & M.M. Chun (1997). The Fusiform Face Area: A Module in Human Extrastriate Cortex Specialized for Face Perception. The Journal of Neuroscience, 17(11):4302–4311

McKone, E., Kanwisher, N., Duchaine, B. (2007) Can generic expertise explain special processing for faces? Trends in Cognitive Science. 11 8-15.
I. Gauthier, M. Tarr, A. W. Anderson, P. Skudlarski, & J.C. Gore. (1999) Activation of the middle fusiform 'face area' increases with expertise in recognizing novel objects. Nature Neuroscience  2, 568 - 573
M. Tarr & I. Gauthier (2000). FFA: a flexible fusiform area for subordinate-level visual processing automatized by expertise. Nature Neuroscience  3, 764 – 769

 

Lecture #4:
I. Biederman (1987). Recognition-by-Components: A theory of human image understanding. Psychological review. 94(2): 115-147.
I. Biederman & M. Bar (2000). Differing views on views: response to Hayward and Tarr. Vision Research. 40. 3901-3905.
Tarr, M. J. (1995). Rotating objects to recognize them: a case study of the role of viewpoint dependency in the recognition of three-dimensional objects. Psychonomic Bulletin and Re6iew, 2, 55–82.
H. Bulthoff, S. Edelman & M. Tarr. (1995). How are three-dimensional objects represented in the brain? Cerebral Cortex 1995; 5:247-260.
N. Logothetis, J. Pauls & T. Poggio (1995). Shape representation in the inferior temporal cortex of monkeys. Current Biology. 5(5): 5520563.
N. Logothetis & D. Sheinberg. (1996). Visual object recognition. Annual review neuroscience. 19: 577-621.

 

Lecture #5:

Jianbo Shi and Jitendra Malik (2000). Normalized Cuts and Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI).

Serge Belongie, Jitendra Malik, Jan Puzicha (2002). Shape matching and object recognition using shape contexts, IEEE Transactions on Pattern Analysis and Machine Intelligence(PAMI).

E Borenstein, S Ullman (2002). Class-specific, top-down segmentation. European Conference on Computer Vision.

A. Shashua and S. Ullman (1988). Structural Saliency: The Detection Of Globally Salient Structures using A Locally Connected Network. IEEE International Conference on Computer Vision, pages 321--327.

 

Visual Thinking with Graph Network
Jianbo Shi
Computer and Information Science
University of Pennsylvania

Many visual perception tasks are fundamentally NP-hard computational problems. Solving these problems robustly requires thinking through combinatorially many hypothesis.  Despite this, our human visual system performs these tasks effortlessly.  How is this done?   I would like to make two points on this topic.  First, formulating visual thinking as NP-hard computation tasks has an important advantage: visual routines can be analyzed precisely to identify their behaviors independently of their implementations.  Second, I will show there is a class of graph optimization problems which can be implemented using a distributed network system with physical (and plausible biological) interpretation.

I will demonstrate this graph based approach for: 1) image segmentation using Normalized Cuts with explanations for illusory contours, visual pop out and attention; 2) salient contour grouping using Untangling Cycle; and 3) contour context selection for shape detection.

 

Lecture #6:
K. Fukushima (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biol. Cybernetics 36, 193-202.
Riesenhuber, M. & Poggio, T. (1999). Hierarchical Models of Object Recognition in Cortex. Nature Neuroscience 2: 1019-1025.
Serre, T., L. Wolf, S. Bileschi, M. Riesenhuber and T. Poggio. (2007). Object Recognition with Cortex-like Mechanisms, IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 3, 411-426.
Y. LeCun, L. Bottou, Y. Bengio and P. Haffner: (1998). Gradient-Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11):2278-2324,

 

Lecutre #7:
Ullman, S., Vidal-Naquet, M. , and Sali, E. (2002) Visual features of intermediate complexity and their use in classification. Nature Neuroscience, 5(7), 1-6.
Felzenszwalb, P. and Huttenlocher, D. (2005). Pictorial Structures for Object Recognition. International Journal of Computer Vision, Vol. 61, No. 1, January 2005.

 

Lecture #8:
A. Treimann & G. Gelade (1980). A feature integration theory of attention. Cognitive Psychology. 12: 97-136.
S Thorpe, D Fize, C Marlot (1996). Speed of processing in human visual system. Nature. 381: 520-522.

M. Ahissar & S. Hochstein (2004). The reverse hierarchy theory of visual perceptual learning. Trends in Cognitive Sciences. 8(10).

 

Lecture #9:
Laurent Itti, Christof Koch, Ernst Niebur, (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 20,  no. 11,  pp. 1254-1259.
D. Lowe (2000). Towards a Computational Model for Object Recognition in IT Cortex. In Proceedings of the First IEEE international Workshop on Biologically Motivated Computer Vision (May 15 - 17, 2000). S. Lee, H. H. Bülthoff, and T. Poggio, Eds. Lecture Notes In Computer Science, vol. 1811. Springer-Verlag, London, 20-31.
(supplementary) D. Lowe (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision.  60(2); 91-110.

 

Lecture #10:
I. Biederman, R. Mezzanotte, and J. Rabinowitz. (1982) Scene perception: detecting and judging objects undergoing relational violations. Cognitve Psychology, 14(2):143–77.
M. Bar, E. Aminoff (2003). Cortical Analysis of Visual Context.  Neuron, Volume 38, Issue 2, Pages 347-358.
M. Bar (2004). Visual objects in context. Nature Review Neuroscience. 5: 617-629.

 

Lecture #11:

G. Rizzolatti & L. Craighero. (2004) The mirror-neuron system. Annu. Rev. Neurosci, 27:169-192.

R. Blake & M. Shiffrar. (2007). Perception of Human Motion. Annu. Rev. Psychol, 58:47-73.

H. Jhuang, T. Serre, L. Wolf, T. Poggio (2007). A biologically inspired system for action recognition. ICCV.

(supplementary) G. Rizzolatti, L. Fogassi. V. Gallese (2001). Neurophysiological mechanisms underlying the understanding and imitation of action. Nature Rev. Neurosci. 2:661-670.

(supplementary) L. Fogassi, P. F. Ferrari, B. Gesierich, S. Rozzi, F. Chersi, G. Rizzolatti (2005). Parietal Lobe: from action organization to itention understanding. Science. 308: 662-667.