COS 429 - Computer Vision

Fall 2005

Course home Outline and lecture notes Assignments


Assignment 3

Due Tuesday, Nov. 22

Note that Thursday Nov. 24 and Friday Nov. 25 will not be counted as late days for this assignment.
Submitting on Saturday will count as two days late.


1. Colorspaces (30%)

(Note: the previous part (a) was a bad question - it has been removed from the assignment.

For this question, you'll evaluate the use of RGB vs. HSV for skin color detection. To do this, start by looking back at your database of faces from Assignment 2. For each image, crop it tightly around the face and convert it to HSV using Matlab's rgb2hsv function (assume that the image gamma is 2.2, so you need to convert RGB to the 0..1 range, then raise the values to this power). Assemble all the pixels from all the images into 6 vectors, one for each of R, G, B, H, S, and V (hint: reshape). Plot the histogram for each of the 6 color channels (hint: use the hist command, but you probably want more bins than the default of 10). Comment on what you see, and the implications for skin color detection and tracking. Please include the histogram plots in your writeup.


2. Color histograms for tracking (20%)

The remainder of the assignment deals with implementing a blob tracker for objects in video sequences. The first cue you will investigate is color histograms. Implement the following:

  1. Read in the first image of the sequence (see datasets below), and allow the user to select the object(s) to be tracked (using the getrect function).
  2. Compile a histogram of whichever of R, G, B, H, S, and V you found to be most effective in section 1 (hint: if the answer wasn't H, go back and try again :-)
  3. Read in each subsequent frame from the sequence in turn. You can read frame number i using a call such as
    imread(sprintf('img%02d.jpg', i));
  4. For each pixel, look up into the target histogram to determine how well that pixel matches the foreground object's colors.

    (Note: simply using the value of the histogram looked up for each pixel's hue is good enough - you don't need to use the full Bayes's rule to evaluate a probability...)


3. Background subtraction (10%)

The next thing to implement is background subtraction: for each frame, find the absolute color difference between the frame and the background. For some of the datasets the background frame will be given, while for others you should determine the background from the video frames themselves by taking the median of all the frames.


4. Basic EM Tracking (30%)

Allow the user to specify any number of objects to be tracked in the first frame, and, as shown in class, propagate their locations forward through the video using expectation maximization and an anisotropic Gaussian mixture model (details here, here, here, here, and here). You should be able to use the results of steps 2 or 3 above, or use a combination of both cues by multiplying the outputs together.


5. Dynamic model (10%)

Implement simple prediction by keeping track of the velocity of each blob on each frame, and updating the velocity using a rolling average. Show that this allows you to keep track of objects more robustly than not using prediction (if necessary, skip frames in the datasets, or construct synthetic datasets).


Writeup

Show sample outputs of each stage of the tracking, as well as at least one complete tracked sequence (where you have marked each object at each frame by overlaying a box on the video frame - make sure to use a different color for each object being tracked). If you want, you can draw nice ellipses using this .m file.


Data

Here are a few datasets. Stay tuned for more...


Last update 29-Dec-2010 12:00:22
smr at princeton edu