COS 429 - Computer Vision

Fall 2011

Course home Outline and lecture notes Assignments


Assignment 3

Due Tuesday, Nov. 29


I. Questions (20%)

  1. Describe the probable effects of running a voxel coloring algorithm using pictures of a constant-color cube (against a different-colored background). How would these artifacts be affected by the camera positions?
  2. What happens when voxel coloring is run with the color similarity threshold set to be too large? Too small?
  3. Describe the effects of incorrect camera calibration on correspondence-based stereo and on voxel coloring. Which algorithm do you expect would produce worse results given a slightly-miscalibrated camera?


II. Voxel Coloring (80%)

Implement a system for 3D reconstruction using voxel coloring, as described in Seitz and Dyer's 1997 paper.

As a review, the basic steps are as follows:

  1. Load a number of images, together with their camera positions and intrinsics.
  2. For each image, create a mask that is the same size as the image and initialize it to all zeros.
  3. For each voxel V in some volume of space:
    1. Project V into each of the camera images. Since the projection of a voxel (which is a cube) will, in general, be some sort of hexagon, it is simpler to approximate the set of pixels to be considered by projecting the eight corners of the voxel and taking all pixels within the image-space bounding box of these projections. Let P be this set of pixels.
    2. Remove from P all pixels that belong to the background (you may assume a pixel to be background if its red, green, and blue components are all less than thresh_bg).
    3. Remove from P all pixels for which the value of the mask is 1.
    4. Find the standard deviation of the colors of the pixels remaining in P.
    5. If the standard deviation is smaller than thresh_color:
      • Record the voxel being considered as "visible" and assign it the average color in P.
      • For each pixel in P, set the corresponding bit in the mask to 1.

Cube data set

Run the voxel coloring algorithm on these (computer-generated) images of a cube. For each image, there is also a ".xf" file containing a 4x4 matrix representing the camera translation and rotation. To project a 3D point (x,y,z,1) into a camera image, you first multiply the point by the corresponding matrix:

        [x']     [x]
        [y'] = M [y]
        [z']     [z]
        [1 ]     [1]
then obtain camera (u,v) coordinates as follows:
        u = 256 - 443.405 * x' / z'
        v = 256 + 443.405 * y' / z'
All this assumes that the pixel origin is at the upper-left corner.

For this data set, you may assume the following:

  1. Make your voxel grid extend from -0.1 to 1.1 in x, y, and z
  2. Perform the voxel sweep in one pass, proceeding in the negative x direction. That is, you should first consider the voxels with x = 1.1, then consider planes of voxels with smaller and smaller x.
  3. Start with a small number of voxels (e.g. 10x10x10) and increase the resolution once your code is working.
  4. Use 5 for thresh_bg, and experiment with different values of thresh_color.

More data sets

... will be available shortly.

Visualizing your results

vxlview is a really simple viewer for visualizing your results. It is available here: vxlview.zip. Versions are available for Windows, Linux, and Mac.

The program takes the name of a file containing a list of voxels on the command line. Under some versions of Windows, you can also drag 'n drop a file of voxel data onto the application and have it run. Two sample files are included: simple.vxl has just a few voxels, so that you can understand the file format, while bunny.vxl is more complex.

When the program is running, you can

The format of the ".vxl" file is as follows:

Language

You may find that this assignment runs slowly in Matlab, especially for large numbers of voxels. If you wish, you can implement the code in another language of your choice (such as C or C++). In order to read in the images, we suggest converting them to PPM (djpeg cube1.jpg > cube1.ppm), which is very simple to read ("man ppm" for details).


Submitting

This assignment is due Tuesday, Nov. 29, 2011 at 11:59 PM. Please see the general notes on submitting your assignments, as well as the late policy and the collaboration policy.

Please submit a single .zip file containing:

The Dropbox link to submit the assignment is here.


Last update 20-Nov-2011 11:01:46
smr at princeton edu