COS 429: Computer Vision, Fall 2013
Assignment 1: Line Detection

Part 1: Thought Exercise

  1. In one paragraph, please discuss the relative advantages of using the Hough transform versus RANSAC algorithms when detecting lines in images. Under what conditions is one preferable to the other and vice-versa? How would this change for ellipses, triangles, or other more complex primitives?
     
  2. The finite size of an image implies that, on average, the length in pixels of the visible portions of lines close to the image center C is greater than that of lines distant from C. Please provide a mathematical formula for how the Hough transform is biased by this effect and explain in a sentence or two how you could counter this bias when computing the Hough transform.

 

Part 2: Programming Exercise

Your goal for this part of the assignment is to write a MATLAB program that predicts the locations of long, straight lines in an input image. This is just like the warmup exercise, except that now you are detecting lines rather than eyes.

As in the previous assignment, "runme()" should write gray-level images in the output directory where the brightness of each pixel is proportional to the predicted probability of finding a straight line at that pixel in the corresponding image in the input directory. For example, running runme() for the input shown on the left below might produce the output shown in the middle. The rightmost image shows an overlay of the two (where the output has been added to the red channel of the input). More examples can be found here.

Input Output Overlay

The general strategy we will use is to predict whether pixels appear to be on edges locally AND on long straight lines globally and then combine the two to produce an overall line detection prediction. This strategy can be implemented with the following four steps:

Step A: Gradient estimation
Implement a function [xgradient_image, ygradient_image] = computegradients(input_image) to find the x and y components of the gradient Fx and Fy of the image smoothed with a Gaussian (e.g., σ = 1). In your implementation, you should use the facts that a 2D Gaussian is separable and that finding the derivative of a function convolved with a Gaussian is the same as convolving with the derivative of a Gaussian to improve the efficieny (as described on slide 65 of the Image Analysis lecture).

Step B: Edge detection
Implement a function canny_image = detectedges( xgradient_image, ygradient_image) to detect edges using the Canny algorithm as follows:

  1. Gradient analysis:
    1. Compute the edge strength F (the magnitude of the gradient)
    2. Compute the edge orientation D = arctan(Fy/Fx) at each pixel.
     
  2. Nonmaximum suppression:
    Create a "thinned edge image" I(x,y) as follows:
    1. For each pixel, find the direction D* in (0, 45, 90, 135) that is closest to the orientation D at that pixel. (Or the equivalent in radians...)
    2. If the edge strength F(x,y) is smaller than at least one of its neighbors along D*, set I(x,y) = 0, else set I(x,y) = F(x,y).
     
  3. Hysteresis thresholding:
    Repeatedly do the following:
    1. Locate the next unvisited pixel (x,y) such that I(x,y) > T_h (e.g., T_h = 0.1).
    2. Starting from (x,y), follow the chain of connected local maxima in both directions as long as I(x,y) > T_l (e.g., T_l = 0.03), where two pixels are connected if they are adjacent in horizontal, vertical, or diagonal directions.
    3. Mark each pixel as it is visited.
     
  4. Output Edge image:
    Return a gray-level Edge image, E, where all pixels found with hysteresis thresholding have values equal to their gradient magnitudes and all other pixels are zero.

Step C: Hough Voting
Implement a function hough_transform = votelines( canny_image, xgradient_image, ygradient_image) that accumulates votes for lines using the Hough algorithm as follows:

  1. Initialization:
    Create a Hough image, H, where the rows represent distances and the columns represent angles -- i.e., each pixel (d,a) in this image represents a line in the input image with distance "d" from the image center and angle "a" with respect to the x-axis. You should choose a dimensionality of the rows large enough to accomodate the largest possible distance from the center (i.e., the diagonal radius of the input image in pixels). The dimensionality of the columns should be large enough to accomodate all possible angles (0 - 2*PI) with at most one degree increments. All values should be initialized to zero.

  2.  
  3. Voting:
    Accumulate votes for lines in the Hough image. For every pixel p = (x,y) of the input image: first compute (d,a) = L(p), where L(p) is the line through p perperpendicular to the image gradient (Fx,Fy), d is the shortest distance between L(p) and the center of the image (cx,cy), and a is the angle between L(p) and the x-axis (note that gradients in opposite directions at the same point should produce the same (d,a)). Then, add E(p) to the corresponding position (d,a) in the Hough image, using bilinear weights to augment the four closest pixels, since (d,a) probably will not fall on integer coordinates. The result will be a Hough image where the brightest pixels represent lines where many votes agree.

Step D: Combination
Implement a function output_image = predictlines( canny_image, hough_image, xgradient_image, ygradient_image, alpha, beta, gamma) that combines the Canny Edge and Hough images into a final output image containing larger gray values where lines are predicted with more confidence. There are many ways to do this, but you should at least implement the following process:

     for each pixel p in the input image
          output_image(p) = max  ( E(p)α · H(L)β · (N(p)·N(L))γ )
                             L
     end
where L is a line containing p, E(p) is the magnitude of the Edge image for point p; H(L) is the magnitude of the Hough image for a line L (again, use interpolation since L may not be on integer coordinates), (N(p)·N(L)) is a measure of how well the orientation of L agrees with the smoothed gradient at p, and the argmax provides the maximum value of the product of the three over all lines L containing p. This procedure predicts that long, straight lines appear where edges are detected locally, lines are detected globally, and the two agree in orientation, where α, β, and γ are exponents used to weight the contributions of the three factors (we will use α=1, β=1, and γ=2 as defaults for this assignment).

 

Part 3: Experimentation

The previous part of the assignment describes a complete pipeline for line detection, which should work for simple cases. However, it has many limitations, and there are many design choices that could have been different in each of the four steps. For example, in Step C, each point could have voted for all lines that pass through it, possibly weighted by how well they align with the local gradient. In Step D, the final prediction P(p) for each pixel p could have been a different function that is more robust to noisy local gradients. Many other algorithmic design choices are possible.

For this part of the assignment, please make a hypothesis about a particular modification or enhancement to at least one of the basic four steps, implement it, and then analyze how and why your modification improves or hurts your results. Your writeup should include a description of your modification, an explanation of why you chose it, images comparing results with and without it, and quantitative comparisons of results for the provided input images (see next part).

 

Part 4: Results and Analysis

You should execute your program using runme.m on a variety of input test images to investigate how well it works under different conditions. To get full credit for this part, you must show outputs of your program for all of the images in the "input" subdirectory of cos429_assignment1.zip, plus at least three images taken with your own camera, and at least three images downloaded from the web.

As in Assignment 0, please think about how to investigate how well your algorithm works. For example, you might consider a set of images with different levels of noise, radial distortion, dynamic range, etc. You could also show images demonstrating the impact of parameter settings in your program (e.g., the sigma used for a blur, the low and high threholds used for finding connected components in Canny, the resolution of the Hough image, etc.).

Please include the results of your program on the test data set and your own input images in a third section of your writeup titled "Results and Analysis." For each input image (or group of images), you should include a short discussion of how your algorithm works on that image. Overall, the goal of this section is to answer questions like: When does your program succeed? When does it fail? What are the key attributes of input images that affect its success? What parameter settings affect its success? You do not have to answer all of these questions, but you should discuss at least one characteristic of the input images and/or parameters that affects the quality of your results, demonstrated by results for a set of input images spanning cases where the algorithm does and doesn't work well.

Additionally, please analyze the results of your algorithm quantitatively on the provided input images using the binary ground truth results provided in the groundtruth subdirectory of the zip file. The evaluation metric will be computed using the 2D correlation coefficient between your predicted image and the ground truth image (see corr2 in MATLAB). This metric will produce a value S=1.0 if your prediction is perfect, S=-1.0 if it misses all ground truth pixels, and S in [-1.0,1.0] otherwise.

 

Extra credit (optional)

Use the same four steps to detect circles in images. Provide separate MATLAB functions for this feature and add a section titled "Extra Credit" to the writeup with examples.

 

Getting Started

You should start from cos429_assignment1.zip, which provides the directory structure to follow in your submission, a template for your writeup (writeup/writeup.html), a set of test input images (input/test*.jpg), corresponding ground truth images (groundtruth/test*.jpg), simple templates for the MATLAB functions you must implement (code/computegradients.m, code/detectedges.m, code/votelines.m, code/predictlines.m) and a script to ease creating results for your writeup (code/runme.m).

You should edit the MATLAB files (and possibly create new ones) to implement the algorithms, download new test images into the input subdirectory, execute runme.m to produce your results, and complete your writeup.

 

Submitting Your Solution

Please submit your solution via the dropbox link here.

Your submission should include a single file named "assignment1.zip" with the following structure:

Please follow the general policies for submitting assignments, including the late policy and collaboration policy.