COS 429 - Computer Vision

Fall 2019

Course home Outline and Lecture Notes Assignments


Assignment 1: Image processing and feature detection

Due Thursday, Oct. 3

Changelog and clarifications


1. Pinhole camera model (5 pts)

Consider a pinhole camera with focal length $f$. Let this pinhole camera face a whiteboard, in parallel, at a distance $L$ between the whiteboard and the pinhole. Imagine a square of area $S$ drawn on the whiteboard. What is the area of the square in the image? Justify your answer.


2. Linear filters (20 pts)

You are expected to do this question by hand. Show all steps for full credit.

In class we introduced 2D discrete space convolution. Consider an input image $I[i,j]$ with an $m \times n$ filter $F[i,j]$. The 2D convolution $I * F$ is defined as: \[ (I*F)[i,j] = \sum_{k,l}I[i-k,j-l]F[k,l] \] Note that the above operation is run for each pixel \((i,j)\) of the result.

  1. Convolve the 2x3 matrix \(I = [-1, 0, 2; 1, -2, 1] \) with the 3x3 matrix \(F = [-1, -1, -1; 1, 1, 1; 0, 0, 0] \). Use zero-padding when necessary. The output shape should be 'same' (same as the 2x3 matrix \(I\)).

  2. Note that \(F\) is separable, i.e., it can be written as a product of two 1D filters: \(F_1 = [-1; 1; 0]\) and \(F_2 = [1, 1, 1]\). Compute \((I*F_1)\) and \((I * F_1) * F_2\), i.e., first perform 1D convolution on each column, followed by another 1D convolution on each row.

  3. Prove that for any separable filter \(F = F_1F_2\): \[I*F = (I*F_1)*F_2\] Hint: expand the 2D convolution equation directly.


3. Difference-of-Gaussian (DoG) detector (25 pts)

  1. Recall that a 1D Gaussian is: \[g_{\sigma}(x) = \frac{1}{\sqrt{2\pi}\sigma}\exp \left (-\frac{x^2}{2\sigma^2} \right ) \] Calculate the 2nd derivative of the 1D Gaussian with respect to \(x\) and plot it in Python (use \(\sigma=1\)). Submit all steps of your derivation and the generated plot in the PDF file you turn in.
    Hint: Create a large number of \(x\) using np.linspace, and get function outputs from those. You can then use Matplotlib to plot the function. If you are unfamiliar with Matplotlib, get started here.

  2. Use Python to plot the difference of Gaussians in 1D given by \[D(x,\sigma,k) = \frac{g_{k\sigma}(x)-g_{\sigma}(x)}{k\sigma-\sigma}\] using k = 1.2, 1.4, 1.6, 1.8, 2.0. State which value of \(k\) gives the best approximation to the 2nd derivative with respect to \(x\). Assume \(\sigma=1\). You will need to submit both the answer (with generated plots) and your code in the PDF file that you turn in. The simplest way to do this is to do this part in Jupyter. Then, to get a pdf of Jupyter, go to File, Download As, pdf. Otherwise, submitting the copy/pasting code as text in the PDF is fine as well.
  3. The 2D equivalents of the plot above are rotationally symmetric. To what type of image structure will the difference of Gaussian respond maximally?


4. Canny edge detector (50 pts)

Background: See the lecture slides, Sections 4.1-4.3 of Trucco & Verri, and Chapter 4 of your textbook. Make sure you've completed at least the first and second part of Assignment 0 before beginning this question.
Hint #1: Take advantage of the fact that you're working with visual data, and visualize every step of your work. You can either do this with cv2, like in Assignment 0, or with Matplotlib using imshow.
Hint #2: Start by working with small images -- for example, by cropping out a 50x50-pixel part of a larger image.

  1. Implement the Canny edge detection algorithm, as described in class. The framework code you should start from is here. All functions should be implemented in a1p4_functions.py, while runme.py can be used to run the actual algorithm. This consists of several phases:
  2. Test your algorithm on images of your choosing, experimenting with different values of the parameters sigma (the width of the Gaussian used for smoothing), \(T_h\) (the "high" threshold), and \(T_l\) (the "low" threshold). Also run your algorithm on the following images:


Submitting

This assignment is due Thursday, October 3, 2019 at 11:59 PM. Please see the general notes on submitting your assignments, as well as the late policy and the collaboration policy.

As stated, our submissions this year will be done through Gradescope. The code to register for our class and how to submit can be found in the "general notes on submitting..." link above. This assignment has 2 submissions:

  1. Assignment 1 Written: Submit one single PDF containing all written portions of the assignments. This contains all work for problems 1-3 (including derivations, plots, and code for problem 3), as well as experiments and findings for problem 4. This portion is worth 60 points (50 points from problems 1-3, 10 points for written portion of problem 4).
  2. Assignment 1 Problem 4 Code: Submit your Canny Edge Detection implementation. This should be your version of a1p4_functions.py, where all functions have been filled. Make sure that the filename was not changed, nor were the function names and inputs/outputs. This portion is worth 40 points.
The submission link will be made available after Monday, 9/23.

You are expected to use good programming style, including meaningful variable names, a comment or three describing what the code is doing, etc. Also, all images created by your code must be saved with the "cv2.imwrite" function - do not submit screen captures of the image window.

Credit to Fei-Fei Li and Juan Carlos Niebles for several problems.


Last update 25-Sep-2019 20:40:00