COS 429 - Computer Vision

Fall 2019

Assignment 2: Face Detection and Model Fitting

Due Thursday, Oct. 17

Part IV. Multi-Scale Face Detection

For the grand finale of implementing a face detector, you will run your detector at multiple scales. Because the classifier is trained at 36x36, you will need to resize images to suit the classifier. There are two ways to do this:

Repeatedly rescale the image, but keep a fixed window size, and run your single-scale detector on the ever-shrinking images.
Extract windows of ever-larger sizes, resizing each window before passing it in to the HoG computation.

Of these options, #2 is probably simpler to implement, while #1 will be faster. The choice is yours.

Your output should look something like this:

Notice that, as before, no nonmaximum suppression has been performed, over either position or scale.

Do this:

Download the starter code and dataset (about 18 MB) for this part. It contains the following files:
- find_faces.py - you will modify this to perform the multi-scale detection.
- face_data/testing_scenes/*.jpg - 130 images from the CMU/MIT frontal face image database. Not all of these have faces, and some are rather difficult to detect: this is a dataset deliberately intended to stress-test face detectors!
- face_data/testing_scenes_bboxes.txt - Ground-truth locations of the faces in the testing_scenes. Each line in this file is of the form
  filename x_min y_min x_max y_max

Do this and turn in:

Implement find_faces.py. No starter code is provided for you this time, so you will have to adapt from the code you wrote for part III. You may choose either of the approaches suggested above:
- If you implement option 1, shrink the image by 20% on each iteration and stop when the smallest dimension is below 36.
- If you implement option 2, start with 36x36 windows and grow them by 20% on each iteration, stopping when the window is larger than the smallest dimension of the image. Hint: (Don't forget to also increase the stride on each iteration if you do #2, such that the stride is always the same fraction of the window size.)
Submit your find_faces.py file
Save and submit your face detection results for 2 images showing particularly good and 2 images showing particularly bad performance.

Optional extra credit (up to 2 points each):
In your writeup, clearly specify which extra credit items you attempted. Be sure to submit all code, along with a description of your method and an analysis of the results. The amount of credit will depend on the sophistication and thoroughness of your implementation and analysis.

Implement nonmaximum suppression, over both position and scale, to eliminate overlapping detections.
The ground truth detections are stored in testing_scenes_bboxes.txt. How would you compare your detector outputs to the results stored in there? Discuss in your writeup how you would handle e.g., the fact that at a high threshold the detector would output very few but likely correct detections whereas at a low threshold the detector would output many more (but mostly incorrect) detections. How would your evaluation function handle that?

Last update 3-Oct-2019 16:03:28