The objective of this project is to play around with the concepts involved in image warping, and try out the application of image mosaicing. Using homographies and morphing/blending techniques that we've learned in the past, we can take a collection of images and stitch them together to form a wide angled panorama. In this particular section, we learn how to automatically find point correspondences from a collection of detected feature points in between pairs of mosaic images, and then compute the best homography matrix.
For automated point correspondences, the first step is to detect features in the images. In this project, we use Harris corners as points of interest. Here are some sample visualizations of the detected corners in a couple of images.
To suppress some of the "less important" corners from the huge set of detected corners, we implement Adaptive Non-Maximal Suppression to filter out some of the weaker interest points. Here are sample visualizations of the remaining chosen corners after running ANMS over the set of detected features from the images above.
Matching Feature Descriptors
As a naive feature descriptor, for each corner point we extract an axis-aligned 8x8 patch sampled from a 40x40 (every 5th pixel) Gaussian blurred window around the interest point in the original image. The patches are bias/gain-normalized through zero mean and unit standard deviation. In between each pair of images for which correspondences need to be found, we match points by comparing their respective feature descriptors using SSD as an error metric. A point correspondence is determined between two points if the ratio of the SSD error for the two descriptors and the SSD error of the descriptor for the next best match to the first point is less than 0.4 (recommended threshold). Here are some sample images where the point correspondences are highlighted using this method.
A homography matrix is then estimated by performing RANSAC over the collection of remaining interest points. For each iteration (out of 10000 for my implementation), 4 random interest points are selected and a homography matrix is computed. The homography is applied over the first image interest points in the original collection of point correspondences, and the resulting points are compared with the second image interest points from the original collection. If a resulting point from the homography matches closely with the corresponding second image point (SSD is less than some threshold), then it is considered an inlier. The homography matrix that returns the most number of inliers out of all iterations is returned as the final estimated homography matrix. Here are some visualizations of the selected points used to compute the final estimated homography matrix (least squares).
Manual vs. Automatic Stitching
Here are a few sample images where we compare the stitching performance after using manual point correspondences, and after automatic point correspondences. It seems that automatic is able to pinpoint feature points that are more "visually important", and generate more aesthetically pleasing alignments after warping. Note how in the first set of images, the Campanile is more aligned in the automatic than in the manual. Additionally, there seems to be less image distortion, especially horizontally (zoom in to inspect).
Here are some sample mosaics that were automatically generated and stitched. The images were pulled from part A of the project.
Cylindrical Projection (Bells and Whistles #1)
We implement cylindrical projection by transforming and warping each image of the mosaic onto a cylinder. I used the equations provided in lecture, and values of f were determined empirically for the most aesthetically pleasing results (smartphone camera). Here are some sample results! Note how we get a significant reduction in image distortion when stitching the mosaics.
Rotation-Invariance (Bells and Whistles #2)
We implement rotation-invariant feature descriptors by averaging the gradient direction for the 40x40 axis-aligned patch around the feature point, rotating the patch with respect to the average direction, then sampling the patch to extract the 8x8 feature vector.
What I Learned
I actually started to learn some more awesome tips on how to vectorize operations and make everything faster. Also, RANSAC is a complicated (and intimidating) name for such a simple concept.