Overview
In this project, I will take two or more photographs and create an image mosaic by registering, projective warping, resampling, and compositing them. Along the way, you will learn how to compute homographies, and how to use them to warp images.
- Shoot and digitize pictures
- Recover homographies
- Warp the images
- Blend images into a mosaic
- Bells and Whistles
The steps are:
Shoot and digitize pictures
The following are the images I used in this project:
Image Rectification
Understanding the Homography Matrix
The homography matrix is used to relate two planes in projective space. We express this as:
\[ \mathbf{q} = H \mathbf{p} \]
Here, both \(\mathbf{q}\) and \(\mathbf{p}\) are in homogeneous coordinates:
\[ \begin{pmatrix} wq_1 \\ wq_2 \\ w \end{pmatrix} = \begin{pmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix} \]
This implies that \(w = gp_1 + hp_2 + 1\). We substitute to get the relations:
\[ (gp_1 + hp_2 + 1)q_1 = ap_1 + bp_2 + c \]
\[ (gp_1 + hp_2 + 1)q_2 = dp_1 + ep_2 + f \]
Or equivalently:
\[ g(p_1 q_1) + h(p_2 q_1) + q_1 = ap_1 + bp_2 + c \]
\[ g(p_1 q_2) + h(p_2 q_2) + q_2 = dp_1 + ep_2 + f \]
These equations are linear in \(p_i\), \(q_i\), and \(p_i q_j\). Stacking them into matrix form, we have:
\[ \begin{pmatrix} p_1 & p_2 & 1 & 0 & 0 & 0 & -p_1 q_1 & -p_2 q_1 \\ 0 & 0 & 0 & p_1 & p_2 & 1 & -p_1 q_2 & -p_2 q_2 \end{pmatrix} \begin{pmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{pmatrix} = \begin{pmatrix} q_1 \\ q_2 \end{pmatrix} \]
For a non-underdetermined system, we need at least 4 correspondences between the images.
Warp the Images
In this part, I implemented a process to warp an image using a given homography matrix \( H \). The key function takes an input image and the homography, applying inverse warping. The main steps include:
- Inverse Homography Calculation: Compute the inverse of the homography matrix \( H \) to map destination points back to the source image.
- Grid Generation: Create a grid of coordinates for the output image using
np.indices
, which defines the pixel locations to be mapped. - Coordinate Transformation: Transform these output coordinates back to the source image space using the inverse homography.
- Interpolation: I used
map_coordinates
to interpolate pixel values at the transformed coordinates.
Image Rectification
Image rectification involves transforming a photo containing known rectangular objects, such as paintings or posters, to make one of them perfectly rectangular using a homography. I defined the destination points by mapping the corners of a known shape, such as a square tile, to a perfect square. For example, if the image contains tiles that are supposed to be square, I identify the corners of one tile and map them to a defined square. Then, I computed the homography and applied it, achieving the rectification.
Blend the images into a mosaic
In this part, I worked on blending images into a mosaic by warping them to align properly.
First, I calculated the homography to map points from one image to another.
Using scipy.ndimage
, I warp the images and create an alpha mask to ensure smooth blending by making the image edges fade out gently.
Then, I determine the bounding box for the mosaic, shifting images as needed.
For blending, I use weighted averaging, combining the images based on their alpha values to avoid harsh edges, which can create a seamless mosaic.
Hallway Results
Observatory Results
Mountain Results
Project 4 Part B: FEATURE MATCHING for AUTOSTITCHING
Corner Detection
In this step, I used the Harris Interest Point Detector to find keypoints in my images. These keypoints are areas in the image with strong corners, which will be useful later for aligning the images. I used provided sample code to implement the Harris detector.
Adaptive Non-Maximal Suppression (ANMS)
After detecting the initial Harris corners, I applied Adaptive Non-Maximal Suppression (ANMS) to choose the most relevant keypoints. The Harris detector can often find too many corners, many of which are too close to each other. ANMS helps by selecting keypoints that are well-distributed across the image, ensuring that we don't have too many corners clustered in one area.
Feature Descriptors
In this step, I implemented feature descriptor extraction to generate a numerical representation of the areas around the keypoints selected by ANMS. For each keypoint, I extracted an 8x8 patch of pixel values from a larger 40x40 window. Sampling from this larger window and blurring it helps create more stable descriptors. These descriptors capture the local appearance around each keypoint and are then normalized to ensure they are comparable between different images.
Feature Matching
In this step, I implemented feature matching to find pairs of keypoints between two images that look similar. To do this, I compared the descriptors extracted from each image and found the closest matches. I used the approach suggested by Lowe, where I compared the distance between the first and second nearest neighbors of each feature. If the ratio between these two distances was below a certain threshold, I considered it a good match.
RANSAC (Random Sample Consensus)
In this step, I implemented the RANSAC (Random Sample Consensus) algorithm to compute a homography estimate between two sets of matched keypoints. The homography is a transformation that maps the points from one image to the corresponding points in another image. Since some matches might be incorrect or noisy, RANSAC helps by repeatedly selecting random subsets of four point pairs and computing a homography from them. For each homography, I checked how many points fit the transformation well (inliers). After 1000 iterations, I selected the homography with the largest set of inliers, which gave me a robust estimate of the correct transformation.
Below, you can see the final matching along with the inliers. Green lines represent correct matching, red lines represent incorrect or noisy matchings.
Results
Reuslts from Part A
Failure Reuslt
For other images from part A, the feature matching and RANSAC failed to produce a good result. I think it is probably because the sky of these images are too uniform , which makes it hard to find good matches.
So I find other two sets of images to do the feature matching and RANSAC.