CS 280A: Intro to Computer Vision and Computational Photography, Fall 2024

Project 4 Part A: IMAGE WARPING and MOSAICING

Jasper Liu

Overview

In this project, I will take two or more photographs and create an image mosaic by registering, projective warping, resampling, and compositing them. Along the way, you will learn how to compute homographies, and how to use them to warp images.

    The steps are:

  1. Shoot and digitize pictures
  2. Recover homographies
  3. Warp the images
  4. Blend images into a mosaic
  5. Bells and Whistles

Shoot and digitize pictures

The following are the images I used in this project:

My Ipad

Hallway

Griffith Observatory in LA

White Mountain, NH

Image Rectification

Understanding the Homography Matrix

The homography matrix is used to relate two planes in projective space. We express this as:

\[ \mathbf{q} = H \mathbf{p} \]

Here, both \(\mathbf{q}\) and \(\mathbf{p}\) are in homogeneous coordinates:

\[ \begin{pmatrix} wq_1 \\ wq_2 \\ w \end{pmatrix} = \begin{pmatrix} a & b & c \\ d & e & f \\ g & h & 1 \end{pmatrix} \begin{pmatrix} p_1 \\ p_2 \\ 1 \end{pmatrix} \]

This implies that \(w = gp_1 + hp_2 + 1\). We substitute to get the relations:

\[ (gp_1 + hp_2 + 1)q_1 = ap_1 + bp_2 + c \]

\[ (gp_1 + hp_2 + 1)q_2 = dp_1 + ep_2 + f \]

Or equivalently:

\[ g(p_1 q_1) + h(p_2 q_1) + q_1 = ap_1 + bp_2 + c \]

\[ g(p_1 q_2) + h(p_2 q_2) + q_2 = dp_1 + ep_2 + f \]

These equations are linear in \(p_i\), \(q_i\), and \(p_i q_j\). Stacking them into matrix form, we have:

\[ \begin{pmatrix} p_1 & p_2 & 1 & 0 & 0 & 0 & -p_1 q_1 & -p_2 q_1 \\ 0 & 0 & 0 & p_1 & p_2 & 1 & -p_1 q_2 & -p_2 q_2 \end{pmatrix} \begin{pmatrix} a \\ b \\ c \\ d \\ e \\ f \\ g \\ h \end{pmatrix} = \begin{pmatrix} q_1 \\ q_2 \end{pmatrix} \]

For a non-underdetermined system, we need at least 4 correspondences between the images.

Warp the Images

In this part, I implemented a process to warp an image using a given homography matrix \( H \). The key function takes an input image and the homography, applying inverse warping. The main steps include:

Image Rectification

Image rectification involves transforming a photo containing known rectangular objects, such as paintings or posters, to make one of them perfectly rectangular using a homography. I defined the destination points by mapping the corners of a known shape, such as a square tile, to a perfect square. For example, if the image contains tiles that are supposed to be square, I identify the corners of one tile and map them to a defined square. Then, I computed the homography and applied it, achieving the rectification.

Ipad

Warped Ipad

Eminem Poster

Warped Eminem Poster

Blend the images into a mosaic

In this part, I worked on blending images into a mosaic by warping them to align properly. First, I calculated the homography to map points from one image to another. Using scipy.ndimage, I warp the images and create an alpha mask to ensure smooth blending by making the image edges fade out gently. Then, I determine the bounding box for the mosaic, shifting images as needed. For blending, I use weighted averaging, combining the images based on their alpha values to avoid harsh edges, which can create a seamless mosaic.

Hallway Results

hallway_left

alpha_hallway_left

Warped_hallway_left

hallway_right

alpha_hallway_right

Warped_hallway_right

Warped_hallway_left

mosaic hallway

Warped_hallway_right

Observatory Results

observatory_left

alpha_observatory_left

Warped_observatory_left

observatory_right

alpha_observatory_right

Warped_observatory_right

Warped_observatory_left

mosaic observatory

Warped_observatory_right

Mountain Results

mountain_left

alpha_mountain_left

Warped_mountain_left

mountain_right

alpha_mountain_right

Warped_mountain_right

Warped_mountain_left

mosaic mountain

Warped_mountain_right

Project 4 Part B: FEATURE MATCHING for AUTOSTITCHING

Corner Detection

In this step, I used the Harris Interest Point Detector to find keypoints in my images. These keypoints are areas in the image with strong corners, which will be useful later for aligning the images. I used provided sample code to implement the Harris detector.

Corners in the hallway

Adaptive Non-Maximal Suppression (ANMS)

After detecting the initial Harris corners, I applied Adaptive Non-Maximal Suppression (ANMS) to choose the most relevant keypoints. The Harris detector can often find too many corners, many of which are too close to each other. ANMS helps by selecting keypoints that are well-distributed across the image, ensuring that we don't have too many corners clustered in one area.

ANMS keypoints in the hallway

Feature Descriptors

In this step, I implemented feature descriptor extraction to generate a numerical representation of the areas around the keypoints selected by ANMS. For each keypoint, I extracted an 8x8 patch of pixel values from a larger 40x40 window. Sampling from this larger window and blurring it helps create more stable descriptors. These descriptors capture the local appearance around each keypoint and are then normalized to ensure they are comparable between different images.

Features of hallway after normalization

Feature Matching

In this step, I implemented feature matching to find pairs of keypoints between two images that look similar. To do this, I compared the descriptors extracted from each image and found the closest matches. I used the approach suggested by Lowe, where I compared the distance between the first and second nearest neighbors of each feature. If the ratio between these two distances was below a certain threshold, I considered it a good match.

Feature matches between hallway images

RANSAC (Random Sample Consensus)

In this step, I implemented the RANSAC (Random Sample Consensus) algorithm to compute a homography estimate between two sets of matched keypoints. The homography is a transformation that maps the points from one image to the corresponding points in another image. Since some matches might be incorrect or noisy, RANSAC helps by repeatedly selecting random subsets of four point pairs and computing a homography from them. For each homography, I checked how many points fit the transformation well (inliers). After 1000 iterations, I selected the homography with the largest set of inliers, which gave me a robust estimate of the correct transformation.

Below, you can see the final matching along with the inliers. Green lines represent correct matching, red lines represent incorrect or noisy matchings.

RANSAC homography between hallway images

Results

Reuslts from Part A

Ransac auto stitching hallway

manual stitching hallway

Failure Reuslt

Feature matches between mountain images

For other images from part A, the feature matching and RANSAC failed to produce a good result. I think it is probably because the sky of these images are too uniform , which makes it hard to find good matches.

So I find other two sets of images to do the feature matching and RANSAC.

Feature matching in ransac

Ransac auto stitching buildings

manual stitching buildings

Feature matching in jurassic

Ransac auto stitching jurassic

manual stitching jurassic