watonomous.github.io

[ Software Division : Perspective Transforms ]

Created by [ Rowan Dempster] on Dec 28, 2019

Overview

There are a number of perspective transforms, that must be done for an image to allow it to be processed by Perception, and for extracted object detections to be published with "correct" coordinates (see the Integration page for documentation of coordinate systems). In particular road line detection algorithms rely on the bird's eye view perspective transform of camera images.

This project is best thought of as broken down into the following components.

Acquiring source points and real dimensions
Setting destination points
Computing the BEV matrix
Performing the BEV transform
Converting to real space Coordinates

This process is generalized to any camera. and should only be pursued once the intrinsic calibration is working for that particular camera.

Important note: A BEV transformation matrix is only applicable to one specific camera configuration. If the camera changes position, or is rotated, the above mentioned processed must be performed again.

Acquiring Source Points

[{.confluence-embedded-image .confluence-external-resource}]

Place a rectangle on the ground plane within the FOV of the camera. The example uses the existing concrete floor markings.
Optional. Mark the four corners of the rectangle with stickers (circled in red in the above image), save the image. The rectangle would appear to be a trapezoid since the bottom edge is closer to the camera than the top edge.
Measure the real-space dimensions of the rectangle.
From the saved image, find the pixel coordinates of the four corners. If you are using Mac, you should be able to use your cursor to find the points' coordinates. See the example below on how to set the source points.

src = np.float32([[577, 226], [1296, 226], [1661, 719], [303, 719]])

Setting Destination Points

Manually mark four corners in the saved image, the four points (pixel coordinates) should correspond to the four points of the rectangle if an image of it were to be taken from above. The width to height proportion would be preserved, this comes from measuring the real-space dimensions of the rectangle. The destination points should be chosen in such a way that the result image covers as much of the original image as possible. You can play around with the value of those points until you get an optimal BEV picture.
Important: The sequence of the source points MUST correspond to the sequence of the destination points. Below is an example of how to set the destination points. Note that the points form a rectangle in the image space, rather than a trapezoid.

dst = np.float32([[577, 360], [1296, 360], [1296, 950], [577, 950]])

Compute the BEV Matrix

This utility is provided by OpenCV, given the source and destination points.

M = cv2.getPerspectiveTransform(src, dst)

TODO: Instructions on how this would be incorporated into the codebase.

Performing the BEV Transform

This functionality is also provided by OpenCV, given the BEV matrix, perform the BEV transform.

bev = cv2.warpPerspective(raw_image, M, img_size, flags=cv2.INTER_LINEAR)

Below is the result image for this specific BEV transform. Note that the four circled corners now appear to be a rectangle instead of a trapezoid:

[{.confluence-embedded-image .confluence-external-resource}]

Once the matrix is generated, it is reusable as long as the camera is not moved.

Converting to Real Space Coordinates

It's helpful to see the rectangle from above from our transformed image. However, knowing the image space coordinates of the rectangle's corners isn't enough. We need to know the corresponding real space coordinates. Where is it relative to the camera or vehicle. Therefore we require another coordinate transformation.

Pixel units for some BEV image need to be scaled to meters, we need to determine the scale in meters/pixel for both the x and y axis. This scaling is easily obtained.
Measure the real-space dimensions of the rectangle.
Measure the pixel dimensions of the rectangle in the BEV image.

We still need the position of the rectangle relative to the camera, or vehicle. Therefore, we need to compute some offset that maps a pixel in the BEV image to a real space point (with respect to some coordinate system).

Measure the rectangle corners with respect to some reference origin point (e.g. the center of the vehicle bumper is the origin for the vehicle POV).
TODO: Some vector math needs to be done here. Instructions on how this needs to be incorporated into the codebase.

BEV Transform Example Code

import numpy as np
import cv2
import matplotlib.pyplot as plt

# Find out the size of the original image
raw_image = cv2.imread('BEV_Image/bev_pic.jpg')
img_size = (raw_image.shape[1], raw_image.shape[0])

# Define the source and destination points
src = np.float32([[577, 226], [1296, 226], [1661, 719], [303, 719]])
dst = np.float32([[577, 360], [1296, 360], [1296, 950], [577, 950]])

M = cv2.getPerspectiveTransform(src, dst)

# Apply bev to the original image
bev = cv2.warpPerspective(raw_image, M, img_size, flags=cv2.INTER_LINEAR)
cv2.imwrite("BEV_Image/Transformed.jpg", bev) # Saving images
plt.imshow(bev) # Displaying the image

Resources

https://nikolasent.github.io/opencv/2017/05/07/Bird%27s-Eye-View-Transformation.html

Document generated by Confluence on Dec 10, 2021 04:01

Atlassian