Generating binary images based on different thresholds: color transform, gradient transform
Executing perspective transform (Bird-eye view)
I start by preparing "object points", which will be the (x, y, z) coordinates of the chessboard corners in the world. Here I am assuming the chessboard is fixed on the (x, y) plane at z=0, such that the object points are the same for each calibration image. Thus,
objp is just a replicated array of coordinates, and
objpoints will be appended with a copy of it every time I successfully detect all chessboard corners in a test image.
imgpoints will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection.
I then used the output
imgpoints to compute the camera calibration and distortion coefficients using the
cv2.calibrateCamera() function. I applied this distortion correction to the test image using the
cv2.undistort() function and obtained this result:
This process is for "pixelizing" the image of the road. Originally, the camera generates colored images, of which the colors are represented by 3 values of R, G, and B. One may think that finding lanes can be done just by sorting out yellow and white pixels, but this is not true. Depending on the light situation, a same color can be represented by different RGB value, that it requires a better algorithm.
Therefore, I used color transforms and gradient threshold, and created a binary image that captures the lane pixels regardless of the lighting condition.
For color transforms, I first transformed the image into a HLS format, which stands for Hue, Lightness and Saturation. Then, I extraceted the saturation:
s_channel, because satuaration value does not vary as much as RGB values depending on the light condition. Then, I "turned on" the pixels that falls within the threshold range that I set for finding the lane pixels. Following is the result of the color transforms:
For the gradient threshold, I used
cv2.Sobel() function. Sobel is an interesting method that helps finding any trend in certain directions, such as x or y. More explanation about sobel matrix can be found at "https://en.wikipedia.org/wiki/Sobel_operator"
After setting up
sobely, I made two kinds of threshold: magnitude and directional. Magnitude was found by calculating the magnitude of the identified trend:
gradmag = np.sqrt(sobelx**2 + sobely**2). Direction of the trend was found by calculating the angle:
absgraddir = np.arctan2(np.absolute(sobely), np.absolute(sobelx)). The result is as following:
To generate a binary output that represents every meaningful trend of pixels found in the input image, I first generated the binary output for each of the three methods: color, magnitude, and directional. Then, I created a combined binary output:
combined_binary[((mag_binary==1) & (dir_binary==1)) | (color_binary==1)]=1. The result is as following:
Here is a input and output in different light condition and different raod color (one with asphalt and the other with cement), and yet we can see that the method works well in both:
Instead of processing the entire image to find lanes, it is much more efficient to focus on the area where the lanes are likely to be located. Then, before fitting the lane pixels with second-order polynomial, it is better to change the perspective of the image, as if the picture was taken from the "bird view."
Here, I designated 4 points of a trapezoid as the
src, and the same of a rectangle as the
dst. Then, I ran
M = cv2.getPerspectiveTransform(src, dst) to find the transform matrix
M. Then, I ran
cv2.warpPerspective() function to "warp" the perspective, and the following is the result:
To find lane pixels, I first started with finding the start point, by finding the maximum point of the histogram of the perspective-transformed image.
Then, from the starting points of left and right lanes, I drew "windows" to trace up the lane upward. The center of the window is designed to change when there are enough number of pixels found in the window.
Lastly, I fit the found lane pixels of each left and right in 2nd-order polynomial, with
Calculating the radius of curvature and the center-position of the car is very important for self-driving. Based on the radius of curvature found in the lanes and how off the car is from the center of the road, the amount of necessary steering is calculated.
Before simply applying the radius of curvautre equation, I transformed the pixel-values to meters in reality, by multiplying
ym_per_pix. These values were caclulated by comparing the actual length of the broken white lane and how many pixels the lane takes in the image. In this particular project, I used
xm_per_pix = 3.7/700 and
ym_per_pix = 30/700. After transformin,g I calculated the radius of curvature based on the poly-fitted lines
For example, the value for the right lane was calculated with this line of code.
right_curverad = ((1 + (2*right_fit_cr*y_eval*ym_per_pix + right_fit_cr)**2)**1.5) / np.absolute(2*right_fit_cr)
Calculating the center-position of the car is much simpler than calculating the radius of curvature. I assumed that the camera was placed on the exact center of the vehicle. Then, I calculated the mean of the starting point of the left and right lanes. Finally, I subtracted the mean from the half of the x-shape of the image, and multiplied it with
This is an image after running through all the steps described above.