What You’ll Learn: A Quick Overview
This tutorial is written with the intent of being easily digestable. It can serve as a great starting point for begineers or as reference material for others. It touches and elucidates upon I/O functionalities of OpenCV, size and color manipulation of images, blurring, image manipulation through static and adaptive thresholds, edge-detection, contours and lastly, parameteric drawing.
To download OpenCV, simply paste pip install opencv-python
in the terminal of your choice.
All
of the assiociated code and several examples are available at the following GitHub repository: https://github.com/mosmar99/OpenCV-Fundamentals
1.0 | Input/Output
There are three types of standard I/O in OpenCV. The first two being images and videos stored somewhere on your computer. The third being your webcam.
import cv2
# read image
= "beagle.png"
image_path = cv2.imread(image_path)
image print('(height, width, #channels) <=>', image.shape)
"image_out.png", image)
cv2.imwrite(
# visualize image
'Beagle Image', image)
cv2.imshow(0) cv2.waitKey(
The code snippet above simply imports the OpenCV library (import cv2), specifies the path within the folder that the image is located in and reads
the image using the imread
function within the cv2
module. Note that all cv2 images internally are stored as numpy arrays
. Each image has an associated height
, width
and channel count
specified in precisely that order. The channel count often represents the basic units of color within the image, often three (Red-Green-Blue) for most images. Note that cv2 by default utilizes the BGR color scheme instead of the common RGB scheme. Writing an image simply saves it. The show command displays it, which often is useful to visually inspect applied transformations. The cv2.waitKey(0)
functions enables the user to close down the shown image by either cliking the window exit button or, simply any other keyboard button.
import cv2
# read video
= "beagle_vid.mp4"
video_path = cv2.VideoCapture(video_path)
video
# visualize video
= True
ret while ret:
= video.read()
ret, frame # video.read() returns boolean "ret=True" whilst there remains frame in my video
if ret:
'Beagle Frame', frame)
cv2.imshow(# my beagle video is 30 frames/second: (1/30)*1000 ms/frame
if cv2.waitKey(33) & 0xFF == ord('q'):
break
video.release() cv2.destroyAllWindows()
To read videos instead of images, simply use the VideoCapture
function. The video.read() reads frames in the videos until they run out, which is specified in the boolean ret
. Whilst there still are frames to displayed, they are, for the amount of milliseconds specified within waitKey, or until the user manually presses the key q
on the keyboard.
import cv2
# read webcam
= cv2.VideoCapture(0)
webcam
if not webcam.isOpened():
print("Error opening video")
# Visualize webcam
while True:
= webcam.read()
ret, frame
'frame', frame)
cv2.imshow(if cv2.waitKey(35) & 0xFF == ord('q'):
break
webcam.release() cv2.destroyAllWindows()
Instead of specifying a path to a video, the video is generated live
through the incoming frames from the local webcam. I selected webcam 0
, you may choose another if you have several cameras. Note that the loop is not
conditioned on the boolean ret returning true, since the frames constantly are incoming from the webcam. The user can similarly quit by pressing q or simply exiting the interface.
2.0 | Size Manipulation
import cv2
= cv2.imread("beagle.png")
img
= cv2.resize(img, (330, 180))
resized_img
print(img.shape)
print(resized_img.shape)
'img', img)
cv2.imshow('resized_img', resized_img)
cv2.imshow(0) cv2.waitKey(
The image is simply resized, looks similar but smaller. Its width and height have been adjusted.
import cv2
= cv2.imread("beagle.png")
img
print(img.shape)
= img[50:, 75:440]
cropped_img
'img', img)
cv2.imshow('cropped_img', cropped_img)
cv2.imshow(
0) cv2.waitKey(
The image has in this case actually been cropped
, i.e., certain parts of the image have been pruned. The image is the same as the before, of the beagle, I simply cropped away parts of the background.
3.0 | Color Manipulation
import cv2
= cv2.imread("beagle.png")
img
# standard cv2 color space: BGR (blue-green-red)
= cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
img_hsv
print(img.shape)
print(img_gray.shape)
print(img_hsv.shape)
'beagle', img)
cv2.imshow('beagle_gray', img_gray)
cv2.imshow('beagle_rgb', img_rgb)
cv2.imshow('beagle_hsv', img_hsv)
cv2.imshow(
0) cv2.waitKey(
One can easily adjust the color scheme of the image from three channeled BGR scheme (the OpenCV default) to any other. Be it a single channel (GRAY scale) or other variations of three channels color schemes (RGB or HSV). This is accomplished through the use of cvtColor
, i.e., convert color. The available colors to convert from/to can be found at https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html.
Why are the colors of lossless images commonly encoded in three channels? It has to do with our biology. The human eye has 3 types of PHOTORECEPTOR
cells for color (cones), RED
, GREEN
and BLUE
. Trichromacy
is not unique to humans; several animals can see colors that we cannot, and vice versa.
4.0 | Blurring an Image
import cv2
= cv2.imread("old_pic.jpg")
img
= 3
k_size = cv2.blur(img, (k_size, k_size))
img_blur = cv2.GaussianBlur(img, (k_size, k_size), 1)
img_gaussian_blur = cv2.GaussianBlur(img_gaussian_blur, (k_size, k_size), 1)
img_gaussian_blur = cv2.medianBlur(img, k_size)
img_median_blur
'img', img)
cv2.imshow('img_blur', img_blur)
cv2.imshow('img_gaussian_blur', img_gaussian_blur)
cv2.imshow('img_median_blur', img_median_blur)
cv2.imshow(0) cv2.waitKey(
One can directly blur an image through several different techniques. What all techniques have in common is their use of kernels
, which can be thought of NxN
size arrays, where N
is a odd positive whole number. The kernel slides through the image and mathematically manipulates the old pixels within the kernel, generating new ones to replace the old. It manipulates the old pixel values through mathematical operations with the specified kernel. The kernel consists of weights. The weights can be distributed in various manners. The weight distribution directly impact the resulting blur. For example, a kernel with a gaussian
weight distribution will prioritize the pixels close to the kernel center. A simple on the other hand blur
prioritizes all pixels within the kernel equally. All these methods have their strengths and weaknesses. For a more through explanation, please visit https://docs.opencv.org/4.x/d4/d13/tutorial_py_filtering.html. Physical photographic images tend to deteriorate by time. I managed to remove some noise in an old photo, especially in the faces and the background, see the before (to the left) and after (to the right) below.
5.0 | Thresholds
import cv2
= cv2.imread('beagle.png')
img
= cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray
= cv2.threshold(img_gray, 150, 255, cv2.THRESH_BINARY)
ret, thresh
= cv2.blur(thresh, (10, 10))
thresh = cv2.threshold(thresh, 80, 255, cv2.THRESH_BINARY)
ret, thresh
'img', img)
cv2.imshow('thresh', thresh)
cv2.imshow(0) cv2.waitKey(
Upon setting a threshold, one specifies when a pixel should go completely dark or bright (I set it to go completely white, i.e., 255). One can ofcourse combined the techniques which have been utilized before, for example, the blurring. I applied a blurring effect in order to achieve a smoother color continuum between black and white.
import cv2
= cv2.imread("beagle.png")
img
= cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray
= cv2.adaptiveThreshold(img_gray, 255, cv2.ADAPTIVE_THRESH_GAUSSIAN_C, cv2.THRESH_BINARY, 7, 8)
thresh
'img', img)
cv2.imshow('thresh', thresh)
cv2.imshow(0) cv2.waitKey(
The adaptive threshold on the other hand computes threshold by itself. Resulting in that every single section of the image will have its own threhsold.
6.0 | Edge Detection
import numpy as np
import cv2
= cv2.imread("dzeko.png")
im
= cv2.Canny(im, 100, 200) # highlight edges
im_edge = cv2.dilate(im_edge, np.ones((2, 2), dtype=np.int8))
im_edge_d = cv2.erode(im_edge_d, np.ones((3, 3), dtype=np.int8))
im_edge_e
'dzeko', im)
cv2.imshow('dzeko_edge', im_edge)
cv2.imshow('dzeko_edge_d', im_edge_d)
cv2.imshow('dzeko_edge_e', im_edge_e)
cv2.imshow(0) cv2.waitKey(
There are functionalities within OpenCV that can detect edges. One can subsequently dilate
(make edges thicker) or erode
(make edges thinner) images as desired. Within Canny
, one first enters the image in question, and subsequently the minimum and maximum for for the hythersis. The tighter this interval the less edges will be detected. What type of edges that are detected with Canny is also specified by the interval. For further implementation details, see https://docs.opencv.org/4.x/da/d22/tutorial_py_canny.html. The image you see above is of footballer Edin Dzeko
.
7.0 | Parametric Drawing
import cv2
= cv2.imread("whiteboard.png")
im
print(im.shape)
100, 150), (200, 250), (0, 255, 0), 3)
cv2.line(im, (50, 50), (100, 100), (0, 0, 255), -1)
cv2.rectangle(im, (250, 150), 80, (255, 0, 0), 3)
cv2.circle(im, ('Hello, Human?', (100, 300), cv2.FONT_HERSHEY_COMPLEX, 1, (255, 255, 0), 2)
cv2.putText(im,
'img', im)
cv2.imshow(0) cv2.waitKey(
OpenCV also enables user to draw directly on top of images. There are functions denoting lines. One specifies start/ending coordinates, line color and thickness. Similarly, you can draw rectangles, circles or add some text.
8.0 | Contours
With contours, one can not only highlight all object contours, but also box in objects in virtue of those contours. It is useful to know that one may not necessarily always need to apply an advanced Computer Vision Algorithm such as YOLO
to simply identify objects. Sometimes it suffices to apply traditional methods.
import cv2
print(cv2.__version__)
= cv2.imread('birds.png')
img
= cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img_gray
= cv2.threshold(img_gray, 100, 255, cv2.THRESH_BINARY_INV)
ret, thresh
# findContours expects thresh to be an image of one channel -> convert img to img_gray
= cv2.findContours(thresh, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
contours, hierarchy
for contour in contours:
if cv2.contourArea(contour) > 135:
# cv2.drawContours(img, contour, -1, (0, 255, 0), 1)
= cv2.boundingRect(contour)
x1, y1, w, h +h, y1+h), (0, 255, 0), 2)
cv2.rectangle(img, (x1, y1), (x1
'img', img)
cv2.imshow('img_gray', img_gray)
cv2.imshow('thresh', thresh)
cv2.imshow(0) cv2.waitKey(
In this case I utilized the contours that were found of the objects within the inversed binary color scheme (birds became white and backgroun black). This is due to that the contour function expects a black background. Nonetheless, I check whether or not the contour area is large than a specified threshold value, as to not highlight small holes with larger objects. Subsequently I harness the box coordinates through boundingRect
, which returns the bottom left coordinates of the rectangle, together with the width and height of the box. We can thereafter draw rectangles around the identified objects through the use of the rectangle
function detailed in the previous section.