Dynamic Color Detection System

Real-Time Object Recognition Using HSV Color Space and OpenCV
Author

Mahmut Osmanovic

Published

February 13, 2025

1.0 | Description

The project objective is to detect objects of a specified color within frames continuously output by a webcam. The color detector leverages the capabilities of two fundamental python modules within Computer Vision, namely, numpy and PIL (pillow). The prerequisites are low, almost any laptop with an external or integrated camera will suffice. The project leverages the HSV colorspace to achieve the desired result. The camera color detector is computationally cheap but may find a use case as a tool in certain environments, say farming. Take for instance various kinds of autonomous finite state automata that have the goal of distinguishing between ripe and unripe fruit, for the sole purpose of harvesting. The project is deemed succesful if the program manages to highlight an object of the specified color through the use of a boundary box.

The program code associated with the project is available at https://github.com/mosmar99/Color-Detection.

2.0 | Code & Method

import cv2 
from util import get_limits
from PIL import Image

yellow = [0, 255, 255] # yellow in BGR colorspace
cap = cv2.VideoCapture(0)
while True:
    ret, frame = cap.read()

    hsv_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2HSV)
    lower_limit, upper_limit = get_limits(color=yellow)
    mask = cv2.inRange(hsv_frame, lower_limit, upper_limit)
    mask_ = Image.fromarray(mask) # cvt image from np.array to pillow format
    bounding_box = mask_.getbbox()
    if bounding_box is not None:
        x1, y1, x2, y2 = bounding_box
        cv2.rectangle(frame, (x1, y1), (x2, y2), (255, 0, 0), 3) 
    cv2.imshow('frame', frame)

    if cv2.waitKey(1) % 0xFF == ord('q'):
        break

cap.release()
cv2.destroyAllWindows()

The main.py file is detailed in the codesnippet above. The color of interest (color to detect) is specified through the OpenCV default colorspace format, BGR. A VideoCapture object is created that recieves input from device 0, which in my case is my integrated laptop webcam. Since the input of frames is continuous (the camera always delievers frames), the boolean ret will likewise always be true. Hence, there is no need to condition the while-loop on ret. Each frame is accessed through the frame variable. Note that the BGR webcam input is first converted to HSV format.

HSV Colorspace

The HSV colorspace is unique in its manner of specifying colors, and I think best described through the use of a figure, inspect the one below.

The HSV colorspace uses three main parameters to access colors, namely, Hue (H), Saturation (S) and Value (V). The angle around the vertical axis corresponds to the hue, the distance frmo the axis corresponds to saturation, and thte distance along the axis corresponds to lightness, value or brightness. The input frame colors are subsequently converted from the BGR colorspace to the HSV colorspace using the OpenCV module. Thereafter one has to specify the range of the HSV color space that one is looking for to detect. Naively restricting the detector to only detect one specified color combination ([0,255,255]) is too restrictive. Since colors are on a continuum, it is hard to set a discriminative threshold that specifies yellowness. Instead, if we again consider the image above specifying the colors in the HSV colorspace, imagen cutting out a slice from that “cake”, representing most of the color combinations that we would classify as yellow. Now, if the camera detects an pixels within that range, they will be labeled as yellow (see GitHub for all code). That is what the mask identifies. All of the colors that fit the subset will be painted white, and the others black. The Pillow module aids us in drawing the bounding boxes of the identified objects within the frame through the retrival of box coordinates. The coordinates are subsequently drawn on the actual frame using standard functions within the cv2 library (the choosen color of the bbox is blue). The user can quit the program at any time through pressing the key q (q for quit).

3.0 | Result

The result is highlighted below. I chose to look for objects within the frames that are yellow-ish. Below is an image of me holding a yellow postit note, which successfully and continuously is detected and boxed in.

4.0 | Limitations

A limitation of this methodology is that some colors are simply easier to detect than others. Yellow is a color which is easier to detect than others since it is more rare, i.e., it sticks out. However, colors such as red are more difficult to deal with as they are present in human skin color (to varying degrees). Hence, an unintended consequence is that not only will the red object be detected but the human holding it will be boxed-in aswell. Another limitation of the product is that it only considers one object. Thus, even if I were to hold different yellow sticky notes, one in each hand, there would unfortunately only be drawn one large boundary box that encapsulates both of them.