This section provides the background to the problem we are solving, “Vision Based Drowsiness Detector for Real Driving Conditions”. The background is followed by literature review in the sub-section titled “Previous Work’. In this sub-section, a brief summary of the relevant work done by other researchers to develop drowsiness detection systems previously is given. The purpose and goal of this report are also briefly discussed. In the end, the approach that we will be following is also discussed.
1.1 BackgroundVehicle crashes and accidents due to drowsy driving are prevalent all over the world. Thousands of people die every year resulting from vehicle accidents due to drowsy driving 12. According to data from Australia, England, Finland, and other European nations, all of whom have consistent crash reporting procedures, drowsy driving represents 10 to 30 percent of all crashes 3. In Pakistan as well, the news of car or truck crashes is often read or heard, and one of the major reasons for these accidents is driver fatigue or sleepiness 45. In order to reduce such accidents and enhance the safety of the driver and the passengers, driver drowsiness detectors have been developed by various researchers all across the world.
Drowsiness detection systems can be broadly categorized to depend on the following methods 67,1. Vehicle basedThese kind of systems detect drowsiness of the driver by monitoring the lane changes, steering wheel movement, vehicle speed, pressure on accelerator pedal etc.2. Behavioral basedSuch drowsiness detectors depend on the behaviour of the head of the driver. Eye closure or blinking, yawning, head pose changing, are monitored through a camera to detect drowsiness.
3. Physiological basedThese drowsiness detectors rely on the correlation between physiological signals ECG (Electrocardiogram) and EOG (Electrooculogram). Pulse rate, heartbeat, and brain information is used to detect drowsiness.Physiological based drowsiness detection systems such as 8 have the limitation that the driver has to wear electrodes on his body that could be annoying.
Whereas vehicle based drowsiness detection systems such as 9 are not robust because they are subjected to constraints related to the kind of driver and vehicle, road conditions etc. Hence, it is best to develop drowsiness detection systems based on the visual assessment of the drivers face as these systems do not require the driver to wear anything, and the current computer vision techniques based on convolutional neural networks enable one to develop highly robust systems.1.2 Computer VisionUnderstanding what we see is an easy task for us humans, but for computers the task of understanding what an image contains or means is difficult. Computer Vision is the field dedicated to making machines understand what is happening in an image i.e. emulate human vision and make decisions based on that.
Another common term that is often confused with Computer Vision is image processing. Image processing involves the application of transformations (rotate, sharpen, smoothen, stretch) to an image to make it more readable. Image processing methods are harnessed for achieving tasks of computer vision. Broadly, there are two main categories of computer vision techniques 10.
Both of them are described below.1.2.1 Traditional VisionThese approaches extract human engineered features like edges, colors, corners, texture and hence depend on traditional image processing techniques. Some techniques used in traditional vision are briefly discussed below.1.2.1.
1 Face Detection, Viola and Jones 220.127.116.11 SIFT (Scale-Invariant Feature Transform) 12The SIFT is algorithm used to detect and describe local features in images.
This algorithm helps solve the problem that certain features like edges and corners are scale variant i.e. a corner in an image may not like a corner if the image is scaled as shown in figure 1.2.1.
1.1 below. Figure 1.2.1.
1 1: Left: A corner in a window. Right: The corner in the left is scaled and viewed again in the same window. As can be seen, now the corner appears flat in the window.
Hence, it is scale variant. Image source: http://opencv-python-tutroals.readthedocs.io/enSIFT algorithm deals with this problems by using a series of mathematical approximations to obtain a scale invariant representation of the image. In effect, it tries to standardize all images (if the image is blown up, SIFT shrinks it; if the image is shrunk, SIFT enlarges it).
This corresponds to the idea that if some feature, (say a corner) can be detected in an image using some square-window of dimension ? across the pixels, and then if the image were scaled to be larger, we would need larger dimension k? to capture the same corner. We will not go into the mathematical details of how this is done but to reiterate, SIFT standardizes the scale of an image, and then extracts key features. What constitutes as a key feature is beyond the scope of this report.SURF (Speeded-Up Robust Features) is a speeded up version of SIFT. BRIEF (Binary Robust Independent Elementary Features) is also an improvement on both SIFT and SURF algorithms.
Histogram of oriented gradients (HOG) is a feature vector representation technique used widely for object detection tasks. The image is characterized by the distribution of local intensity gradients. This is implemented by carving up the image window into little spatial regions called cells.
Each cell has a local one dimensional histogram of gradient directions spread over the pixels of the cell. These histograms are merged to form the image representation.