Research on HS Optical Flow Algorithm Based on Motion Estimation Optimization ()
1. Introduction
Biologist Gibson first proposed the concept of optical flow in 1950. In 1976, Poggio and Reiehardt proposed an algorithm for computing pixel motion in images when studying an insect visual system, and the algorithm is regarded as the oldest optical flow method. In 1981, Horn and Schunck creatively combined a 2D velocity field with brightness variations to establish a basic optical flow computation model by introducing basic optical flow constraint equation and global smoothing constraint condition. In the same year, Lucas and Kanade proposed that the local optical flow field of a moving image can be calculated rapidly by the least square optimization algorithm, assuming that the brightness of pixels remains constant in a small spatial neighborhood. Since then, the optical flow algorithm has been widely studied and applied. Various improvement algorithms have emerged to improve the accuracy, robustness, and real-time property of optical flow estimation.
Optical flow refers to a motion mode. This motion mode refers to an obvious movement of an object, surface, and edge by an observer (e.g., eyes and camera) against the background from a perspective. The optical flow technologies, such as motion detection, image segmentation, time collision, motion compensation coding, and 3D stereoscopic parallax, utilize the edge or surface movement technology. The optical flow algorithm is used to evaluate the deformation between two images. Its basic assumption is voxel and pixel conservation. This algorithm assumes that no obvious change occurs in the color of an object between two sequential frames [3] [4]. From this idea, an image constraint equation can be obtained. Different optical flow algorithms solve the optical flow problems with different additional conditions.
2. Applications of Optical Flow Algorithm and Related Algorithms in Motion Detection
Moving object detection is to detect the changing area in a sequence image and extract the moving object from the background image. Object classification, tracking, behavior understanding, and other post-treatment processes generally only consider the pixel regions that correspond to the moving objects in an image. Therefore, the correct detection and segmentation of moving objects are critical for later treatment. However, the detection and segmentation of moving objects become difficult due to the dynamic changes in scene, such as weather, lighting, shadow, and background interference. On the basis of whether the camera remains stationary or not, moving detection is divided into two types, namely [5], static and moving backgrounds. Most video surveillance systems have fixed cameras. Therefore, the moving object detection algorithms in static background are widely concerned. The commonly used methods include frame difference, optical flow, and background subtraction.
2.1. Frame Difference Method
Frame difference method is a commonly used method for moving object detection and segmentation. The basic principle is to extract the moving region in an image by the threshold of pixel-based time difference between two frames or among three adjacent frames in the image sequence. The corresponding pixel values of the adjacent frame images are first subtracted to acquire the differential image. The differential image is then binarized. Under a minimal change in the environmental brightness, if the change in the corresponding pixel value is less than the predetermined threshold value, this can be considered the background pixel. If the pixel value in the image region changes largely, it can be considered to be caused by the moving objects in the image. These regions are labeled as foreground pixels. The labeled pixel region can be used to determine the location of the moving object in the image.
Using the previous frame image as the background model of the current frame has a good real-time performance because the time interval among adjacent frames is short. Its background is not accumulated [6], and the update speed is fast. The algorithm is simple, and the computation amount is small. The disadvantage of the algorithm is that it is highly sensitive to environmental noise. Selecting the threshold value is critical. If the threshold value is too low, it is insufficient to suppress the noise in the image; if the threshold value is too high, the useful change in the image is ignored. For large moving objects with the same color, holes may be generated in the objects, and the moving objects cannot be extracted completely.
2.2. Background Subtraction Method
Background subtraction method is an effective algorithm for moving object detection. The basic ideas are to approximate the pixel value of the background image by using the background parameter model and to perform a difference comparison between the current frame and the background image to detect the moving area. The pixel region with a larger difference is considered the moving region, and the pixel region with a smaller difference is considered the background region. Background subtraction must have a background image, and the background image must be updated in real time with the changes in the lighting or external environment. Therefore, the key to background subtraction is background modeling and updating.
How to establish an adaptive background model against the dynamic changes in different scenes to reduce the impacts of dynamic scene changes on the moving segmentation? Many background modeling algorithms have been proposed by researchers. They can be generalized into two categories, namely, nonregressive recursion and regressive recursion. Nonregressive background modeling algorithms are to conduct background modeling by dynamically using newly observed data stored in the period from a certain time to the current time as a specimen. Such methods include the most simple frame-to-frame difference, median filtering method and contain the linear filter for estimating the background model by Toyama et al. using buffered sample pixel and the nonparameter model by using a period of historical data to calculate the density of background pixel proposed by Elg-al. The regression algorithm does not need to maintain the buffer of background estimation frame in background estimation. The background model at a certain moment is updated on the basis of the input of each frame of image by regression. This method includes the widely used linear Kalman filtering method and the mixture of Gaussian models proposed by Stauffe and Grimson.
2.3. Optical Flow Method
The main purpose of the optical flow method is to calculate the optical flow field. Under the appropriate smoothness constraints, the moving field is estimated according to the temporal-spatial gradient of image sequence, and the moving objects and scenes are detected and segmented by analyzing the changes in the moving field. Global-based optical flow field and characteristic point optical flow field generally exist.
The most classic global optical flow field calculation methods include the Lucas-Kanade method and the Horn-Schunck (HS) method. After the global optical flow field is obtained, the optical flow segmentation is performed for the moving object by comparing the motion differences between the moving object and background. The disadvantage is that the amount of computation is large. This algorithm is the most common and popular. The algorithm is a two-frame differential optical flow estimation method that aims to calculate the movement of two frames at each pixel point in the time interval of t -t + δt. This method is based on Taylor’s series of image signals and is called differential. This differential method is the application of partial derivatives for spatial and temporal coordinates.
The characteristic point optical flow method is to find the velocity of flow at the characteristic point by characteristic matching, which is characterized by a small amount of computation, rapidity, and flexibility. However, the shape of a moving object is difficult to extract accurately from a sparse optical flow field. The optical flow method can detect moving objects and deal with background motion without knowing any information of the scene in advance. The drawback is that noise [7], multiple sources, shadows and occlusion will seriously affect the calculation results of optical flow field distribution. The optical flow method is also complex in the computation and difficult to achieve real-time processing.
3. Research on the Characteristics of Optimization Algorithm for Motion Estimation
Bruhn indicated that the differential method is the most accurate optical flow method. The implementation is also relatively simple. Therefore, this method has been widely studied and applied. However, all differential optical flow methods almost originate from the two-point hypothetical models proposed by Horn and Schunck, namely, brightness uniformity and smoothness. Many problems exist in their applications. First, the numerical solution of a differential optical flow equation has ill-posedness. Therefore, we must first solve the ill-posed problem (finding the excellent smoothness constraints). The smoothness constraints added will also lead to blurred image edges, resulting in excessive smoothing. Second, according to the differential method, the image
must be differentiable. In low-texture regions, the image gradient is very small or even zero, which results in the loss of motion information. Spatiotemporal presmoothing is sometimes necessary for the image to avoid the aliasing effect. How to find an accurate presmoothing method is also a major difficult point. Third, the HS optical flow model utilizes a quadratic penalty function and overall smoothness constraints. However, in practice, discontinuities usually exist in the form of mutation and independent multiple moving objects in an image. Therefore, the discontinuous and piecewise smoothness cannot be effectively processed. Fourth, the brightness consistency constraints assume that the brightness of a pixel remains constant in a small space distance and time interval. However, occlusion, edge, and other factors occur in an actual image, which lead to an unsteady gray level, and the sensor noise and lighting changes also interfere with the constant brightness. Therefore, occlusions, large displacements, lighting changes, and other problems cannot be effectively handled, and the noise resistance is poor. Fifth, the solution by the traditional differential method must be iterated thousands of times to achieve the desired results. Therefore, the solution is impossible to calculate in real time, which restricts its practical application to a large extent.
After nearly 30 years of research, the optical flow algorithm has greatly improved in the above issues.
3.1. Ill-Conditioned Problems and Countermeasures
Two ill-conditioned problems occur in variational optical flow computation model. First, two velocity components
cannot be solved by only one constraint equation based on constant brightness. Second, although a minimal interference occurs in the image, it will lead to a small spatial-temporal derivative, which results in a large deviation in velocity estimation. On the basis of Tikhonov’s idea, the ill-conditioned problems are changed to solvable problems by researchers through introducing and improving smooth terms. Dense optical flow fields have also been obtained by researchers, such as Nagel, Nagel, Terzopoulos, Heitz, and Boutheny; Otte and Nagel; and Black and Anandan.
3.2. Segmented Smoothing, Discontinuous, and Edge Preserving
The primitive optical flow model (HS) often causes excessive smoothing in the moving edge, discontinuous, and occlusion places due to the quadratic smoothing term. Acquiring an optical flow model that allows segmented smoothing and discontinuous retaining becomes a popular research topic. Alvarez adopted an isotropic image to drive regular terms to overcome the blurring of moving boundaries by traditional methods. Nagel and Alvarez adopted anisotropic images to drive regular terms to reduce the smoothness penetrating the edge of motion. Black and Anandan proposed a regular term driven by the optical flow to reduce the smoothness penetrating optical flow discontinuity. Schnörr and Weickert proposed an improved isotropic optical flow to drive regular terms to reduce the smoothness at the edge of moving boundary and avoid oversegmentation of strongly texture objects. Weickert proposed an anisotropic optical flow driving regular term, which can not only generate a good smoothing effect along the optical flow discontinuity but also can yield a small fluctuation while avoiding the oversegmentation of strongly texture objects. Nagel proposed a spatial-temporal gradient smoothing term to deal with excessive smoothing and discontinuous retaining problems. Weickert and Schnörr extended this method to the time domain. Mukaw adopted Taylor series to develop in consideration of the basic constraint equation of the optical flow field. However, this series is actually discontinuous. Therefore, a modifying factor q is introduced. This modifying factor is obtained through the motion and projection of the object, which solves the discontinuity problem of the basic equation of the optical flow. Bruhn achieved segmented smoothness and discontinuous retaining by integrating the advantages of local and global optical flow methods. Deriche, Chambolle et al. proposed that norm regular term L1 is more conducive than L2 norm regular term to segmented smoothness and discontinuous retaining.
3.3. Processing Outliers
Noise, occlusion, and other interferences often produce outliers in images. They seriously destroy the initial assumptions of the HS optical flow equation, thus resulting in an inaccurate estimation of optical flow. To deal with this problem, researchers have determined a “robust” function and filtering with good performances. Hampel first defined and analyzed a “robust” function. Meer explored the application of a “robust” function for processing outliers in computer vision. Black and Anandan successfully analyzed and verified that the treatment effects of outliers are obviously improved after a “robust” function replaces the HS secondary penalty function. Brox, Lempitsky et al. proposed a series of different “robust” functions successively. Bruhn proposed that noise resistance is improved by integrating local least square fitting. On this basis, Brox established an adaptive scheme based on nonlinear diffusion. Nagel dealt with the occlusion problem by adopting smooth-oriented constraints rather than imposing on the direction wherein the brightness gradient varies strongly (i.e., the edge direction). Snyder proposed a general formula against occlusion problems to enhance the processing performances of outliers.
3.4. Large Displacement Problem
In practice, the changes in gray level and velocity field are discontinuous. Therefore, the basic optical flow model (HS) can only be used to solve two consecutive frames with brightness variation less than 1 pixel. The application scope of optical flow method is greatly restricted. Researchers have achieved some results through doing some efforts. Lucas and Kanade first proposed that the coarse-to-fine strategy could be used to solve. Black and Anandan solved the situation in which the displacement is more than 1 pixel by using the coarse- to-fine pyramid method. Alvarez used scale space to process. Alvarez et al. improved the basic optical flow field in three aspects, which can be used to calculate displacements more than 10 pixels with high accuracy. Prof. Weickert’s team solved large displacement problems by introducing a multigrid method (such as Bruhn) and achieved real-time processing. Brox made a further improvement by integrating segmentation and corner correction.
3.5. Lighting Changes
The basic optical flow model based on uniform brightness cannot deal with the lighting changes in the real life. Two main ways can be adopted to deal with this problem.
1) High-order gradient constant constraint is combined to deal with lighting changes. Uras, Tistarelli dealt with aperture problems by using gradient consistency. Schnörr proposed this concept in variational optical flow method. Brox introduced gradient, Hessian, and Laplacian constants into an optical flow constraint equation to deal with the lighting changes and achieved certain results. However, this method is only effective for additive lighting change and has a minimal effect on multiplicative lighting change, which is the main component in real life.
2) Luminosity invariance of images is used to deal with lighting changes. This method is mainly based on luminosity invariance information and colorful information integrated with HSI color space raised by Golland, normalized RGB, and sphere coordinate transformation raised by Weijer
. However, this method is only effective for colorful images and cannot handle gray images and colorful images with color lighting changes.
3.6. Real-Time Computation
Traditional numerical solution methods, such as Gauss-Seidel and successive over-relaxation, can obtain ideal results after thousands of iterations. Therefore, real-time applications are difficult to achieve. Camus, Strzodka considered real-time problems in data items. Under the efforts of Prof. Weickert’s team (Bruhn), the multigrid method is introduced to the numerical solution of the optical flow equation, which greatly accelerates the solving process and achieves the real-time computation in CPU. With the successful development of graphics processing units (GPUs) in recent years, Zach et al. effectively solved the real-time problem of the optical flow computation by using GPUs.
4. Optimization of Motion Estimation with HS Algorithm
In an image plane, the movement of objects is reflected by different gray-level distributions of images in the sequence. The motion field in the space can be represented by the optical flow field in the image. The optical flow field reflects the gray-changing trend of each point in the image.
4.1. Analysis on Optical Flow Algorithm
1) The brightness at the position that corresponds to the moving object remains unchanged among adjacent frames of the image.
2) The movement of moving objects in adjacent fields is minimal.
Suppose
is
hat the brightness of pixels at t time is at point of time and the brightness is at that
point,
, and the brightness is at that point
. Conditional hypothesis (a) is available.
(1)
First order Taylor expansion at the (x, y, t) position is obtained, and
can be obtained at that time.
(2)
,
,
.
,
,
formula equation as
(3)
(4)
(5)
To calculate, u, v, we need to use the hypothetical condition (b). The iterative formula is as follows
(6)
(7)
where
refers to the weight factor that controls smoothing constraints,
and
refer to the mean velocity of pixel
adjacent area, and K refers to the times of iteration. Formulas (6) and (7) imply that the amount of computation of the HS optical flow equation depends on the iteration. When the initial value is close to the real value, the convergence of iteration is fast. Therefore, the times of iteration can be reduced.
4.2. Design of Optimization HS Optical Flow Algorithm
If the HS optical flow is calculated for all pixels of the image, then it is very complex. In this study, an HS optical flow computation method based on the region of interest (ROI) is proposed. The procedures are as follows: detect HARRIS corner, make motion estimation for corner area, and perform noise filtering for optical flow information. The moving Harris corner area is considered as the ROI. The corresponding motion vector is used as the initial motion vector iterated by using the optical flow method. Smooth optical flow information in the ROI can be obtained by this algorithm on the premise of ensuring a real-time performance. This algorithm has strong robustness in a complex environment, and it is conducive to various applications. The details are as follows:
Detection of HARRIS corner in the image
The image is divided into smooth region, edge region, and corner, as depicted in Figure 1.
The characteristic of the smooth region is that the gradient is basically unchanged. The gradient in the edge region changes largely in direction x or y. The corner largely changes in direction x, y. The algorithm for HARRIS corner is described as
Step 1: Calculate autocorrelation matrix M of all pixels in the image.
(8)
(9)
(10)
(11)
Step 2: Calculate Harris corner response.
(12)
Step 3: Look for the maximum point in custom field. If this maximum value is more than the threshold value, then it is considered a corner point.
Motion estimation
After obtaining the HARRIS corner region, motion vectors can be obtained by motion search of adjacent frames. Motion estimation only makes initial estimation on the image motion, without precise calculation. Therefore, the image can be divided into 16 × 16 macroblocks. For example, a 256 × 256 image can be divided into 256 macroblocks, as illustrated in Figure 2. The initial motion vectors of macroblocks can then be obtained by full search.
Figure 2. Schematic pixel representation.
In motion estimation, to perform matching calculation by using macroblock SAD,
(13)
where
refers to pixel value at t time.
The motion vectors can be obtained by searching the matching block in adjacent frames by
(14)
If
is the minimum value, then it is the most matching block.
refers to the SAD value of macroblock;
and
refer to the SAD values at t moment and t − 1 moment. For example, for gray macroblock k in Figure 3, the initial motion vector
can be obtained by motion estimation.
4.3. Optimized HS Optical Flow Calculation Steps
All the above motion corner regions are ROIs. The optimized HS optical flow calculation is performed in the region. The motion vectors in the ROI v(x, y) are used as the initial motion vector of Formulas (6) and (7). From Formula (5), the accurate motion information in the ROI can be obtained
.
The motion vector obtained by the motion estimation in this study is used as the initial motion vector of HS calculation. Therefore, by using this algorithm, the iteration is obviously reduced. The experiment shows that the corresponding result can be obtained through two times of iteration at most by using this algorithm.
HS optical flow has poor noise resistant capacity. Therefore, noise filtering is required for HS calculation results. The motion vector of two continuous frames of ROI image is filtered. The actual image analysis shows that the motion vectors of the same object are similar, whereas the motion vectors of other background noise are chaotic, as shown in Figure 3. For this reason, the motion vectors are processed as follows.
−1 time t time
Figure 3. The motion path of subgraph.
The motion vectors in two consecutive frames of ROI are accumulated. If the result is less than the threshold value, then noise motion vector is in this region, and the corresponding motion vector is 0, which is removed from the ROI.
(15)
(16)
obtained from the filtering can be used for intelligent monitoring and processing.
refers to the motion information after filtering,
refers to the motion information of the current frame, and
refers to a subjective threshold.
5. Research on Motion Estimation Image Processing
Video processing is the same as image processing. After reading the video, the block-matching algorithm must be taken after being processed. Although no jitter occurs during the camera shooting, a clear change exists in the brightness of the light, especially various noises, including salt-and-pepper noise to be difficultly processed. In this way, a large number of errors will be generated in the processing. The mismatch phenomenon will be greatly increased in block matching.
A raw image will be disturbed by all types of noises during its acquisition and transmission, which will degrade the quality of the image and is unfavorable to the analysis on the image. Typical noises are reflected in the picture. One is the salt-and-pepper noise with the same amplitude and in random location; the other is random noise with random amplitude distribution that exists in each point.
Many salt-and-pepper noises and the interference caused by lighting irradiation exist in the video. The smoothing process for the image is necessary to perform to suppress noise and improve the image quality. Many ways are available to smooth the image, including neighborhood averaging, median filtering, Gaussian filtering, mean filtering with gray minimum variance, and bilateral filtering.
Through analysis, the proper ways include median filtering, Gaussian filtering, neighborhood averaging, and bilateral filtering. In OpenCV, five types of smoothing algorithms are provided; they are simple fuzzy (with zoom), simple fuzzy without zoom (neighborhood averaging), median fuzzy, Gaussian fuzzy, and bilateral fuzzy. The corresponding function in OpenCV is then called. After repeated verification, the best image is obtained.
6. Algorithm Testing and Analysis
In this study, a PC with dual core and 4 G memory is used as a hardware platform in all experiments. VC6.0 is the programming environment. Images (352 × 288) acquired by using a fixed camera are tested. The test source is the surveillance video collected in the natural environment. The algorithm in this study can effectively filter the background noise and obtain partial optical flow information in the smooth region of black car in ROI. This algorithm takes 121 ms by using HS optical flow computation, and the time consumed by using this algorithm is reduced to 45 ms. The proposed algorithm effectively guarantees the real-time performance. The error rates of the two methods can be calculated by using the proximity membership grade between the estimated motion vector and the exact vector, as illustrated in Figure 4. Among them, 15% of error rate is reduced by using the improved optical flow algorithm than that by using the HS algorithm. The former is improved by nearly three times in the time consumption, as shown in Figure 5. Here, the testing times and moving velocity are increased linearly. The experimental results in different vehicle types by two methods are compared. Four common vehicle types are adopted in this study, namely, SUV, car, bus, and jeep. The comparison in the time consumption (IHS/HS) is shown in Figure 6. IHS has obvious advantages, which shows the universality of the improved optical flow algorithm.
Figure 4. The comparison of error ratio in terms of motion vector.
Figure 6. The comparison of time cost in terms of different cars.
7. Conclusion
The optimized algorithm proposed in this study reduces the computation of the HS algorithm and effectively detects the optical flow information in the smooth region. The motion vector filtering improves the robustness of the improved HS algorithm in complex environments. Therefore, the algorithm in this study is more suitable for complex environments. One drawback is that this algorithm is used to perform motion detection on macroblocks. High-resolution images will affect the computation speed. Image segmentation, motion information acquisition, computation performance, and other factors should be considered to perfect the HS optical flow algorithm.
Acknowledgements
The first author acknowledges the financial support by Science and Technology Project of Department of Science and Technology of Guizhou Province (China) (QKH LH [2014] 7205). The second and third authors acknowledge the financial support by Department of education's top talent support program of Guizhou Province (China) (KY [2016] 088). The authors are grateful to the anonymous referee for a careful checking of the details and for helpful comments that improved this paper.