Laser Ranging-Assisted Binocular Visual Sensor Tracking System

Aimed at improving the low measurement accuracy of the binocular vision sensor along the optical axis in the process of target tracking, we proposed a method for auxiliary correction using a laser-ranging sensor in this paper. In the process of system measurement, limited to the mechanical performance of the two-dimensional turntable, the measurement value of a laser-ranging sensor is lagged. In this paper, the lag information is updated directly to solve the time delay. Moreover, in order to give full play to the advantages of binocular vision sensors and laser-ranging sensors in target tracking, federated filtering is used to improve the information utilization and measurement accuracy and to solve the estimated correlation. The experimental results show that the real-time and measurement accuracy of the laser ranging-assisted binocular visual-tracking system is improved by the direct update algorithm and the federal filtering algorithm. The results of this paper are significant for binocular vision sensors and laser-ranging sensors in engineering applications involving target tracking systems.


Introduction
Visual measurement has many advantages, such as high accuracy and a non-contact nature. It is widely used in industrial and military applications and daily life [1][2][3]. According to the different number of cameras used in the measurement process, visual measurement techniques can be generally divided into monocular vision, binocular vision, and multieye vision [4][5][6].
Monocular vision has the problem of scale ambiguity. In the process of solving the corresponding points in the monocular visual image, the fundamental matrix or the homography matrix lacks the depth constraint in the decomposition and cannot determine the proportional coefficient [7]. Compared with monocular vision, stereo vision consists of two or more fixed vision sensors that collect target data simultaneously. By establishing the correspondence between views, the scale information can be estimated quickly, and then the depth recovery and the reconstruction and pose estimation of the target can be realized in the Euclidean space. In reference [8], the three-dimensional coordinates of the moving object are obtained by two cameras that measure the spatial constraints of the same spatial point between different image planes. After the 3D a reconstruction of single acquisition, the absolute orientation of the two acquisition point clouds is solved to achieve the target pose estimation and point cloud fusion. Terui. F et al. [9] used binocular vision to estimate the pose of a known semi-cooperative target and verified its effectiveness in a ground test. However, this method is computationally complex system consists of a model that moves via a two-dimensional moving slide-table. The space schematic of the target tracking system is shown in Figure 1; the physical experimental system is shown in Figure 2.
Sensors 2020, 20, 688 3 of 12 schematic of the target tracking system is shown in Figure 1; the physical experimental system is shown in Figure 2.   O is in the X-axis direction and the Y-axis is vertical to 1 2 O O . The right-hand rule is used to determine that the direction along the optical axis is the Z-axis. The positive direction of the coordinate axes is shown in Figure 2. X AZ is β . schematic of the target tracking system is shown in Figure 1; the physical experimental system is shown in Figure 2.  AX Y Z are established. X AZ is β . As shown in Figure 2, O 1 and O 2 are two cameras of the binocular vision sensor. A is the point of the laser ranging sensor that is fixed in the center of the two-axis turntable. C is the point of the target. According to the spatial location relation of each sensor and target, the coordinate systems O 1 XYZ and AX Y Z are established.
The coordinate system O 1 XYZ: camera O 1 is the ordinate origin. The extension to the other camera O 2 is in the X-axis direction and the Y-axis is vertical to O 1 O 2 . The right-hand rule is used to determine that the direction along the optical axis is the Z-axis. The positive direction of the coordinate axes is shown in Figure 2. The coordinates of target C are (x c , y c , z c ) in coordinate system O 1 XYZ.
The coordinate system AX Y Z : the point A is the ordinate origin. The directions of the X-axis, Y-axis and Z-axis are the same as in the coordinate system O 1 XYZ. The coordinates of target C and camera O 1 are (x c , y c , z c ) and (x 1 , y 1 , z 1 ) in the coordinate system AX Y Z , respectively.
The projection of point C on surface X AY is C 0 . The angle between AC 0 and the Y axis is α. The angle between AC and surface X AZ is β.
The steps for obtaining the spatial location of the target are as follows: (1) Coordinate system O 1 XYZ is set as the reference coordinate system. The binocular vision sensor system first acquires the spatial location of the target, and then transmits the measurement data to the central control system through the visual control computer.
(2) The target space information acquired by the binocular vision sensor is the coordinates in the coordinate system O 1 XYZ. Coordinate transformation to coordinate system AX Y Z is required to obtain the pitch and yaw angles by which the turntable must be rotated.
The coordinates (x c , y c , z c ) are given as follows: where Rot(x, m) is the rotation matrix around the X axis. m is the angle of rotation which is a constant value and is obtained from the initial calibration. Rot(y, n) and Rot(z, k) are the same. It is obtained from the spatial geometry of Figure 2: (3) The yaw angle α and pitch angle β of the 2D turntable are transferred to the laser control computer. Then, after the 2D turntable is adjusted to the designated position, the laser ranging sensor is controlled to shoot towards target C through the laser control computer and the distance value AC =l is returned to the central control system. (4) The measurement values l of the laser ranging sensor are used to correct the measured value of the binocular vision sensor along the optical axis direction (Z axis). Because of the high accuracy of the binocular vision sensor along the vertical optical axis, the coordinates on the X axis and Y axis can be used as the measured values.
The Z coordinate of point C in the coordinate system AX Y Z is obtained as follows: By the coordinate transformation, the Z coordinate of point C is obtained: Therefore, the new Z coordinate value of the C point z nc in the coordinate system O 1 XYZ after the correction by the laser ranging sensor can be solved: Through error analysis and the error transfer formula, the error of z nc can be obtained: ∆z nc = cos α cos β ∆l +l sin α cos β ∆α +l cos α sin β ∆β where, ∆l is the measurement error of laser-ranging sensor, ∆α is the error of the yaw angle, ∆β is the error of the pitch angle. In practice, because the position calibration error between the binocular vision sensor and the laser-ranging sensor is quite small, the primary error is concentrated in the above three errors. The measurement error of the laser-ranging sensor is determined by its own performance. The angle From the process of constructing the system and the acquisition of the target space position, the measurement information of the binocular vision sensor and that of the laser-ranging sensor are related, and the laser-ranging sensor system has constant time delay.
Consider the following multiple sensors system with observing time-delay: where, is the m-dimensional measured vector of the ith sensor, and L is the number of sensors.
is the observation noise of the ith sensor.
In the real-time sensor system, at time k t t = , the following can be obtained: where k Z is the measurement set The measured value z c k−1 is obtained by the binocular vision sensor at time t k−1 . Then, the yaw angle α and pitch angle β are calculated by coordinate transformation and transferred to the 2D turntable controller. After the turntable is rotated to the corresponding angle, the laser-ranging sensor is tested and the corrected coordinate value z nc k−1 at time t k is calculated. Since the frequency of mechanical rotation of the 2D turntable is much lower than the measurement frequency of the binocular vision sensor, the measured value z c k is obtained by the binocular vision sensor at time t k . From the process of constructing the system and the acquisition of the target space position, the measurement information of the binocular vision sensor and that of the laser-ranging sensor are related, and the laser-ranging sensor system has constant time delay.
Consider the following multiple sensors system with observing time-delay: is the m-dimensional measured vector of the ith sensor, and L is the number of sensors. H i ∈ R m×n , i = 1 · · · L is the measured matrix of the ith sensor. w k,k−1 ∈ R h×1 is h-dimensional process noise vector of ith sensor. v i k ∈ R m×1 , i = 1 · · · L is the observation noise of the ith sensor. In the real-time sensor system, at time t = t k , the following can be obtained: where Z k is the measurement set of the Nth sensor at time t k .
Assume that the lag time of the time-delay sensor (laser ranging sensor) is t k−d . There are real-time measurement z i k (binocular vision sensor) and time-delay measurement z j k−d (laser ranging sensor) in the fusion center at time t k . We need to use earlier measurements of the time-delay sensor to update the estimationx k|k : Moreover, the estimated valuex k|k−d of the time-delay sensor system at time t k is considered to be the real-time measured value z j k at time t k . The measured value of all sensors at time t k are where the estimates of z i k and z j k are relevant. To obtain more accurate space coordinates of the target, we need to resolve the above correlation problem and the optimal fusion estimation of the target motion state x k in the fusion center.

Processing of Measurement Constant Time Lag
Based on the stochastic linear time-invariant discrete system and the iterated state equation, we can obtain the following: Namely, Inserting the above formula into the observation equation, Assuming Then the system observation equation is transformed into: Combined with Equation (15), the time-delay subsensor system has the following optimal Kalman filter and one-step predictor, where, ε i k+1 is the innovation. P i ε = E[ε i k+1 (ε i k+1 ) T ] is the innovation variance. K i k+1 is the filter gain. P i k+1|k+1 is the filtering error variance matrix. P i k+1|k is the one step prediction error variance matrix.

Information Fusion
To solve the estimation correlation between two sensor systems and further improve the accuracy of measurement, the federal Kalman filtering algorithm is used for subsequent processing, which includes information distribution, time updating, measurement updating, and estimation fusion.
(1) Information distribution The main filter only updates the timing and dose not measure. The process information of the system is shared among the subfilters and the main filters according to the principle of information distribution.
According to the Law of Information Conservation, n i=0 β i = 1.
(2) Time updating The covariance of the system state and estimation error is transferred according to the system transfer matrix, which is performed independently for the sub-filter and the main filter.
(3) Measurement updating The system state and estimated error covariance are updated using the new measurement information. Since the main filter performs no measurements, the measurement updating is only performed in the subfilter.
(4) Estimation fusionX In above steps, information distribution is a key part of federated filtering, which is an important feature that distinguishes it from other decentralized filtering methods. The coefficient of the information distribution determines the accuracy of the final fusion result.
According to the estimated error covariance, It can be seen that P describes the estimation accuracy of X, and the smaller the P, the higher the estimation accuracy of X.
Considering the use of a globally optimal solution to reset the filter values and error variance matrices in the next filtering step, the influence of the information distribution coefficients on global estimates is discussed: By inserting Equation (30) into Equation (23), the following is obtained: Equations (30) and (32) are substituted into Equation (25) to obtain: . P and Q have the same dimension in general. It is given by the following: (34) Taking the inverse of the Equation (34) on both sides, it is the one-step predictive state information matrix of local filter and global filter.
Taking the trace of both sides, we can obtain: β i is inversely proportional to the estimated error covariance. When the estimation covariance is larger, the estimation quality is poorer, the subfilter accuracy is lower, and the information distribution coefficient is smaller.

System Experiment and Analysis
The target-tracking system of binocular vision laser-ranging sensor is shown in Figure 1. The target is fixed on a sliding platform and performs linear motion in space. At present, only the movement of the target in the direction of the optical axis (Z axis) is studied, and the measured values in the experimental results only represent the measured values in the Z axis direction.
Through error analysis of the laser-ranging sensor, the parameters of the system are brought into the error transfer function. The measurement error of the laser-ranging sensor is ±1.5mm, the rotation error of the two-dimensional turntable is ±0.02 • , the measurement error of the binocular vision X and Y directions is ±3mm, and the Z-axis measurement error is ±40mm. After several calculations, the final average measurement error of the laser-ranging sensor is ±22.3mm, which proves that the laser-ranging sensor improves the measurement accuracy of binocular vision along the optical axis.
As shown in Figure 4, the measurement value of the laser-ranging sensor is directly predicted, and the estimated value of the lag information is used as real-time information for subsequent calculations. It can be seen from the simulation results at the time of t (25) to t (40) that the lag information is improved after the direct update algorithm, and the estimated value has errors due to the influence of noise, but the change trend is basically consistent with the original data. At the same time, when observing the whole curve, the target position changes slowly at the beginning stage, and the slope of the curve gradually increases with the passage of time, and then basically remains unchanged from t (38). This is because there is acceleration at the beginning of the sliding platform where the target is located. After reaching the specified speed, the target enters the stage of uniform motion, and the slope does not change. The target moves for a long time, and the curve of the subsequent deceleration phase is not drawn in Figure 4.  Then, the information fusion algorithm based on the federated Kalman filter is verified by experiments. A comparison between the measurement results of a single sensor and the fusion results is shown in Figure 5. It can be clearly observed in Figure 5 that compared with the binocular vision sensor with larger error, the curve of the fusion result is smoother, and the accuracy is improved.
Compared with the measured value of the laser-ranging sensor, the fusion result can not be seen directly, so the mean squared error (MSE) between each measurement result and the calibrated true value of the target is calculated. The MSE results in Figure 6 indicate that the accuracy of the target position after fusion is improved compared to a single sensor, and the error of the result after fusion is the smallest. Then, the information fusion algorithm based on the federated Kalman filter is verified by experiments. A comparison between the measurement results of a single sensor and the fusion results is shown in Figure 5.  Then, the information fusion algorithm based on the federated Kalman filter is verified by experiments. A comparison between the measurement results of a single sensor and the fusion results is shown in Figure 5. It can be clearly observed in Figure 5 that compared with the binocular vision sensor with larger error, the curve of the fusion result is smoother, and the accuracy is improved.
Compared with the measured value of the laser-ranging sensor, the fusion result can not be seen directly, so the mean squared error (MSE) between each measurement result and the calibrated true value of the target is calculated. The MSE results in Figure 6 indicate that the accuracy of the target position after fusion is improved compared to a single sensor, and the error of the result after fusion is the smallest. It can be clearly observed in Figure 5 that compared with the binocular vision sensor with larger error, the curve of the fusion result is smoother, and the accuracy is improved.
Compared with the measured value of the laser-ranging sensor, the fusion result can not be seen directly, so the mean squared error (MSE) between each measurement result and the calibrated true value of the target is calculated. The MSE results in Figure 6 indicate that the accuracy of the target position after fusion is improved compared to a single sensor, and the error of the result after fusion is the smallest.  Figure 7 shows the change curve of the binocular vision sensor's information distribution coefficient. At the beginning of the measurement, the information distribution coefficient changes rapidly from 0.5 in 5 S. After 15 s, it tends to be stable and its value is 0.37. Because the measurement error of the binocular vision sensor along the optical axis is large, its information distribution coefficient is small, while the laser-ranging sensor has a large information distribution coefficient, which is consistent with the conclusions obtained in the paper.

Conclusions
Through theoretical calculations and experimental verifications, the accuracy of binocular vision along the optical axis is improved in this paper. First, regarding the system structure, we propose to use a one-dimensional point laser-ranging sensor to correct the measurement value of the binocular vision sensor along the optical axis. Second, regarding the measurement process, we found that it is limited by the performance of the two-dimensional turntable, the system has time delay. To improve the utilization and real-time performance of the information of the multisensor measurement system, an optimal information fusion algorithm for the multisensor target tracking system characterized by estimation correlation and constant time delay is studied. We propose a method to separate the complex multisensor environment, which first uses the one-time prediction of the constant delay information as the real-time information at the current moment to solve the delay problem, and then  Figure 7 shows the change curve of the binocular vision sensor's information distribution coefficient. At the beginning of the measurement, the information distribution coefficient changes rapidly from 0.5 in 5 S. After 15 s, it tends to be stable and its value is 0.37. Because the measurement error of the binocular vision sensor along the optical axis is large, its information distribution coefficient is small, while the laser-ranging sensor has a large information distribution coefficient, which is consistent with the conclusions obtained in the paper.  Figure 7 shows the change curve of the binocular vision sensor's information distribution coefficient. At the beginning of the measurement, the information distribution coefficient changes rapidly from 0.5 in 5 S. After 15 s, it tends to be stable and its value is 0.37. Because the measurement error of the binocular vision sensor along the optical axis is large, its information distribution coefficient is small, while the laser-ranging sensor has a large information distribution coefficient, which is consistent with the conclusions obtained in the paper.

Conclusions
Through theoretical calculations and experimental verifications, the accuracy of binocular vision along the optical axis is improved in this paper. First, regarding the system structure, we propose to use a one-dimensional point laser-ranging sensor to correct the measurement value of the binocular vision sensor along the optical axis. Second, regarding the measurement process, we found that it is limited by the performance of the two-dimensional turntable, the system has time delay. To improve the utilization and real-time performance of the information of the multisensor measurement system, an optimal information fusion algorithm for the multisensor target tracking system characterized by estimation correlation and constant time delay is studied. We propose a method to separate the complex multisensor environment, which first uses the one-time prediction of the constant delay information as the real-time information at the current moment to solve the delay problem, and then

Conclusions
Through theoretical calculations and experimental verifications, the accuracy of binocular vision along the optical axis is improved in this paper. First, regarding the system structure, we propose to use a one-dimensional point laser-ranging sensor to correct the measurement value of the binocular vision sensor along the optical axis. Second, regarding the measurement process, we found that it is limited by the performance of the two-dimensional turntable, the system has time delay. To improve the utilization and real-time performance of the information of the multisensor measurement system, an optimal information fusion algorithm for the multisensor target tracking system characterized by estimation correlation and constant time delay is studied. We propose a method to separate the complex multisensor environment, which first uses the one-time prediction of the constant delay information as the real-time information at the current moment to solve the delay problem, and then uses the federal Kalman filter to address the estimation correlation. Finally, the experimental results verify the validity and accuracy of this method.
Although the research in this article provides a basis for the study of the experimental system in actual environments, the current experimental environment is relatively simple, and the experimental parameters and errors are under control. When the distance between the target and sensor system increases, the visual error will increase, and the laser spot will become larger, which will cause the errors of the laser-ranging sensor to increase. There are still many improvements to be studied.

Conflicts of Interest:
The authors declare no conflict of interest.