FFT-Based Scan-Matching for SLAM Applications with Low-Cost Laser Range Finders

: Simultaneous Localization and Mapping (SLAM) is an active area of robot research. SLAM with a laser range ﬁnder (LRF) is effective for localization and navigation. However, commercial robots usually have to use low-cost LRF sensors, which result in lower resolution and higher noise. Traditional scan-matching algorithms may often fail while the robot is running too quickly in complex environments. In order to enhance the stability of matching in the case of large pose differences, this paper proposes a new method of scan-matching mainly based on Fast Fourier Transform (FFT) as well as its application with a low-cost LRF sensor. In our method, we change the scan data within a range of distances from the laser to various images. FFT is applied to the images to determine the rotation angle and translation parameters. Meanwhile, a new kind of feature based on missing data is proposed to determine the rough estimation of the rotation angle under some representative scenes, such as corridors. Finally, Iterative Closest Point (ICP) is applied to determine the best match. Experimental results show that the proposed method can improve the scan-matching and SLAM performance for low-cost LRFs in complex environments.


Introduction
With the rapid development of artificial intelligence and pattern recognition technology, intelligent robots have entered all aspects of industrial automation and human life. One of the most important features/functions of robot intelligence is to move autonomously, which is usually supported by precise localization and reliable navigation. To achieve this, the common technique is Simultaneous Localization and Mapping (SLAM). SLAM includes two parts: localization and mapping, where localization is an important premise of mapping. The SLAM problem can be solved by maintaining a robot-state (pose and map) probability-density function or its moments [1]. Depending on whether the environment is pre-given, localization can be divided into localization in a known environment and localization in an unknown environment [2]. The localization in a known environment mainly considers the problem of positioning accuracy. Localization in an unknown environment often requires the use of external sensors to obtain information for estimating the robot's position after processing. The self-localization method can also be divided into two categories: relative positioning and absolute positioning. Relative positioning refers to determining the relative distance and direction in which a robot moved in a short alignment, which can efficiently estimate the relative transformation between two scans. There are also many other scan-matching methods, such as using polar coordinates [31], Hough transform [32], histogram [33,34], correlation cost function [35,36], and Normal Distributions Transform (NDT) [37], etc. A study of perfect match [38] and its comparison to different methods is given in [39].
For commercial robotics, the cost of traditional LRF sensors are usually unacceptable. Serial kinds of low-cost LRF sensors (price lower than 600 RMB) have been developed in recent years, such as Rplidar by the Slamtec Company and N1-360 by the Leishen Company, etc. Compared with common expensive LRF sensors, low-cost LRF sensors have much poorer performance in range, precision, and resolution. As a result, scan matching may often fail while using low-cost LRF sensors, especially when the robot is running too quickly in complex environments, and it may cause failure to mapping and localization. High performance and low cost cannot be balanced, but it is an important concern for the realization of a commercial robot SLAM. For the commercial application of robotics, how to improve the SLAM algorithms so that they can adapt to low-cost and low-performance sensors is now becoming a key problem.
Most LRF-based SLAM methods need a good initial guess for scan-matching in order to diminish the range of searching and speed up the process. Odometry message is a popular choice [16,17] for that initial guess. Low-cost LRF sensors have much lower detection frequency and lower resolution, it may make the scan matching fail between two adjacent frames, especially when the odometry drifts.
If the scan data is considered as an image, which indicates the outline of obstacles near the robot, then the scan-matching problem can be transformed to an image registration problem. There are many image registration approaches that can match two positions in a long distance without any good initial guess for searching. However, scan images are quite different from normal images, since they have sparse points collected by lidar. As a result, normal feature-based image registration methods, such as Scale-Invariant Feature Transform (SIFT) [40], Speeded-Up Robust Features (SURF) [41], ORiented Brief (ORB) [42], etc., may not be suitable. We need to find the shift of the image by considering all the data in the image. By changing the image to the frequency domain, the FFT transform is suitable for this step. The idea in this paper is mainly derived from FFT-based image registration [43][44][45][46], because the scan image is simple and lies in one plane.
In our previous works [47,48], we proposed the FFT-based scan-matching algorithm and its improvement FFT-ICP. In this paper, we detailed previous work about data pretreatment. To address the low detection range problem of low-cost LRF sensors, we added a pre-alignment module based on missing data features to determine the rough rotation angle before FFT processing under certain conditions. We also propose a new solution for FFT-ICP scan-matching with low-cost LRF sensors.
The rest of the paper is organized as follows: In Section 2, we propose robot modeling and data pretreatment, as well as missing data feature extracting and matching; In Section 3, the FFT-based scan-matching algorithm is introduced; In Section 4, the complete solution of the improved FFT-ICP scan-matching is proposed; In Section 5, by using a cleaning robot with only a low-cost LRF sensor in different complex indoor environments, experiments are completed with the proposed method and comparative methods. Section 6 outlines our conclusions.

Robot Modeling
In our work, a two-wheel driven robot moving on an indoor flat plane was studied, as shown in Figure 1. The scan-matching problem here is defined as finding the transformation between two robot poses based on the scan data acquired on each pose. Two coordinate systems are established here, marked as model coordinate system (X M O M Y M ) and scene coordinate system (X S O S Y S ). The transformation parameters are X, Y, and α, where X and Y are the translation parameters between the two poses and α is the rotation angle. The rotation matrix R and translation matrix T can be used to represent the transformation between the two poses:

Scan Data Pretreatment
In our work, the scan-matching problem is transformed to an image registration problem. As a result, the scan data must be changed to an image. The raw data of one scan collected by the LRF sensor can be represented as where (θ i , d i ) is a scan point detected on the direction angle of θ i with distance d i . One scan contains n scan points in total. If the scan point is invalid, distance d i = 0. Invalid points usually exist in the directions where obstacles are out of the measurement range. When using a low-cost LRF sensor with limited detection range, such invalid points usually occur in the scenes of corridor, hall, etc.
The image size has a great influence on the amount of computation. Fortunately, the effective detection range of a low-cost LRF sensor is limited. As a result, this paper chose the data within a certain distance from the robot (0 < d i ≤ d max ) as the data source for subsequent calculations.
The corresponding position (x i , y i ) of scan point (θ i , d i ) in the image is: where w and h are the width and height of the image and µ is the factor for transforming actual distance to pixel distance. A binary image with values on corresponding positions (x i , y i ) set to 1 (stands for obstacles) and others set to 0 (stands for free) is generated. An example for data of the scan-image is shown in Figure 2.

The Missing Data Features
The raw laser data collected by LRF sensors usually contain missing data caused by out of range or unreliable measurements. Such missing data occurs more often for a low-cost LRF sensor because of the low detection range. To the best of our knowledge, almost all previous related works ignored or threw away the missing data because they assumed the laser sensor was perfect or the missing data were meaningless.
However, such missing data is meaningful. For a robot with a low-cost LRF sensor running in normal indoor environments without floor glasses, the fail detection on certain direction angles usually indicates an out of measurement range in that direction. In fact, some navigation applications interpolate such missing data to the maximum distance value to make the running of the robot more robust in practice.
In this paper, we put forward a new approach for making use of such missing data with (d i = 0) or d i > d max , called missing data feature. For raw scan data, the missing data feature is extracted as An angle range of continuous missing data is represented by (β m , ϕ m ), in which β m is the center angle and ϕ m denotes the size of the angle range. In practice, only the angle ranges with large ϕ m are considered to avoid noises. The size of the feature is expressed as Figure 3 shows the expression example of a missing data feature, as well as its appearance in scan data. For two different and adjacent robot poses, if the size of the missing data features f m1 and f m2 are big enough, there could be a quick way to estimate the rotation angle α, by minimizing the following matching error: If the matching error e is smaller than a certain threshold, α could be regarded as the rough rotation angle between two poses. It is important to note that the missing data features should be big and stable enough for this processing. Fortunately, this is not uncommon indoors, especially when using a low-cost LRF sensor.

FFT-Based Scan-Matching
In this section, the scan-matching problem is transformed to an image registration problem, and it is solved by FFT.

Solution of One-Dimensional Fast Fourier Transform (1D FFT)
Considering two functions f 1 (x) and f 2 (x) with the following relationship where x 0 is the shift between the two functions. Take the FFT of both sides of (8) and use related theorems to give Taking the cross-power spectrum of (9): where F * is the conjugation of F and |•| is magnitude. Taking the inverse FFT of both sides of (10) gives an impulse function. The translation parameter x 0 can be derived from the position of the magnitude of the impulse function [33]. Similarly, 2D FFT function relation can be written in this way: where f 1 (x, y), f 2 (x, y) are two 2D functions and x 0 , y 0 are displacements between them. We can also calculate the cross-power spectrum of (6):

Rotation Parameters
Let A(x, y) be the transformed scan image of B(x, y), with rotation angle α, translation (x 0 , y 0 ), and scale factor λ: In this paper, there is no scaling relation between two scan images and λ = 1. We can write (13) as follows: Taking the FFT of both sides of (14) gives Performing polar coordinate transformation gives We can write (17) as follows: Let which gives Equation (21) is similar to the 1D FFT. Using the 1D FFT theorem, rotation angle α can be solved. In contrast to traditional image registration, there is no scaling relationship between scan images. Therefore, the rotation angle can be obtained by performing polar coordinate transformation without further conversion of the logarithmic coordinate.

Translation Parameters
After calculating the rotation parameter, the translation parameters can be solved by projection of the image in horizontal and vertical directions. Let C(x, y) be the grid map after B(x, y) is rotated by rotation angle α and its projection can be represented by: Similarly, projection of A(x, y) can be represented by By taking a projection of the image and converting 2D to 1D, 1D FFT can be used to calculate the translation parameters.
From Equation (21) to (25), the rotation parameter and translation parameters are eventually solved by 1D FFT. The FFT scan-matching process is shown in Figure 4.

The FFT-ICP Scan-Matching Frame-Work
Although FFT-based scan-matching is robust to noises and does not need any initial pose, the computation cost and accuracy are difficult to satisfy at the same time. In this section, based on our previous work in [47] and [48], a new FFT-ICP-based scan-matching frame-work is introduced, which could be faster, more robust, and more appropriate for low-cost LRF sensors.
Firstly, after transferring the scan data into image data for further processing, the missing data features f m1 and f m2 are extracted. If the missing data features are obvious enough, for example, f m1 > Th1 and f m2 > Th1, where Th1 is a threshold indicating the total missing angle of a scan, the missing data features are matched at the first place to determine a rough rotation angle of the two scans (Section 2.C). If the matching error is small enough, the rotation angle estimated by this step could be an effective initial guess for further processing.
Secondly, if the FFT-based scan-matching method proposed in Section 3.3 has taken place for extracting the rotation and translation parameters (R 1 , T 1 ), it could be denoted that the step of the FFT-based method for estimating rotation angle could be bypassed, as the first step already has a good guess for rotation angle through the missing feature matching.
Then, we take the transformation (R 1 , T 1 ) to the scene scan data and apply the ICP precise matching described in the following paragraph.
According to [26], ICP calculates the transformation through iterations. In each iteration step, the algorithm selects the closest points as correspondences and calculates the transformation for minimizing the equation.
where R is the matrix composed of rotation parameters, T is the matrix composed of translation parameters, n is the number of corresponding points, M i is the model point set and S i is the scene point set. Two factors need to be considered. One is the selection of the closest points, the other is the calculation of transformation in each ICP iteration. The time complexity of ICP is dominated by the time for selecting the closest points, and some enhancements have been proposed, such as the commonly used K-d tree. The most used method that is available to calculate the transformation in each iteration is Singular Value Decomposition (SVD). We give a brief introduction of SVD in the following.
Introducing the centroids of the points to simplify (26): We can write (26) as follows: In order to minimize the above, the algorithm has to maximize the third term and minimize the fourth term. The fourth term has its minimum for T = M − RS.
Writing the third term below: The third term has its maximum for R = VU T , where V and U are derived by the SVD H = UDV T . From the above, the transformation matrices are In each iteration, we estimate the transformation (R, T), transform the scene point data using (R, T), and calculate the matching error until the matching error is small enough or reaches the maximum iteration time. A total transformation estimation (R ICP , T ICP ) is extracted by accumulating transformations during each iteration.
Finally, based on the rough estimation result (R 1 , T 1 ) and precise estimation result (R ICP , T ICP ), the final result is calculated as follows: As introduced above, the FFT-ICP-based scan-matching frame-work is shown in Figure 5.

Experimental Facilities and Settings
The experimental facilities are shown in Figure 6. The left is the Robot, the right is RPLidar-A1. RPLidar-A1 is a low-cost LRF sensor with 360 • coverage field and range of up to six meters. The key parameters of RPLidar-A1 are listed in Table 1. The computer used was a laptop, with intel i3 CPU and 2 GB memory. The experimental data were obtained by RPLidar-A1 under different indoor environments, including office, corridor, laboratory room, etc., as shown in Figure 7. The ground truth of real-world experiments was collected by manual measurement of key positions.
Different experiments were undertaken to evaluate the performance of matching accuracy, matching success rate, computational speed, and moving performance. The methods for comparison included NDT [37], ICP [26], original FFT [48], original FFT-ICP1 [47], and the FFT-ICP2 of this paper.
In the experiments, for FFT processing, the maximum range of the scan data for converting to an image was d max = 3.5 m. The image size for original FFT [48] was set to 256 × 256, while the image size for FFT-ICP1 [47] and FFT-ICP2 was set to 128 × 128.

Scan Matching
For scan matching, two different poses and their corresponding scan data were taken as one set. The ground truth of the poses were given by manual measurement. A series of such data sets of different environments and different conditions were used for the experiments. Figure 8 shows an example of the scan-matching results of different steps with our proposed method. As can be seen from Figure 8b, the approach of estimating the rotation angle through missing data only performs well under certain scenes. After FFT processing, the rough poses were estimated, and they were close enough to the real pose as shown in Figure 8c. Finally, as can be seen from Figure 8d, the two scans were matched precisely. Another example of matching result via different methods is shown in Figure 9. For the input scans in Figure 9a, the method of ICP finally converged to a wrong result in Figure 9b, while NDT reported a matching fail in Figure 9c. In Figure 9d, the result of FFT is near to the truth, but it still has errors. FFT-ICP1 and FFT-ICP2 give approximate and accurate results here. As shown in Figure 10, the success rate of the ICP method decreases very quickly when the difference of poses becomes larger, especially when the rotation angle becomes large. That is because ICP needs the scan points to overlap enough for matching. Although the NDT method performs more robustly than does ICP, it still cannot deal with the large difference of the poses. The FFT-based methods are more robust through large pose changes because they are theoretically unaffected by position changes. The matching fail for FFT-based methods mainly comes from the excessive change of total shape caused by different positions.  Table 2 shows the average matching accuracy of different methods. Comparatively speaking, our original FFT method has larger errors than the other methods. This is mainly because its processing image is small. The FFT-ICP1 and FFT-ICP2 methods achieve better results because of their coarse-to-fine estimation.

Execution Efficiency
Further experiments were undertaken to evaluate the execution efficiency. The average run times for each step of the FFT-ICP2 method in this paper are shown in Table 3. In our work, in the step of estimating rotation angle, if the method of missing-data feature-matching in Section 2.3 (takes only 5 ms) has given a good result, then the processing of FFT for rotation estimation in Section 3.2 (takes 103 ms) can be bypassed. That saves a lot of time compared with FFT-ICP1. The execution efficiency experiment was also undertaken for different methods under different environments. As shown in Table 4, both NDT and our proposed FFT approaches need much more time than the ICP method to achieve a more robust result. Meanwhile, the computational cost of FFT is inversely proportional to the number of grids and the computational accuracy. By down-sizing the image from 256 × 256 to 128 × 128, the FFT-ICP1 has a faster in calculation speed than FFT. Meanwhile, for the places with obvious missing data features, such as corridors, the FFT-ICP2 proposed in this paper shows improvement over our previous methods.

Dynamic Localization
We also completed dynamic localization experiments based on the robot mobile platform. In the experiments, robot ran along a given trajectory under different moving speeds.
As shown in Figure 11, when the speed exceeds 0.15 m/s, the accuracy of the ICP algorithm is poor and the error is up to 0.5 m (tracking lost). In addition, due to the vibration of the robot, the curves generated by the ICP algorithm fluctuate greatly. In contrast, the algorithm proposed by this paper demonstrates robustness with speed; the location error is less than 0.2 m when the speed is 0.2 m/s. The curves generated by the proposed algorithm are smoother, so it also has a good robustness to vibration.  Figure 12 shows the results of generating maps directly from a series of scans. In the experiments, robot ran at a linear speed up to 0.3 m/s and a rotation speed of 25 o /s. The given robot trajectories are shown as red stars. As can be seen from Figure 12, the NDT method failed with rotation, while our proposed method gave a better result.
The experiments show that the proposed method in this paper leads to an improvement of robustness and speed, compared with our previous methods. This technique has proven to be helpful for constructing an on-line SLAM system.

Conclusions and Future Work
In this paper, a new approach of scan-matching and low-cost mobile robot localization is presented using a low-cost LRF sensor. The missing data caused by the out-of-measurement range are processed as features for estimating the rotation angle under certain typical scenes. FFT-based matching is applied to determine the rough pose. An effective implementation of an FFT-ICP-based method, in an attempt to reduce the computational cost, is proposed. The experiments show the proposed method has an advantage in computational cost over our previous methods. The proposed method is also more robust than traditional methods in long-distance matching, which suggests its potential for loop-closing detection in SLAM.
The drawback of this method is its detection range, because we only considered the obstacles near to the robot for matching. In wide scenarios, the current method may not work well. For the missing data, we only considered that they are caused by the out-of-measurement range. However, for low-cost LRF sensors, missing data can also be caused by on-total reflection or absorption of the laser. Scenes with a lot of glass are still a big challenge for all the laser-based localization and mapping approaches.
In the future, we will enhance the algorithm by scaling the scan image adaptively and developing quick loop-closing modules based on the image for the SLAM framework so that the algorithm can adopt to big scenes. We will also put forward a missing-data analysis to make full use of the information contained within the missing data.

Conflicts of Interest:
The authors declare no conflict of interest.