A Self-Adaptive Mean Shift Tree-Segmentation Method Using UAV LiDAR Data

: Unmanned aerial vehicles using light detection and ranging (UAV LiDAR) with high spatial resolution have shown great potential in forest applications because they can capture vertical structures of forests. Individual tree segmentation is the foundation of many forest research works and applications. The tradition ﬁxed bandwidth mean shift has been applied to individual tree segmentation and proved to be robust in tree segmentation. However, the ﬁxed bandwidth-based segmentation methods are not suitable for various crown sizes, resulting in omission or commission errors. Therefore, to increase tree-segmentation accuracy, we propose a self-adaptive bandwidth estimation method to estimate the optimal kernel bandwidth automatically without any prior knowledge of crown size. First, from the global maximum point, we divide the three-dimensional (3D) space into a set of angular sectors, for each of which a canopy surface is simulated and the potential tree crown boundaries are identiﬁed to estimate average crown width as the kernel bandwidth. Afterwards, we use a mean shift with the automatically estimated kernel bandwidth to extract individual tree points. The method is iteratively implemented within a given area until all trees are segmented. The proposed method was tested on the 7 plots acquired by a Velodyne 16E LiDAR system, including 3 simple plots and 4 complex plots, and 95% and 80% of trees were correctly segmented, respectively. Comparative experiments show that our method contributes to the improvement of both segmentation accuracy and computational e ﬃ ciency.


Introduction
Conventionally, forest inventory has been implemented by field survey which is time-consuming and labor-intensive [1,2]. Moreover, because forest attribute information is affected by topography and the environment, the parameters that can be measured manually are very limited [3]. Over the poast three decades, airborne light detection and ranging (LiDAR) technology has been increasingly used to estimate forest structural attributes [4][5][6]. Airborne LiDAR data provide detailed three-dimensional (3D) information of vertical forest structures to estimate forestry parameters at both individual and stand tree levels [7]. With the technology development, compared to airborne LiDAR, unmanned aerial vehicles (UAV LiDAR) have gained a reputation in forest inventory surveying because of their increased data resolution, lower costs, flight flexibility, and simple operation [6,[8][9][10].
Forest inventory requires accurate individual tree structures to directly obtain forest structural attributes (i.e., crown width and tree height) or estimate height-related attributes by establishing regression models (such as diameter at breast height (DBH), volume, and aboveground biomass (AGB)) [11][12][13][14]. Thus, the segmentation of individual trees becomes the key step to highly accurate forestry parameter estimation. However, the existing problems of individual tree segmentation using LiDAR data are mainly caused by: (1) adjacent, overlapped tree crowns, and various tree sizes, which result in serious omission errors in multi-layered forest stands [15]; (2) irregular tree shapes (such as a branch of a tree extending outward widely and resembling a small tree), which lead to commission errors [7]; (3) suppressed or small trees located below tall trees, which are difficult detected because LiDAR systems are less likely to penetrate the dense forest canopy and have less information of suppressed trees.
To solve the aforementioned issues, many methods have been developed. Some studies make use of height variation of canopy height model (CHM) to find treetops by local maxima (LM) filtering [16] or moving variable window [17,18]. Then, tree crowns are mainly delineated by image-processing algorithms designed for edge detection and feature extraction, such as region growing [19], watershed analysis [17,20], spatial wavelet analysis [21], and template matching [22,23]. The CHM-based tree detection algorithms are fast and efficient but easy to produce omission and commission errors [24]. The tree detection accuracy is sensitive to the resolution of CHM and the noise caused by irregular tree shapes and overlapped trees. In addition, because of a variety of canopy sizes, it is difficult to select the optimal resolution and smoothing factor suitable for the whole scene [7,25,26]. To improve the detection rate, the vertical and horizontal structures of point clouds are analyzed. Liu et al. [7] integrated a multiscale CHM-based algorithm and a point-based vertical profile analysis algorithm to improve individual tree segmentation accuracy. Paris et al. [24] improved the segmentation accuracies of dominant trees by combining CHM data and horizontal profiles of point clouds and extracted subdominant trees by analyzing vertical profiles. By using a marker-controlled watershed segmentation method, Harikumar et al. [27] first segmented a point cloud into a set of segments, each of which was projected onto a novel 3D space to separate subdominant from dominant trees.
Because of the increased spatial resolution of UAV LiDAR data, some studies, to improve segmentation accuracy, segmented individual trees directly from point clouds. The existing point-based clustering methods (such as region growing [17], k-means clustering [26], normalized cut [28], and mean shift [29,30]) have been widely used for individual tree segmentation in terms of 3D data characteristics, geometric structures, and height variation of trees [31][32][33]. Compared with other clustering methods, mean shift does not require seed points or number of clusters before clustering. It proves to be robust to segment various kinds of trees [34,35]. Mean shift aims to move each data point to the densest area within a certain neighborhood by iteratively performing shift operations based on a kernel density function. The kernel function determines the weights of neighboring points for re-estimating the local maximum. Eventually, the points will converge to the local maxima of density [29]. Ferraz et al. [30] developed a fixed bandwidth mean shift method, in which a horizontal and a vertical kernel functions were designed to shift data points to a denser and higher region until it converged to the treetop. Ferraz et al. [36] further improved it by establishing a self-calibrated bandwidth model, which relates tree heights to crown widths and depths by extracting easily recognized individual trees. The method required complicated pre-processing and the established allometric model might not be suitable for all tree species [29]. Hu et al. [29] proposed an adaptive mean shift-based clustering approach, in which the key parameter-kernel size-was automatically changed to reduce over-and under-segmentations according to the estimated crown diameters of distinct trees, and nearly 30% more suppressed trees are identified compared to the fixed bandwidth-based methods. However, point-based methods are generally highly computational when having been directly applied to the whole scene [7,33]. Hamraz et al. [35], to improve computational efficiency, generated vertical profiles of point clouds in different directions within a local region, then detected crown boundary for each profile. This method greatly improved computational efficiency and required no prior knowledge of the shapes and sizes of tree crowns [35].
In summary, in mean shift methods, the kernel bandwidth is a major factor, which has a strong impact on the segmentation accuracy and clustering speed. The fixed kernel bandwidth was widely used in segmentation, but it cannot adapt to heterogeneous crown sizes, and thereby might result in omission errors or commission errors. For example, if a small kernel bandwidth is applied to large objects, commission errors will occur (that is, a single tree might be segmented into several clusters); whereas if a large kernel bandwidth is applied to small objects, omission errors will greatly appear (that is, more than two trees are clustered into one segment). Our previous research in [37] used fixed bandwidth mean shift method to segment individual trees and reduced omission errors by normalized Cut algorithm (NCut). The method improved tree segmentation accuracy and also increased computational complexity. Therefore, in this paper, to reduce omission and commission errors, we propose a self-adaptive mean shift method, where an adaptive kernel bandwidth is used according to local tree structure information for individual tree segmentation. Moreover, because mean shift methods are generally highly computational when directly applied to LiDAR data, a 3D space division strategy is used to improve tree segmentation efficiency. In addition, point density is another important factor affecting individual tree detection, we also evaluate the effects of point density on individual tree detection. The remainder of the paper is organized as follows: Section 2 details the proposed self-adaptive mean shift segmentation method. Section 3 describes the study area and the test data including UAV LiDAR data and field survey data, and then analyzes the segmentation results. Concluding remarks are given in Section 4.

Methodology
In this paper, the proposed method segments individual trees by the following steps. Ground points are separated from non-ground points, and then non-ground points are normalized for the data pre-processing. Then, the optimal kernel bandwidth for mean shift is automatically estimated without any prior knowledge of crown size to extract individual tree points. The proposed self-adaptive mean shift method is iteratively implemented within a given area until all trees are segmented.
In the pre-processing process, because ground points affect segmentation results, we remove them by a cloth simulation (CSF) algorithm [38]. Afterward, to reduce the influence of topography on the following tree segmentation, we normalize non-ground points according to the produced digital terrain model (DTM) interpolated by the filtered ground points. The details can be seen in [37].

Self-Adaptive Kernel Bandwidth Determination
The kernel bandwidth is a critical parameter of the mean shift method. In this study, it represents the region where the local maximum exists, thereby we adaptively determine it according to the related tree crown size. The kernel bandwidth is determined by the following steps.

1.
3D space division, which locates the global maximum point in a point cloud as the center point to divide the point cloud into a set of angular sectors at multiple directions.

2.
Crown surface generation, which generates a vertical profile for each sector and extracts LiDAR canopy surface points to simulate a crown surface.

3.
Kernel bandwidth determination, which, from the generated crown surface, a crown boundary is delineated to automatically determine the kernel bandwidth according to the between-tree gap and height variation.
The rest of this section details the delineation of the kernel bandwidth.

3D Space Division
We first locate the highest point in a point cloud as the center point, X c = (x c , y c , z c ), which is defined as the apex of the highest tree. With the center point, X c , we divide the 3D space into n angular sectors within a given search radius R (R must be larger than the largest trees in the scene to be processed) (see Figure 1). For each angular sector θ j Є[2πj/n, 2π(j + 1)/n] (j = 0,1 . . . n-1), we search for the points X i = (x i , y i , z i ) (i = 1, 2 . . . .n, n is the number of the points) belonging to. The set X i belonging to angular sector θ j is defined as:

Crown Surface Simulation
For each angular sector (see Figure 2a), we generate a vertical profile of LiDAR points and extract potential canopy surface points. We first calculate the distance 2 between X c and X i . Then, a vertical profile is generated in terms of coordinates z and d (see Figure 2b). In this study, only LiDAR canopy surface points are kept to simulate the crown surface. As seen in Figure 2c, in doz plane, points are divided into a set of grids with a given grid size , each of which the maximum height is kept to simulate the crown surface.

Kernel Bandwidth Determination
After the crown surface simulation, the crown boundary can be delineated according to between-tree gaps and height variations. In the canopy surface points, the local minimum points always exist between adjacent tree crowns. Starting from X c , only the first local minimum point is remained for further analysis. The distance between center point X c and the local minimum point is identified as a crown radius. To smooth extended branches and reduce false local minimum points, a Gaussian filtering is first applied to the simulated crown surface points. After estimating all crown radii at multiple directions, the average crown radius is calculated as the kernel bandwidth.
The determination of kernel bandwidth is completely based on the coordinate information of LiDAR points without any prior knowledge. Moreover, the division of the 3D space decreases the complexity of 3D forest structure and improves the computational efficiency.

Mean Shift-Based Individual Tree Segmentation
With the determined kernel bandwidth, the individual tree points are extracted by mean shift, which gains its reputations for great performance in individual tree segmentation. In this study, a horizontal and a vertical kernel functions (g s and g r ) are designed to shift data points X i to treetop X c . The offset vector is defined by: where the superscripts s and r refer to the horizontal and vertical directions, respectively. h s and h r represent the horizontal and vertical bandwidths. g s is the horizontal kernel function that follows Gaussian function: g r is specially-designed for assigning a larger weight to the highest voxel: where mask (x c r ,x i r ) represents a mask of foreground object; dist(x c r ,x i r ) is the distance between X i and the boundary of the mask. They are defined by, More details can be found in literature [31]. After detecting and extracting the tallest tree points, the method will automatically search the tallest point in the remaining points. The method was iteratively implemented within a given area until all individual tree points were segmented, and eliminate clusters containing less than 50 points to reduce interference from branches.

Datasets
The individual tree segmentation method was applied within the forest stands in Dongtai forest farm, located in Yancheng City, Jiangsu, China. A field survey was conducted within 7 square plots with two main tree species (Metasequoia in plots 1-3, and Poplar in plots 4-7), and each plot was 30 m × 30 m in size. The numbers of the reference trees for Plots 1~7 were 29, 54, 54, 18, 20, 21, and 42, respectively. Accordingly, their forest densities were 0.032, 0.060, 0.060, 0.020, 0.022, 0.023, and 0.047 trees/m 2 , respectively. Each reference tree was located by a backpack laser scanning system, and its DBH was manually measured using a diameter tape and recorded in July 2017. The full specifications of the Backpack laser-scanning system are listed in Table 1. The UAV LiDAR data were collected by a Velodyne 16E (VLP16) laser-scanner system placed on a GV1300 multi-rotor UAV platform in July 2017. Overall, the LiDAR data have an average point density of 40.57 points/m 2 . According to the complexity of forest structure, we divided 7 plots into 3 simple plots (Plots 1-3 with regular and coned shape, see Figure 3a) and 4 complex plots (Plots 4-7 with extended and irregular branches and varied tree size, see Figure 3b). The details of UAV LiDAR system can be seen in [37].

Individual Tree Segmentation
The proposed method was performed using matlab 2016a on an HP Z820 eight-core-16-thread workstation. To evaluate the performance of the proposed individual tree segmentation method, we tested it on the UAV LiDAR data. In this study, the highest point of the detected tree is located as the tree, and we compared it with the reference tree locations in 7 plots. The LiDAR-derived tree is paired with a closest reference tree. If the difference between the derived tree height and the height of the reference tree is smaller than 0.5 m and more than 80% points are extracted according to visual inspection, we consider that the tree is correctly segmented.

Parameter Analysis
There are several parameters in our self-adaptive mean shift tree segmentation. The parameter involved in the pre-processing can be found in the literature [37]. Our self-adaptive mean shift individual tree segmentation method involves five parameters: h s , h r , R, n, and . Among the five parameters, horizontal kernel bandwidth, h s influences the accuracy and efficiency of individual tree segmentation. The value of h s was determined automatically according to tree structure. Vertical kernel bandwidth, h r , controls the canopy depth to determine the presentence of treetops. According to literature [30], h r was set to 5 m. Search radius, R controls a neighboring size to include the largest tree in the study area. According to our test data, R was set to 5 m. The number of angular sector, n controls the details of a tree canopy in different directions. In each direction, the possible crown width was analyzed along the vertical profiles of point clouds. , the grid size of the distance between point set X i and center point X c , controls the details of a tree canopy surface.
Parameters n and play a significant role in the adaption of kernel bandwidths, leading to strongly impacting on tree segmentation accuracy. Therefore, we designed two groups of experiments to investigate the sensitivity of the proposed self-adaptive mean shift tree segmentation method to the selection of the parameters n and .
In the first group, we held n = 8. To guarantee at least one LiDAR point per interval in terms of point density, we set the maximum value of to 0.5 m and varied from 0.1 to 0.5 at intervals of 0.1. The test results are presented in Table 2. As shown in Table 2, the DET values of the segmented trees decrease as the grid size increases from 0.1 m to 0.5 m, correspondingly, the OM values increase from 0.13 to 0.28. Whereas, the COM values tend to be stable as parameter changes from 0.3 m to 0.5 m. The reason behind this phenomenon might be that the tree details are gradually missing as the grid size increases, leading to the increase of the OM value and finally the decrease of the overall segmentation accuracies. As such, in our study, the best tree segmentations were obtained at interval = 0.2. In the second group, we held = 0.2 m and varied n = 2, 4, 6, 8, and 12. As Table 2 shown, the'DET values increase and the omission errors decrease with the increase of divided 3D space sectors, which means the smaller the angular sector, the more accurate the crown width estimates. When n = 12, the DET value changes slightly and the commission error increases. The reason is that the more extended branches might be included because of the smaller angular sector. In this paper, the n = 8 obtained the best tree segmentation performance.

Sensitivity Analysis of Point Density on Individual Tree Segmentation
To evaluate the effects of point densities on individual tree detection, we decimated original UAV LiDAR data to lower density of 50% (20 points/m 2 ), 25% (10 points/m 2 ), and 10% (4 points/m 2 ), using BACL LiDAR Tools [39,40]. Table 3 shows the individual tree detection accuracy under different point densities. The 100% point density achieved the highest individual tree detection accuracy, 0.87, and as the point density decreased from 100% (40 points/m 2 ) to 10% (4 points/m 2 ), the accuracy decreased from 0.87 to 0.32. Although the tree segmentation accuracy of our method decreased with a decrease of point density, the UAV LiDAR data guarantee the satisfactory tree segmentation results due to its high data resolution.

Overall Performance
After parameter sensitivity analysis, we set h r = 5 m, R = 5 m, = 0.2 m, and n = 8. h s was automatically determined. Figure 4a-f shows the visualized segmentation results of our method. Table 4 lists the tree segmentation results for seven plots (three plots for Metasequoia with simple tree structure, and four plots for Poplar with relatively complex tree structure). As seen in Table 4, for the simple plots, the DET, OM, and COM values obtained 0.95, 0.05 and 0.08, respectively; while for the complex plots, the DET, OM, and COM values achieved 0.80, 0.20, and 0.10, respectively. The segmentation accuracy of the complex plots was much lower than that of the simple plots; 20% of trees were undetected in complex plots. That was caused by the irregular tree structure of Poplar tree and the overlapped branches which made the treetop not distinct. The clusters containing less than 50 points were removed to reduce interference from branches, therefore, some small trees or branches were not included; 10% trees were falsely detected. That could be explained by the extended branches of the object trees being misclassified to neighboring trees. Due to the irregular tree structure and extended branches of complex plots, some individual trees failed to be detected and some branches were misclassified, resulting in the decrease of the segmentation accuracies. It was observed that most canopy shapes tend to be round in this study, a round search radius was defined to determine the segmented trees.

Comparative Tests
To evaluate the effectiveness of the proposed individual tree segmentation method, we designed a group of experiments and compared it with other three methods, including marker-controlled watershed segmentation [17], fixed bandwidth mean shift [30], and our previous tree segmentation method, Yan's method [37]. The parameters used in those comparative methods are listed in Table 5. The visualized results of all methods are shown in Figure 5, respectively. The average segmentation results of all sample plots are listed in Table 6. The marker-controlled watershed segmentation method achieved a better tree detection performance with the DET, OM, and COM values of 0.87, 0.13, and 0.13, respectively. In Figure 5, the results of the marker-controlled watershed segmentation was performed using LiDAR360 produced by Green Valley International (https://www.lidar360.com/archives/5135. html), green dots and black circles represent tree locations and canopy radii derived from watershed segmentation, respectively, and black crosses represent reference tree locations. The method performed better in the sample plots because of the prominent treetops and the circular crown shape of Metasequoia, which made the height variation of the raster surface model-CHM distinct. However, the crown boundaries were not delineated correctly, especially in Plots 4-7 (sample plots of Poplar). This is because the watershed segmentation algorithm considers only height variation of interpolated data, thereby it is incapable of dealing with extended and irregular tree shapes. The fixed bandwidth mean shift method, which uses a single bandwidth for a whole scene, achieved a worse segmentation accuracies with the DET, OM, and COM values of 0.84, 0.16, and 0.11, respectively. Note that the fixed kernel bandwidth method is incapable of dealing with the existence of various canopy sizes and overlapped canopy, which results in obvious omission errors and further decreases the DET value. Although Yan's method was mainly based on the fixed bandwidth mean shift, the segmentation accuracy was improved by using NCut, which gained its reputation for segmenting multiple overlapped objects. Yan's method achieved the DET, OM, and COM values of 0.90, 0.10, and 0.11, respectively, whose segmentation performance was slightly better than our method (the DET, OM, and COM values of 0.87, 0.13, and 0.09, respectively). Our method lost some small branches or trees compared to Yan's method because we eliminated clusters containing less than 50 points to reduce interference from branches, e.g., Figure 5(b3,b4). However, by contrast with Yan's method, our proposed method improved the segmentation accuracy by automatically adjusting the kernel bandwidth according to trees' shape and size.    [37], and (4) Self-adaptive bandwidth mean shift.  [37] 0.90 0.10 0.11 DET is the rate of trees correctly detected; OM is the rate of undetected trees, and COM is the rate of falsely detected trees.
In computing performance, we compared the running time of the four methods within a sample plot with 248,219 points. The marker-controlled watershed segmentation method presented that the time required was less than 1 second, outperformed the other three methods. This is because, compared to the other three methods, the marker-controlled watershed segmentation method was performed on the two-dimensional data. For the mean shift-based methods, the time required was at the same level (i.e., two minutes for our self-adaptive bandwidth method and nine minutes for the fixed bandwidth method). Among these methods, Yan's method presented the time required was 20 minutes. The reason behind this is that Yan's method iteratively used NCut to segment overlapping trees, leading to an increase of time consumption. Comparatively, our self-adaptive bandwidth method improved the computing performance by using the strategy of 3D space division.
In summary, Yan's method and our proposed method outperformed the other two methods in tree segmentation accuracies, but our method is much more effective than Yan's method in computational efficiency. Although the marker-controlled watershed segmentation method required less time than our proposed method for segmenting trees from the UAV LiDAR data, the tree crowns are not delineated correctly. Comparatively, in terms of segmentation accuracies and computational complexity, our proposed method is feasible for the individual tree segmentation from the UAV LiDAR data.

Conclusions
In this paper, we proposed a self-adaptive mean shift tree-segmentation method which improved tree segmentation accuracy and computational efficiency using UAV LiDAR point clouds. The key parameter in mean shift-the horizontal kernel bandwidth-was automatically estimated by analyzing canopy vertical structures at multiple directions. The estimation of the kernel bandwidth was without any prior knowledge of crown sizes or seed points. The performance was evaluated on three simple plots and four complex plots, and achieved the overall tree segmentation accuracies of 0.95 and 0.80, respectively. Comparative tests demonstrated that our proposed method can achieve both a better tree segmentation performance and a higher computational efficiency. Therefore, the self-adaptive mean shift method shows the applicability and feasibility for tree segmentation.
In this study, our method shows its superior to the severely overlapping and irregular trees in the UAV LiDAR data. However, due to a few understory trees in this study area, in the future we will work on the algorithm or integrate terrestrial LiDAR data to further deal with that. The tree species in our LiDAR data are mainly planted forests, and natural mixed-species forests from tropical regions will be considered in future studies.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: