Intelligent High-Resolution Geological Mapping Based on SLIC-CNN

: High-resolution geological mapping is an important supporting condition for mineral and energy exploration. However, high-resolution geological mapping work still faces many problems. At present, high-resolution geological mapping is still generated by expert interpretation of survey lines, compasses, and ﬁeld data. The work in the ﬁeld is constrained by the weather, terrain, and personnel, and the working methods need to be improved. This paper proposes a new method for high-resolution mapping using Unmanned Aerial Vehicle (UAV) and deep learning algorithms. This method uses the UAV to collect high-resolution remote sensing images, cooperates with some groundwork to anchor the lithology, and then completes most of the mapping work on high-resolution remote sensing images. This method transfers a large amount of ﬁeld work into the room and provides an automatic mapping process based on the Simple Linear Iterative Clustering-Convolutional Neural Network (SLIC-CNN) algorithm. It uses the convolutional neural network (CNN) to identify the image content and conﬁrms the lithologic distribution, the simple linear iterative cluster (SLIC) algorithm can be used to outline the boundary of the rock mass and determine the contact interface of the rock mass, and the mode and expert decision method is used to clarify the results of the fusion and mapping. The mapping method was applied to the Taili waterfront in Xingcheng City, Liaoning Province, China. In this study, the Area Under the Curve (AUC) of the mapping method was 0.937. The Kappa test result was k = 0.8523, and a high-resolution geological map was obtained.


Introduction
Geological mapping is an important part of geological surveys. Geological maps can specify the priorities for later geological work and reduce resource waste. However, mapping high-resolution geological maps is a major challenge, especially in hard-to-reach areas, which requires a lot of human and material costs [1][2][3][4][5][6]. Difficult terrain and tide have limited the development of traditional geological surveys. In China, the most commonly used large-scale mapping method is still the traditional geological survey. How to make geological surveys more automated has always been a research hotspot. To solve the difficult problem of geological survey automation, the following two issues need to be addressed.
Data: Many researchers have obtained satisfying results in the use of supervised and unsupervised classifications of medium-and high-resolution satellite images [7], and traditional geological interpretation work can be done on a low resolution. High-resolution geological mapping should be based on the interpretation and classification of high-resolution remote sensing images. However, the traditional classification of lithology is more dependent on multispectral image data [8][9][10], and traditional classification methods do not perform well at large scales.
In recent years, the cost of lightweight rotor Unmanned Aerial Vehicle (UAV) has become lower. As a convenient and highly automated image acquisition platform, UAV can solve the problem of automatic data acquisition. However, the UAV has not fully played its role in geological surveys, especially in China.
Identification: Lithology identification is an important part of geological survey work. Lithology recognition is similar to image recognition research, but there are some differences. Surface rocks often do not have a fixed shape, and the identification of rocks should mainly rely on color and texture information [11,12]. The resolution of the image is improved, the details of the surface objects are more abundant, and their features are increasingly more complex [13]. This all makes identification difficult.
In order to improve the recognition and classification accuracy of surface objects, many researchers have attempted to introduce machine learning algorithms into geosciences. The model of the machine learning algorithm is trained on specific data to obtain a more generalized model, which can sum up the information of the rock's features, structure, etc., and can obviously improve the accuracy of geo-sense recognition in high-resolution images [14][15][16][17]. However, the method based on content recognition cannot accurately determine the edge position of the recognition object, which hinders the application of many recognition algorithms in the field of high-resolution image classification. AlexNet and other neural networks almost all adopt the processing of meta data flipping, twisting, perturbation, etc. to increase the recognition accuracy [18][19][20][21][22][23][24][25][26][27][28]. Although this method will increase the recognition accuracy, it will also make the algorithm lose its perception of the location of the recognition object. In the field of geology, determining the location of rock boundaries and tectonic phenomena is very important for geological research [29][30][31][32]. Machine learning, like fully convolutional networks (FCNs), can detect the edges of an object while identifying it. However, limited by the algorithm design, FCN directly deconvolves the calculation results from the feature map, so the details of the edge division of complex shape objects are not satisfactory. Therefore, separating the target recognition process from the segmentation process is a more common method to improve the accuracy of recognition and the accuracy of edge division [30, [33][34][35][36].
However, it is very difficult for a single algorithm to determine the content boundary under the premise of correct content identification [30,33,34,36]. In response to this problem, our team proposed the Simple Linear Iterative Clustering-Convolutional Neural Network (SLIC-CNN) algorithm for high-resolution mapping work, using the convolutional neural network algorithm to identify the image content and confirm the distribution of lithology. The SLIC algorithm is used to outline the boundaries of the rock mass and further clarify the contact interface of the rock mass. The mode of the majority and expert decision-making are used to further clarify the boundaries and generate the mapping results. Our team conducted a high-resolution mapping experiment on the Taili Beach in Xingcheng City, Liaoning Province, China.

Geological Setting
The Archean basement of the eastern North China Craton (NCC) includes the oldest rocks in China (up to ca. 3.8 Ga) [37]. Sediment deposition mainly occurred during the middle to late Proterozoic [38][39][40]. The study area is geographically located in the Xingcheng-Taili region of the western Liaoning Province in northeastern China (Figure 1a), and tectonically in the eastern section of the northern margin of the NCC. It was mapped and investigated in detail regarding its deformation fabrics [41]. A slightly simplified geological map of the study area is shown in Figure 1b. resolution geological maps of this area using traditional geological survey methods, which has made research in this area more fragmented. Therefore, making high-quality high-resolution geological maps is of profound significance for unifying geological understanding about the evolution of the NCC [41]. This study aimed to use drones and artificial intelligence algorithms to provide a rapid method of drawing geological sketches. With the help of available geological data in this area, the accuracy of the method can also be convenient verified [37][38][39][40][41][42]. Since 2004, geologists from the College of Earth Sciences of Jilin University have conducted field surveys and measurements of high-resolution lithology-structures in the Precambrian metamorphic crystalline basement rock series exposed in the Xingcheng area, western Liaoning. It was found that the relationship between various lithologies in the metamorphic rock series in this area is complex, undergoing excessive tectonic deformation, as well as multiple alterations of varying degrees of metamorphism and magma. Through field and indoor petrology, zircon U-Pb dating, and other research work, the formation sequence of the main rocks in this area was obtained: ① Quartz dioritic gneiss 2510 Ma ± 7 Ma ( Figure 2C); ② mylonite 216.44 Ma ± 0.5 Ma ( Figure 2E); ③ pegmatite veins ( Figure 2D); and ④ biotite-bearing granite ( Figure 2B). The identification of exposed rocks in this area is relatively convenient, and the contact edges of the rock masses are clear, which provides good materials for the lithological image recognition research. This area is composed mainly of three granitic suites (Figure 1b), which formed in Neoarchean, Late Triassic, and Late Jurassic times, respectively, and which are also characterized by variable deformation patterns (Figure 1b). These granitic suites are described as follows: (1) The Neoarchean granitic rocks are traditionally referred to as "Suizhong granite", representing components of the Archean trondhjemite-tonalite-granodiorite (TTG), with a formation age of ca. 2.5 Ga of the NCC's basement; (2) the Triassic (ca. 220 Ma) granitic rocks, including the porphyritic orthogneiss, garnet-bearing granitic aplite, and biotite-syenogranite, intruded into the Neoarchean gneisses (Figure 1b) [37]; and (3) biotite adamellites with zircon U-Pb age of ca. 150 Ma show a massive structure in the south and a gneissic structure in the north, respectively (Figure 1b). The ductile shear zone of the Taili area is dominantly comprised of granitic rocks deformed within low-to middle-grade metamorphic conditions, which were documented by Liang et al. [39][40][41].
However, in the Taili area, there are still many geological problems to be solved. However, the work of predecessors has mainly focused on the petrology, geochemistry, and chronology of the rock mass [42]. The main rock outcrops in the Taili area are in the intertidal zone ( Figure 1c). We can only observe the rock outcrops in about 3 h at the time of the ebb, and it is almost impossible to draw high-resolution geological maps of this area using traditional geological survey methods, which has made research in this area more fragmented. Therefore, making high-quality high-resolution geological maps is of profound significance for unifying geological understanding about the evolution of the NCC [41]. This study aimed to use drones and artificial intelligence algorithms to provide a rapid method of drawing geological sketches. With the help of available geological data in this area, the accuracy of the method can also be convenient verified [37][38][39][40][41][42].
Since 2004, geologists from the College of Earth Sciences of Jilin University have conducted field surveys and measurements of high-resolution lithology-structures in the Precambrian metamorphic crystalline basement rock series exposed in the Xingcheng area, western Liaoning. It was found that the relationship between various lithologies in the metamorphic rock series in this area is complex, undergoing excessive tectonic deformation, as well as multiple alterations of varying degrees of metamorphism and magma. Through field and indoor petrology, zircon U-Pb dating, and other research work, the formation sequence of the main rocks in this area was obtained: 1 Quartz dioritic gneiss 2510 Ma ± 7 Ma ( Figure 2C); 2 mylonite 216.44 Ma ± 0.5 Ma ( Figure 2E); 3 pegmatite veins ( Figure 2D); and 4 biotite-bearing granite ( Figure 2B). The identification of exposed rocks in this area is relatively convenient, and the contact edges of the rock masses are clear, which provides good materials for the lithological image recognition research.

Data Choice
This study used ordinary Red Green Blue (RGB) image data collected by an unmanned aerial vehicle (UAV) as the basic data for classification. There are two reasons: First, ordinary RGB images are easier to obtain, and the cost is lower, which is more important for promoting the popularization of UAVs in geological mapping. Secondly, the UAV can take ultra-resolution RGB images. Although the spectral resolution is poor, its rich texture information can be used as an important parameter. The millimeter-scale spatial resolution can provide geological workers with an observation experience with the naked eye at about a 0.5 m distance. Later work also proved that the use of ultraresolution color images, combined with the ground truth data in the field, achieved satisfactory recognition accuracy in many scenarios.
We used a DJI Phantom 4 Pro UAV with 1-inch CMOS to take an S-shaped route about 30m from the ground. The repetition rate of the route was 80%, and the repetition rate of the pictures was 90%. The ground control points were measured using the Real Time Kinematic (RTK)-based Trimble R2 Integrated GNSS Systems. After the images were mosaiced and corrected, the high-resolution orthoimage of the area were obtained by Pix4D. The spatial resolution of image was about 1.3 cm.

Preprocessing of Data
Machine learning has a very obvious advantage over the traditional algorithm in the field of pattern recognition [43][44][45], but machine learning algorithms generally require the training data shape to be neat [46]. To introduce the machine learning algorithm into high-resolution mapping, we used a slicing method to provide standard-sized source data for machine learning algorithms. The work to determine the slice size needs to be considered in conjunction with the accuracy of geological interpretation. For example, a geological mapping requirement of 1:5000 requires the reflection of a rock vein with a width of 0.5 m on the surface, and it is necessary to identify veins with a width of 0.5 m or less. In this study, a 32 × 32 pixel slice was used with a sampling resolution of approximately 0.35 m.
Geo-interpretation work not only needs to know what the object of interpretation is but also needs to know where the object is, and there is no point in interpreting the object without geographic information [34]. Therefore, in addition to the geological interpretation work, we also needed to geocode the object of interpretation. Geocoding requires a unique ID to match the geographic location, which was not considered in previous image recognition algorithms. In order to facilitate the realization of other functions in the future, we separated geocoding as a functional module, established a WebService-based ground-to-interpret translation server, and used MySQL to store the ID and corresponding geographic location of each point.
In order to get closer to actual work conditions, this study did not remove non-geological-related content, such as tourists, yachts, vegetation, buildings, etc., but classified them separately. We hoped

Data Choice
This study used ordinary Red Green Blue (RGB) image data collected by an unmanned aerial vehicle (UAV) as the basic data for classification. There are two reasons: First, ordinary RGB images are easier to obtain, and the cost is lower, which is more important for promoting the popularization of UAVs in geological mapping. Secondly, the UAV can take ultra-resolution RGB images. Although the spectral resolution is poor, its rich texture information can be used as an important parameter. The millimeter-scale spatial resolution can provide geological workers with an observation experience with the naked eye at about a 0.5 m distance. Later work also proved that the use of ultra-resolution color images, combined with the ground truth data in the field, achieved satisfactory recognition accuracy in many scenarios.
We used a DJI Phantom 4 Pro UAV with 1-inch CMOS to take an S-shaped route about 30 m from the ground. The repetition rate of the route was 80%, and the repetition rate of the pictures was 90%. The ground control points were measured using the Real Time Kinematic (RTK)-based Trimble R2 Integrated GNSS Systems. After the images were mosaiced and corrected, the high-resolution orthoimage of the area were obtained by Pix4D. The spatial resolution of image was about 1.3 cm.

Preprocessing of Data
Machine learning has a very obvious advantage over the traditional algorithm in the field of pattern recognition [43][44][45], but machine learning algorithms generally require the training data shape to be neat [46]. To introduce the machine learning algorithm into high-resolution mapping, we used a slicing method to provide standard-sized source data for machine learning algorithms. The work to determine the slice size needs to be considered in conjunction with the accuracy of geological interpretation. For example, a geological mapping requirement of 1:5000 requires the reflection of a rock vein with a width of 0.5 m on the surface, and it is necessary to identify veins with a width of 0.5 m or less. In this study, a 32 × 32 pixel slice was used with a sampling resolution of approximately 0.35 m.
Geo-interpretation work not only needs to know what the object of interpretation is but also needs to know where the object is, and there is no point in interpreting the object without geographic information [34]. Therefore, in addition to the geological interpretation work, we also needed to geocode the object of interpretation. Geocoding requires a unique ID to match the geographic location, which was not considered in previous image recognition algorithms. In order to facilitate the realization of other functions in the future, we separated geocoding as a functional module, established a WebService-based ground-to-interpret translation server, and used MySQL to store the ID and corresponding geographic location of each point.
In order to get closer to actual work conditions, this study did not remove non-geological-related content, such as tourists, yachts, vegetation, buildings, etc., but classified them separately. We hoped to increase the ability of the model to recognize "foreign objects" during the training of the classification model and improve the generalization and classification accuracy in future applications. Based on the ground conditions in the study area, this study determined the number of interpretations as 7.
The verification area was approximately 50 × 50 m, and the high-resolution image has a spatial resolution of approximately 1.3 cm. A total of 13,432 slices were cut out, each with a pixel size of 32 × 32 pixels and a floor projection area of approximately 0.18 square meters. We divided the high-resolution images of the slices into two groups, namely the training group and the verification group. The amount of each category in each group is shown in Table 1. After grouping, the images and classification tags were formatted as uint8 multidimensional arrays and serialized using the pickle module to facilitate neural network training reading. In this study, the number of training samples was set to 600 according to the distribution of features in the study area. Except for "Sundries", "boardwalk", and "Granite Dykes", the number of training samples of all other features was less than the number of verification samples. It is worth noting that the training and verification samples were randomly taken from the "verification area", and their area was only 1/8 of the area of the study area. This means that the number of training samples, the number of validation samples, and the number of objects used for classification was about 1:3:32.

Original Intention of SLIC-CNN
In recent years, deep learning, also known as the deep neural network, has attracted the attention of scholars from all fields [44,45]. A large number of methods about deep learning have been proposed. A typical deep neural network architecture includes a deep belief network (DBN) [46], CNN and auto encoder [25], and so on. Among them, after the first design and optimization of LeCun et al. [47] in 1998, the performance of the deep CNN has increased. In 2015, the CNN classification accuracy surpassed humans on the 1000 class of the ImageNet dataset [47], which contains 1,200,000 training images, 50,000 verification images, and 10,000 test images. The CNN is often designed for the processing of complex signals, such as computer vision; it has shown its superiority compared to other technologies [48]. CNN is widely used in image classification, speech recognition, traffic sign recognition, medical image analysis, and other applications [49]. This effective technique is also applied to the classification of high-resolution and medium-resolution remote sensing images [43][44][45][46]. With more and more research investigating CNN, the number of improved CNNs has increased conspicuously. They are playing an increasingly important role in various fields [50][51][52]. Deep learning is also very suitable for automatic identification of land types in the field since feature selection is not required.
As another important part of deep learning, the accuracy of semantic segmentation technology cannot be compared with pattern recognition. Machine learning, like fully convolutional networks (FCNs), can detect an object's edges while recognizing it. However, due to the limitation of the algorithm design, FCN directly deconvolves the calculation results of the feature map, so the details of the edge division of complex-shaped objects are not satisfactory.
From recent research, separating the target recognition process from the segmentation process is a more common method to improve the accuracy of recognition and edge segmentation [30, [33][34][35][36]]. This will not only improve the computing efficiency but also facilitate researchers to optimize recognition and segmentation, respectively.
Superpixel is an effective segmentation solution. It divides the image into a series of sub-regions. Each sub-region has a certain feature between them and has strong consistency. Superpixel segmentation algorithms are also mostly based on color space partitioning and do not care about the meaning of the actual classification. The SLIC superpixel segmentation algorithm [53,54] is a typical representation of this type of algorithm. SLIC seeks the distance of pixels in the image subregion in the International Commission on Illumination Lab color space (CIELAB) color space to determine which pixels need to be clustered into one superpixel region. The algorithm's processing speed and storage efficiency are superior to other superpixel segmentation algorithms, and the obtained boundary has a strong dependence on the original boundary of the image.
SLIC or other segmentation techniques have become important methods for edge division [55,56]. However, algorithms, such as the iterative self-organizing data analysis technique algorithm (ISODATA), k-mean, and superpixel, all use color information as clustering parameters. What is inside the clustering area does not affect the clustering results. However, the advantage of deep learning algorithms is in determining what an object is. Detecting the spatial location of objects is especially important for automatic geological mapping. So, is there a method to meet the needs of image recognition and spatial positioning? This is the original intention of the SLIC-CNN algorithm.

Structure of SLIC-CNN
This is shown in Figure 3. Compared with CNN and traditional clustering classification methods, SLIC-CNN separates edge detection from object recognition and optimizes them separately, which can theoretically optimize the image processing results. The parallel design of edge detection and object recognition processes also theoretically improves the efficiency of the algorithm. From recent research, separating the target recognition process from the segmentation process is a more common method to improve the accuracy of recognition and edge segmentation [30, [33][34][35][36]]. This will not only improve the computing efficiency but also facilitate researchers to optimize recognition and segmentation, respectively.
Superpixel is an effective segmentation solution. It divides the image into a series of sub-regions. Each sub-region has a certain feature between them and has strong consistency. Superpixel segmentation algorithms are also mostly based on color space partitioning and do not care about the meaning of the actual classification. The SLIC superpixel segmentation algorithm [53,54] is a typical representation of this type of algorithm. SLIC seeks the distance of pixels in the image subregion in the International Commission on Illumination Lab color space (CIELAB) color space to determine which pixels need to be clustered into one superpixel region. The algorithm's processing speed and storage efficiency are superior to other superpixel segmentation algorithms, and the obtained boundary has a strong dependence on the original boundary of the image.
SLIC or other segmentation techniques have become important methods for edge division [55,56]. However, algorithms, such as the iterative self-organizing data analysis technique algorithm (ISODATA), k-mean, and superpixel, all use color information as clustering parameters. What is inside the clustering area does not affect the clustering results. However, the advantage of deep learning algorithms is in determining what an object is. Detecting the spatial location of objects is especially important for automatic geological mapping. So, is there a method to meet the needs of image recognition and spatial positioning? This is the original intention of the SLIC-CNN algorithm.

Structure of SLIC-CNN
This is shown in Figure 3. Compared with CNN and traditional clustering classification methods, SLIC-CNN separates edge detection from object recognition and optimizes them separately, which can theoretically optimize the image processing results. The parallel design of edge detection and object recognition processes also theoretically improves the efficiency of the algorithm.  Figure 4 shows the use of UAV-related technology by our team in traditional geological mapping work. As shown in the figure, before geological mapping, in the phase of data acquisition, based on the existing topographic maps and satellite images, we can use the UAV to quickly obtain the geological outcrop photo and topography of the target area. The main purpose of field reconnaissance  Figure 4 shows the use of UAV-related technology by our team in traditional geological mapping work. As shown in the figure, before geological mapping, in the phase of data acquisition, based on the existing topographic maps and satellite images, we can use the UAV to quickly obtain the geological outcrop photo and topography of the target area. The main purpose of field reconnaissance is to determine the typical lithology and location within the area and provide content control for automatic mapping algorithms. The automatic mapping algorithm based on SLIC-CNN is shown in the dashed box in the figure, and the resulting automatic mapping results still need to use field work inspection and verification. In order to facilitate the comparison of remote sensing images and actual conditions, we used a Smart phone and OruxMaps APP to record the typical lithology of the ground as lithology samples. A total of 74 ground truth samples were collected in the study area for the calibration of the lithology (Figure 2A).    Figure 5, four polygon boxes are drawn in four colors, which represent the four main parts of the SLIC-CNN method. The blue area indicates the generation of high-resolution aerial photography images of UAVs, and the orange area represents the process of convolutional neural network training and classification using high-resolution image slices. The green area represents the process of the SLIC superpixel classification algorithm doing edge division within the study area. The purple area represents the process of using the mode and special decisions to fuse the CNN's identification content with the edge of the SLIC's identification.   Figure 5, four polygon boxes are drawn in four colors, which represent the four main parts of the SLIC-CNN method. The blue area indicates the generation of high-resolution aerial photography images of UAVs, and the orange area represents the process of convolutional neural network training and classification using high-resolution image slices. The green area represents the process of the SLIC superpixel classification algorithm doing edge division within the study area. The purple area represents the process of using the mode and special decisions to fuse the CNN's identification content with the edge of the SLIC's identification.
In detail, after obtaining a high-resolution image, slicing of the image is necessary. This experiment uses ArcGIS fishnet vector data for batch mask slicing, so that the spatial position of each slice can correspond to the vector mask one by one, which is also convenient for geo-coding of the image recognition results. The CNN model was built using the TensorFlow-python framework, which is improved from AlexNet. The SLIC algorithm was also written in python, and the relevant code has been uploaded to github. Finally, ArcGIS's Python library was used to connect the SLIC and CNN classification results in geographic space, and the slices were assigned values according to the mode principle.

Deep Learning of SLIC-CNN
We modeled a machine learning network modeled on the AlexNet. The main parameters are as Figure 6.
The network includes six weighted layers: The first four layers are convolutional layers, so sometimes it is also called convolutional neural networks (CNNs), and the remaining three layers are fully connected layers. The output of the last full-connection layer is sent to a 7-way SoftMax layer, which produces a distribution that covers 7 types of labels. Softmax is a function used to convert the score result into a probability, which is convenient for probability calculation and gradient descent. Our network maximizes the multi-class logistic regression goal, which is equivalent to maximizing the logarithmic probability average of the correct labels in the training sample under the predicted distribution. Figure 5 describes in detail the SLIC-CNN method in the high-resolution geological mapping process. In Figure 5, four polygon boxes are drawn in four colors, which represent the four main parts of the SLIC-CNN method. The blue area indicates the generation of high-resolution aerial photography images of UAVs, and the orange area represents the process of convolutional neural network training and classification using high-resolution image slices. The green area represents the process of the SLIC superpixel classification algorithm doing edge division within the study area. The purple area represents the process of using the mode and special decisions to fuse the CNN's identification content with the edge of the SLIC's identification.  In detail, after obtaining a high-resolution image, slicing of the image is necessary. This experiment uses ArcGIS fishnet vector data for batch mask slicing, so that the spatial position of each slice can correspond to the vector mask one by one, which is also convenient for geo-coding of the image recognition results. The CNN model was built using the TensorFlow-python framework, which is improved from AlexNet. The SLIC algorithm was also written in python, and the relevant code has been uploaded to github. Finally, ArcGIS's Python library was used to connect the SLIC and CNN classification results in geographic space, and the slices were assigned values according to the mode principle.

Deep Learning of SLIC-CNN
We modeled a machine learning network modeled on the AlexNet. The main parameters are as Figure 6. The network includes six weighted layers: The first four layers are convolutional layers, so sometimes it is also called convolutional neural networks (CNNs), and the remaining three layers are fully connected layers. The output of the last full-connection layer is sent to a 7-way SoftMax layer, which produces a distribution that covers 7 types of labels. Softmax is a function used to convert the score result into a probability, which is convenient for probability calculation and gradient descent. Our network maximizes the multi-class logistic regression goal, which is equivalent to maximizing the logarithmic probability average of the correct labels in the training sample under the predicted distribution.
The core of the latter convolutional layer is connected to all the core maps in the previous convolutional layer. The neurons in the fully connected layer are connected to all neurons in the previous layer. The response-normalized layer follows the first and second convolutional layers. The pooling layer follows the normalized layer and the fourth convolutional layer.
The first convolutional layer uses 64 kernels with a size of 3 × 5 × 5 and a 1-pixel stride to filter The core of the latter convolutional layer is connected to all the core maps in the previous convolutional layer. The neurons in the fully connected layer are connected to all neurons in the previous layer. The response-normalized layer follows the first and second convolutional layers. The pooling layer follows the normalized layer and the fourth convolutional layer.
The first convolutional layer uses 64 kernels with a size of 3 × 5 × 5 and a 1-pixel stride to filter input data with a 3 × 32 × 32 size. The stride is the distance between the centers of the receptive field of the neighboring neuron in the same kernel map. The second convolutional layer needs to take the output of the first convolutional layer (response normalized and pooled) as its own input and filter it with 64 kernels in a 64 × 5 × 5 size. The third convolutional layer has 64 kernels of a 64 × 3 × 3 size connected to the output (convolved, pooled) of the second convolutional layer. The fourth convolution layer has 64 kernels of a 64 × 3 × 3 size. The first full connected layer has 512 neurons and the second has 256, and each full connected layer uses Dropout to guarantee the generalization of the model. We also used random preroll, mirroring, cropping, and stretching to preprocess the data before the data reading process to enhance the generalization ability of the model.
In this study, CNN were mainly used for the recognition of ground objects. The purpose was to accurately segment slices of high-resolution images into seven predefined categories. However, this method of surface classification of slices cannot accurately locate the edge of the feature (Figure 7). To solve this problem, we introduced a superpixel classification method in the process of high-resolution mapping.

Superpixel of SLIC-CNN
The SLIC algorithm mainly uses the K-means clustering algorithm to process superpixels. The distance measurement in the clustering algorithm not only includes the color distance of the color space but also the Euclidean distance of the pixel coordinates. Therefore, the center point of the Kmeans cluster consists of five-dimensional vectors. This includes the recording of pixels in the lab color space and the XY coordinates of the pixel. Since XY coordinates cannot be directly calculated with the color space, a compactness parameter is added [57,58].
In addition, after the K-means clustering, a convolution process is also needed in order to merge independent pixels enclosed by a region into a certain class. The cluster center K parameter is used to indicate the number of segmentation results. The optimization of these parameters needs to be considered in combination with the complexity of geological conditions and the accuracy of mapping. After many experiments, our team believes that the number of segments in the SLIC algorithm should be set to a suitable value first, so that the initial cluster center distance of the SLIC algorithm is approximately equal to the step size of the CNN training slice. If the number of segments is too large, and the image segmentation is too fine, it can achieve better results for object segmentation and edge recognition, but it will cause many regions to not correspond to the classification result. If this number is too small, it will result in a small number of segments, although it can reduce the appearance of areas without corresponding classification results, but the edge fitting effect is poor (Figure 8). It can be seen that when the SLIC clustering center step ≈ CNN slice step length, there are few missing maps, the SLIC segmentation area is approximately equal to the CNN segmentation window size, and the edge matching is better. When the SLIC clustering center step >> CNN slice step size, although almost no missing map spot is apparent, the segment number is too small, and the spot edge is too sketchy. When the SLIC clustering center step size << CNN slice length, the segment number is more, and the edge fitting is better. However, there are more spots missing.
When using the SLIC algorithm and appropriate parameters, we can obtain more satisfactory results of edge partitioning. Based on the edge division, we combined the content recognition results of machine learning to theoretically obtain the classification results of the accurate edges and accurate content. However, there are still some content and scope matching problems that need to be solved.

Superpixel of SLIC-CNN
The SLIC algorithm mainly uses the K-means clustering algorithm to process superpixels. The distance measurement in the clustering algorithm not only includes the color distance of the color space but also the Euclidean distance of the pixel coordinates. Therefore, the center point of the K-means cluster consists of five-dimensional vectors. This includes the recording of pixels in the lab color space and the XY coordinates of the pixel. Since XY coordinates cannot be directly calculated with the color space, a compactness parameter is added [57,58].
In addition, after the K-means clustering, a convolution process is also needed in order to merge independent pixels enclosed by a region into a certain class. The cluster center K parameter is used to indicate the number of segmentation results. The optimization of these parameters needs to be considered in combination with the complexity of geological conditions and the accuracy of mapping. After many experiments, our team believes that the number of segments in the SLIC algorithm should be set to a suitable value first, so that the initial cluster center distance of the SLIC algorithm is approximately equal to the step size of the CNN training slice. If the number of segments is too large, and the image segmentation is too fine, it can achieve better results for object segmentation and edge recognition, but it will cause many regions to not correspond to the classification result. If this number is too small, it will result in a small number of segments, although it can reduce the appearance of areas without corresponding classification results, but the edge fitting effect is poor (Figure 8). It can be seen that when the SLIC clustering center step ≈ CNN slice step length, there are few missing maps, the SLIC segmentation area is approximately equal to the CNN segmentation window size, and the edge matching is better. When the SLIC clustering center step >> CNN slice step size, although almost no missing map spot is apparent, the segment number is too small, and the spot edge is too sketchy.
When the SLIC clustering center step size << CNN slice length, the segment number is more, and the edge fitting is better. However, there are more spots missing.

Mode and Special Decision of SLIC-CNN
For the segmented image, how to assign each segment to the label obtained by the deep learning needs careful consideration. Simply use of CNN's labels to assign SLIC segments is not enough. Especially, in areas with complex geological conditions, we need to be especially cautious. For geological mapping work, many small segments have important geological significance, such as dykes and intrusions. The superpixel segmentation algorithm can adjust the parameters to segment these fine geological bodies as much as possible. However, the depth learning algorithm does not increase the sampling rate indefinitely. Our research shows that the accuracy of prediction is greatly reduced when the pixel resolution of the recognition target is from 64 px, 32 px, to 16 px. Even at a resolution of 64 px, taking this article as an example, the corresponding sampling step on the ground reaches 7 cm. There are still many gaps from the high-resolution mapping specifications promulgated by China (more than 5 cm width veins need to be identified). Therefore, the separation of finely segmented segments from the background and achieving good-quality labels must be considered. In response to this, our team proposed a method called "mode and special decision" to solve this problem.
The meaning of mode and special decision is to use different decision-making schemes to define the target content for different situations. The mode here refers to the most frequently occurring tag in the range of segment slices generated by an SLIC. We superimposed CNN identified classification tag points and SLIC classification segments in space. The following situations thus occured: For the Figure 9A case, we named the segment using the most number of tags that appear in the segment. Since it is considered that the mode may not be the majority (50%), we needed to use the Top_K method to output the first three classifications with the highest prediction rate when they wee output at the softmax layer, and add the comparisons to obtain the classification with the highest total probability. The Figure 9B situation is much simpler. We could directly assign the CNN classification result to this segment.
In the case of Figure 9C, we needed to leave the class as null at this region and then use the nibble algorithm to assign it. The nibble algorithm assigns the value of the nearest neighbor to the target area. The algorithm performs internal Euclidean allocation and assigns the nearest neighbor values to each target area.
By merging the SLIC and CNN, we established a high-resolution geological mapping method based on SLIC-CNN. This method is based on the use of deep learning classification of highresolution remote sensing slices. Using SLIC classification methods to determine the classification boundary, the two are combined by the mode and special decision principles, resulting in the final high-resolution mapping results ( Figure 10). When using the SLIC algorithm and appropriate parameters, we can obtain more satisfactory results of edge partitioning. Based on the edge division, we combined the content recognition results of machine learning to theoretically obtain the classification results of the accurate edges and accurate content. However, there are still some content and scope matching problems that need to be solved. Our team has proposed a "mode and special decision" method to solve these topological issues.

Mode and Special Decision of SLIC-CNN
For the segmented image, how to assign each segment to the label obtained by the deep learning needs careful consideration. Simply use of CNN's labels to assign SLIC segments is not enough. Especially, in areas with complex geological conditions, we need to be especially cautious. For geological mapping work, many small segments have important geological significance, such as dykes and intrusions. The superpixel segmentation algorithm can adjust the parameters to segment these fine geological bodies as much as possible. However, the depth learning algorithm does not increase the sampling rate indefinitely. Our research shows that the accuracy of prediction is greatly reduced when the pixel resolution of the recognition target is from 64 px, 32 px, to 16 px. Even at a resolution of 64 px, taking this article as an example, the corresponding sampling step on the ground reaches 7 cm. There are still many gaps from the high-resolution mapping specifications promulgated by China (more than 5 cm width veins need to be identified). Therefore, the separation of finely segmented segments from the background and achieving good-quality labels must be considered. In response to this, our team proposed a method called "mode and special decision" to solve this problem.
The meaning of mode and special decision is to use different decision-making schemes to define the target content for different situations. The mode here refers to the most frequently occurring tag in the range of segment slices generated by an SLIC. We superimposed CNN identified classification tag points and SLIC classification segments in space. The following situations thus occured: For the Figure 9A case, we named the segment using the most number of tags that appear in the segment. Since it is considered that the mode may not be the majority (50%), we needed to use the Top_K method to output the first three classifications with the highest prediction rate when they wee output at the softmax layer, and add the comparisons to obtain the classification with the highest total probability. The Figure 9B situation is much simpler. We could directly assign the CNN classification result to this segment.  In this experiment, we built a SLIC-CNN algorithm platform based on the Python version of the Tensorflow framework. The experimental hardware platform was a dual-channel Intel Xeon E5-2630V4, with the nVIDIA Quadro K5200 model GPU.

Optimization of CNN
In order to find the best parameters of CNN, we used a total of 15 parameter combinations. In the case of a certain number of iterations, the total calculation time is closely related to the batch size. In the model training-accuracy line chart, we can find that with a certain number of iterations, the higher the batch size setting, the more time consuming the calculation; the smaller the learning rate setting, the slower the model converges. However, at the same time, according to the verification of the correct rate line graph, if the batch size is set too high, it will cause large fluctuations in the accuracy rate of the model validation (batch 1024 and batch 2048 in Figure 11). Setting the learning rate too high (0.01) can also cause the model validation to fluctuate correctly and have poor convergence. In the case of Figure 9C, we needed to leave the class as null at this region and then use the nibble algorithm to assign it. The nibble algorithm assigns the value of the nearest neighbor to the target area. The algorithm performs internal Euclidean allocation and assigns the nearest neighbor values to each target area.
By merging the SLIC and CNN, we established a high-resolution geological mapping method based on SLIC-CNN. This method is based on the use of deep learning classification of high-resolution remote sensing slices. Using SLIC classification methods to determine the classification boundary, the two are combined by the mode and special decision principles, resulting in the final high-resolution mapping results (Figure 10).
In this experiment, we built a SLIC-CNN algorithm platform based on the Python version of the Tensorflow framework. The experimental hardware platform was a dual-channel Intel Xeon E5-2630V4, with the nVIDIA Quadro K5200 model GPU.  In this experiment, we built a SLIC-CNN algorithm platform based on the Python version of the Tensorflow framework. The experimental hardware platform was a dual-channel Intel Xeon E5-2630V4, with the nVIDIA Quadro K5200 model GPU.

Optimization of CNN
In order to find the best parameters of CNN, we used a total of 15 parameter combinations. In the case of a certain number of iterations, the total calculation time is closely related to the batch size. In the model training-accuracy line chart, we can find that with a certain number of iterations, the higher the batch size setting, the more time consuming the calculation; the smaller the learning rate setting, the slower the model converges. However, at the same time, according to the verification of

Optimization of CNN
In order to find the best parameters of CNN, we used a total of 15 parameter combinations. In the case of a certain number of iterations, the total calculation time is closely related to the batch size. In the model training-accuracy line chart, we can find that with a certain number of iterations, the higher the batch size setting, the more time consuming the calculation; the smaller the learning rate setting, the slower the model converges. However, at the same time, according to the verification of the correct rate line graph, if the batch size is set too high, it will cause large fluctuations in the accuracy rate of the model validation (batch 1024 and batch 2048 in Figure 11). Setting the learning rate too high (0.01) can also cause the model validation to fluctuate correctly and have poor convergence. ISPRS Int. J. Geo-Inf. 2020, 9, x FOR PEER REVIEW 12 of 23 Figure 11. Comparing the calculation time of the CNN model under different parameters in this paper, the horizontal coordinate unit is second, and the ordinate bar name format is "learning rate _batch".
From Figure 11, we can compare the average correct rate of each model, but we cannot see whether this correct rate is stable or not. After considering Figures 12-14, and after several experiments and multiple comparisons, we selected the 0.0001 learning rate and 512 batch size training model as application models to participate in the SLIC-CNN mapping work.

Optimization of SLIC
In the parameters of the SLIC algorithm, n_segments or K parameters control the number of slices. The sigma parameter affects the Gaussian smoothing operator, the max_iter parameter controls the number of smooth iterations, and the max_iter and sigma parameters control the K-means clustering effect together. The compactness parameter balances the relationship between the color space and distance. Through the combination of different parameters, we can obtain a variety of segmentation combinations. According to previous calculations, the number of deep learning slices in the study area is 13,432, and the number of SLIC segments should not exceed this value. In order to find the best segmentation parameters, we performed 165 experiments for n_segments from 350-10,000, sigma from 0-8, and compactness from 0.2-50, and selected the segmentation result with the best edge fitting effect. That is, n_segments = 8000, sigma = 6, and compactness = 5 ( Figure 15). From the figure, we can see that the SLIC algorithm can better fit the edge of the segment with the edge of the ground object. Although some concentric ring segments are generated, we only need the SLIC to mark the effective edge to complete the task. The merging of segments does not belong to the SLIC algorithm.   Figure 11. Comparing the calculation time of the CNN model under different parameters in this paper, the horizontal coordinate unit is second, and the ordinate bar name format is "learning rate _batch".
From Figure 11, we can compare the average correct rate of each model, but we cannot see whether this correct rate is stable or not. After considering Figures 12-14, and after several experiments and multiple comparisons, we selected the 0.0001 learning rate and 512 batch size training model as application models to participate in the SLIC-CNN mapping work. Figure 11. Comparing the calculation time of the CNN model under different parameters in this paper, the horizontal coordinate unit is second, and the ordinate bar name format is "learning rate _batch".
From Figure 11, we can compare the average correct rate of each model, but we cannot see whether this correct rate is stable or not. After considering Figures 12-14, and after several experiments and multiple comparisons, we selected the 0.0001 learning rate and 512 batch size training model as application models to participate in the SLIC-CNN mapping work.

Optimization of SLIC
In the parameters of the SLIC algorithm, n_segments or K parameters control the number of slices. The sigma parameter affects the Gaussian smoothing operator, the max_iter parameter controls the number of smooth iterations, and the max_iter and sigma parameters control the K-means clustering effect together. The compactness parameter balances the relationship between the color space and distance. Through the combination of different parameters, we can obtain a variety of segmentation combinations. According to previous calculations, the number of deep learning slices in the study area is 13,432, and the number of SLIC segments should not exceed this value. In order to find the best segmentation parameters, we performed 165 experiments for n_segments from 350-10,000, sigma from 0-8, and compactness from 0.2-50, and selected the segmentation result with the best edge fitting effect. That is, n_segments = 8000, sigma = 6, and compactness = 5 ( Figure 15). From the figure, we can see that the SLIC algorithm can better fit the edge of the segment with the edge of the ground object. Although some concentric ring segments are generated, we only need the SLIC to mark the effective edge to complete the task. The merging of segments does not belong to the SLIC algorithm.

Optimization of SLIC
In the parameters of the SLIC algorithm, n_segments or K parameters control the number of slices. The sigma parameter affects the Gaussian smoothing operator, the max_iter parameter controls the number of smooth iterations, and the max_iter and sigma parameters control the K-means clustering effect together. The compactness parameter balances the relationship between the color space and distance. Through the combination of different parameters, we can obtain a variety of segmentation combinations. According to previous calculations, the number of deep learning slices in the study area is 13,432, and the number of SLIC segments should not exceed this value. In order to find the best segmentation parameters, we performed 165 experiments for n_segments from 350-10,000, sigma from 0-8, and compactness from 0.2-50, and selected the segmentation result with the best edge fitting effect. That is, n_segments = 8000, sigma = 6, and compactness = 5 ( Figure 15). From the figure, we can see that the SLIC algorithm can better fit the edge of the segment with the edge of the ground object. Although some concentric ring segments are generated, we only need the SLIC to mark the effective edge to complete the task. The merging of segments does not belong to the SLIC algorithm.   After obtaining the segmentation results of the SLIC and the recognition results of the AlexNet, then the process of mode and special decision, we can finally obtain the results of high-resolution geological mapping of the study area.

Results of SLIC-CNN
Compared with Figure 16, we can see that the SLIC-CNN algorithm can accurately identify the exposed terrain in the area, and the annotation of the edge of the terrain is also more elaborate. The boundaries between mylonites and orthogneiss, the boundaries between plants and beaches, and the boundaries between roads and beaches are all more accurate. After obtaining the segmentation results of the SLIC and the recognition results of the AlexNet, then the process of mode and special decision, we can finally obtain the results of high-resolution geological mapping of the study area.

Results of SLIC-CNN
Compared with Figure 16, we can see that the SLIC-CNN algorithm can accurately identify the exposed terrain in the area, and the annotation of the edge of the terrain is also more elaborate. The boundaries between mylonites and orthogneiss, the boundaries between plants and beaches, and the boundaries between roads and beaches are all more accurate. After obtaining the segmentation results of the SLIC and the recognition results of the AlexNet, then the process of mode and special decision, we can finally obtain the results of high-resolution geological mapping of the study area.

Results of SLIC-CNN
Compared with Figure 16, we can see that the SLIC-CNN algorithm can accurately identify the exposed terrain in the area, and the annotation of the edge of the terrain is also more elaborate. The boundaries between mylonites and orthogneiss, the boundaries between plants and beaches, and the boundaries between roads and beaches are all more accurate.
.  In order to compare with the classification results of the existing classification methods, we also experimentally classified and interpreted the study area using the pixel classification based on the maximum likelihood method (Table 2) and the object-oriented classification based on the k-Nearest Neighbor (KNN) method (Table 3). In the object-oriented classification, we performed many experiments and comparisons. The classification effect was best when the scale parameter was set to 90, the merge parameter was set to 95, and the texture kernel size was set to 5.  After deriving the classification results and confusion matrix of those classification methods, we visually compared the mapping accuracy of the four methods in the high-resolution mapping work ( Figure 17). In order to compare with the classification results of the existing classification methods, we also experimentally classified and interpreted the study area using the pixel classification based on the maximum likelihood method (Table 2) and the object-oriented classification based on the k-Nearest Neighbor (KNN) method (Table 3). In the object-oriented classification, we performed many experiments and comparisons. The classification effect was best when the scale parameter was set to 90, the merge parameter was set to 95, and the texture kernel size was set to 5.
After deriving the classification results and confusion matrix of those classification methods, we visually compared the mapping accuracy of the four methods in the high-resolution mapping work ( Figure 17).   The confusion matrix is an intuitive tool for displaying multi-class results. Cells (x, y) represent the proportion of objects in x that are classified into y. According to this, it can be found that the overall recognition hits of the pixel-based method are better than the object-oriented method, especially for Mylonite. The CNN method shows far better hit rates than the pixel-based and object-oriented methods (Table 4). This is consistent with the advantages shown by deep learning algorithms in other image recognition fields. Except for "sundries", SLIC-CNN has a higher hit rate than the CNN method ( Table 5). The Kappa verification result of SLIC-CNN is k = 0.8523, which is also greater than the AlexNet method (k = 0.8426).  To more intuitively compare the results of the four methods, we used the Receiver Operating Characteristic (ROC) curve (Figure 18a). ROC (receiver operating characteristic curve) is a comprehensive indicator reflecting the continuous variables of sensitivity and specificity. It uses the composition method to reveal the relationship between the sensitivity and specificity. It uses the sensitivity as the ordinate and (1-specificity) as the abscissa to draw a curve. The larger the area under the curve (AUC), the higher the diagnostic accuracy. On the ROC curve, the point closest to the upper left of the graph is a critical value with a higher sensitivity and specificity. In this case, the AUC of the SLIC-CNN algorithm was 0.937, higher than the AlexNet slice classification (0.92), pixel-based method (0.845), and object method (0.784) (Figure 18b).
Using SLIC-CNN mapping results in comparison with the available geological data (Figure 19), some features of the UAV high-resolution geological mapping method can be found. For ease of comparison and illustration, we marked the typical dykes with the same Latin symbol in the map and marked them with dark red lines under the typical veins on the remote sensing image. The part covered by concrete or beach due to human factors is indicated by the dark-red dotted line. SLIC-CNN mapping is better at identifying veins at α1 in the figure, but it is still not continuous enough. Compared with remote sensing images, we found that the exposed part of the α1 dyke gradually became thin and the algorithm could not identify it. The southern part of the α2 dike was covered by a newly built cement table, and the ground was blocked by tourists at β3, which made it impossible to identify the veins in the area. In addition, the recognition of veins at α3 was very effective, and the "S"-shaped dykes were also identified at α4. The veil at β1 was too small and resulted in poor recognition. The basic dyke at β2 was identified by the algorithm as sand deposition. In general, SLIC-CNN has a high recognition rate for the edge and content of rock masses, and its identification of veins can also give a partial reference. Using SLIC-CNN mapping results in comparison with the available geological data (Figure 19), some features of the UAV high-resolution geological mapping method can be found. For ease of comparison and illustration, we marked the typical dykes with the same Latin symbol in the map and marked them with dark red lines under the typical veins on the remote sensing image. The part covered by concrete or beach due to human factors is indicated by the dark-red dotted line. SLIC-CNN mapping is better at identifying veins at α1 in the figure, but it is still not continuous enough. Compared with remote sensing images, we found that the exposed part of the α1 dyke gradually became thin and the algorithm could not identify it. The southern part of the α2 dike was covered by a newly built cement table, and the ground was blocked by tourists at β3, which made it impossible to identify the veins in the area. In addition, the recognition of veins at α3 was very effective, and the "S"-shaped dykes were also identified at α4. The veil at β1 was too small and resulted in poor recognition. The basic dyke at β2 was identified by the algorithm as sand deposition. In general, SLIC-CNN has a high recognition rate for the edge and content of rock masses, and its identification of veins can also give a partial reference. Figure 19b is a fine tectonic lithologic geological map drawn by Jin Wei and Zheng Changqing, College of Earth Sciences, Jilin University in 2004. This map has been the most detailed geological map of the region for a long time. However, it can be found that the edges of the mylonite body are drawn inaccurately in the picture (w1, w2, and w3 in Figure 19b). The result of the automated geological mapping process (Figure 18a) is in the shapefile format, which is a working and interchange format promulgated by Environmental Systems Research Institute, Inc. (ESRI) for simple vector data with attributes, and it is very convenient for merging, cropping, and adjusting attributes. On the basis of Figure 18a, the data obtained from the ground survey perfected the automated results, resulting in a high-resolution geological map of Taili area ( Figure 20). Compared with previous studies, this geological map has higher resolution and coverage. Moreover, thanks to the assistance of the automated SLIC-CNN method, it took only half a day to draw this map, which has greatly improved the efficiency compared to the traditional geological survey method.    Figure 19b). The result of the automated geological mapping process (Figure 18a) is in the shapefile format, which is a working and interchange format promulgated by Environmental Systems Research Institute, Inc. (ESRI) for simple vector data with attributes, and it is very convenient for merging, cropping, and adjusting attributes. On the basis of Figure 18a, the data obtained from the ground survey perfected the automated results, resulting in a high-resolution geological map of Taili area ( Figure 20). Compared with previous studies, this geological map has higher resolution and coverage. Moreover, thanks to the assistance of the automated SLIC-CNN method, it took only half a day to draw this map, which has greatly improved the efficiency compared to the traditional geological survey method.

Discussions
The popularity of UAV has brought about an opportunity to solve the geological survey problems in difficult areas. Although we minimized the ground geological operations in this experiment, the high-resolution geological mapping work that appears to be automated is still inseparable from a certain amount of ground geological work. At this stage, it is limited by image conditions and it is not yet possible to accurately distinguish certain lithologies, such as pegmatites and fine-grained rocks, in this study area. However, it can be seen from Table 5 that, even with the SLIC-CNN method, the correct classification rate of dykes in the study area remains low. This is related to the actual data used in the experiment. The spatial resolution of the real data used in this

Discussions
The popularity of UAV has brought about an opportunity to solve the geological survey problems in difficult areas. Although we minimized the ground geological operations in this experiment, the high-resolution geological mapping work that appears to be automated is still inseparable from a certain amount of ground geological work. At this stage, it is limited by image conditions and it is not yet possible to accurately distinguish certain lithologies, such as pegmatites and fine-grained rocks, in this study area. However, it can be seen from Table 5 that, even with the SLIC-CNN method, the correct classification rate of dykes in the study area remains low. This is related to the actual data used in the experiment. The spatial resolution of the real data used in this experiment reached 3 cm, and almost all of the small dykes in the study area were plotted. In addition, it is worth noting that UAVs and cameras cannot collect rock information covered by sand, soil, or vegetation. This limits the application of automated geological surveys. There are still many studies to be done on the application of automation in the field of geological mapping.
However, with experiments, it was found that the use of UAVs and deep learning algorithms in traditional geological mapping work provides a huge increase in work efficiency, and its mapping accuracy is also satisfactory. UAV technology brings new solutions to ground geological surveys.
Automatic detection lithology boundaries have been proven to greatly improve the efficiency of geological mapping [59]. This experiment used CNN on the basis of Vasuki's experiment to further improve the automation of mapping. Compared with traditional classification, the use of SLIC-CNN could detect lithology and boundaries more accurately and automatically. Additionally, the SLIC-CNN technology provides important ideas for automated geological mapping. With high-accuracy image recognition technology, we believe that future geological surveys will become more automatic and intelligent: UAVs will fly according to routes and take ground images, and important geological phenomena in the images will be quickly identified and positioning. Related information will be sent back to the server and quickly mapped. Significant areas and uncertain areas will be identified for geologists to conduct ground investigations.

Conclusions
In this study area, the results of a new large-scale mapping process can be said to be satisfactory. The effect of accelerating the use of UAVs and deep learning algorithms is obvious. The results of the SLIC-CNN method were much better than traditional image classification methods, like CNN and AlexNet, for geological body discrimination and the fitting effect of the rock mass edge, which was further improved compared with the ordinary deep learning method. Compared with traditional geological surveys, more accurate rock mass mapping results can be obtained with the help of drone images. Although the automatic classification results of SLIC-CNN were inadequate for rock vein recognition, the accuracy of other rock mass mapping was impressively satisfied. Additionally, it can greatly reduce the manual labor of high-resolution mapping and produce a high-resolution geological sketch with high classification accuracy in less time. Based on the UAV, SLIC-CNN method, and further tectonic evidence, we obtained results of the high-resolution geological map and the ductile shear deformation zone in the region, and confirmed the results of previous studies on the ductile shear zone in the Taili area. The SLIC-CNN method accelerates the interpretation of high-resolution remote sensing images, improves the accuracy, and makes the classification results more convincing for geology. Therefore, the application effect of the SLIC-CNN algorithm in the field of geological mapping is satisfactory and worth promoting. These methods and results will provide very useful experience for subsequent geological work, and will greatly enrich the means of geological research by relieving personnel and environmental pressures in field surveys.