Application of Crack Identification Techniques for an Aging Concrete Bridge Inspection Using an Unmanned Aerial Vehicle

Bridge inspection using unmanned aerial vehicles (UAV) with high performance vision sensors has received considerable attention due to its safety and reliability. As bridges become obsolete, the number of bridges that need to be inspected increases, and they require much maintenance cost. Therefore, a bridge inspection method based on UAV with vision sensors is proposed as one of the promising strategies to maintain bridges. In this paper, a crack identification method by using a commercial UAV with a high resolution vision sensor is investigated in an aging concrete bridge. First, a point cloud-based background model is generated in the preliminary flight. Then, cracks on the structural surface are detected with the deep learning algorithm, and their thickness and length are calculated. In the deep learning method, region with convolutional neural networks (R-CNN)-based transfer learning is applied. As a result, a new network for the 384 collected crack images of 256 × 256 pixel resolution is generated from the pre-trained network. A field test is conducted to verify the proposed approach, and the experimental results proved that the UAV-based bridge inspection is effective at identifying and quantifying the cracks on the structures.


Introduction
Continuous safety monitoring and maintenance of infra-structure such as bridges are essential issues. The effects such as fatigue load, thermal expansion, contraction and external load decrease the service performances of bridges overtime. As bridges become obsolete, the number of bridges that need inspection increases, and this requires much maintenance cost. If we postpone spending on bridge maintenance, more costs will be required in the near future. Thus, many countries established a maintenance plan of bridges. As a crack directly reflects the condition of structures, it is considered as one of the important parameters for structural health monitoring. Conventional crack detections are carried out by human visual inspection. This method has limitations that the performance is highly related to the experience of the inspector, time consumption and limited accessible areas [1].
In this paper, a crack identification method via image processing taken by a commercial UAV with a high resolution camera is investigated for an aging concrete bridge. Specifically, the framework and challenge are provided to identify the appearance of cracks on the bridge using UAV. The inspector can create a background model on a bridge with no information. He/she also can track the damage history by visualizing cracks on the inspection map. The detailed process of this study is as follows. First, the point cloud-based background model of the bridge was generated in a preliminary flight. Then, inspection images from a high resolution camera mounted on UAV were captured and stored to scan structural elements. Finally, we conducted prerequisite, deep learning processing for both image classification and localization; and crack size estimation to quantify the cracks. To quantify the cracks on captured images, reference markers were attached to the structures in advance. In the deep learning method, the regions with convolutional neural networks (R-CNN) method that significantly reduces the computational cost has been used for crack detection and finding its location. Finally, the length and thickness of the crack is calculated using various image processing techniques and validated through the field test. Figure 1 shows a schematic of the crack identification using UAV. As shown in this figure, UAV-based crack identification consists of five main phases: (1) the generation of the point cloud-based background model to build a damage map by visual inspection (inspection map); (2) image acquisition using a camera mounted on a UAV; (3) crack detection using the deep learning method; (4) crack quantification using image processing; and (5) visualization of the identified crack on the inspection map. The proposed approach collects and stores imaging data by using camera-equipped UAV from the region of interest (ROI) on the bridge. The preliminary study to visualize crack information on the inspection map was conducted.

Background Model Generation for Building the Spatial Information
The first phase of crack identification is the generation of the background model using UAV. Detecting and quantifying the cracks and representing the information on the map is essential for recording and utilizing the inspection results [32]. Since most aging bridges have no information about the shape of the structures [33], we generated the inspection map for selected bridges to store the crack information in a database. To build the map using aerial photographs, the flight altitude, interval photographing and overlap ratio should be considered. As the flight altitude is related to the spatial resolution of the image, the height of the UAV should consider an effective resolution for the camera performance. The overlap ratio related to the shutter speed is recommended to be more than 60% for the longitudinal and 30% for the lateral direction. 3D vectorization of images is required to consider the scale of the inspection map because the aerial images obtained by UAV are 2D raster images. In this study, the 2D model based on the stitching method could be utilized as the same side of the bridge was inspected. However, the 2D image matching method has the drawback that it is difficult to create a digital orthoimage, as well as the occlusion region. Therefore, the point cloud technique was applied to build a 3D point cloud model and 3D spatial information by referring to the point information extracted from the 2D images [34]. The procedures of making an inspection map based on the point cloud are shown in Figure 2 [35,36]. The UAV stored GPS and inertial navigation system (INS) information. This information (also called metadata) was combined with the images using the geo-tagging process at first. Then, the coincident points were extracted from the overlapping region of geo-tagged images. The number of extracted coincident points was set to adjust the tolerance. The point cloud data having the position information were formed as a result of the cross-reference to the overlap region for the extracted coincidence. Finally, 3D modeling was conducted through the mesh formation process based on the point cloud data. As a result, the 2D bridge spatial information was generated by mapping the texture as the original image to the generated 3D model [37]. Commercial software, Pix4D Mapper, has been used for 3D model generation in this paper. The spatial information was converted into digital design information by using AutoCAD 2017.

Image Acquisition with a High Resolution Camera on the UAV
The flight of the UAV at the bridge should consider the safety as the priority. Many guidelines were suggested to ensure safety, but the core contents will be presented in this section. At first, the UAV must fly within a flight altitude of 150 m to avoid collisions with aircraft. The UAV should also refrain from flying over the vehicle in the case of a fall. Finally, the operation speed of the UAV should be limited to 80% of the maximum speed to cope with an unexpected situation such as denied GPS and gusts of wind around the bridge. It is very important to keep the distance between the UAV and the target element to collect images because the pixel size for quantification can be calculated by the exact focal distance. However, it is difficult to keep the focal distance in flight, so the localization of UAV is one of the major concerns in bridge inspection due to denied GPS around the bridge. Therefore, planar markers were used as references to calculate the pixel size of images without knowing the location of the UAV and the distance from the target in this study. Several markers with known sizes were attached on the surface of the bridge. The UAV (i.e., Inspire 2) and a vision sensor (i.e., Zenmuse X5S) used in this study were manufactured by DJI Co., Ltd. (Shenzhen, China), and the two systems were connected by a gimbal, as shown in Figure 3. The gimbal could tilt the camera up from −130 • to +40 • , pan 320 • and roll 20 • in either direction. The UAV type was a rotary-wing, and the resolution of the camera was 5280 × 2970 pixels when the aspect ratio was 16:9. The flight control was manually operated, and the UAV flew while scanning the appearance of the bridge and keeping its distance at approximately 2 m from the concrete deck.

Crack Detection Using Deep Learning
This section describes the crack detection using deep learning method. In this paper, crack detection was defined as performing both classification and localization for an image. Many researchers have proposed automated crack detection using image processing and pattern recognition techniques. In most studies, crack features in the captured image were extracted and matched using the image processing techniques, such as blob detection, edge detection and the support vector machine (SVM) algorithm. Meanwhile, a sliding window technique with an image pyramid algorithm has been proposed. A sliding window scans an image sequentially, and an image pyramid algorithm reduces the size of the input image step by step. Therefore, the objects are effectively detected for various scales and locations.
Recently, a sliding window-based CNN, one of the deep learning methods, was proposed, and its performance was validated for object classification [38][39][40]. However, the CNN has disadvantages such as a high computational cost and a long operation time. To overcome these drawback, the region proposal algorithm was proposed to quickly scan the meaningful region in the image. In this paper, the R-CNN method combined region proposals with rich features extracted by CNN and was applied to detect the cracks. The steps of R-CNN consisted of extracting the region proposal from the input image using selective search, feature extraction by using CNN after cropping and image detection after bounding-box regression [41]. In the feature extraction step, insufficient data made it difficult to train the features. Therefore, the transfer learning method is commonly used in deep learning applications. The concept of transfer learning is to use features and parameters extracted from a large image set with many labeled data when the labeled data are insufficient. This method conducts the training of a large image set such as ImageNet [42] and Cifar-10 [43] in advance. Then, new dataset, which was comprised crack data in this study, is connected with the pre-trained network by using fine-tuning. Figure 4 presents the deep learning architecture to extract the feature for crack detection in this study. The input layer is 50,000 image of 32 × 32 × 3 pixel resolution in the Cifar-10 image set, which is large image dataset of 10 classes. In the convolution layer, 32 filters (also called the kernel) with a size of 5 × 5 were used, and a stride of 1 and a padding of 2 to prevent image downsizing. The ReLU as a nonlinear activation function was used because of its fast computation and high accuracy [44]. The max pooling with 3 × 3 kernel and a stride of 2 was used. The fine-tuning was applied to the fully-connected layer to classify the crack. Table 1 lists the detailed information of each layer.

Image Processing for Crack Quantification
To quantify detected cracks, the total length and thickness have been calculated using a planar marker. By using the relationship between the given 3D points in the world frame and their corresponding points in the captured image, the homography transformation matrix could be calculated. From the calculated matrix, the distorted image could be corrected. The procedure of estimating the crack's length and thickness using a planar marker is shown in Figure 5. The camera captured the image including the marker, as shown in Figure 5a, then the marker was detected by using the region of interest (ROI) algorithm with the Sobel edge detection algorithm (see Figure 5b). After finding four corners of the marker, the transformation matrix could be estimated and was applied to the distorted image. From the undistorted image, as shown in Figure 5c, the diagonal length and its thickness can be calculated, as shown in Figure 5d. To detect the horizontality and verticality of edges that can be defined as cracks, the detected image was rotated in the longitudinal direction, and image convolution was performed with the following mask: h = [−1 −1 −1; 1 1 1]'.

Experimental Validation
A field test was conducted to verify the proposed crack detection method based on the UAV with an imaging device in an old concrete bridge. In this study, target elements in bridge were examined for UAV-based visual inspection. At first, the inaccessible elements such as abutments, piers and pylons could be targeted. The bridge deck could be included if the camera were pointing up through the gimbal. The bridge rails were only targeted in this study, but further study will cover other elements. In the experiment, the UAV made a flight for two purposes (i.e., generating the background model and image acquisition for crack detection), and different flight methods were applied. The flight for generating the background model considered driving repetition number and image overlap rate, and the camera mounted on the UAV took a picture along the path. On the other hand, the flight for inspection considered the scanning speed, FOV and dynamics, focusing on obtaining high-quality images. Figure 6 shows the experimental setup for the crack detection using the UAV. Two markers were attached on the bridge to quantify crack size because it is challenging to exactly know both the location of the UAV and the distance from the target. Further study will be conducted to address the challenges without the marker. In this experiment, a square-shaped planar marker was designed and applied. Thus, the pixel size and rotational angle could be estimated from the marker.

Generating the Background Model
In this section, the procedure of the background model generation using captured images from the UAV is described. In this paper, a closed bridge with much crack damage was selected.
A total of 277 images were obtained from a high resolution camera mounted on the UAV during the flight time of 11 min. When constructing the spatial information using the point cloud technique, it is required to secure vertical aerial photographs between each image. The vertical aerial photographs were a set of over 70% in this study. Flight control was manually operated to consider the path plan. As a result of checking the obtained images, the outline of each structure member in the bridge was clearly revealed. Furthermore, the UAV can hardly achieve the same vertical aerial photographs between the images due to the manual flight, but can achieve over 70% vertical aerial photographs.
The vision sensor mounted on UAV captured the image of the structures, then the 3D point cloud model of the bridge was generated with the use of GPS data. Finally, the detected cracks were visualized on the digitized inspection map. In this study, 41,974 tie points between each image were extracted, and the orthoimagery of ground sampling distance (GSD) 0.10 cm/pixel level was generated as shown in Figure 7. Figure 7a represents the spatial information of the bridge, and Figure 7b shows the general drawing of the bridge in order to utilize the inspection map.

Crack Detection
In this paper, the Cifar-10 dataset, which has 50,000 images of 32 × 32 RGB pixels for a pre-trained network, and 384 crack images of 256 × 256 RGB pixels for fine-tuning have been used. To collect the training data for the deep learning, both methods using man-power and the UAV were utilized in concrete bridges and structures without the target bridge. Some images were collected on the web. In order to reduce the imbalance in the crack direction and increase the image, a method of rotating a part of the collected images was used. Then, a labeling process was manually performed on the cracks in the image. The CNN training algorithm using Cifar-10 data used the stochastic gradient descent with momentum (SGDM). The initial learning rate was set to be 0.001, and it was reduced every 8 epochs during 40 epochs. Ten classes of Cifar-10 were changed to the crack class with the background through transfer learning. Then, the bounding box regressions to improve the accuracy of region proposals were conducted after training one binary SVM per class to classify the region features.
In this study, two regions of interest including both the crack and marker were specified and cropped. Figure 8 shows the results of crack detection by using deep learning. In the figure, there are water marked rectangle boxes, which show the accuracy of the results. The labeled boxes are detected cracks (C) obtained by the deep learning process. The dotted boxes are also detected cracks, but they do not seem like cracks due to blurring. These results show that the deep learning method has better detection performance than the traditional image processing techniques. However, the parameters of the selected region should be modified so that the box can contain the crack size. In this study, the deep learning was trained considering the direction of cracks. In addition, training data were acquired through image rotation to reduce the unbalance between vertical crack and horizontal crack data. As a result, the deep learning model was able to detect the horizontal crack, C-8. Therefore, the proposed approach can be validated to conduct the tasks well, which are classification and localization of cracks.

Crack Quantification
In this section, the quantification of the detected and cropped cracks is described. To quantify the cracks, it is necessary to know the exact pixel size in the image. UAV-based systems can use GPS data and focal distance to predict pixel size, but there is an error that affects the resolution. To estimate pixel size, planar markers sized 70 × 70 mm were attached on the structure, and the vision sensor mounted on the UAV captured the structure including the marker and cracks. In the field test, the conventional square-type marker has been used. The procedure of estimating the length and thickness estimation were identical to as described in Section 2.3. The crack quantification algorithm, which was verified in a small-scale laboratory test that provided a relative error of 1∼2%, has been used to calculate length and thickness. Figure 9 shows the results of the marked crack by using the image processing method, and Table 2 shows the estimated results. In this study, the crack quantification based on image processing was considered for the bounding boxes. From the results, it was possible to quantify cracks, but it was difficult to estimate information on some cracks (i.e., C-11 and C-12). Because the dark surface of the surrounding crack makes quantification of the crack hard work, two cracks were hardly quantified. The crack quantification algorithm based on image processing could be affected by the surrounding environment such as shadows and the intensity and direction of sunlight. Some of these problems could be addressed by considering the lighting equipment and post-processing method.

Bridge Inspection Results
In the previous section, we confirmed that the thickness and length of cracks can be quantified via image processing. The coordinates and area of the detected crack, the quantified crack information, can be manually displayed on the inspection map as shown in Figure 10. The cracks expressed in the inspection map were realistic and could be stored and compared with later inspection results. We are investigating ways to link the results to the bridge management system (BMS) and to be processed automatically in the future.

Discussion
The UAV-based crack detection method has received considerable attention because of its stability and reliability. However, this technology is still considered to be in the primitive stages because there are many challenges to be solved for practical applications. We looked over the whole UAV-based crack identification process from generating the background model to crack visualization on the inspection map. The UAV flight and crack inspection were conducted in a real civil structure. The detected crack size was calculated by using the installed 2D marker because the accuracy for localization of the UAV should be improved. In the crack detection, R-CNN was used for deep learning, and 384 crack images were combined with the pre-trained model through transfer learning. As a result, satisfactory performances for both classification and localization were validated. The optimization for the deep learning model and a rich dataset for cracks could effectively improve the performance. In this study, the coordinate values and locations of bounding boxes for detected cracks were obtained, and the detected boxes were automatically cropped and stored. Each cropped image could be quantified through image processing and visualized on an inspection map at the same scale. In the process of quantification of the detected cracks, there was difficulty in the quantification of some cracks. Aging concrete bridges may have contamination on the surface. In addition, the image processing techniques for crack quantification can be affected by the surrounding environment, such as shadows and the intensity of sunlight. Therefore, it is necessary to solve these problems through lighting equipment and post-processing.
To identify and visualize cracks, a high performance computer composed of a CPU (i.e., i7-7700, Intel, Santa Clara, CA, USA), RAM memory of 32.0 GB and a GPU (i.e., Geforce GTX 1080 Ti, NVIDIA, Santa Clara, CA, USA) has been used. Regarding the computation time of generating the background model of the bridge, the 3D point cloud process using the Pix4D mapper was approximately 150 min, and the bridge information extraction using AutoCAD required another 30 min. In the deep learning, approximately 14 min were required to conduct crack detection including the network training. Finally, the computation time for quantification and visualization of cracks was about 5 s. In a large bridge, the computation time will increase as the bulk of images are processed.

Conclusions
In this paper, the applications of the UAV with the deep learning algorithm in crack identification were presented. The study focuses on processing with collected images to identify the cracks of a concrete bridge. The crack identification and visualization method was applied to a real civil bridge.
Because most of the aging bridges have no inspection map, the 3D point cloud-based background model was considered. The background model generated in the preliminary flight has been used for the visualization of the identified crack on the inspection map. In this study, preliminary and inspection flight mode were considered, and the guidelines for flight were presented. To detect the crack, the deep learning algorithm was used to effectively conduct the tasks of both image classification and localization. The R-CNN instead of CNN-based on a sliding window was used as the object detector, and transfer learning with a rich dataset was used in the deep learning architecture. Fifty thousand training images from the Cifar-10 dataset were pre-trained using CNN. Then, the pre-trained network was fine-tuned for crack detection using the 384 collected crack images. In object detection, various cracks could be detected by training the deep learning network with a small number of crack images. The detected cracks were cropped and quantified by image processing. Finally, the identified cracks were automatically visualized on the inspection map using matching of their location. In this study, a field test to apply crack identification techniques in the aging bridge was conducted. As a result, the performance of the proposed technique has been validated to effectively inspect cracks, and the UAV-based bridge inspection system could be considered as one of the promising strategies.