Asbestos—Cement Rooﬁng Identiﬁcation Using Remote Sensing and Convolutional Neural Networks (CNNs)

: Due to the pathogenic nature of asbestos, a statutory ban on asbestos-containing products has been in place in Poland since 1997. In order to protect human health and the environment, it is crucial to estimate the quantity of asbestos–cement products in use. It has been evaluated that about 90% of them are roof coverings. Di ﬀ erent methods are used to estimate the amount of asbestos–cement products, such as the use of indicators, ﬁeld inventory, remote sensing data, and multi- and hyperspectral images; the latter are used for relatively small areas. Other methods are sought for the reliable estimation of the quantity of asbestos-containing products, as well as their spatial distribution. The objective of this paper is to present the use of convolutional neural networks for the identiﬁcation of asbestos–cement rooﬁng on aerial photographs in natural color (RGB) and color infrared (CIR) compositions. The study was conducted for the Ch˛eciny commune. Aerial photographs, each with the spatial resolution of 25 cm in RGB and CIR compositions, were used, and ﬁeld studies were conducted to verify data and to develop a database for Convolutional Neural Networks (CNNs) training. Network training was carried out using the TensorFlow and R-Keras libraries in the R programming environment. The classiﬁcation was carried out using a convolutional neural network consisting of two convolutional blocks, a spatial dropout layer, and two blocks of fully connected perceptrons. Asbestos–cement rooﬁng products were classiﬁed with the producer’s accuracy of 89% and overall accuracy of 87% and 89%, depending on the image composition used. Attempts have been made at the identiﬁcation of asbestos–cement rooﬁng. They focus primarily on the use of hyperspectral data and multispectral imagery. The following classiﬁcation algorithms were usually employed: Spectral Angle Mapper, Support Vector Machine, object classiﬁcation, Spectral Feature Fitting, and decision trees. Previous studies undertaken by other researchers showed that low spectral resolution only allowed for a rough classiﬁcation of rooﬁng materials. The use of one coherent method would allow data comparison between regions. Determining the amount of asbestos–cement products in use is important for assessing environmental exposure to asbestos ﬁbres, determining patterns of disease, and ultimately modelling potential solutions to counteract threats.


Introduction
The term asbestos refers to a group of naturally occurring fibrous serpentine or amphibole minerals [1]. Asbestos fibres were broadly used in industrial production due to its physical and chemical properties; e.g., extraordinary tensile strength, poor heat conduction, and resistance to CNNs were designed to handle image data by utilizing special layers in the network. Convolutional layers are designed to extract information from an image. Pooling layers are used to generalize information. Local connections and shared weights between layers are used to maintain reasonable network training speed [25]. CNNs are widely used for image recognition tasks.
CNNs were previously used in remote sensing for land use or land cover classification using hyperspectral [26], multispectral data [27], cloud detection [28], building detection [29], and image fusion [30]. Works employing CNNs usually enjoy high accuracies that are often higher than the previous state-of-the-art algorithms. So far, no study has attempted to use CNNs to identify roofing types on buildings.
The objective of this paper is to present the use of convolutional neural networks for the identification of asbestos-cement roofing on aerial photographs in natural color(RGB) and color infrared (CIR) compositions. The motivation behind this research is to support the decision-makers who are interested in public issues concerning asbestos removal and abatement, as well as to support them with the operational method of mapping asbestos-cement roofing still in use, employing widely available datasets such as aerial orthophotomaps.

Study Area
The lowest level of the administrative division of Poland is the commune, which may be classified as urban, rural, or mixed urban-rural commune. The study was conducted in the Chęciny urban-rural commune. The Chęciny commune is located in the central part of Poland inŚwiętokrzyskie Province ( Figure 1). The area of the surveyed commune amounts to 127.4 km 2 and is subdivided into 18 villages (Skiby, Miedzianka, Podpolichno, Polichno, Radkowice, Lipowica, Przymiarki, Bolmin, Korzecko, Starochęcin, Mosty, Tokarnia, Wolica, Ostrów, Siedlce, Wojkowiec, Łukowa, and Gościniec) and the town of Chęciny. It has 15,053 inhabitants, of which 10,632 live in villages and 4421 in the town. The population density is 118 persons per km 2 [31].  When compared to rural areas in 18 villages, the town of Chęciny has higher density of buildings. In the town, there are 136 buildings per 1 km 2 , while in rural areas, 68 buildings. The average building area in town is equal to 118.6 m 2 , and for rural areas, 114.4 m 2 . In the villages, buildings are mainly located along streets. Most farms are dominated by single-family houses with additional buildings used in agriculture production.

Aerial Imagery and Asbestos Database
A total number of 43 aerial photographs, each with dimensions of 8961 x 9419 pixels and spatial resolution of 25 cm in RGB and CIR compositions, acquired from the Central Office of Geodesy and Cartography of Poland, were used for the undertaken study [32]. During prepossessing steps, all images were merged into two mosaics, one for each composition (Figure 2). The size of single mosaic in pixels was 80,726 by 73,757. Localization data for the asbestos-cement roofing were derived from the Asbestos Database [10]. Results of the field inventory were recorded in the Asbestos Database in 2015 (field mapping lasted from 03/03/2015 to 06/09/2015, one building was introduced to the reference database in 2016 (09/08/2016). According to data gathered in the aforementioned register, there were 5460 tons of asbestos products in the Chęciny commune, of which 8.3 tons were removed by 2019. In total, there were 3124 buildings with asbestos-cement roofs denoted in the Asbestos Database, which accounts for 39% of all buildings in the commune. Divided into urban and rural areas, 369 buildings were recorded in the town of Chęciny and 2755 in rural areas. The share of buildings with asbestos-cement roofs in the urban area is 19%, while for the rural areas it was 43%.

Field Survey
In order to verify data from the Asbestos Database and to develop the database for the convolutional neural network training, field studies were carried out using orthophotomap printouts derived from aerial imagery. Buildings with a total area higher than 70 m 2 were subject to a physical inventory, compliant with the Polish construction law, which stipulates that smaller buildings do not require a building permit. The number of analyzed buildings was 6287. During fieldwork, the following data were gathered: localization, asbestos or non-asbestos roofing, and function of the building. The data obtained were digitalized with the use of ESRI ArcGIS 10.5 [33] software to develop a spatially referenced geodatabase.

Image Signatures of Asbestos-Cement Roofing and Other
Image signatures of asbestos-cement roofing and roofs with other types of cover were prepared using the results of the field inventory and aerial imagery. A vector point layer was created containing centroids of analyzed buildings roofs ( Figure 3). Each point represented one roof, be it asbestos-cement covered or not. The following attributes have been assigned to each building roof:

•
The type of roofing (asbestos-cement or other); • The degree of roof pitch (sloping, flat); • The type of asbestos-cement building material (flat or corrugated sheets).
A total of 46% of buildings in a developed image signature database had asbestos-cement roofs, and the remaining 54% had other types of roof coverings. Among the asbestos-cement roofs, the corrugated sheets covered 92% of all buildings, while the remaining 8% had roofs made out of asbestos-cement flat sheets. Due to the large disproportions in the sample size for classes of flat and corrugated asbestos-cement sheets, the attempt to differentiate between types of asbestos has been abandoned. Network training was carried out using the TensorFlow [34] and R-Keras libraries [35] in the R programming environment [36].

Convolutional Neural Networks Architecture
Contrary to commonly used feed-forward multi-layered perceptron networks, CNNs can achieve better results using fewer connections. In normal feed-forward multi-layered perceptron networks, neurons in neighboring layers are connected to each other, resulting in a large number of connections. For networks with a large number of layers, this can severely impact network ability to train (vanishing gradient effect [37]). This issue is resolved in CNNs by employing shared weights and local connections.
Local connections share the same weights, reducing the number of both connections in the network and the number of weights associated with them. A key procedure that allows CNNs to achieve high accuracies in image classification tasks is convolution, happening in convolutional layers. During the convolution procedure, the whole image is processed using a trainable kernel or shifting window of fixed size that extracts features from the image. Each kernel has a fixed size that determines how many neighbouring pixels are used in convolution, and stride that determines by how many pixels the kernel window shift across the image. Each convolution operation can use a number of kernels that create feature maps, used later as input for other layers in the network. Those features are then used to differentiate between image classes present in the training dataset. Maximum value from the pooling layer kernel is retained for maximum pooling layer. In order to limit the network overfitting, additional dropout layers are employed [38,39]. In the dropout layer, a certain percentage of connections between the preceding and the following layer are removed. The spatial dropout layer works on the same principle, but instead of removing connections between layers, a percentage of the feature maps is removed. The role of convolution and pooling layers is to extract new features from the image, which are then used as input for multi-layered perceptron.
Artificial neural network training requires an objective function (loss function) that is used to define the training target, numerically describing how far the network is from the optimal solution.
The loss function is combined with the learning algorithm. Moreover, due to the complexity of learning space, the learning function has to be able to avoid local minima that would deliver subpar results and has to be able to approximate the global minimum of learning space, which is the optimal solution for a given learning problem. The speed and direction of traversal are defined by learning rate and additional parameters specific to a given learning algorithm.

Classification with Convolutional Neural Network
The classification was carried out on aerial imagery in RGB and CIR compositions. Reference data points from the asbestos-cement roofing database were used to create image signatures used to train CNNs and to validate classification results. Each signature is an image window that is centered over each vector layer point. Image data were extracted using this window during the classification process. A number of signature sizes were tested. The choice of signature size is always a tradeoff between the amount of data retained in signatures (more pixels = more data) and neural network training times. Due to hardware limitations, the size of the signature cannot exceed a certain maximum size. It was decided to settle for the signature size of 27 by 27 pixels, due to it delivering decent accuracy while maintaining reasonable memory consumption and training speed. Signatures were extracted for both RGB and CIR images. Additional signature augmentation procedures were employed to increase the number of available training samples. The following signature augmentation procedures were employed: flipping over x-and y-axis and rotating clockwise by 90, 180, and 270 degrees. Signatures representing asbestos-cement roofs and other roofs were divided into training and verification data sets in the proportion of 63.2% to 36.8%, maintaining the distribution of classes inside of those data sets. The classification was carried out using a convolutional neural network ( Figure 4) consisting of:
Two blocks consisting of: • Fully connected layer (512 neurons, ReLU activation function) • Dropout layer (50%) • Batch normalization layer The last layer of the network was the classification layer, using 2 neurons and a softmax activation function.
The network training lasted 200 epochs with a batch size of 64 samples. The learning rate parameter was set to 0.001. During training, the stochastic gradient descent method was used with decay parameter equal to 1e-6 and momentum of 0.9. Categorical cross-entropy was used as loss function during network training. The learning rate was reduced every 10th epoch by 20% of the current value of the learning parameter in order to stabilize the network model. The total number of image signatures used for the roof differentiation was 6287, of which 2924 represented the asbestos roofs. The training data set contained 2126 signatures for non-asbestos roofs and 1848 signatures for asbestos roofs, while the test data set contained 1237 signatures for non-asbestos roofs and 1076 signatures for asbestos roofs.
It was decided to limit network architecture to only two convolution blocks due to the negative effect on accuracy of an excessive number of constitutional blocks and extremely long network training times [29]. Kernel and stride size for convolution and pooling layers were set to be small numbers (kernel size 3 and pooling size 2) to preserve as much of the information between layers as possible. A large percentage of connections dropped in dropout layers severely limits the overfitting of the network during the training process.

Results
To distinguish between the two classes-asbestos-covered roofs and non-asbestos-covered roofs-the classification process was carried out using a convolutional neural network. The network learning process, as well as the classification, was performed on aerial imagery in the RGB and CIR composition. Developed models began to stabilize in the 150 epochs. The OA of the training and the validation of the training were 93% and 86%, respectively, for the RGB image and 94% and 89% for the CIR image ( Figure 5). Loss function value after 200 epochs of training was 0.16 in the case of RGB and 0.15 for CIR composition. From 150th epoch, the accuracy of the training and the value of loss function remained unchanged. The shape of the learning curves is satisfactory. The discrepancy between the accuracy of the training and the validation of the accuracy indicates over-fitting.
Classification carried out on the validation dataset, containing 2313 objects, achieved an overall accuracy of 86.7% on aerial imagery in the RGB composition and 88.9% on orthophotomap in the CIR composition (Table 1). However, the user's accuracy (UA) for the class of asbestos-cement roofing materials was 83.9% and 87.5%, respectively, and the producer's accuracy (PA) amounted to 88.5% and 88.9%. The classification of roofs on the CIR image achieved higher accuracies than in case of RGB composition (UA for the asbestos class reached 87.5% for CIR, compared with 83.9% for RGB, PA for other roofs class reached 88.9% for CIR and 85.2% for RGB). The highest difference in underestimation error in both compositions was recorded in the non-asbestos roofs class (14.8% in relation to 11.1%), and the overestimation error in the case of the asbestos roofs class (3.6 percentage point lower value on the CIR composition).
The presence of objects known by the network (training data) means that the accuracy of the asbestos roofing's map is naturally higher than the classification accuracy calculated on the verification data set. The trained network has correctly assigned proper classes to 5907 objects out of 6287 for RGB composition and 5998 objects out of 6287 objects in CIR composition. This results in the map OA of 94% and 95% for RGB and CIR compositions, respectively ( Table 2). It has to be stressed that those numbers should not be used as performance metrics for developed CNN. The shape of both receiver operating characteristic (ROC) curves for both compositions indicates high-quality accuracy of the classification, while AUC above 0.90 confirms these presumptions ( Figure 6). Besides, we have compared two classifiers trained on RGB and CIR signatures using the McNemar's Chi-squared test with continuity correction, which is a good test to show inequality and compare the results of image classification in remote sensing [40]. The result of the test carried out at the significance level alpha = 0.05, obtained a test statistic of 10.531 and a p-value of 0.001174 with a number of degrees of freedom equal to 1. This result has confirmed the presence of statistically significant differences in achieved accuracies between trained classifiers, stating that CIR signatures are more appropriate for asbestos mapping than RGB signatures.   There was no noticeable spatial concentration of over-or underestimation of classes on results obtained using RGB or CIR composition. This was also true for the densely built-up areas such as cities.

Discussion
The environmental exposure associated with the presence of asbestos fibers in the atmospheric air is of great importance to public health [41]. Therefore, it is important to determine the amount of asbestos-cement products remaining in use. This may lead to the reliable estimation of the asbestos removal costs and provide basis for determination of the degree to which citizens are exposed to asbestos fibers in the air. Moreover, the potential risk of developing asbestos-related diseases might be assessed. Attempts are being made to determine the risk of exposure to asbestos fibers using multiple methods such as asbestos consumption indicators in production, field tests, as well as detection and mapping of asbestos-cement products using remote sensing data. All these activities should lead to the estimation of the amount of asbestos-containing roofs in a particular area of the country.
In this research paper, we have successfully shown that the asbestos-containing roofs may be recognized using the convolutional neural network and the aerial imagery. Asbestos roofing was classified with the PA of 89%, and OA of 87% and 89%, depending on the image composition used. It is difficult to compare the results obtained during the undertaken survey with other research papers due to the fact that both the RGB and CIR aerial imagery and CNN classification method have not been used so far in this type of research (Table 3). Attempts made at the identification of asbestos roofing focus primarily on the use of the hyperspectral data (MIVIS, Aisa Eagle II, APEX) and multispectral (QuickBird, WorldView-2) imagery and usually employ the following classification algorithms: Spectral Angle Mapper, Support Vector Machine, object classification, Spectral Feature Fitting, and decision trees. Classification accuracy was comparable to results obtained in our study concerned areas of about 2 to 9 km 2 . For Cavalli et al. [44], the test area covered approximately 125 km 2 , and the overall classification accuracy for asbestos roofing was at 80%-90%. Two completely different types of objects were separated, i.e., asbestos-cement buildings and red mud dust, which affected the under-representation of the individual asbestos-cement class to a little over 10%. With the spatial resolution of MIVIS imagery at 3-4 m, distinguishing buildings with asbestos-cement roofing in the imagery is subject to great uncertainty, so it is difficult to refer to the results obtained by the authors. Fiumi et al. [19] and Krówczyńska et al. [18] present very high classification accuracy but include a small number of buildings with asbestos roofing. The average roof area was quite large and amounts to about 1200 m 2 [42]. However, the accuracy will increase as the average roof area increases [39]. Asbestos roofing mapping results in larger areas, i.e., 3263 km 2 [20] and 795 km 2 [42], were characterized by significantly lower accuracies of 43% and 65% respectively. The underestimation of the class was 48% and 65%. Previous works were not consistent when reporting achieved accuracies. Most of them reported overall accuracy, which, without information on class abundances, can be severely inflated. Our method achieved 89% PA and OA of 89% putting it among the most accurate results obtained so far.
Frassy et al. [20] have pointed out in research that the minimum identifiable roof area should be at minimum the size of 3 by 3 pixels. This assumption sets the minimum roof area eligible for identification to over 100 m 2 , for MIVIS, Aisa Eagle II, APEX, QuickBird, and WorldView-2 images. It was also noted by Krówczyńska et al. [18] that PA increased with the roof surface. The spatial resolution of the imagery used (4 x 4 m) was insufficient to classify buildings' roofing with an area smaller than 144 m 2 . The spatial resolution of the RGB and CIR aerial imagery are much higher than those of the hyperspectral imagery. For the area of Poland, it ranges from 5 to 25 cm and covers the entire country. In this study, the average analyzed building area in an urban area was 119 m 2 and 114 m 2 in a rural area, but much better accuracy results of 87% and 89%, respectively, were achieved.
Previous studies [45] showed that low spectral resolution (four broad bands in the visible and near-infrared spectrum) only allowed for a rough classification of roofing materials. This study showed that using CNNs, spectral resolution does not significantly affect the classification accuracy. Aerial imagery in the RGB and CIR composition with a resolution of 25 cm was used, and the total accuracy achieved was 87% for the RGB composition and 89% for the CIR composition, respectively.
Another point of contention might be datasets used in previous works. Most of them use relatively low-resolution datasets when compared with aerial orthophotos used here (0.25 m spatial resolution). The use of such dataset allows for additional validation of results by means of photointerpretation, which, considering the object of research, is very difficult on images with spatial resolution measured in meters. The use of widely available and cheap orthophotos strengthens the reliability of the method (due to the ability to double-check the results in the lab) and makes the presented method easy to implement (most countries have orthophotos already available and no additional data have to be acquired).

Conclusions
Field-based asbestos mapping is a long and arduous process that generates high costs, while not guaranteeing the highest quality. Therefore, it is reasonable to consider using remote sensing for this task since it has the potential to lower costs and increase the reliability of such mapping. The method presented in this paper allowed the identification of asbestos roofing with a PA of 89% and an OA of 89%. We have used high-resolution orthophotos (0.25 m spatial resolution) and tested both RGB and CIR compositions. The area of Poland is covered by high-resolution orthophotos updated every three years. This dataset is preferred for the presented method for wide-scale asbestos roofing identification and monitoring. While results for rural areas are most promising it is not entirely clear if the proposed method can be successfully employed in dense urban areas, such as big cities. Our method requires the creation of an asbestos roofing database for each commune, in order to guarantee high accuracy and any hope for proper validation. Besides, illumination differences in images due to changing light conditions during image collection will make any attempts of transferring trained network on other communes difficult. It seems that it is possible to accurately identify asbestos roofing without the use of the hyperspectral data. The accuracy achieved in the undertaken survey is comparable to and in some cases higher than the accuracy delivered by the use of the hyperspectral data. The hyperspectral data had some successes in the mapping of asbestos roofing. However, the price and processing requirements, compared to data used in this work, make them not economical to employ on a wider scale.
Previous work concerning asbestos identification with the remote sensing primarily used pixel-based methods that indicate pixels that supposedly identify asbestos-containing roofs. Such results are of limited value. One has to translate pixels identified as asbestos into usable information. This creates uncertainty where some of the pixels belonging to the same roof are identified as asbestos while others are not. That potentially lessens the credibility of such studies. The use of convolutional neural networks allows one to easily tackle this problem. The method presented in this work avoids this issue by assigning a class label to the collection of pixels (signature) describing a given roof. Results shown in this work provide an easy-to-understand final product that can be immediately used for any number of uses. Moreover, by focusing only on the roofs, the problem of misclassification of pixels that are not even located on a building roof (a known problem while using pixel-based classification approaches) is eliminated. On the other hand, this feature forces potential users to prepare the database of building roofs locations using other means, be it manual or automatic. Moreover, this study boasts a large number of roofs identified (more than 6000), with almost 46% of them having asbestos roofing. This allowed for thorough validation of the trained network and allowed a sufficient number of signatures for each class, ensuring proper training of the network.
Polish law requires citizens and private and public companies to report any use or existence of asbestos-cement products on the premises to an appropriate civil servant office. Those reports are then compiled into the Asbestos Database. The percentage of communes that report asbestos roofing is high, but there are no mechanisms to check those reports for completeness or quality. The local commune governments decide to perform asbestos-cement products mapping using data collected by lower-ranging commune officials or third-party entities. Depending on how those data were collected, information fed into the Asbestos Database is of varying quality and may or may not provide accurate information. By comparing field-collected data, we were able to show that the number of buildings in the Asbestos Database for the Chęciny commune was underestimated by 36%, while 26% of all registered buildings had incorrect roofing type reported.
Regarding the data collected in the Asbestos Database, further works should focus on reverification of the already-collected data. Ultimately, data from the Asbestos Database could be automatically updated using CNNs. The use of one coherent method on a national level would allow data comparison. In addition, this method, due to the use of aerial images available free of charge for public authorities and scientific units, would be incomparably cheaper compared to field inventories. Undoubtedly, the biggest benefit would be the possibility of obtaining consistent data on the quantity of asbestos-cement roofing every three years. Determining the amount of asbestos-cement products in use is important for assessing environmental exposure to asbestos fibers, determining the patterns of disease, and ultimately modelling potential diseases to counteract threats.