Classification of Urban Road Traffic Noise based on Sound Energy and Eventfulness Indicators

Noise energetic indicators, like Lden, show good correlations with long term annoyance, but should be supplemented by other parameters describing the sound fluctuations, which are very common in urban areas and negatively impact noise annoyance. Thus, in this paper, the hourly values of continuous equivalent level LAeqh and the intermittency ratio (IR) were both considered to describe the urban road traffic noise, monitored in 90 sites in the city of Milan and covering different types of road, from motorways to local roads. The noise data have been processed by clustering methods to detect similarities and to figure out a criterion to classify the urban sites taking into account both equivalent noise levels and road traffic noise events. Two clusters were obtained and, considering the cluster membership of each site, the decimal logarithm of the day-time (06:00–22:00) traffic flow was used to associate each new road with the clusters. In particular, roads with average day-time hourly traffic flow ≥1900 vehicles/hour were associated with the cluster with high traffic flow. The described methodology could be fruitfully applied on road traffic noise data in other cities.


Introduction
According to the data provided by the European Environment Agency (EAA), "more than 41 million people are reported to be exposed above 55 dB L den due to road traffic noise inside urban areas", and nearly 90 million are estimated in Europe [1]. Road traffic noise is the most diffuse noise source in urban areas, wide spreading in space and time, despite the noise reduction actions implemented by policies and legislations at international and national level. As a consequence, the people's awareness towards the adverse health effects, both direct and indirect, produced by road traffic noise is increasing. The World Health Organization (WHO) has estimated that "at least one million healthy life years are lost every year from traffic related noise in the western part of Europe" [2]. There is evidence in the literature that "sleep disturbance and annoyance, mostly related to road traffic noise, comprise the main burden of environmental noise" [2]. L den , introduced by the European Directive 2002/49/EC on the assessment and management of environmental noise [3], is commonly applied by legislation to assess urban sound environments. However, although energetic indicators show good correlations with long term annoyance, their deficiency for evaluating perceptively urban sound environments has been pointed out in several studies [4]. In particular, they fail in evaluating fluctuating sounds, which are very common in urban

97
A script running in "R" environment, version 3.5.1 [10], was written to import each of the 24 h

107
From the input data, the script was computed for each hour:

108
• The IR value, according to the definition given in [7] and determined by the following: , , • 100 % where Leq,T,Events accounts for all sound energy contributions that exceed a given threshold K, that 110 is, clearly standing out from background noise; Leq,T,tot is the overall continuous equivalent level 111 referred to the measurement time T; and the threshold K is given by the following: where C, as stated in [7], can be assumed equal to 3

Data Processing and Analysis
A script running in "R" environment, version 3.5.1 [10], was written to import each of the 24 h time series as input in terms of text file (four columns with date, time, SPL dB(A) at 1 s intervals, and a code to indicate whether the source was road traffic or something different). In order to provide indicators for the classification, a time window of duration T = 1 hour was chosen to compute L Aeq and IR values This time period was established by the Italian legislation for road traffic noise measurement. Besides this requirement, the chosen measurement time T was considered a reasonable compromise between longer time (i.e., 24 hours, day and night periods, and so on) and shorter ones (i.e., 30 min or even shorter). For the sites where the noise monitoring lasted more days, the median value of IR and L Aeq for each hour was determined, as this parameter is less influenced than the mean by the presence of outliers.
From the input data, the script was computed for each hour: • The IR value, according to the definition given in [7] and determined by the following: where L eq,T,Events accounts for all sound energy contributions that exceed a given threshold K, that is, clearly standing out from background noise; L eq,T,tot is the overall continuous equivalent level referred to the measurement time T; and the threshold K is given by the following: where C, as stated in [7], can be assumed equal to 3 as a result of the numerical simulations for different traffic situations; • The L Aeqh referred to the corresponding day-time L Aeqd (06:00-22:00), taken as a reference level, that is, the difference δ = L Aeqh − L Aeqd ; this computation was necessary because of the different monitoring set-up at the sites, such as different distances from the road and the characteristics of the road itself (its geometry, the presence of reflecting surfaces and obstacles in sound propagation, and types of paving) [11].
As explained in Section 2.1, the microphone was placed close to the road to reduce the influence of noise from other sources and, therefore, the source-receiver distance was not always the same. This factor influences the IR values ( Figure 4 in [12]).
Thus, a matrix formed by 48 variables, (IR and δ for each hour) × 90 observations (sites) = 4320 values, was the input of the subsequent cluster analysis, an unsupervised machine-learning technique, performed to find out similarities in the time patterns. Because the data are measured by different scales, namely δ is in dB and IR in percentage, they needed to be scaled (mean = 0 and standard deviation = 1) before clustering; in addition, the Euclidean distance, one of the most used distance metrics, was considered to represent the similarity between pairs of sites. To select the optimal solution for clustering, that is, the agglomeration algorithm and number of clusters, the "clValid" R package, version 0.6-6 [13] was applied. All 10 clustering methods available in the package were considered, that is, hierarchical clustering with the Ward's method [14], partitioning around medoids (PAM) [15], k-means [16], DIvisive ANAlysis clustering (DIANA) [15], Clustering LARge Applications (CLARA) [15], AGglomerative NESting (AGNES) [15], self-organizing map (SOM) [17], self organizing tree algorithm (SOTA) [18], model-base clustering [19], and fuzzy analysis clustering [20]. The clustering performance of the methods was ranked according to seven internal clustering validation measures [21,22], namely, the connectivity [23], silhouette width [24] and Dunn index (combining measures of compactness and separation of the clusters) [25], average proportion of non-overlap (APN), average distance (AD), average distance between means (ADM), and figure of merit (FOM) [26,27]. All of the clustering algorithms were ranked based on their performance as determined simultaneously by all the validation measures [28].
There is no generally accepted rule on the ratio between the number of clustering variables and sample size (number of observations, sites in this study). Usually, the latter should be much higher than the former. In this study, the ratio between sites and variables is low (90/48 = 1.875). Thus, principal component analysis (PCA) was also performed on the input matrix in order to reduce the number of variables and account for the largest possible variance of the original variables. The method generates a new set of variables, called principal components, and each principal component is a linear combination of the original variables. All the principal components are orthogonal to each other, so there is no redundant information. Components with larger variance are the most relevant to the clustering and, therefore, removing features with low variance acts as a filter that results in a distance metric that provides a more robust clustering. The PCA output for the selected components was taken as input for a further k-means clustering, and the obtained classification was compared with that obtained by the earlier k-means applied to all 48 variables.
Furthermore, the hourly values of the 24 h pattern, corresponding to the cluster membership of the site, were compared with those observed at the site itself and the differences ε were evaluated in terms of root mean square (RMS) and probability P ε of ε being within a specified interval width.
Supervised machine-learning algorithms can be applied to classify new roads into the clusters determined by the cluster analysis, but the acoustic data must be known. Unfortunately, very often they are not available and, therefore, an alternative classifier is needed. For this purpose, the road traffic flow is a potential candidate as its value, obtained from transport models or measured, is rather available and linked to the produced noise. In the present study, the decimal logarithm of the day-time (06:00-22:00) traffic flow rate was considered as an alternative road classifier.

Determination of Clusters and Patterns of Hourly Values of δ and IR
The clustering algorithms ranking based on their performance as determined by all the validation measures showed that the best solution was the k-means agglomeration algorithm, a centroid-based clustering, and a division into two clusters. The obtained partition into two groups corresponds to the minimal discrimination among the data, and this low discrimination enables an easier association of new data with the determined clusters. Table 2 reports the distribution of the sites in each road type and cluster, compared also with the distribution obtained by the DIANA algorithm applied earlier to the hourly IR values only [8]. The two classifications are rather similar as they overlap for 93.3%, that is, 93.3% of all sites are classified in the same category by both clustering algorithms and the percentage of disagreement is 6.7% of pairs. Cluster 2, with the largest number of sites, includes the majority of "F" road types (local roads with low traffic flow), whereas cluster 1 includes the majority of sites in the remaining roads. The multidimensional scaling (MDS) applied to the data provided the bi-dimensional plot given in Figure 2, where the two clusters obtained by k-means look satisfactorily separated.
Appl. Sci. 2020, 10, x FOR PEER REVIEW 5 of 15 easier association of new data with the determined clusters. Table 2 reports the distribution of the 168 sites in each road type and cluster, compared also with the distribution obtained by the DIANA 169 algorithm applied earlier to the hourly IR values only [8]. The two classifications are rather similar as they overlap for 93.3%, that is, 93.3% of all sites are classified in the same category by both clustering sites, includes the majority of "F" road types (local roads with low traffic flow), whereas cluster 1 173 includes the majority of sites in the remaining roads.

174
The multidimensional scaling (MDS) applied to the data provided the bi-dimensional plot given  for the sites belonging to cluster 2 is lower (mainly local roads) than that for the sites belonging to 185 cluster 1. Figure 4 shows the straight comparison between the time patterns for the two clusters. The 186 largest differences between the two clusters are observed during the night-time for δ and between  Bi-dimensional plot of the two clusters obtained by k-means. Dimension 1 and 2 explain 37.8% and 13.7% of the variance, respectively. The rectangular frames highlight the two sites with the greatest distance (dissimilarity) between them in the bi-dimensional plot. Figure 3 reports, for each cluster, the 24-h pattern of the hourly median values (thick lines) ± the median absolute deviation (MAD, grey area) for both δ and IR, as well as the observed time pattern (thin lines) for each of the sites included in the cluster. The patterns show that, in the night period (22:00-06:00), the IR values are the highest for both clusters because of the presence of noise events clearly emerging above the background noise, which is lower than that observed during the day-time. The hourly IR values for cluster 2 are higher than those for cluster 1 because the traffic flow for the sites belonging to cluster 2 is lower (mainly local roads) than that for the sites belonging to cluster 1. Figure 4 shows the straight comparison between the time patterns for the two clusters. The largest differences between the two clusters are observed during the night-time for δ and between 10:00 and 16:00 for IR. Thus, these periods are the most suitable to discriminate between the two clusters.

Principal Component Analysis (PCA) Outcome
As explained in Section 2.2, PCA was performed in order to reduce the number of clustering variables. The scree plot, reported in Figure 5a for the first 12 components, shows that 10 components have eigenvalues ≥1, a value commonly used as a cutoff point at which principal components are retained. The cumulative percentage of variance explained by these 10 principal components is 86.7% (Figure 5b).  The rationale of determining the 24 h patterns of δ and IR for each cluster, shown in Figure 4, is 217 to assign these patterns to the sites according to their cluster membership. This assignment, of course,

218
introduces an uncertainty that should be evaluated ( Figure 3). For this purpose, at each i th hour, the 219 difference εi between the pattern cluster value (PCi) and the measured value (Mi) was determined: This error ε can be expressed in different ways, either as relative error ε r , 221 ε = 10 10 ( )/ /10 / , or as the root mean square of the error ε (RMSE), where n is the number of observations (sites) in each cluster. Indeed, the patterns have their own 223 uncertainties (grey areas in Figure 3), but, for sake of simplicity, only their median values are 224 considered in the following.

225
For practical application, the RMSE error in dB was considered for the δ parameter because it is 226 more representative (Figure 6a), whereas for IR, measured in %, the relative error εr was chosen and 227 is reported in Figure 6b. For all the sites belonging to the same cluster, the εi absolute values for each Thus, a matrix of 90 observations (the sites) and 10 variables, that is, the retained principal components (90/10 = 9), was the input for the k-means cluster analysis, setting the number of clusters to two. The obtained classification was completely in accordance with that already determined by k-means applied to the 48 variables. This result confirms the robustness of the partition obtained by k-means accounting for the 48 variables.

Comparison between Clustering Patterns of δ and IR Median Hourly Values and Measured Data
The rationale of determining the 24 h patterns of δ and IR for each cluster, shown in Figure 4, is to assign these patterns to the sites according to their cluster membership. This assignment, of course, introduces an uncertainty that should be evaluated (Figure 3). For this purpose, at each i th hour, the difference ε i between the pattern cluster value (PCi) and the measured value (Mi) was determined: This error ε can be expressed in different ways, either as relative error ε r , ε rδ = 10log 10 (δ Pi −δ Mi )/10 /10 δ Mi /10 , or as the root mean square of the error ε (RMSE), where n is the number of observations (sites) in each cluster. Indeed, the patterns have their own uncertainties (grey areas in Figure 3), but, for sake of simplicity, only their median values are considered in the following. For practical application, the RMSE error in dB was considered for the δ parameter because it is more representative (Figure 6a), whereas for IR, measured in %, the relative error ε r was chosen and is reported in Figure 6b. For all the sites belonging to the same cluster, the ε i absolute values for each i th hour were determined and grouped (in %) in an interval bin width from 0.5 to 3 dB with steps of 0.5 dB. The median values of these percentages for each interval width were determined for the day (06:00-22:00) and night (22:00-06:00) periods and are plotted in Figure 6c,d, together with fitting curves. No large differences between clusters are observed for the error ε δ determined for δ; the largest values of RMSE error (Figure 6a) and the lowest values of the percentage of ε δ (Figure 6c) are observed for the period 00:00-05:00 (median error ε δ > 2.0 dB), which, on the other hand, is the most suitable to discriminate between the two clusters. The relative error ε rIR for IR is greater for cluster 1 during the day-time (06:00-22:00) (Figure 6b, median error ε rIR = 0.63).

Application of the δ and IR Cluster Patterns to New Data
To classify new roads according to the already determined clusters, it is necessary to find which cluster centroid is the closest to the new acoustic data. As an alternative, supervised machine-learning algorithms can be used and, among them, the k-nearest neighbors (k-NN) is the most common. Let us consider a road not included in the dataset used for determining the clusters (original data), with known δ and IR hourly values (new data). The k-NN algorithm searches for the k-nearest original data, in Euclidean distance, closer to the new data, and the classification is decided by majority vote. If there are ties for the k th nearest vector, all candidates are included in the vote.
Unfortunately, this procedure most often cannot be applied because the acoustic data for the road are unknown. Thus, it is necessary to look for an alternative classifier, hopefully linked to a specific feature of the road and/or to the corresponding traffic flow.
In noise monitoring surveys, to save time and reduce costs, stratified samplings of urban road traffic noise based on some attributes (i.e., road categorization) are commonly used. Their features and performances have been already analyzed, for instance, in [29,30]. Traffic flow (q), that is, the number of vehicles passing a point per unit of time (24 h, 1 h, and so on), can be a potential parameter for such a linkage because it represents information (observed or estimated by transport models) usually available for urban roads and used for their categorization in noise studies [31].
In a previous paper [32], the decimal logarithm of the day-time (06:00-22:00) total traffic flow, hereinafter "x", has been proposed as a "non-acoustic" parameter to be applied to assign cluster membership. This choice was preferred to the 24-h traffic flow because of the observed large uncertainty of the night flow calculated by the traffic model [32]. Indeed, further investigations [33,34] have shown that there are significant differences between the traffic flow data provided by the model and those measured. This points to the need for more accurate data and, on this issue, there is an increasing interest to obtain real traffic flow data, for instance, through methodologies based on video processing and object detection tools [35], or on travel time data obtained by Google Application Programming Interface (API) and Big Data treatment [36]. Figure 7a shows the histograms of the data (bin width of x = 0.1) according to their cluster membership. It can be seen that, for 12 out of 26 bins (46.2%), there is an overlap between the two distributions. In order to figure out a threshold value to discriminate the cluster membership, the procedure to determine an analytical representation of the distribution functions in each cluster, as described in [37,38], was applied. The probability distributions for the two clusters P(x) can be obtained from the analytical fit of the cumulative distribution I(x) according to the following relationship: P(x) = ln(10)·f (x)·I(x) (8) where f(x) is a 3rd order polynomial, f'(x) is the derivative of f(x), and I(x) is the cumulative distributions of x. The fit functions f 1 and f 2 for P(x) obtained for the two clusters are as follows: Appl. Sci. 2020, 10, x FOR PEER REVIEW 9 of 15 uncertainty of the night flow calculated by the traffic model [32]. Indeed, further investigations [33,34] 256 have shown that there are significant differences between the traffic flow data provided by the model 257 and those measured. This points to the need for more accurate data and, on this issue, there is an 258 increasing interest to obtain real traffic flow data, for instance, through methodologies based on video 259 processing and object detection tools [35], or on travel time data obtained by Google Application
They are plotted in Figure 7(b) for the two distributions and show the "probability" that a road 271 with a given "x" belongs to clusters 1 (blue curve) and 2 (red curve), respectively. It can be seen that,   They are plotted in Figure 7b for the two distributions and show the "probability" that a road with a given "x" belongs to clusters 1 (blue curve) and 2 (red curve), respectively. It can be seen that, for x < 4.48 (intersection of the two curves), membership to cluster 2 can be roughly assumed, whereas for x ≥ 4.48, membership to cluster 1 seems to be more appropriate. The classification of sites based on this threshold value (roughly corresponding to an average hourly traffic flow of 1900 vehicles/hour) is in accordance with the original cluster memberships for 71.9% of the sites, a satisfactory value for the performance of the binary classifier "x". In particular, on the basis of this threshold, the number of sites included in cluster 1 (high traffic flow) decreases from 40 (clustering partition) to 23 (classification by "x") and, conversely, those included in cluster 2 (medium-low traffic flow) increase from 50 (clustering partition) to 67 (classification by "x"). Table 3 reports the median values of RMSE relative error ε r (Equation (7)) for δ and IR determined for the day and night periods and the two classifications based on clustering and the "non-acoustic" parameter x. The values in bold-italics point out the situations (4 out of 16) where the classification based on the binary classifier "x" is more accurate than that based on clustering. For the opposite situations, it has to be addressed that the classification based on "x" makes possible the association of new roads to the clusters also when acoustic data of them are unknown and only the day-time traffic flow is available. Table 3. Root mean square error (RMSE) median values of relative error ε r for δ and IR determined for the day and night periods and the two classifications based on clustering and the "non-acoustic" parameter x.

Discussion
Road categorization is often applied to stratified samplings of urban road traffic noise based on some road attributes. This sampling, aimed to get data variability within each stratum less than that between the strata, can be fruitfully applied not only to optimize the resources and time involved in noise monitoring [37], but also to be integrated in the noise mapping process [39]. Unfortunately, among the possible road attributes to be selected for categorization, the Italian road functional classification appears to not be appropriate because, as clearly shown in Table 2, the partitions determined by k-means cluster analysis include different road types. The k-means cluster analysis was also performed selecting site partitions into four groups to be numerically consistent with the four functional road categories. The comparison reported in Table 4 shows a large mismatch between the two approaches, namely 58.9% of the sites were differently grouped. This is most likely because of the inadequacy of functional road classification to represent the actual use and noise emission of roads.
The data partition obtained by clustering (see Table 2) is satisfactory as the two clusters have similar size (40 and 50 sites) and are fairly separated. Cluster 1 includes sites with medium-high traffic flow, whereas cluster 2 is formed by sites with low-medium traffic flow (median value of the average day-time hourly traffic flow equal to 1760 and 470 vehicles/hour, respectively). As shown in Table 2, the obtained classification taking into account δ and IR is rather similar to that considering hourly IR values only, as they overlap for 93.3%. The misclassifications are observed only for road types "E" (four sites) and "F" (two sites).
It is interesting to point out that the major differences between the two clusters for both δ and IR are observed during the night period (22:00-06:00), as clearly shown in Figure 8. Thus, in the view of reducing the number of variables to be taken into account in the clustering process, those included in the night-time might represent potential candidates (16 variables instead of 48).

321
To gain deeper insights of the differences between the two clusters, the farthest two sites in the 322 cluster bi-plot in Figure 2 were selected (highlighted by rectangular frames in Figure 2), and each of 323 the parameters reported in Table 5 and Figure 9 were examined. Table 5    To gain deeper insights of the differences between the two clusters, the farthest two sites in the cluster bi-plot in Figure 2 were selected (highlighted by rectangular frames in Figure 2), and each of the parameters reported in Table 5 and Figure 9 were examined. Table 5 reports, for each site, the hourly interval when the maximum value of the parameter was observed, whereas Figure 9 shows the density plots of the hourly values distribution. For the four parameters examined, their maximum values are observed within the period of 00:00-09:00. In particular, for L Aeqh , the density distributions have a similar shape, whereas for IR, the maximum values are observed between the interval 02:00-04:00; the local road (cluster 2) shows a slightly higher number of events than the motorway (cluster 1). For the sound exposure level (SEL), the density distributions show similar modes, but that for cluster 1 is right-skewed and that for cluster 2 is left-skewed.  Regarding the output of PCA, the correlation plot, showing the contributions of the variables in accounting for the variability in the two first principal components, is reported in Figure 10. It can be seen that dimension 1 can be interpreted as describing the eventfulness of the noise as the most contribution comes from the IR metric, whereas dimension 2 is linked to the noise energy, with L Aeqh being the most contributing variables.   To give an example of application of the above data classifications, let us consider three new roads for which the hourly values of δ and IR, dotted lines in Figure 11, as well as the non-acoustic parameter x, are known. According to the categorization based on x, because for all the three roads, the x value is less than 4.48, they should be classified into cluster 2, the centroid of which is also given in Figure 11, together with that of cluster 1. Running the k-NN algorithm with k = 1, all three roads are recognized to belong to cluster 2 as well. Thus, for these data, both classification criteria, based on clustering and the parameter x, agree in the cluster membership assignment. roads for which the hourly values of δ and IR, dotted lines in Figure 11, as well as the non-acoustic parameter x, are known. According to the categorization based on x, because for all the three roads, 350 the x value is less than 4.48, they should be classified into cluster 2, the centroid of which is also given 351 in Figure 11, together with that of cluster 1. Running the k-NN algorithm with k = 1, all three roads 352 are recognized to belong to cluster 2 as well. Thus, for these data, both classification criteria, based 353 on clustering and the parameter x, agree in the cluster membership assignment.

360
The obtained two clusters can be fruitfully applied to stratify road sampling by choosing a 361 representative road sample in each cluster and, therefore, to optimize noise monitoring resources 362 [31]. The approach of road traffic noise clustering can be integrated in the noise mapping process too, 363 as already applied in the DYNAMAP project [40,41], where a 24 h pattern representative of each 364 cluster is assigned to non-monitored roads belonging to the cluster itself.

365
The assignment of new roads to the clusters can be obtained by supervised algorithms, like k-

366
NN, when the acoustic data are known and, much more often, by a classifier linked to the road noise,

370
Of course, the results are strongly dependent on the local situation and could not be generalized 371 to other contexts. However, besides the above limitations, the described methodology could be      Figure 11. Comparison of the new data roads, δ (a) and IR (b), with the centroids of the two clusters.

Conclusions
Cluster analysis was performed on a data sample of urban road traffic noise monitored in the city of Milan, formed by the hourly values of A-weighted continuous equivalent level L Aeqh , describing the sound energy, and the intermittency ratio (IR), quantifying the noise events owing to vehicle pass-by.
The obtained two clusters can be fruitfully applied to stratify road sampling by choosing a representative road sample in each cluster and, therefore, to optimize noise monitoring resources [31]. The approach of road traffic noise clustering can be integrated in the noise mapping process too, as already applied in the DYNAMAP project [40,41], where a 24 h pattern representative of each cluster is assigned to non-monitored roads belonging to the cluster itself.
The assignment of new roads to the clusters can be obtained by supervised algorithms, like k-NN, when the acoustic data are known and, much more often, by a classifier linked to the road noise, herewith the logarithm of day-time (06:00-22:00) traffic flow. The two data partitions obtained by k-means clustering and by the threshold x = 4.48, corresponding to an average day-time hourly traffic flow of 1900 vehicles/hour, are in a satisfactory accordance (classification matching 71.9%).
Of course, the results are strongly dependent on the local situation and could not be generalized to other contexts. However, besides the above limitations, the described methodology could be fruitfully applied on road traffic noise data in other cities, allowing to compare different noise contexts. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.