Statistical Modeling of the Early-Stage Impact of a New Traffic Policy in Milan, Italy

Most urban areas of the Po basin in the North of Italy are persistently affected by poor air quality and difficulty in disposing of airborne pollutants. In this context, the municipality of Milan started a multi-year progressive policy based on an extended limited traffic zone (Area B). Starting on 25 February 2019, the first phase partially restricted the circulation of some classes of highly polluting vehicles on the territory, in particular, Euro 0 petrol vehicles and Euro 0 to 3 diesel vehicles, excluding public transport. This is the early-stage of a long term policy that will restrict access to an increasing number of vehicles. The goal of this paper is to evaluate the early-stage impact of this policy on two specific vehicle-generated pollutants: total nitrogen oxides (NOx) and nitrogen dioxide (NO2), which are gathered by Lombardy Regional Agency for Environmental Protection (ARPA Lombardia). We use a statistical model for time series intervention analysis based on unobservable components. We use data from 2014 to 2018 for pre-policy model selection and the relatively short period up to September 2019 for early-stage policy assessment. We include weather conditions, socio-economic factors, and a counter-factual, given by the concentration of the same pollutant in other important neighbouring cities. Although the average concentrations reduced after the policy introduction, this paper argues that this could be due to other factors. Considering that the short time window may be not long enough for social adaptation to the new rules, our model does not provide statistical evidence of a positive policy effect for NOx and NO2. Instead, in one of the most central monitoring stations, a significant negative impact is found.


Introduction
Air quality monitoring is one of the major challenges that European institutions jointly with national and local administrations are facing in terms of environmental protection. In particular, the 2008 European Air Quality Directive (AQD) 2008/50/EC [1] requires EU Member States to design appropriate air quality plans for zones where the air quality does not comply with the AQD limit values. In the last few decades, European countries implemented various modeling methods to assess the effects of local and regional emission abatement policy options on air quality and human health [2]. On the one side, they include scenario approaches, in which running a chemical-physical simulation model with and without a specific emission source allows for quantifying the impact on air quality The present paper analyses the introduction of the first phase of an air quality control policy in the municipality of Milan, which started on 25 February 2019 and directly acts on traffic rules. The administration defined an extended limited traffic zone, named Area B (https://www.comune. milano.it/aree-tematiche/mobilita/area-b), where the access and circulation for the most polluting vehicles, as well as those longer than 12 meters, have some partial restrictions, enforced by a monitoring system of entrance gates controlling each license plate and imposing a fine on unauthorized vehicles. The access prohibition concerns Euro 0 petrol vehicles and a large part of Euro 0, 1, 2, and 3 diesel vehicles, with specific exemptions for public transport, itinerant traders, and residents, and it is active from Monday to Friday during business hours (from 7:30 a.m. to 7:30 p.m.), except holidays. According to the municipality of Milan, the share of cars registered in the Milan metropolitan area and involved in the restrictions in 2019 is close to 17%, while the share of freight transport vehicles is around 53% [11]. Area B is a progressive policy divided into various phases, which will concern an increasing number of vehicle classes. In terms of NOx emissions, the administration expects a reduction of 4-5% per year until 2022 and a reduction of 11% between 2023 and 2026. The policy will be fully operative within October 2030.
Area B extends the previously existing limited traffic zone, Area C, which covers just the historical city centre. The physical coverage of the two restriction zones is represented in Figure 2, which highlights the arrangement of both within the city borders. Area B covers almost the entire area of the city, excluding extreme peripheral districts.
Statistical literature on air quality grew up sharply in the last decades. Two main statistical modeling directions have been developed. One has a focus on pollutants concentration and the other on human exposure. Regarding the latter, recent advances are based on crowdsourced data, such as smartphone data modeling [12]. Regarding pollutants concentrations, increasing attention is being given to latent component models; see, as an example [13] and for the problem of misalignment. In particular, the use of the INLA-SPDE approach for misalignment between pollutant concentration and epidemiological data [14] and PCA based methods with missing data [15].
When the territory under study is large and spatial correlation is important, spatio-temporal models are appropriate. See, for example, the multivariate state space approach of Calculli et al. [16], which is capable of handling jointly PM 10 , NO 2 and weather variables, the approach of Menezes et al. [17] for modeling daily NO 2 trends in Portugal. Moreover, the land-use regression model (LUR) under a state space approach has been used for modeling air pollution in Tehran [18]. Despite this growing spatial literature, time series analysis methods have been recently developed to understand the effect of meteorology on pollutant concentration [19], which will be the main focus of this paper.
The previous Milan limited traffic zone, known as Area C, has already been treated in literature by Fassò [20], who analyzed its introduction through spatio-temporal models, by Invernizzi et al. [21], who considered its impact on black carbon, and by Percoco [22] who considered its effect on traffic. Moreover, similar problems have been studied for London "sulphur-free zone" [23] and the "low emission zone" in Munich [24]. In Fassò [20], the author considered both particulates and nitrogen oxides and observed the presence of a more pronounced permanent reduction of the latter within the restricted area, despite the data showing a strong spatial variability depending on the type of pollutant. This is consistent with the known emissions pattern of particulate matters and nitrogen oxides. The latter are mainly primary gaseous pollutants and can be directly attributed to anthropogenic sources, such as car traffic and house heating. Moreover, from the so-called INEMAR emission inventory [25], in Milan province, 68% of NOx and only 41% of PM10 are due to road traffic. Hence, in this first study of Area B, we will take into account NO x and NO 2 and postpone the analysis of PM10 and PM2.5 to further research. To adjust for confounding factors, we will consider weather conditions in Milan, the main calendar events, and the concentration levels of oxides observed in neighbouring towns, as in a pseudo-treatment-control approach.
The study aims to identify and quantify variations in pollutant levels due to the above described Area B. Hence, the present paper will try to investigate and test the following two scientific hypotheses: Hypothesis 1. The introduction of Area B achieved significant changes in pollution concentration for the city of Milan; Hypothesis 2. The variation occurred homogeneously on the territory and the stations do not show spatial variability of the effects.
The first hypothesis aims to quantify the impact of the policy on pollution levels measured by several air quality stations scattered around the city and to assess whether this evidence is significantly supported by the data. The impact is evaluated both regarding the statistical significance of the estimates, the absolute magnitudes of the coefficients, and their signs. From the policy maker perspective, the expected coefficients should be negative, indicating a reduction effect on concentrations due to the car traffic restrictions. However, given the complexity of the phenomenon, a change of opposite sign cannot be ruled out either. The second research hypothesis is dedicated to the comparison of the estimates for the considered stations: the effect can be considered homogeneous when the sign and the magnitude of the coefficients for all the stations are similar.
The paper is structured as follows. Section 2 describes the dataset and the methodologies used for the analyses. In particular, we briefly explain the composition of both weather and air quality monitoring systems in Milan, available data sources, and metadata information. Then, we present the methodologies implemented for the preliminary analysis and the state space approach to time series analysis for air quality data. Section 3 reports and discusses the empirical results of the estimated models and their implications. Section 4 concludes the paper discussing the two research hypotheses in light of what emerged from the data analysis and gives some hints for future research developments.

Materials and Methods
In this section, we present the structure of the ARPA dataset and briefly introduce the methodologies implemented for the analyses. Section 2.1 introduces the data source for the Milan case study and the spatio-temporal structure of the data and provides a brief description of the variables taken into consideration. Section 2.2 designs the preliminary analysis, which introduces a temporal treatment-control experiment to highlighting the differences in concentration levels before and after the policy intervention. Section 2.3 gives a detailed overview of the use of state space models in time series analysis for the study of air quality data, including also a specific subsection for model selection and policy intervention.

Air Quality and Weather Monitoring Network in Milan
Data on pollution and weather conditions of Lombardy are collected from the Lombardy Regional Agency for Environmental Protection (ARPA Lombardia), which makes a large open data portal fully available to users (https://www.dati.lombardia.it/). The agency manages a diffuse monitoring system distributed among the regional territory and counting on hundreds of monitoring stations collecting intra-daily information on climate and pollution through sensors.
Installed within the borders of Milan are seven weather monitoring stations and five air quality monitoring stations. Air quality stations are classified according to a taxonomy system that identifies the type and function in the network. The stations Liguria (ARPA code 539), Marche (ARPA code 501), Senato (ARPA code 548), and Verziere (ARPA code 528) are urban traffic control units: sensors installed near important roads and intersections in order to accurately quantify the pollution generated by traffic. The station Città Studi (ARPA code 705) is instead of type urban background, that is, the station is located in such a position that the level of pollution is not mainly influenced by specific sources but by the integrated contribution of all the upwind sources at the station with respect to the predominant directions of the winds on the site [8]. The seven weather stations are Marche (ARPA code 501), Lambrate (ARPA code 100), Zavattari (ARPA code 503), Brera (ARPA code 620), Feltre (ARPA code 869), Rosellini (ARPA code 1327), and Juvara (ARPA code 502). Figure 2 georeferences on the map the exact position of each station and allows for identifying the position with respect to Area B and Area C. Air quality stations are represented as blue points, while weather stations are the red points. Marche station (ARPA code 501), in the upper side of the map, is the only one to collect both weather and pollution data and is represented with a double label, the first one blue and the second red.
The spatial distribution of the stations is not uniform: air quality stations cover northern, eastern, central, and southern parts of the city, leaving the western districts uncovered; climate stations cover in detail the city centre and all the northern neighbours but are not installed in the south.

Temporal Coverage, Pollutants, and Weather Measures
The analysis presented in this paper takes into account daily measures from 1 January 2014 to 30 September 2019, generating an overall sample of 2099 daily observations. The whole, the monitoring system provides information about many urban pollutants, such as carbon dioxide, particulates, and oxides. All the pollutants are measured as µg/m 3 . As already stated in the Introduction, we focus our attention on concentrations of total nitrogen oxides (NO x ) and nitrogen dioxide (NO 2 ), which are mainly primary gaseous pollutants, hence considered as proxies of pollution emissions due to human activities, first of all car traffic.
Weather stations provide measures of local temperature ( • C), rainfall (cumulated mm), humidity (%), global radiation (W/m 2 ), wind speed (m/s), and wind direction. The wind direction is expressed in clockwise degrees from 0 • to 360 • ; for example, 90 • identifies winds going from east to west. To make results easier to interpret, we decide to aggregate the measurements on wind direction and speed by constructing a set of new variables that describe the average speed in the four quadrants of the compass rose. The Northeast quadrant (Q NE ) corresponds to degrees between 0 and 90, the Southeast quadrant (Q SE ) to degrees from 90 to 180, the Southwest quadrant (Q SW ) to degrees from 180 to 270 and the Northwest quadrant (Q NW ) to the remaining values lying between 270 and 360 degrees. These measures will be used in the modeling part to capture local weather conditions specific to the city of Milan. Instead of using the data referring to the weather station closest to each air quality station, we preferred to aggregate each of the climate variables through a daily city average valid for each pollution station. In this way, the subsequent models will be fully comparable guaranteeing the maximum possible spatial coverage.

Anthropogenic Activities
Human activities, and therefore the quality of the air we breathe, are often affected by calendar events that are recorded based on national, local, and religious holidays and weekends. Calendar effects are captured by a set of covariates, which identify the weekends and the main Italian holidays, both religious and secular. Holidays are collected in a dummy variable named Holidays, while the weekends are contained in a dummy variable named WeekEnd. Specific effects related to the behavior of people can be observed when holidays coincide with the weekend; therefore, we considered two terms of interaction between the two dummies. The interaction terms include those holidays that fall on Saturday, denoted as Saturday:Holiday, and those on which they fall on Sunday, which is Sunday:Holiday.
For a correct assessment of the effects of the traffic policy on pollutants concentrations in Milan, it is necessary to purify the estimates from any external weather or socio-economic effects overlapping with the policy and which may hence alter policy effects. This operation is accomplished by introducing a counter-factual term into the models represented by the pollution levels observed in other cities surrounding Milan. We considered seven important urban centres located in the Lombardy Po Valley area, which show socio-demographic and economic characteristics and weather conditions similar to Milan, but which cannot be directly affected by the limited traffic zone. These urban centres are Bergamo (East), Brescia (far East), Cremona (far Southeast), Lodi (Southeast), Pavia (South), Saronno (North) and Treviglio (East). As reported in Figure 3, the considered candidates cover a large territory surrounding Milan in all the directions while maintaining a sufficient distance to be considered independent in terms of traffic. Figure 4 shows the temporal evolution of yearly average and median concentrations in the period preceding and following the entry into force of the policy for each control units located in Milan. According to the figure, starting from 2015, the city of Milan recorded a generalized reduction of concentration levels especially in peripheral areas, such as Marche and Liguria. Observed mean values for 2019 present a further reduction of concentrations rather apparently anomalous and significant. The comparison between the levels of NO x and NO 2 pairs for each station shows obvious common trends between the two pollutants both considering the annual average and median values. Averages and medians follow similar temporal patterns, but focusing on nitrogen oxides sensors, it is possible to note that the medians are significantly smaller than the averages, highlighting the heavy-tailed characteristic of the distribution (positive asymmetry) and the presence of extreme values. Following these facts, an interesting question to investigate is if, and how much, the greater difference observed in 2019 can be attributed to traffic restrictions, or if it is due to a general de-carbonization trend that the city is experiencing, or to weather variations not considered yet. Before investigating the factors and causes that may have generated these sharp reductions, we perform a preliminary analysis of the concentration levels pre-and-post policy, in order to quantify the changes observed in 2019 both in Milan and in the other centres. Since air quality data present outliers and heavy-tail distributions given by extreme events, the only use of average values for central tendency estimation can provide misleading results. Therefore, we compare the central values obtained both considering the sample mean and the sample median, which is notoriously a more robust indicator if outliers occur [26,27].

Methods: Average and Median Difference before and after the Policy
The comparison is performed through the computation of two statistics based on the difference of central tendency indicators. The first statistic computes the difference between the average of the observations gathered after the policy intervention and the average of observations referring to the sub-period 2014-2018. The second statistics consists of computing the difference between median concentration levels observed in 2019 and before that year. The difference in average concentrations is denoted by d AVG , whereas the difference in median concentrations is denoted as d MED . Since both sub-periods are treated as independent of each other, from the statistical perspective, the statistics are assimilable to unpaired samples statistics. Both

Methods: Time Series Modeling Using a State Space Approach
In this section, we discuss the time series models used to identify the policy effect, the estimation algorithms, and the related inference. Firstly, we introduce a brief description of the basic structural model (BSM) using a state space approach for time series analysis and the estimation algorithm based on the Kalman filter [28,29]. Then, we present a three-step procedure used to select the most representative model in terms of predictive power and quality of fit. As a last step, we explain how the policy intervention is included in the models and how it should be interpreted.

Basic Structural Model for Air Quality Data
According to their physical characteristics, air pollution concentrations time series are often characterized by seasonality, high persistence [30,31], strong right skewness with uni-modal distribution, and scale invariance [32]. Therefore, we analyze the concentrations using the basic structural model [33,34] augmented by deterministic regressors for weather conditions and socio-economic features.
BSM is defined as a simple unobservable components model composed by local linear trend (LLT), stochastic seasonality, and irregular (white noise) component. LLT describes both the temporal evolution of the series level and its slope, while the seasonal component aims to capture cyclical behaviors given by natural and anthropogenic phenomena. We modeled the seasonal component using a trigonometric form for daily data, hence with period s = 365, and considering only a few harmonics given the very regular and almost deterministic behavior of the series. This fact avoids the risk of a model over-parametrization.
Let {y 1 , y 2 , ..., y n } be the time series of the observed pollution concentrations in logarithmic scale, the state space form of BSM without regressors is composed by the following equations: where ε t ∼ N(0, σ 2 ε ) is the measurement error and Stochastic seasonality : where k ≤ s 2 is the number of included harmonics and γ j,t is the non-stationary stochastic cycle are white-noise processes with mean zero and variance σ 2 ω . Equation (1) is called measurement equation and describes the evolution of the observed series as the sum of the underlying components, while Equations (2)-(4) are named transition equations. Equations (2) and (3) compose the LLT and describe respectively the unobservable processes of the level and the slope, whereas Equation (4) describes the trigonometric seasonality evolution. Weather, socio-economic factors, and policy intervention will be included in the models adding a set of deterministic components to the measurement Equation (1). Since the BSM with Gaussian errors belongs to the class of Gaussian linear models, the estimation step has been performed using the Kalman Filter algorithm, an iterative procedure, which allows estimating simultaneously the unobservable components and the model's parameters by maximizing the Gaussian likelihood function.
When dealing with Gaussian linear state space models, the parameters estimated using a maximum likelihood (ML) approach inherit the asymptotic properties of ML estimators [29]. The distribution of the MLE is asymptotically approximated using a Gaussian distribution, which allows deriving the usual asymptotic confidence intervals and t-tests for significance. Assuming a significance level of 5%, the estimates are statistically significant if the standardized value lies outside of the interval [−1.96,1.96], obtained using the quantiles of a Standard Normal distribution. Moreover, since the dependent variable is expressed in logarithmic scale, the coefficients have to be interpreted as relative increases or decreases in concentration levels due to a unitary increase in the explanatory variable.

Three-Step Model Selection
We now propose a three-step procedure for model selection, which considers multiple rules based on cross-validation, information criteria, and stepwise regression. To avoid estimation bias due to the policy introduction, all the steps are computed using only the observations before the introduction of Area B that is, from 1 January 2014 to 24 February 2019.
Step 1 is designed for selecting the most predictive seasonal component, defined in Equation (4), comparing different model specifications, which consider a varying number of harmonics k for the trigonometric function. Specifically, we fit 10 alternative models for each station: in each of them, the trigonometric seasonality is modeled by an increasing number of harmonics ranging from k = 1 to k = 10. The use of an increasing number of sinusoids, in our case up to 10, allows the modeling of complex seasonality with strong variations within short periods, but at the same time increases the model complexity.
Once the seasonal component has been selected, step 2 introduces in Equation (1) a counter-factual term x t able to capture weather and socio-economic factors common to the Po basin and affecting the air quality of Milan. In our approach, the counter-factual candidates are the time series introduced in Section 2.1.3 and which refer to the measurements of pollutant concentrations in seven important cities around Milan. The new measurement Equation can be written as follows: where x t is the logarithm of the counter-factual time series and θ is its coefficient, µ t follows Equations (2) and (3), and γ t follows the specification obtained by step 1. The expected sign of θ is positive: higher levels of air pollution should correspond to high values in nearby cities due to similar conditions. In step 3, we identify the best subset of calendar events and weather covariates, capturing residual variations not yet covered by the counter-factual or by the latent components. These residual variations are estimated by the smoothed observation disturbances from Equation (6) that isε t , and describe residual patterns that have not been explained by the persistence of series, the seasonality or characteristics common to nearby territories of the region.
Relevant weather and calendar covariates are selected through a backward-forward stepwise regression algorithm, which uses as a starting model the auxiliary linear regression expressed in Equation (7). The equation represents the full model which sets up the smoothed observation errorsε t as dependent variable and the weather conditions and calendar events as covariates: ε t = τ 1 Holidays + τ 2 WeekEnd + τ 3 Saturday : Holidays + τ 4 Sunday : Holidays + τ 5 Temperature + τ 6 Rain f all + τ 7 Radiation + τ 8 Humidity The stepwise regression is set up twice for each station: in one case, it selects the model according to the Akaike's Information Criterion (AIC), while, in the other, it uses the Bayesian Information Criterion (BIC). The algorithm starts estimating the full model and computes the AIC or the BIC. Iteratively, it drops out the predictors one at a time; at each step, it computes the new information criterion and considers whether the criterion is improved by adding back in a variable removed at a previous step. The procedure ends when the reintroduction of each omitted variable does not improve the information criteria.
In the first two steps, we select the seasonal component and the counter-factual term by fitting and comparing alternative models based on Equations from (1) to (4) according to their predictive power and their ability to adapt adequately to the observed data. The first principle, which tests the predictive power of the models, relies on the minimization of the cross-validated mean square forecasting error (MSFE) evaluated for up to 10-step-ahead forecast horizon, that is,ŷ t+h ∀ h = 1, 2, ..., 10, while the second compares the models in terms of estimation quality. The latter computes both corrected Akaike's Information Criteria (AICc) and BIC intending to select the model that minimizes both. To identify a unique model for all the stations located in Milan, we proceed to a global comparison, both graphical and analytical, of the two blocks of indicators, giving attention to the overall performances and not focusing only on individual outputs.
According to the cross-validation principle for time series [35,36], we split the full time series into two subsets: a training set for model estimation and a test set for evaluating the out-of-sample forecast performances. The training set includes all the measurements until 24 February 2018, while the test set contains observations relative to the sub-period 25 February 2018-24 February 2019, for a total count of 365 out-of-sample observations. The exclusion of observations after the start of the traffic restrictions makes it possible to obtain unbiased estimates of the policy effects avoiding overlapping with other unidentified factors. Before starting the iterative loop, the model to evaluate is estimated just on time using the observations included in the original training set. At the end of the estimation, the cross-validation algorithm is iteratively implemented as follows. For each iteration, the algorithm extracts the first ten observations available in the test set, generating a forecasting set, and computes three quantities: the 1-to-10 step-ahead forecasts that isŷ t+h ∀ h = 1, ..., 10, the forecast errorŝ y t+h − y t+h and the quadratic forecast errors (ŷ t+h − y t+h ) 2 . The first out-of-sample observation is discarded and the set of forecasting observations is updated right-shifting the forecast horizon by 1 unit and adding the new observation. These operations are repeated for a number of times equal to the length of the test set, in our cases 365 times. The algorithm returns the output of 365 different sequences of 1-10 step-ahead forecasts; for each step-ahead h = 1, 2, ..., 10, the MSFE is calculated as

Policy Intervention Analysis
The introduction of new rules or limitations to individual behaviors can lead to the co-existence of multiple effects with different structure, such as simultaneous immediate changes and adaptive changes that take a long time before visible effects occur. Take into consideration that this fact leads to implement intervention analysis, which includes both permanent and transitory effects. Further details and examples of ARMA-like transfer function applied to intervention analysis are available in Pelagatti ([29]).
The policy intervention is modeled through the combination of two individual effects: (1) a permanent effect, estimated by δ 1 that measures the level shift of pollutant concentrations given by the treatment and modeled as a step dummy, which is D 1t , which assumes a value equal to 1 starting from 25 February 2019; (2) a transitory effect, estimated by δ 0 and evolving according to a first-order difference dynamics of the type w t = λw t−1 + δ 0 D 2,t , where D 2,t is a impulse dummy, which assumes value equal 1 for 25 February 2019 and 0 otherwise and λ measures the persistence of the transitory effect. The sum of the two effects returns the total effect, which expresses the estimated overall reduction or increase in air pollutant levels generated by the policy. The measurement equation after the three-step model selection and augmented by the policy intervention is eventually expressed as follows: where y t is the logarithm of pollution concentrations in one of the stations in Milan, x t is the logarithm of pollution concentrations in the optimal counter-factual station, µ t is the LLT evolving according to Equations (2) and (3), γ t is the optimal seasonal component selected in step one, Z t is a matrix containing the set of optimal subset of weather and calendar covariates selected in step 3, and Φ is the associate vector of coefficients.

Software
All the statistical computations and figures have been carried out using the statistical software R [37]. For state space models estimation, the KFAS package [38] was used. Cross-validation, forecasting, and model selection codes have been developed by the authors. The graphic elaborations were obtained by using the packages ggplot2 [39] and sf [40].

Results
In this section, we present and comment on the empirical results relating both to the differences between pre-and-post policy averages and medians and to the policy intervention analysis for the Milan Area B case study. Section 3.1 shows the variations in concentration levels of NO x and NO 2 for all the stations installed in Milan and for the other seven cities around it. Section 3.2 presents the model selection results, the values of the selection criteria, and the final model specifications. Section 3.3 reports the empirical estimates of the policy effects obtained through the basic structural model augmented by the policy intervention.

Average and Median Differences
Empirical differences of concentrations levels for all the considered stations are presented in Table 1. For both nitrogen oxides and nitrogen dioxide, using the difference of mean and median respectively, it reports the estimates of the difference between the average (the median) concentrations for the year 2019 and the average (the median) concentrations of the same period for the years 2014-2018. The estimates highlight large negative differences in oxides concentrations between 2019 and the period 2014-2018, both in the metropolitan area of Milan and in almost all the surrounding towns. Particularly heavy reductions, and similar to those in Milan, were recorded in the cities of Bergamo (East) and Pavia (South). The simultaneous abatement inside and outside Milan confirms the presence of a general decreasing trend in the aggregate levels of pollutants for the Lombardy Po basin as already indicated by the previous figures.
The differences registered in Milan are relevant both in suburban districts, such as the stations Liguria (West) and Marche (North), and in the historical centre at the Senato station. For those monitoring stations, the reductions are larger than 16 µg/m 3 for NO x and 10 µg/m 3 for NO 2 . In general, the differences between the averages and between the medians are quite similar, but in many stations, the reductions for the medians are stronger than the average differences. This fact is related to the skewed and non-symmetric characteristics of the distribution involved, as shown also in Figure 4. The above considerations on average and median pollution abatement are valid for both pollutants, in fact, the stations where the greatest differences are recorded for nitrogen oxides are the same for nitrogen dioxide.
These preliminary results do not allow for identifying the causes of the reductions and to state if they depend on common causes related to the environment and climatic factors or if they have been generated by the introduction of the new policy in Milan. The next section will attempt to investigate the variations through the modeling of possible environmental and anthropogenic factors able to influence the air quality of the city.

Step 1: Detection of the Seasonal Components
Results relative to the first step of model selection are summarized in Figures 5 and 6, which show the evaluation criteria for all the stations. For each station, the plots are organized in paired-panels; the left panel represents the 10-steps-ahead MSFE as a function of the forecast horizon and the number of harmonics modeling the seasonality (scale colour); the right panel shows the AICc-BIC pairs for each model. The optimal number of harmonics to model the seasonality is identified as the one that evaluates the minimum pair of AICc and BIC and returns the lowest MSFE curve. Both the estimates for NO x and NO 2 for the city of Milan agree unanimously in the selection of the model in which the seasonality is composed by a single harmonic (k = 1); therefore, it can be rewritten as where γ 1,t is Step 1: Cross−Validation seasonal harmonics for nitrogen oxides

Step 2: Detection of the Counter-Factual Component
After selecting the seasonal component, we proceed to the selection of the counter-factual term. Estimates are summarized in Figures 7 and 8, which show the results for NO x and NO 2 . The plots are graphically organized like those related to step one, with the difference that the MSFEs and the AICc-BIC pairs are functions of one of the seven counter-factual candidates instead of the number of harmonics. The selection criteria follow the same rules used for the previous step.
The search for the optimal counter-factual term requires greater attention and detail than in the previous step as the minimizers are not unique. According to the plots, there is a restricted set of stations that are good candidates for the counter-factual role. The set includes the following cities: Treviglio, Pavia, Saronno, and Cremona. In particular, Pavia's station achieves one of the best forecast and fitting performances for almost all the stations in Area B for both NO 2 and NO x . Based on this last consideration, we select as the counter-factual term for future models the air quality monitoring station of Pavia, located South to Milan. Therefore, the final specification of the basic structural model will include as counter-factual term the logarithm of the concentrations in Pavia, x t = log(Pavia t ).

Step 3: Detection of the Weather and Calendar Factors
The last step of model selection aims to select the optimal subset of local weather and calendar regressors after having selected the optimal unobservable components, common weather, and socio-economic factors, captured by the counter-factual. For each station and pollutant, the best models are reported in Tables 2 and 3.  As expected, BIC-based models, being more parsimonious, retain fewer covariates than AIC-based models. Following this fact, we will use the BIC-selected models, but we now discuss some details about the selection process. Concerning the calendar, both criteria include in almost all cases the holidays, week-end, and Sunday holidays effects. AIC suggests adding also the interaction term between Sunday and holidays. Even if the interaction between Saturday and holidays is not always included, the final model will take into account the full set of calendar events and their interactions.
Regarding weather covariates, except for the wind speed, none of the others is included within the final models. Winds blowing from Southwest (Q SW ) are always selected and those coming from Northwest (Q NW ) are often included, and hence kept in the final model. Moreover, temperature, rainfall, solar radiation, and humidity are considered only by the AIC. This fact can be explained by the presence of the counter-factual, which captures not only common characteristics in terms of human behaviour and air quality conditions but also homogeneous climatic conditions common to all the areas considered.

Final Model Specification
Based on the results of the three-step model selection procedure, the final specification of the BSM augmented by the policy intervention can be expressed using the following model: Slope Transitory policy : where y t is the logarithm of pollution concentrations in one of the stations located in Milan and x t = log(Pavia t ) is the logarithm of pollution concentrations in Pavia.

Basic Structural Model and Policy Intervention
In this section, we show the numerical results obtained using state space modeling to estimate both permanent and transitory effects generated by the introduction of Area B controlling for local weather conditions, anthropogenic effects, and common areal trends.
The maximum likelihood estimates of the coefficients and the components' variances for the five air quality monitoring stations installed in Milan are reported in Tables 4 and 5. The results appear to be coherent both for nitrogen oxides and nitrogen dioxide. First, the models identify statistically significant and positive coefficients for the counter-factual term, highlighting its capability to capture socio-economic and climatic factors common to neighbouring areas and coherent with the expected sign. Second, weekends and holidays exert a negative effect on concentration levels probably due to the reduction in the movements and productive activities of the city in those days. Their interactions are almost everywhere not statistically significant but with a positive sign and always less than the sum of the individual effects of the weekend and holidays. This fact underlines how the holiday weekends enjoy more contained effects of emission reductions compared to generic weekends of the year. Third, as to be expected, winds blowing from the West (Q SW and Q NW ) greatly reduce the amount of pollutants all over the city with peaks over 40% to 50%.  The short-term impacts adjusted for common anthropic and weather factors are summarized in Table 6, which shows the estimated permanent and transitory effects for each station in Milan expressed in logarithmic scale, hence interpretable as relative variations in concentrations levels. None of these two coefficients identifies an improvement of the considered pollutant concentrations. Moreover, the permanent effect (δ 1 ) is always positive and in some cases moderately statistically significant. This means that, compared to the generally decreasing areal trend, Milan air quality went worst. It is worth observing that the most significant results are obtained at the Senato station, which is located in the already existing Area C, hence already subject to some car traffic restrictions.
Such a result could be justified by the presence of multiple causes. As a first justification, we are approaching the initial phase of a progressive policy and the time elapsed since its outset may be too short to assess any significant impacts on pollutant levels. This fact is consistent with the forecasts expected by the municipality of Milan about the reductions in nitrogen oxide levels; in fact, the first significant reductions should be observed starting from 2022 [11]. Furthermore, since we are dealing with limitations to human behavior and social perception of new norms, it is not always clear how agents adapt to changes. The deterioration in the air quality of the centre could be linked to new traffic congestions in that area or to a panic shock of drivers, who need time to understand the functioning of the restrictions and adapt their behavior, exactly as in situations mismanagement of individual and organizational changes [41,42]. Eventually, the recent climate changes and the extreme weather conditions that affected the Po valley, such as temperatures higher than the seasonal average and extreme atmospheric events, could increase the noise present in the data and thus mask the real repercussions of the limitations.

Conclusions
This paper analyzed the early-stage effects on air quality of the new traffic policy in Milan, the so-called Area B. The concentrations of nitrogen oxides (NO x ) and nitrogen dioxide (NO 2 ), which are mainly primary pollutants, have been considered as proxies of pollution emissions.
The first hypothesis in the introduction inquires about the presence of a significant effect on the air quality of the city. As a first point, the preliminary results show that concentrations during spring and summer 2019 are lower than during the same seasons in the previous five years, hinting for a reduction effect due to the policy. On the other side, a similar reducing trend has been observed in various neighbouring cities around Milan, which belong to a homogeneous meteorological, social, and economical cluster within the Po valley. Their similar behavior is used here as an areal common trend capturing both weather and anthropogenic components. Our approach, which adjusts for local weather conditions and the areal common trend, does not provide a further reduction effect for any station comparing to this trend. Instead, in Senato station, which is inside the historical city centre and was already covered by Area C, the estimates provide a strong, but moderately statistically significant, increase for both considered pollutants. This is coherent with the fact that the restriction introduced is very limited as it concerns just some classes of old vehicles, which are a small percentage of the entire vehicle pool, both in terms of number of cars and emissions.
Since the first research hypothesis is confirmed just to a minor extent and with an opposite sign with respect to what was expected, the second research hypothesis, concerning the homogeneity of the possible effects, assumes now only a technical scope. It is confirmed just for what concerns the positive direction of the changes, but not for their significance. In fact, among all the estimated permanent effects, only Senato station is significant at 5%. Moreover, the estimated transitory effects are always not significant at any confidence level.
The above facts hint that, compared to the common trend of the considered area, Milan air quality is improving slowly, and, in this sense, the first phase of Area B seems to have a negative effect on air quality. Due to the limited scope of this first phase and its progressiveness, it is not unexpected to find a limited or a zero effect. Nonetheless, the negative effect needs some more explanations.
Although finding the ultimate motivation for this is not the aim of this paper, a discussion follows. Firstly, the statistically significant increase found is limited in space and is located inside the previously introduced restricted Area C. It may be possible that this further restriction increased congestion of public transport buses, which are often very old vehicles, or to the aforementioned adaptation shocks. This could explain only a part of the results. In fact, this first point is also related to the other sources of nitrogen oxides. According to INEMAR [25], road traffic is about 68% of the total emissions. Hence, a transition to house heating green techniques slower in Milan comparing the other considered cities could have an influence on this result. Moreover, also the other stations experienced a comparative deterioration of air quality and the second-worst station is Città Studi, which is an urban background station, hence with limited relation to local traffic congestion. Second, the increase due to road traffic may have temporal dynamics. Since the traffic restriction is limited to business hours, there may be an increase in congestion early in the morning and in the late evening, affecting the daily average.
In conclusion, although environmental protection policies are in general a fundamental step for sustainability improvement, in some cases, they may not be sufficient or their implementation may be misleading. In our case, we considered only the early-stage of a policy, which is progressive in time. Hence, the results of this paper may be regarded as physiological, provided that they characterize only the initial part of the policy implementation and are improved soon. It follows a recommendation to the municipal government to develop the policy more strongly.
Additional research could be developed in the future. In particular, the effect on traffic congestion inside Area C could be investigated further using historical data related to the vehicle movements crossing the access points. Moreover, the use of a multivariate approach, which includes other pollutants such as PM 10 and PM 2.5 , and spatio-temporal modeling could highlight hidden effects, which are not visible considering the single stations. Eventually, the extension to hourly data could consider both the presence of intra-daily effects and explaining the spatial dynamics related to traffic.