Modeling of Future Extreme Storm Surges at the NW Mediterranean Coast (Spain)

: Storm surges are one of the main drivers for extreme ﬂooding at the coastal areas. Such events can be characterized with the maximum level in an extreme storm surge event (surge peak), as well as the duration of the event. Surge projections come from a barotropic model for the 1950–2100 period, under a severe climate change scenario (RCP 8.5) at the northeastern Spanish coast. The relationship of extreme storm surges to three large-scale climate patterns was assessed: North Atlantic Oscillation ( NAO ), East Atlantic Pattern ( EAWR ), and Scandinavian Pattern ( SC ). The statistical model was built using two different strategies. In Strategy #1, the joint probability density was characterized by a moving-average series of stationary Archimedean copula, whereas in Strategy #2, the joint probability density was characterized by a non-stationary probit copula. The parameters of the marginal distribution and the copula were deﬁned with generalized additive models. The analysis showed that the mean values of surge peak and event duration were constant and were independent of the proposed climate patterns. However, the values of NAO and SC inﬂuenced the threshold and the storminess of extreme events. According to Strategy #1, the variance of the surge peak and event duration increased with a fast shift of negative SC and a positive NAO , respectively. Alternatively, Strategy #2 showed that the variance of the surge peak increased with a positive EAWR . Both strategies coincided in that the joint dependence of the maximum surge level and the duration of extreme surges ranged from low to medium degree. Its mean value was stationary, and its variability was linked to the geographical location. Finally, Strategy #2 helped determine that this dependence increased with negative NAO .


Introduction
Human occupation of low-lying coastal regions creates large areas of concentrated human activity, where the population and their property are exposed to coastal flooding. A relevant part of such flooding is due to extreme surges, whose consequences may unleash casualties and infrastructure damage and disrupt essential services [1,2]. Particularly dangerous situations occur when extreme surge levels are combined with ocean waves and a high tide, then resulting in overtopping and breaching of sea defenses [3][4][5]. Low-lying coasts are highly vulnerable areas, in which surges not only drive flooding, but also erosion [6,7]. Extreme surge events are thus an essential driver to be considered in early warning systems and coastal defense planning [8][9][10]. A proper characterization of the extreme surge events can lead to an efficient planning of coastal areas.
The quantification of the effect of climate patterns on extreme surge levels [11,12] is becoming a growing need since large-scale climate patterns represent key modes on the forcings that produce flooding. Although it has been suggested [13] that surge levels would not change in the future or could even slightly decrease [14,15] along the Catalan coast (northwestern Mediterranean Sea), how they will be affected by different climate patterns and the nature of these interactions is still unclear. Among all the climate patterns identified in the Mediterranean Sea [16], the North Atlantic Oscillation (NAO; Hurrell and Deser [17]), the East Atlantic West Russian Pattern (EAWR; Barnston and Livezey [16]), and the Scandinavian Pattern (SC) are some of the most frequently referred to ones for this region [18][19][20][21][22][23].
In a similar way to wave heights [24], surges can be influenced by N AO [25,26]. Significant correlation between N AO and extreme sea levels has already been assessed at the North Atlantic Ocean and Mediterranean Sea [27][28][29][30]. However, not as much is known on the effects of EAWR and SC. A relevant question could be how these teleconnection patterns influence both the intensity and duration of storm surges.
The influence of climate dynamics on extremes [12,[31][32][33] suggests that extreme surge components should be characterized with non-stationary statistical models, which allow moving trends. Further, a single non-stationary statistical model can include the whole sample of extreme events, whereas stationary models require slicing the sample to address shifting trends. Therefore, non-stationary models perform better than stationary models when characterizing a scarce sample of extreme events.
Additionally, maximum surge levels and surge duration may show correlation [34]. Extreme surge intensity at the Mediterranean can be driven by synoptic conditions that may last for days [35]. Hence, the independence assumption between these variables rarely holds valid. A common method for including dependence among marginal probability distribution functions is with copulas. They have been used extensively in recent works [36][37][38][39]. However, at stationary copulas, the association parameter is kept constant. A plausible hypothesis is that such association parameters may depend on large-scale climate indices. This hypothesis implies that a statistical model for storm surges under changing climates should have non-constant parameters both at the marginals and the copula functions.
This paper aims to characterize storm surge projections at the NW Mediterranean Sea under an RCP 8.5 scenario. Such analysis comprise: (1) fitting non-stationary probability distributions of both the extreme surge levels and their duration at the Catalan coast (northwestern Mediterranean Sea, Figure 1), (2) characterizing the joint probability distribution density of these two variables, under a non-stationary assumption, and (3) establishing the relationship of the frequency of occurrence of extreme surge events, the univariate probability distribution functions, and the joint probability density to N AO, EAWR, SC, and their first two time-derivatives. The fitting sample comes from a barotropic model that spans the whole Mediterranean Sea that uses as forcings sea level pressure and wind fields under an RCP 8.5 scenario. The paper is organized into six sections. Following this Introduction, Section 2 describes the study area. Section 3 describes the theoretical background and the methodology. Section 4 lists the results. Section 5 discusses them, and Section 6 provides the main conclusion.

Study Area
The Mediterranean Sea (see Figure 1) is a semi-enclosed basin, delimited by the European, the Asian, and the African continents. It has a narrow connection to the Atlantic Ocean (Gibraltar Strait), as well as access to the Black Sea. This study considers the Catalan coast, which is located in the northwestern Mediterranean sector. The sea bottom of the northwestern Mediterranean, delimited by the coast of the Iberian Peninsula and the longitude 4 • E, has a rather mild slope [28]. The continental shelf slope at the southern area is wider than at the northern part. Hence, bottom friction has a more relevant role in the South than at the North.
The Catalan shoreline is oriented in a northeast to southwest manner. Hence, it is heavily exposed to southeastern winds. Eastern winds, which also contribute to extreme surges, are frequent during the summer, triggered by an intense high-pressure area on the British Islands [1]. Along this coast, the level of intense surges (here given by the mean of the three highest annual independent events) is slightly above 20 cm, which is much smaller than the highest values in the Mediterranean Sea (in the north Adriatic Sea, they exceed 60 cm), but is representative of typical values along most of the Mediterranean coastline [14,28].
The N AO teleconnection pattern has a dipole structure, with its negative center over Greenland and a positive center over the Atlantic, whose position shifts seasonally from Europe to North America [16]. The EAWR has its negative center above the northeastern Atlantic and presents a southeastward gradient leading to a positive anomaly band that extends from the Atlantic to the Mediterranean Sea [16]. The SC pattern has a three-lobe structure with a negative center above Scandinavia and positive centers above Western Europe (or just offshore of the Atlantic coast) and Central Asia. EAWR is characterized by a high pressure located over most of Central Europe and bounded by two low pressure centers, over West Russia and the mid-Atlantic. Here, the signs of all anomalies refer to the positive phases and the 500 hPa geopotential maps, where maxima (positive centers) and minima (negative centers) correspond to anticyclonic and cyclonic circulation, respectively [16]. All signs are reversed during the negative phase of these patterns. As an example, it can be seen that the position and speed of the Atlantic jet stream is linked to the phase of EAWR and N AO, so that positive N AO corresponds to a northward shift of the jet stream and positive phase of both patterns to its large speed [40].
Both wave climate [19,21,24] and surge levels are affected by climate patterns [27]. Teleconnection patterns have a different effect on waves in the various areas of the Mediterranean Sea, depending also on season [19,23]. Positive EAWR implies lower than average significant wave height (SWH), particularly in the eastern Mediterranean, with a particularly strong effect from January to March. In winter, a positive N AO corresponds to a situation of reduced westerly surface wind and SWH in the northern part of the basin. A positive phase of EAWR implies a large reduction of the northwesterly winds in the western Mediterranean, with a reduction of the SWH of the corresponding wave systems traveling across the basin [23]. Furthermore, a positive velocity of SC leads to an increased total energy of the extreme event [21].
The effect of the wind is not disjoint from that of the teleconnections. N AO, EAWR, and SC affect the winds and the sea level pressure [23]. The overall interaction of N AO, EAWR, and SC also has a relevant role on surges, with negative N AO leading to an increase in the number of extreme events and also in their intensity [27].

Theoretical Background and Methods
This study consists of a two-step methodology (see Figure 2): (i) a process-based model that provides data on surge; (ii) a statistical model that characterizes the surge extreme events using the data produced by the process-based model. Two strategies for modeling the dependence between the surge components were tested: Strategy #1 (Bivariate Copula Model (BCM)) was related to techniques proposed by [41], and Strategy #2 (non-stationary BCM) was a modified version of Strategy #1, but allowing non-stationarity in the copula parameter. The study was completed with (iii) a validation of the statistical model. The details of the methodology are described in the following sub-sections.

The Process-Based Model
In the process-based model, dynamical regional projections of storm surge levels (see surge points in Figure 2) were obtained from a deterministic approach, based on the underlying physics. The analysis was performed considering projections of surges at the Catalan coast under a Representative Concentration Pathway 8.5 (RCP8.5) climate change scenario. This scenario considers a CO 2 concentration in the atmosphere close to 1250 ppm in 2100, which is double that of any other scenario in the Fifth Assessment Report [42]. The modeling chain started with the Centro Euro-Mediterraneo sui Cambiamenti Climatici Climate Model (CMCC-CM) [43] Global Circulation Model (GCM). Its latitude and longitude grid sizes were 0.7484 • and 0.75 • , respectively.
N AO, EAWR, and SC indices were computed from the CMCC-CM GCM monthly-averaged sea level pressure fields. Each climate index was based on the difference of normalized sea level pressure at two specific places. For instance, N AO was estimated from such a difference between Lisbon (Portugal) and Reykjavik (Iceland). They were also scaled to have a mean value equal to zero and a variance equal to unity. In order to avoid sudden events that would hinder interpretation, the time series of climate indices were filtered with a second order low-pass Butterworth filter [44], whose low-pass period was 10 years.
The GCM provides boundary conditions for the Regional Circulation Model (RCM) called COSMO-CLM [45]. The COSMO-CLM grid, which has a resolution of 0.125 • × 0.125 • , spans the whole Mediterranean region. The next step consisted of applying a 2D horizontal barotropic model, called Hydrostatic Padua Surface Elevation (HYPSE) [46], which was fed with the COSMO-CLM atmospheric pressure and wind fields. The HYPSE and the RCM share the same domain and spatial resolution. The simulation spanned from 1950 to 2100, providing time series for surge with a frequency of 3 h. This process-based model only considered the short-term contribution to sea level variability. Hence, time series were preprocessed using a high-pass filter with a cutoff frequency of 1/30 days, in order to cancel out the long term oscillation of the sea level. For the analysis, these time series, as well as the values of the climate indices were interpolated to the hour.
Two variables were selected to describe an extreme surge event ( Figure 3): maximum surge level (S p ) and duration of the event (D). D took positive real values; consequently, it was log-transformed for statistical modeling, to avoid scale effects [47] and to be successfully back-transformed to real values.

Statistical Model
The data obtained from the process-based method could be characterized statistically ( Figure 3). Strategy #1 provided a first approximation, with the help of a concatenation of stationary joint probability structures of S p and D. Then, Strategy #2 was employed, introducing a full non-stationarity of the joint probability structure, estimation of the marginal probability structures of each variable at the same time as the joint probability structure, and the link of the whole joint probability distribution function to the selected climate indices.

Commonalities of the two strategies
In both cases, the non-stationarity of the marginal distribution functions was addressed with a Vectorial Generalized Additive Model (VGAM; Yee and Wild [48]). A VGAM allows a wide range of parametric probability density functions. The parameters related to the storminess, the threshold (h 0 ), and the frequency of extreme events followed a Poisson process (λ), whereas the surge components were modeled with a Generalized Pareto Distribution (GPD). A VGAM consisted of a linear function [49,50]: where η i(j) is the j th dependent variable and x i is the i th independent variable that generates η i . η i is the sum of smooth functions of the individual covariates β * 1(j) and f p(j) . Additive models did all the smoothing in R, avoiding the large bias introduced in defining areas in R n . The mathematical assumptions for regression models reflected the degree of confidence of this type of model. These assumptions were: (1) non-correlation, (2) normality, and (3) homoscedasticity of residuals. Assumption 1 was assessed with a autocorrelation function plot. Assumption 2 could be checked with a quantile-quantile plot against an N 0, σ 2 distribution, where the sample standard deviation was used as σ 2 . Assumption 3 could be analyzed on a scatter plot that contraposed fitted values to residuals.
It was assumed that the relationship between the threshold h 0 and any of the following selected factor would follow a Laplace distribution. In order to ensure event independence [34], the minimum time interval between extreme surge events was set to be 72h. A sensibility analysis was carried out on two possible thresholds: 90th and 95th quantiles of the modeled surge series.
The frequency λ was a mean value of a counting variable. Therefore, it was a counting variable, as well. In such a case, a Vectorial Generalized Linear Model (VGLM) could be adopted for its estimation [48]. The VGLM was a particular example of VGAM. The relationship of λ to a factor such as a climate index could be approximated by a Poisson distribution [51,52].
The definition of a GPD was as follows. Y = X − x 0 was the excess of a magnitude X over a location parameter x 0 , conditioned to X > x 0 ; the support of Y was 0 , y sup [53]. y sup was the upper bound of the support. The GPD cumulative probability distribution function was, then, where β ≥ 0 is the scale parameter and ξ ∈ R is the shape parameter. This β should not be mistaken by a covariate of the VGAM. The non-stationary GPD location parameter x 0 was estimated through quantile regression [54]. The quantile regression was a specific type of VGAM, and it estimated the 100τ% conditional quantile , and was a roughness coefficient that controlled the trade-off between quality of fit to the data and roughness of the regression function; and R was a roughness penalty [55,56]. Regarding the rest of the GPD parameters: the shape parameter ξ was assumed to remain constant.
The location and scale parameters could be related to the mean and variance of the population through the expressions: The approximation in Equation (3) remained valid if the orders of magnitude of x 0 and β were similar.
N AO, EAWR, SC, and their first two time-derivatives were eligible climate indices as covariates for the VGAM [21,52] to predict the threshold h 0 , the frequency λ, and the GPD parameters. Using time-derivatives of climate patterns presented an advantage: time-derivatives were associated with variability in the sharp time gradients of these patterns. Up to two climate indices could be used to predict h 0 , λ, and x 0 . If a climate index had clearly much more influence on these parameters, then only this single climate index was used, for the sake of simplicity. A linear interpolation function was selected to relate the covariates to the predictands. The first and second time-derivatives could be written herein as d (·) and d 2 (·), respectively. For example, the first and the second time-derivatives of N AO could be written as dN AO and d 2 N AO, respectively.
The fitting criteria were to minimize the the Akaike Information Criterion (AIC; Akaike [57]) and the Bayesian Information Criterion (BIC; Tamura et al. [58]). Several combinations of large-scale indices were tested, and those ones with the minimum AIC were chosen. This criterion meant that the selected combination explained more information about the sample than those ones with higher AIC. Additionally, in order to avoid overfitting, the AIC penalized the combination of several large-scale indices (i.e., as the number of covariates grew, the AIC increased). Note that in the figures from Figures 8-13, the discontinuous red line marks the selected large-scale index combination. The degree of influence of the climate indices was quantified in the following manner: Equation (1) can be written as: for x i2 = 0, both sides of the equation could be divided by x i2 and be represented as: where C r is the first coefficient of this polynomial expression, and it can be compared to β 1(j) /x i2 (called relative order or O rel ). In its turn, O rel was herein approximated by the ratio between the mean of the variable and the order of magnitude of the climate index, which was one.
In both strategies, the joint dependence density among several variables could be modeled by a multi-dimensional Archimedean copula. A d-dimensional Archimedean copula has the form: for a given generator function G [59][60][61]. A Gumbel generator could define the dependence in the upper tail of the probability distribution [62]. The fitting criteria was the maximum likelihood method. The dependence parameter θ could be transformed to Kendall's τ for easing interpretation [63,64].
τ specified the independence when τ = 0 and total dependence when |τ| → 1. The τ of Kendall was a different concept from theτ in the quantile regression.

Differences of the Two Strategies for Modeling Dependence
In Strategy #1, the scale parameter β of the non-stationary GPD was estimated by a VGLM. It was predicted using a series of co-variates, which, in this case, were the selected climate indices. A linear combination of up to two climate indices was established in Strategy #1 to predict β.
In this strategy as well, the joint probability density was formed by the concatenation of a series of stationary Archimedean copulas. Constant Kendall's τs were calculated for time-slices of 15 years that ranged from 1950 to 2100. Stationarity was assumed for each time-slice, which overlapped with the previous and latter time-slices by half a year. Such time-slicing for stationary probabilistic models is usual in the state-of-the-art [13]. Due to the persistence of the chosen climate indices, the hypothesis of stationarity was fulfilled within these time-slices. Once all these stationary τ were calculated for each time-slice, they constituted a time series of a non-stationary τ. Fifteen years was the minimum time-span that provided a sufficient number of extreme surge events to determine the Archimedean copula density [65,66]. Time-slices shorter than 15 years did not have enough extreme event samples, and the statistical model may lead to sampling bias.
Strategy #2 estimated the parameter β of GPD and a non-stationary joint probability density through a bivariate probit model [41]. Whereas in Strategy #1, the copula was not related to the climate indices, in Strategy #2, both the parameter β of GPD and the parameter θ of the copula were each related to the climate indices. Only one climate index was used as a covariate in each one of these cases, in order to avoid over-complicating the probit model. The parameters of the probit model have the expression: In this equation, 1 n is an n-dimensional vector made up of one, the k th element β * k is the vector (1) (not related to the parameter β of GPD), and the design matrix

Validation of the Non-Stationary Statistical Model
The projections of storm surges showed the general trend of the extreme storm surge events. Climate projections aimed to reproduce climate due to a set of constraints (for instance, a given CO 2 concentration). However, they were different from other products such as reanalysis. In these cases, as proposed in [21], a plausible option was to compare empirical Probability Distribution Functions (PDF) from measurements and fitted parametric PDFs. When two sets of PDFs shared similar probability weights at the same quantile discretization, a random sample from the fitted PDF had similar moments as the measurements (i.e., mean, variance, skewness). Therefore, the trend in their statistical characterization should be comparable to the trend in the observations.
The proximity of the locations of the tidal gauge and the model output was also relevant for the validation of the model. For instance, the model points at Badalona and Barcelona were near each other ( Figure 1). Therefore, the modeled probability distribution functions of S p and D of Badalona could be compared to the ones from the Barcelona tidal gauge (REDMAR; Sea Level Monitoring Network, Puertos del Estado).
The Kullback-Leibler divergence [67] and the Aitchison distance [68,69] could show the affinity of two time series of S p by comparing their probability distribution functions. A set of S p readings from gauges, during surge extreme events, is: and a set of model S p (written as S * p , just for this explanation) is: They can be combined to form a joint set: where vec obs corresponds to the observations and vec mod corresponds to the model. Each element of the vector was the summation of the probability distribution function in between two corresponding quantiles. If two probability distribution functions were similar, the function between two quantiles should be similar, so each pair of elements in vec obs and vec mod should be close in value. A Kullback-Leibler divergence Kullback [67] has the form: It measured the extra entropy of the probability distribution Q of the model, with respect to the probability distribution P of the observations. D KL ranged between zero and infinity. The more similar two probability distributions were, the smaller the divergence measure should be. Note that for any i, Q (i) = 0, it must imply P (i) = 0, in order to avoid indetermination. That is, the model should consider all the values that the observations showed. Furthermore, whenever P (i) = 0, the contribution of the i th term was null, as lim x→0 x log (x) = 0.
The Aitchison distance was also a useful tool to compare vec obs and vec mod . The partition of cumulative probabilities between quantiles ended up being the partition of a whole (the total cumulative probability) into the quantiles. This kind of data, called "compositional data", was more focused on the ratios between the compositions than on their absolute values [70,71]. The Aitchison distance dealt with compositional data. Its expression is: [68,72]: where x and y are the two compared vectors. d (x, y) takes values in the interval (0, ∞) ∈ R. If all the k th elements in the two vectors coincided, the distance would be zero. Note that the modules of P, Q, x, and y were particular expressions of the Kullback-Leibler divergence or the Aitchison distance and presented an order of magnitude of one. Therefore, the model resembled the observations when the measures calculated with Equations (9) and (10) were reasonably smaller than these modules.

Process-Based Model
This subsection addresses the results from HYPSE projections at a set of four specific points. These points were selected as representatives of the general behavior of the model at the study area. All four points presented a high variance in S p and D for the whole period, but they showed a constant moving average throughout time. For instance, the S p at Tossa (Figure 4) ranged from 0.14 to 0.43 m, with a mean equal to 0.21 m. The maximum modeled surges had the same order of magnitude as the tidal station measurements. Its D ranged between 25 and 530 h, with a mean equal to 68 h, in the same time period.  However, two spatial clusters can be stated: (i) a northern one from Llanca to Badalona and (ii) a southern one from Badalona to the Ebro Delta. As stated in Section 2, the Northern part of the Catalan Coast was more affected from sharp wind gradients than the South. The central part (Badalona, Figure 5) featured a transition area. At the north, wind stress from N-NW sectors (i.e., Tramontana and Mistral winds) drove surges. Then, S p at Llanca ( Figure 6) peaked above 0.25 m more frequently than at the other points. Whereas S p values above 0.3 m were uncommon at La Marquesa (15 storms in Figure 7), there were around 50 storms at Llanca and 30 storms at Tossa. Additionally, high S p events were grouped at specific decades, whilst other decades were relatively calmer, with S p below 0.25 m at Llanca. These results suggested that the variability of large-scale climate may have a role in the intensity and duration of the events. Storm duration was more homogeneous along the Catalan coast. However, events above 250 h were more frequent in the North than in the South: around 30 events were above this duration at Llanca and Tossa, whereas the number decreased to 20 events at La Marquesa. Note also that the exceptional storm events in the sample (high S p and D, as the one around 2075) affected the whole study area. They highlighted an upper bound of what may be possible. However, these rare events were highly uncertain, and they could not be representative of strong conclusions.   The results of the validation will be shown in the next subsection. The empirical PDF (from the measurements) and the fitted PDFs are compared with the method stated in Section 3.3.

Non-Stationary Statistical Models
This subsection handles the second step in the analysis, showing the results from the fitting of the non-stationary probability distributions. The climate index that presented the strongest influence on the threshold h 0 was, by a significant difference, SC (see Figure 8). Therefore, only this one climate index was selected, following the principle of parsimony. Its C r = 0.134 was comparable to the O rel of surge levels, which was 0.207.
The climate index that presented the strongest influence on the frequency parameter λ was SC. Its C r was 0.156, while λ in the Catalan coast was of the order of one to 10 storms/year. Hence, SC was related to a higher frequency λ.
The climate index combination that mostly affected the value of the location parameter x 0 of the GPD of S p was a combination of NAO and d 2 NAO (see Figure 9). Its C r = −0.001 was significantly smaller than its O rel = 0.207, exhibiting that a combination of N AO and d 2 N AO had a small effect on the mean of S p . The BIC indices presented analogous trends to the AIC indices. The climate pattern that mostly affected the value of the location parameter x 0 of D was SC (see Figure 9). Its C r = 0.038 was much smaller than its O rel = 4.187. Thus, the mean value of D was independent of the proposed climate patterns.    In Strategy #1, the climate indices that presented the strongest influence on the value of the scale parameter β of S p were a combination of dSC and d 2 SC (see Figure 10). Its C r = −0.85 was comparable to its O rel = 0.207. Thus, negative values of a combination of dSC and d 2 SC (mainly meaning an accelerating shift of negative SC) led to larger variance of S p . The combination of climate indices that presented the strongest influence on the value of the scale parameter β of D was N AO. Its C r = 0.16 was, comparable to O rel = 4.187. Therefore, positive values of N AO led to a larger variance of D. The modeled τ from the dependence structure proposed in Lin-Ye et al. [21] is shown in Figure 11. This plot presents discontinuities in time-slices that did not have enough extreme surge events to be fit by a stationary copula. These values of τ ranged between 0.1 and 0.37, thus showing low to moderate correlation.
In Strategy #2, the climate indices that showed the strongest influence on the scale parameters β of S p and D were EAWR and N AO, respectively (see Figures 12 and 13). C r for the β of S p was 0.120, while O rel = 0.207 (the same one as for S p in Strategy #1). C r was comparable to O rel , here. Then, a positive EAWR led to larger variance of S p . C r for D was 0.071, while O rel = 4.187 (the same one as for D in Strategy #1). C r was significantly smaller than O rel , here. Then, the variance of D was not affected by the proposed climate patterns.      The τ computed with the methodology described in Marra and Radice [41] is shown in Figure 14. It ranged between 0.1 and 0.26. There was the same phase at all model points, but the amplitudes depended on the point. The climate index that presented the greatest influence on the copula was N AO (see Figure 13). Its C r = −4.104 was greater than its O rel = 1.193. Therefore, a negative N AO led to greater dependence between S p and D.
The VGAM used to estimate all the parameters complied with the corresponding assumptions (see Section 3.2). Finally, the Barcelona tidal gauge was used for the validation. For the years 1993-2016, the Aitchison distance corresponding to S p was 1.14. The corresponding Kullback-Leibler divergence was 0.15. The module of S p was of the order of one, that meant that the PDFs of the measured and modeled had strong similarity. Then, the statistical characterization of projected S p resembled that of observed S p , in the years 1993-2016. Regarding the event duration, measured and modeled durations were at the same order of magnitude. On average, the measured surge was above 16 cm at around 275 h/year. The minimum storm surge modeled was around 24 h, and the most extraordinary events lasted for almost 500 h. Please note that the proposed definition was that the surge was above the modeled threshold, which ranged around 15 cm.

Discussion
HYPSE outputs represent the general behavior of storm surges at the Catalan cast. Previous research with different GCMs, scenarios, and hydrodynamic models have shown similar trends [13,15]. Modeling surges at the NW Mediterranean requires atmospheric forcings with a relative high horizontal resolution. Hence, GCMs' usual resolution is insufficient for handling the orographic effects presents in the study Area [73]. In this contribution, the COSMO-CLM model was needed for downscaling the GCM forcings. Otherwise, common extreme wind regimes such as Mistral or Tramontana could not be properly assessed. Extreme winds have a prominent role in the local surge due to wind setup [34,35].
The sensibility analysis on the threshold h 0 showed that a h 0 equal to the 95th quantile would lead to similar values of D as using the 90th quantile, whereas there would be an insufficient amount of samples to calculate the joint probability density. Therefore, the 90th quantile of S p can be considered an adequate threshold (h 0 ).
The threshold h 0 and the frequency λ are directly linked to SC. Usually, if the h 0 rises, λ should decrease, but it seems that SC first leads to a higher λ, and then, this value is not reduced sufficiently by a rising h 0 . It can be observed that N AO also has a great influence on λ (see Figure 8). It is similar to the Gulf of Lions, where a higher λ is reported for negative N AO [74].
S p and D presented significant variance, but had a constant running mean in the years 1950-2100 (see Figure 4). This is reflected by the results in Table 1, which shows low dependence on the selected climate patterns for both S p and D statistical means. Another result for D is that it can take large values of approximately 20 days. This might be caused by the fact that the selected threshold h 0 , set low to avoid excluding an excessive amount of extreme surge events, had the limitation that on rare occasions, it could be too permissive in the selection of extreme surge events. From Figure 4, it seems that the running variance of S p and D would be larger from 2030 to 2100. That is, the variance of these two variables seemed to increase with time. This could be explained by the effect of N AO, EAWR, and SC on the scale parameters β of the GPDs of these two variables. Table 1. Most influential climate indices on parameters. The variables are: surge level at the peak of the extreme surge event (S p ) and duration of the extreme surge event (D). The climate indices are: North Atlantic oscillation (N AO), East Atlantic-West Russian pattern (EAWR), and Scandinavian pattern (SC). The prefix "d-" means "first time-derivative of", and the prefix "d2-" means "second time-derivative of". C r is a coefficient from the vectorial generalized additive model and can be compared to the relative order of magnitude of the variable to the climate index. The combination of climate indices that had the most influence on the GPD and copula parameters varied from Strategies #1 to #2 (see Table 1). This might be the effect of full non-stationarity in Strategy #2. In the case of doubt, Strategy #2 should prove more reliable and flexible. Its dependence structure did not depend on time, rather on large-scale climate patterns. Hence, in the case of strong negative N AO, there would be more probability of high duration surges jointly with high sea levels at the surge peak. Furthermore, note that, in Strategy #1 (see Figures 9 and 10), the values of AIC are significantly different between the location parameter x 0 and the scale parameter β of GPD. This does not happen in Strategy #2 (see Figures 12 and 13), where the AIC value for the location parameter x 0 , the scale parameter β, and even Kendall's τ of the dependence parameter are similar in order of magnitude. This is because, in Strategy #1, a quantile regression VGAM is used to obtain the location parameter x 0 , while a VGLM is used to obtain the scale parameter β. In Strategy #2, however, one single type of model, the probit model, characterizes location parameter x 0 , the scale parameter β, and the dependence parameter at the same time. The AIC (and the BIC, which produces similar values for this methodology) is an information measure of models. Its value can easily be of different orders of magnitude for different models, while sharing a similar order of magnitude for the same model.
The ξ is always negative in the GPD at these model points, whereas the order of magnitude of the location parameter x 0 and the scale parameter β of the GPD are similar to each other. Thus, the approximations in Equations (3) and (4) are applicable. A negative shape parameter is related to probability distribution functions that have finite tails, which means that extreme values have a lesser importance in these distributions.
Instead of discussing x 0 and β, it is equivalent to talk about the statistical mean and variance, which are more easily related to physical phenomena. The mean of S p and the variance of D (in both strategies) are not affected by N AO, EAWR, nor SC (see Table 1), the same way that the scale parameter x 0 of the GPD is constant. However, there is some evidence of the effect of an external forcing, possibly another climate pattern. This is out of the scope of this paper, but could be explored in future work.
The temporal dynamics of SC seems to be key to the variance of S p , in Strategy #1. It can be deduced that periods of extraordinary positive gradients of SC would intensify the variability of S p . Moreover, a positive N AO would increase the variability of D. It should be noted that a positive velocity of SC leads to extreme wave storm total energy [21], and a negative N AO is linked to large SWH [23]. Therefore, periods with these characteristics are especially prone to damage caused by the joint action of wave storms and extreme surge events. Other external enhancers to the damage caused by S p and D are coastal erosion and reduction of the beach width. The reduction of the beach width occurs by modifying the responses of the coast to the propagation and intensification of the extreme surge events over a long time.
The correlation between wave storms and extreme surge events has not been considered, as the dependence density between wave storms and extreme surge events in the Catalan coast is not entirely clear, in this case. This correlation might depend on the directionality and fetch of the wave-storm [75,76]. The combined action of storm surges and wind waves may drive increased coastal water levels [77], even under moderate return periods. Furthermore, if adding sea level rise due to global warming, then coastal structures' vulnerability is even more exacerbated [33,78]. Hence, it may be useful to find consistent relationships between the synoptic patterns and the main hydrodynamic drivers. Such characterization may serve to create design scenarios that would address moderate waves and surges, jointly with sea level rise. These scenarios may be more frequent (and, probably, more harmful) than high return period wave storms and/or surges.
Incidentally, the relevant effect of N AO on S p coincides with existent literature, which states that the observed surge levels increase in the Catalan coast for negative N AO, because this condition causes a shift of atmospheric storm tracks towards lower latitudes [28,79]. High observed surge levels are also reported in the Gulf of Lions, for negative N AO [80].
SC does not present the same relevance in Strategy #2, where it is EAWR that has the strongest influence on the variance of S p . Positive EAWR leads to higher variance of D. Additionally, while in Strategy #1, N AO is an influential climate index on the variance of D, in Strategy #2, the variance of D does not depend on any of the proposed climate patterns. It is possible that SC lost its influence in a non-stationary framework such as Strategy #2, where a wider time window is selected.
The Gumbel copula family, which describes the joint probability density among S p and D, is symmetric and has its own limitations [81]. However, due to the complexity of the non-stationary model, a symmetric model allows for a simpler model, whereas some main results, such as the influence of climate patterns on GPD and copula parameters, can be considerably close to reality, despite this simplification.
Kendall's τ, for both strategies, is below 0.37. It ranges from low to moderate (see Figures 11 and 14) and has constant moving means over time. In Strategy #2, the effect of N AO on τ at all these model points is demonstrated and is evident, as all the time series of τ coincide closely in phase. Specifically, a negative N AO leads to higher dependence among S p and D. The coincidence in phase can also be seen in Strategy #1, where τ is not related to any climate pattern. The τ in Strategy #1 shows more frequent fluctuations than the τ in Strategy #2. Moreover, both figures show that the Tossa, Lloret, and Badalona model points denote greater variance. This is evident from the amplitude of the time series, while Llanca and Marquesa present smaller amplitudes of τ. The main reason for this might be the way that the low-sea level pressures and the extreme wind regime happen in this study area. Cyclogenesis and propagation of such storm surges are the driving processes. It seems that, in the northern part of the Catalan Coast, stronger surges and winds are propagated. The surges that occur in the southern part of the Catalan Coast come from milder wind fields and sea level pressure anomalies. Hence, they tend to have lower anomalies.

Conclusions
A hybrid approach was used to characterize extreme surge events at the Catalan coast. It consisted of a process-based model and a statistical model. The process-based model generated projected surge levels for the period 1950-2100, under the climate change scenario RCP8.5. Extreme surge events were extracted from this time series through a non-stationary threshold h 0 , which defined a λ (frequency of a Poisson's process) of extreme events and produced the variables S p (maximum surge level) and D (duration), describing them.
A statistical model was proposed to characterize the maximum surge level and duration with GPDs and their joint probability density with Gumbel-type Archimedean copulas. This statistical model also relates these variables to indices of three climate patterns and their first two time-derivatives: N AO, EAWR, and SC. Two different strategies were proposed for the statistical model. These two strategies were based on the bivariate copula model methodology developed in Lin-Ye et al. [21] (Strategy #1) and its non-stationary version Marra and Radice [41] (strategy #2), each. The strategies shared all steps except the estimation and establishment of the relationship of the scale parameter β of the GPD and Kendall's τ to the climate indices. Strategy #1 estimates β through a VGLM and characterized the joint probability density by concatenating standard stationary Archimedean copulas. Strategy #2 integrated the estimation of β along with the non-stationary Archimedean copula, within a bivariate probit model.
The results showed how SC and its dynamics strongly influenced the threshold and the event frequency. Both parameters increased with positive SC. Maximum surge level and duration had a constant moving mean from 1950 to 2100, while their running variance increased with time. The constant moving mean of the two variables could be explained by the fact that they were not sensitive to the proposed climate patterns. The increase in the running variance might be due to the effect of different combinations of climate patterns on the scale parameter β of the GPD used to fit each variable.
The location parameter x 0 of the GPD was not influenced by any of the proposed climate indices. According to Strategy #1, a combination of rapidly shifting negative SC led to an increase in the variance of the maximum surge level, whereas the variance of duration increased with positive N AO. In Strategy #2, the variance of maximum surge level increased with positive EAWR and the variance of duration was independent of the proposed climate patterns.
In both strategies, the values of Kendall's τ of the dependence parameter were below 0.4, ranging from low to medium dependence among maximum surge level and duration. The τ in Strategy #1 had a constant trend and was similar for all locations. τ in Strategy #2 presented the same phase for all model points, but provided different amplitudes for each point. In this strategy as well, Kendall's τ of the two variables increased with negative N AO.
The big picture of the influence of the proposed climate patterns was that N AO and SC had a strong influence on the threshold h 0 and the frequency λ (according to Strategy #1), the variances of the maximum surge level and duration, and (according to Strategy #2) the joint probability density. The knowledge about this relationship between N AO and SC with maximum surge level and duration could help with predicting changes in maximum surge level and duration, with previous knowledge about these two climate patterns. Special attention must be paid to Strategy #2, which was the fully non-stationary model. This could improve the design of coastal structures that have to withstand the action caused by extreme surge events, as well as protection measures for mitigating coastal risk levels.

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; nor in the decision to publish the results.

Abbreviations
The following abbreviations are used in this manuscript: