An Assessment of the GOCE High-Level Processing Facility (HPF) Released Global Geopotential Models with Regional Test Results in Turkey

The launch of dedicated satellite missions at the beginning of the 2000s led to significant improvement in the determination of Earth gravity field models. As a consequence of this progress, both the accuracies and the spatial resolutions of the global geopotential models increased. However, the spectral behaviors and the accuracies of the released models vary mainly depending on their computation strategies. These strategies are briefly explained in this article. Comprehensive quality assessment of the gravity field models by means of spectral and statistical analyses provides a comparison of the gravity field mapping accuracies of these models, as well as providing an understanding of their progress. The practical benefit of these assessments by means of choosing an optimal model with the highest accuracy and best resolution for a specific application is obvious for a broad range of geoscience applications, including geodesy and geophysics, that employ Earth gravity field parameters in their studies. From this perspective, this study aims to evaluate the GOCE High-Level Processing Facility geopotential models including recently published sixth releases using different validation methods recommended in the literature, and investigate their performances comparatively and in addition to some other models, such as GOCO05S, GOGRA04S and EGM2008. In addition to the validation statistics from various countries, the study specifically emphasizes the numerical test results in Turkey. It is concluded that the performance improves from the first generation RL01 models toward the final RL05 models, which were based on the entire mission data. This outcome was confirmed when the releases of different computation approaches were considered. The accuracies of the RL05 models were found to be similar to GOCO05S, GOGRA04S and even to RL06 versions but better than EGM2008, in their maximum expansion degrees. Regarding the results obtained from these tests using the GPS/leveling observations in Turkey, the contribution of the GOCE data to the models was significant, especially between the expansion degrees of 100 and 250. In the study, the tested geopotential models were also considered for detailed geoid modeling using the remove-compute-restore method. It was found that the best-fitting geopotential model with its optimal expansion degree (please see the definition of optimal degree in the article) improved the high-frequency regional geoid model accuracy by almost 15%.


Introduction
The low-earth orbiting (LEO) satellite missions initiated a new stage in Earth gravity field studies and led to unprecedented progress in determining global gravitational field models and related parameters. In addition to the satellite laser ranging (SLR) tracking data, as a fundamental technique

Practical Need of Global Geopotential Model Validation Results
Testing and evaluating geopotential model performances is important since determining an optimal model, with the highest accuracy and best resolution, has certain practical consequences in specific applications of the Earth science and engineering disciplines. We exemplify some of the applications in which GGM performance is crucial in the following by addressing the applied numerical tests under this study.
The determination of a high-resolution regional geoid using the remove-compute-restore (RCR) method combines low to high-frequency content of the gravity field signal employing the reference GGM, the terrestrial gravity observations and the topographic height and density data [27]. In this strategy, the best-fitting GGM to the gravity field in the region contributes to the accuracy of the high-resolution geoid by improving the long-wavelength component in the error budget [27][28][29]. The GGM contribution to the regional geoid model accuracy can be even more crucial if there is a lack of high-quality and sufficiently dense terrestrial gravity data under certain modifications of the kernel function and when a few-centimeter accuracy geoid model is desired (see e.g., [30]). Regarding this issue, we evaluated the released GGMs in high-frequency geoid computations using gravity data in Turkey. The computations were carried out with fast Fourier transform (FFT) evaluation of an unmodified Stokes integral. The conclusions depicted a centimeter-level contribution of the GGMs to the accuracy improvement of the regional geoid models.
The independent data provides an objective measure to examine the gravity field mapping performances of the GGMs in a region. Since the gravity field parameters and their gradients are directly estimated with spherical harmonic expansion equations in some applications, their accuracy checks are important. Regarding this, [31] uses the deflection of vertical (DoV) components estimated from GGMs to compare with the astrogeodetic DoVs obtained through the observations with the digital zenith camera system (DZCS); [32] exploits the GRACE-based GGMs derived gravity gradients in interpretation of geophysical or geodynamical patterns in Iran; [32] compares the upward continued airborne gravity data with the GOCE-based gravity gradients in Antarctica; and [33] issues the GGM-derived geoid heights and gravity anomalies to compare with the mean sea surface and free air anomalies derived from altimetry data in open ocean and coastal regions. In some GNSS applications, GGM-derived geoid undulations/height anomalies are considered for vertical control purposes. Before using the estimated undulations in the transformation of GNSS heights, the accuracies and the fit of the GGMs to the regional vertical datum should also be validated. For such a purpose, [34] evaluates five combined GGMs in a very dense GPS/leveling network on the Greece mainland and tested capability of parametric models for fitting the GGMs to the regional vertical datum.
The independent validations of the GGMs and test results are also important for the height system unification (HSU) and world height system (WHS) realization studies, because the precise determination of the offsets among the regional vertical datums, as well as with the WHS surface, is critical to establishing a globally unified height datum [23,35]. In this way, using a high-precision GGM as a reference in the calculations allows for higher accuracy calculation of regional height system offsets [35,36]. Although numerical assessments on the determination of the geopotential value (W o ) of regional vertical datum and height system unification are not explicitly undertaken in this article, [37] is recommended for further reading on the detailed numerical evaluations of a number of GGMs on the height system unification between Turkey and Greece.

Overview the Tested Global Geopotential Models
The tested GGMs in this study include the recent models that were calculated with different processing strategies from satellite orbits and GOCE gradiometer observations. These models were released (tagged with the release version from RL1 to RL5) as the products of the GOCE High-Level Processing Facility (HPF) of the ESA project. The first releases of the models (RL1) consist of the initial two months' GOCE observations (between November 2009 and January 2010); the RL2 models are calculated employing the eight months' observation cycle (November 2009 to July 2010); and the RL3, RL4 and RL5 releases exploit the 18 months, 26 months and the full mission observations, respectively [38]. The sixth releases of the direct and time-wise models, which were also based on the GOCE full mission dataset, were recently published and made available during the preparation of this article [21]. Therefore, the RL6 models were also employed and tested in the content of the study in order to compare their performances with the RL5 models, which have been already exploited the full mission observations. Essentially, the models were developed according to three HPF gravity field processing approaches, namely, the direct (DIR) [39], time-wise (TIM) [40] and space-wise (SPW) [41] approaches. Each approach follows a different processing strategy in calculating the releases. These strategies include mathematical models, the adopted a priori models (or no a priori model use), reference information in constraints, spherical cap regularization, filter algorithms of the observations, etc. Table 1 provides a detailed comparison of the validated GGMs including a short report on the differences in their processing strategies. References [18,42] provide a detailed description of the basics of these data processing methods. Reference [18] reviews and discusses the DIR, TIM and SPW methods, their processing philosophies and architectures in computations of the GOCE gravity field models using RL1 models. Reference [16] reviews the second, third and fourth generation GOCE-based Earth gravity field models with their computation strategies, and evaluates their performances with spectral sensitivities over different functionals.
The products used in computations include gravity gradients in gradiometer reference frame (with identifier EGG_NOM_2), common-mode accelerations (EGG_CCD_2C), attitude quaternions (EGG_IAQ_2C), and precise science orbits (including the sub-products: kinematic orbits (SST_PKI_2I), variance-covariance information of orbit positions (SST_PCV_2I), reduced-dynamic orbits (SST_PRD_2I) and quaternions of transformation from Earth-fixed to inertial reference frame (SST_PRM_2I)); see [43] for the product descriptions and standards that were applied during the processes. In addition to the observations and products, depending on the applied processing strategy, the reference gravity field models were also adopted in some of the solutions. These reference gravity field models were either exploited as a priori information or for regularization, internal assessment and validation purposes. EIGEN5C, EIGEN51C, ITG-GRACE2010S [21] were used in the DIR method for internal validation purposes, whereas the GOCE quick-look gravity field model was used in the first release of the SPW approach as a priori information. EGM2008, EIGEN5C, EIGEN6C3stat were also employed for signal and error covariance computation in SPW solutions [21]. However, as the basic difference among the three methods, in the processing strategy of TIM method no gravity field a priori information entered the solution and, hence, this method yields pure GOCE-only models. All the information given here were retrieved from the headers of the spherical harmonic coefficients' files and respective data sheets, published by [21]. The main characteristics of the models are summarized as follows: The DIR method, as the classical approach, is based on the orbit perturbation theory and combines a priori orbits and an a priori gravity field model, hl-SST observations and common-mode accelerations, while constituting the normal equations in determining the gravity field model coefficients in an iterative way: • DIR models contain GRACE observations in the lower to medium degrees of the expansions (in addition to GOCE gravity gradient data).

•
In RL3, RL4 and RL5 models, GRACE was combined with GOCE and SLR data on the basis of the normal equations (including the SLR data (LAGEOS-1 and -2)) in order to improve the gravity field solution, since the very low-degree harmonics (in particular degrees 2 and 3) cannot be estimated with GRACE and GOCE data).

•
In order to overcome the polar gap of GOCE's observations caused by the orbit inclination of the satellite (inclined at 96.7 • [18]), a spherical cap regularization (SCR) in accordance to [44] was applied using EIGEN-51C (≤d/o 240). In the later releases after RL1, the SCR was applied using GRACE and LAGEOS data. In the RL3, RL4, RL5 of DIR models, additionally, the predecessor release was used as a priori information (≤d/o 240, 260, 300, respectively, in the releases) and a Kaula regularization [44] was used beyond degrees 200 (for RL3, RL4) and 180 (for RL5).

•
In the DIR RL1 solution, the gravity tensor elements, which are measured with GOCE on-board SGG, were accumulated with the relative weights of Txx 1.0, Tyy 0.5 and Tzz 1.0. They were equally weighted in combination for the releases after RL1. In RL4, RL5 and RL6 models, the off-diagonal tensor elements Txz were also included.

•
In RL4 and RL5 models, the spectral bandwidth of the bandpass filter used to filter the SGG observations was extended by 1.7 MHz toward the lower frequency domain.

•
Reference [16] notes that contrary to the TIM approach, the DIR approach employs the GOCE gravity gradient information as restricted in a certain spectral band, which is close to the gradiometer measurement bandwidth (5-100 MHz [45]). However, the filtering procedure in the TIM approach allows the use of information on the gradient observations over the entire spectrum.
The TIM approach exclusively relies on an epoch-wise processing of GOCE SGG and hl-SST data according to the least squares principle. In the method, the SST data analysis is based on the kinematic orbit solutions with the energy integral approach (short-arc integral approach for TIM-RL4, RL5 and RL6). Hence, since TIM solutions are independent of any other gravity field information, they can be used for independent comparisons and combination with other satellite-only models (such as GRACE), terrestrial gravity data and altimetry data on the normal equations' level. However, the TIM solutions are not competitive with the GRACE models in low degrees, since they are based solely on kinematic GOCE orbits but no external (such as GRACE) information:

•
The TIM models were constrained using Kaula's rule regularization applied to (near-) zonal coefficients, improving the signal-to-noise ratio (constraints for high degree coefficients).

•
The off-diagonal tensor element Vxz was included in the models TIM RL3, RL4, RL5 and RL6. The stochastic models for the gravity gradient observations rely on optimum weighting based on variance component estimation.
The revision in 2012 of the processing procedure of the Level-1b products led to the expectation of performance improvement of the GOCE's SGG observations and consequently a quality improvement of the fourth, fifth and sixth releases of the models. Reference [47] reports that the gradiometry-only gravity field estimates show the largest improvement in recovery of low and medium degree coefficients, due to the new Level-1b products (see [16]).
In the SPW method, spatialized observations are produced from SST and SGG data using Wiener orbital filter [48] and the gravity field coefficients are derived by spherical harmonic analysis after transforming these observations into a spatial grid using least squares collocation [49]: • The SPW RL1 model includes the first GOCE quick-look model as the a priori information and the EGM2008 was used for error calibration of the estimated gravitational potential along the orbit that affects the low degrees of the solution. However, the later releases, RL2 and RL4, were not corrected using any a priori model, thus they are the GOCE-only models. EIGEN5C and EIGEN6C3stat models were respectively used in SPW RL2 and RL4 models' computations for signal covariance modeling, in addition to FES2004 for ocean tide modeling.

•
In all SPW models, the off-diagonal tensor elements Vxz were included.
The maximum d/o of spherical harmonic expansion also changes (see the 2nd column of Table 1). The maximum d/o of the models was determined by the processing teams of HPF, depending on the signal-to-noise ratio, processing issues and the strategy of using additional a priori information [18].  In the low degrees, it is not competitive with GRACE models. Kaula reg. is to improve SNR.

Synthesis of the Gravity Field Parameters Using GGMs Spherical Harmonic Coefficients
The International Center for Global Earth Models [60] database has published 175 geopotential models so far and almost 30% of these models include GOCE mission data. In order to quantify the quality of the GGMs independently, validating the models using external information is necessary. So-called validations require the level-3 (L3) products from the satellite gravity missions to be compared with terrestrial data [24]. The GPS/leveling data is typically used for independent geoid undulations (N GPS/lev. ) at the benchmarks to compare with the GGM derived functionals (N GGM ) using the following equation: where (θ, λ, r) are co-latitude, longitude and geocentric radius of the computation point, respectively; a is the major axis radius of the reference ellipsoid; GM is the product of the gravitational constant and the mass of the Earth; and γ is the mean gravity of the reference ellipsoid.   [20]. In the equation, − P m (cosθ) are the fully-normalized associated Legendre functions. As can be seen, Equation (1) does not include the zero-degree geoid height term (N o ), which is represented by the zero-degree harmonic coefficient in the spherical harmonic expansion model and means the position of the GGM geoid with respect to the geo-center of the Earth. The zero-degree term Remote Sens. 2020, 12, 586 9 of 28 is expressed as depending on the difference between the constant (GM) value of the GGM and that of the reference ellipsoid (GM GRS80 ), which is GRS-80 for Turkey (see Equation (2)): where U o corresponds to the normal gravity field of the GRS-80 reference ellipsoid with parameters of GM GRS80 = 398, 600.50 × 10 9 m 3 /s 2 ; thus, it is U o = 62, 636, 860.85 m 2 /s 2 [61]. Using r = 6378.137 km and γ = 981 Gal, the zero-degree geoid height (considering solely the first component at the right-hand side of Equation (2)) corresponds to the N o = −0.936 m that is to be added to the N GGM , derived from Equation (1). The residual zonal-coefficients (shown as ∆ − C m in Equation (1)) account for the difference between the reference radius of the GGM and the semi-major axis of the GRS-80 ellipsoid [62,63]. In the synthesis of GGMs' spherical harmonic coefficients, the zero-degree term is considered either by taking the zero-degree zonal-harmonic coefficients (as  (2), , introduces the difference between the gravity potential of the geoid (W o ) and the normal gravity potential on the surface of the normal ellipsoid (U o ). In the ideal case, without irregularities of topographic densities, the two potentials are equal (W o = U o ). As a primary parameter for the definition of the reference (zero-height) surface with respect to Earth's body (which provides an absolute definition to the vertical datum of a height system), a global estimation of W o is required [64].
In addition to the zero-degree term definition, another issue in the synthesis of the gravity field parameters from the GGMs' spherical harmonic coefficients is adopting the proper permanent tide-system to which the GGMs' functionals refer. The permanent tide-systems, commonly adopted in studies investigating Earth dynamics include mean-tide, zero-tide and tide-free [65]. In studies related to the Earth's potential field, investigations of the Earth's crust deformations or determination of the best-fitting ellipsoid to the geoid, a proper permanent tide system is adopted. According to Bruns' definition of geoid undulation [61], the geoid undulation changes from one tide system to another depending on the geometric change in the shape of the geoid (but with a constant W = W o potential for each tide-system) relative to a fixed reference ellipsoid [65]. In other words, the W o value does not change among the systems, but the shape of the geoid changes. The effect of changing the permanent tide-system is defined basically with the second-degree zonal harmonic coefficient − C 20 [65]. In this study, the gravity field parameters were synthesized using the GGM coefficients in a tide-free system. In order to have comparable geoid undulations at the GNSS/leveling benchmarks, the orthometric heights in regional vertical datum were converted from the mean-tide to tide-free system according to following the equation [66]:

Assessments of the Global Geopotential Models Using Spectral Enhancement Method (SEM)
The spectral constituents of Earth's gravity field are denoted with the spectral bands of a GGM [15]. The gravity field quantities as observed on the Earth's surface contain the full spectral signal power, while any global geopotential model is limited by its spectral resolution, which leads to omission error in the output functional. When comparing GGMs with independent observational data, one should take into account spectral incompatibility of both data sets. To overcome the spectral band limitation of the mode, the so-called spectral enhancement method (SEM) is suggested and applied by [18]. The fundamental principle of the SEM is based on bridging the spectral gap between the GOCE-based GGM functional and ground-truth data, using the relevant auxiliary data set (such as a higher resolution geopotential model like EGM2008, and the residual terrain data, to account for the ultra-high frequencies) as much as possible. Hence, the ground-truth data for validation purposes are approximated much better, and thus the influences of omission error on the comparison results are consequently minimized. In spectral enhancement, the GGM, expanded to a spherical harmonic degree 1 (e.g., 1 = 240 for DIR_RL1), is enhanced using the spectral bands over the degrees of ( 1 + 1) until 2 from a higher resolution Earth geopotential model (e.g., 2 = 2190 when using EGM2008). The very short-wavelength components of the gravity field, represented with spectral bands beyond degree 2 (e.g., 2 = 2190 in our case) until the maximum possible degree (e.g., 216,000), are completed as the residual terrain effect (RTE) of topographic data on the estimated geoid heights (see Equation (4)).
Equation (4) is a symbolic representation of the enhancement procedure in terms of geoid undulation derived using the GGM, where N o is zero-degree geoid undulation; N 1 2 + N 2 1 + 1 is the geoid undulation calculated with enhanced coefficients according to Equation (1). Here, N RTE is the residual terrain effect on geoid undulation, calculated using the digital topographic data [27]. As the summation of the components that represent each different spectral constituent of the Earth's gravity field, the full spectral content geoid undulation (N enh. ), which is comparable with ground truth data, is obtained.
In Equation (4), N RTE was calculated using the residual terrain model (RTM) mass reduction method evaluated in the Gravsoft program package, following the processing methodology outlined by [27], Section 8.4.4 Terrain Effects and High Resolution Global Geopotential Models in pp. 366-371. In computations, 3 arc-second Shuttle Radar Topography Mission (SRTM) topographic data was used for approximating the real terrain and the spherical harmonic expansion of the DTM2006.0 (up to d/o 2190) was used to model the mean elevation surface [27].

Spectral Analysis of GGMs
In the assessment of the quality of the GGM solutions, two types of errors are defined. Firstly, the noise in the observations that are used in the calculation of the GGM coefficients propagates from the measurements to the solutions, which is called the commission error. Secondly, an error is revealed because of estimating the projection of the gravity field functional (taking the anomalous potential as an example) on the finite dimensional subspace (generated by linear combinations of solid spherical harmonics up to a limited degree) instead of deriving its full content. This error is known as the omission error. The omission error has an easy formulation that is set by considering the coefficients left out from the truncated convergent series. In fact, this is the sum of the squares of all coefficients of degree higher than max . Clearly, however, this is an unknown quantity, which cannot be known a priori. However, the so-called degree variances (i.e., the sum over all orders of the squares of the coefficients up to a certain degree) can provide an idea of the GGMs' signal decay, which allows one to judge the commission error and compute the omission error according to an adopted law [27]. Accordingly, the signal and error degree variances of the GGMs provide an insight into the omission/commission error content of a GGM under consideration (see [24]). Hence, they are regarded as internal error estimates for the investigated models.
Using the fully normalized spherical harmonic coefficients, − C m , − S m from a specific degree of max and order m (m = 0, 1, . . . max ), the signal degree amplitudes (or square root of power per degree ) of functions of the disturbing potential T(ϕ, λ, R) at the Earth's surface are computed as follows [20,21,27]: Equation (5) provides the signal degree amplitude σ in terms of unitless coefficients, and the σ (N) in terms of geoid heights (meter) is shown in Equation (6): The error spectra of GGMs are investigated depending on the error degree amplitudes, calculated using the estimation errors of the Stokes' coefficients ) of a spherical expansion model: As a function of minimum and maximum degrees , the accumulated (signal, error) degree amplitudes provide the power spectrum accumulated over a spectral band (between 1 -usually 1 = 0 or 2 as minimum degree of the expansion-and 2 -usually 2 = max as the maximum degree of the harmonic expansion equation) and shows the increase in overall power with increasing degree of :

GGMs' Contribution in High-resolution Regional Geoid Modeling
For the determination of high-resolution precise regional gravity field models, remove-compute-restore (RCR) is a very well-known approach and has been used in calculation of many countries' geoid models to date [27,61]. In the RCR scheme the terrestrial gravity and topography/bathymetry data are used along with a global geopotential model to smooth the observations for data gridding, transformations, predictions and to eliminate the aliasing effects [27]. The computation steps of the RCR procedure are as follows: Equation (9) represents the 'remove' step where the long-wavelength component (∆g GGM , from the GGM according to Equation (10)) of the gravity field, as well as the short-wavelength components (∆g tc , the direct topographical effect on gravity, which consists of the subtraction of the attraction of condensed topographic masses from the attraction of all topographic masses above the geoid and is calculated as according to a selected reduction scheme-i.e., Helmert's Second Method of Condensation was used in this study (see Equations (3-97) of Section 3.9 in [61])-using topographic data), are removed from the free-air gravity anomalies (∆g FA ). Hence the calculated and reduced residual gravity anomalies (∆g res ) on the geoid surface are evaluated in Stokes' formula.
where the terms in Equation (10) are as introduced previously in Equation (1). The 'compute' step includes the computation of the residual co-geoid using Stokes' integral (N ∆g ), the long-wavelength geoid undulation (N GGM ) using the GGM according to Equation (1) and finally the indirect effect of topographic mass attraction on geoid undulation (N ind using topographic data according to Equation (13)). After computing all constituents separately, their compilation step is called 'restore' and formulated as follows: as the high-resolution geoid undulation (N).
Remote Sens. 2020, 12, 586 12 of 28 The Stokes' formula for calculating N ∆g is: where σ is the surface element, R is the mean Earth radius and S(ψ) is Stokes' function having the angular distance ψ from the computation point to the differential surface element dσ. N ind is: where ρ is the density of topographic masses; H Q and H P are the heights of the running and computation points, respectively; and s is the planar distance between P and Q. Hence, the output of the equation is N ind according to Helmert's Second Method of Condensation reduction scheme [27]. The details of the computation steps followed in this study were explained in [28] pp. 433-435 and Figure 13 given in the cited reference illustrated the adopted computational schema in here.
In the computation of a regional geoid model with hybrid (mixed) data in RCR, the GGM is responsible for the long-wavelength (signal) error content (see Equations (10) and (11)). When calculating a geoid model over a spatially limited region, while having low-accuracy and sparse terrestrial data, the quality of the GGM becomes even more important to compensate for the weakness of the terrestrial data accordingly in the RCR algorithm with specific modifications [30,67].

Internal Error Estimates of Tested GGMs
The error degree variances of the tested models were considered for internal uncertainty estimates of the GGMs. The theory and formulations for calculating error degree variances were given in Section 2.5. (see Equations (7) and (8)). Figure 1  where is the surface element, is the mean Earth radius and ( ) is Stokes' function having the angular distance ψ from the computation point to the differential surface element . is: where is the density of topographic masses; and are the heights of the running and computation points, respectively; and s is the planar distance between P and Q. Hence, the output of the equation is according to Helmert's Second Method of Condensation reduction scheme [27]. The details of the computation steps followed in this study were explained in [28] pp. 433-435 and Figure 13 given in the cited reference illustrated the adopted computational schema in here.
In the computation of a regional geoid model with hybrid (mixed) data in RCR, the GGM is responsible for the long-wavelength (signal) error content (see Equations (10) and (11)). When calculating a geoid model over a spatially limited region, while having low-accuracy and sparse terrestrial data, the quality of the GGM becomes even more important to compensate for the weakness of the terrestrial data accordingly in the RCR algorithm with specific modifications [30,67].

Internal Error Estimates of Tested GGMs
The error degree variances of the tested models were considered for internal uncertainty estimates of the GGMs. The theory and formulations for calculating error degree variances were given in Section 2.5. (see Equations (7) and (8)). Figure 1 shows the degree-wise accumulated error amplitudes of the models. The figures reveal the overall geoid undulation accuracies up to the maximum degree of each model. For d/o 200, which corresponds to the spatial resolution of 100 km, the cumulative errors are 0.8 cm for the DIR RL5 model (see Figure 1a), 2.2 cm for the TIM RL5 model (Figure 1b), 2.7 cm for the SPW RL4 model (Figure 1c), 2.0 cm for the GOCO05S model and 2.9 cm for the GOGRA04S model (Figure 1d), whereas it is 7.1 cm for the EGM2008 model, in terms of geoid undulation. When the cumulative error amplitudes are investigated, it is seen that the improvement with the final releases of the GGMs is up to 80% compared to their predecessors. It is also noteworthy that the cumulative error estimates of TIM RL5 and DIR RL5 differ by a factor of three. However, it is arguable if these cumulative error estimates depending on degree variances are realistic and, on the other hand, whether calibration is required or not. In this manner, in a similar study on GGM validations, [51] suggests applying a calibration factor of two for the DIR RL5 model. Related to concerns about the uncertainty estimates of the GGMs depending on the error degree variances approach, [68] emphasizes these issues: (1) firstly, the correlations among the coefficients are ignored while evaluating the GGMs with degree variances; (2) the effect of the polar-gap problem on the zonal and near-zonal coefficients of the GOCE-only models obviously influences the provided standard deviations of the spherical harmonic coefficients, specifically for the GOCE-only TIM and SPW models; and (3), on the other hand, dependence of the GOCE models' uncertainties and performances on the geographic location cannot be ignored in evaluations. Regarding the mentioned handicaps, beside the degree variances-based assessments, evaluating the GGMs using geographically dependent standard deviations (if possible) with the propagation of full variancecovariance matrices of the spherical harmonic coefficients is suggested by [68].
In summary, although the accumulated error degree variances provide a global mean of the GGM's performance over the entire sphere, which is significantly influenced by the large uncertainties in the polar gap, deriving concluding measures for its performance in a local region is not straightforward. Therefore, Section 3.2 provides results of local validations of tested GGMs using terrestrial data.

Overview of the Regional Accuracies of the GGMs
The results of evaluating the gravity field models using various data sets in different regions are being published regularly as the new models are released. However, the methodologies applied in these assessments vary to differing degrees. Regarding a number of significant studies from the literature, the following can be noted.
Reference [19] provides a detailed assessment of the first generation GOCE-based models using LEO satellite orbit perturbations and GPS/leveling geoid height differences. The conclusions of their study confirm that the orbital perturbations are mostly sensitive to the long-wavelengths of the gravity signal and hence provide clear results in quality assessments of the GGMs in a global manner. In contrast, the geoid comparisons using GPS/leveling data give the opportunity to assess the model performances at the medium to short-wavelengths. In the study, the regional assessments using the GPS/leveling data were carried out in Australia, Europe, North America (US and Canada), Japan and Germany. The preliminary conclusions from this study give a global geoid accuracy of about 5-6 cm with a 111 km spatial resolution (at d/o of 180). Here, it is emphasized that they obtain this level of accuracy as a result of just two-month GOCE data, and the goal of the mission (1-2 cm geoid accuracy with 100 km resolution) could be reached with the availability of more data. In terms of geoid residual RMS values, [19] reports the EGM2008 model accuracies as the most improved in Germany (3.5 cm), When the cumulative error amplitudes are investigated, it is seen that the improvement with the final releases of the GGMs is up to 80% compared to their predecessors. It is also noteworthy that the cumulative error estimates of TIM RL5 and DIR RL5 differ by a factor of three. However, it is arguable if these cumulative error estimates depending on degree variances are realistic and, on the other hand, whether calibration is required or not. In this manner, in a similar study on GGM validations, [51] suggests applying a calibration factor of two for the DIR RL5 model.
Related to concerns about the uncertainty estimates of the GGMs depending on the error degree variances approach, [68] emphasizes these issues: (1) firstly, the correlations among the coefficients are ignored while evaluating the GGMs with degree variances; (2) the effect of the polar-gap problem on the zonal and near-zonal coefficients of the GOCE-only models obviously influences the provided standard deviations of the spherical harmonic coefficients, specifically for the GOCE-only TIM and SPW models; and (3), on the other hand, dependence of the GOCE models' uncertainties and performances on the geographic location cannot be ignored in evaluations. Regarding the mentioned handicaps, beside the degree variances-based assessments, evaluating the GGMs using geographically dependent standard deviations (if possible) with the propagation of full variance-covariance matrices of the spherical harmonic coefficients is suggested by [68].
In summary, although the accumulated error degree variances provide a global mean of the GGM's performance over the entire sphere, which is significantly influenced by the large uncertainties in the polar gap, deriving concluding measures for its performance in a local region is not straightforward. Therefore, Section 3.2 provides results of local validations of tested GGMs using terrestrial data.

Overview of the Regional Accuracies of the GGMs
The results of evaluating the gravity field models using various data sets in different regions are being published regularly as the new models are released. However, the methodologies applied in these assessments vary to differing degrees. Regarding a number of significant studies from the literature, the following can be noted.
Reference [19] provides a detailed assessment of the first generation GOCE-based models using LEO satellite orbit perturbations and GPS/leveling geoid height differences. The conclusions of their study confirm that the orbital perturbations are mostly sensitive to the long-wavelengths of the gravity signal and hence provide clear results in quality assessments of the GGMs in a global manner. In contrast, the geoid comparisons using GPS/leveling data give the opportunity to assess the model performances at the medium to short-wavelengths. In the study, the regional assessments using the GPS/leveling data were carried out in Australia, Europe, North America (US and Canada), Japan and Germany. The preliminary conclusions from this study give a global geoid accuracy of about 5-6 cm with a 111 km spatial resolution (at d/o of 180). Here, it is emphasized that they obtain this level of accuracy as a result of just two-month GOCE data, and the goal of the mission (1-2 cm geoid accuracy with 100 km resolution) could be reached with the availability of more data. In terms of geoid residual RMS values, [19] reports the EGM2008 model accuracies as the most improved in Germany (3.5 cm), Japan (10 cm) and Canada (10.5 cm), whereas weaker fitting performances were found in Australia (24 cm), Europe (21.5 cm) and the US (26.5 cm) with comparatively larger residual RMS values. The non-homogeneous control datasets and vertical datum distortions were cited as possible reasons for the larger error levels in the continent-wide assessments.
Reference [21] provides the validation statistics of the released model for the maximum truncation degrees (without spectral enhancement) using similar datasets. Reference [15] evaluated the first generation GOCE-based models globally using EGM2008 and regionally using terrestrial gravity data and astrogeodetic vertical deflections. The analyses by [15] clearly show the improvement in gravity field mapping with GOCE data contribution and identify the reasons for the reported improvement specifically at certain spectral bands. Reference [69] evaluated the first and second releases of the models by means of gravity anomalies and radial gradients from GOCE gradiometer data using least squares collocation at several regions. The evaluations of the models from the first to third releases were published by [70] using the terrestrial gravity data in Norway and [71] over the Earth by means of gravity gradients from the GOCE gradiometer data. There are also other studies that validate the GOCE-based GGMs over different regions using various terrestrial data sets including [72] (in Sudan), [73] (in Brazil), [74] (in Hungary) and [68] (in Germany). Reference [75] provides the recent results of regional validations of all releases over different countries. However, the literature lacks information by means of GGM validation statistics over many areas of Earth. This may stem from the countries' data restriction policies and/or lack of quality terrestrial data for the region. Accordingly, the numerical section of this study focused on regional evaluation of the GGMs in Turkey.
In the following sections, in addition to reporting the models' formal error estimates, the performances of the GOCE-based GGMs were evaluated in terms of absolute accuracies. The GOCE-based models were compared with EGM2008 in order to clarify the GOCE observations' contribution to the GPS/leveling benchmarks in Turkey. In addition, the spectral bands of the spherical harmonic expansion models, where the improvement of tested GGM is the most significant, were determined and employed in the regional geoid modeling.

Assessment of GGMs' Accuracies in Turkey
In accuracy assessments of the HPF-released GGMs in Turkey, a number of numerical tests were carried out in the country at 36 • N-42 • N latitude and 26 • E-45 • E longitudes. The test statistics of the geoid height residuals (∆N) of the observations and the models were calculated for two different high order GPS/leveling network benchmarks, separately. The GPS/leveling networks include 30 and 81 points, respectively. The first dataset covers the whole of Turkey (Dataset I), and the latter is denser but localized in the northwest of the country (Dataset II) [76]. Carrying out the tests individually using both datasets (instead of using a unified dataset) aims to make validation results independent from the effects of possible datum differences between the datasets. A detailed description of the employed data is provided in Table 2. The distributions of the benchmarks on the topographic maps are given in Figures 2 and 3. 1 The datasets include first order GPS network points of Turkey National Fundamental GPS Network with Helmert orthometric heights (third-order leveling network) in regional vertical datum. BM: benchmark.  The numerical results of the study were interpreted from different perspectives, including: (1) verifying the role of the data amount and the calculation strategy of the model on its accuracy and performance; (2) determining an optimum d/o of the models available in the study area (whereas the term of the "optimum" possibly sounds vague and its definition is dependent on the application, the "optimum d/o" was used here to mean the degree of the spherical harmonic expansion at which the  The numerical results of the study were interpreted from different perspectives, including: (1) verifying the role of the data amount and the calculation strategy of the model on its accuracy and performance; (2) determining an optimum d/o of the models available in the study area (whereas the term of the "optimum" possibly sounds vague and its definition is dependent on the application, the "optimum d/o" was used here to mean the degree of the spherical harmonic expansion at which the highest accuracy is obtained in the tests performed with the terrestrial data in the study area); and, (3) clarifying the contribution of a best-fitting geopotential model with its optimum d/o to the development of a precise regional geoid model using RCR approach.
In the following, Figures 4a and 5a show the evaluation results of the DIR RL1, RL2, RL3, RL4 and RL5 models, which are enhanced using the high-resolution EGM2008 model and high-resolution DTM data (SRTM3), at GPS/leveling networks with 30 benchmarks throughout Turkey and 81 benchmarks in the northwest of the country, separately. The graphics show the standard deviations The numerical results of the study were interpreted from different perspectives, including: (1) verifying the role of the data amount and the calculation strategy of the model on its accuracy and performance; (2) determining an optimum d/o of the models available in the study area (whereas the term of the "optimum" possibly sounds vague and its definition is dependent on the application, the "optimum d/o" was used here to mean the degree of the spherical harmonic expansion at which the highest accuracy is obtained in the tests performed with the terrestrial data in the study area); and, (3) clarifying the contribution of a best-fitting geopotential model with its optimum d/o to the development of a precise regional geoid model using RCR approach.
In the following, Figures 4a and 5a show the evaluation results of the DIR RL1, RL2, RL3, RL4 and RL5 models, which are enhanced using the high-resolution EGM2008 model and high-resolution DTM data (SRTM3), at GPS/leveling networks with 30 benchmarks throughout Turkey and 81 benchmarks in the northwest of the country, separately. The graphics show the standard deviations of the geoid height residuals, calculated at the co-located GPS/leveling benchmarks with the GGM-derived geoid undulations, which were truncated and enhanced degree by degree from two to the maximum harmonic expansion d/o of the tested models (see Equations (1) and (4)). Looking at the statistics of the two datasets in the graphics, the improvement of the model performances from the first release to the last can be verified.
Similarly, Figures 4b and 5b show the performances of the TIM models, depending on the standard deviations of the geoid undulation residuals at co-located GPS/leveling benchmarks. Considering the results in graphics, the improvement of the TIM models as the release number increases is also clear.
The SPW models were also compared for the given two geodetic network datasets in Figures 4c and 5c. The statistics of SPW model validations also show the progress with increasing release number.
The numerical tests, which were carried out fulfilling the first objective of the current section, show improvement in the models with the increasing amount of data (by means of the observation cycle; please see Table 1). This conclusion has been drawn independently from the GGM computation strategy. Carefully looking at the graphics given in Figures 4 and 5, the direct models reveal almost similar performances to each other and with the EGM2008 model, of up to approximately 70 d/o of the spherical harmonic expansions; however, this is not the case for the TIM and SPW models. This situation can be explained as a consequence of the differences among their computation strategies (compare the strategies and the a priori information in computations in Table 1). Since EIGEN-51C combined GGM [39] was employed for the regularization of the GOCE-based models in the direct approach, they were reinforced by this background model. In the long-wavelengths, the used background model utilizes the information from GRACE and CHAMP, and the medium to short-wavelengths are provided by DNSC08GRA [77] and thus (indirectly) the terrestrial gravity (as contained by EGM2008) over the land areas. Regarding the information on the determination of the direct models, it is not possible to strictly quantify which spectral bands provide improvement, and the extent to which improvement is provided by the GOCE observations when considering solely these models.
The processing strategy for computation of the time-wise models relies exclusively on GOCE gradiometry and hl-SST observations, and only Kaula's regularization is employed to constrain the model at the short-wavelengths. Hence, the assessments of time-wise models can provide a realistic measure of the GOCE-based contribution to the GGMs without the intervention of other data sets. When the evaluation results of these models are considered, the contribution of the GOCE signal to the improvement of gravity field mapping in Turkey is clarified (see Figures 4b and 5b). The maximum improvement is seen at the spectral bands between d/o 100 and 250 as is promised by the GOCE Earth gravity field satellite mission.
The improvement in gravity field mapping with a GOCE-based contribution is shown to be significant in Figure 5. However, it is not highly significant in the test results obtained with 30 benchmarks, distributed over the country (see Figure 4). This may be because of the insufficient density and distribution of the test benchmarks (as seen in Figure 1) and relatively lower accuracy of the GPS/leveling heights at these benchmarks. Hence, it may be concluded that the first dataset, with 30 benchmarks, is not sufficient by means of data quality and distribution for carrying out a detailed assessment of the geopotential models regarding different spectral bands. However, the evaluation results using this data set were nonetheless considered as a country-wide look at the accuracy level of the tested GGMs. In this manner, the test results that were obtained through the locally distributed 81 benchmarks (Dataset II) revealed the significant improvement of the GOCE-based models in the medium-wavelength spectral bands.  Considering the entire statistics from the evaluations of direct, time-wise and space-wise models (namely, GO_CONS_GCF_2_ DIR, TIM, SPW) for the two datasets, it is concluded that the computation strategy of the GGMs does not have a significant role in the improvement of the model accuracies. In this part of the study, in addition to the GOCE-based DIR, TIM and SPW models (actually computed with the contribution of the GOCE and GRACE missions' data), the final releases of the two other combined GGMs, namely, the GOGRA04S and GOCO05S models, were also evaluated. The GOGRA04S model combined approximately four years of GOCE gradiometry data with GRACE observations from a cycle of satellite tracking in excess of 10 years. Despite the complementary advantage of the two missions in the solution, the GOGRA04S model did not reveal significant improvement in mapping the regional gravity field compared to the last releases of the GOCE models (see graphics in Figure 4d). The GOCO05S model combined four years of GOCE gradiometry data with GRACE, CHAMP and SLR observations (with the GRACE normal equations from the ITSG-GRACE2014S model). In the assessment results, the performance of the model is better than that of GOGRA04S and almost equal to the final releases of the GOCE-only models up to d/o of 155 and, in relative assessment, it performs better than the other models for degrees between 230 and 270 (Figures 4d and 5d). Considering the entire statistics from the evaluations of direct, time-wise and space-wise models (namely, GO_CONS_GCF_2_ DIR, TIM, SPW) for the two datasets, it is concluded that the computation strategy of the GGMs does not have a significant role in the improvement of the model accuracies. In this part of the study, in addition to the GOCE-based DIR, TIM and SPW models (actually computed with the contribution of the GOCE and GRACE missions' data), the final releases of the two other combined GGMs, namely, the GOGRA04S and GOCO05S models, were also evaluated. The GOGRA04S model combined approximately four years of GOCE gradiometry data with GRACE observations from a cycle of satellite tracking in excess of 10 years. Despite the complementary advantage of the two missions in the solution, the GOGRA04S model did not reveal significant improvement in mapping the regional gravity field compared to the last releases of the GOCE models (see graphics in Figure 4d). The GOCO05S model combined four years of GOCE gradiometry data with GRACE, CHAMP and SLR observations (with the GRACE normal equations from the ITSG-GRACE2014S model). In the assessment results, the performance of the model is better than that of GOGRA04S and almost equal to the final releases of the GOCE-only models up to d/o of 155 and, in relative assessment, it performs better than the other models for degrees between 230 and 270 (Figures 4d and 5d). Accordingly, the accuracies of the GGMs in terms of standard deviations of the geoid height residuals, obtained through their maximum and optimum degrees (optimum degree corresponds to spherical harmonic expansion degree of the model with the highest accuracy with a minimum standard deviation of the geoid undulation differences; see Figure 5), are shown in Figure 6a,b, separately computed for each test area, respectively. Given statistics in figures are derived using the enhanced coefficients of the GGMs over optimum (blue bars) and maximum (orange bars) expansion degrees of the models using the EGM2008 model and high-resolution DTM. The accuracies that are obtained through employing the full-expansion of the EGM2008 model are shown with a black color bar at the end of each graphic. Accordingly, the accuracies of the GGMs in terms of standard deviations of the geoid height residuals, obtained through their maximum and optimum degrees (optimum degree corresponds to spherical harmonic expansion degree of the model with the highest accuracy with a minimum standard deviation of the geoid undulation differences; see Figure 5), are shown in Figure 6a The given statistics show that the GGMs, enhanced on their optimum d/o, revealed an almost 50% better fit comparing the results obtained from the enhanced models on their maximum degrees.
Comparing with the EGM2008 model, the maximum improvement was recorded by the SPW RL5 model (a 23.8% improvement with respect to the EGM2008 model) in the test results using the second data set. Regarding the overall test results, obtained using both datasets, separately, it was seen that the GOCE-based GGMs had accuracies between 10 cm and 15 cm in Turkey. While interpreting the reasons for obtaining different fitting statistics of the GGMs in the tests with two datasets, it is worth noting the possible role of the tectonic structure in the region. Turkey is in the collision zone between three plates (African, Arabian and Eurasian) (see [78] for the active faults in Turkey) and hence continuously exposes horizontal and vertical crustal movements (see [79,80] for vertical and horizontal velocities in Turkey). Accordingly, the approximate rates in horizontal and vertical positions are reported as 2-3 cm/year in [80]. Moreover, an earthquake of Mw ≥ 6 occurs every 1-2 years, particularly around the North Anatolian Fault Zone. All these activities continuously distort the geodetic networks by time and even have a minor effect on the change of the gravity field; however, the latter is not significant for the scope of this study. Hence, the first geodetic network (Dataset I) was exposed to the Izmit earthquake of 1999 (7.6 Mw) and the northwest part of this geodetic network, including the 81 benchmarks of Dataset II, have been remeasured and adjusted since then. Although the networks were positioned at the same geodetic datum, the physical The given statistics show that the GGMs, enhanced on their optimum d/o, revealed an almost 50% better fit comparing the results obtained from the enhanced models on their maximum degrees.
Comparing with the EGM2008 model, the maximum improvement was recorded by the SPW RL5 model (a 23.8% improvement with respect to the EGM2008 model) in the test results using the second data set. Regarding the overall test results, obtained using both datasets, separately, it was seen that the GOCE-based GGMs had accuracies between 10 cm and 15 cm in Turkey. While interpreting the reasons for obtaining different fitting statistics of the GGMs in the tests with two datasets, it is worth noting the possible role of the tectonic structure in the region. Turkey is in the collision zone between three plates (African, Arabian and Eurasian) (see [78] for the active faults in Turkey) and hence continuously exposes horizontal and vertical crustal movements (see [79,80] for vertical and horizontal velocities in Turkey). Accordingly, the approximate rates in horizontal and vertical positions are reported as 2-3 cm/year in [80]. Moreover, an earthquake of M w ≥ 6 occurs every 1-2 years, particularly around the North Anatolian Fault Zone. All these activities continuously distort the geodetic networks by time and even have a minor effect on the change of the gravity field; however, the latter is not significant for the scope of this study. Hence, the first geodetic network (Dataset I) was exposed to the Izmit earthquake of 1999 (7.6 Mw) and the northwest part of this geodetic network, including the 81 benchmarks of Dataset II, have been remeasured and adjusted since then. Although the networks were positioned at the same geodetic datum, the physical establishments of the network stations were distorted by time and these distortions negatively influence their quality as control data in model evaluations. As a conclusion, the ground-truth data employed to validate the qualities and fitting performances of the GGMs should be of high quality, sufficiently dense, homogeneously distributed over the test area and as up-to-date as possible, especially in areas under the influence of dynamic Earth activities, such as Turkey.
In addition, the recently published sixth releases of the direct and time-wise models were considered in evaluations and they were compared with the RL5 models by means of standard deviations of the geoid undulations at the second dataset benchmarks. Figure 7 shows the comparisons of the models. According to obtained statistics, the RL6 models shown similar performances with the RL5 models between the 100 and 200 d/o of the spectrum. However, in degrees between 200 and 280, TIM RL6 models shown slightly better performance comparing with the RL5 and DIR RL6 models and had an improvement in standard deviation less than 1 cm.
Remote Sens. 2019, 11, x FOR PEER REVIEW 20 of 28 establishments of the network stations were distorted by time and these distortions negatively influence their quality as control data in model evaluations. As a conclusion, the ground-truth data employed to validate the qualities and fitting performances of the GGMs should be of high quality, sufficiently dense, homogeneously distributed over the test area and as up-to-date as possible, especially in areas under the influence of dynamic Earth activities, such as Turkey.
In addition, the recently published sixth releases of the direct and time-wise models were considered in evaluations and they were compared with the RL5 models by means of standard deviations of the geoid undulations at the second dataset benchmarks. Figure 7 shows the comparisons of the models. According to obtained statistics, the RL6 models shown similar performances with the RL5 models between the 100 and 200 d/o of the spectrum. However, in degrees between 200 and 280, TIM RL6 models shown slightly better performance comparing with the RL5 and DIR RL6 models and had an improvement in standard deviation less than 1 cm.

Testing the GGMs Contribution in Detailed Geoid Modeling
A high-resolution geoid model has been an essential component of the modern geodetic infrastructure in many countries, particularly in recent decades (see the geoid models' repository at [81] for some of the released regional geoid models). As a consequence of the progress in global positioning systems, many countries, such as Canada, New Zealand and the US, redefined their vertical datum to refer to a precise geoid model utilizing GNSS techniques. Hence, they also aimed to obtain further benefit from the satellite-based positioning technology for obtaining precise physical point heights in real time in many geodetic applications. Equations (11)-(13) provide the fundamental theory in the determination of a high-resolution geoid model using RCR steps. As was seen, the method combines GOCE-based GGM data with terrestrial data. Thus, the error content of the geoid undulation (Equation (11)) includes the errors from the global geopotential model, as well as the terrestrial gravity data and digital height data. Therefore, in order to clarify how the GGM accuracy affects the determination of the high-resolution geoid model, the numerical results from regional geoid modeling studies in Turkey are shared and discussed in this article. The terrestrial gravity data used in the RCR computations of this study are in five arc-minute resolution gravity anomalies in the IGSN-71 datum. Reference [82] reports that the estimated accuracy of the used gravity anomaly grid is at the level of 7-8 mGal based on cross-validations, as well as validations using point-wise gravity observations.

Testing the GGMs Contribution in Detailed Geoid Modeling
A high-resolution geoid model has been an essential component of the modern geodetic infrastructure in many countries, particularly in recent decades (see the geoid models' repository at [81] for some of the released regional geoid models). As a consequence of the progress in global positioning systems, many countries, such as Canada, New Zealand and the US, redefined their vertical datum to refer to a precise geoid model utilizing GNSS techniques. Hence, they also aimed to obtain further benefit from the satellite-based positioning technology for obtaining precise physical point heights in real time in many geodetic applications. Equations (11)-(13) provide the fundamental theory in the determination of a high-resolution geoid model using RCR steps. As was seen, the method combines GOCE-based GGM data with terrestrial data. Thus, the error content of the geoid undulation (Equation (11)) includes the errors from the global geopotential model, as well as the terrestrial gravity data and digital height data. Therefore, in order to clarify how the GGM accuracy affects the determination of the high-resolution geoid model, the numerical results from regional geoid modeling studies in Turkey are shared and discussed in this article. The terrestrial gravity data used in the RCR computations of this study are in five arc-minute resolution gravity anomalies in the IGSN-71 datum. Reference [82] reports that the estimated accuracy of the used gravity anomaly grid is at the level of 7-8 mGal based on cross-validations, as well as validations using point-wise gravity observations.
Using the terrestrial gravity data and the identical computation parameters, twelve experimental detailed geoid models (expTG-) were calculated for Turkey. Each of these models relies on a different GGM with either the optimum ( opt = 155) (called expTG-Opt models) or the maximum (called expTG-M models, having max = 330/300/280) degrees of the spherical harmonic expansions. In the tests, in addition to the best performing DIR RL5, TIM RL5, SPW RL5 and GOCO05S models, the EGM2008 coefficients up to the identical expansion degrees ( max = 155, 280, 300, 330) of the other tested models were also considered (see the test statistics of the experimental models in Table 3).
The twelve experimental models were validated at 81 GPS/leveling benchmarks of the geodetic network in northwest Turkey. Regarding the obtained results (see Table 3), all the validated experimental geoid models (relying on the reference GGM with optimum d/o of spherical harmonic expansion) provided an accuracy of 16.5 cm in terms of standard deviations of the residual geoid undulations without removing the tilt/trend at GPS/leveling benchmarks. However, employing the GGMs with the maximum degrees of their expansions, the calculated experimental geoid models revealed an accuracy of approximately 20 cm. This means that the use of optimum d/o of the geopotential model led to an improvement in the accuracy of the regional geoid model using Stokes' integral without any modification and any cap size limitation. The table also includes the statistics of the experimental geoid models after de-trending the models' surface simply using a third-order polynomial equation. The cross-validation of the de-trended models gave a so-called improved accuracy of around 9.7 cm for the models. The expTG-Opt-1 geoid model surface is shown in Figure 8. As also seen in Table 3 (statistics before fit), the detailed geoid models, which were calculated using the classical RCR algorithm with GOCE-based reference models, are not more accurate than the enhanced EGM2008 model. This is a consequence of insufficient accuracy and density of the mean terrestrial gravity anomalies that were used in the computations. The role of the terrestrial gravity data accuracy for the quality of the calculated geoid model is explained by [83,84]. In accordance with a 1-2 mGal accuracy of the terrestrial gravity data, having 1-2 km spatial resolution with homogenous distribution is required if 1-2 cm accuracy of the geoid undulations is aimed. Compared to this criterion, the accuracy and resolution of the used mean gravity anomalies (with 6 mGal accuracy and 5′ grid spacing) in computations do not satisfy the requirements for determining a centimeter accuracy regional geoid. The accuracy of the used GPS/leveling data (∼1.5-2.0 cm threedimensional position accuracy and ∼2.0 cm leveling height accuracy) in validations was sufficient for the purpose of this study. Although it is not officially declared in technical reports explaining the calculation strategy and data content of EGM2008, if the terrestrial gravity data of Turkey or the surrounding area was used in EGM2008 computations, this could explain the superior performance of this model in the validation results. It is possible to obtain slightly improved accuracy in regional geoid models by applying modifications to the Stokes' integral in order to achieve a more rigorous combination of the geopotential models with terrestrial data by also considering their variances [38]. Hence, it will be possible to compensate for the random errors that are contained in the terrestrial data with the improved quality of the geopotential models up to a certain amount. As also seen in Table 3 (statistics before fit), the detailed geoid models, which were calculated using the classical RCR algorithm with GOCE-based reference models, are not more accurate than the enhanced EGM2008 model. This is a consequence of insufficient accuracy and density of the mean terrestrial gravity anomalies that were used in the computations. The role of the terrestrial gravity data accuracy for the quality of the calculated geoid model is explained by [83,84]. In accordance with a 1-2 mGal accuracy of the terrestrial gravity data, having 1-2 km spatial resolution with homogenous distribution is required if 1-2 cm accuracy of the geoid undulations is aimed. Compared to this criterion, the accuracy and resolution of the used mean gravity anomalies (with 6 mGal accuracy and 5 grid spacing) in computations do not satisfy the requirements for determining a centimeter accuracy regional geoid. The accuracy of the used GPS/leveling data (~1.5-2.0 cm three-dimensional position accuracy and~2.0 cm leveling height accuracy) in validations was sufficient for the purpose of this study. Although it is not officially declared in technical reports explaining the calculation strategy and data content of EGM2008, if the terrestrial gravity data of Turkey or the surrounding area was used in EGM2008 computations, this could explain the superior performance of this model in the validation results.
It is possible to obtain slightly improved accuracy in regional geoid models by applying modifications to the Stokes' integral in order to achieve a more rigorous combination of the geopotential models with terrestrial data by also considering their variances [38]. Hence, it will be possible to compensate for the random errors that are contained in the terrestrial data with the improved quality of the geopotential models up to a certain amount.

Conclusions
The purpose of this study is to investigate the progress of GOCE High-Level Processing Facility (HPF) released geopotential models that include data from gravity field satellite missions, and provide assessment results of these models for Turkey. The analyses were carried out by means of both spectral and spatial approaches. Regarding the numerical results and obtained statistics through these analyses, the following conclusions were drawn: GOCE HPF-released GOCE-and GRACE-based global geopotential models were validated using the cumulative degree variances. Regarding the spherical harmonic expansion degrees (between 100 and 200 degrees of SH expansions, corresponding to the medium-wavelengths of the gravity field, where a significant contribution of the GOCE mission is expected), the cumulative formal errors (in terms of the geoid undulations) were 0.8 cm, 2.2 cm and 2.7 cm for DIR RL5, TIM RL5 and SPW RL4 models, respectively, whereas they were 2.0 cm, 2.9 cm and 7.1 cm for GOCO05S, GOGRA04S and EGM2008 models, respectively.
Considering the cumulative error amplitudes at the corresponding degrees for 100 km spatial resolution, the significant improvement toward the final releases of the geopotential models was verified (the improvement of final releases is almost up to 80% with respect to the first releases). On the other hand, the cumulative formal error estimate of the TIM RL5 model is three times worse than that of the DIR5 model. Though the difference seems to be quite significant, it is arguable if such a comparison without any calibration is objective.
The regional validations of the tested geopotential models against the GPS/leveling data were carried out with the spectral enhancement method. Hence, the higher frequencies in the content of the tested GGMs were synthesized from ultra-high degree coefficients of the EGM2008 model and the regional DTM data. As a conclusion, the spectral enhancement method revealed meaningful results and is recommended to be used for GGM validations using terrestrial data. In the results of the comparisons among the model releases (in every group of models as DIR, TIM, SPW), it is concluded that the model accuracy improves as its release increases, and that this improvement is independent of the computation strategy.
The spatial analyses of GGMs were carried out using GPS/leveling data of high-order geodetic networks in Turkey. The accuracy of the used control data, their spatial distribution and density, as well as the size of the validation area, are critical for the model validations and affect the reported performances. In this manner, the second test dataset can be said to be more suitable and hence the obtained validation statistics are more realistic in terms of the detailed analyses of the GGMs in the study area. The spatial analyses results clarified the GOCE mission's contribution to the improvement of the GGMs (with respect to the EGM2008 model), especially in the corresponding spectral bands of 100 and 250 degrees. Moreover, comparing standard deviations of the geoid undulation differences calculated with an increasing degree of the spherical harmonic expansion model, the optimum degree was determined with a minimum standard deviation value. Accordingly, the optimal value for d/o of spherical harmonic expansion is around 155 in the study area. In the test results explained in Section 3.2.1, an approximate accuracy for the tested GGMs is about 12 cm. EGM2008 model accuracy is calculated to be around 15 cm in Turkey. This is three times greater than the reported absolute accuracy of this model for well-covered areas with quality terrestrial data on Earth. This result indicates that either EGM2008 does not include the terrestrial data of Turkey or the quality of the included data is low. In the analyses, the GOGRA04S and GOCO05S models were also validated. However, since the obtained statistics did not reveal any improvement, it can be concluded that the complementary advantage of the GOCE gradiometry data with the observations from other gravity field satellite missions and SLR is not significant.
In the final part of the study, the role of the reference geopotential model accuracy, as well as its optimal expansion degree in determination of the detailed hybrid geoid model, were validated based on the RCR computation algorithm.
The obtained results revealed a 15% improvement in the accuracy of the gravimetric geoid model (considering the results without corrector surface fitting) when the GGM is employed with its optimum degree instead of the maximum degree (~16.5 cm vs. 19.5 cm for DIR RL05 as seen in Table 3). This conclusion was drawn using an unmodified Stokes kernel. On the other hand, a combination of the GGMs with terrestrial gravity data is possible using the RCR algorithm with stochastic modification of the Stokes kernel. Various approaches for the modification can be found in the literature. Hence, a more rigorous combination of the data may contribute to further improvement of the detailed geoid accuracy via compensating for the error budget, and realizing the GGMs' improvement to a certain extent would be possible. A research project, which was recently carried out in Turkey and the present study is a part of it (with The Scientific and Technological Research Council of Turkey, TUBITAK, support for contract number 114Y581), performed different methodologies related to detailed geoid modeling using hybrid data, and clarified the role of the modifications on a rigorous combination of GGMs with terrestrial data in order to improve the accuracy of a regional geoid model.
As a summary, global geopotential models have significantly progressed due to the data contribution from recent Earth gravity field satellite missions. The improved GGMs, in terms of accuracy and resolution, hence benefit many engineering and Earth science disciplines that use gravity field parameters in their analyses. The most significant impact of the improvement in GGMs' quality is on the vertical datum definition and realization, and global unification of the height systems. This study provided the methodological steps for the assessment of GGMs and quantified the improvement of the models in various regions, with cited statistics from the literature and numerical test results in Turkey.