Cluster Low-Streams Regression Method for Hyperspectral Radiative Transfer Computations: Cases of O 2 A- and CO 2 Bands

: Current atmospheric composition sensors provide a large amount of high spectral resolution data. The accurate processing of this data employs time-consuming line-by-line (LBL) radiative transfer models (RTMs). In this paper, we describe a method to accelerate hyperspectral radiative transfer models based on the clustering of the spectral radiances computed with a low-stream RTM and the regression analysis performed for the low-stream and multi-stream RTMs within each cluster. This approach, which we refer to as the Cluster Low-Streams Regression (CLSR) method, is applied for computing the radiance spectra in the O 2 A-band at 760 nm and the CO 2 band at 1610 nm for ﬁve atmospheric scenarios. The CLSR method is also compared with the principal component analysis (PCA)-based RTM, showing an improvement in terms of accuracy and computational performance over PCA-based RTMs. As low-stream models, the two-stream and the single-scattering RTMs are considered. We show that the error of this approach is modulated by the optical thickness of the atmosphere. Nevertheless, the CLSR method provides a performance enhancement of almost two orders of magnitude compared to the LBL model, while the error of the technique is below 0.1% for both bands.


Introduction
Radiative transfer models (RTM) are a key part of the remote sensing algorithms, which are used to retrieve atmospheric parameters from Earth observation data. The new atmospheric composition sensors with high spatial and spectral resolution require accurate and efficient computational algorithms. An accurate method to simulate spectral radiances in the absorption bands is the line-by-line (LBL) approach [1]. However, due to the high spectral variability of the gas absorption coefficient k in the absorption bands, the LBL-approach is very time-consuming because it requires up to several thousands of monochromatic computations per absorption band. In this regard, specifically designed hyperspectral RTMs are required. They reduce the number of monochromatic computations considerably without compromising accuracy. The basic idea behind the hyperspectral RTMs is to get rid of redundancies in hyperspectral data. This principle can be traced back to Ambartsumian [2], who noted that the transmission within a spectral interval does not depend on the LBL variation of k, but rather on the distribution of values of k within the spectral interval. This idea was used in the k-distribution method [3][4][5][6], in which the wavelengths are grouped into a smaller number of bins with similar values of k, and the LBL calculations are replaced by a smaller set of radiative transfer model. The simplest RTM is the single-scattering model, which solves the radiative transfer equation neglecting the integral term [33].
The gaseous absorptions for the O 2 A-and CO 2 bands are computed with the LBL model Py4CAtS [34], while the gas absorption cross-sections are taken from the HITRAN 2016 database [35]. The uncertainties in the spectroscopic parameters are not considered because their role is irrelevant for this study [36]. Continuum (also referred to as collision-induced absorption (CIA) [37,38]) contributions to molecular absorption are not taken into account. We note that the CIA process gives a broad and smooth contribution [39] and hence, does not impose difficulties for the regression techniques considered in this study.

Study Cases
We consider the reflected spectral radiance at the top of the atmosphere (TOA). In our simulations, the atmosphere is discretized into 35 layers with a step of 1 km between 0 and 25 km, and a step of 2.5 km between 25 km and 50 km. We assume a Lambertian surface with an albedo of 0.3. The solar zenith angle, the viewing zenith angle and the relative azimuth angle are 45 • , 35 • and 90 • , respectively.
In the O 2 A-band the spectral sampling is 0.001 nm in the spectral range 755-775 nm, while in the CO 2 band the spectral sampling is 0.0015 nm in the spectral interval 1590-1620 nm. Thus, on each band, 20,000 spectral points are considered. The Rayleigh cross-sections and depolarization ratios are computed as in [40], while the pressure and temperature profiles correspond to the US standard model atmosphere [41]. The computations are performed for the unit solar irradiance at the TOA.
We define 5 atmospheric scenarios: 'Clear sky', 'Aerosol 1', 'Aerosol 2', 'Cloud 1' and 'Cloud 2'. The 'Clear sky' scenario corresponds to an atmosphere without clouds and aerosols. In the 'Aerosol 1' and 'Aerosol 2' scenarios, the atmosphere contains the clean continental and the polluted continental aerosols taken from the OPAC database [42], respectively. The aerosol optical depths at the middle of the absorption bands are shown in Table 1. In the 'Cloud 1' and 'Cloud 2' scenarios, a continental clean cumulus cloud with a modified Gamma size distribution [42,43]: is considered. The size distribution parameters are a mod = 4.8 µm, α = 5 and γ = 2.16, the droplet size ranges between 0.02 and 50.0 µm, and the cloud optical depths are τ = 10 ('Cloud 1') and τ = 20 ('Cloud 2'). In both cases, the cloud-top height is 5 km and the cloud geometrical thickness is 1 km.
The number of streams considered for the multi-stream model is the same for all scenarios. We might need a different number of streams for different scenarios, but in order to simplify and ensure convergence of relative errors for all cases, we use 32 streams.

PCA-Based RTM
Since a PCA-based RTM can be implemented in several ways (e.g., [12,13,[15][16][17]44,45]), we describe below the main features of the PCA-based RTM used in this paper. The idea of the method is to refine the two-stream solution by a correction function estimated in the reduced space of the optical properties. Considering a discretization of the atmosphere in L layers, we define a 2L-dimensional vector x w for each wavelength {λ w } W w=1 , by where τ k and ω k are the optical thickness and the single-scattering albedo of the k-th layer, respectively. Thus, the vector x w encapsulates the wavelength variability of the optical parameters, which are the input parameters of the radiative transfer code. By applying PCA, a M-dimensional subspace is found which is spanned by a set of linear independent vectors (empirical orthogonal functions) {q k } M k=1 such that the centered (mean-removed) data x w −x lie mainly on this subspace, i.e., where y wk are the principal component (PC) scores. Following Natraj et al. [12], we introduce a correction function where I LS (x w ) is the radiance provided by the low-stream RTM, and I MS (x w ) is the radiance simulated by the multi-stream model. Setting ∆x w = ∑ M k=1 y wk q k , we consider a second-order Taylor expansion of f (x w ) aroundx in the direction ∆x w . Approximating the directional derivatives with central differences (see [13,17] for mathematical details) we obtain Please note, since M < 2L (and in practice, M 2L), the estimation of the correction function in the reduced data space is much faster than in the original space. Also note that the total number of calls of the multi-stream RTM is 2M + 1. Once f (x w ) is computed, the low-stream radiances can be converted into the multi-stream radiances by using Equation (3). Finally, the predicted multi-stream radiance I MS can be expressed as follows: In del Águila et al. [17], it was shown that a higher order expansion of f (x w ) in Equation (4) does not substantially improve the accuracy of the solution. In our simulations, an appropriate value for M is found iteratively, i.e., M is increased until the approximation error is sufficiently small.

Regression Relationship between Multi-Stream and Low-Stream Models
The LBL computations can be accelerated by finding a regression relationship between the multi-stream radiances and the optical parameters. In O'Dell [21], the correction function of the low-stream radiance as a function of the gas optical thickness τ gas , i.e., the relative error was analyzed ( Figure 1). The idea is to compute f (τ gas ) at certain values of τ gas , and then to use a regression model for computing f (τ gas ) at the remaining values of τ gas . However, for the present application, such an approach has two major drawbacks: 1. the dependence of f (τ gas ) on τ gas is non-linear and, therefore, the application of the regression model requires a binning of the τ gas values; 2. in a mathematical sense, f (τ gas ) is not a function (for a value of τ gas , there are several values of f (τ gas )) and therefore, even for a fine binning, the regression model will be not accurate.
To overcome these drawbacks, O'Dell [21] devised a regression model based on the so-called adjusted gas optical thickness, defined as the gas optical thickness from the TOA down to the layer in the atmosphere where the scattering optical depth equals some critical value. A similar idea, which we exploit in the following, is to use a regression model for the radiances computed by a low-stream RTM rather than in the τ gas -space. Figure 2 shows a comparison between the multi-and low-stream radiances for the 'Aerosol 2' scenario. Although the errors of the low-stream RTM are significant (e.g., they may reach 200%), the low-and multi-stream radiances have a similar spectral behaviour. Moreover, as shown in Figure 3, the dependence between the low-and multi-stream radiances is almost linear. In this respect, it seems reasonable to cluster the spectral points according to the radiance values computed with a low-stream RTM (rather than according to the optical properties) and to establish a regression model between the low-and multi-stream radiances within each cluster. In order to decrease the errors of the regression estimates, an additional wavelength-dependent parameter will be included in the regression model: the direct transmittance T.

Cluster Low-Streams Regression Method
The Cluster Low-Streams Regression (CLSR) method can be formulated as follows.

Consider a high-resolution spectrum
where α c , β c and γ c are the regression coefficients of the c-th cluster, andT is the corresponding direct transmittance. 6. Compute the regression coefficients α c , β c and γ c as a solution to the least square problem 7. Use the values of (α c , β c , γ c ) found in the previous step to restore the multi-stream radiances for all the spectral points according to Equation (6) (Figure 4d).
Here, the "hat" notationÎ refers to the sorted radiances, the "bar" notation I refers to the equidistant radiances entering the regression model, and the "tilde" notationĨ refers to the predicted radiances. Please note that the total number of regression points, and thus the number of calls of the multi-stream model, is nC.

Efficiency and Computational Performance Estimation
To estimate the accuracy of the acceleration techniques, we consider the residual error for the radiance at each spectral point λ i : whereĨ MS,i is the predicted radiance calculated with either the PCA-based RTM (cf. Equation (5)) or the CLSR method (cf. Equation (6)), while I cont MS is the radiance without absorption (i.e., the continuum radiance, which is used to avoid radiance values close to zero in the denominator of Equation (8), when strong gas absorption is present [15]). The mean relative error is computed as follows: Additionally, the median and interquartile range (IQR) of the residual are computed. Essentially, these parameters serve as robust metrics to evaluate the variability of the residuals in the corresponding spectral band, thereby with a lower sensitivity to single outliers [16].
To estimate the performance enhancement, we define the speedup factor as the ratio between the computational time of a multi-stream LBL calculation to that of a certain acceleration technique. To exclude the hardware-related factors from our analysis, we estimate the speedup factor for the PCA-based RTM as [16] where t MS and t LS are the computational times for a single monochromatic calculation corresponding to the multi-and low-stream RTMs, respectively, while t PCA is the computational time of the PCA.
Thus, the numerator in Equation (10) is the time required for the LBL computations. Please note the PCA-based RTM requires 2M + 1 calls to the multi-stream model. The speedup factor for the CLSR method is defined by where t LSM is the computational time needed for the least-squares method according to Equation (7). Recall that the total number of regression points and the corresponding calls of the multi-stream model is nC.

Performance of the Low-and Multi-Streams Models
In this section, we assess the performance of the low-and multi-stream RTMs before applying the acceleration techniques. The spectral radiances for three atmospheric scenarios ('Clear sky', 'Cloud 1' and 'Cloud 2') are illustrated in Figure 5. Tables 2 and 3 provide the mean relative errors of the low-stream RTMs with respect to the multi-stream RTM for the O 2 A-and CO 2 absorption bands, respectively. Table 4 shows the computational performance for a different number of discrete ordinates N do . The results show that (i) the relative errors of the single-scattering model are higher than those of the two-stream model, (ii) the relative error increases as the optical thickness of the atmosphere increases, and (iii) the speedup factors of the single-scattering model are considerably higher than those of the two-stream model.

Spectral Residuals
The residuals (cf. Equation (8)) are computed for (i) different numbers of principal components in the case of the PCA-based RTM, and (ii) different numbers of regression points per cluster in the case of the CLSR method. Figure 6 shows these residuals for the 'Cloud 2' and the 'Aerosol 2' scenarios.
From Figure 6 the following conclusions can be drawn.
1. The residuals decrease when increasing the number of PCs and regression points. 2. The interquartile range for the CLSR method is substantially reduced when switching from 1-2 to 3 regression points for the CO 2 band, and from 3 to 4 regression points for the O 2 A-band. 3. In both spectral bands, the interquartile range for the PCA-based RTM decreases gradually with the number of principal components. 4. The residuals in the O 2 A-band are systematically higher than those in the CO 2 band. This behaviour is more pronounced when the gas optical thickness is large, thus resulting in larger discrepancies for the PCA-based RTM and almost negligible discrepancies for the CLSR approach. 5. In the O 2 A-band, the median values of the residuals Equation (8) are higher than 2%. The median values remain almost constant with the number of principal components and regression points. However, the median values as well as the interquartile range for the PCA method are generally higher than those of the CLSR method. This trend remains coherent using the single-scattering RTM instead of the two-stream RTM, although the residuals are substantially higher for the PCA-based RTM.

Estimation of the Required Parameters for the Acceleration Techniques
In order to select the optimal number of PCs, we compute the explained variance ratio of the optical data, which accounts for the variance associated with a given number of PCs. The results in Figure 7 show that in all scenarios almost 99% of the optical data variance can be explained within the first four PCs. Moreover, from Figure 6a we can infer that (i) the convergence of the two-stream radiances is reached for 4-5 PCs, while (ii) the convergence of the single-scattering radiances requires 4-7 PCs (Figure 6b). To select the number of clusters and regression points for the CLSR method, we estimate the mean error (cf. Equation (9)). As an example, we illustrate in Figure 8 the mean errors for the 'Aerosol 2' scenario. The results show that 4-5 clusters and 3-5 regression points guarantee a small mean error.

Accuracy of the CLSR and PCA-Based Methods
In this section, we assess the accuracy of the PCA-based and CLSR methods. The residuals (Equation (8)) and interquartile ranges corresponding to the PCA-based and CLSR methods, as well as for the two-stream and single-scattering models, are shown in Figure 9. The results can be summarized as follows: 1. For both low-stream models and cloudy scenarios, (i) the residuals of the PCA-based method in the O 2 A-band are substantially larger than those of the CLSR method (on average, for the two-stream model in the O 2 A-band, the residuals are above 1% for the PCA-based method and below 0.01% for the CLSR method), while (ii) the PCA-based and CLSR methods yield comparable accuracies in the CO 2 band (on average, for the two-stream model in the CO 2 band, the residuals are below 0.1% for the PCA-based method and below 0.01% for the CLSR method). Please note similar results were established in previous studies, i.e., the errors of the PCA-based method generally increase with the optical depth [16]. The superiority of the CLSR method is also demonstrated by the interquartile ranges: these are much smaller for the CLSR method. 2. For both O 2 A-and CO 2 bands, (i) the residuals of the single-scattering model with the PCA-based method are higher than those corresponding to the two-stream model, while (ii) the residuals of the two-stream and single-scattering models with the CLSR method are comparable (on average, for the single-scattering model, the residuals of the CLSR method are below 0.2% in the O 2 A-band and below 0.1% in CO 2 band, while for the two-stream model the corresponding errors are below 0.01% in both spectral bands).
Thus, (i) the two-stream model with the PCA-based and CLSR methods yields accurate results, (ii) the efficiency of the PCA-based method decreases when increasing the optical thickness, and (iii) both the two-stream and the single-scattering models with the CLSR method provide reasonable accuracies.
Please note the considered atmospheric scenarios implicitly involve a set of assumptions which causes differences between RTM calculations and actual (physically realistic) atmospheric conditions. Furthermore, the discrete ordinate method itself is an approximate technique which in principle should be validated against rigorous analytical methods [46]. However, we use the same atmospheric definition inputs for comparing acceleration techniques.

Acceleration Techniques: Computational Performance
In this section, we evaluate the computational efficiency of the CLSR method and compare it with the PCA-based method. The speedup factors computed by using Equations (10) and (11) for the PCA-based and CLSR methods, respectively, are given in Table 5. Since the single-scattering model with the PCA-based method provides large errors (see Figure 9b), the table includes the speedup factor S PCA for the two-steam model with the PCA-based method, as well as for the two-stream and single-scattering models with the CLRS method, S TS CLSR and S SS CLSR , respectively. According to the optimal values of the PCs and regression points and clusters, the number of calls to the multi-stream model for the PCA-based method is 9, for the two-stream model with the CLSR method 12, and for the single-scattering model with the CLSR method 20. The computational time to perform the PCA calculation t PCA is 3 orders of magnitude higher than that of the least-squares calculation t LSM . However, since the most time-consuming part is due to the number of calls to the multi-stream model, the efficiencies of the two-stream model with the CLRS and the PCA-based method are comparable. In contrast, the computational performance of the single-scattering model with the CLRS method is much higher than that with the PCA-based method due to the neglect of multiple scattering computations. Comparing the speedup factors of this study with those of other authors (e.g., [21]), we find that our values are of the order of their speedup factors and one order of magnitude higher when considering the single-scattering model for the CLSR technique.

Computation of Convolved Spectra
In this section, we examine the efficiency of the CLSR method for computing convolved spectra. The high-resolution spectra in the O 2 A-band are convolved with the slit functions corresponding to GOME-2 and TROPOMI instruments, while the radiance spectra in the CO 2 band are convolved with the GOSAT slit function. In this paper, slit functions are modeled with a Gaussian function. The corresponding full widths at half maximum (FWHM) are listed in Table 6. The FWHM considered for the O 2 A-band are based on pre-launch calibrations [47] and for the CO 2 band on [48]. The examples of convolved spectra corresponding to the 'Cloud 1' scenario are shown in Figure 10. The computations are performed by using the PCA-based and the CLSR methods (in conjunction with the two-stream model) as well as the LBL approach. Tables 7 and 8 show the mean relative errors for the PCA-based (ε PCA ) and CLSR (ε CLSR ) methods for the O 2 A-band and the CO 2 band, respectively. In addition, the residuals for non-convolved spectra are shown for comparison. For the 'Clear sky' and aerosol scenarios the accuracies of both methods are comparable, while for cloud scenarios the CLSR method is more accurate. Please note the residuals estimated for the convolved spectra are very close to those for the non-convolved ones and the value of residuals according to Equation (8) are robust. Figure 10. Convolved spectra for the multi-stream model and the two acceleration methods for the 'Cloud 1' scenario using PCA and CLSR methods for the sensors GOME-2, TROPOMI and GOSAT. For GOME-2 and TROPOMI the O 2 A-band spectra are convolved, while for GOSAT the CO 2 spectra are convolved. Table 7. Mean relative error ε for the convolved spectra compared with the multi-stream spectra for the PCA-based and CLSR methods and for the different atmospheric scenarios considered for the O 2 A-band. In all cases, the low-stream model considered is the two-stream model. The instruments analyzed are GOME-2 and TROPOMI, which are compared with the non-convolved values.

Conclusions
In this study, we developed the Cluster Low-Streams Regression (CLSR) method for fast radiative transfer simulations of the O 2 A-and CO 2 absorption bands. The CLSR method exploits a strong close-to-linear relationship between the radiances computed with a low-stream model (which is either the two-stream or the single-scattering model) and the radiances computed with the multi-stream model. The spectral points are grouped in several clusters according to the values of the low-stream radiances. For each cluster, the regression model is established between the low-stream and multi-stream models, where the corresponding regression coefficients are found by using the least-squares method. This approach can be regarded as a variation of the low-streams interpolation method explained in [21], in which the binning is performed in the space of gas optical depths.
The CLRS method was compared with the PCA-based RTM in several atmospheric scenarios. For the 'Clear sky' scenario, the performance of both acceleration methods is comparable. For the cloud scenarios, the CLSR method shows more accurate results than the PCA-based method. However, to improve the performance of the PCA-based method, in [14][15][16] the optical depth binning technique was applied, i.e., the spectral points were grouped into bins according to the optical depth values and the PCA-based method was applied independently for each bin. Such an approach improves the accuracy of the PCA-based method, although at the cost of increasing the number of multi-stream computations. Comparing the results of [14][15][16] with the results obtained in our study, we can conclude that the PCA-based binned approaches and the CLSR method are comparable in terms of accuracy.
Our analysis has shown that, although the CLSR method requires more calls to the multi-stream model than the PCA-based model, the corresponding speedup factors are very similar, with slightly better results for the CLSR method. However, this difference is not significant. In addition, we note that the CLSR method does not require to deal with the eigenvalue problem as the PCA-based RTMs do. The CLSR method can be used in conjunction with either the two-stream or the single-scattering model providing a performance enhancement of almost two orders of magnitude while keeping the maximum error below 0.1% (note that the use of the single-scattering model instead of the two-stream model within the framework of the PCA-based method drastically reduces the accuracy).
As future goals, following [18,49] we plan to extend the CLSR method for computing the Stokes parameters and analyze the possibility of hybrid use of the CLSR method, the PCA-based method and the correlated k-distribution technique. It is also planned to analyze different atmospheric scenarios such as cirrus clouds [50] as well as the efficiency of the CLRS method for improving the accuracy of other types of approximate models. In this case, one can expect reducing the number of regression points if instead of the single scattering model, the double scattering approximation is considered [51]. For optically thick media, as an approximate model, asymptotic radiative transfer models can be used [52].