The Application of Discrete Wavelet Transform with Improved Partial Least-Squares Method for the Estimation of Soil Properties with Visible and Near-Infrared Spectral Data

: This study evaluated whether wavelet functions (Bior1.3, Bior2.4, Db4, Db8, Haar, Sym4, and Sym8) and decomposition levels (Levels 3–8) can estimate soil properties. The analysis is based on the discrete wavelet transform with partial least-squares (DWT–PLS) method, incorporated into a visible and near-infrared reﬂectance analysis. The improved DWT–PLS method (called DWT–Stepwise-PLS) enhances the accuracy of the quantitative analysis model with DWT–PLS. The cation exchange capacity (CEC) was best estimated by the DWT–PLS model using the Haar wavelet function. This model yielded the highest coefﬁcient of determination ( R v 2 = 0.787, p < 0.001), with the highest relative percentage deviation (RPD = 2.047) and lowest root mean square error (RMSE = 4.16) for the validation data set of the CEC. The RPD of the SOM predictions by DWT–PLS using the Bior1.3 wavelet function was maximized at 1.441 ( R v 2 = 0.642, RMSE = 5.96), highlighting the poor overall predictive ability of soil organic matter (SOM) by DWT–PLS. Furthermore, the best performing decomposition levels of the wavelet function were distributed in the ﬁfth, sixth, and seventh levels. For various wavelet functions and decomposition levels, the DWT–Stepwise-PLS method more accurately predicted the quantiﬁed soil properties than the DWT–PLS model. DWT–Stepwise-PLS using the Haar wavelet function remained the best choice for quantifying the CEC ( R v 2 = 0.92, p < 0.001, RMSE = 4.91, and RPD = 3.57), but the SOM was better predicted by DWT–Stepwise-PLS using the Bior2.4 wavelet function ( R v 2 = 0.8, RMSE = 5.34, and RPD = 2.24) instead of the Bior1.3 wavelet function. However, the performance of the DWT–Stepwise-PLS method tended to degrade at high and low decomposition levels of the DWT. These degradations were attributed to a lack of sufﬁcient information and noise, respectively.


Introduction
Soil quality mainly depends on the chemical and physical properties of the soil, which are estimated by the cumulative effects of natural factors involved in its formation, including climate, topography, parent material, biological activity, and time [1]. The development of precision The main objectives of this study were as follows: (a) to characterize the representativeness of different wavelet functions and different decomposition scales in the prediction model, and (b) to understand appropriateness of the improved regression analysis (Stepwise-PLS) in improving the prediction accuracy of soil properties.

Soil Sample Preparation and Laboratory Analysis
For this study, 193 soil samples of Burozem and Cinnamon soil were collected from the Experimental Station of Qingdao Agricultural University (Shandong Province, China) in 2014. The main parent materials of Cinnamon soil consist of loess and lime, and its mineral composition is primarily hydromica, montmorillonite, and kaolinite. The main parent materials of Burozem soil are non-calcareous eluvial slope deposits and earthy deposits, and its mineral composition is primarily hydromica, kaolinite, and vermiculite. For one certain soil type, the soil samples have similar parent materials, mineral composition, and texture. The collected soil samples were air-dried for 72 h and then passed through a 4.75-mm aperture square-hole sieve to remove the coarse matter and organic debris. The physical and chemical properties of the soil were measured as described by I.S.S.C.A.S. (1978) [31]. Briefly, the soil organic matter (SOM) was estimated by K 2 Cr 2 O 7 oxidation at 180 • C, and the cation exchange capacity (CEC) was estimated by displacing the exchangeable cations on the soil particle surfaces with NH4 + . Each soil sample was measured in duplicate, and the main soil properties are summarized in Table 1. The hyper-spectral reflectance data were obtained from 350 to 2500 nm by an Analytical Spectral Device spectroradiometer with a spectral resolution of 1.4 nm. The collected soil samples were measured in a dark room in the laboratory. The field-of-view was 8 • , and the illumination was provided by a 1000-W halogen lamp fixed 100 cm directly above the soil plane. To eliminate the effects of soil moisture as much as possible, the soil samples were naturally air dried before the hyper-spectral measurements. The soil samples were placed inside a circular black capsule with a diameter of 10 cm and depth of 1 cm, and were leveled to a smooth surface with the edge of a spatula. In the present laboratory experiments, the reflectance between each reflectance measurement was standardized using a white Spectralon reference panel [32]. The visible and near infrared reflectance spectra of the soil samples were converted to spectral reflectance by dividing them by the Spectralon reference panel. These first derivative transformations are known to minimize the sample variations caused by changes in the grinding and optical settings [21]. To eliminate the noise in the first derivative spectra, we applied Savitzky-Golay smoothing [33] to the original reflectance spectra curve. In addition, like most of the hyper-spectral prediction models of soil properties, this study takes the first derivative of the spectral reflectance as an independent variable. Prior to the data analysis, six soil samples yielded negative hyper-spectral data after the hyper-spectral measurements of all soil samples and were discarded from our analysis.
We classified the original soil samples prior to data analysis. Combined with some previous studies, there is a definite masking effect between soil properties. For example, Rossel's studies [34] have confirmed that the masking of other constituents by organic matter in hyper-spectral reflectance is weakened by soil organic matter contents below 10 g/kg. Therefore, the classification of all soil samples is an essential preliminary work, in order to avoid the masking effect between soil properties. The soil samples were selected on the condition that the soil properties to be predicted varied significantly (with greater coefficients of variance (CV) values), whereas other properties changed slightly (with lower CV values). In this study, we grouped the soil samples with soil organic matter contents below 10 g/kg into the A group, which was used as the dataset for estimating CEC. On the other hand, the soil samples with CEC values below 20 cmol/g were selected into the B group, which was used as a data set for estimating SOM.
The soil samples were randomly selected from groups A and B at a 7:3 ratio to get calibration and validation data sets, respectively. The A group dataset (69 soil samples) was split into 49 randomly selected samples for calibration, and the remaining 20 soil samples were reserved for validation. Similarly, the B group dataset (114 soil samples) was split into 90 randomly selected samples for calibration, and 34 samples for validation. The soil properties in the soil samples are statistically described in Table 2.

Discrete Wavelet Transform (DWT)
Wavelet analysis theory has been extensively reported in previous literature, so it is only briefly introduced here. DWT is ideally suited to spectral feature extraction, most fundamentally because it performs a multi-scale analysis of the signal [35]. DWT can be mathematically expressed as a finite length sequence and a discrete wavelet basis of the inner product, where each inner product factor is a discrete wavelet transform [13]. The DWT coefficients are expressed as follows: where W f (j, k) is the value of DWT coefficient; f (n) is a sequence of length n. The discrete wavelet basis is given by where S 0 j and S 0 j,k correspond to the discrete wavelet scale and the translation parameters, respectively, and the superscript * denotes the complex conjugate. For a binary wavelet, S 0 = 2. To classify the detailed information into approximate scale categories, the hyper-spectral signal is projected onto a wavelet function. This approach is superior to other analytical methods. Unlike the Fourier transform, the scale transformation of a discrete wavelet transform is carried out on wavelets with non-unique and irregular fundamental waves, largely different waveforms of the fundamental waves, and largely different support lengths and regularity. The signal processing of different wavelet signals in the same signal often yields large differences among the results, which inevitably affects the final processing results. Therefore, the wavelet functions must be appropriately chosen for hyper-spectral pre-processing. The basic properties of the different wavelet functions available for DWT are given in Table 3.

Stepwise-Partial Regression Analysis
To eliminate the redundancy and noise in the hyper-spectral data, this study improves the partial least-squares fitting. The error terms in the partial least-squares regression model are not normally distributed, and their exact distribution is particularly difficult to discern. Therefore, the parameter significances, which determine the choice of variables, cannot be tested by evaluating the parameter statistics. Here we improve the partial least-squares model by proposing a variable selection method in MSLR, based on the fitting error. We refer to this method as Stepwise-PLS. This article briefly introduces the idea of improvement showed in Figure 1.
In this improved algorithm, all of the original variables are used in the PLS model fitting, and the root-mean-square deviation (RMSE) is calculated as where A(i) and F(i) are the actual and forecasted values, respectively. The RMSE calculated by Equation (3) is set to E0. Next, one of the original variables are removed each time, until all variables have passed through, and the remaining variables are reinserted into the PLS model. The RMSE is calculated in each instance, and is set to E. The minimum value of RMSE is computed (E = min (RMSE)) and compared with E0. If E < E0, the variable corresponding to the minimum error is removed, presuming that its removal will improve the model precision. In this improved algorithm, all of the original variables are used in the PLS model fitting, and the root-mean-square deviation (RMSE) is calculated as where ( ) and ( ) are the actual and forecasted values, respectively. The RMSE calculated by Equation (3) is set to E0. Next, one of the original variables are removed each time, until all variables have passed through, and the remaining variables are reinserted into the PLS model. The RMSE is calculated in each instance, and is set to E. The minimum value of RMSE is computed (E = min (RMSE)) and compared with E0. If E < E0, the variable corresponding to the minimum error is removed, presuming that its removal will improve the model precision. Conversely, if E > E0, then all of the unfavorable variables are rejected, and all remaining variables are deemed suitable for the PLS model. These series of algorithmic flows are coded and performed with Statistics toolbox in MATLAB Release 12b (Matrix Laboratory, Math-Works, Natick, United States).

Regression Analysis Based on Discrete Wavelet Transform with Stepwise-Partial Least-Squares
To reduce the effects of the edge noise, the hyper-spectral measurements were acquired between 390 and 2437 nm. The first step computes the approximation and detailed coefficients of the wavelet decomposition, which are used in the Stepwise-PLS regression analysis. Each soil reflectance spectrum was subjected to eight levels of DWT. Combining the detailed and approximate signals gives more complete coverage of the spectral information than the approximate or detailed signals alone. The approximate coefficients are severely distorted when the decomposition level is too high; conversely, when the decomposition level is too low, the information in the approximate coefficients is too verbose. After several attempts, the approximate signal was optimized at four levels. Hence, after obtaining the detailed coefficients at levels 3, 4, 5, 6, 7, and 8, the original reflectance spectroscopy was replaced with the approximation coefficients at level 4. The spectral signal was decomposed into different wavelet functions at level 6, and a single branch was reconstructed from

Regression Analysis Based on Discrete Wavelet Transform with Stepwise-Partial Least-Squares
To reduce the effects of the edge noise, the hyper-spectral measurements were acquired between 390 and 2437 nm. The first step computes the approximation and detailed coefficients of the wavelet decomposition, which are used in the Stepwise-PLS regression analysis. Each soil reflectance spectrum was subjected to eight levels of DWT. Combining the detailed and approximate signals gives more complete coverage of the spectral information than the approximate or detailed signals alone. The approximate coefficients are severely distorted when the decomposition level is too high; conversely, when the decomposition level is too low, the information in the approximate coefficients is too verbose. After several attempts, the approximate signal was optimized at four levels. Hence, after obtaining the detailed coefficients at levels 3, 4, 5, 6, 7, and 8, the original reflectance spectroscopy was replaced with the approximation coefficients at level 4. The spectral signal was decomposed into different wavelet functions at level 6, and a single branch was reconstructed from the approximated DWT coefficients at level 4, as well as the detailed DWT coefficients at level 6 ( Figure 2). The approximate coefficients were nearly independent of the level, but the signals of the detailed coefficient reconstruction were influenced by the different wavelet functions. Therefore, one must try different wavelet functions in the prediction model and find the best wavelet function, as well as its appropriate decomposition level. Meanwhile, to compare the application effects of the stepwise and standard PLS methods, we regressed the wavelet coefficients of the various wavelet functions and their decomposition levels using the two PLS methods. When performed by PLS and Stepwise-PLS, these approaches were named DWT-PLS and DWT-Stepwise-PLS, respectively.
With reference to previous studies, the predictive ability of the DWT-PLS and DWT-Stepwise-PLS method was assessed by four indicators: the coefficient of determination R 2 , the p value, root-mean-square deviation (RMSE), and the relative percentage deviation (RPD). R 2 represented the calibration measures of the models, and R v 2 , RMSE, and RPD represented the validation measures of the models. Along with the p value, R 2 also evaluates the degree of correlation between the predicted and true values. As the RMSE is a parameter in the Stepwise-PLS model, it was excluded as an evaluation indicator of the model's calibration results, but was added as an effective evaluation indicator of the model's validation results. The RMSE and RPD define the accuracy of the error between the predicted and true values. Mathematically, RPD defines the ratio of the sample standard deviation (StD) to the RMSE (StD/RMSE). According to related research [36], models with RPD values below 1.0 are very poor predictive performers, and their use is not recommended. Models with RPD values between 1.0 and 1.4 are also poor performers, and can distinguish only high and low values. When the RPD values is between 1.4 and 1.8, the model is a fair performer, and is useful for assessment and correlation. Models with RPD values between 1.8 and 2.0 are sufficiently accurate for quantitative predictions, and those with RPD values between 2.0 and 2.5 are high-quality quantitative predictors. Any model with an RPD values above 2.5 makes excellent predictions.
stepwise and standard PLS methods, we regressed the wavelet coefficients of the various wavelet functions and their decomposition levels using the two PLS methods. When performed by PLS and Stepwise-PLS, these approaches were named DWT-PLS and DWT-Stepwise-PLS, respectively. With reference to previous studies, the predictive ability of the DWT-PLS and DWT-Stepwise-PLS method was assessed by four indicators: the coefficient of determination R 2 , the p value, root-mean-square deviation (RMSE), and the relative percentage deviation (RPD). R 2 represented the calibration measures of the models, and Rv 2 , RMSE, and RPD represented the validation measures of the models. Along with the p value, R 2 also evaluates the degree of correlation between the predicted and true values. As the RMSE is a parameter in the Stepwise-PLS model, it was excluded as an evaluation indicator of the model's calibration results, but was added as an effective evaluation indicator of the model's validation results. The RMSE and RPD define the accuracy of the error between the predicted and true values. Mathematically, RPD defines the ratio of the sample standard deviation (StD) to the RMSE (StD/RMSE). According to related research [36], models with RPD values below 1.0 are very poor predictive performers, and their use is not recommended. Models with RPD values between 1.0 and 1.4 are also poor performers, and can distinguish only high and low values. When the RPD values is between 1.4 and 1.8, the model is a fair performer, and is useful for assessment and correlation. Models with RPD values between 1.8 and 2.0 are sufficiently accurate for quantitative predictions, and those with RPD values between 2.0 and 2.5 are high-quality quantitative predictors. Any model with an RPD values above 2.5 makes excellent predictions.   Table 1 lists  Prior to spectral analysis, we must determine the intrinsic relationships between the soil density, pH, SOM, and CEC of the soils. The correlation coefficients between pairs of soil density, pH, SOM, and CEC values are shown in Table 4. There are no significant correlations between the soil properties, implying that these four indexes are individual and representative of the study area.

Reflectance Spectral Feature of Soil Properties
Panels (a) and (b) of Figure 3 show the average spectral reflectance of the soil samples and their first derivatives, respectively. The first derivatives of the soil reflectance contain more absorption bands, and hence much more SOM and CEC information than the raw average spectral reflectance. The absorption bands in the first derivatives appear near 566 nm, 1418 nm, 1918 nm, and 2205 nm. Figure 4 plots the Pearson's correlation coefficients between the first derivative of the reflection spectrum and the soil properties as functions of wavelength. As indicated in Figure 4, SOM has more absorption bands than CEC, with characteristic spectral peaks at 1388 nm, 1888 nm, 2182 nm, and 2303 nm for positive correlations, and with characteristic spectral peaks at 732 nm, 1257 nm, 1469 nm, and 2022 nm for negative correlations. Many of these absorption bands are immediately attributable to organic matter (such as absorptions by hydroxyl (O-H) stretching vibrations near 1469 nm and 1888 nm, and aliphatic C-H stretching near 2182 nm [15]). The absorption spectral bands of CEC appear at 710 nm, 1274 nm, 1442 nm, and 1966 nm (all negative correlations).
Obviously, the negatively-correlated absorption bands of the SOM and CEC are very similar, indicating that the characteristic reflection bands of the SOM and CEC probably overlap (masking effect). This masking effect of spectral features is common, which complicates the spectral analysis. Therefore, the classification of soil samples is necessary before analyzing the quantitative prediction model (Section 2.1).
According to the Table 2, the CVs of all soil properties (except CEC) are below 40%, indicating that the pH, soil density, and SOM exert limited impacts on the CEC in the spectral analysis. The StDs and CVs of the pH and soil density are very small, meaning that the variations in these properties (pH and soil density) are insufficient for effective modeling. Moreover, because of their small CV, their spectral features are easily covered by the SOM and CEC features ( Figure 4). Hence, in the following discussion, the pH and soil density are treated only as simple comparison items, while the SOM and CEC (which are important indicators of soil fertility) are the main variables in the spectral analysis. Remote Sens. 2018, 10, x FOR PEER REVIEW 9 of 21    Obviously, the negatively-correlated absorption bands of the SOM and CEC are very similar, indicating that the characteristic reflection bands of the SOM and CEC probably overlap (masking effect). This masking effect of spectral features is common, which complicates the spectral analysis. Therefore, the classification of soil samples is necessary before analyzing the quantitative prediction model (Section 2.1).
According to the Table 2, the CVs of all soil properties (except CEC) are below 40%, indicating

Calibration of the Discrete Wavelet Transform with Partial Least-Squares Model
The calibration results of DWT-PLS are summarized in Table 5. The SOM and CEC prediction models established by DWT-PLS perform well on the calibration dataset in this study. Moreover, the DWT-PLS models perform better at levels 5, 6, 7, and 8 of the DWT than at levels 3 and 4, with a larger R 2 ( Table 5), meaning that higher decomposition levels improve the performance of the DWT-PLS model, which quantitatively predicts the SOM and CEC values. In the calibration set of the quantitative CEC prediction, the R 2 values yielded by the DWT-PLS models using the Haar, Sym8, and Db4 wavelet functions were 0.97, 0.97, and 0.92, respectively (all > 0.90).
In the calibration set of the quantitative SOM prediction, the R 2 values (again > 0.90) yielded by DWT-PLS models using the Bior2.4, Haar, Sym4, and Sym8 wavelet functions were 0.95, 0.91, 0.91, and 0.91, respectively. Table 5 also lists the number of variables used in the DWT-PLS models of the CEC and SOM predictions. Because the SOM spectrum has more absorption bands than the CEC spectrum (Figure 4), the coefficients of the DWT-PLS prediction model are generally more variable for SOM than for CEC.  (1.17) for CEC prediction in the Vis-NIR by PLS method. Moreover, the maximum RPD of DWT-PLS, regardless of wavelet function, is distributed around levels 5, 6, and 7, and remains relatively stable over those levels (Figure 5c). These results show the obvious advantages of levels 5, 6, and 7 over the other levels. According to previous studies [7], lower-level DWT coefficients introduce high-frequency noise that affects the model predictions; conversely, higher-level DWT coefficients filter much of the available spectral information, also decreasing the predictive ability of the model. As shown in Figures 5a and 6a, the validation results of CEC are better overall than those of SOM. The RPD of the SOM predictions by DWT-PLS was maximized at 1.441 (R v 2 = 0.637, RMSE = 5.73), highlighting the poor overall predictive ability of SOM by DWT-PLS. Vohland et al. [38] also observed a similar R 2 value, RMSE value, and RPD value (0.60, 0.41, and 1.58 respectively) for soil organic C prediction in the Vis-NIR by the PLSR method. The SOM predictions also lack the regularity of the CEC results with regard to the optimal decomposition-level distributions and the optimal wavelet functions in the DWT-PLS model. The SOM prediction model has more the variables of the DWT-PLS prediction model than the CEC (Table 5). This results are similar to the study results of Dunn et al. [39], who predicted CEC (R 2 = 0.9, RPD = 3.3) with higher levels of accuracy than soil organic C (R 2 = 0.66, RPD = 1.7) in the Vis-NIR by the PLS method. Although more variable factors capture more information from the soil reflectance spectra, increasing the number of variable factors in the DWT-PLS model unfortunately introduced more unreasonable variables into the regression equations, increasing the noise and errors.
Remote Sens. 2018, 10, x FOR PEER REVIEW 12 of 21 of Dunn et al. [39], who predicted CEC (R 2 = 0.9, RPD = 3.3) with higher levels of accuracy than soil organic C (R 2 = 0.66, RPD = 1.7) in the Vis-NIR by the PLS method. Although more variable factors capture more information from the soil reflectance spectra, increasing the number of variable factors in the DWT-PLS model unfortunately introduced more unreasonable variables into the regression equations, increasing the noise and errors.

Calibration and Validation Results of the Discrete Wavelet Transform-Stepwise-Partial Least Squares Model
The DWT-Stepwise-PLS models for predicting CEC and SOM were calibrated on the same datasets used for calibrating their DWT-PLS counterparts, with no additional processing. The DWT-Stepwise-PLS model yielded a better calibration result than the DWT-PLS model; the R 2 values of both models were very close to 1 ( Table 6). Note that when comparing the DWT-PLS models, the numbers (N) of the variables are not significantly decreased in the DWT-Stepwise-PLS models with the same wavelet function and decomposition level (Tables 5 and 6). This result implies that the DWT-Stepwise-PLS method selects more effective variables than DWT-PLS.
The validation results of the DWT-Stepwise-PLS models for CEC and SOM are shown in Figures  7 and 8, respectively. The numbers in the grid are the Rv 2 values of the DWT-Stepwise-PLS for validation datasets with various wavelet functions and decomposition levels (Figures 7a and 8a).  The DWT-Stepwise-PLS models for predicting CEC and SOM were calibrated on the same datasets used for calibrating their DWT-PLS counterparts, with no additional processing. The DWT-Stepwise-PLS model yielded a better calibration result than the DWT-PLS model; the R 2 values of both models were very close to 1 ( Table 6). Note that when comparing the DWT-PLS models, the numbers (N) of the variables are not significantly decreased in the DWT-Stepwise-PLS models with the same wavelet function and decomposition level (Tables 5 and 6). This result implies that the DWT-Stepwise-PLS method selects more effective variables than DWT-PLS.
The  [40], who observed close R 2 values and RMSE values (0.82 and 5.5, respectively) for soil organic C prediction in the NIR by the PLSR method. Rossel et al. [10] also observed similar R 2 values (R 2 = 0.60) and RMSE values (RMSE = 0.18) in the visible reflectance and near-infrared reflectance for soil organic C prediction by the PLSR method.

Characteristics of Wavelet Functions Representativeness
The RPD values of the DWT-Stepwise-PLS at levels 3-8 are shown in Figures 7b and 8b. The average RPD of the CEC predicted by DWT-Stepwise-PLS was higher for the Haar and Sym8 wavelet functions (2.36 and 1.96, respectively; see Table 7) than for the other wavelet functions. Moreover, according to the CEC validation results, the effect of DWT-Stepwise-PLS using the Haar and Sym8 wavelet functions largely depends on the decomposition level of the DWT; the former appears to be more effective at higher decomposition levels, whereas the latter tends to be relatively stable at each level (Figure 7b).

Characteristics of Wavelet Functions Representativeness
The RPD values of the DWT-Stepwise-PLS at levels 3-8 are shown in Figures 7b and 8b. The average RPD of the CEC predicted by DWT-Stepwise-PLS was higher for the Haar and Sym8 wavelet functions (2.36 and 1.96, respectively; see Table 7) than for the other wavelet functions. Moreover, according to the CEC validation results, the effect of DWT-Stepwise-PLS using the Haar and Sym8 wavelet functions largely depends on the decomposition level of the DWT; the former appears to be more effective at higher decomposition levels, whereas the latter tends to be relatively stable at each level (Figure 7b).
The average RPDs of the SOM predictions by DWT-Stepwise-PLS using Bior2.4 and Sym4 were 1.63 and 1.73, respectively (Table 7), showing slight advantages of Bior2.4 and Sym4 over the other wavelet functions. Similarly, most of the DWT-Stepwise-PLS models yielded higher R v 2 values than the DWT-PLS model (Figure 8). The DWT-Stepwise-PLS not only yielded a higher RPD with Sym4 than with Bior1.3, but was overall more stable with Sym4 than with the other wavelet functions.