Determination of Soil Salt Content Using a Probability Neural Network Model Based on Particle Swarm Optimization in Areas Affected and Non-Affected by Human Activities

: Traditional partial least squares regression (PLSR) and artiﬁcial neural networks (ANN) have been widely applied to estimate salt content from spectral reﬂectance in many different saline environments around the world. However, these methods entail a great amount of calculation, and their accuracy is low. To overcome these problems, a probability neural network (PNN) model based on particle swarm optimization was used in this study to build soil salt content models. Furthermore, there is a clear correlation between the level of human activities and the degree of salinization of an environment. This paper is the ﬁrst to discuss this matter. Here, the performance of the PNN model to estimate soil salt content from reﬂectance data was investigated in areas non-affected (Area A) and affected (Area B) by human activities. The study area is located in XinJiang, China. Different mathematical procedures, ﬁve wave band intervals, and two types of signal input sources were used for cross analysis. The coefﬁcient of determination (R 2 ), root mean square error (RMSE), and ratio of performance to deviation (RPD) index values were compared to verify the reliability of the model. Particle swarm optimization was used to adjust the optimal smoothing parameters of the PNN model and to avoid the long training processes required by the traditional ANN. The results show that the optimal wave band interval of the PNN is between 1000 nm and 1350 nm in Area A and between 400 nm and 700 nm in Area B. The reciprocal (1/R) transformation after Savitzky-Golay (SG) smoothing of the signal source is optimal for both areas. The RPD for both is greater than 30, which shows that the PNN model is applicable to areas with and without human activities and the prediction results are very good. The results indicated that the optimal wave band intervals for PNN modeling differed in areas affected and non-affected by human activities. The optimal interval of the artiﬁcial activities region falls in the visible light portion of the spectrum, and the optimized wave band region without human activities falls in the near-infrared short-wave portion of the spectrum.


Introduction
Soil is an important natural resource that provides food and fiber for the survival of earth organisms, as well as the sustainable development of ecosystems [1][2][3][4]. Soil provides a physical basis for human survival, activity, development, and agricultural production, and it serves as an important carrier. However, natural primary salinization and secondary salinization caused by unreasonable human activities are the most common cause of soil degradation in arid and semi-arid areas, and this is an important global ecological problem. Soil hardening, a decrease in fertility, and pH changes caused by soil salinization [5][6][7][8][9] can seriously restrict agricultural production, upset economic sustainability and development, and cause ecological imbalance. Because soil salinization represents a severe threat to human survival and development, it has attracted worldwide attention. Several statistical investigations have shown that 9.55 × 10 8 hm 2 of soil is under threat of salinization across the globe [10,11]. This amounts to approximately 20% of the global total of arable land spread over more than 100 different countries. The area affected by secondary salinization is about 0.77 × 10 8 hm 2 and is growing at a rapid rate [10,11]. Traditional laboratory chemical analysis requires the excavation and collection of soil samples as well as transportation to a laboratory for analysis, which damages the soil, Moreover, this method is time consuming, labor intensive, and has a high testing cost [12][13][14][15]. However, visible and near-infrared hyperspectral technology is fast, very efficient, non-polluting, economical, and harmless. Hyperspectral remote sensing can quickly gather soil surface spectral reflectance data and produce a continuous, complete, detailed and accurate ground spectrum record [16][17][18][19][20]. Hyperspectral remote sensing can even reflect the physical and chemical characteristics of soil, making the quantitative determination of soil salt content, as well as other key soil indicators, possible. This can satisfy farmers' needs for more accurate information as well as scientific evaluation.
Human activities have a serious effect on the physical properties of the soil. The type of activity, level, and its duration will determine the degree of impact on soil characteristics and heterogeneity. In recent years, domestic and international scholars have been paying considerable attention to human activities relating to soil, and this has become an important subject of study [21][22][23]. However, there have been very few studies on the relationship between different levels of human activities and salinity. For example, Ding et al. [24] measured the spectral reflectance of salinized soil and vegetation in the Weigan-Kuqa River Delta Oasis on the northern edge of the Xinjiang Tarim Basin. They conducted a spectrum transformation of the hyperspectral data obtained from soil and vegetation samples and determined the most sensitive wave band, indicative of the salinization level. Viscarra et al. [25] used different wavelength spectra and a partial least squares regression (PLSR) to estimate multiple soil attributes. Their results showed that the near-infrared spectrum provided fairly precise measurements of exchangeable potassium, but the accuracy was far lower than that achieved for soil organic carbon and pH. Debaene et al. [26] used visible and near-infrared spectra to predict attributes, such as available potassium and phosphorus, Mg, and soil organic carbon (SOC) in soil. However, their results for available potassium and phosphorus were less than ideal. Ben-Dor et al. [27] used a DAIS-7915 sensor to conduct a quantitative inversion and obtain soil moisture data. The soil moisture parameters were used for an indirect inverse soil salt content determination. Qu et al. [28] utilized raw reflectance data and the PLSR method to build a soil salt content quantitative inversion model for the Inner Mongolia Hetao irrigation area. Liu et al. [29] utilized a visible and near-infrared spectrum method in a feasibility study for the assessment of soil potassium content in Qian County yellow soil, Shaanxi. They built a soil potassium content calculation model using laboratory spectral reflectance, total soil potassium, available potassium content, multiple linear regression (MLR), and PLSR. Liu et al. [30] utilized visible/short wave near infrared spectroscopy (Vis/SW-NIRS) to determine the available nitrogen and potassium in soil. In their study, they used the standard normal variation (SNV) method, multiple scattering correction (MSC) method, and Savitzky-Golay smoothing and combined them in a first-order differential algorithm for impellent spectrum pre-processing analysis. The simulation showed that the least squares-support vector machine (LS-SVM) model was more precise.
The construction of a good soil salt content hyperspectral quantitative inversion model is very important for soil salinization hyperspectral monitoring. Farifteh et al. [31] employed PLSR and artificial neural network (ANN) methods to analyze the correlation between different sized soil spectra and soil salt content in the Netherlands, Hungary, and Australia. Weng et al. [32] employed PLSR to build a soil salt content model for the Yellow River Delta. Sidike et al. [33] employed multiple regression and PLSR to build a soil salt content prediction model for Pingluo County in the Ningxia Hui Autonomous Region. Zhang et al. [34] employed a PLSR model for the quantitative inversion of salt content in northeast soda saline soil and then used an inverse logarithmic transform, after spectral signal smoothing, to build the model. The ratio of performance to deviation (RPD) for this model was 1.61, and the coefficient of determination was 0.6677. All these studies used spectral reflectance as the input signal and used the model to determine the output relationship. Linear regression is frequently used for model comparison, and the root mean square error (RMSE), coefficient of determination (R 2 ), and RPD are used to verify predictive ability. However, these models only use linear fitting in the spectral system for expression, which does not provide a very good prediction. While the input can have multiple variables, the corresponding output can have only one variable. If multiple output variables are required, multiple models must be used simultaneously.
To overcome the poor prediction of linear PLSR models and to avoid the heavy calculation load of non-linear ANN models, a probability neural network (PNN) model was proposed in this paper. Particle swarm optimization was used to adjust the optimal smoothing parameter of the PNN model and to avoid the long training process required by a traditional neural network. In order to show the effects of human activities, the optimized wave band regions with and without human activities are also discussed in this study.

Overview of the Study Area
The test area is located in the Xinjiang Tianshan Northern Foothills at the southern edge of the Dzungarian Basin (87 • 44 -88 • 46 E, 43 • 29 -45 • 45 N). The soil categories include grey desert soil, cracked soil, and sandy soil. The research area was divided by a large channel, and two areas were selected (A and B), as shown in Figure 1. Because Area A is some distance from human habitation, it had not been developed and was considered to be free of human activities in this study. Area B is located near the Xinjiang 102 Production and Construction Army Corp and had been subject to human activity. Most of the land in Area B has been developed, some soils have been leveled, and some soils have been cultivated into tree seedlings; however, no fertilizer had been applied. The human activities in this area are very intensive. Thus, it was designated as having been subject to human activities. To overcome the poor prediction of linear PLSR models and to avoid the heavy calculation load of non-linear ANN models, a probability neural network (PNN) model was proposed in this paper. Particle swarm optimization was used to adjust the optimal smoothing parameter of the PNN model and to avoid the long training process required by a traditional neural network. In order to show the effects of human activities, the optimized wave band regions with and without human activities are also discussed in this study.

Overview of the Study Area
The test area is located in the Xinjiang Tianshan Northern Foothills at the southern edge of the Dzungarian Basin (87°44′-88°46′E, 43°29′-45°45′N). The soil categories include grey desert soil, cracked soil, and sandy soil. The research area was divided by a large channel, and two areas were selected (A and B), as shown in Figure 1. Because Area A is some distance from human habitation, it had not been developed and was considered to be free of human activities in this study. Area B is located near the Xinjiang 102 Production and Construction Army Corp and had been subject to human activity. Most of the land in Area B has been developed, some soils have been leveled, and some soils have been cultivated into tree seedlings; however, no fertilizer had been applied. The human activities in this area are very intensive. Thus, it was designated as having been subject to human activities.

Soil Sample Collection
In Area A, five sampling lines, spaced 600 to 800 m apart, were established in a north to south direction. In Area B, six sampling lines, spaced 800 to 1000 m apart, were set out. Five representative sampling points were selected along each sampling line. The interval between sampling points was 300 to 500 m. Area A had 25 sampling points, and Area B had 30 sampling points. Global positioning system (GPS) was used to determine the exact position of each of the 55 points. A soil sample was collected at a depth between 0 and 10 cm at each of the sampling points and placed in a sequentially numbered bag. The samples were air-dried in the laboratory, and their impurities were removed. After screening through a 1-mm sieve, the samples were sent to the Xinjiang Institute of Ecology and Geography to be tested for salt content.

Obtaining Spectrum Data
In this study, a portable FieldSpec ® 3 Hi-RES Spectrometer (ASD, USA) was used to collect soil spectrum data. The instrument has an effective spectral testing range between 350 and 2500 nm. The sampling interval is 1.4 nm at 350 to 1000 nm and 2 nm at 1000 to 2500 nm. The repeatability interval is 1 nm. Because weather conditions can affect the readings, spectral measurements were conducted from 9-23 May 2017, between 11:00 and 15:00 h, on clear cloudless days with no wind. Before each spectrum measurement, a whiteboard calibration was conducted to eliminate any dark current effects. The sensor head of the spectrometer was held 15 cm from the sampling point surface and directly above it when the reading was taken. To prevent surface cracks or surrounding vegetation from affecting the readings, each spectrum testing point chosen was situated away from any interfering objects. Spectrum readings were made at five different places, within a meter of each sampling point, and each reading was repeated 10 times. A total of 50 spectrum curves were obtained for each sampling point. The mean of these curves was used as the spectrum value for the sampling point.

Probability Neural Network
PNN was proposed by Specht in 1991 for non-linear and high-dimensional mapping applications [35,36]. The PNN was designed to be similar to the multiple regression model and uses different wave bands and reflectance from various observation stations as the input variance, and the salt content reflects the relationship between output variances. PNN is formed from the input, pattern, summation, and output layers, as shown in Figure 2. In this study, particle swarm calculation was used to adjust the node parameters of the model, increase calculation speed of the PNN model, and avoid the huge volume of calculations required by the traditional ANN model. PNN can provide large quantities of pattern nodes in the pattern layer and can approximate the non-linear prediction function or realize a non-linear prediction. Thus, the input vector X and output vector Y in the input layer and output layer can be expressed as an input-output match pattern, as shown in Equation (1): where K is the quantity of data collected (k = 1, 2, ..., K), λ(k)/λ max is the wavelength, R(k)/R max is the reflectance, and SC(k)/SC max is the salt content. In this experiment, the pattern nodes can continue to grow while nodes are added or deleted in the pattern layer. This requires that the pattern nodes build a model in a short period of time for the high dimensional regression function and estimator. However, the input and output connect to the weight from the input nodes (  ). The pattern node parameters require adjustment for the optimal calculation method. The PNN calculation steps are as follows: Step 1. Consider that the K input-output matching training pattern can easily determine the multilayer network structure. There are two inputs, a K pattern, summation nodes, and one output node.
Step 2. For the 2  K input training pattern where [ ] T kj w is the K  2 matrix in this study, and 1 2 ( ), ( ) x k x k are K  1 column vectors.
Step 3. For the K  1 output training pattern (salt content), the pattern layer and summation layer relationship weight is Step 4. Calculate the output nodes  In this experiment, the pattern nodes can continue to grow while nodes are added or deleted in the pattern layer. This requires that the pattern nodes build a model in a short period of time for the high dimensional regression function and estimator. However, the input and output connect to the weight from the input nodes (x 1 , x 2 ) to the pattern nodes (H 1 , H 2 , . . . , H k , . . . , H K ) and the summation nodes (S 1 , ∑ H K ). The pattern node parameters require adjustment for the optimal calculation method. The PNN calculation steps are as follows: Step 1. Consider that the K input-output matching training pattern can easily determine the multilayer network structure. There are two inputs, a K pattern, summation nodes, and one output node.
Step 2. For the 2 × K input training pattern X(k) = [x 1 (k), x 2 (k)], the input layer and pattern layer relationship weight w kj , j = 1, 2; k = 1, 2, 3, . . . , K is shown in Equation (2): Step 3. For the K × 1 output training pattern (salt content), the pattern layer and summation layer relationship weight is w ks , s = 1, 2, as shown in Equation (3) where W 2 = [w ks ] T is the K × 2 matrix, and y(k) is K × 1 column vector. The total pattern to summation layer relationship weight ∑ H k is set as 1.
Step 4. Calculate the output nodes H k , k = 1, 2, . . . , K using the Gaussian function as in Equation (4): where In this study, we used particle swarm optimization (PSO) [37] to search for the optimal parameter.
Step 5. Obtain y 1 of the output layer, as shown in Equation (5): The final output is shown in Equation (6): After determining the output value, the RMSE can be used to show whether the prediction result of this model is accurate. The smaller the value, the more accurate the prediction, as shown in Equation (7): where Y i − Y i is the error between the prediction value and the actual value.

Particle Swarm Optimization
In particle swarm optimization (PSO), the particle swarm [38] can reach up to 30 particles (experience: 10-30 particles). By separating particle swarms into numerous individual units we can separate the many possible optimal solutions. First, we move towards the target from all sides. In each optimal solution search process, the most advantageous particle from the particle swarm is selected, which is the optimal solution to mathematical problems. A global search method is used to move towards the target. PSO uses iteration to adjust the smoothing parameter σ: Smoothing parameter correction calculation ∆σ: Adjustment parameter: Smoothing parameter correction equation: where g = 1, 2, 3, . . . , G is the number of the particle swarm, σ best is the (p-1) optimal solution reached by iteration, and σ best g is the optimal solution found by the g th particle. Parameters rand 1 and rand 2 are the random values between 0 and 1; p is the iteration number and p max is the maximum iteration number; c 1 and c 2 parameters can be used to adjust the number of iterations, a 1 , a 2 , b 1 , and b 2 , so that c 1 decreases from 2.5 to 0.5 (the individual coefficient) and c 2 increases from 0.5 to 2.5 (the group coefficient). ∆σ g p+1 represents the distance of each particle swarm, that is, ∆σ g p+1 is the time change adjustment parameter. During the search process, the initial phase allows the particles to move significantly towards the target. This method prevents the search from being confined to a small range. Once the target is near, the distance is reduced to search for the global optimal solution.

Model Verification
After the model was built, we used the prediction root mean square error (RMSE), coefficient of determination (R 2 ) and ratio of performance to deviation (RPD) to verify the predictive ability of the model [39,40]. The smaller the value of RMSE, the more precise the prediction of the model. The scope of the coefficient of determination lies in the range of −1 < R 2 < 1. A value close to −1 represents a negative correlation, while a value close to 1 represents a positive correlation. The higher the model stability and fitting level, the closer R 2 will be to 1. When RPD ≤ 1.4, the model will be unreliable, whereas the model prediction reliability is normal when 1.4 < RPD < 2. An RPD ≥ 2 indicates a model with a superior prediction ability. The RPD evaluation indicator is shown in Equation (11): where SD is the standard deviation.

Spectrum Data Pre-Processing
The component and structural information of the tested sample reside in the visible light, near-infrared and ultraviolet spectrum data. However, the instrument background, sample size, and other factors can affect the data and cause a baseline shift, random noise, light scattering, and other system activities. These activities can mask information and have a significant effect on the building of models and the determination of unknown sample components or properties. Such noisy spectra require pre-processing to reduce or eliminate the effects of various non-target factors before quantitative analysis can be done, and this is crucial for building a good stable model. Preprocessing can even have a decisive effect on the final result. Commonly used spectrum pre-processing methods include removing the interfering wave bands, the use of a smoothing algorithm, and mathematical transformation.

Removing Interfering Wave Bands
The spectra collected by the ASD device range from 350 nm to 2500 nm, with a total of 2150 bands, for uniform soils of the same kind. The smaller the spectrum curve fluctuation, the better the quality of the spectrum. The 350-399 nm ultraviolet and the 2401-2500 nm short-wave infrared bands have a poor reflectance signal to noise ratio, and the data from these segments are unreliable. Data from sampling point 15 in Area A, as examples of this low quality, are shown in Figure 3. The moisture absorption segments at 1400 nm, 1900 nm, and 2200 nm also show a significant curve fluctuation because of soil moisture. Moisture present in soil can have three forms, combination water, hygroscopic water, and pore water, which all fall within the same spectrum absorption wave band. Studies show that the absorption band near 1400 nm is the main hydroxyl-OH spectrum band, and the band near 1900 nm is the main inter-layer H 2 O spectrum band [41][42][43][44][45]. The 2200 nm range includes hydroxyl stretching and Al-OH/Mg-OH bending vibrations, which have a greater effect on salt content spectrum inversion, as shown in Figure 4a. For these reasons, the reflectance from 350-399 nm, 2401-2500 nm, 1355-1410 nm, and 1820-1942 nm were removed, as shown in Figure 4b. The remaining 1823 wave bands from the soil sample reflectance spectrum data were used to build the inversion model.

Savitzky-Golay Convolution Smoothing
Savitzky-Golay (SG) convolution smoothing, also called polynomial smoothing, is widely used for spectrum filtering. The method combines least squares fitting with a moving window. First, a window that contains an odd number of points is selected. Each point of the spectrum in the window is perceived as a polynomial. Least squares is used to integrate the polynomial coefficient value. The calculation equation is as shown below: x h H (12) where i h is the smoothing coefficient and can be obtained through polynomial fitting. H is the normalization factor, for which the calculation method is

Multiple-Spectrum Mathematical Transformation
Before a model based on spectral reflectance of earth surface parameters is calculated and built, the original spectral reflectance (R) is often used to implement non-linear mathematical transform-ation. Commonly used non-linear mathematical transformations include root mean square ( R ), reciprocal (1/R), inverse logarithmic (log(1/R)), logarithmic (log(R)), and logarithmic reciprocal (1/log(R)). There are two reasons for this. (1) To make the linear relationship between spectral reflectance and earth surface non-linear, so that simple linear regression analysis can be conducted to obtain an approximate non-linear result. By building different forms of an estimation model, an optimal model can be selected to increase the precision of identification. (2) To use non-linear transformation to increase spectrum difference so that the earth surface parameters will

Savitzky-Golay Convolution Smoothing
Savitzky-Golay (SG) convolution smoothing, also called polynomial smoothing, is widely used for spectrum filtering. The method combines least squares fitting with a moving window. First, a window that contains an odd number of points is selected. Each point of the spectrum in the window is perceived as a polynomial. Least squares is used to integrate the polynomial coefficient value. The calculation equation is as shown below: x h H (12) where i h is the smoothing coefficient and can be obtained through polynomial fitting. H is the normalization factor, for which the calculation method is

Multiple-Spectrum Mathematical Transformation
Before a model based on spectral reflectance of earth surface parameters is calculated and built, the original spectral reflectance (R) is often used to implement non-linear mathematical transform-ation. Commonly used non-linear mathematical transformations include root mean square ( R ), reciprocal (1/R), inverse logarithmic (log(1/R)), logarithmic (log(R)), and logarithmic reciprocal (1/log(R)). There are two reasons for this. (1) To make the linear relationship between spectral reflectance and earth surface non-linear, so that simple linear regression analysis can be conducted to obtain an approximate non-linear result. By building different forms of an estimation model, an optimal model can be selected to increase the precision of identification. (2) To use non-linear transformation to increase spectrum difference so that the earth surface parameters will

Savitzky-Golay Convolution Smoothing
Savitzky-Golay (SG) convolution smoothing, also called polynomial smoothing, is widely used for spectrum filtering. The method combines least squares fitting with a moving window. First, a window that contains an odd number of points is selected. Each point of the spectrum in the window is perceived as a polynomial. Least squares is used to integrate the polynomial coefficient value. The calculation equation is as shown below: where h i is the smoothing coefficient and can be obtained through polynomial fitting. H is the normalization factor, for which the calculation method is

Multiple-Spectrum Mathematical Transformation
Before a model based on spectral reflectance of earth surface parameters is calculated and built, the original spectral reflectance (R) is often used to implement non-linear mathematical transform-ation. Commonly used non-linear mathematical transformations include root mean square ( √ R), reciprocal (1/R), inverse logarithmic (log(1/R)), logarithmic (log(R)), and logarithmic reciprocal (1/log(R)). There are two reasons for this. (1) To make the linear relationship between spectral reflectance and earth surface non-linear, so that simple linear regression analysis can be conducted to obtain an approximate non-linear result. By building different forms of an estimation model, an optimal model can be selected to increase the precision of identification. (2) To use non-linear transformation to increase spectrum difference so that the earth surface parameters will have a discernible effect on the spectrum. The original spectral reflectance R from sampling point 15 in Area A, along with five types of spectral transform form curve, is shown in Figure 5. have a discernible effect on the spectrum. The original spectral reflectance R from sampling point 15 in Area A, along with five types of spectral transform form curve, is shown in Figure 5.

Five Wave Band Intervals
As described in Section 4.1, 1823 wave bands were used to build the soil salt content inversion model after the noise and moisture absorption wave bands had been removed. Of these 1823 bands, some had a rather weak correlation with salt content. The inclusion of these bands would have interfered with the building of the model as well as its prediction and precision. To find the optimal wave band interval for building the inversion model, we avoided those affected by moisture and extracted five wave band intervals, as shown in Figure 6. The bands include one visible light band (400-700 nm) and four near-infrared bands (700-1000 nm, 1000-1350 nm, 1455-1805 nm, and 2000-2400 nm). The lengths of the five wave band intervals were 300 nm, 300 nm, 350 nm, 350 nm, and 400 nm. Referencing Zhao [46], we changed the reflectance R to

Five Wave Band Intervals
As described in Section 4.1, 1823 wave bands were used to build the soil salt content inversion model after the noise and moisture absorption wave bands had been removed. Of these 1823 bands, some had a rather weak correlation with salt content. The inclusion of these bands would have interfered with the building of the model as well as its prediction and precision. To find the optimal wave band interval for building the inversion model, we avoided those affected by moisture and extracted five wave band intervals, as shown in Figure 6. The bands include one visible light band (400-700 nm) and four near-infrared bands (700-1000 nm, 1000-1350 nm, 1455-1805 nm, and 2000-2400 nm). The lengths of the five wave band intervals were 300 nm, 300 nm, 350 nm, 350 nm, and 400 nm. Referencing Zhao [46], we changed the reflectance R to √ R, 1/R, log(1/R), log(R), and 1/ log(R), and introduced them separately into the probability neural network model to determine the best band interval.

Optimal Smoothing Parameter
The only parameter that required adjustment in PNN was the smoothing parameter. The accuracy of the PNN network output is affected by the smoothing parameters, and adjustment can produce K number of classifiers in the pattern layer. In this study, we used a Gaussian function to conduct a similarity comparison between the input pattern and training pattern. The classifier of each Gaussian function attempts to give its target value maximum similarity. Therefore, an appropriate  parameter should be chosen. Assume that the target function/squared error function is: where T(k) is the preset target vector, and its physical purpose is the expression and adjustment of parameter  . k  is the tolerable error value that allows the actual output value of the PNN to be near the target value. This conforms to the trend of the target function/squared error function value becoming smaller. Before the simulation experiment, the neural network model must find the optimal smoothing parameter 0.0001   (see Figure 7) so that the model can enable a simulation comparison.

Evaluation Indicators for Area A
As previously mentioned, the study area was divided into Areas A and B. The spectral reflectance at the five wave band intervals were used as the inputs. The output was the salt content.
The precision evaluation index values of RMSE, 2 R , and RPD for the three models were calculated.
Area A is used as the first discussion subject. The summary is shown in Tables 1-3.

Optimal Smoothing Parameter
The only parameter that required adjustment in PNN was the smoothing parameter. The accuracy of the PNN network output is affected by the smoothing parameters, and adjustment can produce K number of classifiers in the pattern layer. In this study, we used a Gaussian function to conduct a similarity comparison between the input pattern and training pattern. The classifier of each Gaussian function attempts to give its target value maximum similarity. Therefore, an appropriate σ parameter should be chosen. Assume that the target function/squared error function is: where T(k) is the preset target vector, and its physical purpose is the expression and adjustment of parameter σ. ε k is the tolerable error value that allows the actual output value of the PNN to be near the target value. This conforms to the trend of the target function/squared error function value becoming smaller. Before the simulation experiment, the neural network model must find the optimal smoothing parameter σ ≈ 0.0001 (see Figure 7) so that the model can enable a simulation comparison.

Optimal Smoothing Parameter
The only parameter that required adjustment in PNN was the smoothing parameter. The accuracy of the PNN network output is affected by the smoothing parameters, and adjustment can produce K number of classifiers in the pattern layer. In this study, we used a Gaussian function to conduct a similarity comparison between the input pattern and training pattern. The classifier of each Gaussian function attempts to give its target value maximum similarity. Therefore, an appropriate  parameter should be chosen. Assume that the target function/squared error function is: where T(k) is the preset target vector, and its physical purpose is the expression and adjustment of parameter  . k  is the tolerable error value that allows the actual output value of the PNN to be near the target value. This conforms to the trend of the target function/squared error function value becoming smaller. Before the simulation experiment, the neural network model must find the optimal smoothing parameter 0.0001   (see Figure 7) so that the model can enable a simulation comparison.

Evaluation Indicators for Area A
As previously mentioned, the study area was divided into Areas A and B. The spectral reflectance at the five wave band intervals were used as the inputs. The output was the salt content.
The precision evaluation index values of RMSE, 2 R , and RPD for the three models were calculated.
Area A is used as the first discussion subject. The summary is shown in Tables 1-3.

Evaluation Indicators for Area A
As previously mentioned, the study area was divided into Areas A and B. The spectral reflectance at the five wave band intervals were used as the inputs. The output was the salt content. The precision evaluation index values of RMSE, R 2 , and RPD for the three models were calculated. Area A is used as the first discussion subject. The summary is shown in Tables 1-3. The smallest RMSE value within the 1000-1350 nm range was 0.140863, as shown in Table 1, and this was introduced into the PNN model using SG spectrum smoothing and the 1/R method. The largest R 2 value within the range of 1000-1350 nm was 0.999374, as shown in Table 2, and this was introduced into the PNN model using SG spectrum smoothing and the 1/R method. The largest RPD value within the range of 1000-1350 nm was 39.99251, as shown in Table 3, and this was introduced into the PNN model using SG spectrum smoothing and the 1/R method. In Area A with no human activities, the three evaluation indicators of the PNN model (RMSE, R 2 , and RPD) were all located on the same band interval of 1000-1350 nm, as shown in Tables 1-3. The same signal source SG smoothing spectrum, as well as the same mathematical transformation and 1/R method, were used. The three index values showed outstanding results, the coefficient of determination R 2 was 0.999374, and the corresponding prediction ability RPD was 39.99251. The R 2 value was very near 1, and the RPD value was far greater than 2. This showed that the PNN model had an excellent prediction ability, and the prediction results were very good.

Area A Prediction Results
The prediction results and linear fitting for Area A are shown in Figure 8. The prediction result of the PNN model in the 1000-1350 nm band, where SG spectrum smoothing and 1/R were used to introduce the segment into the neural network model, is shown in Figure 8a. Blue represents the salt content prediction value and red represents the actual measured salt content value. Yellow is the error between the prediction value and actual measured value. The linear fitting result for the PNN model is shown in Figure 8b. The same signal source SG smoothing spectrum, as well as the same mathematical transformation and 1/ R method, were used. The three index values showed outstanding results, the coefficient of determination 2 R was 0.999374, and the corresponding prediction ability RPD was 39.99251. The 2 R value was very near 1, and the RPD value was far greater than 2. This showed that the PNN model had an excellent prediction ability, and the prediction results were very good.

Area A Prediction Results
The prediction results and linear fitting for Area A are shown in Figure 8. The prediction result of the PNN model in the 1000-1350 nm band, where SG spectrum smoothing and 1/ R were used to introduce the segment into the neural network model, is shown in Figure 8a. Blue represents the salt content prediction value and red represents the actual measured salt content value. Yellow is the error between the prediction value and actual measured value. The linear fitting result for the PNN model is shown in Figure 8b.  The PNN model had slight prediction errors at observation points 2, 11, 13, 14, and 15, as shown in Figure 8a. The mean error was 0.0084012 (g/kg). Overall, however, the prediction ability was very good. For linear fitting, the coefficient of determination R 2 of the PNN model reached 0.99937, which is very close to 1, as shown in Figure 8b. This shows that the stability and fit of the PNN model was very high.

Area B Evaluation Indicators
The spectral reflectance and five wave band intervals were used (as in Area A) as the input while the salt content was the output in calculating the precision evaluation index values of RMSE, R 2 , and RPD for the PNN model (Tables 4-6).
The smallest RMSE value within the range of 400-700 nm was 0.281027, as shown in Table 4, and this was introduced into the PNN model using SG spectrum smoothing and the 1/R method. The largest R 2 value within the range of 400-700 nm was 0.999208, as shown in Table 5, and this was introduced into the PNN model using SG spectrum smoothing and the 1/R method. The largest RPD value within the range of 400-7000 nm was 35.55161, as shown in Table 6, and this was introduced into the PNN model using SG spectrum smoothing and the 1/R method. From Tables 4-6, it can be seen that the three PNN model evaluation indicators (RMSE, R 2 , RPD) were all located on the same band interval, 400-700 nm in Area B. The same signal source SG spectrum smoothing and 1/R mathematical transformation were used, and the three index values provided very good results. The coefficient of determination R 2 was 0.999208, and the corresponding prediction ability RPD was 35.55161. The R 2 value was very close to 1, and the RPD value was much greater than 2. This indicated that the prediction ability of the PNN model was very good. Although the RPD value of Area B (35.55161) was lower than that of Area A (39.99251), both were far greater than 30. This indicated that the PNN model was suitable for use in undisturbed as well as disturbed areas, and it was capable of giving outstanding results. Area B had been subject to human activities, and the soil had been disturbed, but the soil in Area A had remained in its natural undisturbed state. Thus, the PNN prediction ability was lower for Area A.

Area B Prediction Results
Area B prediction results and linear fit are shown in Figure 9. The prediction result for the 400-700 nm range in the PNN model, which was introduced into the neural network using SG spectrum smoothing and the 1/R method, is shown in Figure 9a. The salt content prediction values are blue, and the actual measured salt content values are red. Yellow indicates the error between the prediction values and actual measured values. The linear fitting result for the PNN is shown in Figure 9b.
From Tables 4-6, it can be seen that the three PNN model evaluation indicators (RMSE, 2 R , RPD) were all located on the same band interval, 400-700 nm in Area B. The same signal source SG spectrum smoothing and 1/ R mathematical transformation were used, and the three index values provided very good results. The coefficient of determination 2 R was 0.999208, and the corresponding prediction ability RPD was 35.55161. The 2 R value was very close to 1, and the RPD value was much greater than 2. This indicated that the prediction ability of the PNN model was very good. Although the PRD value of Area B (35.55161) was lower than that of Area A (39.99251), both were far greater than 30. This indicated that the PNN model was suitable for use in undisturbed as well as disturbed areas, and it was capable of giving outstanding results. Area B had been subject to human activities, and the soil had been disturbed, but the soil in Area A had remained in its natural undisturbed state. Thus, the PNN prediction ability was lower for Area A.

Area B Prediction Results
Area B prediction results and linear fit are shown in Figure 9. The prediction result for the 400-700 nm range in the PNN model, which was introduced into the neural network using SG spectrum smoothing and the 1/ R method, is shown in Figure 9a  More than half of the observation points exhibited a slight error in the PNN model, and the mean error value was 0.015297 (g/kg), as shown in Figure 9a. The errors were small, and the overall prediction ability of the model was very good. The coefficient of determination R 2 of the PNN model reached 0.99921, as shown in the linear fitting in Figure 9b. This is also very close to 1, indicating very high stability and fit in Area B.

Discussion
The PNN model had a better performance in Area A than in Area B with respect to prediction. The main reason for this was that Area B had been disturbed by plowing, as shown in Figure 10. The soil surface had undergone different degrees of damage, and the soil moisture and salt content shifts were more complex. These factors interfered with the features of the soil spectrum. Area A had remained more or less in its natural state, as shown in Figure 11, and the prediction ability of the PNN model was better. The results also showed that the optimal band interval in the PNN model was 1000-1350 nm, under near-infrared short-wave bands, and 400-700 nm, under visible light bands, for Area A and Area B, respectively. The results obtained in this study were consistent with the similar band intervals of previous studies [47][48][49][50]. Previous researchers used spectroscopy to predict salt content and found that the optimal band was also in the visible light and near-infrared short-wave bands.
We conducted a field investigation of the study areas in 2016. The study areas are located near the Xinjiang 102 Construction Army Corp. It was found that there is a huge canal between Area A and Area B, which is 24 m wide at the highest point, 6 m wide at the lowest point, 5 m deep, and 15.3 km long. We divided the two different regions using the canal as a boundary.
From an analysis of different soil properties caused by human activities, it can be seen that Area A is far from human settlement, and is rarely affected by human activities due to the area's isolation caused by the huge canal. This area basically has maintained its original ecological environment. However, Area B is very close to the location of 102 Construction Army Corp, and some houses in human settlements can be seen in Figure 10b, indicating that the human activities in this area are very intensive. Therefore, the soil properties caused by human activities in the two areas are significantly different.
From an analysis of the soil cover environment, it can be seen that there are some haloxylon, red willow, and salt claw in the soil surface of Area A. Due to the small range of soil spectral tests, we chose to perform spectral measurements at locations where the soil surface is covered by less vegetation. However, most of the land in Area B has been developed, and some soils have been leveled, as shown in Figure 10a,b. At the same time, some soils in Area B have been cultivated into tree seedlings, as shown in Figure 10c. Particularly in Figure 10c, we can see that there are some white salts on the surface of the soil, indicating that the salinization of this area is very extensive. Therefore, the soil cover environment in the two areas is differs significantly.
Moreover, in the process of land use and development in arid and semi-arid regions, the oasis ecological environment is under different degrees of human activity. The differences in the stress of human activities on the oasis ecological environment are mainly manifested in two aspects. The first is the difference in land use patterns and intensity. Second, there is a big difference in the duration of land use. Therefore, during the development and utilization of land, regular and irregular irrigation and the construction of ditches and networks will lead to a spatial imbalance of regional water. At the same time, the consolidation and development of land will directly change the local soil structure and type, causing changes in salt transport channels and transfer rates, which in turn exacerbate the complexity of salt spatial variability.
Finally, human activities refer to people controlling the development direction of the soil by changing a certain soil-forming factor or changing the contrast between various factors. For example, eliminating the original natural vegetation and replacing it with artificially cultivated crops or artificial vegetation can directly and indirectly affect the direction and intensity of the biological cycle of the soil.
Irrigation and drainage can change the hydrothermal conditions of natural soils, thus changing the migration and transformation process of substances in the soil. Agricultural practices, such as farming and fertilization, can directly affect soil development, soil material composition and shape changes. Agricultural practices, such as farming and fertilization, can directly affect soil development, soil material composition and shape changes.

Conclusions
The results of this study show that after signal pre-processing was complete and the band had been selected, the mathematical introduction of PSO optimization into the PNN model using the 1/ R method produced optimal results. The predictive ability indicator RPD for the PNN model with PSO optimization was greater than 30 in both Areas A and B, providing a clear indication of an excellent predictive ability. The results indicated that the optimal wave band intervals for PNN modeling differed in areas affected and non-affected by human activities. The optimal interval of the artificial activities region falls in the visible light band, and the optimized wave band region without human activities falls outside the visible light band. It was shown that the near-infrared short-wave bands in the range of 1000-1350 nm can be used to analyze undisturbed areas, and the visible light bands in the range of 400-700 nm are more suitable for analyzing areas that have been disturbed by human activities in Xinjiang, China. The 2 R of the PNN model for Area A was 0.999374, and the RPD was 39.99251. The 2 R of the PNN model for Area B was 0.999208, and RPD was 35.55161. The results of this study could provide a reference point for aircraft or satellite-carried remote sensing technology, large-scale soil salinization monitoring, salinization data extraction, and remote sensing topic mapping. The results can also be used as a reference point for the monitoring of other soil surface parameters, such as organic matter, pH value, and nitrogen, phosphorus, potassium, cation, and anion content. Lastly, the results of this study provide a new perspective in the hyperspectral technology monitoring of soil salinization in areas with different levels of human activities.
Author Contributions: C.F. and A.T. designed the research. C.F. performed all of the modeling. H.X. and A.T. performed the experiments. C.F., S.G., X.Y., H.X. and A.T. participated in the data analyses. C.F. and A.T. involved in drafted and revised the manuscript. Agricultural practices, such as farming and fertilization, can directly affect soil development, soil material composition and shape changes.

Conclusions
The results of this study show that after signal pre-processing was complete and the band had been selected, the mathematical introduction of PSO optimization into the PNN model using the 1/ R method produced optimal results. The predictive ability indicator RPD for the PNN model with PSO optimization was greater than 30 in both Areas A and B, providing a clear indication of an excellent predictive ability. The results indicated that the optimal wave band intervals for PNN modeling differed in areas affected and non-affected by human activities. The optimal interval of the artificial activities region falls in the visible light band, and the optimized wave band region without human activities falls outside the visible light band. It was shown that the near-infrared short-wave bands in the range of 1000-1350 nm can be used to analyze undisturbed areas, and the visible light bands in the range of 400-700 nm are more suitable for analyzing areas that have been disturbed by human activities in Xinjiang, China. The 2 R of the PNN model for Area A was 0.999374, and the RPD was 39.99251. The 2 R of the PNN model for Area B was 0.999208, and RPD was 35.55161. The results of this study could provide a reference point for aircraft or satellite-carried remote sensing technology, large-scale soil salinization monitoring, salinization data extraction, and remote sensing topic mapping. The results can also be used as a reference point for the monitoring of other soil surface parameters, such as organic matter, pH value, and nitrogen, phosphorus, potassium, cation, and anion content. Lastly, the results of this study provide a new perspective in the hyperspectral technology monitoring of soil salinization in areas with different levels of human activities.
Author Contributions: C.F. and A.T. designed the research. C.F. performed all of the modeling. H.X. and A.T. performed the experiments. C.F., S.G., X.Y., H.X. and A.T. participated in the data analyses. C.F. and A.T. involved in drafted and revised the manuscript.

Conclusions
The results of this study show that after signal pre-processing was complete and the band had been selected, the mathematical introduction of PSO optimization into the PNN model using the 1/R method produced optimal results. The predictive ability indicator RPD for the PNN model with PSO optimization was greater than 30 in both Areas A and B, providing a clear indication of an excellent predictive ability. The results indicated that the optimal wave band intervals for PNN modeling differed in areas affected and non-affected by human activities. The optimal interval of the artificial activities region falls in the visible light band, and the optimized wave band region without human activities falls outside the visible light band. It was shown that the near-infrared short-wave bands in the range of 1000-1350 nm can be used to analyze undisturbed areas, and the visible light bands in the range of 400-700 nm are more suitable for analyzing areas that have been disturbed by human activities in Xinjiang, China. The R 2 of the PNN model for Area A was 0.999374, and the RPD was 39.99251. The R 2 of the PNN model for Area B was 0.999208, and RPD was 35.55161. The results of this study could provide a reference point for aircraft or satellite-carried remote sensing technology, large-scale soil salinization monitoring, salinization data extraction, and remote sensing topic mapping. The results can also be used as a reference point for the monitoring of other soil surface parameters, such as organic matter, pH value, and nitrogen, phosphorus, potassium, cation, and anion content. Lastly, the results of this study provide a new perspective in the hyperspectral technology monitoring of soil salinization in areas with different levels of human activities.
Author Contributions: C.F. and A.T. designed the research. C.F. performed all of the modeling. H.X. and A.T. performed the experiments. C.F., S.G., X.Y., H.X. and A.T. participated in the data analyses. C.F. and A.T. involved in drafted and revised the manuscript.