Eco-Friendly Estimation of Heavy Metal Contents in Grapevine Foliage Using In-Field Hyperspectral Data and Multivariate Analysis

Heavy metal monitoring in food-producing ecosystems can play an important role in human health safety. Since they are able to interfere with plants’ physiochemical characteristics, which influence the optical properties of leaves, they can be measured by in-field spectroscopy. In this study, the predictive power of spectroscopic data is examined. Five treatments of heavy metal stress (Cu, Zn, Pb, Cr, and Cd) were applied to grapevine seedlings and hyperspectral data (350–2500 nm), and heavy metal contents were collected based on in-field and laboratory experiments. The partial least squares (PLS) method was used as a feature selection technique, and multiple linear regressions (MLR) and support vector machine (SVM) regression methods were applied for modelling purposes. Based on the PLS results, the wavelengths in the vicinity of 2431, 809, 489, and 616 nm; 2032, 883, 665, 564, 688, and 437 nm; 1865, 728, 692, 683, and 356 nm; 863, 2044, 415, 652, 713, and 1036 nm; and 1373, 631, 744, and 438 nm were found most sensitive for the estimation of Cu, Zn, Pb, Cr, and Cd contents in the grapevine leaves, respectively. Therefore, visible and red-edge regions were found most suitable for estimating heavy metal contents in the present study. Heavy metals played a significant role in reforming the spectral pattern of stressed grapevine compared to healthy samples, meaning that in the best structures of the SVM regression models, the concentrations of Cu, Zn, Pb, Cr, and Cd were estimated with R2 rates of 0.56, 0.85, 0.71, 0.80, and 0.86 in the testing set, respectively. The results confirm the efficiency of in-field spectroscopy in estimating heavy metals content in grapevine foliage.

Heavy metal contamination in food-producing ecosystems is considered to be a major environmental problem due to its potential hazard to humans and other organisms and due to the intention to protect the safety of food chains [16,17]. Within the selection of human food, grapes and their secondary products (wine, jam, juice, jelly, vinegar, grape seed oil, and raisins) play an important role. Therefore, the safety of vineyards in terms of heavy metals is a key factor in grape production and wine industries [17,18]. In viticulture areas, the excessive and prolonged usage of fertilizers and pesticides releases heavy metals (i.e., Cu, Zn, Cd, Pb, Cr, Ni, Hg, and As), which has been considered in many studies [16][17][18][19][20]. According to Milićević et al. [18] and Sun et al. [17], significant correlations occur between heavy metal concentration in soil, grapevine parts (leaf, skin, pulp, and seed), and wine. Alagić et al. [21] also concluded that the grapevine has some highly effective strategies involved in tolerance to heavy metal stress, which makes it an excellent plant species for phytostabilization purposes. Therefore, grapevine foliage monitoring can potentially demonstrate heavy metal concentration states in other parts of the plant and is also acknowledged to be a bio-indicator of heavy metals in the enclosing environment.
Heavy metal stress can produce some changes in plant morphological and biochemical characteristics [15]. This is because the leaf spectral response is mainly affected by plant structural and morphological characteristics; i.e., the leaf's intracellular and extracellular structure, and biochemical parameters such as nitrogen, pigments, and water contents [22][23][24][25][26][27].
Usually, heavy metal concentrations are detected in plant samples by acid digestion-solvent extraction followed by hydride generation atomic absorption spectrometry [28,29]. This tedious approach is expensive and destructive. Alternatively, by modeling the relationships between the heavy metal concentrations and foliar spectral characteristics, these concentrations can be efficiently estimated without using any chemical solvents. Therefore, by analyzing leaf spectral data, it becomes possible to investigate the biochemical and morphological changes caused by heavy metal stress [15,30]. It should be noted that in-field spectroscopy is one of the most attractive fields in remote sensing studies and can record specific spectral data to any object such as fingerprints [31,32]. Hyperspectral sensors can be used in the in-field spectroscopy process and so provide a framework for spectral reflectance acquisition in hundreds of narrow and contiguous bands/wavelengths [24,26]. Accordingly, it is expected that a plant being exposed to heavy metal stress will lead to subtle differences in the spectral curve as opposed to a healthy plant. These differences mainly occur in the visible and near-infrared regions of the electromagnetic spectrum [33].
Several studies have made specific use of the application of crop spectral characteristics through in-field spectroscopy data and multivariate statistical analysis to promote the prediction of heavy metal content in plant samples. For instance, Font et al. [28] and Font et al. [29] applied visible and near-infrared spectroscopy and the modified partial least squares (PLS) method to forecast metal content in prostrate amaranth and rice, with determination coefficients of 0.63 and 0.65, respectively. In another study, Rosso et al. [34] examined the spectral and physiological responses of Salicornia virginica to heavy metal (Cd and V) stress in laboratory conditions. The potential of in-field spectroscopy to detect heavy metal contents was also investigated by Ni et al. [35], Gu et al. [36], Liu et al. [37], Liu et al. [38], and Li et al. [39] in the case of dominant plants in the Poyang lake wetlands, Brassica rapa chinesis, rice, Phragmites australis, and vegetables, respectively.
It is worth noting that in-field spectroscopy delivers a large amount of spectral data, whereby each of the wavelengths may be associated with one of the plant parameters [40]. Therefore, identifying optimal wavelengths to monitor any parameter-e.g., heavy metal concentrations-is an important step in applying these data [41]. In this regard, the usage of multivariate statistical techniques such as the PLS method [14,40,42,43], multiple linear regression (MLR) [41,44,45], and support vector machines (SVM) [12,40,46] can help with feature selection, data reduction, and modelling the existing relationships between hyperspectral data and plant characteristics. Many studies have also taken advantage of spectral indices to minimize atmospheric and background disturbances and illustrate plant characteristics [3,15,30,45,47]. These indices are mathematical spectral transformations of two or more bands designed to enhance the spectral response of vegetation properties [12,40,46]. Hence, spectral indices calculated from foliar reflectance data may reveal the biochemical and physiological properties of leaves, which may be responsible for monitoring plant characteristics [46]. Despite the proven performance of in-field spectroscopy in estimating heavy metal contents in plants, to the best of our knowledge, such a study has never been employed on grapevines leaves.
Altogether, this study was designed with the following goals: i) developing hyperspectral libraries of healthy and heavy metal-stressed grapevine leaves (Vitis vinifera cv. Askari, as a common grapevine variety in Iran) by using full range in-situ spectroscopy (350-2500 nm), ii) evaluating the potential of in-field spectroscopy for estimating heavy metals (Cu, Zn, Pb, Cr, and Cd) concentrations in grapevine foliage, iii) investigating two types of hyperspectral data (wavelengths vs. spectral indices) and identifying the most appropriate features to estimate each studied metal in grapevine foliage, and iv) comparing the performance of SVM and MLR algorithms in modeling the relationships between the foliar spectral response and heavy metal concentrations.

Pollutant Exposure Experiments
Since experience in the evaluation of in-field spectroscopy when estimating heavy metal contents in grapevine leaves is lacking, we chose to conduct this research in a laboratory-controlled environment. Therefore, treatments for heavy metal stress were applied to grapevine seedlings. For this purpose, five treatments at varying levels of Cu, Zn, Pb, Cr, and Cd were considered, and in each treatment, four repetitions were carried out (a total of 84 grapevine seedlings were examined).
It should be noted that the objective of this experiment was not to determine the sensitivity of grapevine to pollutants. We only intended to add heavy metal contents to the grapevine to compare its spectral differences with healthy leaf samples. The common grapevine variety in the study area is Vitis vinifera cv. Askari; all seedlings belonged to this variety to eliminate the effect of variety change on spectral characteristics [43,48]. Experiments were conducted outdoors in full sun between March and September 2018. Each grapevine seedling sample was placed in an individual plastic pot (length and width 25 cm × 10 cm) and was randomly divided amongst the studied treatments. The seedlings were two years old, and their height at the beginning of the experiment was between 20 and 30 cm. All seedlings were in the same conditions in terms of soil, pot size, sunlight exposure, watering, temperature, and humidity. In Figure 1, a schematic of the applied treatments is displayed. The first treatment served as a control to monitor the potential effects of soil, water, and air on the transfer of heavy metals to grapevine seedlings. In the second treatment, the maximum allowed level (MAL) of Cu, Zn, Pb, Cr, and Cd in irrigation water provided stress to the seedlings. All contamination levels were increased in the third, fourth, and fifth treatments as two, three, and four times the metal MALs in irrigation water, respectively. A stress program was applied to treatments 2-5 by dissolving the metal salts (nitrate form) in irrigation water. Salt metals have a high solubility, resulting in the absorption of the metals by plant organs [34]. According to the Iranian Water Quality Standard (IWQS), the MALs for Cu, Zn, Pb, Cr, and Cd in irrigation water are 200, 2000, 100, and 10 mcg/l, respectively. Seedlings were examined for a period of seven months, and they were stressed during each month (a total of seven stresses were applied). At the end of the stress period and before the beginning of the fall season (September 2018), a spectrophotometric analysis of grapevine seedlings leaves was applied.

Spectra Acquisition
At least five leaves of each seedling pot were collected for spectroscopy measurements (a total of 420 spectra samples were taken), and afterwards, individual reflectance spectra were measured by pot. In this study, the grapevine foliar spectral reflectance was measured using the ASD FieldSpec 3 spectroradiometer in the full range (350-2500 nm). This instrument is supported by three separate spectrometers (first: 350-975 nm, second: 976-1770, and third: 1771-2500 nm). The ASD spectral resolutions in the range of 350-1000 nm and 1000-2500 nm are 3 and 10 nm with sampling intervals of 1.4 and 2 nm, respectively. In accordance with Kumar et al. [49], the electromagnetic spectrum in the range of 350 to 2500 nm can be classified into four regions including visible (VIS), red-edge region (RDE), near-infrared (NIR), and mid-infrared (MIR), with ranges of 350~700, 680~750, 700~1300, and 1300-2500 nm, respectively. We performed the spectroscopy experiment in a fully dark room in order to reduce the effect of wind, water vapor, temperature, and other environmental disturbance [12]. In this study, each spectral sample was recorded in 100 automatic replicates. Then, we applied the ViewSpect version 6.0 in order to convert spectral curves into test files and analyze them by statistical software.
For each sample, the reflectance spectrum was recorded at 2151 wavelengths (350-2500 nm), which gave a large amount of data, not all of which may be useful for the study purpose. Therefore, in this study, 32 spectral indices were calculated to evaluate their ability to estimate heavy metal contents. The spectral indices which are used in this study were calculated based on the method indicated by Mirzaei et al. [12] , although no specific spectral indices exist to detect heavy metal contamination [1]. Table 1 shows the indices that have demonstrated sensitivity in previous studies to small differences in plant characteristics [12,46,50].  [13,46].

Heavy Metal Laboratory Analysis
The leaves of each pot were placed in polyethylene bags and converted separately in the laboratory after obtaining the foliar reflectance spectra. The leaf samples were dried for 24 hours in an oven at 45 °C to achieve a constant weight [16]. The samples were powdered and stored for further analysis with a stainless-steel mill. We then digested one gram of each grapevine sample with HNO3 + HClO4 (3:1 v/v) for about 4 hours at a low temperature (about 40 °C) [51]. All digested samples were then diluted and filtered to 25 ml. Finally, a Graphite-Furnace Atomic Absorption Spectrophotometer (GA-AAS, Model: Analytik Jena, Germany) was used to analyze all samples in triplicate. The concentrations of heavy metal samples were expressed as dry weight (DW) mg/kg. The device detection limits for Zn, Cu, Pb, Cr, and Cd were 0.008, 0.025, 0.01, 0.04, and 0.009 mg/kg, respectively. Based on the analysis, the relative standard deviation accuracy was less than 9%. To evaluate the accuracy of analytical techniques, a spike-and-recovery analysis was performed. Postanalyzed samples were accentuated and homogenized with varying amounts of standard metal solutions. The recovery ranged from 90% to 108% of the spiked sample [52].

Feature Selection/Partial Least Squares (PLS)
In summary, the dependent variables were the contents of Cd, Cr, Cu, Pb, and Zn in grapevine leaves, while the independent variables were wavelengths (count: 2151) and spectral indices (count: 32). However, a large number of independent variables can reduce the performance of the relationship modelling between spectral data and metal contents. To mitigate this, we needed a feature selection process to identify optimal features (wavelengths and spectral indices) to forecast the concentration of each metal, individually. Also, before applying statistical operations, it is recommended to scale each variable linearly to the same standard range, especially in the machine learning methods [40,53]. The values of wavelengths, spectral indices, and heavy metal concentrations were therefore scaled to the range between 0 and 1, as follows: x is the original data, and min x and max x are the minimum and maximum of each variable's percentages, respectively. Given the high-dimensional spectral dataset, the use of multivariate statistical analysis is an appropriate solution for achieving optimal features to estimate each metal. PLS is a robust and wellknown statistical analysis in relation to hyperspectral data that has shown acceptable performance in many studies [12,40]. This statistical analysis method generates some new components instead of using existing inputs, based on the least square regression. Unlike principal components analysis (PCA), PLS considers response variables in the data reduction process [54]. Fitting a regression model between input and output variables, high collinear spectral data, and the high processing speed are the other advantages of the PLS method. The PLS-developed components are capable of explaining community variance by a simpler structural mechanism. Accordingly, the importance of each input variable is realized by its factor load in each component [12]. We therefore selected optimal independent variables (wavelengths or spectral indices) based on the maximum factor load in each developed PLS component. These variables were considered to be the most representative of the related components. Based on the PLS results, the optimal wavelengths and indices were identified and introduced to the next step (modelling). Wold et al. [55] has provided more information about the assumptions and applications of the PLS.

Modelling the Relationship Between Spectral Data and Heavy Metal Contents
After the identification of the optimal wavelengths and relevant indices by the PLS, two types of modelling algorithms (SVM and MLR) were applied to estimate heavy metal concentrations based on hyperspectral data. To assess the estimation performance of each model, two goodness-of-fit indicators-specifically, the coefficient of determination (R 2 ) and root mean squared error (RMSE)were used [40]. All achieved data in this study were randomly separated into two sections: 70% as training data and 30% as testing data. As such, the performance of each developed model was individually reported for training and testing sets.

Support Vector Machine (SVM)
SVM is a nonparametric learning algorithm for regression and classification goals and for hyperspectral data mining [56][57][58]. In the SVM procedure, the n-dimensional input vectors are conveyed into a high-dimensional feature space, and consequently, the optimal separating hyperplanes are developed [59]. Here, the SVM regression algorithm was used in multiple scenarios and designs to gain the best performance for modelling the relationship between the in-field hyperspectral data and the measured heavy metal concentration in grapevine leaves. To this end, the input vectors were linked to the outputs with a kernel function [12]. Regression SVM-type 1 with different kernel functions-i.e. radial basis functions (RBF), polynomials, and a sigmoid shape-was applied. In order to achieve an optimal training constant, V-fold cross validation was used, and kernel function parameters (coefficient, gamma, and degree) were altered to give a high-performance score [60]. More details about the assumptions and structure of SVM are provided by Stitson et al. [59] and Cristianini and Shawe-Taylor [61].

Multiple Linear Regressions (MLR)
MLR is a parametric regression algorithm that attempts a relationship model between two or more independent variables and a response variable with a linear fitting. It has the capacity to select appropriate input data. In this study, the forward selection method of MLR was applied to increases the R 2 value by adding an independent variable [40]. The Durbin-Watson statistic was applied to test autocorrelation in the residuals from statistical regression analysis. Durbin-Watson values close to 2 (1.5-2.5) indicate that there is no autocorrelation detected in the samples. Additionally, in order to detect multicollinearity in regression analysis, thw variance inflation factor (VIF) was considered (VIFs exceeding 10 are signs of serious multicollinearity) [62,63]. The general form of the MLR equation is as follows: where HMC is the heavy metal concentration in grapevine leaves, a (i = 0,1,…,n) are the parameters generally estimated by least squares, and X (i = 1,2,...,n) are the independent variables (i.e., wavelengths and spectral indices).

Reflectance Spectra of Healthy and Stressed Leaves
The average reflectance spectrum of a healthy grapevine vs. stressed grapevine leaves due to heavy metal stress is shown in Figure 2. In the VIS region, the light absorption rate of the stressed grapevine was drastically decreased. This is due to the fact that the spectral characteristics of plants in this region are regularly motivated by pigments [64,65]. Accordingly, this suggests that heavy metal stress reduced pigment contents. Various spectral characteristics between healthy and stressed leaves can also be observed in the RDE, NIR, and MIR regions ( Figure 2). As Vogelmann [66], Slaton et al. [23], and Strever [67] stated, plant pigments do not absorb the light in the NIR and MIR regions; therefore, the plant leaf reflectance is significantly increasing in these regions. Additionally, the spectral characteristics of plant leaf in the NIR and MIR regions were changed by structure/morphology and water contents, respectively [54]. According to Figure 2, in the NIR and MIR regions, a lower reflectance was observed in healthy grapevine leaves as opposed to the stressed grapevine. Although other driving variables such as structural parameters and water contents were not measured in this study, it can be concluded that the stress caused by heavy metals had a significant effect on the leaf optical properties.  Reflectance highest absolute correlation coefficient took place in the range of 350 to 400 nm in relation to Cr, Pb, and Zn. Cd showed the best correlation with the wavelengths in the VIS region (400-680 nm), while it dropped sharply in the RDE region (680-750 nm) (Figure 3). This suggests that the RDE region is one of the best options for introducing optimal wavelengths to estimate Cd concentrations in grapevine leaves. Also, the other heavy metals caused subtle fluctuations in the RDE region, and their correlation coefficients tended to be positive. This finding indicates the potential of this region to forecast metal contents in the grapevine leaves. Similar correlation coefficients were observed for Cu, Zn, Cr, and Pb in the NIR spectrum region (750-1300 nm), but Cd had a varied correlation curve in this range. In the MIR region (2500-1300 nm), the heavy metal correlation coefficients were closer together ( Figure 3). With the exception of Pb, the remaining metals were negatively correlated with most wavelengths of this region.

Correlation Coefficient
In comparison to a related study by Zhuang [41], a similar correlation graph between spectral response (400-2500 nm) and heavy metal contents (Cu, Zn, Pb, Cd, As, and Fe) was obtained. A comparison of Figure 3 with the study results of Zhuang [41] shows that the correlation pattern between the heavy metal contents and the spectral response is not alike. Therefore, the structural and biochemical differences between the studied species (grapevine and rice) and the level of spectroscopy (leaf or canopy level) can be considered as the most important drivers justifying these differences.

Optimal Feature Selection
Determining the optimal wavelengths to monitor the desired plant parameters within the vast hyperspectral bands is one of the most critical operations in spectroscopy [43,46,56]. Commonly, a small number of wavelengths/spectral indices are selected with maximum performance for the study purpose, while missing data should be minimal [46,68]. Thus, we chose the PLS method because of its high adaptability with hyperspectral data to recognize optimal predictive variables (wavelengths and spectral indices) for estimating heavy metals in grapevine leaves [42]. Identifying the fit number of components is one of the most imperative factors in applying the PLS results because the number of components can directly determine the number of model input variables. Accordingly, the crossvalidation algorithm was applied to optimize the number of PLS components [43], and then the optimum variable for each of the components was identified. Figure 4 shows the number of optimal components and the wavelength factor loads of the metals studied. This figure shows that the numbers of developed fit components were 4, 6, 5, 6, and 4, for Cu, Zn, Pb, Cr, and Cd, respectively. Therefore, based on the introduced components, the wavelengths and spectral indices which had the highest correlation with the components were identified. They can be subsequently used as optimal spectral wavelengths and indices in the relevant modelling process, especially for estimating metal concentrations in the grape leaves [43].
As shown in Figure 4, the wavelengths in the vicinity of 2431, 809, 489, and 616 nm can be recognized as an optimal rate for estimating Cu content in grapevine leaves. In the same method,   [41] also reported that VIS and RDE delivered the most optimal wavelengths for estimating heavy metal contents. Moreover, according to the results, the RDE was one of the most influential regions in introducing optimal wavelengths for estimating the contents of Zn, Pb, Cr, and Cd. In confirmation with this finding, Gu et al. [36] noted the RDE region as being sensitive to estimate the variances of metal contents (especially Cd). They suggested the wavelength of 782 nm as an optimal wavelength for estimating Cd concentration in Brassica rapa leaves.  In the same way, the optimal spectral indices for estimating contents of Cu, Zn, Pb, Cr, and Cd were also determined based on the interpretation of the PLS results. In Table 2, a summary of the PLS results is presented, which is used to determine the optimal indices to estimate the heavy metal concentrations. As an optimal index for the estimation of Pb, Cr, and Cd concentrations, the Structure Intensive Pigment Index (SIPI)( (proposed by Penuelas et al. [69]), which represents the ratio of carotenoids to chlorophyll, was the most frequent index among the studied indices. Furthermore, the Disease Water Stress Index (DWSI) and Moisture Stress Index (MSI) indices, which are sensitive to water levels in vegetation (water stress), were identified as optimal indices for estimating Zn-Pb and Cu-Cd, respectively. It is worth remarking that the Normalized Difference Vegetation Index (NDVI) was not chosen as the optimal index to predict the studied metal contents. On the other hand, according to Zhuang [41], the NDVI band ratios were extremely useful in monitoring the contents of metals in the paddy canopy. Therefore, it can be argued that, in addition to the structural and biochemical differences between grapevine and paddy species, the differences in studied spectral indices are another reason for differences in the optimal spectral indices.

Modelling and Accuracy Assessment
After determining the optimal spectral wavelengths and indices, two regression approachesi.e., MLR and SVM-were applied to model the relationships between spectral data and heavy metal concentrations. Table 3 illustrates the best-developed models and validation results using the SVM algorithm. Based on this table, the RBF function was selected as the optimal central function in 60% of the developed models, followed by the linear function (30%). These two functions were therefore considered as the optimal functions for relevant modelling in the studied grapevine leaves.   Table 4 shows the modelling results using the MLR method. In cases where the Durbin-Watson coefficient ranged from 1.5 to 2.5, there was a lack of self-correlation between error terms in the regression model that included 60% of the presented models [40]. However, in relation to the presented models for Pb-Cd (based on wavelengths) and Cr-Cd (based on spectral indices), the Durbin-Watson coefficient was less than 1.5 and lacked one of the most important conditions for using regression modelling. VIF was also considered for the multicollinearity checking between the predictor variables in the regression models. According to Table 4, there was serious multicollinearity (some predictor VIFs exceeded the critical threshold of 10) in the Pb-based-wavelength and Zn-basedspectral index models. Therefore, these models violate the key assumption of multiple linear regression, making these models invalid. Table 4. The results of modelling and validation of the best multiple linear regression (MLR) models based on optimal wavelengths and spectral indices for estimating heavy metals concentrations in grapevine leaves in training and testing sets.  Figure 5 illustrates the distribution of the observed vs. predicted concentration of Cu in the test set. In some cases, the predicted values were significantly lower than the observed values, which led to a sharp decrease in their accuracy. The optimal wavelengths in the SVM and MLR approaches can predict test samples with 54 and 56% accuracy, respectively. Hence, as a general finding, using wavelengths has a more acceptable performance as opposed to using spectral indices for estimating Cu concentration in the grapevine leaves. In relation to the modelling approaches, it should be noted that, although MLR yielded a slightly superior R 2 than SVM (at the test set), the SVM-RMSE (25.06) was lower than the MLR-RSME (25.65 mg/kg); therefore, the SVM's performance seems more acceptable (see also Tables 3 and 4).

Modelling of Zn Concentration
The SVM and MLR approaches based on wavelengths were able to predict the Zn contents with accuracies of 42%-47% and based on spectral indices with accuracies of 70%-85% in the testing set, respectively (Tables 3 and 4). As shown in Figure 6, the predicted values overestimated the observed values in most cases of wavelength-based models. However, a more uniform distribution was found between the observed and predicted values in spectral indices-based models. Therefore, spectral indices-based models tend to be preferred for predicting Zn contents in the grapevine leaves.

Modelling of Pb Concentration
The MLR models based on wavelengths and spectral indices yielded a low performance in testing sets with accuracies of 13% and 15%, respectively (Table 4 and Figure 7). Conversely, the SVM model performed more reasonably and predicted Pb contents based on wavelengths and spectral indices in the testing set with accuracies of 71% and 67% (RMSE: 22.49 and 24.51 mg/kg), respectively (Table 3 and Figure 7). It can thus be deduced that SVM is better at estimating Pb contents in the grapevine leaves as opposed to MLR. It should also be noted that the wavelength-SVM model had a more acceptable performance as compared to spectral indices. The obtained results therefore suggest that the wavelength-SVM model is an optimal scenario for estimating Pb contents in grapevine leaves.  Figure 8 shows the distribution pattern of the observed vs. predicted values of Cr in the test set. As shown, the predicted values were overestimated in most cases of MLR, but the predictions of SVM were closer to the observed contents. Overall, the usage of the spectral indices-SVM model was an optimal scenario for estimating Cr contents in the studied grapevine leaves (see also Tables 3 and 4).

Modelling of Cd Concentrations
The wavelengths-based and spectral indices-based MLR models can estimate Cd contents with accuracies of 64% and 67% in the testing set, respectively (Table 4 and Figure 9). On the other hand, the accuracies of SVM were 77% and 86% in the testing set, respectively. Thus, the SVM outperformed the MLR method at estimating Cd concentrations in the grapevine leaves. It must be admitted, however, that the majority of observed values were around zero, leading to biased estimations. This is also reflected in the RMSE values, which were higher than the other studied metals. Overall, the best model presented by the SVM approach (based on spectral indices) had an RMSE value of 102.85 mg/kg dry weight in the testing set (see also Table 3).

Summarizing Heavy Metal Modelling
Grapevine leaves are a suitable option for the study of the absorption and accumulation of heavy metals [21]. Therefore, the monitoring of heavy metal concentration can ensure food security as well as the reduction of health and ecological risks [16]. In this study, a stress-stroke method was employed to ensure the appearance of heavy metals in grapevine foliage. This method was also used in similar studies [34,38]. It is important to note that expanding heavy metal masses in plant foliage leads to an increase in the number of reactive oxygen species [70]. Reactive oxygen species are produced in the course of electron transfer activities-mainly in chloroplasts and mitochondria. They also have an important role in consequences such as plant growth retardation, chlorophyll content reduction, inhibition of enzymatic activity, damage to biological molecules (such as lipids, proteins, and nucleic acids, especially DNA), cell membrane peroxidation, and damage to important cellular organelles such as chloroplasts and mitochondria [71,72]. Heavy metal stress, like other non-biotic stresses, leads to changes in the pathways of synthesis of secondary plant metabolites and increases or decreases these compounds [73,74]. It was also observed that heavy metal stress leads to changes in the cuticle position of the leaves and the openings of leaves' stomas [73]. Considering the effect of heavy metals on the physico-chemical changes in the plant, the spectral pattern of the plant can change, which leads to the spectral pattern differentiation of stressed leaves from healthy leaves. These differentiations can be determined by field-based spectrometry.
According to our results, SVM and MLR prediction methods performed similarly in estimating Cu contents, but in relation to Zn, Pb, Cr, and Cd, the SVM models outperformed the MLR models (Tables 3 and 4). Therefore, the SVM regression method tends to be preferred. Although, in related studies, MLR was the most-used model due to its clarity and structure simplicity [30,41,45], the results of this study recommend SVM for future investigations. The most important reason for the superiority of SVM as opposed to MLR can be attributed to the nature of the relationships between independent and dependent variables. SVM regression was able to perform more accurately in estimating heavy metals due to its high flexibility in training by using both linear and nonlinear functions in the kernel equation [75]. Similarly, a comparison between MLR and artificial neural network (ANN) methods was performed to estimate heavy metals in rice leaves [38]; the results also showed the superior performance of ANN as opposed to MLR.
A comparison between the results obtained for the testing set and the optimal spectral indices and wavelengths in estimating heavy metal contents in various studies was conducted and is shown in Table 5. Based on the R 2 rate of the test set, the performance order of the presented models was Cd > Zn > Cr > Pb > Cu (Table 5). Therefore, the predictive accuracies for Cd, Zn, and Cu were 86, 85, and 56%, respectively. Li (2011) listed a prediction order accuracy of heavy metals in vegetation as Cr > Pb > Cu > Zn. Furthermore, Zhuang [41] ranked the prediction accuracy of heavy metals in rice as Cu > Pb > Zn, which is different from the findings of the present study ( Table 5). The rate of prediction accuracy of Pb in this study is close to the findings of Li [44] and Zhuang [41]. The accuracy of Cr prediction content is also comparable to the results of Li et al. [39]. According to Li [44], Zhuang [41] and Ping et al. [30], Cu predictions were, respectively, 60, 76, and 69%, higher than the present study's result (56%) ( Table 5). However, the present study was able to estimate Zn contents with a higher accuracy compared with the results of Li [44], Zhuang [41], and Kooistra et al. [45], as well as Cd contents as compared to the findings of Ping et al. [30] and Liu et al. [37].
As a final remark, in many studies, RDE and VIS regions were reported to be sensitive to the stress caused by heavy metals [36,38,41]. The comparison of the optimal spectral indices and wavelengths selected for the heavy metal rate predicted in the present study and other related studies show discrepancies ( Table 5). The number of spectral samples, spectroscopy acquisition level, spectral range, calculated spectral indices, as well as statistical analyses for data reduction and relationship modelling can all play a role in explaining these differences. Finally, it should also be pointed out that each heavy metal has a special effect, leading to distinct responses depending on the plant species (including leaf colour changes, chlorosis, necrosis, dwarfism, giant, leaf and root spreading, etc.), which can justify this finding [76].

Conclusion
In this study, we examined the suitability of in-field hyperspectral data (wavelengths from 350 to 2500 nm and 32 spectral indices) in the estimation of heavy metal contents (Cu, Zn, Pb, Cr, and Cd) in vine leaves. Our most important findings are listed as follows: i) The grapevine's foliar spectral signatures (reflectance characteristics) altered when applying heavy metal stress due to their effects on the biochemical components and the leaves' structure.
Considerable changes are observed in the VIS, RDE, NIR, and MIR regions of the electromagnetic spectrum. ii) Significant correlations are found between the heavy metal contents and the grapevine's foliar spectral response, especially in VIS and RDE regions. iii) From the reflectance data, 32 spectral indices were formulated using two or more bands. In PLS analysis, it was found that the Simple Ratio (SR), Cellulose Absorption Index (CAI), RATIO9752, and DWSI; R680, Water Index (WI), Lic1, MSI, and Photochemical Reflectance Index (PRI)2; Vogelman Index (VOG), MSI, SIPI, and R550; mNDVI705, Greenness Index (GI), RATIO975, and SIPI; and SIPI and DWSI are more responsive to heavy metal contents compared with the other indices. They are considered to be optimal indices to estimate Cu, Zn, Pb, Cr, and Cd concentrations, respectively. iv) Also based on the PLS results, the wavelengths in the vicinity of 2431, 809, 489, and 616 nm; 2032, 883, 665, 564, 688, and 437 nm; 1865, 728, 692, 683, and 356 nm; 863, 2044, 415, 652, 713, and 1036 nm; and 1373, 631, 744, and 438 nm are optimal for estimating Cu, Zn, Pb, Cr, and Cd contents in the grapevine leaves, respectively. Accordingly, VIS and RDE emerged as the most sensitive regions for monitoring heavy metal contents in grapevine leaves. v) In most cases, the SVM regression models yielded more accurate performances when estimating heavy metal contents as opposed to the MLR models. For the best SVM structures, the concentrations of Cu, Zn, Pb, Cr, and Cd are estimated with R 2 values of 0.56, 0.85, 0.71, 0.80, and 0.86 in the testing set, respectively. vi) As a general finding, spectral indices yielded more acceptable performance as opposed to wavelengths in forecasting heavy metal contents in the grapevine leaves.
Altogether, the scenario of joining spectral indices with SVM regression is suggested as the most appropriate method for predicting heavy metal contents in the grapevine leaves. At the same time, this conclusion underpins the usage of in-field spectroscopy data and multivariate statistical analysis for the rapid and eco-friendly monitoring of heavy metals in food-producing ecosystems. This study further revealed that the spectral responses of foliar grapevine and other agriculture/horticulture species to heavy metal stress need to be better understood. Similar studies are required to investigate heavy metal spectral signatures in other plant species. Eventually, the ultimate goal of this research line is to integrate field data with spectral data from overpassing aerial and satellite sensors to upscale and automate the monitoring strategy to the field scale.