UPLC-QTOF/MS-Based Nontargeted Metabolomic Analysis of Mountain- and Garden-Cultivated Ginseng of Different Ages in Northeast China

Aiming at further systematically comparing the similarities and differences of the chemical components in ginseng of different ages, especially comparing the younger or the older and mountain-cultivated ginseng (MCG), 4, 5, 6-year-old cultivated ginseng (CG) and 12, 20-year-old MCG were chosen as the analytical samples in the present study. The combination of UPLC-QTOF-MSE, UNIFI platform and multivariate statistical analysis were developed to profile CGs and MCGs. By the screening analysis based on UNIFI, 126 chemical components with various structural types were characterized or tentatively identified from all the CG and MCG samples for the first time. The results showed that all the CG and MCG samples had the similar chemical composition, but there were significant differences in the contents of markers. By the metabolomic analysis based on multivariate statistical analysis, it was shown that CG4–6 years, MCG12 years and MCG20 years samples were obviously divided into three different groups, and a total of 17 potential age-dependent markers enabling differentiation among the three groups of samples were discovered. For differentiation from other two kinds of samples, there were four robust makers such as α-linolenic acid, 9-octadecenoic acid, linoleic acid and panaxydol for CG4–6 years, five robust makers including ginsenoside Re1, -Re2, -Rs1, malonylginsenoside Rb2 and isomer of malonylginsenoside Rb1 for MCG20 years, and two robust makers, 24-hydroxyoleanolic acid and palmitoleic acid, for MCG12 years were discovered, respectively. The proposed approach could be applied to directly distinguish MCG root ages, which is an important criterion for evaluating the quality of MCG. The results will provide the data for the further study on the chemical constituents of MCG.


Introduction
Ginseng, the king of herbs in the Orient, has always received a lot of attention, not only as a therapeutic medicinal herb, but also as a health supplement. According to the different growing environments and diverse cultivation methods, there two kinds of ginseng are distinguished in the Chinese Pharmacopoeia: cultivated ginseng (CG) and mountain-cultivated ginseng (MCG). CG is cultivated artificially in gardens, while MCG is grown for at least 10 years [1,2]. MCG, also called "Lin-Xia-Shan-Shen", can be regarded as a replacement of wild ginseng. MCG is of better quality than CG and offers more production than wild ginseng [3]. Actually, the adulteration or falsification of the cultivation age of MCG has always been a serious problem in the MCG commercial market. As we all know, the chemical components and biological activities of ginseng with different cultivation ages are

UPLC/QTOF-MS E
The chromatographic separation and mass spectrometry detection were conducted on the Waters Acquity UPLC system coupled with a Xevo G2-S QTOF mass spectrometer equipped with an electrospray ionization source (ESI). Separation was performed on Waters ACQUITY UPLC BEH C 18 column (100 mm × 2.1 mm, 1.7 µm) at 40 • C. The mobile phase consisted of eluent A (0.1% formic acid aqueous solution) and eluent B (0.1% formic acid in acetonitrile) at flow rate of 0.4 mL/min with the following gradient program: 0~2 min, 10% (B); 2~26 min, 10%~100% (B); 26~28 min, 100% (B); 28~28.1 min, 100%~10% (B); 28.1~30 min, 10% (B). Mixtures of 10/90 and 90/10 water/acetonitrile were the strong wash and the weak wash solvent, respectively. The optimized conditions were employed: source temperature was 120 • C, the desolvation temperature was 300 • C, capillary voltage was 2.6 kV(ESI + ) or 2.2 kV (ESI − ), cone voltage was 40 V, desolvation gas flow was 800.0 L/h, cone gas flow was 50 L/h. The energy of low energy function and the collision energy of high energy function were set at 6 V and 20 V~40 V respectively in MS E mode. The mass spectrometer was calibrated with sodium formate in the range of 200-1500 Da. The lockmass compound used was leucine-enkephaline (external reference to the ion m/z 556.2771 in positive mode and 554.2615 in negative mode). Data were collected with Masslynx™ V4.1 workstation in continuum mode.

Chemical Information Database for the Components of CG and MCG
In addition to the Waters Traditional Medicine Library in UNIFI software, a systematic investigation of chemical constituents from the target herbs based on the literature was conducted. A self-built database of compounds, such as saponins, flavonoids, volatile oil, amino acids and so on, isolated from CG and MCG was established by searching online databases such as China Journals of Full-Text Database (CNKI), PubMed, Medicine, Web of Science and ChemSpider. The name, molecular formula and structure of components from CG and MCG were obtained in the database.

The Screening Analysis Based on UNIFI Platform
UNIFI 1.7.0 software (Waters, Manchester, UK) was used to perform the screening analysis on the structural characteristics and MS fragmentation behaviors, especially for characteristic fragments. Main parameters were set as follows: peak intensity of high energy over 200 counts and the peak intensity of low energy over 1000 counts were the selected parameters in peak detection; mass error up to ±10 ppm for identified compound; retention time tolerance was set in the range of ±0.1 min; positive adducts containing +H, +Na or negative adducts containing −H, +HCOOH were all selected; the reference compound was leucine-enkephalin (556.2766 for positive ion, 554.2620 for negative ion). The MS raw data were processed using the streamlined workflow of UNIFI software to quickly identify the chemical components that met the match criteria with the in-house Traditional Medicine Library and the self-built database [20,21].

The Metabolomics Analysis Based on Multivariate Statistical Analysis
To differentiate MCG and CG, MarkerLynx XS V4.1 software (Waters, Milford, DE, USA) was used to process the raw data by deconvolution, alignment, data reduction and to perform the multivariate statistical analysis [20,21]. The following steps were performed: acquiring data, creating a MarkerLynx processing method, processing the acquired data and viewing results Extended Statistics (XS) Viewer. The main parameters in the method set to process the raw data were as follows: retention time range 5-28 min, mass range 200-1400 Da, mass tolerance 5 mDa, intensity threshold 2000 counts, mass window 0.05 Da, retention time window 0.20 min. In resulting database list, RT-m/z pairs represent an identifier of ion in the order of their elution time. The same value of RT and m/z in different batches of samples were regarded as the same compound. Multivariate statistical analysis was then performed to find the potential biomarkers that significantly contributed to the difference among the groups. During the analysis, principal component analysis (PCA) was firstly used to show the maximum variation and pattern recognition in order to get the overview and classification, and the orthogonal projections to latent structures discriminant analysis (OPLS-DA) was then performed aiming to get the maximum separation between two groups. S-plots was then available to provide visualization of the OPLS-DA predictive component loading to facilitate model interpretation. Variable importance for the projection (VIP) was also used to help screen the different components, and the metabolites with VIP value above 1.0 were considered as potential markers. Additionally, a permutation test was performed to provide reference distributions of the R 2 /Q 2 -values that could indicate the statistical significance. Simca 15.0 software (Umetrics, Malmö, Sweden) was used to show the analysis results.

Identification of Components from MCG and CG Based on UNIFI Platform
As a result of our analysis, a total of 126 compounds, including triterpenoids (the main ingredients), flavonoids, organic acids and organic acid esters, alcohol phenols, aldehyde ketones and amino acids, etc., were characterized or tentatively identified from the MCG and CG in both ESI + and ESI − modes. 85 compounds were identified in ESI + mode and 41 compounds were identified in ESImode. Base peak intensity (BPI) chromatograms are shown in Figure 1, the identification information is listed in Table 2, and the chemical structures are shown in Figure 2.         group was significantly higher than the other one ( ∆ p < 0.05, ∆∆ p < 0.001) # , ## : Represented the content either in CG 4-6 years group or in MCG 20 years group was significantly higher than the other one ( # p < 0.05, ## p < 0.001) *, **: Represented the content either in MCG 12 years group or in MCG 20 years group was significantly higher than the other one (* p < 0.05, ** p < 0). In a result, one of them was identified as ginsenoside Rd due to the same retention time, and the other one was tentatively identified as gypenoside XVII because it was matched with the characteristic MS fragmentation pattern of gypenoside XVII reported in the literature [31].

Biomarker Discovery for Distinguishing MCG and CG
The MS E data of CG and MCG samples were statistically analyzed via PCA and OPLS-DA. As seen in PCA 2D plots (Figure 3), there was no obvious difference among of 4-6-year-old CG samples, but the MCG20 years, MCG12 years and CG4-6 years groups were obviously separated, indicating that these three groups could be differentiated. With the aim of distinguishing MCG from CG, or MCG20 years from MCG12 years, OPLS-DA plot, permutation test, and S-plot, VIP values were obtained to understand which variables were responsible for the separation (Figures 4-6). The variables showing VIP>1 and p < 0.05 (in t-test) were considered as potential biomarkers. The robust known biomarkers enabling the differentiation between CG and MCG were discovered and marked in S-plots. In order to systematically evaluate the biomarkers, heatmaps (Figure 7) were generated from these biomarkers. The hierarchical clustering heatmaps, intuitively visualizing the differential levels of potential biomarkers concentration in different ginseng groups, are shown in Figure 7. The larger contents were represented by red squares and smaller values by green squares.
Between the CG4-6 years and MCG12 years groups, the contents of 24-hydroxyoleanolic acid, ginsenoside F3 and palmitoleic acid in MCG12 samples were significantly higher. While, the contents of α-linolenic acid, 9-octadecenoic acid, linoleic acid and panaxydol in all the CG samples were significantly higher.
Between the CG4-6 years and MCG20 years groups, the contents of ginsenoside Re1, -Re2, -Rs1, malonylginsenoside Rb2, -Rf, isomer of malonylginsenoside-Rb1 and quinquenoside R1 in the samples of MCG20 years were higher. On the contrary, the contents of ginsenoside Ro and the isomer of ginsenoside Ro, 12,13,15-trihydroxy-9-octadecenoic acid, linoleic acid, 9-octadecenoic acid, αlinolenic acid, panaxydol were rather higher in CG samples. In a result, one of them was identified as ginsenoside Rd due to the same retention time, and the other one was tentatively identified as gypenoside XVII because it was matched with the characteristic MS fragmentation pattern of gypenoside XVII reported in the literature [31].

Biomarker Discovery for Distinguishing MCG and CG
The MS E data of CG and MCG samples were statistically analyzed via PCA and OPLS-DA. As seen in PCA 2D plots (Figure 3), there was no obvious difference among of 4-6-year-old CG samples, but the MCG 20 years , MCG 12 years and CG 4-6 years groups were obviously separated, indicating that these three groups could be differentiated. With the aim of distinguishing MCG from CG, or MCG 20 years from MCG 12 years , OPLS-DA plot, permutation test, and S-plot, VIP values were obtained to understand which variables were responsible for the separation (Figures 4-6). The variables showing VIP > 1 and p < 0.05 (in t-test) were considered as potential biomarkers. The robust known biomarkers enabling the differentiation between CG and MCG were discovered and marked in S-plots. In order to systematically evaluate the biomarkers, heatmaps (Figure 7) were generated from these biomarkers. The hierarchical clustering heatmaps, intuitively visualizing the differential levels of potential biomarkers concentration in different ginseng groups, are shown in Figure 7. The larger contents were represented by red squares and smaller values by green squares.

Discussion
Although MCG and CG both belong to Panax ginseng, their chemical ingredients and pharmacological activities are different due to their significantly different growth environment [3,67]. As we all know, MCG has been regarded as a replacement of wild ginseng. Recently, the UPLC-QTOF-MS/MS-based approach has been developed to distinguish MCG (grown for 15 years) and CG (grown for 4-7 years) [68]. As a result, 40 ginsenosides in both MCG and CG were unambiguously identified and tentatively assigned, and the potential chemical markers identifying different ginseng products were characterised [68]. Additionally, the study on 6-18-year-old Mountain Cultivated Ginseng Leaves (MGL) samples showed that the MGL were obviously divided into three main groups according to different age brackets (6~10, 11~13 and 14~18 years) [7]. Although the sample of the study was the leaf of MCG, it could be indirectly speculated that the MCG roots with different cultivation ages are also different. In order to further systematically compare the similarities and differences at the chemical level between different ages of ginseng, especially to compare the younger Between the CG 4-6 years and MCG 12 years groups, the contents of 24-hydroxyoleanolic acid, ginsenoside F 3 and palmitoleic acid in MCG 12 samples were significantly higher. While, the contents of α-linolenic acid, 9-octadecenoic acid, linoleic acid and panaxydol in all the CG samples were significantly higher.
Between the MCG 12 years and MCG 20 years groups, the contents of palmitoleic acid and 24-hydroxyoleanolic acid in MCG 12 years samples were significantly high, while the contents of ginsenoside Re 1 , -Rs 1 , malonylginsenoside Rb 2 , -Re 2 and isomer of malonylginsenoside Rb 1 were rather higher in MCG 20 years samples.
Overall, on one hand, the contents of α-linolenic acid, linoleic acid, 9-octadecenoic acid and panaxydol in CG samples were significantly higher than those in all MCG samples. On the other hand, ginsenoside Re 1 , -Re 2 , -Rs 1 , malonylginsenoside Rb 2 and isomer of malonylginsenoside Rb 1 in MCG 20 years samples were really higher than those both in MCG 12 years and in all of CG samples, but there is no significant difference between MCG 12 years and CG 4-6 years samples. The summary with variable identity, VIP and p value were shown in Table 3.

Discussion
Although MCG and CG both belong to Panax ginseng, their chemical ingredients and pharmacological activities are different due to their significantly different growth environment [3,67]. As we all know, MCG has been regarded as a replacement of wild ginseng. Recently, the UPLC-QTOF-MS/MS-based approach has been developed to distinguish MCG (grown for 15 years) and CG (grown for 4-7 years) [6]. As a result, 40 ginsenosides in both MCG and CG were unambiguously identified and tentatively assigned, and the potential chemical markers identifying different ginseng products were characterised [6]. Additionally, the study on 6-18-year-old Mountain Cultivated Ginseng Leaves (MGL) samples showed that the MGL were obviously divided into three main groups according to different age brackets (6~10, 11~13 and 14~18 years) [7]. Although the sample of the study was the leaf of MCG, it could be indirectly speculated that the MCG roots with different cultivation ages are also different. In order to further systematically compare the similarities and differences at the chemical level between different ages of ginseng, especially to compare the younger or the older MCG, 4, 5, 6-year-old CG and 12, 20-year-old MCG were chosen as the analytical samples in the present study.
Firstly, based on UNIFI platform, intelligent and automatic workflows, the screening analysis of metabolites in different cultivation ages of ginseng were rapidly performed. As a result, a total of 126 compounds were characterized from CG 4-6 years , MCG 12 years and MCG 20 years samples. Among of them, ginsenosides were the main ingredients. Both CG and MCG had the similar chemical composition, but the components were variously distributed in CG and MCG samples at different contents. That means in CG and MCG, the secondary metabolites had the features of structural diversity and the different content patterns. As far as we know, this is the first time that the comprehensive screening analysis of MCG 12 years and MCG 20 years samples by using UPLC-QTOF-MS E combined with UNIFI platform. It could provide the scientific data for clarifying the chemical composition of MCG.
Secondly, the combination of LC-MS based metabolomic profiling with multivariate statistical analysis method was used to profile the CG, MCG 12 years and MCG 20 years samples. A total of 17 potential age-dependent markers enabling differentiation among the CG and MCG samples were discovered. (1) There were four robust markers including α-linolenic acid, 9-octadecenoic acid, linoleic acid and panaxydol being the characteristic components for CG samples, that distinguished them from both MCG 12 years and MCG 20 years samples. The results showed that CG samples contained more non-ginsenosides. Both linoleic acid and α-linolenic acid, the main products of the acetate-malonate pathway, are two essential fatty acids necessary for health. Linoleic acid is used in the biosynthesis of arachidonic acid and thus some prostaglandins, leukotrienes, and thromboxane [68,69]. Panaxydol, one of the C17 polyacetylenic compounds, originates from acetyl-CoA/malonyl-CoA via fatty acids with crepenynate as the intermediate [70]. It is considered a potential antitumor agent due to its significant anticancer activity [71]. (2) In CG samples, there were three other characteristic components such as ginsenoside Ro, the isomer of ginsenoside Ro, and 12,13,15-trihydroxy-9-octadecenoic acid, that could be used to differentiate them from MCG 20 years samples. From this, we could draw a conclusion that pentacyclic triterpenoids decreased significantly in older MCG samples. (3) Five robust biomarkers including ginsenoside Re 1 , -Re 2 , -Rs 1 , malonylginsenoside Rb 2 and isomer of malonylginsenoside Rb 1 were found to enable differentiation of MCG 20 years from CG and MCG 12 years samples. These five compounds might be used for rapid identification of MCG 20 years samples. A proposed biosynthetic pathway of ginsenosides is as follows: with the action of squalene epoxidase, squalene was converted to 2,3-oxidosqualene. Dammaranes can be synthesized by dammarenediol synthase, and oleananes by β-amyrin synthase [72]. Ginsenosides were found to have both antimicrobial and antifungal properties and the molecules are naturally bitter-tasting, discouraging insects and other animals from consuming the plant, so ginsenosides likely serve as mechanisms for plant defense [73,74]. (4) In MCG 20 years samples, another two markers, ginsenoside Rf and quinquenoside R 1 , were discovered that distinguished them from all CG samples. (5) In MCG 12 years samples, 24-hydroxyoleanolic acid and palmitoleic acid were the two robust markers for distinguished from both CG and MCG 20 years samples. These two compounds might be used for rapid identification of MCG 12 years samples. Palmitoleic acid is biosynthesized from palmitic acid by the action of the enzyme stearoyl-CoA desaturase-1, a key enzyme in fatty acid metabolism [75]. (6) Ginsenoside F 3 was another marker for MCG 12 years samples that differentiated them from CG samples. However, there are still some unresolved issues. For example, as shown in BPI chromatograms, though 126 compounds were identified, there are still some unidentified components. there are still some unidentified components. Further research should be carried out based on the formula of these unknown compounds.

Conclusions
By combining the UPLC-Q/TOF-MS E and UNIFI platform, 126 chemical components with various structural types, such as triterpenoids, flavonoids, organic acids and organic acid esters, etc., were characterized or tentatively identified from CG 4-6 years , MCG 12 years and MCG 20 years samples for the first time. All the CG and MCG samples had the similar chemical composition, but there were significant differences in the content of each component. Further nontarget metabolomic analysis combined with multivariate statistical analysis showed that CG 4-6 years , MCG 12 years and MCG 20 years samples were obviously divided into three different groups. A total of 17 potential age-dependent markers enabling differentiation among the CG and MCG samples were discovered. Among of these markers, four robust markers, including α-linolenic acid, 9-octadecenoic acid, linoleic acid and panaxydol, could be the characteristic components for differentiation of CG from all other MCG samples. Five robust markers including ginsenoside Re 1 , -Re 2 , -Rs 1 , malonylginsenoside Rb 2 and isomer of malonylginsenoside Rb 1 were found to enable differentiate MCG 20 years samples from all other samples, while 24-hydroxyoleanolic acid and palmitoleic acid were the robust markers for distinguishing MCG 12 years samples from all the CG samples and MCG 20 years samples. The proposed approach could be applied to directly distinguish MCG root ages, which is an important criterion for evaluating the quality of MCG. The results will provide the data for the deficient study on the chemical constituents of MCG and provide reference for the quantitative determination in the quality control criterion of MCG.