Patterns of Variation and Chemosystematic Significance of Phenolic Compounds in the Genus Cyclopia (Fabaceae, Podalyrieae)

As a contribution towards a better understanding of phenolic variation in the genus Cyclopia (honeybush tea), a collection of 82 samples from 15 of the 23 known species was analysed using liquid-chromatography–high resolution mass spectrometry (UPLC-HRMS) in electrospray ionization (ESI) negative mode. Mangiferin and isomangiferin were found to be the main compounds detected in most samples, with the exception of C. bowiena and C. buxifolia where none of these compounds were detected. These xanthones were found to be absent from the seeds and also illustrated consistent differences between species and provenances. Results for contemporary samples agreed closely with those based on analysis of a collection of ca. 30-year-old samples. The use of multivariate tools allowed for graphical visualizations of the patterns of variation as well as the levels of the main phenolic compounds. Exclusion of mangiferin and citric acid from the data was found to give better visual separation between species. The use of UPLC-HRMS generated a large dataset that allowed for comparisons between species, provenances and plant parts (leaves, pods, flowers and seeds). Phenetic analyses resulted in groupings of samples that were partly congruent with species but not with morphological groupings within the genus. Although different provenances of the same species were sometimes found to be very variable, Principle Component Analysis (PCA) indicated that a combination of compounds have some (albeit limited) potential as diagnostic characters at species level. 74 Phenolic compounds are presented, many of which were identified for the first time in Cyclopia species, with nine of these being responsible for the separation between samples in the PCAs.


Introduction
Cyclopia Vent. is a fynbos-endemic genus of legumes (family Fabaceae, tribe Podalyrieae) comprising 23 known species. Several species have a long history of traditional use as herbal teas [1] but it is only recently that commercial crop and product development has been initiated [2,3], focused mainly on C. genistoides (L.) R.Br., C. intermedia E.Mey. and C. subternata Vogel. These three species are The molecular formula of the aglycone fragment at m/z 283 corresponds to the isoflavone olmelin (biochanin A) [22], produced by neutral loss of a hexose moiety, whilst the ion at m/z 268 results from the further neutral loss of the methyl (−15 Da) group. Thus, this peak was tentatively annotated as olmelin ( Figure 2 consists of a heatmap of the main compounds detected in the samples, showing the higher concentration compounds in lighter shades and low concentrations in dark, note the many light blocks for citric acid and mangiferin. Since calibration standards are not available for the majority of compounds detected, the peak areas for these compounds were converted to concentration values in mg/kg by interpolation off the mangiferin calibration curve and are provided in the Supplementary data, Table S1. Mangiferin levels in the plant extracts were found to be above the linear range of the mass spectrometer and their concentrations in this table should therefore be seen as relative. The mangiferin levels of 30 of the samples were more accurately determined using UV detection at The phenolic metabolites of Cyclopia species that have been commercialized (C. subternata, C. genistoides, C. intermedia) have been well studied and the results thereof published extensively [16][17][18][19][20]24]. In addition, some mention is also made of C. sessiliflora and C. maculata which are also commercially processed, albeit on a smaller scale [18,25]. Walters et al. [21] investigated the phenolic composition of the non-utilised species C. pubescens and detected the xanthones mangiferin and isomangiferin as some of the main compounds. The same authors also detected flavanones, a flavone and benzophenones. Methylated flavonoids including the isoflavone, formononetin and afrormozin as reported by [20] in C. subternata were not reported by other investigators. The reported presence [14] of (iso)sakuranetin and hesperitin, which elute rather late in the chromatogram was confirmed in this study (Table 1). It is possible that these compounds may not elute off the C 18 column of a modern reverse-phase chromatographic system, since the work of earlier investigators was performed on normal phase systems. This scenario was investigated by extracting one sample with solvents of different polarity (methanol, dichloromethane, dimethylsulfoxide, ethanol, water and combinations of these). The analysis was then repeated using the current method as well as on a much shorter column using a stronger gradient. The results showed a lower extraction efficiency of early eluting polar molecules and a higher extraction efficiency for non-polar late eluting molecules when using stronger solvents. For example, 20% more luteolin and 33% less mangiferin was extracted using methanol/dichloromethane compared to 50% methanol. No other methoxylated flavonoids were detected using this solvent system, only some hydroxylated long chain fatty acids were detected ( Figure 3).

Levels of Main Compounds
in the supplementary data, Table Sup 1. Mangiferin levels in the plant extracts were found to be above the linear range of the mass spectrometer and their concentrations in this table should therefore be seen as relative. The mangiferin levels of 30 of the samples were more accurately determined using UV detection at 280 nm. Concentrations of between 0.41 and 3.8 g/100 g were recorded in the samples where the compound was present (Results not shown).
The phenolic metabolites of Cyclopia species that have been commercialized (C. subternata, C. genistoides, C. intermedia) have been well studied and the results thereof published extensively [16,17] [ [18][19][20]24]. In addition, some mention is also made of C. sessiliflora and C. maculata which are also commercially processed, albeit on a smaller scale [18,25]. Walters et al. [21] investigated the phenolic composition of the non-utilised species C. pubescens and detected the xanthones mangiferin and isomangiferin as some of the main compounds. The same authors also detected flavanones, a flavone and benzophenones. Methylated flavonoids including the isoflavone, formononetin and afrormozin as reported by [20] in C. subternata were not reported by other investigators. The reported presence [14] of (iso)sakuranetin and hesperitin, which elute rather late in the chromatogram was confirmed in this study (Table 1). It is possible that these compounds may not elute off the C18 column of a modern reverse-phase chromatographic system, since the work of earlier investigators was performed on normal phase systems. This scenario was investigated by extracting one sample with solvents of different polarity (methanol, dichloromethane, dimethylsulfoxide, ethanol, water and combinations of these). The analysis was then repeated using the current method as well as on a much shorter column using a stronger gradient. The results showed a lower extraction efficiency of early eluting polar molecules and a higher extraction efficiency for non-polar late eluting molecules when using stronger solvents. For example, 20% more luteolin and 33% less mangiferin was extracted using methanol/dichloromethane compared to 50% methanol. No other methoxylated flavonoids were detected using this solvent system, only some hydroxylated long chain fatty acids were detected ( Figure 3).  *Standard was use to confirm retention time and spectra, base peaks in MS E fragmentation data in bold.
Molecules 2019, 24, 2352 9 of 20 Figure 4 contains the structures of selected compounds presented in Table 1. The PCA cluster map of all the samples is presented in Figure 5. Two Cyclopia species that do not produce mangiferin (C. buxifolia, BX and C. bowieana, BW) are seen as outliers on the right hand side. The clustering was driven by mangiferin and the rest of the species were not visually well separated in the cluster map. In addition, the samples from flower parts other than the leaves (twigs, stems and flowers), also influenced the separation. Figure 6 is the cluster map of only the leaf samples with the mangiferin data excluded. The groupings of the species in clusters improved somewhat with e.g. C. genistoides now clustering on its own. In Figure 7 only the leaf extracts of the three commercial species C. intermedia (IN), C. genistoides (GE), and C. subternata (SU) were investigated with citric acid and mangiferin excluded. This showed a separation of C. genistoides (green, cluster 2,5, and 6) from C. subternata (orange, cluster 1) and C. intermedia (red, cluster 3,4), with some extracts forming additional clusters that appear to be based on geography/provenance/population.    Table 1. The PCA cluster map of all the samples is presented in Figure 5. Two Cyclopia species that do not produce mangiferin (C. buxifolia, BX and C. bowieana, BW) are seen as outliers on the right hand side. The clustering was driven by mangiferin and the rest of the species were not visually well separated in the cluster map. In addition, the samples from flower parts other than the leaves (twigs, stems and flowers), also influenced the separation. Figure 6 is the cluster map of only the leaf samples with the mangiferin   ) and from C. subternata (orange, cluster 1) and C. intermedia (red, cluster 3,4) with some extracts forming additional clusters that appear to be based on geography/provenance/population.

Old Samples Versus Contemporary Samples
No significant differences between older and newer sample were detected which confirms the stability of these phenolic compounds in plants if stored as dry material.

Differences Between Plant Parts (Twigs, Leaves, Pods, Flowers And Seeds)
This study has shown that the same compounds occur at varying concentrations in different plant parts, with the exception of the seeds that contain certain unique compounds but lack others, especially the flavonoid glycosides ( Figure 8). A comparison of the main classes of compounds between plants parts is presented in Figure 9. There are only quantitative differences between twigs, leaves and pods in Cyclopia aurescens Kies, but the seeds are markedly different, with a dominance of chalcones and flavanones. The major seed flavonoids in Cyclopia were reported by De Nysschen et al. [15] as butin, 3'-hydroxydaidzein, butein and vicenin-2, but these compounds have not been detected in more recent studies. In our study, butein/butin and derivatives were detected in seeds at much higher levels than in the leaves, pods or stems. We have recorded a significant peak for 3'hydroxydaidzein (one of the main compounds detected in seeds by De Nysschen [14,15] in one of the seed samples (AU5S, Cyclopia aurescens Kies). This peak corresponds to 3'hydroxydaidzein (m/z 269.0451, C15H9O5 fragment ions: 269.0453 (base peak), 133.0294, retention time 20.9, eluting just before the butein peak in Figure 8). Vicenin-2 is also more prominent in the seed samples, but coelutes with isomangiferin in the extracts from twigs, leaves, pods and flowers. ) and from C. subternata (orange, cluster 1) and C. intermedia (red, cluster 3,4) with some extracts forming additional clusters that appear to be based on geography/provenance/population.

Old Samples Versus Contemporary Samples
No significant differences between older and newer sample were detected which confirms the stability of these phenolic compounds in plants if stored as dry material.

Differences Between Plant Parts (Twigs, Leaves, Pods, Flowers And Seeds)
This study has shown that the same compounds occur at varying concentrations in different plant parts, with the exception of the seeds that contain certain unique compounds but lack others, especially the flavonoid glycosides ( Figure 8). A comparison of the main classes of compounds between plants parts is presented in Figure 9. There are only quantitative differences between twigs, leaves and pods in Cyclopia aurescens Kies, but the seeds are markedly different, with a dominance of chalcones and flavanones. The major seed flavonoids in Cyclopia were reported by De Nysschen et al. [15] as butin, 3'-hydroxydaidzein, butein and vicenin-2, but these compounds have not been detected in more recent studies. In our study, butein/butin and derivatives were detected in seeds at much higher levels than in the leaves, pods or stems. We have recorded a significant peak for 3'hydroxydaidzein (one of the main compounds detected in seeds by De Nysschen [14,15] in one of the seed samples (AU5S, Cyclopia aurescens Kies). This peak corresponds to 3'hydroxydaidzein (m/z 269.0451, C 15 H 9 O 5 fragment ions: 269.0453 (base peak), 133.0294, retention time 20.9, eluting just before the butein peak in Figure 8). Vicenin-2 is also more prominent in the seed samples, but co-elutes with isomangiferin in the extracts from twigs, leaves, pods and flowers. . Figure 9. Composition of classes of compounds (as a sum of the concentrations in mg/kg) in various plant parts of Cyclopia aurescens (AU1-5, all from Klein Swartberg, refer to Table 2) (L = leaves, T = twigs, P = pods, S = seeds). Leaves, twigs and pods are chemically diverse and have a similar combination of compounds whilst the seeds contain mainly chalcones and flavanones.

Diagnostic Value of Phenolic Compounds
The results suggest that phenolic compounds do have diagnostic value in distinguishing between some of the species, especially when combinations of some of the compounds are used. Figure 10 shows the average composition of compounds for the species studied. Cyclopia buxifolia and C. bowieana are apparently unique in their inability to produce xanthones and benzophenones; this chemical difference presumably makes them unsuitable for tea production. The other species have   Table 2) (L = leaves, T = twigs, P = pods, S = seeds). Leaves, twigs and pods are chemically diverse and have a similar combination of compounds whilst the seeds contain mainly chalcones and flavanones.

Diagnostic Value of Phenolic Compounds
The results suggest that phenolic compounds do have diagnostic value in distinguishing between some of the species, especially when combinations of some of the compounds are used. Figure 10 shows the average composition of compounds for the species studied. Cyclopia buxifolia and C. bowieana are apparently unique in their inability to produce xanthones and benzophenones; this chemical difference presumably makes them unsuitable for tea production. The other species have  Table 2) (L = leaves, T = twigs, P = pods, S = seeds). Leaves, twigs and pods are chemically diverse and have a similar combination of compounds whilst the seeds contain mainly chalcones and flavanones.

Diagnostic Value of Phenolic Compounds
The results suggest that phenolic compounds do have diagnostic value in distinguishing between some of the species, especially when combinations of some of the compounds are used. Figure 10 shows the average composition of compounds for the species studied. Cyclopia buxifolia and C. bowieana are apparently unique in their inability to produce xanthones and benzophenones; this chemical difference presumably makes them unsuitable for tea production. The other species have similar combinations of compounds, but the relatively high levels of xanthones in C. genistoides must be noted. The seemingly random quantitative combinations of main compounds in leaf samples of all the species are shown in Figure 11 comparing the concentrations of the individual flavanones. There is visually no clear pattern in Figure 11 and the underlying processes (phenotypic or genetic) deserve more detailed studies.
A somewhat clearer picture emerges when multiple samples from different provenances are analysed, as shown in Figure 12 that represents flavanones of the commercial species: C. genistoides, C. intermedia and C. subternata. Note that different plants collected from the same population often have very similar chemical profiles, while different populations tend to be somewhat different. From this result it is clear that a large part of the chemical variation in the three commercial species can be ascribed to provenance. Chemical differences at population level are often genetically determined and it will be interesting to compare cultivated plants with plants from the original populations where the seeds were collected. A similar pattern emerges when the phenolic compounds from the loading plots that caused the separation of clusters in Figure 7 are considered ( Figure 13). Note that the unique combinations of compounds that are uniform within a provenance are often discontinuous between all or most of the species. The chemical identities and the diagnostic value of the nine compounds shown in Figure 13 should be a priority for future studies. This would require isolation and purifying the compounds and confirmation and structural elucidation using Nuclear Magnetic Resonance spectroscopy (NMR).
When mangiferin and citric acid were removed from the data set, distinct clusters were obtained. Cluster analysis, however, often grouped extracts from the same species together but many were not congruent with species delimitations, i.e. clustering was based on provenance rather than species (see Figure S2 in Supplementary-the Dendrogram). The dendrogram also did not group species together that are presumed to be related on the basis of morphological characters. Cyclopia genistoides differs from C. subternata and the majority of provenances of C. intermedia in the higher concentrations of mangiferin. Cyclopia intermedia is a widely distributed species with some morphological differences between populations and it seems that some outlier values may obscure what is otherwise a promising diagnostic difference. Stepanova et al. [24] found leaf anatomical characters to distinguish between C. genistoides, C. intermedia and C. subternata but chemical analyses are clearly a more practical approach for quality control purposes. Particular provenances are usually selected for crop development, so that commercial tea samples are likely to be chemically more uniform than wild-harvested material collected from unknown populations. Developers often try to standardise the chemical composition of herbal products in order to minimize batch to batch variation. In this context, the numerous chemical compounds and their diversity in Cyclopia species described here are likely to provide a practical and reproducible approach to identify the source species of the material, to detect possible contaminants and assess the quality of the product.  Figure 11 comparing the concentrations of the individual flavanones. There is visually no clear pattern in Figure 11 and the underlying processes (phenotypic or genetic) deserve more detailed studies. A somewhat clearer picture emerges when multiple samples from different provenances are analysed, as shown in Figure 12 that represents flavanones of the commercial species: C. genistoides, C. intermedia and C. subternata. Note that different plants collected from the same population often have very similar chemical profiles, while different populations tend to be somewhat different. From this result it is clear that a large part of the chemical variation in the three commercial species can be ascribed to provenance. Chemical differences at population level are often genetically determined and it will be interesting to compare cultivated plants with plants from the original populations where the seeds were collected. A similar pattern emerges when the phenolic compounds from the loading plots that caused the separation of clusters in Figure 7 are considered ( Figure 13). Note that the unique combinations of compounds that are uniform within a provenance are often discontinuous between all or most of the species. The chemical identities and the diagnostic value of the nine compounds shown in Figure 13 should be a priority for future studies. This would require isolation and purifying the compounds and confirmation and structural elucidation using Nuclear Magnetic Resonance spectroscopy (NMR). When mangiferin and citric acid were removed from the data set, distinct clusters were obtained. Cluster analysis, however, often grouped extracts from the same species together but many were not congruent with species delimitations, i.e. clustering was based on provenance rather than species (see Figure 2 in Supplementary-the Dendrogram). The dendrogram also did not group species together that are presumed to be related on the basis of morphological characters. Cyclopia genistoides differs from C. subternata and the majority of provenances of C. intermedia in the higher concentrations of mangiferin. Cyclopia intermedia is a widely distributed species with some morphological differences between populations and it seems that some outlier values may obscure what is otherwise a promising diagnostic difference. Stepanova et al. [24] found leaf anatomical characters to distinguish between C. genistoides, C. intermedia and C. subternata but chemical analyses are clearly a more practical approach for quality control purposes. Particular provenances are usually selected for crop development, so that commercial tea samples are likely to be chemically more uniform than wildharvested material collected from unknown populations. Developers often try to standardise the chemical composition of herbal products in order to minimize batch to batch variation. In this context, the numerous chemical compounds and their diversity in Cyclopia species described here are likely to provide a practical and reproducible approach to identify the source species of the material, to detect possible contaminants and assess the quality of the product.  . Average levels (mg/kg relative to mangiferin) of nine classes of phenolic compounds in leaf samples of 15 species of Cyclopia. The A at the end of the species codes means that it is an average value for all the leaf samples of that species analysed-see Table 2). C. aurescens (AU), C. bolusii (BO), C. bowiena (BW), C. burtonii (BU), C. buxifolia (BX), C. capensis (CA), C. falcata (FA), C. genistoides (GE), C. glabra (GL), C. intermedia (IN), C. maculata (MA), C. meyeriana (ME), C. plicata (PL), C. pubescens (PU) and C. subternata (SU).   Table 2. Numbering is according to the collection point and from West to East in each species.   Table 2. Numbering is according to the collection point and from West to East in each species. Figure 12. Composition of the flavanones (mg/kg relative to mangiferin) in the leaf samples from the three main commercial sources of honeybush tea: Cyclopia genistoides (GE, nine samples), C. intermedia (IN, 16 samples) and C. subternata (SU, nine samples). For sample codes see Table 2. Numbering is according to the collection point and from West to East in each species.

Compounds From Loadings Plots That Caused
The Separation of Clusters: Figure 13. Composition of phenolic compounds relative to the total from the loading plots that caused the separation of clusters (see Figure 7). For sample codes see Table 2

Conclusion
The analyses of Cyclopia species using UPLC-HRMS with simultaneous collection of low collision energy MS data, ramped collision energy MS data and UV data resulted in large, complex datasets, which revealed considerable complexity in the phenolic compounds observed. MS E fragmentation data is presented for 74 phenolic compounds, including at least three benzophenones, Figure 13. Composition of phenolic compounds relative to the total from the loading plots that caused the separation of clusters (see Figure 7). For sample codes see Table 2

Conclusions
The analyses of Cyclopia species using UPLC-HRMS with simultaneous collection of low collision energy MS data, ramped collision energy MS data and UV data resulted in large, complex datasets, which revealed considerable complexity in the phenolic compounds observed. MS E fragmentation data is presented for 74 phenolic compounds, including at least three benzophenones, two dihydrochalcones, three chalcones, three xanthones, 17 flavanones, three flavones, two isoflavones, three acetophenones and eight phenolic acids (cinnamic acid derivatives). Some unknown compounds have been tentatively identified including piceol-hexose-pentoside isomers, piceol-hexose-rhamnoside, butein-hexosides and olmelin-O-hexoside.
The study also revealed that the methods of extraction and analysis by UPLC-HRMS analysis influence the results and that both polar and nonpolar (methylated) compounds may be overlooked in routine analyses. Plant parts (twigs, leaves, flowers and pods) show only quantitative differences in the main constituents but seeds often contain much lower concentrations of xanthones and higher concentrations of chalcones and other flavonoids. As suggested in the literature, phenolic compounds have limited chemosystematic value at species level but a combination of chemical characters can be used to distinguish between some of the species. The study provides deeper insights into the chemical complexity of Cyclopia species and the potential role that UPLC-HRMS analyses can play, not only in quality control but also to help select superior chemotypes for crop and product development.

Materials and Methods
Methods and equipment were the same as used by Stander et al., [25] but the gradient was extended to 37 minutes to accommodate more non-polar compound including isoflavones and methoxylated flavonoids described in previous papers [14,17].

Samples and Sampling
The samples came from a collection of what are now historical materials that formed part of a comprehensive revision of the genus Cyclopia by Schutte [7], who also identified the materials (Table 2). De Nysschen [14] used part of this collection for a study of the main phenolic compounds in the genus, and reported the presence of mangiferin as the main constituent for the first time. The material was carefully stored at low humidity in a dark storeroom. We have previously shown [25] that the main phenolic compounds of commercial rooibos tea are remarkable stable, producing almost identical phenolic profiles after more than 80 years of storage.

Extraction
Depending on available material, ca. 300 to 500 mg of dry plant material was soaked overnight in 50% methanol in water containing 1% formic acid (2 mL), using 15 mL polypropylene centrifuge tubes. The volumes of solvent were adjusted according to the available sample amount to 7.5 mL per 1 gram of sample. The samples were extracted in an ultrasonic bath (0.5 Hz, Integral systems, RSA) for 60 min at room temperature, followed by centrifugation for 5 minutes (Hermle Z160m, 3000× g) and transferred to glass vials.

UPLC-HRMS Analysis
UPLC-HRMS analysis was performed using a Waters Synapt G2 Quadrupole time-of-flight (QTOF) mass spectrometer (MS) connected to a Waters Acquity ultra performance liquid chromatograph (UPLC) (Waters, Milford, MA, USA) with photodiode array detector. A Waters HSS T3, 2.1 × 150 mm, 1.7 µm column with water with 0.1% formic acid in line A and 0.1% formic acid in acetonitrile in line B. A flow rate of 0.25 mL/min was used and the gradient started with 100% solvent A for 1 minute followed by a linear gradient to 28% B in 21 minutes and another linear gradient to 60% B in 8 minutes. The column was washed for 1 minute at 100% B and then re-equilibrated.
Data were acquired in MS E mode whereby a low collision energy scan is followed by a high collision energy scan to obtain both molecular ion [M − H] and fragment data at the same time. During the high collision energy scan the collision energy was ramped from 20 to 60V. Electrospray ionisation was used in the negative mode and a scan range of 120 to 1500 was used. The desolvation temperature was set at 275 • C and nitrogen was used as desolvation gas at 650 L/h. The capillary voltage was 25 V and the instrument was calibrated with sodium formate and leucine encephalin was used as lock mass for accurate mass determinations.

Data Processing and Clustering
The Markerlynx application manager of MassLynx™ version 4.1 software (Waters Corporation, Boston) was used to align the raw mass spectrometry data and convert it to retention time-mass pairs with signal intensity for each peak. Selected mass peaks from the mass spectra were normalised to compensate for the variance in concentration and ensure equal representation in the dataset, thereby facilitating comparative analysis. Normalisation involves scaling each sample vector using least squares normalisation (L2 norm), independently of other samples. Multivariate analysis was performed similar to [25].
Principal component analysis (PCA) was performed on the dataset. The number of PCA components was selected so that the amount of variance that needs to be explained is greater than two times standard deviation (95.45%) data coverage. In traditional methods, the PCA components are visualised in pairs while the loadings plot for all PCA components is displayed simultaneously. However, all the selected PCA components need to be considered collectively for meaningful discrimination of the dataset. To achieve this, unsupervised hierarchical clustering analysis was then performed on the selected PCA components. An implementation of the Mean Shift clustering algorithm was chosen as it holds no intrinsic hypothesis about the number of clusters, nor the shape thereof. This is in contrast with to the classic K-means clustering approach where the number of clusters is predetermined. Mean Shift is a non-parametric centroid based algorithm, using a radial basis function (RBF) kernel, where each point in the feature space corresponds to the initial centroid positions. It iteratively updates centroids to be the mean of all the points within a given region, thereby discovering dense regions in the feature space, until convergence was achieved. The remaining set of centroids after convergence, being the cluster centres and the data points associated with the same centroid, are members of the same cluster.
Next, the loadings factors for each PCA component was analysed, to gain an understanding of which metabolites contributed to the most variation within the dataset. The loadings plots of the Markerlynx data as well as a manual peak picking process was used to identify the main compounds in the samples. The Targetlynx application manager was then used to create a smaller subset of 74 compounds that was processed in the same way, yielding similar results. The Targetlynx dataset is reported, as it contains data with tentatively identified compounds.