Chemical Reactivity Properties, pKa Values, AGEs Inhibitor Abilities and Bioactivity Scores of the Mirabamides A–H Peptides of Marine Origin Studied by Means of Conceptual DFT

The MN12SX density functional, in connection with the Def2TZVP basis set, was assessed, together with the SMD solvation model (Solvation Model based on the Density), for calculation of the molecular properties and structure of a group of peptides of marine origin named Mirabamides A–H. All the chemical reactivity descriptors for the systems were calculated via Conceptual Density Functional Theory (CDFT). The active sites suitable for nucleophilic, electrophilic, and radical attacks were chosen by linking them with the Fukui function indices, nucleophilic and electrophilic Parr functions, and condensed Dual Descriptor Δf(r), respectively. Additionally, the pKa values for the different peptides are predicted with great accuracy as well as the ability of the studied molecule in acting as an efficient inhibitor of the formation of Advanced Glycation Endproducts (AGEs), which constitutes a useful knowledge for the development of drugs for fighting Diabetes, Alzheimer and Parkinson diseases. Finally, the bioactivity scores for the Mirabamides A–H are predicted through different methodologies.


Introduction
The sea is an inexhaustible source of natural resources that give rise to molecules that can serve as a guide for the development of new medicines. For this reason, numerous investigations have been carried out in recent years dedicated to the search for new natural products that can be obtained from the knowledge of marine species [1].
Among the chemical species that can be obtained from natural products of marine origin, the peptides that are molecules of intermediate size between amino acids and proteins stand out. The therapeutic application of these peptides, called for this reason therapeutic peptides, is currently one of the most active fields of research due to the great possibilities they represent as aids for the treatment of numerous diseases [2].
For the consideration of therapeutic peptides from the point of view of medicine, it is necessary to know their molecular properties and their bioactivity. It is our belief that the bioactivity of these peptides is intimately related to their chemical reactivity from a molecular perspective [3,4]. For this reason, we consider it essential to study the chemical reactivity of natural products that have the

Settings and Computational Methods
This study obtained the molecular structure of the Mirabamides A-H peptides from PubChem (https://pubchem.ncbi.nlm.nih.gov), a website that serves as the public repository for information pertaining to chemical substances, along with their associated biological activities. The pre-optimization of the resultant system involved selecting the most stable conformers. The selection was done using random sampling that involved molecular mechanics techniques and the inclusion of the various torsional angles via the general MMFF94 force field [39][40][41][42][43] involving the Marvin View 17.15 program (ChemAxon, Budapest, Hungary), which constitutes an advanced chemical viewer suited to multiple and single chemical queries, structures, and reactions (https://www.chemaxon.com). After that, the chemistry of the structures was checked and the 3D structures of the stereoisomers were generated using the same MarvinView 17.15 program. The chirality at the stereogenic centers was verified in accordance to the Cahn-Ingold-Prelog priority rules. The resulting geometries were further refined, as was previously explained, and the lowest energy conformation for each molecule was chosen to calculate the electronic energy and the HOMO and LUMO orbitals at the DFT functional level, as mentioned in the next paragraph.
Consistent with our previous work [16][17][18][19][20][21][22], the computational studies were performed with the Gaussian 09 [44] series of programs (Gaussian Inc., Wallingford, CT, USA) that implement density functional methods. The basis set Def2SVP was used in this work for geometry optimization and frequency determination, while the Def2TZVP basis set was used for calculating electronic properties [45,46]. All calculations were performed in the presence of water as the solvent under the Solvation Model Density (SMD) parameterization of the Integral Equation Formalism-Polarized Continuum Model (IEF-PCM) [47].
To calculate the molecular structure and properties of the studied systems, we have chosen the MN12SX density functional which is known to consistently provide satisfactory results for several structural and thermodynamic properties [48].
The SMILES (Simplified Molecular Input Line Entry Specification) notations of the studied compounds were fed in the online Molinspiration software from Molinspiration Cheminformatics (Slovensky Grob, Slovak Republic) (www.molinspiration.com) for the calculation of the molecular properties (Log P, Total polar surface area, number of hydrogen bond donors and acceptors, molecular weight, number of atoms, number of rotatable bonds, etc.) and for the prediction of the bioactivity score for different drug targets (GPCR ligands, Kinase inhibitors, Ion channel modulators, Enzymes and Nuclear receptors). The bioactivity scores were compared with those obtained through the use of other software, such as MolSoft from Molsoft L.L.C. (San Diego, CA, USA) (http://molsoft.com/mprop/) and ChemDoodle Version 9.02 from iChemLabs L.L.C. (Richmond, VA,USA) (www.chemdoodle.com).

Results and Discussion
The molecular structures of the optimized conformers of the Mirabamides A-H obtained as mentioned in the Settings and Computational Methods section, and whose graphical sketches are shown in Figure 1, were reoptimized in the gas phase by considering the DFTBA (Density Functional Tight-Binding Approximation) model available in Gaussian 09 and then optimized again using the MN12SX density functional mentioned in the previous section together with the Def2SVP basis set and the SMD solvent model, using water as the solvent. After verifying that each of the structures corresponded to the minimum energy conformations through a frequency calculation analysis, the electronic properties were determined by using the same model chemistry, but with the Def2TZVP basis set instead of that used for the geometry optimization.
These peptides present a unique structural arrangement that comprises a macrocyclic region closed through an ester bond between the C-terminus and a β-hydroxyl group, and terminated with a polyketide moiety or a more simple branched aliphatic acid [49]. As mentioned in the previous referenced study, two unknown building blocks were described after their isolation: The glycosylated amino acid β-methoxytyrosine 4'-O-α-L-rhamnopyranoside and the 4-chloro-L-homoproline. Thus, these are highly modified glycosylated depsipeptides whose amino acid sequences are hard to explain. However, the sequences for Mirabamides A, C and D starting from the C terminal can be described as: ClHPr-βOMeTyr-NMeThr-Ala-Gly-3-OMeAla-3OHLeu-3,4-DiMeGln-Dab-Thr-Gly-Dhtda. The differences between these are due to the presence or not of the Cl at the terminal proline (substituted by an H) or the rhamnose substitued by an H in Mirabamide C. In the case of Mirabamide B, the sequence is the same as that for Mirabamide A, with the difference of the presence of the Aba instead of the Dab (2,3-diaminobutanoic) residue. For the Mirabamides E-H, the sequence is very similar to that of Mirabamide A, with the only difference being the replacement of threonine with its dehydration product 2-amino-2-butenoic acid (Aba), the replacement of an OH by an H in the polyketide terminal chain and a different conformation for the rhamnose moiety. The analysis of the results obtained in the study aimed at verifying that the KID procedure was fulfilled. On performing it previously, several descriptors associated with the results that the HOMO and LUMO calculations obtained are related with results obtained using the vertical I and A following the ∆SCF procedure. A link exists between the three main descriptors and the simplest conformity to the Koopmans' theorem by linking H with −I, L with −A, and their behavior in describing the HOMO-LUMO gap as Notably, the J A descriptor consists of an approximation that remains valid only when the HOMO that a radical anion has (the SOMO) shares some similarity with the LUMO of the neutral system. Consequently, we decided to design another descriptor, ∆SL, to guide in verifying the accuracy of the approximation [16][17][18][19][20][21][22]. The results of this analysis are presented in Table 1. As can be seen from Table 1, the results for the descriptors show values that are consistent with our previous findings for the case of the melanoidins [16][17][18][19][20][21][22], that is, the MN12SX density functional is capable of giving HOMO and LUMO energies that allow to verify the agreement with the approximate Koopmans' theorem. This is not only true because the J HL values are almost zero, but due to the fact that the ∆SL descriptor, which relates to the difference between the LUMO of the neutral and the HOMO of the anion, is also close to zero. Indeed, these values cannot be exactly equal to zero, but the small differences mean that errors in the prediction of the global reactivity descriptors will be negligible. Moreover, it can be seen from Table 1 that the MN12SX density functional predicts negative values for the LUMO energies, which will represent positive values of the electron affinity A.
The KID procedure has its foundations on the behavior of the four descriptors J I , J A , J HL and ∆SL: The closer they are to zero, the better agreement of a density functional in giving accurate Conceptual DFT descriptors calculated only from the HOMO and LUMO. This allows to avoid the calculation of the energies of the cation and anion species which, being open systems, are more difficult to converge than the parent neutral molecule, which is inconvenient when studying large systems like those considered in this study.
By taking into account the KID procedure presented in our previous works together with the finite difference approximation, the global reactivity descriptors can be expressed as [5,6,[50][51][52]: Global Electrophilicity Net Electrophilicity where H and L are the energies of the HOMO and LUMO, respectively. According to our previous discussion, the results for the global reactivity descriptors based on the values of the HOMO and LUMO energies calculated with the MN12SX density functional are presented in Table 2. As expected from the molecular structure of this species, its electrodonating ability is more important that its electroaccepting character. However, when comparing the values of the global descriptors for each of the peptides, it can be seen that there are not significant differences between them.
The Electrophilic Fukui functions f − (r) and Nucleophilic Fukui functions f + (r) for the Mirabamides A-H molecules are shown in Figures 2 and 3, respectively.  As it has been stated by Martínez-Araya in a recent work [58], while the Fukui function is a nice descriptor to understand the local reactivity of the molecules, it can be demonstrated that the Dual Descriptor in its condensed form ∆ f k will perform better for the prediction of the preferred sites for the electrophilic and nucleophilic attacks. For this reason, we have decided to present the results for the Condensed Dual Descriptor ∆ f k as calculated from either Mulliken Population Analysis (M) or Natural Population Analysis (N) in comparison with the Nucleophilic Parr Function P + k and Electrophilic Parr Function P − k proposed by Domingo et al. [59,60], considering atomic spin densities coming from the mentioned Mulliken Population Analysis (M) or from Hirshfeld Population Analysis (H).
The results for the calculation of these local reactivity descriptors for the Mirabamides A to H are presented in Tables 3-10, respectively. It must be noted that we are presenting only the results for those atomic sites where the ∆ f k (which is itself multiplied by 100) are greater than one. Also, the H atoms are not shown.           As can be seen from the results in Table 3, C104 will be the site for the nucleophilic attack while C105 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide A. By considering the results in Table 4, C89 and C91 will be the sites for the nucleophilic attack while C106 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide B. From Table 5, C54 will be the site for the nucleophilic attack while C97 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide C. For the case of the Mirabamide D in Table 6, C56 will be the site for the nucleophilic attack while C106 will be the preferred site for electrophilic attack. From the results in Table 7, C47 will be the site for the nucleophilic attack while C106 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide E. In the case of Mirabamide F, we can see from Table 8 that C43 will be the site for the nucleophilic attack while C105 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide F. Considering the results in Table 9, C84 and C86 will be the sites for the nucleophilic attack while C96 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide G. Lastly, from the results in Table 10, C83 will be the site for the nucleophilic attack while C95 will be the preferred site for electrophilic attack during a chemical reaction involving Mirabamide H.
According to the results from Tables 3-10, the local reactivity descriptors calculated from the different formulations are able to recognize the nucleophilic and electrophilic sites for chemical reactivity with great accuracy. Moreover, there is an impressive agreement between the results coming from the Condensed Dual Descriptor ∆ f k and the Nucleophilic and Electrophilic Parr Functions P + k and P − k , which means that their use in this and future works related to the study of therapeutic peptides will have an increased chance of success.

Calculation of the pK a of the Peptides
We have recently presented a study of the computational prediction of the pK a s of small peptides through Conceptual DFT descriptors [12]. In that work, we concluded that the relationship pK a = 16.3088 − 0.8268η could be a valuable starting point for the prediction of the pK a of larger peptides of interest for the development of AGE inhibitors.
At biological pH, the Mirabamides A-H are neutral molecules, and we considered them in that state for our pK a calculations. Following the methodology or our previous work, we have considered the optimized molecular structure of each of the conformers and we have applied the mentioned relationship to the calculation of the pK a s of the Mirabamide A-H molecules, with the η values presented in Table 2 being the results as follows in Table 11: From these results it can be seen that our methodology is able to distinguish between the pK a value of every peptide in spite of the small differences between them. These values can be seen in the context of our previous study [12], and we believe that they could be of interest when designing pharmaceutical drugs starting from these peptides and enabling an explanation of the mechanisms of action and the drug delivery procedures.

Quantification of the AGEs Inhibitor Ability
The Maillard reaction between a reducing carbonyl and the amino group of a peptide or protein leads to the formation of a Schiff base which, through a series of steps, renders different molecules known as Advanced Glycation Endproducts or AGEs. It is believed that the presence of these AGEs is one of the main reasons for the developing of some diseases, such as Diabetes, Alzheimer and Parkinson [62].
Among several strategies that have been considered for the prevention of the formation of AGEs, it is worth mentioning the use of compounds presenting amino groups in their structure capable of interacting with the reducing carbonyl of carbohydrates and being competitive with the amino acids, peptides and proteins present in our body. Many compounds have been devised as drugs to achieve this goal and, to name a few, we can include Pyridoxamine, Aminoguanidine, Carnosine, Metformin, Pioglitazone and Tenilsetam [63,64].
It can be proposed that peptides having amino and amido groups could be thought as potential therapeutic drugs for preventing the formation of AGEs, because they could react in the Maillard reaction with reducing carbohydrates before the peptides and proteins of our body. Although this a merely speculative proposal, we believe that it is worth exploring this possibility by following a methodology we presented earlier. In a previous work, we have studied the ability of a group of proposed molecules to act as inhibitors of the formation of AGEs by quantifying their behavior in terms of Conceptual DFT reactivity descriptors [13]. It was concluded that the key factor in the study of the chemical reactivity of the potential AGEs' inhibitors was on their nucleophilic character, and although there are several definitions of nucleophilicity [65], our results suggested that the inverse of the net electrophilicity ∆ω ± could be a good definition for the nucleophilicity N. On the basis of the mentioned analysis, we were able to find some qualitative trends for the studied molecular systems.
In this work, we will extend this correlation to the Mirabamide A-H peptides in order to see if they can be considered as precursors of therapeutic drugs for the inhibition of the formation of AGEs, besides their known activity as HIV-Inhibitory peptides. As the model chemistry employed in both works is the same, the comparison is straightforward: This qualitative trend is representative of the known pharmacological properties of the studied AGEs inhibitors [63,64] and it can be seen that the Mirabamide A-H possess AGEs Inhibitor Abilities similar to that of Pyridoxamine, with some of them (Mirabamide G, Mirabamide C and Mirabamide D) performing better, while the others present lower values of the AGEs Inhibitor Ability.

Bioactivity Scores
When considering a given molecular system as a potential therapeutic drug, it is customary to check if the considered species follows the Lipinski Rule of Five, which is used to predict whether a compound has or not has a drug-like character [66]. The molecular properties related to the drug-like character were calculated with the aid of the MolSoft and Molinspiration software and are presented in Table 12, where miLogP represents the octanol/water partition coefficient, TPSA is the molecular polar surface area, natoms is the number of atom of the molecule, nON and nOHNH are the number of hydrogen bond acceptors and hydrogen bond donors respectively, nviol is the number of violations of the Lipinski Rule of Five, nrotb is the number of rotatable bonds, volume is the molecular volume, and MW is the molecular weight of the studied system. However, what the Lipinski Rule of Five really measures is the oral bioavailability of a potential drug, because this is desired property for a molecule having drug-like character. Indeed, this criteria cannot be applied to peptides, even when they are small, as we can see from Table 12, due to the inherent molecular weight and number of hydrogen bonds.
In a more recent work, Martin [67] has developed what she called "A Bioavailability Score" (ABS) for avoiding these problems. The rule for the ABS established that the Bioavailability Score for neutral organic molecules must be 0.55 if they pass the Lipinski Rule of Five and 0.170 if they fail.
The ABS value for all the Mirabamides A-H considered in this work have been calculated by using the ChemDoodle software, and the results were equal to 0.170 for all of them. A different approach was then followed by considering similarity searches in the chemical space of compounds with structures that can be compared to those that are being studied and with known pharmacological properties.
As has been mentioned in the Settings and Computational Methods section, this task can be accomplished using the online Molinspiration software for the prediction of the bioactivity score for different drug targets (GPCR (G protein-coupled receptor) ligands, kinase inhibitors, ion channel modulators, enzymes and nuclear receptors). The results are named Bioactivity Scores and the values for the Mirabamides A-H are presented in Table 13. These bioactivity scores for organic molecules can be interpreted as active (when the bioactivity score is >0), moderately active (when the bioactivity score lies between −5.0 and 0.0) and inactive (when the bioactivity score < −5.0). All the Mirabamides A-H were found to be moderately bioactive towards all the enzymes considered for the study.
As the IC 50 values for the HIV-Inhibitory action of the peptides are available from References [10,11], we considered that it was worth verifying if a linear relationship exists between the IC 50 values and the global reactivity descriptors presented in Table 2. However, it was not possible to find such a relationship. This can be explained by considering that half of the results came from different experimental studies.

Conclusions
In this paper we have presented the results of a study of the chemical reactivity of a group of peptides with therapeutic potential, Mirabamides A-H, based on the Conceptual DFT as a tool to explain the molecular interactions.
It must be remarked that the tools used did not generate a huge error despite the relative low computational they have, which is itself a good advantage.
The knowledge of the values of the global and local descriptors of the molecular reactivity of the Mirabamides A-H peptides studied could be useful in the development of new drugs based on these compounds.
In a similar manner, the pK a values for each of the therapeutic peptides have been predicted by resorting to values of the chemical hardness following a previously proposed methodology, and the information that resulted would be helpful in understanding not only the chemical reactivity, but other important properties, such as the water solubility.
A point of special interest has been the quantification of the ability of each peptide to act as an inhibitor in the formation of AGEs, and this could be of importance for the design of medicines for fighting diseases like Diabetes, Alzheimer's or Parkinson's.
Finally, the molecular properties related to bioavailability have been predicted using different methodologies already described in the literature, and the descriptors used for the quantification of the bioactivity allowed the characterization of the studied peptides as moderately bioactive towards several ligands and enzymes.