Insecticidal Activity Evaluation of Phenylazo and Dihydropyrrole-Fused Neonicotinoids Against Cowpea Aphids Using the MLR Approach

: This paper presents a Quantitative Structure-Activity Relationship (QSAR) study of a series of 24 dihydropyrrole-fused and phenylazo neonicotinoid derivatives, with insecticidal activity tested against Cowpea aphids ( Aphis craccivora ). In this regard, the conformational search ability of the OMEGA software was employed to model neonicotinoid conformer ensembles, using molecular mechanics calculations based on the 94s variant of the Merck Molecular force field (MMFF94). The minimum energy conformers were used to calculate structural descriptors, which were further related to the insecticidal activity (pLC 50 values), using the multiple linear regression (MLR) approach. The genetic algorithm was used for variable selection and several criteria for internal and external model validation. A robust model (r 2 = 0.880, r 2adj = 0.855, q 2LOO = 0.827, s = 0.2098, F = 34.295) with predictive power (concordance correlation coefficient (CCC) ext = 0.945, r 2m = 0.824) was obtained, using the QSARINS software. The developed model can be confidently used for the prediction of the insecticidal activity of new chemicals, saving a substantial amount of time and money.


Introduction
Neonicotinoids are considered to be one of the most important and relevant classes of insecticides used nowadays [1,2]. Neonicotinoids are synthetic insecticides acting on the insect nicotinic acetylcholine receptor (nAChR) and have been increasingly used to control various insects during recent decades, especially since imidacloprid was introduced to the market [3]. However, the neonicotinoids success is being provoked by the rapid development of resistance [2] and severe bee toxicity [4][5][6]. It is considered that neonicotinoid insecticides represent the most effective chemical class for the control of sucking insect pests (aphids, whiteflies, leaf-and planthoppers, thrips), micro lepidoptera, and a number of coleopteran pest species [7]. Neonicotinoids have the advantage of their plant systemicity over other insecticides. After application into the soil or the seed, these compounds are absorbed through the plant roots, where they are distributed and give therefore consistent and long-lasting control of sucking insects.
The coplanar segment between guanidine or amidine and pharmacophore in the neonicotinoids could create an electronic conjugation to facilitate the partial negative charge flow toward the tip atom and increase the binding affinity to the insect target [8]. Photostabilized compounds selective for insects relative to mammals have photolabile nithiazine with a nitromethylene moiety and no cationic substituent [9].
Quantitative Structure-Activity Relationship (QSAR) is the most commonly used method to understand how chemical structure features correlate with the toxicity of natural and/or synthetic chemicals like insecticides. This method offers the possibility of searching for new insecticides with enhanced activity against insects and pests. The urgent need for the development of a new insecticide is related to the phenomenon of insecticide-resistant cases of pests. In this regard, several computational approaches were applied to study the insecticidal activity of neonicotinoids [10][11][12][13][14][15].
In this study, the QSAR model of 24 dihydropyrrole-fused and phenylazo neonicotinoid derivatives is derived from the data set of chemical structures and insecticidal activities tested against Cowpea aphids (Aphis craccivora) using multiple linear regression (MLR) approach.
Molecular mechanics calculations, using the 94s variant of the Merck Molecular force field (MMFF94), were used to model the neonicotinoid structures. Statistical analysis using several criteria was employed to find a robust and predictive MLR model. The best derived MLR model could be confidently used to predict the insecticidal activity of newly designed insecticides.

Dataset and Theoretical Molecular Descriptors Calculation
A dataset of 24 phenylazo and dihydropyrrole-fused neonicotinoid derivatives (Table 1) having the insecticidal activity (LC50, in mmol/L) against cowpea aphids (Aphis craccivora) [16,17] was analyzed. pLC50 values were used as the dependent variable. The neonicotinoid structures were pre-optimized using the MMFF94 molecular mechanics force field included in the Omega (Omega v.2.5.1.4, OpenEye Scientific Software, Santa Fe, NM) software [18,19]. For conformer generation, the maximum number of conformers per compound set of 400 and a root-mean-square deviation (RMSD) value of 0.5 Å were employed during the conformer ensemble generation.

The Multiple Linear Regression Method
The MLR approach [20] was employed to relate the pLC50 values with the calculated structural descriptors, using the QSARINS v. 2.2 program [21,22]. The genetic algorithm with leave-one-out cross-validation correlation coefficient was used for variable selection of a constrained function to be optimized, a mutation rate of 20%, and a population size with 10 and 500 iterations.

Model Validation
The dataset was divided randomly into training and test (25% of the total number of compounds) sets. Following compounds: 3, 11, 13, 14, 17, and 23 were included in the test set (Table 1).
Several criteria were used for testing the predictive model power: the concordance correlation coefficient (CCC) [26] (having the thresholds values higher than 0.85, [27]) and the predictive parameter 2 m r (with a lowest threshold value of 0.5) [28].
The model overfit was checked using the Y-randomization test [29] and by comparing the root-mean-square errors (RMSE) and the mean absolute error (MAE) of the training and validation sets [30].
The Multi-Criteria Decision Making (MCDM) validation criterion [32] is used to summarize the performance of MLR models. For every validation criterion, a desirability function was associated, and MCDM had values between 0 (the worst) and 1 (the best).

Results and Discussion
The autoscaling method was employed for normalizing the data: The variables contained in the MLR models were selected using the genetic algorithm. The statistical (fitting and predictivity) results are included in Tables 2-4. The 'MCDM all' scores, based on the fitting, cross-validated, and external criteria were considered for choosing the best MLR models.    For the reliability of the best MLR1 model, the experimental versus predicted pLC50 values and Y-scramble plots are presented in Figures 1 and 2, respectively. In the Y-scrambling test performed for the MLR models, a significant low scrambled r 2 (   The selected descriptors included in the MLR1 best model are not intercorrelated, as presented in the correlation matrix from Table 5. The best MLR1 model had three descriptors: one Galvez topological charge index (JGI2, which means the topological charge index of order 2) and two GETAWAY descriptors (HATSv, which represents the leverage-weighted total index/weighted by atomic van der Waals volumes and R3m-R autocorrelation of lag 3/weighted by atomic masses). The increase of the JGI2 and HATSv descriptor values is favorable for high insecticidal activity. Lower values of R3m raise the insecticidal activity.
New neonicotinoid structures with insecticidal activity against the cowpea aphids can be designed based on the MLR models presented in this study.

Conclusions
Quantitative structure-insecticidal activity relationships were developed using the multiple linear regression approach for neonicotinoids with dihydropyrrole-fused and phenylazo moieties, active against the cowpea aphids (Aphis craccivora). Insecticide structures were modeled using the MMFF94s force field. Descriptors of the minimum conformers were related to the pLC50 values using the multiple linear regression approach. Good correlations and predictive models were obtained. Getaway and Galvez topological charge index descriptors included in the best MLR model can be used for prediction of new insecticides active against the cowpea aphids, saving experimental time and money.