Discovery of Potent Inhibitors for the Large Neutral Amino Acid Transporter 1 (LAT1) by Structure-Based Methods

The large neutral amino acid transporter 1 (LAT1) is a promising anticancer target that is required for the cellular uptake of essential amino acids that serve as building blocks for cancer growth and proliferation. Here, we report a structure-based approach to identify chemically diverse and potent inhibitors of LAT1. First, a homology model of LAT1 that is based on the atomic structures of the prokaryotic homologs was constructed. Molecular docking of nitrogen mustards (NMs) with a wide range of affinity allowed for deriving a common binding mode that could explain the structure−activity relationship pattern in NMs. Subsequently, validated binding hypotheses were subjected to molecular dynamics simulation, which allowed for extracting a set of dynamic pharmacophores. Finally, a library of ~1.1 million molecules was virtually screened against these pharmacophores, followed by docking. Biological testing of the 30 top-ranked hits revealed 13 actives, with the best compound showing an IC50 value in the sub-μM range.


Homology modeling of LAT1
The homology model of human LAT1 was constructed against two templates: (i) the crystal structure of outward-occluded conformation of arginine/agmatine transporter AdiC from E. coli (PDB ID: 3L1L), [1] (ii) the crystal structure of inward-open conformation of ApcT from M. jannaschii (PDB ID: 3GIA) [2]. The sequence identity and the sequence similarity of LAT1 with AdiC is  20% and  40%, and the sequence identity and the sequence similarity of LAT1 with ApcT is  23% and  41%. The amino acid residues 150 and 480507 of LAT1 were not considered in the model building because these residues are predicted to form long intracellular N-and C-terminus domains [3]. In our final alignment, short insertions of one and two amino acids were observed in the TM3 and TM11 of LAT1 ( Figure S1). Gaps with deletions of four and one amino acids were found in the TM9 and TM10. Long insertions and deletions were observed in the extracellular loop 3 (EL3) between TM5 and TM6, undoubtedly implying ambiguity in the loop prediction. Additionally, the amino acid residue differences were observed in the TMs of LAT1 of human, mouse, rabbit, and dog (Table S1). However, the residues enclosing the binding site of LAT1 were identical in all species.

Model evaluation
The final model of LAT1 was evaluated using the PROCHECK [4] and QMEAN [5]. The Ramachandran analysis showed that 88.4% of all residues were present in most favored regions, 9.5% in additionally allowed, 1.6% in generously allowed and 0.5% in disallowed areas (Table S2, Figure S3). Most of the residues located in generously and disallowed regions were found on the outer surface and in the intra-and extra-cellular loops of the model. Only two residues G65 and G256 found in the forbidden areas were within 5Å of the binding site ( Figure S4).
Both residues were optimized via energy-based refinement using the variable dielectric surface generalized Born solvation model [6]. The model showed decent quality in all regions including the binding site according to QMEAN analysis (Table S2, Figure S5).  Figure S21). Based on the binding energy calculations, the estimated sensitivity of LAT1 to ligands may be expressed in the order 11  9  10  12  8, which is qualitatively reliable with in vivo data of the NMs. Nevertheless, 8 was poorly predicted by MM-PBSA, though it is equipotent to 9 and  9 times more potent than 10. This deviation between the predicted and experimental value may be ascribed to the shortcomings of MM-PBSA in contrast to more precise methods of G calculations, such as thermodynamic integration (TI) and free energy perturbation (FEP). The van der Waals (Gvdw), electrostatic interactions (Gelect) and non-polar solvation energy (Gnon-polar) contributed negatively, while polar solvation energy (Gpolar) added positively to the total free binding energy of the ligands.

Network visualization of the docking poses
The r 2 between Gvdw and Gbind is 0.82, and r 2 between Gelect and Gbind is 0.64. In terms of negative contribution, Gvdw gives more significant contribution than Gelect for all ligands except 8 suggesting significant hydrophobic interactions of the side chain. The lack of extended side chain in 8 may explain the low Gvdw as compared to the NMs, and thus a smaller Gbind.
Moreover, in 8 and 12, the contribution from the electrostatic and van der Waals energy was compensated mainly by the high polar solvation free energy resulting in reduced Gbind.
Overall, Gelect and Gvdw seems to be dominant forces contributing to the stability of complexes 812. To identify the critical molecular determinants involved in the binding, perresidue energy contribution was computed. The binding of the ligands was mostly influenced favorably by residues I139, I140, I147, V148, F252, W257, V339 and W405 via van der Waals interactions, while residues T62, I63, G65, S66, G67, F252, and S338 contributed via electrostatic interactions (Figure S22).  Table S1. The amino acid residue differences in the TMs of LAT1 of mouse, rabbit, and dog with respect to the human sequence. The corresponding substitutions are indicated in red.

Figure S1
. LAT1-AdiC alignment as visualized using Jalview [10]. The residues are colored according to their type using the Clustalx color scheme. The TMs are indicated as brown boxes.
The TMs of AdiC were defined using the PPM server [11]. The residues of LAT1 involved in direct interactions with the docking poses of 812 are highlighted with a black asterisk.  Errat (Overall quality factor) 93.57 Table S2. Assessment of LAT1 model built on the AdiC structure (PDB ID: 3L1L).                         Table S7. Dose-response analysis of compounds 28, 42 and 36.       Figure S63. 1 H-NMR spectrum of compound 36.