Prevalence and Factors Related to Natural Resistance-Associated Substitutions to Direct-Acting Antivirals in Patients with Genotype 1 Hepatitis C Virus Infection

This study aimed to assess the prevalence of natural resistance-associated substitutions (RASs) to NS3, NS5A and NS5B inhibitors in 86 genotype 1 Hepatitis C Virus (HCV)-infected patients from Buenos Aires, Argentina, and to determine their effect on therapy outcome. Additionally, virological, clinical and host genetic factors were explored as predictors of the presence of baseline RASs. NS3 RASs (39.2%) were more prevalent than NS5A RASs (25%) and NS5B RASs (8.9%). In the three regions, the frequencies of RASs were significantly higher in HCV-1b than in HCV-1a. The prevalence of Y93H, L159F and Q80K were 1.3%, 6.3% and 2.5%, respectively. IFNL3 CC genotype was identified as an independent predictor of the presence of baseline RASs in NS5A and NS3 genes (p = 0.0005 and p = 0.01, respectively). Sustained virologic response was achieved by 93.3% of the patients after receiving direct-acting antivirals (DAAs), although 48.7% of them showed baseline RASs related to the DAA-regimen. Notably, the prevalence of clinically relevant RASs in the three genes was lower than that observed around the world. The baseline presence of RASs in both subtypes did not appear to affect therapy outcome. These results support the need to evaluate resistance patterns in each particular country since RASs´ prevalence significantly vary worldwide.


Introduction
Hepatitis C Virus (HCV) infection affects more than 70 million people worldwide [1,2]. Argentina is considered a low-endemicity country, with a prevalence that is estimated to be around 1% [3]; although higher rates have been described in different small rural communities [4].
The treatment of chronic HCV infection has considerably improved in recent years with the widespread development of "direct-acting antiviral" (DAA) drugs. These drugs, which target specific viral proteins of the HCV lifecycle, can be combined in highly effective, well-tolerated, interferon-free infected with HCV genotype 1, by automated Sanger sequencing and Ion Torrent NGS, and to determine their effect on therapy outcome. Additionally, virological, clinical and host genetic factors were explored as predictors of the presence of baseline RASs.

Study Population
This study was approved a priori by the Ethics Committee on Research from the Hospital Italiano of Buenos Aires and conducted in accordance with good clinical practice guidelines and the principles of the Declaration of Helsinki.
From 2012 to 2014, consecutive DAA-naïve patients with genotype 1 chronic hepatitis C were invited to participate in the study, which took place at the Hepatology Unit of the Hospital Italiano of Buenos Aires. Serum and whole blood samples were collected from each patient, after obtaining written informed consent.
Clinical data, such as gender, age and previous failure to PegIFN/RBV treatment, were recorded. To evaluate the impact of baseline RASs on treatment outcome, SVR rates were documented in those patients who underwent DAA prescription after recruitment and sample collection. Fibrosis grade was staged either by biopsy or Transient Elastography by Fibroscan®(Echosens, Paris, France). Plasma HCV RNA load was measured using Cobas®TaqMan®(Roche, Pleasanton, CA, USA), with a detection limit of 15 IU/mL. HIV co-infection was diagnosed by ELISA (Dade Behring; Enzygnost anti HIV-1/2 plus, Marburg GmbH, Germany) and confirmed by Western-blot (New Lab Blot-1, Bio-Rad, Marnes-la-Coquette, France).

RT-PCR and Automated Sanger Sequencing
NS3, NS5A and NS5B genomic regions were partially amplified by previously described RT-Nested PCR protocols specific for subtype 1a and 1b [23][24][25], covering positions involved in drug resistance. PCR products were bi-directionally sequenced using the Big-Dye Termination chemistry system (Applied Biosystems, Foster City, CA, USA).

NGS Sequencing
Amplicons were fragmented into 200 base pair (bp) pieces using Bioruptor®Sonication System (Diagenode, Denville, NJ, USA). Fragments were nick repaired, adaptor ligated, and barcoded using the Ion Neb Next Library kit (Life Technologies, Foster, CA, USA). Then, fragments were purified by Agencourt®AMPure®XP magnetic beads (Beckman Coulter, Brea, CA, USA), the fragmentation pattern was analyzed by semi-fluidic electrophoresis (Agilent®Bioanalyzer®, Santa Clara, CA, USA) and libraries were diluted to obtain equimolar amounts. Emulsion PCR was carried out using the Ion OneTouchTM 2.0 system (Life Technologies, Carlsbad, CA, USA) and the Ion PGMTM Template Hi-Q™ OT2 200 kit (Thermo Fisher Scientific, Waltham, MA, USA), followed by an enrichment step. Sequencing was performed on an Ion Torrent Personal Genome Machine (PGM) sequencer (Thermo Fisher Scientific, Waltham, MA, USA) using the Ion PGM™ Hi-Q™ Sequencing Kit and the Ion 316 chip (Thermo Fisher Scientific, Waltham, MA, USA).
All NGS reads were trimmed and aligned with the previously mentioned reference sequences using the Smith-Waterman algorithm, included in the Torrent Suite™ software (Thermo Fisher Scientific, Waltham, MA, USA). The plugins Variant Caller and Coverage Analyzer, both available in this software, were used for analysis of sequence variants and coverage, respectively. Low stringency parameters were used to identify variants, as previously reported [17]. Sequence data were visually confirmed with the Integrative Genomics Viewer and any sequence, alignment, or variant call error artifacts were discarded.

HCV Quasispecies Analysis
Sequencing errors induced by PCR and ultra-deep sequencing would complicate the analysis of viral populations and result in inflated estimates of quasispecies heterogeneity. To avoid such misinterpretation, all necessary precautions in sample preparation, RNA extraction, and amplification were taken into account [29], and PCR bias was avoided as previously reported [30]. In addition, ShoRAH (v.0.8) software, which applied a probabilistic Bayesian approach to minimize the effects of errors, was used [31].
The quasispecies complexity of NS3, NS5A and NS5B was determined by calculating: Nucleotide mutation frequency (Pn) and normalized Shannon entropy (Sn). Pn was calculated as the total number of polymorphic sites divided by the total number of nucleotides sequenced; considering that the variability of the quasispecies increased as Pn increased. The measure of the Shannon entropy, defined in terms of the probabilities of different sequences that can appear at a given time point was calculated using the formula S = −Σi(pi ln pi), where pi is the frequency of each variant in the viral quasispecies. The normalized entropy, Sn, was calculated as Sn = S/ln N, where N is the total number of sequences analyzed in each sample. Sn theoretically varies from 0 (no diversity) to 1 (maximum diversity) [32].

Evaluated Variables and Statistical Analyses
For descriptive statistics, mean and standard deviation (SD) or absolute number and percentages were used. The statistical analysis of data was performed by means of contingency tables using Fisher exact probability test for categorical variables and Mann-Whitney U test for continuous variables. Univariate and multivariate logistic analyses were performed to identify variables that were independently associated with the presence of RASs in each genomic region. All significant parameters in the univariate analysis were included in the multivariate analysis. The virological variables that were explored to be associated with the presence of RASs were HCV subtypes, viral load and quasispecies complexity. Age, gender, METAVIR score and previous failure to PegIFN/RBV treatment were the clinical variables analyzed to be associated with RASs. Host SNPs in IFNL3/IFNL4 genes were also explored. Statistical analyses were performed with the Statistical Package for the Social Sciences software (v.18.0) (SPSS, Chicago, IL, USA) and significant differences were considered only for p < 0.05.

Patients' Characteristics
A total of 86 consecutive DAA-naïve patients chronically infected with HCV genotype 1 (33 subtype 1a and 53 subtype 1b) were included in the study. The clinical, virological and host genetic characteristics of the patients are shown in Table 1.

Genotype
Reference NS5A Position

RASs Frequency
C/H/I/N/R/S/T -Among the 33 subtype 1a strains, the primary resistance mutation M28V, related to resistance to LDV and OMV, was observed in one (3%) sample. (Table 2).
No combinations of multiple RAS were observed in subtype 1a, whereas in subtype 1b, one patient (2%) carried two mutations being the genetic profile R30Q+P58R.
Among the 49 subtype 1b sequences, five (10.2%) harbored the L159F variant, associated with a lower response to SOF. Although the major mutation S282T, which confers high-level resistance to SOF, was not observed, RASs S282G (4.1%) was detected (Table 3).
No combinations of multiple RASs were observed in subtype 1a-or 1b-infected patients.
Among the 29 subtype 1a strains, the clinically relevant RASs Q80K, associated with resistance to SMV, GZR and PTV, was detected only in one (3.4%) sample ascribed to subtype 1a clade 1. Substitution T54S, associated with resistance to ASV and SMV, was found in one (2%) sequence; as well as RAS V36L, which confers resistance to SMV, GZR and PTV, and I170V associated with resistance to SMV (Table 4). Table 4. Frequency of RASs in the NS3 protease by automated Sanger sequencing data.

Genotype
Reference NS3 Position
No combinations of multiple RASs were observed in subtype 1a-or 1b-infected patients.

Comparison of NGS and Direct Sequencing Results
The NS5A, NS5B and NS3 genes were successfully sequenced by NGS in 76 (90.5%), 72 (91.1%) and 72 (91.1%) of the samples characterized by automated Sanger sequencing, respectively. A mean of 12,259,169, 10,968,036 and 19,235,648 bp in the NS5A, NS5B and NS3 genes were mapped onto the reference sequences, and an overall average coverage depth of 889x, 859x and 987x was achieved for each region, respectively. An average of 60,676, 59,412 and 90,748 reads of the NS5A, NS5B and NS3 regions were aligned and compared with reference sequences for each subtype, respectively. Deep sequencing yielded a median of 11,650, 10,870 and 12,340 sequences per sample in the NS5A, NS5B and NS3 genomic regions, respectively.

Assessment of HCV Quasispecies Diversity
The viral population parameters of complexity of the NS5A, NS5B and NS3 genes sequenced by NGS, are summarized in Table 6. No significant associations were observed when these parameters were compared among the three analyzed HCV genomic regions. However, the nucleotide mutation frequencies for subtype 1a in NS3 and NS5A regions were significantly lower than those for subtype 1b (p = 0.002). Nucleotide mutation frequencies in the NS5B region as well as entropy values for all analyzed genomic regions were similar between subtype 1a and 1b (Table 6).

Factors Related to the Presence of Baseline RASs
In univariate analyses, several factors were significantly associated with the presence of baseline RASs in the NS5A, NS5B and NS3 regions (Table 7).
In a multivariate logistic regression model, HCV subtype 1b, IFNL3 CC genotype and values above the median of Shannon entropy and nucleotide mutation frequency were identified as independent predictors of the presence of baseline RASs in the NS5A gene. Among these, IFNL3 CC genotype was the factor with the highest predictive value (p = 0.0005). In the NS3 gene, this host genetic factor was also identified as the only independent predictor of the presence of RASs at baseline (p = 0.01). No factor was independently associated with the presence of RASs in the NS5B genomic region after the multivariate analysis (Table 7).

Effect of Baseline RASs on Treatment Outcome
Sixty of the 86 (69.8%) recruited patients ( 2%) exhibited at baseline a RAS (or a combination of RASs) related to the DAA-regimen. Treatment failure was observed in four (6.7%) cirrhotic and treatment-experienced patients (one subtype 1a and three subtype 1b) who relapsed after receiving triple therapy. One of the subtype 1b patients had a low-level baseline RAS (S122G) to protease inhibitors.

Discussion
The World Health Organization (WHO) has declared that HCV should be eliminated as a public health threat by 2030 [2]. Although DAAs seem a promising tool for viral elimination, HCV treatment fails in approximately 2.5% to 5% of patients and this may often occur in the presence of RASs [1]. Moreover, it has been reported that the frequencies of RASs differ between geno-and subtypes, geographical region and method of sequencing [33]. In addition, several viral, host and treatment factors can influence the emergence of RASs [18]. Therefore, in order to reach the worldwide elimination goals, RAS prevalence should be evaluated in individual countries or regions.
In this study, the combined results from both sequencing methods revealed that the frequency of natural RASs varied according to the analyzed viral region, as RASs were more frequently observed in the NS3 when compared to their frequency in the NS5A (39.2% vs. 25%, respectively; p = 0.06) and in the NS5B genes (39.2% vs. 8.9%; p < 0.0001).
This disparity in the prevalence of RASs between the three genomic regions is a possible reflection of the fact that the substitution rates in the different proteins may be constrained by the function that each one of them fulfills in the biology of the HCV. Since the genetic variability is not evenly distributed across the viral genome [34], different evolutionary processes shape the different genetic barriers for the development of resistance of each DAA family, which are similar between NS5A and NS3 protease, but higher for NS5B polymerase [34].
In the NS3 and NS5A regions, the frequencies of natural RAS were higher than those observed in previous studies [10,35], but similar to those reported in Argentina for NS3 [25] and in Brazil [17,36,37] when taking into consideration the sequencing method and cut-off value used. Notably, when the analysis was limited to amino acid substitutions associated with a clinically relevant level of resistance (defined as "likely resistant"), their prevalence was lower in both genomic regions when compared to other studies [10,22,35,38]. In this regard, in the NS5A protein, combining the results from both sequencing methods, mutations M28V and L31M were only detected in two subtype 1a and two subtype 1b-infected patients, respectively. Moreover, RAS Y93H, which confers high-level resistance to DCV, OMV, LDV and ELB [39], was observed in one subtype 1a-infected patient at a low frequency (3%) only after performing NGS. In the NS3 region, the clinically important RAS Q80K was only detected in one patient belonging to subtype 1a clade 1. This finding is in agreement with amino acid signature analyses from United States and Europe sequences in which Q80K polymorphism was only found in subtype 1a clade 1 and not clade 2 [40,41].
The low prevalence of the clinically relevant RASs in the NS3 and NS5A regions in this study contrasts with results from Asia, Europe and USA [35,38,[42][43][44], but it is similar to Argentinean and Brazilian reports [17,25,36,37]. The reasons for this discrepancy remain unclear as there is still little information about the genetic variability of the viral genes of HCV isolates from Latin America. However, this finding could be associated to geographical factors, as previously suggested [45,46].
In the NS5B region, the prevalence of RASs was lower than those reported by other studies [35,47]. None of the patients carried the S282T mutation, which is the signature substitution associated with resistance to SOF in vitro for all HCV genotypes [38]. This finding is in agreement with previous studies which reported that S282T is rarely (<1%) selected in patients treated with a SOF-based regimen and is not detected in treatment-naïve genotype 1 patients [42].
In this study, natural substitutions were detected at baseline in both subtypes, with a higher prevalence of RASs among subtype 1b samples in all three HCV genomic regions. This finding is in agreement with studies carried out in Italy [42], Argentina [25] and Brazil [17,35,45,48]; whereas a higher prevalence of RASs in subtype 1a sequences was observed in Europe and USA [10,22,49,50] and in two reports from Uruguay in which the number of analyzed subtype 1b sequences was relatively low [51,52].
The higher presence of Q80K among subtype 1a samples reported in Europe and USA, as well as the low prevalence of this subtype and this mutation among the samples of this study [4,25] could be possibly related to the different prevalence of RASs in each region. These geographic discrepancies highlight the genetic diversity present in both the natural RASs found in Latin American countries and the evolutionary relationships between Argentinean and other worldwide isolates, probably linked to the Italian immigration occurred in the 19th century.
Regarding the differences observed between subtypes, the nucleotide mutation frequencies in the NS3 and NS5A regions revealed that the genetic diversity was significantly higher among subtype 1b-infected patients compared to those infected with subtype 1a. Subtype 1b patients were, on average, older than those infected with subtype 1a (Table 1) and may have been infected-and possibly transmitting HCV-for longer periods of time [53], leading to a higher diversity in the viral genome. Finally, the higher prevalence of multiple RASs in subtype 1b genomes could enhance their replication capacity, as was previously reported [54,55]. Taking this into account, HCV subtype 1b infection may have a more unfavorable treatment outcome in Argentina than the expected in other geographical regions, particularly with some DAA regimens. However, the baseline presence of RASs in both HCV subtypes did not appear to affect therapy outcome on a limited sample of patients treated with DAAs in this study.
The need to routinely test for RASs at baseline has been recommended for Q80K in subtype 1a prior to using SMV in interferon-containing regimens, and in patients with cirrhosis that are going to be treated with SOF and SMV. For NS5A inhibitors, it is recommended to test for RASs in subtype 1a patients when using GZR + EBR [12]. The results presented herein suggest that routine testing for RASs at baseline would not be necessary for subtype 1a patients in this population, as their frequencies are low. However, the development of a resistance monitoring system is required as this scenario may change in the near future.
It has been reported that not only the presence of a particular RAS but also the RAS dominancy (>15% to 25%) in the patient's quasispecies has an impact on treatment outcome [56,57], although a recent study suggests that minority RASs present at much lower than 15% may also be relevant for treatment planning [58]. In this regard, this is the first report in Argentina that screened HCV drug resistance using NGS. This technology revealed RASs-not detected by automated sequencing-in 10.5% and 11.1% of the NS5A and NS3 samples. Among them, only two RASs were detected at a frequency above 15% among the patient's viral quasispecies: M28V (with a frequency of 21%) in NS5A and Q80L (with a frequency of 21%) in NS3. This finding supports the fact that deep sequencing has not been proven clinically relevant for RASs testing [59]. Indeed, NGS methodology is costly, too laborious, and technically challenging for many laboratories [60] and, thus, NGS could not yet be ready for routine clinical practice in Argentina.
Although Sanger automated sequencing is generally limited to the detection of variants with greater than 20% of prevalence [16], the two RASs mentioned above could not be detected by automated sequencing. This lack of correlation between both sequencing methods was probably due to the fact that both samples exhibited low viral loads (3.2 log 10 copies/mL), as it was previously reported [61]. In fact, it has been estimated that the efficiency of RNA extraction and cDNA synthesis is 1% to 10% [62]; thus, in low viral load samples, the small number of molecules present in the cDNA could be better examined by the higher analytical sensitivity of NGS [63].
Regarding host factors, it has been reported that the IFNL4 TT genotype, which has a more efficient antiviral response to HCV infection, is strongly associated with RAS Y93H in the subtype 1b NS5A protein, resulting in selection for viral adaptations [19,21]. In this study, univariate and multivariate analyses revealed that the IFNL3 CC genotype was identified as an independent predictor of the presence of RASs at baseline in NS5A and NS3 genomic regions. Therefore, differences in the frequency of NS5A and NS3 RASs could reflect disparities in the selection pressure against HCV due to genetic variations within the IFNL3/4 locus.
Finally, this study has certain limitations that need to be considered. First, NS3, NS5A and NS5B genes could not be amplified in some patients due to the low volume of stored serum and/or RNA samples. Second, the number of analyzed sequences was not so large. However, other studies included a similar-or even smaller-number of patients [46,51,52]. Finally, this study recruited only genotype 1 patients from Buenos Aires and the metropolitan region, and thus excluded patients infected with other genotypes or from distant areas of Buenos Aires, where the ancestry composition of the population and the prevalence of host genetic factors differ [64]. Therefore, these findings should not be generalized or transferable to the whole country.

Conclusions
Decisions regarding which combinations of DAAs to use remain challenging [65,66]. This study depicts a current scenario on DAA resistance in Buenos Aires, Argentina. In summary, the frequencies of natural NS5A and NS3 substitutions were similar than those previously reported in this geographic region. In the NS5B region, the prevalence of RASs was lower than those previously reported. Notably, the prevalence of clinically relevant RASs in the three analyzed genes was lower than those observed around the world. Although tested on a limited sample of patients in this study, the baseline presence of RASs in both HCV subtypes did not appear to affect therapy outcome. Despite the fact that NGS allowed a more complete analysis of RASs, its usefulness may be limited in this population. These results support the need to define the resistance testing in each country since the RASs have very different worldwide prevalence.
Supplementary Materials: The following are available online at http://www.mdpi.com/1999-4915/11/1/3/s1, Figure S1: Maximum-likelihood phylogenetic tree (GTR+Γ+I model for nucleotide substitutions) including 48 references sequences (in black) and the 84 HCV-NS5A samples of this study (in red). The numbers at each node correspond to bootstrap values obtained with 100 replicates (values lower than 70 are not shown); Figure S2: Maximum-likelihood phylogenetic tree (GTR+Γ+I model for nucleotide substitutions) including 48 references sequences (in black) and the 79 HCV-NS5B samples of this study (in red). The numbers at each node correspond to bootstrap values obtained with 100 replicates (values lower than 70 are not shown); Figure S3: Maximum-likelihood phylogenetic tree (GTR+Γ+I model for nucleotide substitutions) including 48 references sequences (in black) and the 79 HCV-NS3 samples of this study (in red). The numbers at each node correspond to bootstrap values obtained with 100 replicates (values lower than 70 are not shown).