Genome Wide Association Study of Karnal Bunt Resistance in a Wheat Germplasm Collection from Afghanistan

Karnal bunt disease of wheat, caused by the fungus Neovossia indica, is one of the most important challenges to the grain industry as it affects the grain quality and also restricts the international movement of infected grain. It is a seed-, soil- and airborne disease with limited effect of chemical control. Currently, this disease is contained through the deployment of host resistance but further improvement is limited as only a few genotypes have been found to carry partial resistance. To identify genomic regions responsible for resistance in a set of 339 wheat accessions, genome-wide association study (GWAS) was undertaken using the DArTSeq® technology, in which 18 genomic regions for Karnal bunt resistance were identified, explaining 5–20% of the phenotypic variation. The identified quantitative trait loci (QTL) on chromosome 2BL showed consistently significant effects across all four experiments, whereas another QTL on 5BL was significant in three experiments. Additional QTLs were mapped on chromosomes 1DL, 2DL, 4AL, 5AS, 6BL, 6BS, 7BS and 7DL that have not been mapped previously, and on chromosomes 4B, 5AL, 5BL and 6BS, which have been reported in previous studies. Germplasm with less than 1% Karnal bunt infection have been identified and can be used for resistance breeding. The SNP markers linked to the genomic regions conferring resistance to Karnal bunt could be used to improve Karnal bunt resistance through marker-assisted selection.


Introduction
Wheat (Triticum aestivum L.) is one of the most important staple cereals grown worldwide contributing about 21% of the energy intake of the world population [1]. Karnal bunt disease of wheat caused by the fungus Neovossia indica (Mitra) Mundkur (syn. Tilletia indica), is an important disease related to international trade. The disease is characterized by the replacement of a part of the seed with a black powdery mass of spores. This disease affects not only the grain weight but also the grain quality of wheat due to the production of trimethylamine with an unpleasant rotten fishy smell [2]. Karnal bunt was first reported in 1931 from Karnal town in the Indian state of Haryana [3]. Since then, the disease has spread to different states of India, as well as Nepal [4], Iran [5], Iraq [6], Afghanistan [7], Mexico [8], limited areas of the USA [9], South Africa [10] and Brazil [11]. Climate Only few Karnal bunt resistant sources have been identified and used in the breeding programs, making it necessary to explore new sources of resistance and enrich the breeding pool to broaden the genetic base of the resistance. Most of the known resistance gene(s)/QTL have been identified based on bi-parental mapping populations; thus, the usefulness of the linked markers in materials other than the populations studied remains unknown. Genome wide association study (GWAS) offers unique opportunities to use a diverse set of germplasm that have accumulated a much larger number of crossing over events since their last common progenitor, as opposed to only one or a few meiotic recombinations in a bi-parental mapping populations derived from crosses of two parents only [39]. Good resolution of the identified QTL is often achievable and wide sampling of molecular variation is possible in GWAS. Its ability to resolve marker/trait associations depends upon the extent of linkage disequilibrium (LD) present in the association panel. The genetic relationship among entries in association panel influences LD, which can be easily inferred from genome-wide genotypic datasets [40]. The markers identified to be linked with QTL have the potential to be used across breeding material for identification and introgression [41]. The sequence information, the annotated, whole-genome sequence of wheat and other cereals offer new opportunities to study the exact nature of allelic variation, explore the underlying genetic basis and putative gene(s) associated with Karnal bunt resistance. In wheat, GWAS has been used to study disease resistance like rusts and powdery mildew, drought tolerance and end use quality traits, whereas only few studies were undertaken for Karnal bunt resistance. In this study, a diverse collection of germplasm from Afghanistan including 339 lines has been subjected to GWAS analysis, in order to identify QTL for Karnal bunt resistance.

Karnal Bunt Resistance in the Germplasm Collection
The percent Karnal bunt infection score in the evaluated germplasm collection ranged from 0% to 60%. In all four experiments, Karnal bunt infection was skewed towards lower infection but continuous variation was observed (Figure 1). Disease severity in the four experiments was highly correlated, with phenotypic correlations ranging from 0.61 to 0.70. unique opportunities to use a diverse set of germplasm that have accumulated a much larger number of crossing over events since their last common progenitor, as opposed to only one or a few meiotic recombinations in a bi-parental mapping populations derived from crosses of two parents only [39]. Good resolution of the identified QTL is often achievable and wide sampling of molecular variation is possible in GWAS. Its ability to resolve marker/trait associations depends upon the extent of linkage disequilibrium (LD) present in the association panel. The genetic relationship among entries in association panel influences LD, which can be easily inferred from genome-wide genotypic datasets [40]. The markers identified to be linked with QTL have the potential to be used across breeding material for identification and introgression [41]. The sequence information, the annotated, whole-genome sequence of wheat and other cereals offer new opportunities to study the exact nature of allelic variation, explore the underlying genetic basis and putative gene(s) associated with Karnal bunt resistance. In wheat, GWAS has been used to study disease resistance like rusts and powdery mildew, drought tolerance and end use quality traits, whereas only few studies were undertaken for Karnal bunt resistance. In this study, a diverse collection of germplasm from Afghanistan including 339 lines has been subjected to GWAS analysis, in order to identify QTL for Karnal bunt resistance.

Karnal Bunt Resistance in the Germplasm Collection
The percent Karnal bunt infection score in the evaluated germplasm collection ranged from 0% to 60%. In all four experiments, Karnal bunt infection was skewed towards lower infection but continuous variation was observed ( Figure 1). Disease severity in the four experiments was highly correlated, with phenotypic correlations ranging from 0.61 to 0.70.
Based on the average of all four experiments, 52 genotypes exhibited a severity less than 5%, 17 genotypes exhibited more than 40% and rest of the lines recorded between 5-40% ( Figure 1). It is noteworthy that 11 lines showed severities less than 1% and could be used as Karnal bunt resistance sources ( Table 2). Analysis of variance indicated significant variation among genotypes, as well as for genotype-by-experiment interaction ( Table 3). The heritability estimate for Karnal bunt was observed to be 0.80 based on all four experiments.  Based on the average of all four experiments, 52 genotypes exhibited a severity less than 5%, 17 genotypes exhibited more than 40% and rest of the lines recorded between 5-40% ( Figure 1). It is noteworthy that 11 lines showed severities less than 1% and could be used as Karnal bunt resistance sources ( Table 2). Analysis of variance indicated significant variation among genotypes, as well as for genotype-by-experiment interaction ( Table 3). The heritability estimate for Karnal bunt was observed to be 0.80 based on all four experiments.

Population Structure
Principal component analysis based on the genotypic data was able to differentiate the germplasm collection into two main groups ( Figure 2). Genotypes of International Maize and Wheat Improvement Center (CIMMYT) origin and those derived from CIMMYT lines, as well as some Afghanistan materials, were clustered in Group I, whereas the rest of Afghanistan materials were found in Group II. The lines released in India were also clustered into Group I, in agreement with their CIMMYT based pedigrees.

11
GLM and MLM models were tested for best fit based on the observed P-values and the expected   (Table 4). The

19
identified SNPs explained the phenotypic variation ranging from 5% to 20% for Karnal bunt severity.

20
Four MTAs were reported on chromosome 2B, followed by three on chromosome 5B, two each on

Markers Significantly Associated with Karnal Bunt
GLM and MLM models were tested for best fit based on the observed p-values and the expected p-values for a trait (Q-Q plots). The mixed linear model (MLM) involving population structure (Q) and kinship data (K) fitted better than GLM and therefore was used for further association analysis. The association analysis identified a total of 18 marker-trait associations (MTAs) for Karnal bunt. The SNPs exhibiting MTAs were assigned to 12 wheat chromosomes regions, namely, 1D, 2B, 2D, 4A, 4B, 5A, 5B, 6A, 6B, 7B and 7D (Figure 3). Out of the 18 MTAs, only the one on chromosome 2BL was common among all four experiments, the one on chromosome 5BL was common among three experiments, and the remaining 16 were only common between two experiments (Table 4). The identified SNPs explained the phenotypic variation ranging from 5% to 20% for Karnal bunt severity. Four MTAs were reported on chromosome 2B, followed by three on chromosome 5B, two each on chromosome 5A and 6B and only one on chromosomes 1D, 3B, 4A, 4B, 6A, 7B and 7D. The SNP markers showing significant MTAs were tested for linkage disequilibrium (LD) among each other, and the results showed that two SNPs on 5BL (1079540 and 1083023) were in LD, indicating that they may represent the same QTL. The QTL on chromosome 2BL (6048838) explained the highest phenotypic variation from 11 to 20% across the four experiments, and lines having the CC genotype of the marker 6048838 showed an average Karnal bunt infection score of 16.5%, whereas those having TT genotype showed an average infection of 25%. QTL represented by markers 1079540 and 1083023 on 5BL explained a phenotypic variation of 6 to 9% across three experiments, and at 1079540, the CC genotypes showed 17% less Karnal bunt infection as compared to lines having the AA genotype (Figure 4). Similarly, for other significant SNPs, the resistance alleles showed their association with Karnal bunt reduction as well.

Putative Genes Associated with MTAs for Karnal Bunt
In the present study, significant and stable SNPs have been reported on chromosomes 1D, 2B, 3B, 4A, 5A, 5B, 6A, 6B, 7B and 7D. The SNP markers 6048838 and 1228074 on chromosome 2BL encompass an interval of 38Mb and are reported to contain several resistance genes coding for Kinase-like proteins, nucleotide-binding site-leucine rich repeat (NBS-LRR) class of proteins, a serine threonine protein kinase that are mostly associated with disease resistance. The genes TraesCS2B02G496800 and TraesCS2B01G535000, which code for disease resistance protein (Toll-interleukin receptor-NBS-LRR class) and 70 kDa heat shock protein are adjacent to the markers 6048838 and 1228074, respectively. The SNPs on chromosome 5BL (1079540 and 1083023) encompass 14 Kb region and the putative candidate genes in the region were TraesCS5B01G506000 and TraesCS5B01G506100 which code for F-box family protein and F-box domain containing protein, respectively. Another QTL on chromosome 4AL harbors gene TraesCS4A01G379200 coding for F-box family protein. Gene TraesCS1D01G407200 coding for Leucine-rich repeat receptor-like protein kinase family was found in the vicinity of the QTL on 1DL. Similarly, two SNPs (1002872 and 1107259) on chromosome 2BS encompass genes TraesCS2B02G197000 and TraesCS2B01G197500, which code for Kinase family protein and GRF zinc finger family protein.
SNPs on chromosome 6AS and 6BS were found to be close to genes coding for NBS-LRR-like resistance proteins. The remaining SNPs on chromosomes 2D, 5AS, 5AL, 5BS, 6BL, 7BS and 7DL were also found to be associated with Serine/threonine-protein kinase, Protein Kinase family protein, Kinase family protein, Receptor-like kinase, C2H2-like zinc finger protein, Glycosyltransferase and Transcription factor gene families. The candidates genes located nearby/ in proximity to the significant SNP markers are presented in Figure 5

Discussion
Karnal bunt in wheat is of utmost importance with respect to the strict quarantine laws in countries across globe. Host resistance to this disease is of quantitative nature and only few resistance sources are identified and deployed in breeding programs across China, India, Mexico and Brazil [24,31,33,42] ( Table 1). The current study made use of a germplasm collection comprising of Afghan landraces and breeding lines mostly of CIMMYT origin disseminated to the country to identify new Karnal bunt resistance sources that can be integrated into the breeding programs to diversify resistance to this disease. The disease development is highly affected by environmental conditions; thereby, it is a challenge to properly screen the genotypes for Karnal bunt resistance [43]. In the present study, correlation and heritability of the %Karnal bunt infection score across experiments was high despite significant genotype-by-experiment interaction, indicating screening and scoring was appropriate for Karnal bunt, which is in agreement with other studies [33,37,38]. Here, we have reported 11 lines exhibiting less than 1% Karnal bunt infection score, and based on their diverse pedigree and different QTL composition, they represent diverse resistance donors for Karnal bunt and thus could be used in resistance breeding ( Table 2). Out of these eleven genotypes, one genotype belonged to Afghanistan wheat collection (#6) and the other 10 are of CIMMYT origin based on pedigree information.
Two genetic loci associated with SNPs identified on chromosomes 2BL and 5BL were the most repeatable, explaining a variation of approximately 14% and 7%, respectively. The 2BL QTL identified in this study is at a distance of 18 Mb from the previously reported QKb.cim-2BL [37] and has a TIR-NBS-LRR class protein-coding gene in its vicinity, which is well-recognized for its role in conferring disease resistance [48]. Another QTL on chromosome 5BL was found to be about 37 Mb away from an earlier reported QTL (Qkb.ksu-5BL.1) on chromosome 5BL [33,34] and the genes in the vicinity of this QTL code for F-box family protein and F-box domain containing proteins.
The QTL on chromosome 4B was found to be in the same position as the ones identified in previous studies [31,[33][34][35]. The loci on chromosome 5AL and 6BS reported in the present study are in close proximity of the previously reported QTL on the respective chromosomes [33,36,37]. The genomic regions on chromosomes 1DL, 2BS, 2DL, 4AL, 5AS, 6BL, 6BS, 7BS and 7DL are found to be potentially new and could be useful for enhancing Karnal bunt resistance in the breeding germplasm. The critical p value (0.001) adopted in this study was based on a balance between false positive and false negative. The p value would be 3.8 × 10 −6 if the Bonferroni correction were used, which is too low and would result in the absence of any significant QTL (see Figure 3). As demonstrated in the manuscript, we have identified several QTLs that have been reported previously, which implies that our analysis is robust, and also our critical p value is appropriate.
The results of our and earlier studies indicate that high diversity for Karnal bunt resistance exists in wheat germplasm. Although most of the identified MTAs showed low phenotypic effects, a few loci with large affects was still identified, such as the one on 2BL, which could be transferred together with minor QTL into individual backgrounds through MAS for developing Karnal bunt resistance cultivars.

Experimental Plant Material and Field Trials
The plant material consisted of 339 wheat genotypes representing a collection from Afghanistan, which includes landraces, elite lines, released varieties and advanced breeding lines. The experimental material was planted at the Norman E. Borlaug Experimental Station (CENEB) of CIMMYT, located in Cd. Obregon, Sonora (27. raised beds. The Karnal bunt susceptible check WL711 was seeded in a range of dates to monitor disease pressure over the whole experiment period. The field trials were managed as per recommended local practices. The material was planted on two different dates, with the first being in mid-November and the second two weeks later in each year. The heading date and disease development is affected by environmental conditions therefore each planting date was considered as separate experiment ( Figure S1). In all there were four experiments, namely, E1-2017 (2017-seeding date 1), E2-2017 (2017-seeding date 2), E3-2018 (2018-seeding date 1) and E4-2018 (2018-seeding date 2) used for screening germplasm collection for percent Karnal bunt infection.

Inoculum Preparation
To prepare inoculum, Karnal bunt infected wheat kernels were mixed with Tween 20 solution in a glass tube, which was shaken and then filtered with 60 µm mesh and allowed to stand for 24 h. The collected teliospores were placed in 0.6% sodium hypochlorite for 2 min and centrifuged at 3000 rpm for a few seconds. The supernatant was discarded and the teliospores were rinsed with distilled water, then the solution was shortly centrifuged at 3000 rpm. The rinse and centrifuge steps were repeated one more time. The teliospores were transferred to 2% water agar under sterile condition and incubated at 18-22 • C until germination was detected. Pieces of water agar with teliospores germinating on it were put inversely onto Petri dishes with potato-dextrose-agar (PDA) in order to stimulate the production of secondary sporidia. Nine days later, the Petri dishes were flooded with sterile water and scraped with a sterile spatula, and the suspension was transferred to other Petri dishes with PDA to increase the inoculum. Once the Petri dishes were covered with fungal colonies, the agar was cut into pieces and put inversely on sterile glass Petri dishes, into which distilled water was added and secondary sporidia were daily harvested. Inoculum concentration was adjusted to 10,000 sporidia/mL using a haemocytometer [26].

Karnal Bunt Inoculations and Disease Scoring
Artificial inoculation was performed on two separate occasions for each variety, with specific dates for each variety determined by the availability of sufficient numbers of tillers that were in the boot stage. The inoculations occurred between January and March, and were performed by injecting a sporidial suspension with a hypodermic syringe into the boot just as awns emerged [49]. Appropriate humidity was maintained in the field via intermittent misting with overhead sprinklers during the inoculation period (five times a day, with 20 min of misting each time). At maturity, five inoculated heads were harvested and threshed separately, and the number of infected and uninfected seeds per ear was counted. The data was used to evaluate disease severity, calculated as the percentage of infected grains in each ear. The average of the Karnal bunt infection over five spikes was used in subsequent analysis.

Statistical Analysis
Data analysis was performed using the R statistical software [50]. Analysis of variance was done for estimating the genotypic differences among dates of sowing within a year as well as across the years for the Karnal bunt infection. The Pearson's correlation coefficient was estimated between different experiments. Variance components were used for estimating the broad-sense heritability H 2 =σ 2 g /(σ 2 g +σ 2 g*E / E + σ 2 e / E ), where σ 2 g , σ 2 g*E and σ 2 e represent the genotype, genotype × experiment interaction and residual variance, respectively and E was the number of experiments, respectively.

Genotypic Data and GWAS
Genomic DNA was extracted from young leaves following the previously described CTAB method [51]. The 339 genotypes were genotyped with the DArTseq ® technology at the Genetic Analysis Service for Agriculture (SAGA) at CIMMYT, Mexico. This genotyping platform is based on a combination of complexity reduction methods developed for array-based DArT and sequencing of resulting representations on next-generation sequencing platforms, as described in detail in [52]. To avoid spurious marker trait associations, markers with missing data points >20% and minor allele frequency <0.05 were excluded from analysis. Markers that were monomorphic were also removed from the data set and the remaining array comprised 13,098 SNP markers spanning all the chromosomes. Significant marker-trait associations were identified using the mixed linear model approach (MLM) in TASSEL package, which includes a Q matrix for fixed effects and a kinship matrix for random effects. The population structure (Q) and relatedness (K) among genotypes was taken care by the MLM approach to rule out false associations. Marker-trait associations were considered significant if the p value of a trait-associated marker is ≤0.001, and were significant in at least two experiments. R 2 was used to evaluate the magnitude of marker trait effects. The marker-trait associations were cross-referenced against all reported QTL in the literature and the GrainGenes database (https://wheat.pw.usda.gov/GG3/). Significant SNPs were used to locate putative candidate genes associated with Karnal bunt resistance, where the SNP sequences were used for BLASTN against the reference genome sequence IWGSC (non-profit organization registered in the United States) RefSeq v1.0 to look for nearby genes associated with plant disease resistance.

Conclusions
Karnal bunt resistance in common wheat is a complex trait and different genes have been reported controlling the resistance in different accessions. In the present study, eleven lines were identified to be resistant as evident from average Karnal bunt infection scores of <1% after evaluation over four experiments. The lines identified here indicated that variation for Karnal bunt exists and can be used to enrich the Karnal bunt resistance base of breeding germplasm. Markers linked to QTL on chromosomes 2BL and 5BL identified in this study are useful for their utilization in breeding, and further studies on the two most significant QTL could be helpful in the identification of diagnostic markers to be used in marker assisted breeding. The genomic regions spanning the Karnal bunt QTLs identified in our study harbored many genes coding for pathogenesis related proteins such as Leucine-rich repeat receptor-like protein kinase, Kinase family protein, Serine/threonine-protein kinase and Kinase family protein, which have been identified in earlier studies to be conferring resistance. The physical map position of these QTLs linked to these defense related genes would provide opportunities for gene cloning and molecular breeding in wheat.