The Mean Staple Length of Wool Fibre Is Associated with Variation in the Ovine Keratin-Associated Protein 21-2 Gene

Wool and hair fibres consist of a variety of proteins, including the keratin-associated proteins (KAPs). In this study, a putative ovine homologue of the human KAP21-2 gene (KRTAP21-2) was identified. It was located on chromosome 1 as a 201-bp open reading frame (ORF) in the ovine genome assembly from a Texel sheep (v.4 NC_019458.2: nt122932727 to 122932927). A polymerase chain reaction- single strand conformation polymorphism (PCR-SSCP) analysis of this ORF, and subsequent DNA sequencing, identified five sequences (named A-E). The putative amino acid sequences that would be produced, shared some identity with each other and with other KAPs, but they were most similar to ovine KAP21-1, and phylogenetically related to human KAP21-2. The location of the ovine KRTAP21-2 sequence was consistent with the location of human KRTAP21-2, and this suggests they represent different variant forms of ovine KRTAP21-2. Variation in this gene was investigated in 389 Merino (sire) × Southdown-cross (ewe) lambs. These were derived from four independent sire-lines. The sequence variation was found to be associated with variation in five wool traits: including mean staple length (MSL), mean fibre diameter (MFD), fibre diameter standard deviation (FDSD), prickle factor (PF), and greasy fleece weight (GFW). The most persistent effect of KRTAP21-2 variation was with variation in MSL; with the MSL of sheep of genotype AC being 12.5% greater than those of genotype CE. A similar effect was observed from individual variant absence/presence models. This suggests that KRTAP21-2 should be further investigated as a possible gene-marker for improving MSL.


Introduction
Wool is a complex structure composed of numerous proteins. Of these, the large and diverse family of keratin-associated proteins (KAPs) fulfil an important structural role, as they are components of the matrix in which the keratin intermediate filaments (KIFs) are embedded. They are believed to play an important role in defining the physical and mechanical properties of wool fibres [1]. The KAPs usually possess either a high level of cysteine, or both glycine and tyrosine, and historically they have been classified into three groups based on their amino acid content: the ultra-high sulphur (UHS) KAPs, the high sulphur (HS) KAPs, and the high glycine-tyrosine (HGT) KAPs [2].
Of these KAPs, HGT-KAPs are predominantly found in the wool fibre orthocortex, and they have been revealed to be the first KAPs expressed after the synthesis of the KIFs. The proportion of the wool fibre that is HGT-KAP varies in different wools, varying up to 12% by weight in the wool of Merino sheep, to less than 1% by weight in wool from Lincoln sheep [3]. The small amount of HGT-KAP in sheep with the felting lustre mutation and the wide range of content of HGT-KAP in wool from different sheep breeds [4], suggests these proteins have a novel role in determining wool fibre characteristics.
To confirm this, we analysed the sheep genome assembly sequence (NC_019458.2) in proximity to ovine KRTAP21-1 [9] and at approximately 4.6 kb downstream of the KRTAP21-1 identified an open reading frame (ORF) (NC_019458.2: 122932727 to 122932927) that could potentially produce a putative HGT-KAP (unpublished data). In this study, we describe the characterization of this ORF, report sequence variation in it that was detected by polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) analysis and DNA sequencing, and reveal associations between this gene variation and variation in some important wool fibre traits.

Materials and Methods
The collection of drops of blood from the ears of sheep as was undertaken in this study, is covered by Section 7.5 Animal Identification of the Animal Welfare (Sheep and Beef Cattle) Code of Welfare 2010. This code of welfare is issued under the New Zealand Animal Welfare Act 1999 (New Zealand Government).

Sheep Blood and Wool Samples
Three hundred and eighty-nine Merino × Southdown-cross lambs from four sire-lines farmed at Ashley Dene, Canterbury, New Zealand (NZ), and NZ Romney lambs (n = 75) from five separate farms, were used to ascertain if there was DNA sequence variation in the putative KRTAP21-2. The 389 Merino × Southdown-cross lambs were also used in the association analysis. Blood samples from each of the lambs was collected onto individual blotting papers (FTA TM cards, Whatman BioScience, Middlesex, UK) and genomic DNA from the leucocytes in the blood was prepared using the method described by Zhou et al. [11].
The Merino × Southdown-cross lambs were all ear-tagged with an identification number within 12 hours of parturition and their birth date, birth weight, birth rank (i.e. whether they were a single, twin or triplet), gender, and pedigree (sire and ewe identity) were recorded. The ewes and lambs remained as a single mob until weaning, at which time the lambs were separated from the ewes and the lambs were separated into two groups based on their gender. At approximately one year of age the lambs were shorn, and their greasy fleece weight (GFW) was measured. Mid-side wool samples were collected and sent for wool testing. This was undertaken by the New Zealand Wool Testing Authority Ltd (Ahuriri, Napier, NZ) using International Wool Textile Organisation (IWTO) endorsed methods (www.iwto.org). The traits measured at testing included mean staple length (MSL), mean fibre curvature (MFC), and mean staple strength (MSS); plus the fibre diameter related traits of mean fibre diameter (MFD), fibre diameter standard deviation (FDSD), coefficient of variation of fibre diameter (CVFD), and prickle factor (PF). Wool yield (Yield) was assessed and this was used to calculate the clean fleece weight (CFW).

PCR Primers and Amplification
Sequences bordering the newly identified ORF (NC_019458.2: 122932727 and 122932927) were analysed to assist in the design of PCR primers that would span the entire putative HGT-KAP reading frame. The chosen primer sequences were: 5'-ACACACTTCAGAACCATCGC-3' and 5'-TGGTTTC AGACGTAAATGGTG-3', and they were made by Integrated DNA Technologies (Coralville, IA, USA). The PCR amplifications were performed in a 15-µL reaction. Each reaction contained the genomic DNA on an individual 1.2-mm punch of a single FTA paper, 0.5 U of Taq DNA polymerase (Qiagen, Hilden, Germany), 1 × reaction buffer supplied with the enzyme, 0.25 µM of each primer, 150 µM of each dNTP (Bioline, London, UK), and 2.5 mM of Mg 2+ . Amplification was carried out in S1000 cyclers (Bio-Rad, Hercules, CA, USA), with a thermal profile that consisted of two minutes at 94 • C, followed by 35 cycles of 30 s at 94 • C, 30 s at 62 • C and 30 s at 72 • C, with a final incubation of five minutes at 72 • C.

Screening for Variation in KRTAP21-2
The PCR amplicons were screened for DNA sequence variation using a SSCP analysis. For each amplicon derived from PCR, a 0.7-µL aliquot was added to 7 µL of 98% formamide, 10 mM EDTA, 0.025% bromophenol blue, and 0.025% xylene-cyanol. These were then denatured for five minutes at 95 • C for five minutes. The tubes containing the amplicons and dye were then cooled in wet ice and immediately loaded into separate lanes on 16 cm × 18 cm, 14% acrylamide: bisacrylamide (37.5:1) (Bio-Rad) gels. These gels contained 3% v/v glycerol. The SSCP run entailed electrophoresis at 350 volts in Protean II xi cells (Bio-Rad) with a 0.5× TBE running buffer. Buffer temperature was precisely maintained at 19 • C for the full 24 h of electrophoresis. After electrophoresis, the SSCP gels were silver-stained [12].

Sequencing of PCR-SSCP Variants and Sequence Analysis
If the amplicons produced PCR-SSCP banding patterns that suggested the sheep was homozygous in the amplified region, then they were sequenced in both directions at the Lincoln University (New Zealand) DNA sequencing facility. If the PCR-SSCP banding patterns were more complex and the patterns suggested the sheep were heterozygous for the amplified region, then a band corresponding to a single amplicon/ variant was excised as a gel slice. This was macerated then used as a template for re-amplification with the original primers. This approach is described by Gong et al. [13]. The translation of DNA sequences to putative amino acid sequences and other sequence alignments were undertaken using DNAMAN (version 5.2.10, Lynnon BioSoft, Vaudreuil, QC, Canada).

Statistical Analyses
The data were analysed statistically using Minitab version 16 (Minitab Inc., State College, PA, USA). In the analyses, General Linear Models (GLMs) were employed to assess the effect of the presence or absence of the different KRTAP21-2 variants on variation in the measured wool traits. The approach involved starting with single-variant models, where only a single variant's presence/absence was factored into the models. In these models, any variant that was revealed to associate with variation in a wool trait with a low threshold for rejection of the null hypothesis (p < 0.2), and which could this potentially be affecting the trait, was then factored into a second series of multi-variant models. In effect the presence/absence of any given allele was therefore corrected for the effect of any other alleles in the sheep's genotype that might being affecting the traits.
Next GLMs were used to compare the wool traits between lambs that had different KRTAP21-2 genotypes. This was only done using genotypes with a frequency greater than 5%, and so as to limit the potential introduction of bias from small groups of sheep with less common genotypes. When the GLMs indicated significant differences among the genotypes, multiple pairwise comparisons of the sheep with the different genotypes were made, and with a correction (Bonferroni) being applied to reduce the chances of obtaining false-positive results during the multiple comparisons.
In the GLMs, the sire of the lambs was revealed to affect all the wool traits. Accordingly, sire was included as a random explanatory factor in all the models. The gender of the lambs was revealed to have an effect on GFW, CFW, MSL, MFD, FSDS, MSS, MFC and PF, and hence gender was included as a fixed explanatory factor in the models for these traits. The birth rank of the lambs had an effect on MSL, and hence it was included as a fixed explanatory factor in the model for MSL.

Variation in SHEEP-KRTAP21-2
There were five PCR-SSCP banding patterns detected for the amplicon containing the ORF, with either one or a combination of two banding patterns being observed for each sheep ( Figure 3). DNA sequencing of the amplicons revealed five nucleotide sequences (named A, B, C, D and E), and these sequences were deposited into GenBank with accession numbers MF143975-MF143979. Four nucleotide sequence differences were identified when comparing the five sequences ( Table 1). All of these were located in the ORF, and if the ORF was functional then one of them would be a non-synonymous substitution that would result in an amino acid change (p.Val56Ile) in the putative KAP21-2 protein. Table 1. Sequence variation identified in ovine KRTAP21-2.

Nucleotide Position
Genes 2019, 10, x FOR PEER REVIEW 4 of 10 as a fixed explanatory factor in the models for these traits. The birth rank of the lambs had an effect on MSL, and hence it was included as a fixed explanatory factor in the model for MSL.

Identification of KRTAP21-2 in the Sheep Genome
A 201-bp ORF was identified between OAR1: NC_019458.2, 122932727 and 122932927. Eighteen other KAP genes have also been identified in this vicinity and in order from the centromere to telomere these were KRTAP24-1, Figure 1). The PCR primers were designed to amplify the ORF and upon sequencing the amplicons were confirmed to be of 294 bp in size. Following sequencing, the ORF would potentially produce a HGT-KAP that was most similar at the level of amino acid sequence to ovine KAP21-1 (Figure 2), and phylogenetically related to human KAP21-2. This suggests that this ORF represents the ovine KAP21-2 gene and hence it was named 'SHEEP-KRTAP21-2', using the updated KAP/KRTAP nomenclature [14].

Comparison of the Variant and Genotype Frequencies in NZ Romney and Merino × Southdown-Cross Sheep
The frequencies of the KRTAP21-2 variants in the NZ Romney sheep were: A: 6.85%; B: 19.86%; C: 53.42% and E: 19.86%; while those in Merino × Southdown-cross sheep were: A: 21.36%, B: 30.56%, C: 35.81%, D: 2.3%, and E: 9.97%. Variant C was common in the NZ Romney sheep, while in the Merino × Southdown-cross sheep, both C and B were common. Variant A was common in the Merino × Southdown-cross sheep, but was rarer in the NZ Romney sheep. The frequency of E was much higher in the NZ Romney sheep than the Merino × Southdown-cross sheep.
Ten genotypes were distinguished in the Merino × Southdown-cross sheep, and they were as follows: AA, AB, AC, AE, BB, BC, BD, BE, CC, CD, CE and EE, while BD and CD were not found in the Romney sheep. Of the genotypes present in the Merino × Southdown-cross sheep, only seven (AB, AC, BB, BC, BE, CC and CE) occurred at a frequency over 5%, and accordingly only sheep of these genotypes were used in the genotype models.

Effect of Variation in SHEEP-KRTAP21-2 on Wool Traits
Of the five KRTAP21-2 variants detected in Merino × Southdown-cross lambs, D occurred at a very low frequency, and so its association with wool traits was not tested. In the 'single-variant' models, the presence of E was associated with a decrease in MSL (p = 0.003), and the effect of E on MSL persisted when A was introduced into models (p = 0.013). Variant E was also trending towards an association with lower GFW (p = 0.173), and increased MFD (p = 0.102), FDSD (p = 0.068), MSS (p = 0.068), and PF (p = 0.055) ( Table 2). The presence of A was associated with an increase in GFW (p = 0.030), but the association between A and GFW was lost when E was introduced into the model. A trend for association was still evident (p = 0.078).

Comparison of the Variant and Genotype Frequencies in NZ Romney and Merino × Southdown-Cross Sheep
The frequencies of the KRTAP21-2 variants in the NZ Romney sheep were: A: 6.85%; B: 19.86%; C: 53.42% and E: 19.86%; while those in Merino × Southdown-cross sheep were: A: 21.36%, B: 30.56%, C: 35.81%, D: 2.3%, and E: 9.97%. Variant C was common in the NZ Romney sheep, while in the Merino × Southdown-cross sheep, both C and B were common. Variant A was common in the Merino × Southdown-cross sheep, but was rarer in the NZ Romney sheep. The frequency of E was much higher in the NZ Romney sheep than the Merino × Southdown-cross sheep.
Ten genotypes were distinguished in the Merino × Southdown-cross sheep, and they were as follows: AA, AB, AC, AE, BB, BC, BD, BE, CC, CD, CE and EE, while BD and CD were not found in the Romney sheep. Of the genotypes present in the Merino × Southdown-cross sheep, only seven (AB, AC, BB, BC, BE, CC and CE) occurred at a frequency over 5%, and accordingly only sheep of these genotypes were used in the genotype models.

Effect of Variation in SHEEP-KRTAP21-2 on Wool Traits
Of the five KRTAP21-2 variants detected in Merino × Southdown-cross lambs, D occurred at a very low frequency, and so its association with wool traits was not tested. In the 'single-variant' models, the presence of E was associated with a decrease in MSL (p = 0.003), and the effect of E on MSL persisted when A was introduced into models (p = 0.013). Variant E was also trending towards an association with lower GFW (p = 0.173), and increased MFD (p = 0.102), FDSD (p = 0.068), MSS (p = 0.068), and PF (p = 0.055) ( Table 2). The presence of A was associated with an increase in GFW (p = 0.030), but the association between A and GFW was lost when E was introduced into the model. A trend for association was still evident (p = 0.078).

Effect of Common Genotypes on Wool Traits
With the seven common KRTAP21-2 genotypes (AB, AC, BB, BC, BE, CC and CE), an effect of genotype was observed for MSL (p = 0.026), FDSD (p = 0.019), PF (p = 0.041), and a trend towards association was observed for GFW (p = 0.054) and MFD (p = 0.060) ( Table 3). Sheep of genotype AC were of a predicted mean MSL that was 12.5% greater than those of genotype CE. Genotype CE was associated with a higher predicted mean FDSD, than CC and BC. In terms of PF, sheep of genotype CE had a predicated mean PF of 4.72%, this being more than 2.3 times higher than that of sheep with BC and CC.

Discussion
The putative SHEEP-KRTAP21-2 was located on chromosome 1 and was clustered with other KAP genes that have been reported previously. It displayed a lower similarity to any previously described ovine HGT-KAP gene, but was closer in identity to a human KRTAP21-2 sequence. This gene was also located between KRTAP20-2 and KRTAP21-1, and this location is consistent with the chromosomal location of human KRTAP21-2 [10].
If transcribed and translated the amino acid sequence that would be produced from the KRTAP21-2 sequence would contain 66 residues and nearly half of these would be glycine or tyrosine. This is consistent with the defining characteristic of the HGT-KAPs. However, this polypeptide would also contain a high level (16.67 mol%) of cysteine, which is more than has been reported for any other HGT-KAP family. It is comparable to that reported for other ovine HS-KAPs though. This higher than expected level of cysteine had also been described for the other SHEEP-KAP21 family member, KAP21-1 [9]. This suggests that while KAP21-2 can be classified as a HGT-KAP, it could also be called a HS-KAP; and functionally by way of the formation of disulphide bridges from the cysteine residues, form cross-links with other HS-KAPs and the keratin intermediate filaments in the wool fibre. Additionally, it would suggest the current broad classification of KAPs into three groups (HS-, UHSand HGT-KAPs), may need to be revised.
Four nucleotide sequence variations were found in ovine KRTAP21-2 and these produced five unique variants. Among the five variants, C was very common and D was not detected in Romney sheep. In the Merino × Southdown-cross sheep both B and C were common. The differences in the frequency of the variants in these sheep breeds may suggests that this gene has been under different selection pressures and/or plays an important role in wool fibre development. Whether it reflects differences in wool fibre traits between the breeds, cannot be ascertained.
While associations were detected between variation in KRTAP21-2 and MSL, MFD, FDSD, PF and GFW, the most enduring association with KRTAP21-2 variation appeared to for variation in MSL. For the fibre trait, there was a sizeable difference in the marginal mean for the common genotypes, and a difference that reinforced the conclusions drawn from the variant absence/presence models.
Sheep with E had a lower MSL, and a trend for association with higher FDSD, MFD and PF. This is consistent with the correlation that has been reported between MSL and these other traits, with a moderate or close to moderate negative phenotypic correlation being reported between MSL; and FDSD (−0.35), MFD (−0.27) and PF (−0.24) [15]. The wool of higher MSL and lower FDSD, MFD and PF would be more desirable in the market, which means reducing the frequency of E on sheep group would fit the market demand.
Wool from sheep with A have a higher GFW than those without A. Greasy fleece weight and CFW are reported to be strongly correlated at the phenotypic level (0.916) [15], yet the variant is not associated with variation in CFW. This suggests that A has an independent effect on GFW. The results from this study also suggest that nucleotide sequence variations in genes can have a functional effect, even if they are synonymous. Given that the only difference between variants A and B is a single synonymous nucleotide change, the difference in wool traits observed between genotypes AB and BB may be due to this variation. Although synonymous mutations do not change amino acid sequences, they may affect gene function in other ways. There is some evidence that they may change mRNA folding, translation and stability, protein folding, and miRNA-based regulation of expression [16]. This would be difficult to confirm without precise analysis of gene expression levels, and in the context of the small size and complexity of the wool follicle, that would not be an easy task.
The possibility exists that the effects observed for KRTAP21-2 may be due to its linkage to other KRTAPs on the same chromosome. Regardless of the potential for linkage, the extent of the genetic variation and its associations with MSL, suggests that ovine KRTAP21-2 may have potential as a gene-marker for wool production.