Transcriptome Analysis of Long Non-Coding RNA in the Bovine Mammary Gland Following Dietary Supplementation with Linseed Oil and Safflower Oil

This study aimed to characterize the long non-coding RNA (lncRNA) expression in the bovine mammary gland and to infer their functions in dietary response to 5% linseed oil (LSO) or 5% safflower oil (SFO). Twelve cows (six per treatment) in mid lactation were fed a control diet for 28 days followed by a treatment period (control diet supplemented with 5% LSO or 5% SFO) of 28 days. Mammary gland biopsies were collected from each animal on day-14 (D-14, control period), D+7 (early treatment period) and D+28 (late treatment period) and were subjected to RNA-Sequencing and subsequent bioinformatics analyses. Functional enrichment of lncRNA was performed via potential cis regulated target genes located within 50 kb flanking regions of lncRNAs and having expression correlation of >0.7 with mRNAs. A total of 4955 lncRNAs (325 known and 4630 novel) were identified which potentially cis targeted 59 and 494 genes in LSO and SFO treatments, respectively. Enrichments of cis target genes of lncRNAs indicated potential roles of lncRNAs in immune function, nucleic acid metabolism and cell membrane organization processes as well as involvement in Notch, cAMP and TGF-β signaling pathways. Thirty-two and 21 lncRNAs were differentially expressed (DE) in LSO and SFO treatments, respectively. Six genes (KCNF1, STARD13, BCL6, NXPE2, HHIPL2 and MMD) were identified as potential cis target genes of six DE lncRNAs. In conclusion, this study has identified lncRNAs with potential roles in mammary gland functions and potential candidate genes and pathways via which lncRNAs might function in response to LSO and SFA.


Introduction
Advances in high throughput RNA sequencing technologies and computational prediction techniques have enabled the discovery of an abundant class of non-coding RNA (ncRNA) species with emerging roles in gene regulation. Among these, long non-coding RNA (lncRNA) generally considered as RNA molecules >200 nucleotides (nts) are known to participate in a diverse set of biological processes including genomic imprinting, X chromosome inactivation, cell differentiation and development, cancer metastasis, immunity, disease and ageing [1][2][3][4][5][6][7][8][9]. LncRNA mediate these processes through diverse mechanisms including acting as scaffolds, decoys or signals, regulation of gene expression in cis or trans and antisense interference or by epigenetic regulation, organization of protein complexes, cell-cell signaling, allosteric regulation of proteins as well as genome targeting [7,[10][11][12]. To date, a large number of lncRNA genes, enabled by continued developments in high-throughput sequencing methodologies, have been identified in the genomes of human (n = 96,308), mouse (n = 87,774), cow (n = 22,227), rat (n = 22,217), gorilla (n = 15,095), other animals and model organisms (http://www.bioinfo.org/noncode/analysis.php, accessed on 03 April 2018). Although the function of majority of lncRNAs are unknown, the mode of action of a few like X inactive specific transcript (XIST, functions in X chromosome inactivation, chromatin modification etc.) [7,13,14], HOX transcript antisense RNA (HOTAIR, functions in positional identity, regulate gene expression in trans and is associated with a variety of cancers) [15][16][17] and metastasis associated lung adenocarcinoma transcript 1 (MALAT1, functions in nuclear structure organization and is associated with a variety of cancers etc.) [18] are well characterized. In bovine, only a few studies have examined the occurrence of lncRNAs in muscle [19], skin [20], expressed sequence tag (EST) data [21,22], across 18 tissues [23] and in the mammary gland [24]. Although it has been predicted that bovine ncRNAs including lncRNAs are abundant, primarily intergenic and associated with regulatory genes [22], little is known about the functions of lncRNAs in the bovine genome and the lncRNA atlas of the different cell types and tissues remain to be explored. A recent study has suggested that some lncRNAs play a role in translation control of target mRNA (messenger RNA) during development of bovine early embryos [4] as well as development processes in calf gut at the early part of life [25].
Numerous studies in humans and mice have shown evidence of a role for lncRNA in mammary development and disease [26]. Pregnancy-induced non coding RNA (PINC) is the first lncRNA shown to be differentially expressed in the mammary gland of a pregnancy simulated rat model [27]. Further work showed that the expression of PINC is temporally and spatially controlled in response to developmental stimuli in vivo and loss-of-function analysis suggest roles in cell survival and regulation of cell-cycle progression in the mammary gland [28]. Zfas1 also known as ZNFX1 antisense RNA 1 is a lncRNA localized in the ducts and alveoli of the mammary gland whose expression is differentially regulated during different stages of pregnancy, lactation and involution [29]. Furthermore, knockdown of Zfas1 in a mammary epithelial cell line (HC11 cells) promoted increased cellular proliferation and differentiation and thus is a key player in the regulation of mammary alveolar development and epithelial cell differentiation [29]. Unlike lncRNA, more efforts have been directed at characterizing microRNA (miRNA, another class of ncRNA) expression and potential regulatory roles in the bovine mammary gland [30][31][32][33][34][35][36][37]. However, the occurrence and roles of lncRNAs in the bovine mammary gland is largely unknown and remain to be explored. Recently, Tong et al. [24] identified 184 lncRNAs (intergenic) in the bovine mammary gland including 36 lincRNAs co-located with 172 milk related quantitative trait loci (QTL) and one lncRNA co-located within a mastitis QTL region. Moreover, lncRNAs have been shown to play roles in dietary response in different species including human [38][39][40], mouse [41], pig [42] and calf [43]. LncRNA roles in dietary responses might be through various processes such as metabolic control [40], glucose homeostasis [40] or hypoxia-mediated metastasis [44]. Recently, Weikard et al. [43] identified 270 differentially expressed lncRNAs in the jejunum mucosa of calves fed two different milk diets and suggested that the lncRNAs might function by modulating biological processes related to energy metabolism pathways and cellular signaling processes influencing the intestinal epithelial cell barrier function.
It is well documented that bovine milk fat contains isomers (e.g., conjugated linoleic acid (CLA)) that positively influence human health [45,46]. Furthermore, bovine milk fat can be modified to increase the contents of its beneficial components [46]. Particularly, many studies have shown that unsaturated fatty acids enriched dietary supplementation with plant oils (e.g., linseed oil, corn oil, canola oil, safflower oil) and fish oil significantly increased the concentrations of milk beneficial fatty acids such as CLA [47][48][49][50][51]. Previously, we identified numerous differentially expressed genes and miRNAs in mammary gland tissues of cows following dietary supplementation with unsaturated fatty acids enriched diets [32,52,53]. The functions of lncRNAs in this dietary response are not known. In order to shed more light on lncRNA occurrence in the bovine genome, we characterized the lncRNA expression in the bovine mammary gland and examined its expression pattern in response to diets In order to shed more light on lncRNA occurrence in the bovine genome, we characterized the lncRNA expression in the bovine mammary gland and examined its expression pattern in response to diets rich in unsaturated fatty acids. Moreover, we also performed lncRNA function enrichment via their potential cis regulated target genes.

Expressed LncRNAs in the Bovine Mammary Gland
One hundred base pairs paired-end RNA sequencing of 36 libraries generated a total of 1.2 billion reads. About 87.2% of reads mapped to unique/multiple positions on the bovine genome UMD3.1 built. Of these, 96.5% mapped to unique positions and were further processed while reads that mapped to multiple positions (3.2%) and discordant alignments (0.38%) were discarded. A total of 27,967 potential transcripts were identified. Since lncRNA expression is generally low as compared to mRNA, only lncRNAs with DESeq2 normalized counts 5 and present in at least 10% of our libraries were considered as truly expressed and also used in DE analysis. Consequently, 72.29% (20,218) of potential lncRNA transcripts failed this screening step and were not further considered.
A total of 4955 lncRNA genes (7749 lncRNA transcripts) equivalent to 325 known and 4630 novel lncRNA genes were identified (Supplementary file 1). Using FPKM (fragments per kilo base of transcript per million mapped reads) normalization, 13 novel and 15 known lncRNAs were highly expressed (0.55 to 11.56 FPKM values for novel or 0.21 to 11.93 FPKM values for known lncRNAs) in the bovine mammary gland ( Figure 1A).  FPKM values ranged from 0.55 to 11.56 or 0.21 to 11.93 for novel and known highly expressed lncRNAs, respectively. (B) Intuitive map of lncRNA distribution across bovine chromosomes (outermost circle, different colors). The inner circle (blue lines) represents novel lncRNAs and the innermost circle (red lines) represents known lncRNAs. The height of the line is proportional to the expression level (FPKM) and only those with FPKM > 0.02 are shown.
Expression level is a feature that distinguishes lncRNAs from mRNAs. Using FPKM normalization, we showed that the mean expression level of mRNA transcripts from the same data was 3.6 as compared to 0.30 for lncRNA ( Figure 2A).

Characteristics of Expressed LncRNAs
LncRNAs are generally regarded as RNA molecules >200 nts. The length distribution of identified lncRNA transcripts ranged from 200 to over 10,000 nts ( Figure 2B, Supplementary file 3a). The majority (45.11%) were between 200 and 999 nts followed by 1000 to 2499 nts (37.54%) while 17.34% were ≥2500 nts long. One known lncRNA was however <200 nts long. Compared with mRNA transcripts from the same data [53], transcript length of majority of mRNA was between 500 and 7000 nts ( Figure 2B).
The genomic location of a lncRNA is important as it may give clues to its functions. Thus, identified lncRNAs were classified according to their genomic location and expression direction into 11 classes (Supplementary file 3c). As expected, 62.38% of lncRNA transcripts were intergenic and located at >1 kb (kilo base pairs) away from the nearest gene. This was followed by an appreciable number (23.86%) of transcripts located in such a way that one or more of their exons overlapped with the exons of protein coding genes. LncRNA transcripts located within a one kb region upstream of protein coding genes and transcribed in the same or opposite direction constituted 8.74%. In comparison, fewer lncRNAs (1.54%) were located within one kb downstream of protein coding genes.

Characteristics of Expressed LncRNAs
LncRNAs are generally regarded as RNA molecules >200 nts. The length distribution of identified lncRNA transcripts ranged from 200 to over 10,000 nts ( Figure 2B, Supplementary file 3a). The majority (45.11%) were between 200 and 999 nts followed by 1000 to 2499 nts (37.54%) while 17.34% were ≥2500 nts long. One known lncRNA was however <200 nts long. Compared with mRNA transcripts from the same data [53], transcript length of majority of mRNA was between 500 and 7000 nts ( Figure 2B).
The genomic location of a lncRNA is important as it may give clues to its functions. Thus, identified lncRNAs were classified according to their genomic location and expression direction into 11 classes (Supplementary file 3c). As expected, 62.38% of lncRNA transcripts were intergenic and located at >1 kb (kilo base pairs) away from the nearest gene. This was followed by an appreciable number (23.86%) of transcripts located in such a way that one or more of their exons overlapped with the exons of protein coding genes. LncRNA transcripts located within a one kb region upstream of protein coding genes and transcribed in the same or opposite direction constituted 8.74%. In comparison, fewer lncRNAs (1.54%) were located within one kb downstream of protein coding genes. LncRNAs located in the introns of genes were very few (2.17%), as well as lncRNAs with intron containing mRNAs (1.32%).

Function Enrichment via Potential cis Target Genes of lncRNAs
Correlation analysis of lncRNA and mRNA expression identified 59 and 494 potential cis target genes (mRNAs) for lncRNAs in LSO and SFO treatments, respectively (Supplementary file 4). Among them, 38 genes were common to both treatments. A total of 67 (49 biological process gene ontology (GO) terms, 9 cellular components GO terms and 9 molecular functions GO terms) and 15 (12 biological process GO terms, 2 cellular components GO terms and 1 molecular functions GO term) were enriched for cis target genes of lncRNAs in SFO and LSO treatments, respectively (Tables 1 and 2 and Supplementary file 5). The most enriched GO terms were GO:1904375 (regulation of protein localization to cell periphery, p = 3.6 × 10 −4 ) for LSO and GO:0048294(negative regulation of isotype switching to IgE isotypes, p = 2.6 × 10 −3 ) for SFO. Moreover, 2 and 11 KEGG pathways were also enriched for LSO and SFO cis target genes at uncorrected p-value < 0.05, respectively and SNARE interactions in vesicular transport pathway was common to both treatments ( Figure 3). The SNARE interaction in vesicular transport pathway was also the most significantly enriched pathway for both LSO and SFO cis target genes ( Figure 3).

Reversed Transcribed PCR (RT-PCR) Verification of the Detection of lncRNA and Real Time Quantitative PCR (qPCR) Verification of the Expression Level of lncRNA
Using RT-PCR, we verified the presence of four lncRNAs (XLOC_003855, XLOC_053295 (NONBTAT026075.2), XLOC_014422 and XLOC_049508) in three different samples (Supplementary file 6). RT-PCR products were of expected sizes (Supplementary file 6), thus confirming RNA-Seq results of lncRNA detection. Moreover, we verified the expression levels of two lncRNAs (XLOC_049508 and XLOC_040628) by real time qPCR ( Figure 5). XLOC_049508 and XLOC_040628 were both expressed at >4 fold change, compared to >2 fold change by RNA-seq, thus confirming RNA-seq results.

Reversed Transcribed PCR (RT-PCR) Verification of the Detection of lncRNA and Real Time Quantitative PCR (qPCR) Verification of the Expression Level of lncRNA
Using RT-PCR, we verified the presence of four lncRNAs (XLOC_003855, XLOC_053295 (NONBTAT026075.2), XLOC_014422 and XLOC_049508) in three different samples (Supplementary file 6). RT-PCR products were of expected sizes (Supplementary file 6), thus confirming RNA-Seq results of lncRNA detection. Moreover, we verified the expression levels of two lncRNAs (XLOC_049508 and XLOC_040628) by real time qPCR ( Figure 5). XLOC_049508 and XLOC_040628 were both expressed at >4 fold change, compared to >2 fold change by RNA-seq, thus confirming RNA-seq results.

Discussion
Previously, we showed a reduction in milk fat yield of 30.38% and 32.42% in response to 5% LSO and 5% SFO, respectively, accompanied by increased concentrations of some monounsaturated and polyunsaturated fatty acids in milk, differential regulation of genes with roles in lipid synthesis/metabolism [53], differential miRNA expression [32] and co-expression network of miRNAs [52]. In the present study, we have characterized the lncRNA repertoire of the bovine mammary gland in response to LSO and SFO.
A total of 325 known and 4630 novel lncRNAs were identified in this study. Identified lncRNAs were generally less expressed and of smaller sizes compared to mRNA transcripts. Studies on the annotation of human lncRNAs have reported lower expression, smaller size and fewer exons for lncRNAs as compared to mRNAs [54,55] thus supporting our observations. The transcript number per lncRNA gene as compared to mRNA in this study followed the same pattern reported earlier for human [55]. Majority of identified lncRNA transcripts in this study are located in the intergenic regions of protein coding genes (Table S3e). This observation is consistent with previous studies that have reported that lncRNAs are principally located in the intergenic region of genes while a lesser percentage overlap protein coding genes [22,54,55]. Qu and Adelson [22] noted that 67.4% of intergenic bovine ncRNAs had a neighbor gene within 20 kb, with significant number within 5 kb flanking regions of genes. Studies have suggested/demonstrated that lncRNAs may act in cis or trans to regulate the activities of neighboring genes [56][57][58][59][60][61]. It has been shown that functional clustering of neighbor genes within 5 kb of intergenic ncRNAs resulted in over-representation of regulatory genes [22]. The expression of intergenic lncRNAs was reported to be highly correlated with the expression of neighboring genes within 10 kb [54]. It should be noted that co-expression of lncRNA and mRNA could be due to a true cis effect of the lncRNA on the mRNA or due to nearby transcriptional activity of surrounding open chromatin [54,62]. Some of the highly expressed lncRNAs identified in this study (13 novel and 15 known) have been detected in bovine tissues, skin and EST data from many developmental stages [20][21][22]. Given that lncRNAs are generally less expressed, the relative high expression levels of the 28 lncRNAs suggest potential roles in the bovine mammary gland. However, validation of their functional significance in the bovine mammary gland merits further investigations. Since it is known that lncRNAs may regulate in cis or trans the expression of protein coding genes [56][57][58][59][60][61] and since the functions of most bovine lncRNAs are still unknown, we predicted the potential functions of detected lncRNAs via correlated cis located mRNAs in the transcriptome data from the same animals. Various GO terms for the potential cis target genes of lncRNAs were enriched in different processes (Tables 1  and 2) which might reflect diverse functions of lncRNAs in the bovine mammary gland. The most enriched GO term for LSO (GO:1904375-regulation of protein localization to cell periphery) does not appear to have a direct functional link with mammary lipid synthesis but it might be important for tissue functioning by modulating the frequency, rate or extent of protein localization to cell

Discussion
Previously, we showed a reduction in milk fat yield of 30.38% and 32.42% in response to 5% LSO and 5% SFO, respectively, accompanied by increased concentrations of some monounsaturated and polyunsaturated fatty acids in milk, differential regulation of genes with roles in lipid synthesis/metabolism [53], differential miRNA expression [32] and co-expression network of miRNAs [52]. In the present study, we have characterized the lncRNA repertoire of the bovine mammary gland in response to LSO and SFO.
A total of 325 known and 4630 novel lncRNAs were identified in this study. Identified lncRNAs were generally less expressed and of smaller sizes compared to mRNA transcripts. Studies on the annotation of human lncRNAs have reported lower expression, smaller size and fewer exons for lncRNAs as compared to mRNAs [54,55] thus supporting our observations. The transcript number per lncRNA gene as compared to mRNA in this study followed the same pattern reported earlier for human [55]. Majority of identified lncRNA transcripts in this study are located in the intergenic regions of protein coding genes (Table S3e). This observation is consistent with previous studies that have reported that lncRNAs are principally located in the intergenic region of genes while a lesser percentage overlap protein coding genes [22,54,55]. Qu and Adelson [22] noted that 67.4% of intergenic bovine ncRNAs had a neighbor gene within 20 kb, with significant number within 5 kb flanking regions of genes. Studies have suggested/demonstrated that lncRNAs may act in cis or trans to regulate the activities of neighboring genes [56][57][58][59][60][61]. It has been shown that functional clustering of neighbor genes within 5 kb of intergenic ncRNAs resulted in over-representation of regulatory genes [22]. The expression of intergenic lncRNAs was reported to be highly correlated with the expression of neighboring genes within 10 kb [54]. It should be noted that co-expression of lncRNA and mRNA could be due to a true cis effect of the lncRNA on the mRNA or due to nearby transcriptional activity of surrounding open chromatin [54,62]. Some of the highly expressed lncRNAs identified in this study (13 novel and 15 known) have been detected in bovine tissues, skin and EST data from many developmental stages [20][21][22]. Given that lncRNAs are generally less expressed, the relative high expression levels of the 28 lncRNAs suggest potential roles in the bovine mammary gland. However, validation of their functional significance in the bovine mammary gland merits further investigations. Since it is known that lncRNAs may regulate in cis or trans the expression of protein coding genes [56][57][58][59][60][61] and since the functions of most bovine lncRNAs are still unknown, we predicted the potential functions of detected lncRNAs via correlated cis located mRNAs in the transcriptome data from the same animals. Various GO terms for the potential cis target genes of lncRNAs were enriched in different processes (Tables 1 and 2) which might reflect diverse functions of lncRNAs in the bovine mammary gland. The most enriched GO term for LSO (GO:1904375-regulation of protein localization to cell periphery) does not appear to have a direct functional link with mammary lipid synthesis but it might be important for tissue functioning by modulating the frequency, rate or extent of protein localization to cell periphery. In the SFO treatment, the most enriched term (GO:0048294-negative regulation of isotype switching to IgE isotypes) as well as other enriched GO terms (GO:0002829, GO:0045623 and GO:0045829) showed involvement in immune regulation. The functions of lncRNAs in immunity are well documented [63,64]. Recently, enrichment results by Tong et al. [24] suggest that lncRNAs might play roles in the regulation of immune genes and potentially contribute to disease resistance, such as mastitis in cows. Yang et al. [65] reported involvement of lncRNA H19 in TGF-β1-induced epithelial to mesenchymal transition in bovine epithelial cells and suggested its potential role in immunity and bovine mastitis. In another experiment, Ma et al. [66] reported many lncRNAs that were DE during bovine viral diarrhea virus infection with potential roles in immune functions. As expected, lncRNA target genes were significantly enriched for biological process GO terms involved in regulation of RNA processing (GO:0006396 and GO:1902679) as well as DNA recombination (GO:0045910, GO:0000018). In fact, to perform their functions, lncRNAs might bind to their target genes [67], therefore it is not surprising that the nucleic acids regulation GO terms were enriched. A notable KEGG pathway enriched for LSO treatment was Notch signaling pathway. Notch signaling pathway is important in mammary gland development [68,69]. Previously, we reported that Notch signaling pathway was enriched for target genes of miRNAs in the regulation of milk yield and component traits [70]. It is not clear which specific lncRNAs could be regulating this pathway or how they are involved in the regulation of mammary gland functions. However, the lncRNA HOTAIR has been reported to target the Notch signaling pathway in cervical cancer cells [71]. The SNARE interaction in vesicular transport pathway was significantly enriched for cis target genes of lncRNAs in both LSO and SFO treatments. This pathway is important for mediating intracellular destination of transport vesicles [72] as well as membrane fusion [73,74] but it is not clear what role it plays in the regulation of mammary gland functions. Other notable pathways enriched for SFO lncRNA cis target genes were cAMP and TGF-β signaling pathways. cAMP was recently identified as an enriched pathway for lncRNA target genes in the bovine mammary gland [24] while TGF-β signaling pathway, known to have important immune functions, was reported as an important pathway for lactation persistency [75] as well as an enriched pathway for target genes of DE miRNAs during a lactation curve [76].
Differential gene expression results showed that nutrients rich in unsaturated fatty acids had an effect on lncRNA expression. A comparison of DE lncRNAs between LSO and SFO treatments indicated that more lncRNAs were DE by LSO as compared to SFO and in particular, no lncRNA was DE after one week of SFO supplementation (D-14 vs. D+7). This is similar to our previous observation on mRNA transcriptome of the same data that showed a greater impact of LSO over SFO on gene expression [53]. Also, the mRNA transcriptome data indicated involvement of LSO and SFO DE genes in similar (molecular transport, small molecule biochemistry, lipid metabolism) and different (LSO: cell death and survival, protein synthesis, cellular growth and proliferation and amino acid metabolism; SFO: energy production, cellular movement, cell cycle and carbohydrate metabolism) functions and pathways, which could be due to the different degree of unsaturation of the main fatty acids in LSO and SFO [53]. LSO is rich in α-linolenic acid (3 double bonds in their structure) while SFO is rich in linoleic acid (2 double bonds), which resulted in different intermediates of biohydrogenation in the rumen thus affecting differently the pathways of lipid metabolism and other functions. It is known that the profile of ruminal biohydrogenation intermediates are influenced by the type of diet [77,78] and that pathways related to lipid metabolism have been significantly changed due to diet supplementation [79]. Thus, the observed differential expression of lncRNAs might reflect the change in their functions in response to the type of diet supplement (LSO or SFO). To the best of our knowledge, there are no studies related to lncRNAs expression/function in response to lipid supplements in the mammary gland, so further studies are needed in this area. Moreover, some DE lncRNAs in this study have been previously characterized in bovine [20,22]. These results and our observation suggest regulatory roles of lncRNA in many biological processes including mammary gland functions. Moreover, we also identified six potential cis target genes (KCNF1, STARD13, BCL6, NXPE2, HHIPL2 and MMD) for DE lncRNAs (Table 5). These genes are involved in lipid metabolism (STARD13), molecular transport (KCNF1), immune processes/disease (MMD and BCL6) and in epigenetic processes (STARD13). STARD13 encodes for a member of StAR-related lipid transfer (START) proteins which play important roles in the regulation of intracellular lipid metabolism [80]. MiRNA-125b was shown to induce metastasis in MCF-7 and MDA-MB-231 breast cancer cells through targeting of STARD13 [81]. The monocyte to macrophage differentiation associated (MMD) gene showed the highest level of correlation (p-value = 6.54 × 10 −9 ) with a lncRNA (XLOC_020830) in SFO treatment. Roles for MMD in the positive regulation of ERK and Akt activation and TNF-α and nitric oxide production in macrophages have been demonstrated [82].
It should be noted that, transcripts of the main proteins (CSN1S1, CSN1S2, CSN2, CSN3, LGB and LALBA and GLYCAM1) in milk constituted 79.45% of the read counts in mammary tissue transcriptome [53] which could impede detection of lowly expressed transcripts. Therefore, a higher sequence read count per sample or depletion of the transcripts of these main proteins might be required to better characterize a class of lowly expressed genes like lncRNAs in mammary tissue. As with many differential gene expression studies, the number of DE genes detected relies on the choice of methodologies (data filtering, read count normalization and comparison between different groups) and selection of methods for correction of multiple testing and threshold for declaration of significant p-values. In this study, we chose the Benjamini and Hochberg [83] moderate conservative method for multiple testing which is widely used in the field to avoid losing important DE genes as observed with more conservative methods like Bonferroni correction. It is well documented that the choice of database for enrichment analyses and the methods to test enriched terms also influence results obtained [84,85]. In this study, a hypergeometric test was applied for testing of GO term enrichment using ClueGO [86] platform. This approach has been widely used in the literature and also in our previous study [70]. The potential functions of identified lncRNAs were predicted through inference of the correlation of lncRNA and mRNA expression. However, it is important to note that an observed correlation does not necessarily mean causal relationship. The cis target genes predicted based on expression correlation needs to be experimentally functionally verified to confirm their functions.

Experimental Animals and Tissue Sampling
Animal care, management and use procedures were according to the national codes of practice for the care and handling of farm animals (http://www.nfacc.ca/codes-of-practice) and approved by the Animal Care and Ethics Committee of Agriculture and Agri-Food Canada (CPA #402, 04 April, 2012).
The experiment was conducted at the dairy barn of the Sherbrooke Research and Development Centre of Agriculture and Agri-Food Canada. Procedures for animal management and sampling have been reported in our companion papers on the same animals [32,52,53]. In summary, twelve high producing (35 ± 10 kg milk/day) Canadian Holstein cows in mid-lactation (150 ± 50 days in milk) were separated based on parity and days in milk and randomly allocated to one of two treatments: (1) linseed oil treatment (LSO) six cows fed a control diet composed of a total mixed ration of corn and grass silages (50:50) and concentrates supplemented with 5% LSO (on dry matter (DM) bases) and (2) safflower oil treatment (SFO) six cows fed the control diet supplemented with 5% SFO (DM) for 28 days. The treatment period (D+1 to D+28) was preceded by a control period (D-28 to D-1) of 28 days during which time all the animals were on the control diet. The composition of experimental diets is listed in Supplementary file 7. Mammary gland biopsies were harvested from all the animals at three different times during the experimental periods: D-14 (control period), D+7 (7th day after onset of treatment, early treatment period) and D+28 (28th day of treatment, late treatment period) according to an established protocol [87]. Milk samples were collected weekly for the measurement of fat content and fatty acid profiles and the results have been reported [53].

RNA Isolation and Sequencing
Total RNA was isolated from 50 mg/biopsy sample with miRNeasy Kit (Qiagen Inc., Toronto, ON, Canada) according to manufacturer's protocol. Purified RNA was DNase digested with Turbo DNase Kit (Ambion Inc., Foster City, CA, USA) to eliminate DNA contamination. RNA concentration was measured with Nanodrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, DE, USA). The RNA 6000 Nano Labchip Kit (Agilent Technologies, Santa Clara, CA, USA) was used to assess the quality of RNA on an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). The RNA integrity number of all samples was high and ranged from 7.99 to 9.5. Thirty

RNA-Sequence Read Alignment and Identification of lncRNA
RNA-Seq reads from each sample (total of 36) were trimmed using trimmomatic software v0.32 to keep reads longer than 32 bp with a minimum phred score of 30 and to remove adaptor sequences. Reads were then aligned to the bovine genome (UMD3.1) [88] with Tophat (v2.0.11) [89] using default parameters. Uniquely mapped and properly paired reads were assembled with Cufflinks (v2.1.1) [90] and using Ensembl bovine gene annotation release 79. Assembled transcripts from all samples were merged into one using Cuffmerge (Cufflinks v2.1.1) to generate a unique set of all transcripts. Transcripts with a length <200 nt were removed and remaining transcripts compared with Ensembl bovine gene annotation (release 79) to remove transcripts overlapping with known protein coding and noncoding genes (mRNA, tRNA, rRNA, snRNA, snoRNA, miRNA) using Cuffcompare. mRNA transcripts were retained as a separate data set for use in comparing lncRNA expression pattern. Transcripts with class code "i" (an exon falling into an intron of reference transcript), "o" (generic exonic overlap with reference transcripts), "u" (intergenic transcript) and "x" (exonic overlap with reference transcript on the opposite strand) were retained. Retained transcripts were evaluated for their coding potentials using Coding-Non-Coding Index (CNCI) program [91]. CNCI is effective for distinguishing protein-coding and non-coding nucleotide sequences by profiling adjoining nucleotide triplets. Those transcripts assigned with a negative CNCI score were classified as candidate non-coding transcripts. The coding potential of candidate non-coding transcripts was further assessed with Coding Potential Assessment Tool (CPAT) [92]. CPAT was trained with available bovine known protein-coding transcripts from Ensembl bioMart and bovine non-coding sequences (NONCODE2016) [93] to build a logistic regression model. The resulting CPAT coding probability score for the transcripts ranges between 0 and 1 with a higher score indicating a higher coding potential. We chose a cut-off value of 0.4 for determining protein coding probability.
The remaining transcripts were then blasted against the Swiss-prot database to remove those with a hit (e value < 1 × 10 −5 ) using usearch [94]. Retained transcripts were compared with known bovine lncRNA annotation from NONCODE2016 database [93,95]. Those transcripts with class codes of "=" (complete match with reference transcript), "c" (contained in reference transcript) and "j" (novel isoform of reference transcript) were classified as known bovine lncRNA whereas the rest were classified as novel lncRNA. The identified lncRNA were further classified into 11 classes with the reference of Ensembl bovine protein coding gene annotation.

Gene Ontology and Pathways Enrichment for lncRNA cis Target Genes
Since lncRNAs can cis regulate mRNAs [56][57][58][59][60][61], we performed enrichments for lncRNA cis regulatory functions by using mRNA transcriptome data obtained from the same animals [53]. For each lncRNA, Pearson correlation of its expression value with that of each mRNA was calculated. The closest coding genes within 50 kb upstream and downstream of lncRNAs were mined using BEDTools v2.25.0 program [96]. The genes were considered potential cis target genes of lncRNAs if in addition to their location (within a 50 kb window upstream or downstream of lncRNAs) they had a Pearson correlation of >0.7 with lncRNAs.
These cis target genes were submitted to the ClueGo plugin [86] in Cytoscape [97] for GO term and KEGG pathways enrichment analysis. Enriched pathways and GO terms were tested using a hypergeometric test which estimates enrichment by evaluating the overlap between genes in a given gene set (input gene list) followed by annotating genes to a GO term or pathway. The null hypothesis was 'the annotated GO term or pathway was irrelevant to the input list'. The p-value measures the significance of enrichment derived from the tail probability of observing numbers of DE genes annotated to the GO term or pathway. Enriched GO terms were declared significant at Benjamini and Hochberg [83] adjusted p-value ≤ 0.05 while a lower threshold at uncorrected p-value < 0.05 were considered significant for KEGG pathways enrichment.

LncRNA Expression and Differential Gene Expression Analysis
The expression of identified lncRNAs (known and novel) was quantified in each sample using HTSeq-count (version 0.6.1p1) with default settings (-s reverse). The raw read counts of retained transcripts of all libraries were then imported into DESeq2 [98] to identify differentially expressed lncRNAs. DESeq2 calculates a size factor for each sample to correct for library size and RNA composition bias. Those lncRNAs with DESeq2 normalized counts ≥5 in at least 10% of our libraries were considered as truly expressed. Significantly differentially expressed lncRNAs were defined as having a Benjamini and Hochberg adjusted p-value < 0.1. The expression level of each lncRNA was determined as FPKM. To further illustrate the functions of lncRNA in the nutrient effects on mammary gland, the same procedure for enrichments using Clue GO was applied for cis target genes of lncRNAs DE by treatments.

Reversed Transcribed (RT)-PCR
Reversed transcribed-PCR was performed to verify the presence of lncRNAs detected by RNA sequencing. Primers for four randomly selected lncRNAs (XLOC_003855, XLOC_053295 (NONBTAT026075.2), XLOC_014422 and XLOC_049508) were designed using Integrated DNA Technologies Assay tool (https://www.idtdna.com/scitools/Applications/RealTimePCR/). The gene-specific primers used for detecting lncRNAs are shown in Supplementary file 6. Reverse transcription was performed with SuperScript™ II Reverse Transcriptase (Life Technologies Inc., Carlsbad, CA, USA), using 500 ng of the same total RNA used in RNA sequencing. cDNA templates were amplified in three different samples by PCR using Crimson Taq DNA polymerase (New England BioLabs, Whitby, ON, Canada). All PCR reactions were performed using the Veriti 96 well thermal cycler (Applied Biosystems, Foster City, CA, USA). An initial PCR gradient was done to determine the best annealing temperature for each primer pair. Thermal cycling condition was composed of an initial denaturation at 95 • C for 4 min followed by 45 cycles of 30 s denaturation at 95 • C, 1 min annealing at 52 • C and elongation at 72 • C for 30 s. The final extension step was done at 72 • C for 5 min. The PCR products (~300-600 bp) were run on 1.5% agarose gel and visualized with Fusion FX (Birch House, Brambleside, Uckfield, UK). A 100bp ladder was run alongside the samples.

Real-Time qPCR Verification of lncRNA Expression
Validation of the RNA-seq expression levels of two randomly selected DE lncRNAs was done using real-time quantitative PCR. Reverse transcription was performed with the SuperScript™ III Reverse Transcriptase (Life Technologies), using aliquots (1 µg) of the same total RNA used in RNA-seq. The cDNA samples were diluted to 20 ng/µL. Transcript-specific primers were designed using Integrated DNA Technologies RealTime qPCR Assay tool (https://www.idtdna.com/scitools/ Applications/RealTimePCR/) (Supplementary file 8). Real-time PCR reaction mix was composed of 5 µL Power SYBR ® Green PCR Master Mix (Life Technologies Inc., Burlington, ON, Canada), 3 µL cDNA, 300 nM of each forward and reverse primers and 0.1 U AmpErase ® Uracil N-Glycosylase (UNG, Life Technologies, Carlsbad, CA, USA). QPCR reactions were performed using the StepOnePlus™ Real-Time PCR System (Life Technologies). The thermal cycling conditions were composed of a step for UNG treatment at 25 • C for 5 min followed by an initial denaturation/activation step at 95 • C for 10 min, 45 cycles at 95 • C for 30 s, 60 • C for 60 s. The experiments were carried out in triplicate for each data point. The relative quantification of gene expression was determined using the 2 −∆∆Ct method [99]. The fold change in gene expression was obtained following normalization to two reference genes, RPS15 and GAPDH. The stability of these reference genes have been previously tested [53].

Conclusions
A total of 4955 lncRNAs (325 known and 4630 novel) were identified including 32 (11 up-regulated and 21 down-regulated) and 21 (4 up-regulated and 17 down-regulated) lncRNAs differentially expressed in LSO and SFO treatments, respectively. The impact of LSO on lncRNA expression was early and also more potent as compared to SFO. GO and pathway analyses of lncRNA cis target genes suggest regulatory roles for lncRNAs in mammary gland functions, immune functions and metabolism/regulation of nucleic acid processes in the mammary gland. Furthermore, lncRNAs DE by LSO or SFO suggest potential regulatory roles in mammary lipid metabolism and synthesis of lipid/fatty acid. The identified lncRNAs will further enrich the catalogue of bovine lncRNAs and will contribute in the understanding of mammary gland functions and biology.

Conflicts of Interest:
The authors declare no conflict of interest.
Availability of supporting data: The sequence data has been submitted to Gene Expression Omnibus database (BioProject ID: PRJNA301777) and is available through this link: http://www.ncbi.nlm.nih.gov/bioproject/ 301777.