Identification of key genes and pathways associated with feed efficiency of native chickens based on transcriptome data via bioinformatics analysis

Background Improving feed efficiency is one of the important breeding targets for poultry industry. The aim of current study was to investigate the breast muscle transcriptome data of native chickens divergent for feed efficiency. Residual feed intake (RFI) value was calculated for 1008 closely related chickens. The 5 most efficient (LRFI) and 5 least efficient (HRFI) birds were selected for further analysis. Transcriptomic data were generated from breast muscle collected post-slaughter. Results The differently expressed genes (DEGs) analysis showed that 24 and 325 known genes were significantly up- and down-regulated in LRFI birds. An enrichment analysis of DEGs showed that the genes and pathways related to inflammatory response and immune response were up-regulated in HRFI chickens. Moreover, Gene Set Enrichment Analysis (GSEA) was also employed, which indicated that LRFI chickens increased expression of genes related to mitochondrial function. Furthermore, protein network interaction and function analyses revealed ND2, ND4, CYTB, RAC2, VCAM1, CTSS and TLR4 were key genes for feed efficiency. And the ‘phagosome’, ‘cell adhesion molecules (CAMs)’, ‘citrate cycle (TCA cycle)’ and ‘oxidative phosphorylation’ were key pathways contributing to the difference in feed efficiency. Conclusions In summary, a series of key genes and pathways were identified via bioinformatics analysis. These key genes may influence feed efficiency through deep involvement in ROS production and inflammatory response. Our results suggested that LRFI chickens may synthesize ATP more efficiently and control reactive oxygen species (ROS) production more strictly by enhancing the mitochondrial function in skeletal muscle compared with HRFI chickens. These findings provide some clues for understanding the molecular mechanism of feed efficiency in birds and will be a useful reference data for native chicken breeding.


Background
Feed cost, account for 60-70% of the total cost of the modern poultry industry, has become one of the most important factors restricting the development of the poultry industry. A strategy to improve feed efficiency is a high priority for the poultry industry to reduce feed costs and nitrogen excretion [1]. Residual feed intake (RFI) has become a sensitive and accurate indicator of feed efficiency. RFI, first proposed by Koch et al. [2], is defined as the feed consumption above or below what is predicted for growth and maintenance. Herein, birds with low level RFI means consume less feed than predicted and are identified as efficient birds. For the last decades, RFI is being used to measure feed efficiency traits, which has successfully applied to the artificial selection of feed efficiency in mammal [3] and poultry [4]. Besides, RFI is a trait independent of other production traits, and the heritability of RFI is between 0.23 and 0.49 in chickens, these characteristics of RFI make it can be easily incorporated into the multi-trait selection indexes of commercial breeding companies [5]. From primary breeders to commercial growers, the selection of feed efficiency needs to be specifically considered by all industry practitioners to maximize returns. However, in fact, RFI is indeed rare in commercial production systems, mainly because of the complexity of RFI measurement [6]. In order to further expand the application prospect of RFI, it is urgent to study and explore the biological basis of chicken RFI.
RFI is a complex quantitative trait that is known to be associated with many biological factors. High throughput sequencing technology including RNA sequencing (RNA-seq) has become a mature and efficient tool for deeper understanding the underlying molecular mechanism of complex trait such as RFI [7]. An earlier duodenal transcriptomic analysis in chickens showed that the difference in RFI may be related to digestibility, metabolism and biosynthesis processes as well as energy homeostasis [8]. Moreover, A previous high throughput sequencing analysis indicates that mitochondrial energy metabolism in skeletal muscle plays a key role in the regulation of feed efficiency [9,10]. Skeletal muscle plays a particularly important role in the utilization and storage of carbohydrates and lipids from feed [11]. In chickens, the breast muscle is the main skeletal muscle. Many studies have reported that feed efficiency is associated with mitochondria function, breast muscle growth, and some physiological changes of breast muscle tissue in chickens [10,12,13].
Statistically, RNA-seq has been widely used for indeep analysis and a better understanding of the molecular basis of feed efficiency in chickens. To further interpret RNA-seq data, functional enrichment analysis is extensively used to derive biological insight.
Traditionally, differentially expressed genes (DEGs) from transcriptome data were first identified, and then the biological interpretation of DEGs was assisted by computational functional analysis based on accumulated biological knowledge. Finally, the biological function analysis of DEGs is based on a list of preselected 'interesting' genes [14]. However, traditional practices in transcriptomic data analysis can only account for DEGs selected by arbitrary cutoffs, and this method may also be limiting insight by prioritizing highly differentially expressed and 'interesting' genes over those genes that undergo moderate fold-changes [15]. Gene Set Enrichment Analysis (GSEA) is a computational method used to determine whether a particular gene expression pattern is significantly different between two groups of samples [16]. GSEA is reviewed as a cutoff-free strategy, which ranks all expressed genes according to the strength of expression difference. Compared with biological function analysis of DEGs, GSEA method avoids choosing arbitrary cutoffs and can accumulate subtle expression changes in the same group of gene set for studying functional enrichment between two biological groups [17]. In the current study, transcriptome data were analyzed with DEGs function analysis and GSEA method in order to obtain comprehensive biological insight of differences between RFI groups.
Wannan Yellow chicken was selected as experiment material. It is a famous native chicken breed and popular in the southeast of China for its delicious meat and unique flavor. There is considerable variation in feed efficiency between commercial broilers and native chickens. In addition to extrinsic factors such as diet, microbiota, and housing environment, it can be speculated that there are some internal molecular mechanism enabling the differential allocation of resources for various physiological processes [18]. The transcriptome data from commercial broilers may not be appropriate as a reference for native chicken breeding. To date, however, a large number of sequencing analyses have been performed on commercial broilers [12,19], but only a few reports have focused on native chickens [20]. Therefore, the objective of this study was to identify a series of key genes and pathways affecting feed efficiency through analysis of the breast muscle transcriptome between native chickens divergent with extreme RFI. Our findings will contribute to a better understanding of the underlying molecular mechanism of feed efficiency and provide important reference information for native chicken breeding.

Performance and feed efficiency
The difference in feed intake, growth performance, and feed efficiency traits are showed in Table 1. The average daily feed intake (ADFI) of HRFI birds was significantly higher than that of LRFI birds (P < 0.05), and the HRFI group consumed 8.8% more feed than the LRFI group. As expected, the FCR and RFI of LRFI group were significantly lower than those of HRFI group (P < 0.05). the LRFI birds had the RFI value of − 2.29 ± 0.16 compared with 1.94 ± 0.09 for the HRFI birds during 42 days (day 56-98) of the experiment. In addition, there was no significant difference in the initial body weight (BW56) and final body weight (BW98) between RFI groups (P > 0.05). Moreover, metabolic body weight (MBW 0.75 ) and average daily body weight gain (ADG) also showed no significant difference between the two groups (P > 0.05).

Gene expression profile
All breast muscle samples (n = 5 per RFI group) were collected for RNA-seq. The number of raw reads, high quality raw reads, trimmed reads, and mapped reads for each sample are presented in (Additional file 1: Table  S1). After filter, the overall Q30 percentage of high quality clean data was above 95%. An average of 68.1 million trimmed reads per sample were mapped to the reference with a mean of 83.05% mapping efficiency. To analyze the transcriptional variations occurring between the HRFI and LRFI groups, differential gene expression analysis was used in the current study. Among all the genes annotated in the chicken genome, after multiple tests and corrections, a total of 354 gens were identified as being DEGs (Fig. 1). 5 DEGs were detected within the unannotated parts of the chicken genome, which could be considered as novel genes. Of the 349 known DEGs, 24 DEGs were up-regulated in the LRFI groups while 325 were down-regulated compared with the HRFI  Table S2).

Functional enrichment of DEGs
A functional enrichment analysis was performed to reveal the potential functional categories of DEGs. Analysis of Gene Ontology (GO) enrichment for the DEGs indicated that 212 biological processes terms were significantly enriched, which were mainly associated with 'immune system processes' and 'response to stimulus'. Moreover, the significantly enriched GO terms also including 17 cellular component terms and 12 molecular function terms, which involved in 'plasma membrane part' and 'carbohydrate derivative binding'. All enriched GO terms of DEGs are provided in (Additional file 3: Table S3).
Enrichment of the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways was performed to further reveal the biological functions of DEGs [21]. Interestingly, genes of 'oxidative phosphorylation' were upregulated in LRFI group (Fig. 2), while genes of other enriched pathways were down-regulated in LRFI group (Table 3). Other enriched pathways of interest including 'cytokine-cytokine receptor interaction' and 'Jak-STAT signaling pathway', which were involved in muscle myogenesis and regulation of immune response. The remaining significant enriched signaling pathways, such as 'phagosome', 'cell adhesion molecules (CAMs)', and 'toll-like receptor signaling', were mainly involved in inflammation, immune response, and innate immune response.

Identification of hub genes and pathways through protein-protein interaction (PPI) network analysis of DEG
The PPI network analysis was employed to further analyze and reveal the interaction information of DEGs. The PPI network of DEGs, including 245 nodes and 942 edge, was constructed in the STRING database and visualized using Cytoscape software (Fig. 3). The cutoff criterion was set as degree > 10. Base on the STRING database, the top 10 genes of DEGs evaluated in the PPI network using four centrality methods (Table 4). Moreover, we observed the intersections of these four    TNFRSF18, FASLG, XCR1, EDA2R, IL18R1, CSF2RA, TNFSF13B, CCL1, CCR2, IL4R, TNFRSF8, IL18, TNFRSF4,  IL17RA, IL22RA2, IL1RAP, TNFRSF25,  algorithms and generated a Venn plot (Fig. 4) to identify significant hub genes using an online website (http://bio informatics.psb.ugent.be/webtools/Venn/). Finally, the four hub genes, including RAC2 (Ras-related C3 botulinum toxin substrate 2), VCAM1 (Vascular cell adhesion molecule 1), CTSS (Cathepsin S), and TLR4 (Toll like receptor 4), were identified. Among these genes, RAC2 showed the highest node degree, which was 50.
The hub genes derived using these four algorithms may represent key candidate genes with important biological regulatory functions. The three significant modules, including module 1 (MCODE score = 22.33), module 2 (MCODE score = 11), and module 3 (MCODE score = 5.67), were constructed from the PPI network of the DEGs by MCODE (Fig. 5). And then, genes of each module were performed by biological functional enrichment analysis, respectively Protein-protein interaction (PPI) network for products of differentially expressed genes (DEGs). A total of 245 nodes and 942 interaction associations were identified. The red node represents the up-regulated gene, while the green node represents the down-regulated gene. The nodes with the highest degree scores were shaped as diamond and highlight the blue border paint. Node size indicated the fold change of each gene  (Table 5) Table S4).

GSEA
We further investigated the difference of gene expression levels between HRFI and LRFI groups by GSEA. GSEA was performed using a GO-based list, including 9996 gene sets, and a KEGG-based list, including 186 gene sets. Moreover, the results of GSEA analysis are presented in Additional file 5: Table S5. Totally, 20 gene sets, including 14 GO-based gene sets and 6 KEGGbased gene sets, were identified as significantly enriched (Table 6) (FDR < 0.05). Positive and negative NES indicate higher and lower expression in LRFI, respectively. From the GO-based list, interestingly, all higher expression gene sets in LRFI group were mainly related to mitochondrial function, such as 'mitochondrial membrane part' (Fig. 6a) and 'electron transport chain' (Fig. 6b). On the other hand, the lower expression gene sets in LRFI group were involved in inflammatory response, response to stimulus, molecular transport, and metabolic process, such as 'negative regulation of cytokine-mediated signaling pathway' (Fig. 6c) and 'negative regulation of response to cytokine stimulus' (Fig. 6d). From the KEGG-based list, the higher expression gene sets in LRFI group were 'citrate cycle (TCA cycle)' and 'cardiac muscle contraction'. And the higher expression gene sets in HRFI group were 'intestinal immune network for IgA production', 'N-Glycan biosynthesis', 'apoptosis', and 'glycosaminoglycan biosynthesischondroitin sulfate/dermatan sulfate'.

Validation of RNA-seq results
To validate RNA-seq expression profiles, six genes were selected randomly from all differential expression genes. These genes are PEPD (peptidase D), SERBP1 (SER-PINE1 mRNA binding protein 1), TAP2 (transporter 2, ATP-binding cassette, sub-family B), LECT2 (leukocyte cell derived chemotaxin 2), SEC23B (Sec23 homolog B, coat complex II component), and KLHL18 (kelch like family member 18). The samples of qPCR were same as samples utilized for RNA-seq. The qPCR analysis confirmed that the selected genes were differently expressed between the RFI groups, indicating that RNA-seq results were accurate and reproducible (Fig. 7).

Discussion
In this study, the breast muscle transcriptome data were obtained from two groups of native chickens with extreme opposing RFI using high-throughput RNA-seq technology. The gene expression profile was deconstructed and understood by an integrated bioinformatics analysis. Firstly, the DEGs were identified from transcriptome data and analyzed by functional annotation. Secondly, an in-depth analysis of DEGs was performed by the integration of PPI network and module analysis. Meanwhile, the hub genes were identified through the analysis of key nodes in the PPI network. Finally, all expressed genes were ranked according to the strength of expression difference, and then a GSEA method was employed for functional enrichment between RFI groups. All bioinformatics analyses investigated the differences, associations, and enrichment of expressed genes from the above three different perspectives in order to further gain a comprehensive biological insight into the feed efficiency of native chickens.

Functional annotation and biological interpretation of DEGs
Typical differential expression analysis of transcriptome data produces a list of hundreds of DEGs, requiring further analysis to construct a high-level overview of changes between the different compared groups [22]. In this study, a total of 349 known DEGs (24 up-regulated and 325 down-regulated) were identified from sequencing data. Ontology annotation of DEGs revealed several biological events related to immune system process, response to stimulus and T cell activation. 'immune system process' was the most significantly enriched GO term in the LRFI birds relative to the HRFI birds. All genes of this term were down-regulated in LRFI group. Moreover, a range of GO terms related to immune   response, including 'regulation of immune system process', 'regulation of immune response', and 'activation of immune response', were found significantly enriched in LRFI groups. It was widely considered that immune response may increase maintenance requirements, and are prioritized overgrowth in terms of nutrient allocation [23]. Nutrients shifted away from growth toward the immune-related processes may reduce feed efficiency in animals during the immune response [24]. Moreover, a range of GO terms involved with the response to stimulus, such as 'regulation of response to stimulus', 'response to stimulus', and 'positive regulation of immune response', were found enriched in the LRFI group compared to the HRFI group in breast muscle. Similar to the genes of GO terms related to immune response, almost all genes of these GO terms were down-regulated in LRFI group. A previous study indicated that LRFI heifers respond differently to hepatic proinflammatory stimulus and may expend less energy toward combating systemic inflammation and redirecting nutrients toward growth and protein accretion [25]. Our finding indicated that genes related to immune response and response to stimulus may be important factors contributing to the difference in feed efficiency. In agreement, it was well documented that pigs with high feed efficiency showed lower susceptibility to oxidative stress during production compared to pigs with low feed efficiency, resulting in lower inflammatory responses and lower growth impairment [26]. Moreover, a previous study in pigs indicated that genetic selection for low RFI (high feed efficiency) resulted in less stress responsiveness [27]. We further analyzed the KEGG pathways of DEGs, including 8 significantly enriched pathways. These enriched pathways focus on immune response, response to cytokine, energy metabolism, and inflammatory response. Among these, the top 3 pathways, including 'phagosome', 'cell adhesion molecules (CAMs)' and 'intestinal immune network for IgA production', are associated with immune response and inflammatory response. 2 pathways, including 'cytokine-cytokine receptor interaction' and 'Jak-STAT signaling pathway', were involved in the response to cytokine. Cytokines are a group of proteins that are soluble in water and secreted by various cells primarily in response to stimulus and responsible for transmitting messages between cells. It was well documented that the over-production of proinflammatory cytokines may lead to damage to intestinal integrity and epithelial function and subsequently reduced feed efficiency [28]. These results indicated that pathways related to immune response and inflammatory response are associated with feed efficiency. Consistent with previous studies, it is well established that LRFI pigs have an up regulated basal colonic inflammatory state and a heightened response to a lipopolysaccharide (LPS) challenge compared with the HRFI pigs [29]. A similar finding suggested that compared with low feed efficiency pigs, the high feed efficiency pigs could induce a quicker and more effective hepatic response to inflammatory stimuli [24]. Interestingly, we found only genes (ND1, ND2, ND3, ND4, ND4L, ND5, CYTB, COX1, COX2, COX3, and ATP6) of 'oxidative phosphorylation' were up-regulated in LRFI group, while genes of other enriched pathways were down-regulated in LRFI group. Based on the figure of oxidative phosphorylation pathway (Fig. 2), the upregulated genes of this pathway appear in major mitochondrial complexes, including complex I, complex III, complex IV, as well as ATP synthase (complex V). These mitochondrial complexes form the electron transport chain [30], which coupled the oxidative phosphorylation to produce energy in mitochondria [31]. Among these, ND1, ND2, ND3, ND4, ND4L, and ND5 are the core subunits of the mitochondrial membrane respiratory chain NADH dehydrogenase (complex I), which is the largest mitochondrial complex and has the entry site of the NADH electron transfer chain [32]. Notably, ND2, ND4, and CYTB were top 10 up-regulated DEGs in the breast muscle of LRFI group compared to HRFI group ( Table 2). ND2 plays a key role in controlling the production of the mitochondrial reactive oxygen species (ROS), which can contribute to oxidative damage to mitochondrial structure and functions. It was reported that the missense substitution in the ND2 was significantly associated with the production of ROS in Tibet chicken [33]. ND4 protein is a hydrophobic inner membrane subunit of mitochondrial complex I and is thought to be involved in the proton translocation function of complex I [34]. Previous research indicated that the expression of ND4 and COX 2 was lower in the low feed efficiency broilers compared with high feed efficiency broilers [35]. CYTB is one of the 11 subunits of mitochondrial complex III, and is the key to maintain the function of complex III [36]. Mutations in CYTB might result in the functional failure of complex III, which could have a negative impact on complex I function [37]. Moreover, A previous study suggested that the presence of mtDNA polymorphisms, including ND4, CYTB, and COX3, affecting respiratory chain complexes I, III and IV, and were associated with altered ROS level [38]. The aforementioned results confirmed that LRFI chickens may have tighter control over ROS production compared with HRFI chickens through enhancing the expression of genes related to mitochondrial function. Coincidentally, it was established that low feed efficiency broilers produced higher amounts of ROS compared with high feed efficiency broilers [39]. Moreover, a previous study suggested that using poor hygiene conditions to activate mature fat cells isolated from different RFI pigs could lead to higher ROS production in HRFI pigs [26]. Hence, it can be inferred that ND2, ND4, and CYTB were key candidate genes affecting feed efficiency in native chickens.

Integration of PPI network and module analysis
The alignment and mapping of PPI networks provide opportunities to further investigate the intrinsic relationship between DEGs through conserved pathways and protein complexes [40]. Analyzing PPI network is an important prerequisite for understanding the molecular basis for complex traits. In our study, the PPI network was constructed with DEGs, and then the top centrality hub genes were obtained using four centrality methods. Finally, we identified 4 hub genes, including RAC2, VCAM1, CTSS, and TLR4. RAC2 was identified as one of the hub genes with the highest degree of connectivity. RAC2 is a key signal transduction factor in inflammatory cells and plays a key role in the activation of the various NADPH oxidases (NOXes) family members, which play important role in the production of ROS through response to receptor agonists such as growth factors or inflammatory cytokines [41]. Moreover, a previous study suggested that RAC2 deficiency inhibits the action of pro-inflammatory cytokines and chemokines [42]. In chickens, RAC2 is involved in the production of ROS in phagosomes of chicken heterophils to kill pathogens [43]. VCAM1 encode vascular cell adhesion molecule-1 and mainly expressed in endothelial cells during inflammation [44]. Dysfunctional endothelial cells express adhesion molecules and release VCAM1, thereby causing vascular inflammation [45], and this event appears to be mediated by increased ROS production [46]. CTSS, encoding for cathepsin S protein, is implicated in body weight regulation and the development of obesity [47]. It was reported that CTSS expression and cathepsin S in adipose tissue were induced by pro-inflammatory factors, such as TNF-α and IL-β [48]. TLR4 is a member of toll-like receptors (TLRs) family, which recognize mainly microbial membrane components [49]. TLR4 is also the only known member of the TLR family that engages all four toll-interleukin receptor (TIR) domains-containing adaptor proteins to participate in signaling inflammatory response [50]. A previous study indicated the elevated TLR4 expression in skeletal muscle expression may lead to augmented inflammation and chronic disease risk observed with increased adiposity [51].
It is worth signaling that, the four hub genes were mainly expressed in inflammatory cells. Under normal circumstances, skeletal muscle is responsible for most insulin-stimulated glucose processing throughout the body. Existing evidence suggested that skeletal muscle myocytes can secrete large amounts of cytokines and other molecules and may become inflamed in obesity [52]. Moreover, skeletal muscle myocytes can express and secrete numerous cytokines such as IL-6, IL-15, and other molecules such as irisin and myonectin, whereas most adipokines are pro-inflammatory, regulated by obesity [53]. Furthermore, a previous study indicated that immune cells can also cause myocyte inflammation by secreting pro-inflammatory molecules for adverse regulatory effects on myocyte metabolism [54]. In summary, the four hub genes obtained from the PPI network were up-regulated in skeletal muscle of HRFI chickens and deeply involved in the production of ROS and inflammatory response. In this study, the four hub genes up-regulated in HRFI chickens, which indicated the HRFI chickens increased ROS production and inflammatory response. In agreement, a number of studies have suggested that low feed efficiency pigs showed higher inflammatory responses, growth impairment, and ROS production [26,29]. Similarly, in the above DEGs enrichment analysis, our results indicated that the birds in the HRFI group up-regulated inflammation-related pathways, such as 'phagosome', 'cell adhesion molecules (CAMs)', and 'cytokine-cytokine receptor interaction', and down-regulated genes related to mitochondrial function.
In this study, the HRFI group consumed 8.8% more feed than the LRFI group. The overconsumption of food of HRFI chickens may lead to metabolic disorders and overload of the electron transport chain, which increased the production of ROS and resulting in cellular oxidative stress [55]. A previous study indicated that the generation of ROS level lead to numerous downstream effects, including triggering inflammatory cascades and increasing production of ROS [56]. In chickens, the blunted inflammatory response may reduce feed demand and stimulate faster muscle growth [57]. Hence, according to the aforementioned results, it could be hypothesized that overconsumption of food may increase the risk of overload of electron transport chain, which in turn leads to cellular oxidative stress and inflammatory response, resulting in increased feed demand and reduced feed efficiency in HRFI chickens.
To further analyzed the PPI network, we constructed three significant modules (Fig. 5). In the current study, the genes of module 1 were up-regulated in HRFI group and enriched in 'phagosome' and 'cell adhesion molecules (CAMs)' pathway. The seed node of module 1 is IL16, which is a polypeptide pro-inflammatory cytokine that plays a key role in most immune and inflammatory responses [58]. This result further confirmed the above surmise that HRFI chickens increased inflammatory response. Genes of module 2 were up-regulated in LRFI chickens and enriched in 'oxidative phosphorylation' and 'metabolic pathways'. The seed node of module 2 is ATP6, which plays a crucial role in the proton channel of ATP synthase (complex V). A previous study indicated that the mutation of ATP6 gene may make Tibetan chickens easier to convert energy and metabolize than Chinese native chickens [59]. This finding is consistent with the above results that LRFI chickens enhanced expression of genes related to mitochondrial function. Genes of module 3 are involved actin cytoskeleton, which implicated in the regulation of cell motility [60]. Thus, it can be speculated that the 'phagosome' and 'cell adhesion molecules (CAMs)', and 'oxidative phosphorylation' were key pathways affecting feed efficiency in native chickens.

Gene set enrichment analysis
In the current study, we used GSEA method to convert the RNA-seq count data into biological interpretations. In this way, we do not rely on any arbitrarily predefined threshold to select 'interesting' genes or pathways for function analysis. And GSEA can accurately and reliably detect gene sets with biological meaningful [17]. In this study, based on the GO-based list, all higher expressed gene sets in LRFI group were mainly related to mitochondrial function. Among these, the 'mitochondrial membrane part' and 'electron transport chain' were significantly enriched gene sets (Fig. 6). These results indicated that LRFI chickens increased mitochondrial function, especially in function of electron transport chain. The results were consistent with the former function analysis of DEGs that LRFI chickens enhanced the expression of genes related to mitochondrial complexes, which form the electron transport chain (Fig. 2). 'Negative regulation of cytokine-mediated signaling pathway' and 'negative regulation of response to cytokine stimulus' were higher expressed in HRFI chickens. This result indicated that the aforementioned two pathways related to cytokine, including 'cytokine-cytokine receptor interaction' and 'Jak-STAT signaling pathway', should deserve more attention in further research.
Base on the KEGG-based list, we found that 'citrate cycle (TCA cycle)' was the most significantly enriched gene set, with higher expression in LRFI group compared with HRFI group. It was well known that the TCA cycle is the major common pathway for oxidation of carbohydrates, lipids, and some amino acids, and finally results in the production of large amounts of adenosine triphosphate (ATP) via oxidative phosphorylation [61]. This result indicated that LRFI chickens increased the expression of genes of the 'citrate cycle (TCA cycle)' pathway in skeletal muscle. It was well documented that mitochondria are involved in ATP synthesis through the TCA cycle and oxidative phosphorylation [62]. Based on the above analysis, we speculated that compared with the HRFI chickens, LRFI chickens may synthesize ATP more effectively by enhancing TCA cycle and oxidative phosphorylation in skeletal muscle. In agreement, a recent study suggested that high feed efficiency broilers enhanced expression of the energy production in breast muscle [63]. Moreover, a recent study in pigs suggested that compared with HRFI pigs, LRFI pigs might be more efficient in energy utilization during skeletal muscle growth [64]. Furthermore, a previous study indicated that high feed efficiency pigs can use nutrients more effectively to promote growth than low feed efficiency pigs [24]. Collectively, our results of GSEA indicated that LRFI chickens had higher expression of genes related to mitochondrial function compared with HRFI chickens, and the 'citrate cycle (TCA cycle)' may be a key pathway to influence the feed efficiency of native chickens.

Conclusions
In summary, we performed RNA-seq analysis on breast muscle derived from native chickens with extreme opposing RFI. Enrichment and interaction analysis of DEGs and GSEA method were employed for further analysis to construct a high-level overview of changes between the different RFI groups. Our results indicated that ND2, ND4, CYTB, RAC2, VCAM1, CTSS and TLR4 were key genes affecting feed efficiency of native chickens, and they may influence feed efficiency through deep involvement in ROS production and inflammatory response. Function analysis of DEGs and GSEA analysis suggested that genes related to immune response, mitochondrial function, response to stimulus, and inflammatory response are associated with feed efficiency. Moreover, the 'phagosome', 'cell adhesion molecules (CAMs)', 'citrate cycle (TCA cycle)' and 'oxidative phosphorylation' were key pathways contributing to the difference in feed efficiency. Among these, Genes and pathways related to inflammatory response and immune response were up-regulated in HRFI chickens, while genes and pathways related to mitochondrial function were up-regulated in LRFI chickens. Our study indicated that HRFI chickens may face more oxidative stress and the consequent increased inflammatory response, while LRFI chickens may synthesize ATP more efficiently and control ROS production more strictly by enhancing the mitochondrial function in skeletal muscle. The interaction between inflammatory response and mitochondrial function in skeletal muscle needs further investigation to understanding the underlying molecular mechanisms affecting the feed efficiency of native chickens.

Birds and RFI calculation
A pedigreed chicken population was established from a random breed population, 200 males mated with 1000 females obtain 4500 chickens in one hatch. All birds used in the current study were provided by Qingyang Pingyun Poultry Conservation and Breeding, Co. Ltd. After hatch, a total of 2500 male Wannan Yellow chicken were selected and raised as experimental populations. At 56 day of age, a total of 1008 chickens with similar body weight were selected and transferred to individual cage, each cage measuring 45 cm × 35 cm × 50 cm. All chickens had access to water ad libitum. All chickens were fed the same diet throughout the experiment period, which provided by Qingyang Pingyun Poultry Conservation and Breeding, Co. Ltd.
The feed intake and ADFI were measured at 56-98 d of age. The BW56 and BW98 were recorded to calculate the MBW 0.75 , BWG, and ADG. FCR was calculated by FI and BWG, RFI is calculated as difference between the actual and expected FI using the model as follows [20]: where ADFI, ADG, and MBW 0.75 are the average daily feed intake, average daily body weight gain, and metabolic body weight, respectively. Additionally, b 0 is the regression intercept, b 1 is the partial regression coefficient of ADFI on ADG, and b 2 is the partial regression coefficient of ADFI on metabolic weight. The RFI values were calculated using the regression procedure of SAS (version 9.4, SAS Inst. Inc., Cary, NC). After excluding outlier data (total 1.5%), all chickens were ranked according to the RFI value. 30 highest RFI (HRFI) chickens and 30 lowest RFI (LRFI) chickens were selected as HRFI and LRFI group. All animal performance data showed in the table are expressed as least square means ± standard error of the mean (SEM). Student's t-test was used to analyze the feed efficiency difference between HRFI and LRFI groups. The probability value was P < 0.05, indicating statistical significance.

RNA extraction and sequencing
At the age of 98 days, 5 birds were randomly selected from HRFI group and LRFI group, respectively. All birds were manually killed by cervical dislocation. The pectoralis major was immediately collected and stored in liquid nitrogen and subsequently transferred to the laboratory and stored at − 80°C for further use (RNA sequencing). Total RNA was extracted from the pectoralis major (100 mg) using TRIzol reagent (Invitrogen, Carlsbad, CA, USA) based on the manufacturer's instructions. RNA quality was determined by measuring the absorbance at 260, 280 and 230 nm using NanoDrop 2000 (Thermo Fisher Scientific). The reference 260/280 ratio and 260/230 ratio for the RNA sample were 1.8 to 2.0 and 1.8 to 2.2, respectively. The integrity number was tested by Agilent 2100 Bioanalyzer (Agilent, Santa Clara, CA, USA, 2009). Only RNA integrity number equal to or higher than 7.0 was RNA used for the next analysis.
After total RNA was extracted and checked, all samples were sent to Genedenovo Biotechnology Co., Ltd. (Guangzhou, China) for cDNA library construction. All samples were sequenced using the Illumina HiSeq 4000 platform (Illumina, San Diego, California, USA).

RNA-seq data analysis
Before read alignment, the quality control of raw sequence reads was performed using the FastQC program (version 11.5, http://www.bioinformatics.babraham.ac. uk/projects/fastqc/) and nucleotide calls with a quality score of 30 or higher were considered high quality clean reads. Adapters and low-quality reads were trimmed using the Cutadapt (1.14) such that the average base quality was greater than 20.
After trimming, the processed reads were then aligned to the chicken reference genome GRCg6a (GCA_ 000002315.5) using the alignment program Tophat2 (version 2.1.1, http://ccb.jhu.edu/software/tophat/index. shtml). The reference genome and annotated file were obtained from the Ensembl database (http://asia. ensembl.org/Gallus_gallus/Info/Index). After aligned with the reference genome, unmapped reads were then re-aligned with Bowtie2, the enriched unmapped reads were split into smaller segments which were then used to find potential splice sites. Then, a reference annotation-based transcript assembly for each sample was performed using the Cufflinks (version 2,2,1). The fragments per kilobase of exon per million reads (FPKM) value was used to quantify the gene expression levels. In addition, all assembled transcripts of all samples were merged to improve the overall quality of assembly by merging new and mapped transcripts into a single assembly and deleting artificial structures.

Identification of differently expressed genes (DEGs) and function annotation analysis
DEGs were identified using Cuffdiff (version 2.2.1), here, only identified transcripts with a fold change > 2 or < 0.5, and a false discovery rate (FDR) < 0.05 were used for subsequent analysis.
To identify the biological function related to the DEGs, the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways and Gene Ontology (GO) terms (CC, Cellular Component, MF, Molecular Function, BP, Biological Process) were investigated using the Database for Annotation, Visualization and Integrated Discovery (DAVID) (version 6.8, https://david.ncifcrf.gov/) [65]. The GO terms and KEGG pathways with Benjamini-Hochberg (B-H) P value < 0.5 were considered to be statistically significant enrichment.

Protein-protein interaction (PPI) network construction and modules selection
The Search Tool for the Retrieval of Interacting Genes (STRING) database was used to obtain PPI data. Mapping DEGs to STRING to evaluate the interactive relationship, with a confidence score > 0.9 defined as significant. PPI network of DEGs was visualized by Cytoscape (http://cytoscape.org/), which is an open source software for visualizing complex networks and integrating them with any type of attribute data. The Cyto-Hubba application in Cytoscape was performed to analyze the hub genes through four centrality methods, including Degree, EPC, EcCentricity, and MNC [66]. The Molecular Complex Detection (MCODE) [67] application in Cytoscape was used to screen the modules of the PPI network. The criteria setting of MCODE is: degree cutoff = 2, node score cutoff = 0.2, k-core = 2, maximum depth = 100. Moreover, the function and pathway enrichment analysis was performed for genes in the modules.

Gene set enrichment analysis (GSEA)
All expressed genes, regardless of whether they were differentially expressed in either case, were used for GSEA analysis. Gene set analysis was analyzed by GSEA software (http://software.broadinstitute.org/gsea/index.jsp) based on C5, CC, C5.BP, C5.MP, and C2.CP:KEGG gene set collections (MSigDB v7.0, broad institute, Cambridge, MA, USA) [68]. GSEA first ranked all expressed genes according to the significance of differential gene expression between the HRFI and LRFI groups. The enrichment score for each gene set is then calculated using the entire ranked list, which reflects how the genes for each set are distributed in the ranked list. Normalized enriched score (NES) was determined for each gene set. The significant enrichment of gene set was selected based on the absolute values of NES > 1, nominal P-value of NES ≤ 0.05, and false discovery rate (FDR) ≤ 0.05 [69].
Validation of RNA-seq through quantitative real-time PCR (qPCR) Following cDNA synthesis from 1 μg of total RNA and in presence of random primers (Promega, Mannheim, Germany), these primers were designed using Primer 5.0 software and synthesized by the Nanjing Tsingke biological technology Co. Ltd. (Nanjing, China). The primer sequences are provided in (Additional file 6: Table S6). First-strand complementary DNA (cDNA) was synthesized using one-step gDNA Removal and cDNA Synthesis SuperMix (TransGen Biotech Co., Ltd., Beijing, China) according to the manufacturer's instructions. qPCR was carried out on a 7500 Real-Time PCR apparatus (Applied Biosystems, Warrington, UK) using the SYBR Green Master Mix (Biomiga, San Diego, CA, USA). The efficiency of the quantitative PCR reaction was verified by creating a standard curve from fivefold serial dilutions of cDNA. PCR reactions were carried out in a final volume of 20.0 μL, which contained 1.0 μL of 1000 ng cDNA, 1.0 μL of 10 μM forward and reverse primer mix, 10.0 μL 2 × SYBR green Master Mix, 8.0 μL RNase-free ddH 2 O. Samples were run in triplicate. The quantitative PCR program was at 95.0°C for 5 min, 40 cycles of 95.0°C for 15 s and 60.0°C for 1.0 min, followed by a melting curve program was 1 cycle of 95.0°C for 15 s, 60.0°C for 1.0 min, 95.0°C for 15 s, 60.0°C for 15 s. The qPCR results were detected using a dissociation curve analysis and gel electrophoresis. CT-method was utilized to quantify the changes in the gene expression, whereas GAPDH served as a housekeeping for normalization. Relative gene expression was calculated using 2 -ΔΔCT method according to a previous study [70].