- Research article
- Open Access
Bioinformatic analysis and functional predictions of selected regeneration-associated transcripts expressed by zebrafish microglia
BMC Genomics volume 21, Article number: 870 (2020)
Unlike mammals, zebrafish have a remarkable capacity to regenerate a variety of tissues, including central nervous system tissue. The function of macrophages in tissue regeneration is of great interest, as macrophages respond and participate in the landscape of events that occur following tissue injury in all vertebrate species examined. Understanding macrophage populations in regenerating tissue (such as in zebrafish) may inform strategies that aim to regenerate tissue in humans. We recently published an RNA-seq experiment that identified genes enriched in microglia/macrophages in regenerating zebrafish retinas. Interestingly, a small number of transcripts differentially expressed by retinal microglia/macrophages during retinal regeneration did not have predicted orthologs in human or mouse. We reasoned that at least some of these genes could be functionally important for tissue regeneration, but most of these genes have not been studied experimentally and their functions are largely unknown. To reveal their possible functions, we performed a variety of bioinformatic analyses aimed at identifying the presence of functional protein domains as well as orthologous relationships to other species.
Our analyses identified putative functional domains in predicted proteins for a number of selected genes. For example, we confidently predict kinase function for one gene, cytokine/chemokine function for another, and carbohydrate enzymatic function for a third. Predicted orthologs were identified for some, but not all, genes in species with described regenerative capacity, and functional domains were consistent with identified orthologs. Comparison to other published gene expression datasets suggest that at least some of these genes could be important in regenerative responses in zebrafish and not necessarily in response to microbial infection.
This work reveals previously undescribed putative function of several genes implicated in regulating tissue regeneration. This will inform future work to experimentally determine the function of these genes in vivo, and how these genes may be involved in microglia/macrophage roles in tissue regeneration.
Tissue regeneration allows restoration of the function of damaged tissues and organs. Mammals have the ability to regenerate a limited number of tissues and organs like skin [1, 2], skeletal muscle [3, 4] and liver [5, 6]. Unfortunately, injuries or disease of the central nervous system (CNS) resulting in neuronal loss cannot regenerate neurons in mammals [7,8,9,10,11,12]. In contrast, zebrafish (Danio rerio) have the ability to regenerate numerous different tissues, including tissue in the central nervous system [10, 12,13,14,15,16,17,18,19]. For example, zebrafish can regenerate damaged retinal neurons, which restores visual function . In all species examined, macrophage populations appear to be crucial to tissue regeneration [21,22,23,24,25,26,27,28,29,30], though in the mammalian CNS they appear to instead engage in pathological functions [31,32,33,34,35].
In vertebrates, the retina lies at the back of the eye and is a stereotypically organized part of the CNS that is composed of neural and glial cell types that are laminated into 3 distinct nuclear layers. Evidence strongly indicates that Müller glia are the source of regenerated retinal neurons in zebrafish [12, 36,37,38,39,40,41,42]. In both zebrafish and mammals, resident microglia respond to retinal injury and degeneration. This may lead to immune-Müller glia crosstalk that may shape Müller glia reaction to retinal injury [43,44,45]. The zebrafish is a relatively new, and powerful, vertebrate model in microglial biology [10, 30, 46,47,48,49,50,51]. In particular, microglia and macrophage functions in the regeneration of CNS tissue, such as in the zebrafish retina, is just beginning to be explored.
Our recent work has used the zebrafish towards an understanding of microglia and macrophage responses to acute, widespread retinal lesion in zebrafish [30, 51]. In particular, our transcriptome analysis  has provided a rich dataset to facilitate an understanding of gene expression in microglia/macrophages in a context of successful CNS regeneration. In order to translate our transcriptome findings in zebrafish  to mammals, we examined predicted orthology of differentially expressed genes (DEGs) enriched in zebrafish microglia/macrophages during retinal regeneration. We found that nearly all of the genes examined had predicted orthologs in mouse and human. However, several of these genes did not. Further, the putative function of these genes is largely unknown. As these “non-orthologous” genes comprise a portion of the microglia/macrophage regeneration-associated transcriptome , a better understanding of their predicted gene products will facilitate a greater understanding of the similarities and differences in fish and mammalian response to retinal injury. We reason that these genes could play functional importance in determining the outcome of tissue regeneration in zebrafish, and so functional predictions for these genes is necessary to inform future experimental work. This knowledge will also help us better understand evolutionary relationships between mammalian and teleost immunity.
For twelve selected genes without clear human or mouse orthologues, we performed a variety of bioinformatic analyses aimed to identify functional protein domains. These analyses included identification of protein domains and Gene Ontology (GO) analysis, sequence similarity comparisons, and predicted protein structure. In addition, we used synteny analysis which failed to find evidence of orthologous genes in human and mouse genomes. However, sequence similarity comparisons to find similar genes in other vertebrate species with well described regenerative capacity (Axolotl, Xenopus, Salamander) indicated possible orthologs for several of the genes of interest. We also examined several other published gene expression datasets to determine if these genes showed informative expression patterns in other contexts of tissue regeneration, or if these genes might also be differentially expressed in macrophages responding to microbial infection. The work presented here is informative for several zebrafish genes of previously unknown function, providing a foundation for future experimental work to test gene function in vivo. In addition, only one of these twelve genes was previously described to be differentially expressed in macrophages responding to microbial infection, suggesting that these genes indeed have importance to tissue regeneration and not only macrophage responses in general. These results have provided further insight into the transcriptome of zebrafish macrophages in the context of tissue regeneration.
Selection of genes expressed in zebrafish microglia/macrophages for further bioinformatics analyses
We previously described a set of 970 genes enriched in in mpeg1+ cells (representing microglia and macrophage populations) compared to other retinal cell types in regenerating zebrafish retinas . Of these genes, 409 of them comprised a list that we considered to be “regeneration-associated” transcripts. These particular 409 transcripts were considered to be “regeneration associated” because they were enriched in microglia/macrophages isolated from regenerating retinal tissue, but were not found to be enriched in resting/steady-state zebrafish brain microglia in another published study [30, 52]. Each gene in this list of 409 “regeneration-associated” transcripts was examined for predicted orthology in mouse and human species using the DRSC integrative ortholog prediction tool. Most genes returned predicted orthologues in mouse and/or human (Supplemental File 1). However, twelve (12) of these genes did not show predicted orthology to human or mouse genes with this analysis and were therefore selected for further bioinformatic analysis (Table 1, denoted P1-P12 throughout the manuscript). We reasoned that these twelve transcripts could be part of a transcriptional program executed in microglia/macrophages during CNS regeneration, and therefore could be important in understanding similarities and differences in mammalian vs. zebrafish outcomes following tissue damage.
Summary of results from bioinformatic analyses
A number of bioinformatic analyses were performed for the twelve genes of interest shown in Table 1 (methods summarized in Materials and Methods), and are summarized in Fig. 1. The species included in the results from these analyses are shown in Supplemental Figure 1. Protein domain and GO term were found for nine genes and largely included terms involved in immune system (Table 2). Orthologs found by sequence similarity arise from several species, mainly vertebrates (Supplemental Figure 1, Table 3); several are associated with the immune system or soluble signaling (Table 3) and the best-matched proteins are most frequently from species of fish, with occasional hits in mouse or human (Table 4). Overall, the results found for the sequence similarity and best-matched ortholog approach are consistent with the results found with the protein domain and gene ontology (GO) term approach (Tables 2, 3, 4). The three dimensional structure of the protein, or lack thereof, is known to determine protein function . Of the genes studied here, two of these (P4 and P12 (pho)) are predicted to have greater than 50% disordered amino acids, and thus are likely to code for unstructured proteins (Supplemental Figure 2). We predicted three-dimensional (3D) structure using homology modeling (Table 5, Figs. 2, 3, 4, 5 and 6). The results are consistent with sequence similarity and protein domain/GO results for several genes of interest. In addition, structural similarity was informative for genes that did not return results with previous analyses (e.g. P2, P7, and P12). Synteny analysis compared to human and mouse genome returned results for only one gene (P4, with hit in human genome, Supplemental Figure 3), though based on sequence comparison this gene did not align with the candidate gene in the identified human chromosomal region. Comparison to other vertebrate species with described capacity for tissue regeneration (Ambystoma mexicanum, Xenopus laevis, Xenopus tropicalis and Cynops pyrrhogaster) returned putative orthologs of several of these genes (Table 6 and Supplemental Table 1) indicating that they may have conserved function across these species. More detailed descriptions of findings regarding P1-P12 are provided next.
The gene coding for P1 (si:dkey-181f22.4) is located on zebrafish chromosome 7 and is predicted to have exon/intron structure coding for a predicted 513 amino acid protein (Table 1). Protein domain and gene ontology (GO) term returned predicted “protein kinase domain” and “Caspase Activation and Recruitment (CARD) domain” (Table 2). The CARD domain is known to function in innate immunity, particularly in inflammation and the regulation of apoptotic process (Table 2, [66,67,68,69]). Amino acid sequence similarity analysis returned several kinases associated with immune function, and suggested that this gene may code for a receptor tyrosine kinase (Table 3). The best-matched ortholog analysis returned “Receptor-interacting serine/threonine-protein kinase 2 isoform 1” in both human and mouse (Table 4). Of note, human RIPK2 has been described to contain a C-terminal CARD domain [70,71,72]. In comparison to other selected species (Table 6), P1 returned receptor tyrosine kinase-like orphan receptor 2 (Axolotl), Threonine-protein kinase 2-like isoform X1 (Xenopus), and insulin-like growth factor receptor as well as receptor tyrosine kinase-like orphan receptor 2 (Salamander). Structure prediction (Table 5, Fig. 2) strongly indicated a kinase domain/function for P1.
The results strongly indicate that P1 has a kinase domain that may be activated by interactions with other proteins via the CARD domain, and this function may be acting in concert with receptor activity. Interestingly, the CARD domain of human RIPK2 facilitates interaction with NOD-like receptors [73, 74]. Collectively, these results indicate that zebrafish P1 may have orthologous function to human RIPK2. However, the amino acid substrate of phosphorylation (tyrosine vs. serine/threonine) by zebrafish P1 is not yet clear, as both classes of kinases were indicated in the hits.
P2 (si:ch73-112 l6.1)
The gene for P2 (si:ch73-112 l6.1) is located on zebrafish chromosome 21 and codes for a predicted 1025 amino acid protein (Table 1). Protein stability analysis (Supplemental Figure 2) indicates P2 is a structured protein, but with a large disorded domain. Such disordered regions often indicate a protein-protein binding interface . However, collective analyses were largely uninformative for P2. For example, no protein domains nor GO terms were returned (Table 2). A putative ortholog with unknown function from Branchiostoma floridae was returned based on amino acid sequence similarity (Table 3), and three uncharacterized zebrafish genes were returned as best-matched orthologs (Table 4).
The gene for P3 (zgc:174863) is located on zebrafish chromosome 6 and codes for a predicted 290 amino acid protein (Table 1). Protein domain and GO terms indicate an immunoglobulin-like domain, which are present in proteins involved in cell adhesion (Table 2). Consistent with this, sequence similarity analysis revealed 5 proteins from 4 species, several of which contain immunoglobulin folds (Table 3). Protein structure analysis (Table 5, Fig. 3) further indicated that the predicted protein contains immunoglobulin-like domains as it was resonably modeled by the T cell receptor beta chain in regions containing immunoglobulin folds (Fig. 3). Collectively, these results suggest that P3 could be a cell membrane receptor possibly involved in cell adhesion. In support of this, comparison to Xenopus tropicalis returned a predicted ortholog with putative cell adhesion function (Table 6). In addition, several hits for P3 were found by amino acid similarity in Xenopus tropicalis, Apis mellifera, Gadus morhua, and Latimeria chalumnae (Table 3), and based on phylogenetic relationships of these species (Supplemental Figure 1), it seems possible that the funciton of the gene coding for P3 was evolutionarily conserved in these species.
P4 (si:dkey-56 m19.5)
The gene coding for P4 (si:dkey-56 m19.5) is located on zebrafish chromosome 7 and codes for a predicted 526 amino acid protein (Table 1). As noted above, P4 is predicted to be a disordered protein (Supplemental Figure 2). Many intrinsically disordered proteins evolve rapidly [75,76,77,78], and therefore, predicting a function for P4 is difficult based on amino acid sequence. Accordingly, analyses based on sequence similarity were overall minimally informative. An associated protein domain (Ribonuclease E/G) was returned for P4 (Table 2) and a possible ortholog (Brain abundant, membrane attached signal protein 1, BASP1) with unknown function in Oryzias latipes was a hit based on amino acid sequence similarity (Table 3). P4 returned four best-matched orthologs from other species, but these genes had widely varying predicted functions (Table 4). Protein structure analysis was uninformative for P4 (Table 5).
Synteny analysis indicated that the gene coding for P4 lies in a syntenic region with human genome on human chromosome 16 (Supplemental Figure 3). The gene for P4 is flanked by several neighboring genes that have apparent orthologs in human, and based on the orientations and locations of the neighboring genes in the two species, the gene for P4 lies in a relative location similar to human TERB1. However, using NCBI BLASTP to compare sequences of zebrafish P4 and human TERB1 (with any scoring matrix) found no signficant similarity between these two genes, therefore failing to provide evidence of orthology of these genes. Therefore, we consider that the gene coding for P4 could have been gained in zebrafish or lost in humans. Interestingly, several possible orthologs in various species of fish were returned for P4 (Table 4).
Protein domain and GO term returned MGC-24 and Mucin15 domain (Table 2) for P5 (si:ch211-105j21.9). Amino acid sequence similarity returned three hits from three different species for genes with unknown and varying functions (Table 3), but best-matched orthologs (Table 4), as well as protein structure analysis, was uninformative. Although a hit was found in Xenopus laevis (Table 6), the protein has unknown function.
P6 (si:ch73-248e21.7) did not return any hits for GO terms, but a putative complement regulatory protein from Xenopus tropicalis was identified as a hit by sequence similarity analysis (Table 3). Best-matched orthologs were found in four Sinocyclocheilus species of fish, two of which were Mucin 5AC_like proteins and two of which were cell wall-like proteins (Table 4). However, other analyses proved uninformative.
Analyses for P7 were largely uninformative, though there were hits in some of these analyses indicating unknown, uncharacterized, or hypothetical proteins in six different fish species (Table 3, Table 4) their meaning was not interpretable.
Protein domain/GO term results suggest P8 contains immunoglobulin-like domain. This was further indicated by the amino acid sequence similarity results (Table 3), protein structure results (Table 5), and the putative “CD48 antigen” orthologue identified in Xenopus tropicalis (Table 6).
The gene coding for P9 was previously annotated as urp1, suggesting that putative urotensin function is already recognized. Consistent with this, protein domain/GO term and amino acid sequence similarityreturned results for P9 indicating urotensin function (Table 2 and Table 3), which is involved in regulation of vasculature diameter. Specifically, Urotensin II is a secreted mediator known to function in vasoconstriction of blood vessel diameter (Table 2, [79,80,81]). However, similar structures were not identified in our analyses (Table 5).
The gene for P10 (xcl32a.1) is located on zebrafish chromosome 2 and is predicted to encode a protein of only 126 amino acids (Table 1). The protein domains/GO term search returned chemokine interleukin-8-like, which functions in immune response (Table 2). Other analyses also indicated that P10 is likely a cytokine/chemokine (Table 3, Table 4, Table 5, Table 6). The predicted amino acid length of P10 is consistent with short amino acid chains seen in cytokines/chemokines. Consistent with this function, regions of P10 were well modeled by regions of the chemokine Lymphotactin’s interleukin-8-like domain (Fig. 4).
Collectively, results for P11 indicate that it could be an enzyme involved in carbohydrate metabolism (Table 2, Table 3, Table 4, Table 5, and Table 6). P11 could be well modeled by human intestinal maltase-glucoamylase (Table 5, Fig. 5), as well as sucrase-isomaltase and lysosomal alpha-glucosidase (Table 5). However, the predicted functional domains found previosly (P-type trefoil, galactose mutarose, and glycoside hydrolase domains, Table 2), were not covered in the homology model of maltase-glucoamylase. The domain P-type trefoil, found for P11 (Table 2), is found in several secreted proteins associated with mucins [82,83,84], many of which are involved in the response to gastrointestinal mucosal injury and inflammation , though the function of such a secreted protein in the CNS during tissue regeneration is not clear; perhaps it could be involved in extracellular matrix degradation.
The gene encoding P12 (pho) is located on zebrafish chromosome 5 and encodes a large predicted protein of 2798 amino acids (Table 1). Interestingly, P12 (pho) has been previously described to be required for the regeneration of zebrafish neuromasts , which are sensory patches located along the zebrafish body, but its function has not been studied otherwise. The coiled coil domain found in the protein domain/GO term analysis (Table 2) was described previously . In addition, we find that P12 is predicted to have more than 50% of the amino acids disordered, and is therefore is likely an unstructured protein (Supplemental Figure 2). Since P12 is a disordered protein, this is likely the reason that other analyses did not prove informative (Table 3, Table 4, Table 5, Table 6). Many studies have shown that disordered proteins evolve more rapidly than structured proteins [75,76,77,78] and the disordered region of the protein drives this rapid evolution . In addition, large proteins with coiled-coil domains appear to have functions in cell structure . In spite of the predicted disordered structure, the previously cited study  found evidence for an ATPase and transmembrane domain; however, our analyses did not reveal these features. Given that P12 is reported to be required for neuromast regeneration in zebrafish , we considered that a syntenic relationship might be identified in genomes of other species known to have robust regenerative abilities. However, our synteny analyses did not return predicted syntenic regions compared to Ambystoma mexicanum, Xenopus laevis, Xenopus tropicalis, Cynops pyrrhogaster (not shown).
Comparison to other published RNA-seq datasets
We were interested in determining to what extent transcripts mapping to some select genes might be shared in other zebrafish tissue/cells such as regenerating tissue such as heart , in resting microglia , and in microglia responding to acute damage . We focused this comparison on P1, P9, and P12 because P1 had particularly informative analyses above (indicating kinase function), and P9 and P12 might have novel functions in regeneration. Interestingly, transcripts for both P1 and P9 were increased in regenerating heart tissue samples compared to uninjured (Fig. 6a). Transcripts mapping to P1 appeared slightly more abundant in resting microglia compared to other brain cells, but levels did not change significantly in microglia responding to acute damage (Fig. 6b). Since P1 was enriched in microglia in our study , which sampled microglia/macrophages during retinal regeneration, it is possible that expression and function of this putative kinase (P1) are upregulated during tissue regeneration. Transcripts for P9 gene were present in microglia in the zebrafish brain, both in resting state and in response to acute brain damage (Fig. 6b), though they did not appear to change significantly in such conditions. Thus, it is possible that P9 is a mediator produced by microglia/macrophages that acts on the local vasculature to control blood pressure locally and perhaps this function is upregulated during tissue regeneration.
Examining expression levels of P12 did not demonstrate any apparent upregulation of P12 in regenerating heart compared to the very low transcript levels in uninjured heart tissue (Fig. 6a). However, P12 expression was observed in resting microglia from zebrafish brain, and the expression of P12 appeared to be reduced in context of microglial acute damage response  (Fig. 6b). This expression pattern, in combination with our dataset indicating expression by microglia/macrophages during retinal regeneration, suggests that P12 (pho) may have function in restoring and/or maintaining a “resting” microglial/macrophage state. However, such a hypothesis will require experimental testing.
We next examined a published RNA-seq dataset representing zebrafish macrophages responding to M. marinum infection , to determine if the genes of interest were also differentially expressed in zebrafish macrophages responding to microbial infection. Interestingly, although transcripts were detected in the Rouget et al. study for ten out of twelve of the genes, only one of these (P6, si:ch73-248e21.7, which may have complement regulatory function based on the results describbed above) was found to be differentially expressed in macrophages from infected fish compared to uninfected fish based on the authors’ cut-off criteria of Log2FC > =1, p-adj < 0.05 (Table 7). This supports the idea that these genes could comprise part of a unique transcriptome that is expressed in microglia/macrophages during tissue regeneration compared to that in response to microbial infection.
In this study, we analyzed twelve zebrafish genes with unknown function. These genes were selected from our previous transcriptome analysis of zebrafish microglia/macrophages isolated from regenerating retinal tissue . We used bioinformatic analyses to analyze the twelve selected transcripts to suggest putative functions. These analyses included protein domain and gene ontology (GO) terms, amino acid similarity, predicted protein structure, and synteny comparisons. For some selected genes, we examined expression level in other published studies of gene expression in zebrafish [52, 64], and examined other published data sets involving macrophages responding to microbial infection  to determine if these genes might be regulated in different activation contexts.
Results for many of the genes analyzed indicate putative functions related to the immune system. Several of these functions may not be well described in fish compared to mammalian organisms. The predicted genes/predicted proteins yielding the most informative results include P1 (results strongly indicate receptor associated kinase activity), P9 (previously annotated as urp1, which results indicate urotensin-like activity), P10 (which may have chemokine activity), and P11 (which could be an enzyme involved in carbohydrate metabolism). Although only an immunoglobulin-like fold domain was revealed for P3 and P8, and a possible mucin domain for P5, these results provide at least some new insight into the structure of the predicted proteins as these domains have not been previously noted for these genes. On the other hand, our analyses did not reveal significant functional information about P2, P4, P6, P7, and P12. Given that P12 (pho) is predicted to be a disordered protein, our analyses do not allow us to make predictions about the function of this particular protein, though it remains of interest due to its previously indicated role in neuromast regeneration . It will be interesting to determine, experimentally, if phoenix (pho), or any of the other genes analyzed in this work, are required for retinal regeneration.
The lack of syntenic relationships between zebrafish and mouse/human for the majority of the genes analyzed is notable, suggesting that possibly these genes were not evolutionarily retained across these species or alternatively, that these genes may have appeared in certain species . For the one zebrafish gene that did have syntenic relationship identified, sequence alignment did not indicate an evolutionary relationship to the candidate gene in the syntenic region. Orthologs were identified for some, but not all, of these zebrafish genes of interest in species which are also known to regenerate damaged tissue (Axolotl, Xenopus and Salamander, Table 6 and Supplemental Table 1). We therefore consider that, in future work, it is important to determine if the genetic program used by microglia/macrophages during zebrafish CNS regeneration is unique on a species level. Whether such a unique genetic program is required for successful regeneration also remains to be determined.
To begin to probe this question, we examined other published RNA-seq datasets for expression patterns of the genes examined here in this work. For selected genes (P1, P9, and P12), we examined transcript abundance in samples from zebrafish regenerating heart tissue  and zebrafish brain microglia . Both P1 and P9 showed upregulation in regenerating zebrafish heart, while P12 transcripts were apparently reduced in microglia responding to acute damage compared to resting microglia. When we examined the transcriptome of zebrafish macrophages responding to infection by the microbe M. marinum , only one of the twelve genes discussed in our work here was found to be differentially expressed in this context. It is worth considering that the samples sequenced in our study  compared to these other studies differ in regards to the developmental age/stage of the animal, location in the body, sample preparation, sequencing protocols, as well as other factors. However, these comparisons might still suggest that it is possible that these genes may be regulated in a tissue regeneration context rather than in response to microbial infection. Thus, it is possible that at least some of these genes comprise part of a general transcriptional program active in zebrafish microglia/macrophages responding to both tissue damage and/or infection. However, further experimental studies involving at least some of these genes (i.e. P1, which bioinformatic predictions suggest could be a kinase, and P12 (pho)) are likely to increase our understanding of mechanisms involved in successful tissue regeneration. Indeed, harnessing such regenerative capacity in mammals must be better informed by a more thorough functional understanding of a genetic program executed by organisms such as zebrafish, that underlies successful regeneration. Such work will also lead to a better evolutionary understanding of the vertebrate innate immune system.
In this study, we have predicted putative functions for several zebrafish genes with previously unknown function. Transcripts mapping to these genes were enriched in microglia/macrophages during retinal regeneration, suggesting they could have functional importance in tissue regeneration. We identified putative orthologs of several of these genes, mainly based on functional domains, which provide informative insight into possible protein function. In addition, comparison to other RNAseq datasets suggest that most of these genes could be expressed as part of a transcriptional program expressed by microglia/macrophages during tissue regeneration. Our findings provide a foundation for future experimental work to determine the function of these genes in vivo.
RNAseq dataset and predicted orthology
The 3’mRNA Quant-seq experiment and differential gene expression (DEG) analysis is described in Mitchell et al., 2019 . This dataset is available on the Gene Expression Omnibus (GEO120467). To identify putative mouse and human orthologs of the 986 transcripts found to be enriched in mpeg1+ cells compared to other cell types, the DRSC integrative ortholog prediction tool (DIOPT, v 7.0, www.flyrnai.org) was employed based on the zebrafish ENSEMBL ID.
Protein domains and gene ontology (GO) terms
The protein domains and the gene ontology (GO) terms (Biological Process and Molecular Function) were determined from the universal protein knowledgebase (UniProt, ) and the integrative protein signature database (InterPro, ). The gene ID from Ensembl (https://www.ensembl.org/, ) was used to extract the predicted protein sequence of the gene from the National Center for Biotechnology Information database (NCBI, https://www.ncbi.nlm.nih.gov/). The gene’s amino acid sequence was used to extract protein domains and gene ontology (GO) terms in UniProt  and InterPro .
Two approaches were used to find orthologs for each protein based on sequence similarity, EggNOG and SmartBLAST, because these two approaches use different protein databases. The bioinformatics web-server EggNOG 4.5.1  compares the input protein sequence to the sequences available in several databases and displays the list of orthologs of the protein and the species where those orthologs are found . The “default” settings of the web-server SmartBLAST (https://blast.ncbi.nlm.nih.gov/smartblast/) was used to identify the species of origin of orthologs (and paralogues within zebrafish) which were best-matched by our genes using the non-redundant protein sequence database .
To look for orthologs in species with described capacity for regeneration (Ambystoma mexicanum, Xenopus laevis, Xenopus tropicalis, Cynops pyrrhogaster), the protein sequences of zebrafish genes were compared to the NCBI database (http://blast.ncbi.nlm.nih.gov) using BLASTP with the BLOSUM45 scoring matrix and Gap Costs “Existence: 10 Extension: 3” (http://blast.ncbi.nlm.nih.gov). In addition, we used tBLASTn to identify putative unannotated orthologs in these species, and these results are reported in Supplemental Table 1.
We inferred protein disorder using default settings (5% false positive rate) of the the server PrDOS (http://prdos.hgc.jp/cgi-bin/top.cgi, ), which predicts natively disordered regions of a protein chain from its amino acid sequence. PrDOS returns a disorder probability for each residue. Proteins with more than 30–50% predicted disordered residues are considered disordered proteins .
We used the bioinformatics web-server SWISS-MODEL  to identify templates or homologs for our list of unknown proteins based on the predicted 3D structure of the proteins of interest (with Global Model Quality Estimation  or GMQE > 0.3 as cut-off). Homology modeling, or comparative protein modeling, uses an ortholog’s (template’s) experimentally-determined 3D-structure to estimate a model for the target sequence .
Synteny comparisons were performed using www.ensembl.org, because this database uses the most updated genome build for zebrafish (GRCz11). The ENSEMBL ID was used to identify the gene of interest and the chromosomal region containing the gene was selected. In the Comparative Genomics menu option, synteny was selected to compare the chromosomal region of the zebrafish gene to human (GRCh38.p13) and mouse (GRCm38.p6) genomes. Only one gene of interest was found to lie in a syntenic region (P4, Supplemental Figure 3). The amino acid sequence of the zebrafish gene was compared using (BLASTP, http://blast.ncbi.nlm.nih.gov) to the candidate annotated gene found inside the syntenic region using the National Center for Biotechnology Information (NCBI) database to look for similarity and orthologs; alignment was compared with each scoring matrix in the program .
Expression level in other RNA-seq datasets
We determined the expression level of selected zebrafish genes of interest in other published datasets of zebrafish heart regeneration  and zebrafish brain microglia  using the Zf Regeneration Database (www.zfregeneration.org) . The gene’s symbol or ENSEMBL ID were used to plot the normalized expression level of transcripts of interest.
To probe the RNA-seq dataset from Rouget et al. , we searched for the ENSEMBL ID of each gene of interest in the raw datasets (GSE78954 and GSE68920) to determine if transcript counts were detected. To determine if the gene was considered to be differentially expressed in macrophages responding to infection, we examined the authors’ reported results of differential expression analysis comparing transcripts from sorted uninfected vs. M. marinum infected macrophages from zebrafish larvae  (Rouget et al.,2019).
Availability of data and materials
The original RNAseq dataset (Mitchell et al., 2019) is available on the Gene Expression Omnibus (GEO120467).
Amphibian RCA protein 2
Brain abundant, membrane attached signal protein 1
Basic local alignment search tool
BLK proto-oncogene, Src family tyrosine kinase
Blocks Substitution Matrix
Calcitonin related polypeptide beta
Caspase activation and recruitment domain
Cluster of Differentiation 48
Coding DNA Sequence
CEA cell adhesion molecule 6
Central nervous system
Colony stimulating factor 1 receptor
Differentially expressed genes
XCL2 = X-C motif chemokine ligand 2
LOC102359650 = programmed cell death 1 ligand 1-like
SI:DKEY-181F22.4 = receptor-interacting serine/threonine-protein kinase 2-like
LOC102236357 = hepatocyte cell adhesion molecule-like
Glucosidase, alpha; neutral AB
Genome Reference Consortium Human Build 38
Genome Reference Consortium Mouse Reference 38
Genome Reference Consortium Danio rerio Build 11
Heparan sulfate 2-O-sulfotransferase 1
Multi-glycosylated core protein 24
Macrophage stimulating 1 receptor (c-met-related tyrosine kinase)
MOS proto-oncogene, serine/threonine kinase
Macrophage-expressed gene 1 protein
si:dkey-181f22.4 = ENSDARG00000105643
si:ch73-112 l6.1 = ENSDARG00000093126
zgc:174863 = ENSDARG00000099476
si:dkey-56 m19.5 = ENSDARG00000068432
si:ch211-105j21.9 = ENSDARG00000097845
si:ch73-248e21.7 = ENSDARG00000096331
si:ch211-191j22.3 = ENSDARG00000095459
si:ch73-256j6.2 = ENSDARG00000071653
urp1 = ENSDARG00000093493
xcl32a.1 = ENSDARG00000093906
si:ch211-287n14.3 = ENSDARG00000093650
pho = ENSDARG00000035133
Platelet-derived growth factor receptor, beta polypeptide
Rotein tyrosine phosphatase, receptor type A
- RNA-seq, RNAseq:
Martin P. Wound healing--aiming for perfect skin regeneration. Science. 1997;276:75–81.
Hirsch T, Rothoeft T, Teig N, Bauer JW, Pellegrini G, De Rosa L, et al. Regeneration of the entire human epidermis using transgenic stem cells. Nature. 2017;551:327–32.
Carlson BM. The regeneration of skeletal muscle — a review. American Journal of Anatomy. 1973;137:119–49.
Tedesco FS, Dellavalle A, Diaz-Manera J, Messina G, Cossu G. Repairing skeletal muscle: regenerative potential of skeletal muscle stem cells. J Clin Invest. 2010;120:11–9.
Michalopoulos GK, DeFrances MC. Liver regeneration. Science. 1997;276:60–6.
Alwayn IPJ, Verbesey JE, Kim S, Roy R, Arsenault DA, Greene AK, et al. A critical role for matrix metalloproteinases in liver regeneration. J Surg Res. 2008;145:192–8.
Brösamle C, Schwab ME. Axonal regeneration in the mammalian CNS. Semin Neurosci. 1996;8:107–13.
Nicholls JG, Frs AWB, Eugenin J, Geiser R, Lepre M, et al. Why does the central nervous system not regenerate after injury? Surv Ophthalmol. 1999;43:S136–41.
Bradke F, Fawcett JW, Spira ME. Assembly of a new growth cone after axotomy: the precursor to axon regeneration. Nat Rev Neurosci. 2012;13:183–93.
Lyons DA, Talbot WS. Glial cell development and function in Zebrafish. Cold Spring Harb Perspect Biol. 2015;7:a020586.
Zhou B, Yu P, Lin M-Y, Sun T, Chen Y, Sheng Z-H. Facilitation of axon regeneration by enhancing mitochondrial transport and rescuing energy deficits. J Cell Biol. 2016;214:103–19.
Wan J, Goldman D. Retina regeneration in zebrafish. Curr Opin Genet Dev. 2016;40:41–7.
Mensinger AF, Powers MK. Visual function in regenerating teleost retina following cytotoxic lesioning. Vis Neurosci. 1999;16:241–51.
Hitchcock P, Ochocinska M, Sieh A, Otteson D. Persistent and injury-induced neurogenesis in the vertebrate retina. Prog Retin Eye Res. 2004;23:183–94.
Montgomery JE, Parsons MJ, Hyde DR. A novel model of retinal ablation demonstrates that the extent of rod cell death regulates the origin of the regenerated zebrafish rod photoreceptors. J Comp Neurol. 2010;518:800–14.
Chablais F, Veit J, Rainer G, Jaźwińska A. The zebrafish heart regenerates after cryoinjury-induced myocardial infarction. BMC Dev Biol. 2011;11:21.
Kizil C, Kaslin J, Kroehne V, Brand M. Adult neurogenesis and brain regeneration in zebrafish. Developmental Neurobiology. 2012;72:429–61.
Gemberling M, Bailey TJ, Hyde DR, Poss KD. The zebrafish as a model for complex tissue regeneration. Trends Genet. 2013;29:611–20.
Wan J, Zhao X-F, Vojtek A, Goldman D. Retinal injury, growth factors, and cytokines converge on β-catenin and pStat3 signaling to stimulate retina regeneration. Cell Rep. 2014;9:285–97.
McGinn TE, Mitchell DM, Meighan PC, Partington N, Leoni DC, Jenkins CE, et al. Restoration of dendritic complexity, functional connectivity, and diversity of regenerated retinal bipolar neurons in adult Zebrafish. J Neurosci. 2018;38:120–36.
Caplan AI. Adult mesenchymal stem cells for tissue engineering versus regenerative medicine. J Cell Physiol. 2007;213:341–7.
Segawa M, Fukada S, Yamamoto Y, Yahagi H, Kanematsu M, Sato M, et al. Suppression of macrophage functions impairs skeletal muscle regeneration with severe fibrosis. Exp Cell Res. 2008;314:3232–44.
Lu H, Huang D, Saederup N, Charo IF, Ransohoff RM, Zhou L. Macrophages recruited via CCR2 produce insulin-like growth factor-1 to repair acute skeletal muscle injury. FASEB J. 2010;25:358–69.
Godwin JW, Pinto AR, Rosenthal NA. Chasing the recipe for a pro-regenerative immune system. Semin Cell Dev Biol. 2017;61:71–9.
Petrie TA, Strand NS, Tsung-Yang C, Rabinowitz JS, Moon RT. Macrophages modulate adult zebrafish tail fin regeneration. Development. 2014;141:2581–91.
Wynn TA, Vannella KM. Macrophages in tissue repair, regeneration, and fibrosis. Immunity. 2016;44:450–62.
Simkin J, Gawriluk TR, Gensel JC, Seifert AW. Macrophages are necessary for epimorphic regeneration in African spiny mice. eLife. 2017;6:e24623.
Simkin J, Sammarco MC, Marrero L, Dawson LA, Yan M, Tucker C, et al. Macrophages are required to coordinate mouse digit tip regeneration. Development. 2017;144:3907–16.
Dort J, Fabre P, Molina T, Dumont NA. Macrophages are key regulators of stem cells during skeletal muscle regeneration and diseases. Stem Cells Int. 2019. https://doi.org/10.1155/2019/4761427.
Mitchell DM, Sun C, Hunter SS, New DD, Stenkamp DL. Regeneration associated transcriptional signature of retinal microglia and macrophages. Sci Rep. 2019;9:1–17.
Kreutzberg GW. Microglia: a sensor for pathological events in the CNS. Trends Neurosci. 1996;19:312–8.
Stollg G, Jander S. The role of microglia and macrophages in the pathophysiology of the CNS. Prog Neurobiol. 1999;58:233–47.
Baumann N, Pham-Dinh D. Biology of Oligodendrocyte and myelin in the mammalian central nervous system. Physiol Rev. 2001;81:871–927.
Nakajima K, Kohsaka S. Microglia: activation and their significance in the central nervous system. J Biochem. 2001;130:169–75.
Katsumoto A, Lu H, Miranda AS, Ransohoff RM. Ontogeny and functions of central nervous system macrophages. J Immunol. 2014;193:2615–21.
Bernardos RL, Barthel LK, Meyers JR, Raymond PA. Late-stage neuronal progenitors in the retina are radial Müller glia that function as retinal stem cells. J Neurosci. 2007;27:7028–40.
Fausett BV, Goldman D. A role for α1 tubulin-expressing Müller glia in regeneration of the injured Zebrafish retina. J Neurosci. 2006;26:6303–13.
Bringmann A, Pannicke T, Grosche J, Francke M, Wiedemann P, Skatchkov SN, et al. Müller cells in the healthy and diseased retina. Prog Retin Eye Res. 2006;25:397–424.
Fimbel SM, Montgomery JE, Burket CT, Hyde DR. Regeneration of inner retinal neurons after Intravitreal injection of Ouabain in Zebrafish. J Neurosci. 2007;27:1712–24.
Ramachandran R, Fausett BV, Goldman D. Ascl1a regulates Müller glia dedifferentiation and retinal regeneration through a Lin-28-dependent, let-7 microRNA signalling pathway. Nat Cell Biol. 2010;12:1101–7.
Nagashima M, Barthel LK, Raymond PA. A self-renewing division of zebrafish Müller glial cells generates neuronal progenitors that require N-cadherin to regenerate retinal neurons. Development. 2013;140:4510–21.
Hamon A, Roger JE, Yang X-J, Perron M. Müller glial cell-dependent regeneration of the neural retina: An overview across vertebrate model systems. Developmental Dynamics. 2018:727–38.
Wang M, Ma W, Zhao L, Fariss RN, Wong WT. Adaptive Müller cell responses to microglial activation mediate neuroprotection and coordinate inflammation in the retina. J Neuroinflammation. 2011;8:173.
Fischer AJ, Zelinka C, Gallina D, Scott MA, Todd L. Reactive microglia and macrophage facilitate the formation of Müller glia-derived retinal progenitors. Glia. 2014;62:1608–28.
Conedera FM, Pousa AMQ, Mercader N, Tschopp M, Enzmann V. Retinal microglia signaling affects Müller cell behavior in the zebrafish following laser injury induction. Glia. 2019;67:1150–66.
Huang T, Cui J, Li L, Hitchcock PF, Li Y. The role of microglia in the neurogenesis of zebrafish retina. Biochem Biophys Res Commun. 2012;421:214–20.
Sieger D, Peri F. Animal models for studying microglia: the first, the popular, and the new. Glia. 2013;61:3–9.
Casano AM, Peri F. Microglia: multitasking specialists of the brain. Dev Cell. 2015;32:469–77.
Casano AM, Albert M, Peri F. Developmental apoptosis mediates entry and positioning of microglia in the Zebrafish brain. Cell Rep. 2016;16:897–906.
Hamilton L, Astell KR, Velikova G, Sieger D. A Zebrafish live imaging model reveals differential responses of microglia toward Glioblastoma cells in vivo. Zebrafish. 2016;13:523–34.
Mitchell DM, Lovel AG, Stenkamp DL. Dynamic changes in microglial and macrophage characteristics during degeneration and regeneration of the zebrafish retina. J Neuroinflammation. 2018;15:163.
Oosterhof N, Holtman IR, Kuil LE, van der Linde HC, Boddeke EWGM, Eggen BJL, et al. Identification of a conserved and acute neurodegeneration-specific microglial transcriptome in the zebrafish. Glia. 2017;65:138–49.
Ruzicka L, Bradford YM, Frazer K, Howe DG, Paddock H, Ramachandran S, et al. ZFIN, The zebrafish model organism database: Updates and new directions. Genesis. 2015;53:498–509.
Hubbard T, Barker D, Birney E, Cameron G, Chen Y, Clark L, et al. The Ensembl genome database project. Nucleic Acids Res. 2002;30:38–41.
Huerta-Cepas J, Szklarczyk D, Forslund K, Cook H, Heller D, Walter MC, et al. eggNOG 4.5: a hierarchical orthology framework with improved functional annotations for eukaryotic, prokaryotic and viral sequences. Nucleic Acids Res. 2016;44:D286–93.
van der Lee R, Buljan M, Lang B, Weatheritt RJ, Daughdrill GW, Dunker AK, et al. Classification of intrinsically disordered regions and proteins. Chem Rev. 2014;114:6589–631.
Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, et al. SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res. 2018;46:W296–303.
Biasini M, Bienert S, Waterhouse A, Arnold K, Studer G, Schmidt T, et al. SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information. Nucleic Acids Res. 2014;42:W252–8.
Rose PW, Bi C, Bluhm WF, Christie CH, Dimitropoulos D, Dutta S, et al. The RCSB protein data Bank: new resources for research and education. Nucleic Acids Res. 2013;41:D475–82.
Hrdinka M, Schlicher L, Dai B, Pinkas DM, Bufton JC, Picaud S, et al. Small molecule inhibitors reveal an indispensable scaffolding role of RIPK2 in NOD2 signaling. EMBO J. 2018;37:e99372.
Pang SS, Berry R, Chen Z, Kjer-Nielsen L, Perugini MA, King GF, et al. The structural basis for autonomous dimerization of the pre-T-cell antigen receptor. Nature. 2010;467:844–8.
Kuloǧlu ES, McCaslin DR, Kitabwalla M, Pauza CD, Markley JL, Volkman BF. Monomeric solution structure of the prototypical ‘C’ chemokine Lymphotactin. Biochemistry. 2001;40:12486–96.
Ren L, Qin X, Cao X, Wang L, Bai F, Bai G, et al. Structural insight into substrate specificity of human intestinal maltase-glucoamylase. Protein Cell. 2011;2:827–36.
Goldman JA, Kuzu G, Lee N, Karasik J, Gemberling M, Foglia MJ, et al. Resolving Heart Regeneration by Replacement Histone Profiling. Developmental Cell. 2017;40:392–404 e5.
Nieto-Arellano R, Sánchez-Iranzo H. zfRegeneration: a database for gene expression profiling during regeneration. Bioinformatics. 2019;35:703–5.
Bouchier-Hayes L, Martin SJ. CARD games in apoptosis and immunity. EMBO Rep. 2002;3:616–21.
Ludwig-Galezowska AH, Flanagan L, Rehm M. Apoptosis repressor with caspase recruitment domain, a multifunctional modulator of cell death. J Cell Mol Med. 2011;15:1044–53.
Martinon F, Tschopp J. Inflammatory Caspases: linking an intracellular innate immune system to autoinflammatory diseases. Cell. 2004;117:561–74.
Hruz P, Eckmann L. Caspase recruitment domain-containing sensors and adaptors in intestinal innate immunity. Curr Opin Gastroenterol. 2008;24:108–14.
Inohara N, del Peso L, Koseki T, Chen S, Núñez G. RICK, a novel protein kinase containing a Caspase recruitment domain, interacts with CLARP and regulates CD95-mediated apoptosis. J Biol Chem. 1998;273:12296–300.
McCarthy JV, Ni J, Dixit VM. RIP2 is a novel NF-κB-activating and cell death-inducing kinase. J Biol Chem. 1998;273:16968–75.
Thome M, Hofmann K, Burns K, Martinon F, Bodmer J-L, Mattmann C, et al. Identification of CARDIAK, a RIP-like kinase that associates with caspase-1. Curr Biol. 1998;8:885–9.
Manon F, Favier A, Núñez G, Simorre J-P, Cusack S. Solution structure of NOD1 CARD and mutational analysis of its interaction with the CARD of downstream kinase RICK. J Mol Biol. 2007;365:160–74.
Gong Q, Long Z, Zhong FL, Teo DET, Jin Y, Yin Z, et al. Structural basis of RIP2 activation and signaling. Nat Commun. 2018;9:1–13.
Brown CJ, Takayama S, Campen AM, Vise P, Marshall TW, Oldfield CJ, et al. Evolutionary rate heterogeneity in proteins with long disordered regions. J Mol Evol. 2002;55:104–10.
Afanasyeva A, Bockwoldt M, Cooney CR, Heiland I, Gossmann TI. Human long intrinsically disordered protein regions are frequent targets of positive selection. Genome Res. 2018;28:975–82.
Fahmi M, Ito M. Evolutionary approach of intrinsically disordered CIP/KIP proteins. Sci Rep. 2019;9:1–10.
Forcelloni S, Giansanti A. Evolutionary forces on different flavors of intrinsic disorder in the human proteome. bioRxiv. 2019:653063.
Russell FD. Urotensin II in cardiovascular regulation. Vasc Health Risk Manag. 2008;4:775–85.
Debiec R, Christofidou P, Denniff M, Bloomer LD, Bogdanski P, Wojnar L, et al. Urotensin-II system in genetic control of blood pressure and renal function. PLoS One. 2013;8. https://doi.org/10.1371/journal.pone.0083137.
Ames RS, Sarau HM, Chambers JK, Willette RN, Aiyar NV, Romanic AM, et al. Human urotensin-II is a potent vasoconstrictor and agonist for the orphan receptor GPR14. Nature. 1999;401:282–6.
Poulsom R, Wright NA. Trefoil peptides: a newly recognized family of epithelial mucin-associated molecules. Am J Physiology-Gastrointestinal and Liver Physiology. 1993;265:G205–13.
Wong WM, Poulsom R, Wright NA. Trefoil peptides. Gut. 1999;44:890–5.
Longman RJ, Douthwaite J, Sylvester PA, Poulsom R, Corfield AP, Thomas MG, et al. Coordinated localisation of mucins and trefoil peptides in the ulcer associated cell lineage and the gastrointestinal mucosa. Gut. 2000;47:792–800.
Aihara E, Engevik KA, Montrose MH. Trefoil factor peptides and gastrointestinal function. Annu Rev Physiol. 2017;79:357–80.
Behra M, Bradsher J, Sougrat R, Gallardo V, Allende ML, Burgess SM. Phoenix is required for Mechanosensory hair cell regeneration in the Zebrafish lateral line. PLoS Genet. 2009;5. https://doi.org/10.1371/journal.pgen.1000455.
Rougeot J, Torraca V, Zakrzewska A, Kanwal Z, Jansen HJ, Sommer F, et al. RNAseq profiling of leukocyte populations in Zebrafish larvae reveals a cxcl11 chemokine gene as a marker of macrophage polarization during mycobacterial infection. Front Immunol. 2019;10. https://doi.org/10.3389/fimmu.2019.00832.
Singh U, Syrkin WE. How new genes are born. eLife. 2020;9:e55136.
Apweiler R, Bairoch A, Wu CH, Barker WC, Boeckmann B, Ferro S, et al. UniProt: the Universal Protein knowledgebase. Nucleic Acids Res. 2004;32(suppl_1):D115–9.
Hunter S, Apweiler R, Attwood TK, Bairoch A, Bateman A, Binns D, et al. InterPro: the integrative protein signature database. Nucleic Acids Res. 2009;37(suppl_1):D211–5.
Pruitt KD, Tatusova T, Maglott DR. NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007;35(suppl_1):D61–5.
Ishida T, Kinoshita K. PrDOS: prediction of disordered protein regions from amino acid sequence. Nucleic Acids Res. 2007;35(suppl_2):W460–4.
Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997;25:3389–402.
We thank Dr. Celeste Brown (University of Idaho) for guidance on the bioinformatics analysis. We thank Dr. Celeste Brown, Dr. Deborah Stenkamp, and Dr. JT Van Leuven (University of Idaho) for review of previous drafts of the manuscript.
Ousseini Issaka Salia was supported by the Institute for Modeling Collaboration and Innovation (IMCI) at the University of Idaho (NIH P20GM104420). Bioinformatic analyses were also supported through a Modeling Access Grant through ICMI (NIH P20GM104420). The original RNAseq experiment and differential gene expression analysis was funded by a Technology Access Grant from Idaho INBRE (NIGMS P20GM103408).
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Issaka Salia, O., Mitchell, D.M. Bioinformatic analysis and functional predictions of selected regeneration-associated transcripts expressed by zebrafish microglia. BMC Genomics 21, 870 (2020). https://doi.org/10.1186/s12864-020-07273-8
- Bioinformatic analysis
- Functional predictions