Digital gene expression approach over multiple RNA-Seq data sets to detect neoblast transcriptional changes in Schmidtea mediterranea

Background The freshwater planarian Schmidtea mediterranea is recognised as a valuable model for research into adult stem cells and regeneration. With the advent of the high-throughput sequencing technologies, it has become feasible to undertake detailed transcriptional analysis of its unique stem cell population, the neoblasts. Nonetheless, a reliable reference for this type of studies is still lacking. Results Taking advantage of digital gene expression (DGE) sequencing technology we compare all the available transcriptomes for S. mediterranea and improve their annotation. These results are accessible via web for the community of researchers. Using the quantitative nature of DGE, we describe the transcriptional profile of neoblasts and present 42 new neoblast genes, including several cancer-related genes and transcription factors. Furthermore, we describe in detail the Smed-meis-like gene and the three Nuclear Factor Y subunits Smed-nf-YA, Smed-nf-YB-2 and Smed-nf-YC. Conclusions DGE is a valuable tool for gene discovery, quantification and annotation. The application of DGE in S. mediterranea confirms the planarian stem cells or neoblasts as a complex population of pluripotent and multipotent cells regulated by a mixture of transcription factors and cancer-related genes. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1533-1) contains supplementary material, which is available to authorized users.


New neoblast genes
• Smed-atf6A The Cyclic AMP-dependent Transcription Factor ATF-6 alpha forms a dimer which interacts with the Nuclear Transcription Factor Y (NF-Y) trimer through direct binding to the subunit C (NF-YC) (reviewed later in this study). It could also be involved in activation of transcription by the Serum Response Factor [1]. Interestingly, Smed-srf [2] is also overexpressed in X1 ( Table 2).
• Smed-ccar1 Cell division cycle and apoptosis regulator protein 1 (CCAR1) is a perinuclear phospho-protein that associates with components of the Mediator complex and functions also as a p53 coactivator [3], which has a homolog in S. mediterranea [4]. Apart from playing an important role in transcriptional regulation, it modulates apoptosis signaling by CD437, a retinoid that causes cell cycle arrest and apoptosis in a number of cancer cells, and plays a role in cell cycle progression and cell proliferation. Loss of c-Myc sensitizes cells to apoptosis by CCAR1, whereas expression of c-Myc inhibits CCAR1-dependent apoptosis [5]. It has been proposed to act as a tumor growth supressor in a variety of cancers, from breast cancer [6] to B-cell lymphoma [7] or medulloblastoma [8]. In addition, expression of CCAR1 has been detected in the neural crest of vertebrate embryos [9] and osteoblasts [10].
• Smed-dnaJA3 Molecular chaperones are a diverse family of proteins that function to protect proteins during synthesis and from cellular stress. Mitochondrial chaperone DNAJ, also known as Hsp40 (heat shock protein 40kD), is expressed in a wide variety of organisms from bacteria to humans, where it modulates apoptotic signal transduction or effector structures within the mitochondrial matrix and can modulate IFN-gamma-mediated transcriptional activity [11]. Isoform 2 inhibits apoptosis whereas Isoform 1 increases apoptosis triggered by Tumor Necrosis Factor (TNF) and could act as a tumor suppressor [12,13,14].
• Smed-ergic3 The Endoplasmic Reticulum-Golgi Intermediate Compartment protein 3 probably mediates transport between endoplasmic reticulum and Golgi. Recently, it has been identified as upregulated in lung cancer [15].
• Smed-got2 The glutamic-oxaloacetic transaminase 2 (GOT2) is an aspartate aminotransferase used by pancreatic cancer cells to convert Gln-derived glutamate (Glu) into α-ketoglutarate (αKG) in the mitochondria to fuel the tricarboxylic acid cycle [16]. In  • Smed-med7 and Smed-med27 MED7 and MED27 are components of the Mediator complex, which serves as a scaffold for the assembly of general transcription factors associated also with RNA polymerase II transcription. They play a role as intermediaries transducing regulatory signals from upstream transcriptional activator proteins to basal transcription machinery at the core promoter. In zebrafish, MED27 is necessary for the development of dopaminergic amacrine cells in the retina and may also negatively regulate the development of rod photoreceptor cells [28].
• Smed-mlx The Max-like protein X (Mlx) interacts with the Max network of transcriptional factors regulating cell growth, the transition from proliferation to differentiation and apoptosis. Myc and Mad proteins associate with the bHLHZip protein Max to bind specific DNA sequences and regulate the expression of genes important for cell cycle progression. It has been suggested that Mlx may also associate with a subset of the Mad family of transcriptional repressors to antagonize the growth-promoting action of the Myc proto-oncogenes, resembling Max in some cell types or cellular stages and controlling the progression through the S phase and differentiation [29,30]. Hence, Mlx might be important in the sequence of events leading to cell commitment.
NcapD2 is a regulatory subunit of the condensin complex that is required for conversion of interphase chromatin into mitotic condensed chromosomes [31]. In interphase cells, the majority of the condensin complex is found in the cytoplasm until mitosis, when most of the complex is associated with chromatin. At the onset of prophase, the regulatory subunits of the complex are phosphorylated by CDK1, leading to condensin's association with chromosome arms and to chromosome condensation [32].
• Smed-nme1 and Smed-set • Smed-rack1 RACK1 is a highly conserved intracellular adaptor protein originally identified as the receptor for activated protein kinase C (PKC). It is involved in the recruitment, assembly and regulation of a variety of signaling molecules in many cellular processes: negative regulation of cell growth, positive regulation of cell migration and apoptosis, and regulation of the cell cycle [57, 58]. RACK1 is upregulated in several cancers [59], and plays an important role in angiogenesis and cancer growth [60]. In colon cancer it inhibits cell growth [61] whereas in breast carcinoma it promotes migration and metastasis by interacting with RhoA and activating the RhoA/Rho kinase pathway [62]. It also induces tumorigenicity in lung cancer through activation of the sonic-hedgehog signaling pathway [63]. In Xenopus neural development, it is required for neural tube closure [64].
• Smed-rbbp Up to six different retinoblastoma binding proteins were overexpressed in neoblasts: four RbbpP4, one Rbbp5 and one Rbbp6 homolog. Transcriptional repression by retinoblastoma is crucial for the proper control of cell growth and it has been reported to regulate stem cell proliferation in freshwater planarians [65]. Three RBBP4s have already been annotated in S. mediterranea: Smed-rbbp4-1 has been shown to be expressed in proliferative cells [66] whereas Smed-rbbp4-2 was not further analyzed or proposed as neoblast gene [2], and Smed-rbbp4-3 was only described at phenotype level associated with regeneration [67]. RBBP4 is a component of several complexes that regulate chromatin and promote transcriptional repression [68]. Those include the following: Chromatin Assembly Factor 1 (CAF-1), which mediates chromatin assembly in DNA replication and is required for efficient progression through the S phase and may participate in heterochromatin maintenance in proliferating cells [69]; Polycomb Repressive Complex 2 (PRC2), which inhibits homeotic genes during development [70]; and Nucleosome Remodeling and Histone Deacetylase (NuRD), which blocks embryonic stem cell-specific genes [71]. In addition, RBBP4 is associated with cervical cancer (probably through its regulation of tumor supressors p53 and retinoblastoma), thyroid cancer (as a target of Nuclear Factor NF-κB), and apoptosis [72,73]. The functions of RBBP5 and RBBP6 are less well known. RBBP5 is implicated in acute leukemias as part of the Mixed Lineage Leukemia protein 1 (MLL1) core, which is also essential in embryonic development-where it is predominantly associated with the expression of Hox genes [74,75]. RBBP6 may function as a negative regulator of p53, leading to both apoptosis and cell growth, via MDM2 ubiquitination [76].
• Smed-rrM2B The M2-B subunit of Ribonucleoside-diphosphate reductase together with the M1 subunit forms an active ribonucleotide reductase (RNR) complex which is expressed in both resting and proliferating cells in response to DNA damage. It also supplies deoxyribonucleotides for DNA repair in a p53-dependent manner in cells arrested at G1 or G2 [77,78,79], playing a pivotal role in cell survival, cancer [80,81,82] and mitochondrial-associated diseases [83,84,85,86].
• Smed-serinc Serinc are carrier proteins that incorporate serine into membranes and facilitate the synthesis of lipids derived from this amino acid. They form a unique family of five members with eleven transmembrane domains showing no homology to other proteins but highly conserved in eukaryotes [87].
• Smed-srrt Arsenite is a carcinogenic compound that can act as a comutagen by inhibiting DNA repair. Serrate RNA effector molecule was firstly described as Arseniteresistance protein 2 (ARS2), since it modulates arsenic sensitivity [88]. ARS2 controls the multipotent progenitor state of postnatal and adult neural stem cells (NSCs), inducing their self-renewal by direct binding to the promoter of the pluripotency factor Sox2 and positively regulating its transcription. It may also play a similar role in embryonic stem cells, since ARS2 is also essential for early mammalian development [89,90].
• Smed-thoc2 As part of the THO Complex, it is required for transcription, processing and nuclear export of spliced mRNA associated with the TREX complex [91]. This complex has been identified as a key element in development, regulating pluripotency and self-renewal of embryonic stem cells, cell differentiation and somatic cell reprogramming as a mature ribonucleoprotein (mRNP) biogenesis factor [92,93]. It may also be linked to tumorigenesis [94].
• Smed-tif1A Transcription Intermediary Factor 1 is a transcriptional coactivator with a central role in the regulation of cell proliferation and apoptosis by promoting ubiquitination and proteasomal degradation of p53 [95]. In mice, it functions as a modulator of early embryonic gene expression during the first wave of transcription activation in the zygote [96] and plays a role in the control of retinoic acid-dependent proliferation of hepatocytes, acting as a liver-specific tumor suppressor [97]. In humans, it is known to be upregulated in breast and prostate cancers [98,99].
• Smed-traf-4 and Smed-traf-5 TNF Receptor Associated Factors (TRAFs) are a family of scaffold proteins that function as signal transducers of Toll/Interleukin-1 (Toll/IL-1) receptors leading to the activation of the NF-κB transcription factor and the c-Jun N-Terminal Protein Kinases (JNK) among others. Thus, they are involved in the control of inflammation, apoptosis and cell survival. TRAF2 and TRAF3 could act as tumor suppressors, whereas TRAF1 appears to be oncogenic in B cells. Overexpression of TRAF4 and TRAF6 has been reported in breast and lung carcinomas, and in osteosarcoma [100].
• Smed-tsg101 Tumor Susceptibility Gene 101 (TSG-101) takes part in MDM2-p53 regulatory feedback, stabilizing MDM2 and thus promoting p53 degradation through its ubiquitin ligase activity. Hence, loss of TSG101 would result in up-regulation of p53. Stabilization of MDM2 by TSG101 seems to be achieved by inhibiting ubiquitination [101]. This is consistent with the participation of TSG101 in the ubiquitin system as a component of the Endosomal Sorting Complexes Required for Transport 1 (ESCRT-1), which plays a critical function in endosomal sorting and trafficking of ubiquitinated proteins. Despite its name, the involvement of this gene in cancer remains to be clearly demonstrated.
Overexpression of TSG101 has been reported in most cancer types but, while initially discovered as negative regulator for tumorigenesis in a screen for potential tumor suppressors in immortalized fibroblasts, subsequent evidence points out TSG101 as a positive modulator of cancer progression. In any case, TSG101 has also been found to be essential for many processes in the cell, including transcriptional regulation, cell cycle control, and growth and proliferation, and it is clearly required for normal cell function in embryonic and adult tissues [102].
• Smed-tssc1 Tumor Suppressing Subtransferable Candidate 1 (TSSC1) protein was first discovered as one of several tumor-suppressing subtransferable fragments located in the imprinted gene domain of 11p15.5, an important tumorsuppressor gene region [103]. In a recent study it was reported that TSSC1 inhibits breast cancer cell invasion leading to bone metastasis [104]; • Smed-tusc3 Formally a magnesium transporter with a putative function in protein Nglycosylation, Tumor Supressor Candidate 3 gene is closely related with embryonic development and cancer. Morpholino knockdown of TUSC3 protein expression in zebrafish embryos results in early developmental arrest [105]. As a tumor supressor, it has been identified in prostate [106] and ovarian cancer [107] although it has been found expressed in most non-lymphoid cells and tissues examined. In addition, point mutations or deletions in the TUSC3 gene have been identified in individuals with nonsyndromic autosomal recessive intellectual disability (ARID) [108].