Planarians can regenerate entire animals from a small fragment of the body. The regenerating fragment is able to create new tissues and remodel existing tissues to form a complete animal. Thus different fragments with very different starting components eventually converge on the same solution. In this study, we performed an extensive RNA-seq time-course on regenerating head and tail fragments to observe the differences and similarities of the transcriptional landscape between head and tail fragments during regeneration.
We have consolidated existing transcriptomic data for S. mediterranea to generate a high confidence set of transcripts for use in genome wide expression studies. We performed a RNA-seq time-course on regenerating head and tail fragments from 0 hours to 3 days. We found that the transcriptome profiles of head and tail regeneration were very different at the start of regeneration; however, an unexpected convergence of transcriptional profiles occurred at 48 hours when head and tail fragments are still morphologically distinct. By comparing differentially expressed transcripts at various time-points, we revealed that this divergence/convergence pattern is caused by a shared regulatory program that runs early in heads and later in tails.
Additionally, we also performed RNA-seq on smed-prep(RNAi) tail fragments which ultimately fail to regenerate anterior structures. We find the gene regulation program in response to smed-prep(RNAi) to display the opposite regulatory trend compared to the previously mentioned share regulatory program during regeneration. Using annotation data and comparative approaches, we also identified a set of approximately 4,800 triclad specific transcripts that were enriched amongst the genes displaying differential expression during the regeneration time-course.
The regeneration transcriptome of head and tail regeneration provides us with a rich resource for investigating the global expression changes that occurs during regeneration. We show that very different regenerative scenarios utilize a shared core regenerative program. Furthermore, our consolidated transcriptome and annotations allowed us to identity triclad specific transcripts that are enriched within this core regulatory program. Our data support the hypothesis that both conserved aspects of animal developmental programs and recent evolutionarily innovations work in concert to control regeneration.
Understanding how we might replace damaged and diseased tissue through the use of stem cell based therapies is an important goal for biomedical science. Despite natural occurrences that occur across the Animal Kingdom and work in a growing number of systems, regeneration is still poorly understood. Model systems are starting to shed light on the molecular processes that orchestrate various regenerative phenomena [1, 2]. Among the various systems, planarians have a distinct advantage of being able to regenerate entire animals from small starting fragments [3–8]. To what extent this is due to novel mechanisms unique to planarians is unknown. The simple anatomy and highly accessible adult stem cell system of planarian flatworms make it a high value model system from which we can hope to form valuable paradigms for how regeneration is controlled.
Regenerative potential in these animals is dependent on a population of cycling pluripotent adult stem cells present throughout the parenchyma, except in the area in front of the photoreceptors and the region of the pharynx [3–7]. On wounding or amputation these adult stem cells undergo two characteristic peaks of cell division that produce stem cell progeny and can subsequently differentiate to replace missing or damaged tissue . The amenability of planarians, and in particular the planarian Schmidtea mediterranea to functional genomic approaches in combination with advances in sequencing technology has produced several gene expression and transcriptomic studies on head regeneration , wounding , and neoblast dynamics [11–14].
Planarians are able to reconstitute the full adult body plan from very different starting scenarios. For example tail stump pieces will regenerate a new head, head pieces a new tail and trunk fragments both a head and a tail. In addition to all these potentially different regenerative scenarios, the planarian must also rescale and remodel the morphology of pre-existing tissues and organs to fit the size of the animals and to ensure sufficient functional integration . Thus from very different beginnings all fragments converge to the same end point. This means that any starting fragment contains the information required to reconstitute the whole adult body. While we already know a little about some of the key events in this process, for example the signalling pathways that are required to ensure the correct polarity along the different axes of regenerating pieces, we still lack an understanding of how these events fit together globally [7, 8, 16–19].
In this study we have amalgamated existing transcriptomes [10, 13, 14, 20, 21] to provide an improved resource as a service to the research community. This exercise includes re-annotation of transcripts, removal of likely chimeric transcripts and characterisation of transcripts that code for proteins novel to the phylum Platyhelminthes and/or the intensely studied Triclad group of planarians.
Using this new meta-transcriptome assembly we looked to investigate the potential for genome wide expression analysis for understanding differences and similarities in the regulatory program that underpins different regenerative scenarios. We investigated head and tail regeneration of the planarian Schmidtea mediterranea from 0 to 72 hours after amputation. Using this wealth of data, we were able to describe patterns of gene expression levels during regeneration across the whole transcriptome and perform comparisons between regeneration time-points and scenarios. This allowed us to describe the transcriptional changes that reflect key regulatory transitions during the regenerative process.
We found that head regeneration (head fragment regenerating tail) and tail regeneration (tail fragment regenerating head) transcriptomes initially reflect the differences in cellular content at the beginning of regeneration. They then diverge further over the first 12 hours of regeneration. However, we observed an unexpected convergence of expression profiles by 48 hours of regeneration between these two contrasting scenarios. This divergent/convergent pattern was underpinned by a core battery of more than 5,000 genes that are regulated in the same manner during between 6-12 hours of head regeneration and 36-48 hours of tail regeneration.
Both to internally validate our data at the genome wide level and to further define those genes that can be associated specifically with anterior regeneration we performed RNA-seq in the background of the well characterised Smed-prep(RNAi) phenotype, which specifically results in the loss of anterior structures . This allowed the identification of putative direct and indirect targets of Smed-prep which were enriched among those genes we found to be involved in the shared regulatory transition.
From blast annotations against selected species, we were able to define lists of genes that are potentially unique to S. mediterranea, unique within the tricladida , and unique to the phylum Platyhelminthes. We found that differentially expressed transcripts during regeneration were enriched for triclad specific transcripts. These transcripts are potentially involved in novel mechanisms underpinning potent regenerative capacity, and suggest that as has been recently suggested in urodeles, some important aspects of regeneration may be lineage specific [24, 25]. Our data provide new insight into the regulatory logic of regeneration, and act as a reference point against which to advance our understanding of the regulatory control of regeneration.
Results and discussion
Consolidation of the available transcriptome data
There are currently five independently assembled S mediterran ea transcriptomes with sufficient read depth and coverage to have aspirations to providing whole transcriptome coverage [10, 13, 14, 20, 21]. It is unclear in the literature to what extent, if at all, later assemblies have used data from earlier sources. We decided against a reassembly from the raw sequencing data that went into the 5 independent assemblies due to the varying error profiles of the raw data from different library preparation and sequencing chemistry. In addition, not all data was readily available and/or described fully in the repositories. We gathered the available transcriptome data from the 5 relevant publications (we refer to transcriptome datasets by the group leader’s last name) along with the available EST datasets and performed a consolidation of the transcripts. We chose to include only 5 of the 6 available transcriptomes because the dataset provided by Abril et al. contained a high number of contigs (~192,000) suggesting a highly fragmented assembly . We did not want to introduce more variability into the consolidation.
The consolidation process seeks to retain a high confidence set of transcripts, resolve transcript fusion events, and retain transcripts with the longest open reading frames. We first clustered transcripts from all 6 datasets by sequence similarity using CAP3 assembler . Each assembled contig can be represented as a cluster with contributing transcripts from one of the 6 data sources. We kept only clusters that had transcripts from at least 2 different sources to ensure a high confidence set of transcripts resulting in 23,802 clusters. We then removed potential fusion transcripts from each cluster by analysing the position of top blast hits along the cluster length to complete proteome sets. Removal of potential fusion transcripts split 441 clusters into 1,014 clusters. To retain the sequence with the longest ORF for each cluster, we took the transcript or CAP3 contig with the longest ORF in each cluster.
To make sure we are including known S. mediterrane a transcripts that might not be in the 5 transcriptomes due to low expression, we blasted the consolidated transcripts to 179 known S. mediterranea mRNA sequences. 16 transcripts were not found in the consolidated transcriptome including genes with very low expression levels or very localised expression patterns (e.g. Smed-wnt-1, Smed-noggin-like 4, Smed-noggin-like 6) and many neuropeptide pro-hormones with restricted expression patterns [16, 28, 29]. These data are indicative that genes with expression restricted to small populations of cells may well have escaped current sequencing efforts.
A detailed description of the process is available as a Additional files 1, 2, 3, 4, 5, 6, 7, 8 containing all relevant scripts and details. The consolidated transcriptome contains 23,545 transcripts of which, 2,859 are unmodified transcripts taken from one of the transcriptome sources and the remaining 20,686 transcripts are assembled CAP3 contigs. A comparison of average transcript length and N50 length to the 5 transcriptomes shows a substantial length increase with respect to both transcript length and open reading frame (ORF) length (Figure 1A and B). We also assessed transcriptome coverage by blasting each transcriptome to the Core Eukaryotic Gene Mapping Approach (CEGMA) database  which contains a core set of genes found in a wide range of eukaryotic organisms. The consolidated transcriptome had hits to 449 CEGMA genes out of a possible 458 and a significantly larger ortholog hit ratio of 0.93 compared to the 5 transcriptomes (Table 1).
Overall our analyses show that a simple consolidation of the extant published data provides an improved high confidence S. mediterranea transcriptome with respect to representation, total length and coding potential. This will be of significance for future genome wide expression analyses exploiting this important model system.
A gene expression time-course of anterior and posterior regeneration
Planarian regeneration is able to confidently restore whole individuals, with all organs scaled to the correct size from any starting piece . Thus from very different beginnings, the same end result is obtained. In the first instance we wished to understand how this process is reflected in whole transcriptome gene expression changes. One would expect early expression profiles to be very different depending on the cell and tissue contents, for example brain and neural tissues in the head versus gut tissues in the tail. The expression profiles should eventually converge as the missing tissues are regenerated.
Our goal was to describe these trajectories from different starting scenarios to obtain an overview of which genes are differentially expressed during the first 72 hours of regeneration. We hypothesised that differentially expressed genes would represent the unique regulatory solutions each regenerative scenario uses to arrive at the reconstitution of a whole animal. To this end we performed transcriptome sequencing on regenerating head and tail fragments at 0, 6, 12, 24, 36, 48 and 72 hours after amputation.
In total, 514,384,160 reads were mapped to the transcriptome across replicate samples at each regeneration time-point in each of the two scenarios. On average, the correlation between replicate samples was 0.99. While having 2 replicates for each time-point/fragment is not ideal for modelling the variance encountered in RNA-seq, our sample preparation of including multiple individuals (20 fragments in each library) does offer some variance stabilization through biological averaging. Having more replicates would add more resolution to our individual transcript expression profiles, but for the purpose of observing global trends in expression, we resorted to statistical optimizations of filtering our dataset more stringently and setting a higher adjusted p-value threshold.
We filtered our raw count data for transcripts that had less than 20 reads mapping in all libraries, leaving us with 15,423 transcripts for differential expression analysis. We performed differential expression analysis with EdgeR  on pairs of consecutive time-points (0-6 hours, 6-12 hours, 12-24 hours…etc) and also on pairs of fragments at the same time-point (0 hour head vs 0 hour tail, 6 hour head vs 6 hour tail…etc). The significance threshold for differential expression was set at p-value of 0.01 (Figure 2A-D).
The largest number of differentially expressed transcripts during head and tail regeneration are 3,228 transcripts down-regulated from 6-12 hours in heads and 5,646 transcripts down-regulated from 36-48 hours in tails (Figure 2A and B). We will refer to lists of differentially expressed transcripts by their abbreviated fragment type and time-period. For example, transcripts down-regulated from 6 to 12 hours in head fragments will be abbreviated as H6-12-down. We also observed that between fragments there is considerable increase in differentially up-regulated transcripts in both head and tail fragments between 12 and 36 hours. (Figure 2C). This increase hints at a divergence of expression profiles between head and tail regeneration starting at 12 hours and ending after 36 hours. This divergence may be representative of a differential program utilized by head and tail fragments. At higher fold-changes, there are also more transcripts up-regulated in tail fragments compared to head fragments between 12 and 36 hours suggesting that tail fragments undergoes a more drastic expression regulation during the divergence than head fragments. A possible reason for this may be that tail fragments need to regenerate a brain which contains a rich population of genes and isoforms.
Overall our data set describes the transcriptome of head and tail fragments during the first 72 hours of regeneration. This time-course data reflects the processes of regenerating new tissues in addition to remodelling existing tissue since whole fragments were used instead of just the regenerating blastema. This dataset presents a valuable resource for data mining the transcriptional behaviour of planarians genes during regeneration.
Head and tail fragment enriched transcripts implicates genes involved in early anterior and posterior regeneration
As a simple validation of our expression data, we looked for head and tail enriched transcripts by comparing head and tail fragments at 0 hours, immediately after amputation. We were able to find many known anterior markers (arrestin, opsin, tyrosinase, eyes absent, wnt2-1, otxA, six1, prep) and posterior markers (wnt11-1, wnt11-2, AbdBa, hox-D, axin) enriched in head and tail piecess respectively (Tables 2 and 3). We also find many membrane voltage gated channels and neurotransmitter receptors involved in nervous system function, metalloproteinases, and several homeobox genes (NK-1, cut-like, lim, orthopedia, vsx-1) in head enriched transcripts. We will refer to head and tail enriched transcripts at the beginning of regeneration as F-head and F-tail (Table 4).
We expected head and tail enriched transcripts to be representative of the existing anterior/posterior tissues. F-head and F-tail should be consistently up-regulated in their respective fragments until remodelling of the existing tissues occurs. We looked at the expression profile of these head and tail enriched transcripts during regeneration and found F-head transcripts to be consistently up-regulated in heads compared to tail fragments at the same time-points suggesting remodelling does not occur within the first 72 hours of head regeneration. In contrast, for tail enriched transcripts, there is a down-regulation in tail fragments at 48 hours (Figure 3) suggesting remodelling is taking place at this time.
There is also a slight up-regulation of head enriched transcripts during early tail regeneration that might reflect the development of the early brain structure shown previously to occur in anterior facing wounds [39, 47]. We found 277 transcripts out of a total 1,193 head enriched transcripts are up-regulated in tail fragments at 24 hours compared to 0 hours. Among these are several S. mediterranea transcripts involved in anterior development (smed-prep, smed-ndk, smed-six) and transcripts known to be expressed anteriorly (smed-wnt2-1, smed-sfrp) (Table 5). This group of 277 genes could be important in the early regeneration of anterior stuctures. Interestingly there are also several metalloproteinases (MMP) which have been implicated in neural tissue remodelling and cell migration in this list. However, early brain structures have previously been shown to develop in the blastema, not existing tissues and we also do not observe remodelling until 48 hours suggesting the function of MMPs at the stage is perhaps confined only to the blastema tissues or a small portion of the existing tissue.
A Shared regulatory program between head and tail regeneration
We performed a hierarchical clustering of expression profiles using correlation distance on all libraries to observe when tail and head regeneration diverges (Figure 4). The clustering generated two separate groups of expression profiles (red, yellow) from 12 hours in heads and 6 hours in tails indicative of an early divergence in expression profiles. Surprisingly, we observed a subsequent convergence of expression profiles at 48 hours indicated by the blue cluster, grouping heads and tail fragments at 48 and 72 hours.
This convergence suggest both head and tail fragments reach a similar regenerative state at the transcriptome level as early as 48 hours. At this time point tails will have just formed cephalic ganglia tissue and started to form the beginnings of the photoreceptors [34, 47]. Nonetheless, we found this convergence surprising, as head and tail fragments are still morphologically distinct at this early stage. Together this suggested to us that much of the similarity of expression levels might represent underlying gene regulatory and cellular behaviours rather than the formation of equivalent tissues.
To investigate the convergence further we looked for shared batteries of genes between differentially regulated genes at each time point (Figure 5). The most significant overlapping regulatory programs with a hypergeometric p-value of effectively 0 were observed for 2 differentially expressed lists: H6-12-down and T36-48-down (Figure 6A), H6-12-up and T36-48-up (Figure 6C). We conclude that the H6-12 regulatory transition is primarily responsible for the early divergence of expression profiles between head and tail fragments and the subsequent convergence at 48 hours is caused by tails going through a very similar regulatory transition during T36-48 (Figure 6B and D). We will refer to the shared regulatory program between H6-12 and T36-48 as O and we will refer the shared up/down-regulation program as O-up and O-down.
O-up and O-down both represent approximately one third of transcripts assessed for differential expression. We found 69 potential transcription factors out of a total 467 potential transcription factors in our data set were present in O-down and 29 in O-up. Known S. mediterranea transcript factors found in O-down include smed-prep, smed-dlx, smed-gata, smed-prox1, and smed-six. In O-up, we found smed-junli, smed-tcf15, smed-e2f-like.
There are more transcripts up/down-regulated in T36-48 compared to H6-12. 91% of the transcripts in H6-12-down overlap with 52% of transcripts in T36-48-down and 81.7% of H6-12-up transcripts overlap with 49% of T36-48-up transcripts meaning there are 2,696 and 2,677 transcripts that are exclusively regulated (not shared with H6-12) in T36-48-down and T36-48-up. While these transcripts were not found to be differentially expressed at H6-12, they still conform to the convergence of expression profile at 48 hours (Figure 7). Instead of being sharply regulated during H6-12, these transcripts gradually up/down-regulate during head regeneration to eventually match tail expression level at 48 hours.
Together our analysis reveals a previously unknown shared regulatory program between two very different regenerative scenarios. This program includes a large proportion of the genome and runs at different times and scenarios. This gives us for the first time insight into how whole body regeneration is regulated and suggests that different regenerative scenarios may use a core regenerative program that is activated after scenario specific events have occurred. Future work investigating an even wider set of regenerative scenarios will test this model, but the use of shared program between the opposite scenarios investigates here is strongly suggestive this is the case. This program is likely to represent key shared events, such as elaboration of axial fates, replacement of major tissues such as the gut , excretory system and nervous system and re-establishment of the stem cell and stem cell progeny populations , and remodelling of existing tissues to their correct proportions.
Smed-prep(RNAi) disrupts the expression of transcripts found in the two major regulatory events during regeneration
Smed-prep is a TALE class homeodomain gene that has been found to be required for anterior fate and patterning during regeneration . Upon RNAi knock-down of smed-prep, regenerating tail fragments fail to develop a discernible anterior compartment. In order to provide a genome wide validation of our time-course dataset and to investigate possible down-stream genes regulated directly or indirectly by smed-prep, we performed RNA-seq on tail fragments of smed-prep(RNAi) animals 24 hours after amputation along with GFP dsRNA injected controls at the same time-point. We will refer to this 24 hour tail fragment comparison between gfp and smed-prep(RNAi) animals as P. Transcripts down and up regulated in response to smed-prep(RNAi) will be referred to as P-down and P-up.
We generated two lists of differentially expressed transcripts for P-down and P-up at a fold-change of 2 or more. There are 1,236 transcripts in P-up and 591 in P-down. The larger number of transcripts up-regulated in response to smed-prep(RNAi) versus down-regulation suggest a direct or indirect transcriptional repressive role of smed-prep.
As a validation of our smed-prep(RNAi) data, we looked at whether F-head transcripts are affected by smed-prep(RNAi). We found 51 F-head transcripts in P-down and 47 in P-up. While there are some voltage gated channels and neurotransmitter transcripts in these lists, no known S. mediterranea transcripts were found and only two transcription factors were found in both P-down and F-head (an achaete-scute homolog and zinc finger protein). This result was to be expected as F-head transcripts are not up-regulated in tail fragments during the regeneration time-course. There was also no significant enrichment of the F-head transcripts that were up-regulated in tail regeneration at 24 hours in either P-down or P-up. This suggests that smed-prep is not involved in early brain development, in agreement with previous work that has investigated early brain regeneration .
We looked at the expression profiles of P during regeneration and found that 330 P-down transcripts are in T36-48-up and 878 P-up transcripts are in T36-48-down (Figure 8) with significant enrichment (5.5e-41 and 3.4e-185). While P-up overlapped significantly with O-down, P-down overlapped with transcripts up-regulated in T36-48 exclusively (and not in H6-12-up) (Figure 8). The P-up data is showing that smed-prep is possibly playing a direct or indirect repressive role which facilitates O-down. Since the phenotype of smed-prep(RNAi) is loss of the anterior structures, we can reasonably assume that some of P-down is involved in anterior development. The observation that P-down seem to only relate to T36-48-up exclusively suggest T36-48 is when smed-prep is up-regulating these anterior developmental transcripts; whereas it was not necessary to do so in head fragments since anterior structures already exists.
The expression profile of P-up and P-down during regeneration allowed us to observe the differential role of smed-prep in both head and tail fragments. In head fragments, smed-prep plays a repressive role during the early major regulatory transition in heads. In tail fragments, in addition to also playing the same repressive role during the later major regulatory transition, it activates the anterior structure regeneration program.
Triclad specific transcripts are enriched in differentially expressed transcripts during regeneration
We annotated the consolidated transcriptome by blasting against 12 proteomes: Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Homo sapiens, Mus musculus, Schistosoma mansoni, Clonorchis sinensis, Strongylocentrotus purpuratus, Nematostella vectensis Lottia gigantea, Helobdella robusta, Capitella teleta and 3 transcriptomes: Girardia trigrina, Procotyla fluviatilis, Denrocoelum lacteum. At an e-value threshold of 1e-5 or less, 20,603 transcripts had at least one hit and 19,478 transcripts at e-value threshold of 1e-15 or less.
We categorized the transcripts based on species hits to generate lists of transcripts that were potentially S. mediterranea specific, triclad specific, and platyhelminth specific (Figure 9). We performed this analysis on increasing e-value strictness and found 2,932 potentially S. mediterranea specific transcripts, 4,825 triclad specific transcripts, and 4,949 platyhelminthes specific transcripts at the highest strictness level (Table 6).
In addition to blast annotations, we also performed protein domain predictions on the transcriptome using models from the PFAM database  resulting in 13,217 transcripts with domain annotations. Using this domain information, we were able to look at the composition of domains within the strict platyhelminth and triclad specific lists of transcripts.
Within the platyhelminthes specific transcripts, 289 transcripts had pfam annotations. The low number of domain annotations probably reflects the heavy bias of known protein domains towards nematode, insect and vertebrate systems. We found that the most abundantly represented domain in platyhelminth specific list was the 7 transmembrane receptor domain (7tm) of the rhodopsin family (PF00001) found in 13 transcripts. This agrees with a previous study which catalogued the repertoire of G protein-coupled receptors (GPCR) in S. mediterranea and found a large expansion of platyhelminth specific GPCR of the rhodopsin subfamily .
Within the triclad specific transcripts, 262 transcripts had pfam annotations with the ubiquitin domain (PF00240) being most represented in 12 transcripts. The ubiquitin protease system (UPS) is the main cellular proteolytic mechanism that uses ubiquitin to tag and target proteins for degradation. Several studies have implicated UPS in drosophila development  and regeneration in various systems [56–58] making UPS a potential target for further research.
We also looked at the expression profile of triclad specific transcripts during regeneration and found there was significant enrichment in differentially expressed transcripts (p-value < 0.01 and fold-change > 2) during both head and tail regeneration (Figure 10, Tables 7 and 8).
Together our data identify a large set of potentially novel and/or rapidly evolving genes that are clearly differentially expressed during regeneration. Previous studies in planarians and other regenerative models have highlighted a role of conserved genes, known from studies of development, as key regulators of regeneration. More recent genome wide studies using transcriptomics have revealed that lineage specific genes may also be important and require study [24, 25]. Our data also support that this may be the case during planarian regeneration. In particular we uncover an enrichment of novel genes during a regulatory transition that is shared between different regenerative scenarios. Our data suggest that the potent regenerative capacity in planarians may be partly due to novel mechanisms conserved within the highly regenerative triclad clade and pave the way for functional study of these genes.
In this study we have generated a consolidated transcriptome from 5 independently assembled transcriptomes and available ESTs. This consolidated dataset represents a high confidence set of transcripts providing a valuable resource for future expression studies. Our regeneration transcriptome consisting of regenerating head and tail fragments from 0 to 3 days reveals a shared regulatory program consisting of over 5,000 transcripts active at temporally shifted time-periods between regenerating head (6-12 hours) and in tail fragments (36-48 hours). Additional RNA-seq experiments on smed-prep(RNAi) animals versus control tail fragments allowed us to find transcripts that are regulated differentially in response to smed-prep. We observed that these smed-prep response transcripts are enriched during the shared regulatory program during regeneration suggesting an involvement in brain regeneration. We also performed BLAST alignment to 15 species across eukaryotes to identify novel or divergent genes and found lists of S. mediterranea, triclad, and platyhelminth specific transcripts. Triclad specific transcripts are found to be enriched in differentially expressed transcripts throughout regeneration suggesting novel mechanisms may contribute to the animal’s potent regenerative capacity.
A clonal line of the asexual strain of Schmidtea mediterranea, AAANOTBIOL01, was used for all experiments. Animals were reared at 20°C in tap water filtered through activated charcoal and buffered with 0.5 ml/L 1 M NaHCO3. Planarians were fed veal liver and starved for at least one week prior to experiments or amputation. All worms used were 7–8 mm in length. The animals used in these experiments do not require approval from the ethical committee.
Smed-prep(RNAi) for RNAseq
RNAi for Smed-prep was perfomed by two rounds of injection as previously described .
Preparation of RNA form a regenerating timecourse
Regenerating head and tail fragments from 20 worms at each timepoint were collected and snap frozen at 6, 12, 24, 36, 48, and 72 hours of anterior and posterior regeneration in two replicate samples for total RNA preparation (Trizol). RNA was also prepared from Smed-prep(RNAi) regenerating tails at 24 hours. Total RNA was also prepared and pooled from a regenerative time course of a sexual strain of G. tigrina.
Library preparation and sequencing
RNA from regenerative stages was enriched for mRNA enriched using the Poly A Purist Kit (Ambion, Cat. No. AM1919) followed by further depletion of ribosomal RNA using the Ribominus Eukaryotic kit (Invitrogen, Cat. No. A10837-08). Solid whole transcriptome libraries were made as outlined in the Solid Whole transcriptome kit protocol (Applied Biosystems, Cat. No. 4425680). The Quant-it HS dsDNA assay kit (Invitrogen, Cat. No. Q32851) was used to measure the concentration of libraries. Sequencing was performed on a SOLiD 3 ABi sequencer according to the manufacturer’s instructions to generate 50 bp reads in colour space. All data is available at the NCBI short read archive under study number SRP002478.
sequencing and assembly was performed from G. tigrina total RNA as previously described .
Detailed information on the transcriptome consolidation process is included in the Additional files 1, 2, 3, 4, 5, 6, 7, 8.
BLASTX alignment was performed against Caenorhabditis elegans, Drosophila melanogaster, Danio rerio, Homo sapiens, Mus musculus, Schistosoma mansoni, Clonorchis sinensis proteomes and TBLASTX was performed against Girardia tigrina transcriptome. A constant database size of total base pairs of all sequences was used to ensure comparable e-values. PFAM annotations was performed with HMMScan using the PFAM-A database. The gathering cut-off threshold was used for HMMScan. Transcription factors were identified using manually curated PFAM domain profiles from DBD: Transcription factor prediction database.
Read mapping and differential expression analysis
Reads were mapped with the ABI LifeScope software’s single fragment mapping module. Uniquely mapped reads were counted for each transcript on both strands with HTSeq-count  using a mapping quality filter of 30. Outliers were filtered out by removing transcripts with tag counts that were more than 1% of the total library in at least 3 sequenced libraries. Transcripts with less than 20 reads in all libraries were also removed. Normalized counts and differential analysis was performed with EdgeR . The generalized linear model function of edgeR was used across all the libraries.
Read mapping and tag counting for smed-prep (RNAi) RNAseq samples was performed the same way as the regeneration time-course. Outliers were also removed based on more than 1% of total reads in at least 2 libraries. Transcripts with less than 50 reads were removed from all libraries. Since there were no replicates for these samples, we did not use edgeR to calculate differential expression as EdgeR requires replicates to effectively model the dispersion among libraries of the same conditions. We instead relied on a high filter of 50 reads for lowly expressed transcripts and a fold-change threshold of at least 2 for assessing differential expression.
Availability of supporting data
The raw sequencing data was deposited into short read archive with these two study accession numbers: PRJEB4680 (G. tigrina raw Roche 454 data), PRJEB4686 (S. mediterranea regeneration time-course ABI SOLiD data). The transcriptome data for both S. mediterranea and G. tigrina was uploaded to FigShare at this address and DOI: Aboobaker Lab Schmidtea mediterranea transcriptome dataset. Damian Kao. figshare. http://dx.doi.org/10.6084/m9.figshare.801077. Additionally, this data is incorporated into PlanMine (http://planmine.mpi-cbg.de/).
Brockes JP, Kumar A: Comparative aspects of animal regeneration. Annu Rev Cell Dev Biol. 2008, 24: 525-549. 10.1146/annurev.cellbio.24.110707.175336.
Onal P, Grün D, Adamidi C, Rybak A, Solana J, Mastrobuoni G, Wang Y, Rahn H-P, Chen W, Kempa S, Ziebold U, Rajewsky N: Gene expression of pluripotency determinants is conserved between mammalian and planarian stem cells. EMBO J. 2012, 31: 2755-2769. 10.1038/emboj.2012.110.
Solana J, Kao D, Mihaylova Y, Jaber-Hijazi F, Malla S, Wilson R, Aboobaker A: Defining the molecular profile of planarian pluripotent stem cells using a combinatorial RNAseq, RNA interference and irradiation approach. Genome Biol. 2012, 13: R19-10.1186/gb-2012-13-3-r19.
Labbé RM, Irimia M, Currie KW, Lin A, Zhu SJ, Brown DDR, Ross EJ, Voisin V, Bader GD, Blencowe BJ, Pearson BJ: A comparative transcriptomic analysis reveals conserved features of stem cell pluripotency in planarians and mammals. STEM CELLS. 2012, 30: 1734-1745. 10.1002/stem.1144.
Blythe MJ, Kao D, Malla S, Rowsell J, Wilson R, Evans D, Jowett J, Hall A, Lemay V, Lam S, Aboobaker AA: A dual platform approach to transcript discovery for the planarian schmidtea mediterranea to establish RNAseq for stem cell and regeneration biology. PLoS ONE. 2010, 5: e15617-10.1371/journal.pone.0015617.
Adamidi C, Wang Y, Gruen D, Mastrobuoni G, You X, Tolle D, Dodt M, Mackowiak SD, Gogol-Doering A, Oenal P, Rybak A, Ross E, Sánchez Alvarado A, Kempa S, Dieterich C, Rajewsky N, Chen W: De novo assembly and validation of planaria transcriptome by massive parallel sequencing and shotgun proteomics. Genome Res. 2011, 21: 1193-1200. 10.1101/gr.113779.110.
Riutort M, Álvarez-Presas M, Lázaro E, Solà E, Paps J: Evolutionary history of the Tricladida and the Platyhelminthes: an up-to-date phylogenetic and systematic account. Int J Dev Biol. 2012, 56: 5-17. 10.1387/ijdb.113441mr.
Looso M, Preussner J, Sousounis K, Bruckskotten M, Michel CS, Lignelli E, Reinhardt R, Höffner S, Krüger M, Tsonis PA, Borchardt T, Braun T: A de novo assembly of the newt transcriptome combined with proteomic validation identifies new protein families expressed during tissue regeneration. Genome Biol. 2013, 14: R16-10.1186/gb-2013-14-2-r16.
Molina MD, Neto A, Maeso I, Gómez-Skarmeta JL, Saló E, Cebrià F: Noggin and noggin-like genes control dorsoventral axis regeneration in planarians. Curr Biol. 2011, 21: 300-305. 10.1016/j.cub.2011.01.016.
Collins JJ, Hou X, Romanova EV, Lambrus BG, Miller CM, Saberi A, Sweedler JV, Newmark PA: Genome-wide analyses reveal a role for peptide hormones in planarian germline development. PLoS Biol. 2010, 8 (10): e1000509-10.1371/journal.pbio.1000509. doi:10.1371/journal.pbio.1000509
Martín-Durán JM, Monjo F, Romero R: Morphological and molecular development of the eyes during embryogenesis of the freshwater planarian Schmidtea polychroa. Dev Genes Evol. 2012, 222: 45-54. 10.1007/s00427-012-0389-5.
Iglesias M, Almuedo-Castillo M, Aboobaker AA, Saló E: Early planarian brain regeneration is independent of blastema polarity mediated by the Wnt/β-catenin pathway. Dev Biol. 2011, 358: 68-78. 10.1016/j.ydbio.2011.07.013.
Gurley KA, Elliott SA, Simakov O, Schmidt HA, Holstein TW, Alvarado AS: Expression of secreted Wnt pathway components reveals unexpected complexity of the planarian amputation response. Dev Biol. 2010, 347: 24-39. 10.1016/j.ydbio.2010.08.007.
Raška O, Kostrouchová V, Behenský F, Yilma P, Saudek V, Kostrouch Z, Kostrouchová M: SMED-TLX-1 (NR2E1) is critical for tissue and body plan maintenance in Schmidtea mediterranea in fasting/feeding cycles. Folia Biol (Praha). 2011, 57: 223-231.
Martín-Durán JM, Amaya E, Romero R: Germ layer specification and axial patterning in the embryonic development of the freshwater planarian Schmidtea polychroa. Dev Biol. 2010, 340: 145-158. 10.1016/j.ydbio.2010.01.018.
Cebrià F, Kobayashi C, Umesono Y, Nakazawa M, Mineta K, Ikeo K, Gojobori T, Itoh M, Taira M, Alvarado AS, Agata K: FGFR-related gene nou-darake restricts brain tissues to the head region of planarians. Nature. 2002, 419: 620-624. 10.1038/nature01042.
Eisenhoffer GT, Kang H, Alvarado AS: Molecular analysis of stem cells and their descendants during cell turnover and regeneration in the planarian schmidtea mediterranea. Cell Stem Cell. 2008, 3: 327-339. 10.1016/j.stem.2008.07.002.
Liu S-Y, Selck C, Friedrich B, Lutz R, Vila-Farré M, Dahl A, Brandl H, Lakshmanaperumal N, Henry I, Rink JC: Reactivating head regrowth in a regeneration-deficient planarian species. Nature. 2013, 500: 81-84. 10.1038/nature12414.
Zamanian M, Kimber MJ, McVeigh P, Carlson SA, Maule AG, Day TA: The repertoire of G protein-coupled receptors in the human parasite Schistosoma mansoni and the model organism Schmidtea mediterranea. BMC Genomics. 2011, 12: 596-10.1186/1471-2164-12-596.
Lier S, Paululat A: The proteasome regulatory particle subunit Rpn6 is required for Drosophila development and interacts physically with signalosome subunit Alien/CSN2. Gene. 2002, 298: 109-119. 10.1016/S0378-1119(02)00930-7.
Patruno M, Thorndyke MC, Candia Carnevali MD, Bonasoro F, Beesley P: Changes in ubiquitin conjugates and Hsp72 levels during arm regeneration in echinoderms. Mar Biotechnol New York N. 2001, 3: 4-15. 10.1007/s101260000018.
El-Khodor BF, Kholodilov NG, Yarygina O, Burke RE: The expression of mRNAs for the proteasome complex is developmentally regulated in the rat mesencephalon. Brain Res Dev Brain Res. 2001, 129: 47-56. 10.1016/S0165-3806(01)00181-X.
We thank current and previous members of the Aboobaker lab and Deep Seq Sequencing unit at Nottingham University for technical assistance and discussions. This work was funded by grants from the MRC and BBSRC to AAA.
Authors and Affiliations
School of Life Sciences, University of Nottingham, University Park, Nottingham, NG7 2RD, UK
Fundación CNIC Carlos III- Centro Nacional de Investigaciones Cardiovasculares, Melchor Fernández Almagro, 3, Madrid Código Postal, 28029, Spain
Department of Zoology, University of Oxford, The Tinbergen Building, South Parks Road, Oxford, OX1 3PS, UK
The authors declare that they have no competing interests.
Conceived and designed the project AAA, Provided essential tools DK, Wrote the manuscript DK, AAA, performed experiments AAA, DF, analysed data DK, AAA. All authors read and approved the final manuscript.
Additional file 1:ZIP file containing a .pdf document of detailed methods including how consolidation was performed and various analysis. Also includes relevant python scripts for each step of the methods. (ZIP 198 KB)
This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License (
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Kao, D., Felix, D. & Aboobaker, A. The planarian regeneration transcriptome reveals a shared but temporally shifted regulatory program between opposing head and tail scenarios.
BMC Genomics14, 797 (2013). https://doi.org/10.1186/1471-2164-14-797