- Research article
- Open access
- Published:
Twinkle twinkle brittle star: the draft genome of Ophioderma brevispinum (Echinodermata: Ophiuroidea) as a resource for regeneration research
BMC Genomics volume 23, Article number: 574 (2022)
Abstract
Background
Echinoderms are established models in experimental and developmental biology, however genomic resources are still lacking for many species. Here, we present the draft genome of Ophioderma brevispinum, an emerging model organism in the field of regenerative biology. This new genomic resource provides a reference for experimental studies of regenerative mechanisms.
Results
We report a de novo nuclear genome assembly for the brittle star O. brevispinum and annotation facilitated by the transcriptome assembly. The final assembly is 2.68 Gb in length and contains 146,703 predicted protein-coding gene models. We also report a mitochondrial genome for this species, which is 15,831 bp in length, and contains 13 protein-coding, 22 tRNAs, and 2 rRNAs genes, respectively. In addition, 29 genes of the Notch signaling pathway are identified to illustrate the practical utility of the assembly for studies of regeneration.
Conclusions
The sequenced and annotated genome of O. brevispinum presented here provides the first such resource for an ophiuroid model species. Considering the remarkable regenerative capacity of this species, this genome will be an essential resource in future research efforts on molecular mechanisms regulating regeneration.
Background
Echinoderms are a phylum of marine invertebrates, which together with hemichordates constitute the group Ambulacraria. In turn, the Ambulacraria form a sister clade to the phylum Chordata within the monophyletic superphylum Deuterostomia. Echinoderms have attracted much attention from scholars from various disciplines of biology (e.g., ecology, evolution, and developmental biology [1]). One of the fascinating aspects of the biology of many echinoderm species is the ability to regenerate all the tissues of large body parts and organ systems completely [2]. Thus far, attempts to understand the molecular underpinnings of echinoderm regeneration have been largely driven by transcriptome and gene expression studies [3–7]. Only recently functional perturbation studies started to emerge [3, 7–9], but the progress has been relatively slow in part due to the scarcity of genomic resources. Although the gene expression approaches are valuable in identifying the sets of genes that are involved in the regeneration program, gene expression studies do not by themselves lead to a comprehensive mechanistic understanding of regeneration. A mechanistic understanding of regeneration will be achieved through the reconstruction of the cause-and-effect relationships between regulatory and effector genes in functional genomics studies. The availability of genomic data is one of the essential prerequisites for these studies and it shall allow for the identification of cis-regulatory modules, reconstruction of chromatin accessibility maps, and the design of the genome editing experiments to probe for the function of candidate genes.
The efforts in echinoderm genomics were pioneered by the sequencing of the genome of the purple sea urchin, Strongylocentrotus purpuratus (Stimpson, 1857) [10]. Since then, the availability and affordability of the new generation high-throughput sequencing technologies have expanded genome sequencing and annotation projects to other species representing all five classes within the phylum. Table 1 lists the 21 echinoderm genomes that are currently available in public databases. These genomes, sequenced and annotated to a varying degree of completeness, have been submitted to the National Center for Biotechnology Information (NCBI) or other databases (e.g., [11–17]). These sequencing efforts, however, have rarely been driven by regeneration research. Only seven of the currently sequenced species have ever been studied in terms of their regenerative capacities (i.e., there is at least one published study on the topic). Only three of the sequenced species, the sea cucumbers Holothuria glaberrima and Apostichopus japonicus and the sea star Patiria miniata, are established model organisms in echinoderm regenerative biology (i.e., there has been continuous effort to characterize cellular and molecular events underlying regeneration). Two brittle star genomes are available at present, Ophionereis fasciata and Ophiothrix spiculata, but neither of those species has been used in studies of regenerative biology.
The primary aim of this contribution is to provide genomic data and tools for a highly regenerative brittle star (ophiuroid echinoderm) Ophioderma brevispinum [18], an emerging model organism in the field of regenerative biology [3] (Fig. 1). This species is capable of autotomizing and quickly regenerating its arms, the segmented body appendages. Arm regeneration is a classic example of ”epimorphic” regeneration, which involves extensive unidirectional terminal growth through rapid generation of new tissues distal to the plane of the injury [3] (Fig. 1). As arms are often exposed to predators in natural habitats, brittle stars are known for their ability to sustain arm loss followed by remarkably high rates of arm regeneration. It has been estimated that regenerated tissues may account for as much as half of the total body weight in a brittle star individual [19]. It has also been shown that regeneration is temperature-dependent in at least some Ophioderma species [20].
The rationale for choosing this particular brittle star species for our research is as practical as it is fundamental. O. brevispinum is common in the Western Atlantic in shallow water and can be easily collected in numbers sufficient for molecular biology studies. During experiments, these animals are easily maintained in indoor aquaria and as research stock with minimal maintenance for extended periods (e.g., months). In addition, live individuals of O. brevispinum are available to all interested researchers, as they can be ordered from several commercial suppliers at a moderate price.
Our previous work on O. brevispinum demonstrated a critical role of the Notch signaling pathway (Fig. 2) in ophiuroid arm regeneration [3]. This pathway is a key node of a complex hypernetwork of interconnected signaling pathways that mediate cell-cell interactions in various developmental contexts [21]. Thus a specific goal of this paper includes describing the genomic composition of the key components of the Notch pathway [21]. A report of these genes in O. brevispinum is provided to serve as a fundamental toolkit for future studies on regeneration. This work will facilitate further research to unravel the functions of signaling pathways in regeneration and identify their target genes through functional genomics approaches.
Here, we report a de novo genome assembly and annotation of Ophioderma brevispinum. This is the third ophiuroid for which genome sequencing has been applied following low coverage sequencing of Ophiothrix spiculata [14] (NCBI’s BioProject accession number PRJNA182997) and Ophionereis fasciata [17]. Our Ophioderma brevispinum genome and transcriptome [3] allow us to describe regulatory gene families to further explore the molecular bases of echinoderm regeneration. This use case is an example of one of the several applications for these genomic data that researchers will find.
Results
Sequencing
Three different genomic DNA library preparation and sequencing strategies were employed. First, PCR-free library preparation (“short” libraries) followed by sequencing on an Illumina HiSeq 25000 machine yielded 2 ×163,310,307 250 nt paired-end reads with an overall GC content of 37% (NCBI’s SRA accession number SRP238266). Second, mate-paired libraries (“long” libraries) with approximately 3 Kbp insert size were sequenced as above and resulted in 2 ×103,843,354 250 nt paired-end reads with an overall GC content of 41%. Third, PacBio long-read sequencing generated approximately 23 million reads with a total yield of 159 billion bp (51.3 × coverage) (Fig. 3). A summary of sequence read statistics is provided in Additional file 1: File S1.
In addition, in order to facilitate the annotation efforts, we took advantage of the earlier RNA-Seq study [3] that generated 17,318,775 MiSeq and 832,245,006 HiSeq quality filtered and adapter trimmed reads used in a de novo transcriptome assembly.
Nuclear DNA assembly statistics
The draft assembly of the O. brevispinum genome generated 88,538 scaffolds with the total length of the assembly of 2,684,874,461 bp (approximately 2.68 Gb) (Table 2). This value is close to the haploid genome size independently determined by a densitometry assay (2.89 Gb) and from paired-end sequence data using a k-mer statistical approach (2.11 Gb). Scaffolds range from 2,035 bp to 612,917 bp, with a mean scaffold size of 30,325 bp. The N50 scaffold length and L50 scaffold size are 48,505 bp and 15,677, respectively. The scaffold nucleotide content is 30.77%, 19.22%, 19.18%, and 30.83% for A, C, G, and T, respectively.
An independent de novo assembly of repetitive DNA elements with REPdenovo resulted in 92,505 individual sequences with a total length of 134,023,983 bp. The average length and N50 of the repetitive DNA segments assembled this way were 1,448.83 bp and 2,838 bp (14,228 sequences), respectively. The sequences assembled with REPdenovo were used to aid in our draft genome assembly’s repeat identification and masking. A total of 1,410,344,530 bp (52.53%) of the draft assembly were classified as repetitive DNA and masked (see the summary of results in Table 3). Most DNA repeats (42.91% of the repetitive DNA sequence length) were classified as interspersed elements. However, a significant number of repeats (49.91% of the repetitive DNA) were marked as unclassified. The most common (5.27% of the repetitive DNA) transposable element (TE) in the classification of repeats corresponds to long interspersed nuclear elements (LINEs). Short interspersed nuclear elements (SINEs), repetitive DNA elements, and long terminal repeats (LTRs) amounted to 0.11%, 0.73%, and 1.12% of the total sequence length, respectively.
In the GTF files from BRAKER and exonerate (Additional file 2: File S2), we found 361,060 exons in 146,703 genes, each corresponding to a different transcript in a total of 53,436 nuclear DNA scaffolds. According to position information, at least 3,394 of those 146,703 genes could represent gene isoforms. These putative isoforms are found in 1,311 scaffolds. The degree of fragmentation of this draft genome and limitations in resolving gene isoforms may partially account for this overestimation of the number of protein coding genes in Ophioderma brevispinum. These caveats should be addressed in future efforts to increase the completeness of this genomic resource.
The completeness of the draft genome assembly at the scaffold level was also evaluated with BUSCO [22, 23], a commonly used tool to assess the representation of marker genes in newly generated genomic and transcriptomic datasets. The results are summarized in Table 4. BUSCO analysis of this brittle star transcriptome has been previously reported elsewhere (see [3]).
Notch signaling pathway
To demonstrate the utility of the genome, we performed a case study in which we assessed the genomic representation of the main components of the Notch signaling pathway. This pathway is highly conserved across all multicellular animals and is known to coordinate a multitude of diverse cellular events, including: proliferation, differentiation, cell fate specification, and cell death [21, 24–29]. In the context of echinoderm regeneration, we have recently demonstrated that the proper function of the Notch pathway is crucial for the arm regeneration in O. brevispinum [3]. Here, we searched the draft genome for 29 genes involved in the pathway (Fig. 2, Table 5).
All genes were retrieved by BLAST [30] search. In addition, for all selected genes, except Mesp2, Presenilin 1, and NACK, we also recovered the same putative coding regions from exonerate alignments. In addition, in 18 genes, we also identified the expected conserved protein domains. Taken together, the newly assembled draft genome of O. brevispinum allowed us to retrieve the sequences of the Notch pathway components that will be subsequently used to design functional genomic studies to further probe into the mechanistic role of the pathway in brittle star regeneration.
Mitogenome assembly
The O. brevispinum mitochondrial genome (mitogenome) is 15,831 bp long and has a GC content of 32.4% (Fig. 4). These values are similar to those of the previously published [31] reference mitogenome of another brittle star species Ophiarachnella gorgonia (NCBI accession number NC_046053), which has a length of 15,948 bp and a GC content of 36.7%. Likewise, mitochondrial genome features of O. brevispinum showed the same gene order reported for O. gorgonia, and their putative control regions are of similar length (488 and 474 bp, respectively).
There are also differences between these two brittle star mitogenomes that are worth noting. For example, the size difference between the mitogenomes of O. brevispinum and O. gorgonia is mostly due to deletions in non-coding intergenic regions. However, deletions in tRNA, rRNA, and protein-coding genes are also observed. Furthermore, unlike in O. gorgonia, the ND4 coding sequence in O. brevispinum is complete and does not add 3’ adenine residues to the mRNA.
Discussion
Here, we present a draft genome assembly for the highly regenerative brittle star species O. brevispinum. Due to its availability, ease of maintenance, and amenability to experimental manipulations, O. brevispinum has become an emerging model organism in echinoderm regenerative biology [3]. We previously performed transcriptome-wide gene expression studies in this species and identified a range of candidate regeneration-associated genes for further experiments. However, without a fully sequenced genome, including non-coding and regulatory regions, it was not previously feasible to delve into the molecular mechanisms of regeneration with functional genomics tools for purposes such as reconstructing gene regulatory networks that underlie regenerative events. This draft genome of O. brevispinum provides the first such resource in ongoing and future molecular studies
The new genome has immediately allowed analysis of protein-coding genes. To demonstrate the utility of the genome, we aimed to retrieve 29 select components of the Notch signaling pathway, including the ligands, receptors, transcription factors, regulators, and target genes. All 29 genes of interest were identified in the assembly. The identity and predicted function of the proteins can be inferred by the presence of the conserved domains.
One of the limitations of our new draft genome assembly is its fragmented state. Ideally, the ultimate goal of any genome sequencing and annotation project is to reconstruct continuous chromosome-size sequences with the fully preserved order of the genes and non-coding sequence elements. Like many first-effort sequencing projects, our assembly will require subsequent efforts to reach that level. Even at its current state though, these data provide a valuable resource for the ongoing and future studies. This research will not be limited to regenerative biology, but can also benefit other areas, such as evolution of the echinoderm body plan, animal phylogeny, and history of gene families, to name a few.
Conclusion
Here we presented the first draft nuclear genome and a complete mitochondrial genome of the brittle star Ophioderma brevispinum (Say, 1825) (Echinodermata: Ophiuroidea: Ophiacanthida: Ophiodermatidae), a rising model for regenerative studies (e.g., [3, 32–34]). The mitochondrial genome of this brittle star has 15,831 bp (with a mean depth of 1,658.7 and GC content of 32.4%) with 13 protein-coding genes, 22 tRNAs, and 2 rRNAs. The draft nuclear DNA assembly has 88,538 scaffolds summing up to 2.7 Gbp, corresponding to ∼93% of the expected haploid genome size independently determined by a densitometry assay. Despite the high degree of fragmentation of the assembly, which is partially caused by a high frequency of repetitive DNA elements (∼52.5% of the assembly), we demonstrated the usefulness of these data for biological investigation by identifying 29 key genes of the Notch signaling pathway, which is essential to tissue regeneration (e.g., [3, 26, 29, 35, 36]). We predict that the resources we are making available in this publication will be fundamental towards assembling the entire genome of O. brevispinum at the chromosomal level and establishing this brittle star species as a model for studies of regeneration and other fields.
Methods
Supporting genomic resources
Comparative genomic analysis relied on our original data and other genomic resources that are publicly available, including the complete genome of the purple sea urchin Strongylocentrotus purpuratus [10, 11] and other genomes available during the preparation of the manuscript (Table 1). These genomes represent 19 species and 17 genera of the classes Asteroidea (orders Forcipulatida and Valvatida), Crinoidea (order Comatulida), Echinoidea (orders Camarodonta and Cidaroida), Holothuroidea (orders Holothuriida and Synallactida), and Ophiuroidea (order Amphilepidida).
Additional interrogation and exploration of echinoderm genomic data leverage on resources made available through the Echinobase (www.echinobase.org) [37]. Recent reviews of genomic resources for the study of echinoderm development and evolution are available elsewhere [1, 38].
Computational resources
All analysis steps where performed using computer clusters (Red Hat Enterprise Linux 7.5 with 64 CPUs and 512 GB to 1.5 TB of memory) as well as high-memory machines (Red Hat Enterprise Linux 7.5 with 16 CPUs and 512 GB to 4 TB of memory) at the University of North Carolina at Charlotte.
Animal collection
Adult individuals of the brittle star O. brevispinum were obtained from the Marine Biological Laboratory (Woods Hole, MA, USA). Specimens (catalog no. 1970) were received on April 13, 2016. Immediately after delivery, the package was opened and left overnight to slowly allow the seawater to warm up to room temperature. The animals were then kept in aquaria with aerated artificial seawater.
RNA-Seq
Complete RNA-Seq analysis (from RNA sampling and isolation until sequencing, de novo transcriptome assembly, and gene expression analysis) was described in [3]. Results correspond to BioProject number PRJNA596798 and SRA accession number SRP238266, and were deposited to NCBI’s Gene Expression Omnibus (GSE142391, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE142391). BUSCO scores for the trasncritome assembly were presented in [3].
DNA isolation and evaluation
A total of 100 mg of tissue was collected from the arms of a single non-regenerating adult individual through the natural autotomy response. The animal remained alive after tissue collection and regenerated its arms. The collected tissue samples were washed in filter-sterilized (0.2 μm) seawater, cut into small pieces with a sterile blade, and put into the lysis buffer. The high molecular weight nuclear genome DNA was then extracted using the Qiagen MagAttract HMW kit according to the manufacturer’s instructions with the following modifications that were found to increase the yield and the molecular weight of the resulting DNA: 1) the vortexing speed with magnetic speed was reduced from 1,400 rpm to 1,200 rpm; and 2) after the first vortexing step with the magnetic beads the samples were allowed to sit for 5 min at room temperature before being placed into the magnetic rack. The concentration of the extracted DNA was assessed using the Qubit dsDNA Broad Range Kit (ThermoFisher). The total amount of DNA was determined at ∼50 μg. The integrity of the genomic DNA sample was verified by agarose (0.6%) gel electrophoresis (at 2 V/cm, 4 hours).
DNA library preparation and sequencing
All DNA sequencing was performed at the David H. Murdock Research Institute (DHMRI; Kannapolis NC, USA). Two different technologies were employed to obtain short and long sequence reads from the high molecular weight genomic DNA (HWM gDNA) sample extracted as described above (Fig. 3).
For sequencing on the Illumina HiSeq2500 platform, subsamples of ≥8 μg of HMW gDNA were used to produce short sequence reads. Two complementary strategies were used to generate ”short” and ”long” insert libraries, respectively. First, three PCR-free paired-end read libraries (“short” insert size) with a ∼450 bp fragment size were constructed using the TruSeq DNA PCR-free library preparation kit (catalog no. FC-121-3001; Illumina, USA). Second, three mate paired-end read libraries (“long” insert size) were generated using the Illumina Nextera Mate Pair Sample Preparation Kit with the insert size of ∼3 Kbp. The short and long libraries were combined onto their respective pools and sequenced in the Rapid Mode to produce 2 × 251 bp reads.
To generate long sequence reads (∼10 Kbp), we used the Pacific Biosciences Single Molecule Real-Time (SMRT) platform [39]. Based on the estimated genome size of ∼2.9 Gbp (see “Genome size estimate”, below), we aimed to generate a ∼50 × coverage in an effort to improve the assembly by reducing and closing gaps. Over 10 μg of HWM gDNA with an average fragment length of ≥60 Kbp was used to produce four SMRTbell libraries. The libraries were generated using the SMRTbell Template Prep Kit 1.0 following the PacBio “ >20 Kbp Template Preparation Using BluePippin Size-Selection System (15-20 Kbp) for Sequel Systems” procedures and checklist (catalog no. 100-286-000-07; Pacific Biosciences, USA). Libraries were sequenced in a combined total of 35 SMRTcells.
DNA assembly and descriptive statistics
We performed several quality preprocessing steps on the raw DNA sequence reads before the assembly. The overall quality of sequence data before and after each step was determined by FastQC. See the supplementary information for additional details on the DNA assembly, including the commands used to execute the computational analyses described below (Additional file 3: File S3).
The Illumina reads from the short insert library were processed with Trimmomatic v0.38 [40] to remove the adapters and low-quality bases at the ends of each read. Trimmomatic also scanned each read with a 4-base sliding window, cutting when the average quality per base dropped below 15. Reads shorter than 36 bp were discarded. The long insert library Illumina reads were processed with NxTrim v0.3.0-alpha [41] using default parameters to separate reads into four different categories according to the adapter position: mate pairs, unknown (which are mostly mate pairs), paired end, and single end sequence reads.
All the sequence files produced by Trimmomatic and NxTrim were then evaluated with the HTQC toolkit v0.90.8 [42] to produce quality stats per file (using ht-stat) and perform the final read trimming and filtering (with ht-trim and ht-filter, respectively). Finally, FastUniq v1.1 [43] was used to remove duplicates introduced by PCR amplification from paired short reads.
All cleaned Illumina reads were used as input for the de novo assembly with ABySS v2.11 [44] using k-mers ranging from 23 to 61 (with a steps of 2). The individual assemblies generated at each k-mer value were ranked using several metrics, including the number of sequences, total assembly length, L50, and N50 [45]. We then polished the resulting best assembly in Pilon v1.2.3 [46] to improve base calling and detect sequence variation.
The long PacBio sequence reads we assembled with MaSuRCA v3.2.7 [47]. The contigs generated this way were polished using Arrow v2.3.3 [48] and merged with the ABySS assembly using quickmerge v3be7287 [49].
Assembly stats were calculated using the “assembly-stats” [50] (developed at the Wellcome Sanger Institute) and “assemblathon_stats.pl” [51] (developed a the UC Davis Bioinformatics Core) tools. The completeness of protein-coding gene representation in the scaffolds of the draft genome assembly was assessed with BUSCO v4.0.6 [22] run in the “genome mode” against the evolutionary conserved metazoan gene set (metazoa_odb10, creation date: 2021-02-17, number of species: 65, number of BUSCOs: 954) and the conserved eukaryota gene set (eukaryota_odb10, creation date: 2020-09-10, number of species: 70, number of BUSCOs: 255). Since O. brevispinum is not listed among the available species available for Augustus training, we tested other three species: Homo sapiens, Drosophila melanogaster, and S. purpuratus.
Genome size estimate
The Animal Genome Size database has only one entry for species of Ophioderma, O. panamensis [52]. The expected C-value variation in echinoderms is provided in Additional file 4: File S4. That entry indicates that the expected C-value for O. panamensis is ∼3.3 pg (∼3.23 Gbp) based on bulk fluorometric assay [53]. The genome size of O. brevispinum have never been determined before. We, therefore, estimated it using two complementary approaches, including Feulgen densitometry (FD) assay [54–56] and also from the sequence data.
For the FD assay, soft uncalcified tissues (stomach wall and podia) from a single individual were finely minced with a razor blade, fixed in methanol:acetic acid (3:1) for 10 min and squashed in a drop of 45% acetic acid on a gelatin-coated slide. The samples were then air-dried and post-fixed in methanol:formalin:acetic acid (85:10:15) for 24 hours. After rinsing in tap and distilled water, the samples were hydrolyzed in 5N HCl and stained for 2 hours in Schiff reagent. After brief washes in a 0.5% sodium metabisulfite solution and then in water, the slides were dehydrated in an ethanol series, air dried and mounted in the immersion oil. Microscopic images were then taken at a consistent light intensity in the green monochromatic channel. The optical density of the stained nuclei was measured in the Fiji/ImageJ software [57]. To convert the optical density relative units to the absolute values of DNA mass per nucleus, the following control samples with known DNA content were processed and quantified along with the O. brevispinum specimens: chicken erythrocyte nuclei, trout erythrocyte nuclei, triploid trout nuclei, and human (male) cheek epithelial cell nuclei.
In addition to the FD assay, we also estimated the haploid genome size of O. brevispinum from the paired-end sequence data in Jellyfish v2.2.4 [58] using a k-mer-based statistical approach. The histograms produced by Jellyfish were used to estimate the haploid genome size in GenomeScope [59].
Detailed protocols for genome size estimation based on FD and on k-mer-based statistics are provided in Additional file 5: File S5. The estimated genome size was evaluated considering the variation in haploid genome size among echinoderms (see Additional file 6: File S6).
Assembly and classification of repetitive DNA elements
Previous to this study, the reports of observed haploid genome size in echinoderms varied over 8-fold. Haploid genome size ranged from 0.53 Gbp in the sea star Dermasterias imbricata to 4.3 Gbp in the sea cucumber Thyonella gemmata [52]. The largest haploid genomes in the subphylum Asterozoa belong to the order Ophiurida, Ophioderma panamensis, with 3.3 Gbp [52].
In addition to the whole-genome assembly described above, we also performed a stand-alone de novo assembly of repetitive DNA elements in the genome of O. brevispinum with REPdenovo v0.0 [60] following the protocol described in the supplementary information (Additional file 7: File S7).
In short, we used REPdenovo to assemble repeats directly from the cleaned paired-end and single-end short sequence reads that resulted from the quality control steps described above, using different k-mer sizes ranging from 25 to 50 with a step of 2.
The contigs assembled with REPdenovo were used as input to RepeatModeler v1.0.11 [61] to build a library of repetitive genomic elements in the genome of O. brevispinum. This resulting brittle star repeat library was then combined with repeat libraries from the 2018 version of Repbase [62–65] and RepeatMasker v4.0.8 [66]. Only unique entries were kept to generate a final custom repeat library. This custom repeat library was then used to screen the draft genome of O. brevispinum with RepeatMasker to identify interspersed repeats and low complexity DNA sequences. Finally, the RepeatMasker output was manually curated and written into a General Feature Format version 3 (GFF3) file [67]. The details of the repeat library preparation and repeat masking are provided in Additional file 8: File S8.
Gene prediction and annotation
The gene prediction and annotation workflow is summarised in Fig. 5. We also provide template scripts listing the parameters used to execute each program listed below in Additional file 9: File S9.
Full gene structure annotations were generated with BRAKER v2.1.2 [68, 69], which integrates GeneMark-ET/EP+ v4.38 [70] and AUGUSTUS v.3.3.2 [71, 72] and allows for fully automated training from RNA-Seq or protein homology information. We also conducted an independent run with AUGUSTUS on selected scaffolds.
The BRAKER annotation pipeline used the genome of the purple sea urchin S. purpuratus (assembly Spur_5.0) as a reference [10, 73] and also the de novo assembled transcriptome of O. brevispinum [3].
The predicted gene models were aligned with BLAST v.2.9.0+ [30] against the following databases (each downloaded on October 25, 2019): the UniProt Archive (UniParc; https://www.uniprot.org/help/uniparc), the NCBI’s non-redundant nucleotide database (“nt”; https://ftp.ncbi.nlm.nih.gov/), and the complete EchinoDB database of protein coding genes (https://echinodb.uncc.edu/).
In addition, the genomic scaffolds were also aligned to the transcriptome of O. brevispinum and the cDNA sequences from S. purpuratus (assembly Spur_5.0) using exonerate v2.4.0ls [74], GMAP v2021.03.08 [75], and BLAT v36x2 [76].
The programs listed above generated different annotation tables. These tables were formatted as GTF files that are listed in the supplementary information. The main GTF files are provided in Additional file 2: File S2.
Annotation of genes associated with the notch signaling pathway
As a case study, to demonstrate the practical utility of our draft genome assembly, we annotated selected core components and modifiers of the Notch signaling pathway (Fig. 2) using reference amino acid sequences from the UniProt and Echinobase (www.echinobase.org) [37] databases. The sequences from this query reference database were aligned to target exons from the BRAKER annotation using the TBLASTN program to search translated nucleotide databases (from scaffolds) using a protein query (described above) with the E-value, bit score, and percentage identity cutoff thresholds of 1.0E-5, 30.0, and 23%, respectively. In parallel, we also aligned the amino acids query sequences to all assembled scaffolds using exonerate to test if its exon predictions match BLAST results.
We used NCBI’s Conserved Domain Search (www.ncbi.nlm.nih.gov/Structure/cdd) to identify conserved protein domains in the brittle star Notch pathway genes returned by BLAST and/or exonerate. The conserved domains were searched against the CDD v3.19 database, with an E-value threshold of 0.01 and compositional-based statistics adjustment. We stored the best 500 hits for each gene sequence and then manually inspected the output for the presence or absence of diagnostic domains.
The complete list of genes related to the Notch signaling pathway we searched is provided in the Results section. Just as for the annotation tables for the genomic scaffolds, we also formatted the results of the annotation Notch-related genes in GTF files that we provide as supplementary information (Additional file 2: File S2).
Mitochondrial genome
The mitochondrial genome (mitogenome) of O. brevispinum was contained in a single scaffold generated during the whole-genome assembly. It was identified via sequence alignments using BLAST v.2.9.0+ [30] and a reference sequence from an ophiuroid of the same family (NCBI’s accession number NC_046053.1), Ophiarachnella gorgonia (Müller & Troschel, 1842) (Echinodermata: Ophiuroidea: Ophiacanthida: Ophiodermatidae).
The putative circular sequence was extracted from the selected scaffold using AWA (available from https://gitlab.com/MachadoDJ/awa; accessed on July 22, 2021) [77]. Next, we remapped filtered short paired-end reads back to AWA’s putative mitogenome using Bowtie2 to review base calling. Finally, we used MITOS WebServer (version 2; available from http://mitos2.bioinf.uni-leipzig.de/index.py) [78] to predict genes and an independent analysis with tRNAscan-SE 2.0 [79, 80] to confirm the annotation of tRNAs.
Availability of data and materials
The data sets supporting the conclusions of this article are included within the article and its additional files. Supplementary information accompanies this paper is available at Zenodo, DOI: 10.5281/zenodo.6618000. Data corresponding to our draft genome assembly of O. brevispinum can be found at NCBI’s databases under BioProject number PRJNA779014, BioSample number SAMN23008116, and GenBank’s accession number JAMKCH000000000.1. The mitochondrial genome sequence and annotations have been submitted to NCBI under the same Bioproject number (PRJNA779014).
Abbreviations
- BAM:
-
binary alignment map
- BLAST:
-
Basic Local Alignment Search Tool
- bp:
-
base pair
- dsDNA:
-
Double stranded DNA
- EBI:
-
European Bioinformatics Institute
- EMBL:
-
European Molecular Biology Laboratory
- EMBOSS:
-
European Molecular Biology Open Software Suite
- ENCODE:
-
Encyclopedia of DNA Elements
- FASTA:
-
it is pronounced “fast A” and stands for “Fast-All”
- FASTP:
-
it is pronounced “fast P” and stands for “Fast-Protein”
- FD:
-
Feulgen densitometry
- FDR:
-
false discovery rate
- FPKM:
-
fragments per kilobase of exon per million fragments mapped
- GB:
-
gigabyte (approx. 1024 MB)
- Gb:
-
same as Gbp
- Gbp:
-
giga base pairs (1,000,000,000 bp)
- gDNA:
-
genomic DNA
- GFF3:
-
general feature format or gene-finding format, version 3
- GO:
-
gene ontology
- HTS:
-
high-throughput sequencing
- HMW:
-
high molecular weight
- InDel:
-
insertion or deletion
- Kb:
-
same as Kbp
- Kbp:
-
kilo base pairs (1,000 bp)
- L50:
-
If we sort sequences by size and sum their sizes in succession from the shortest sequence, the L50 will be the number of sequences needed to achieve 50% of the total size
- MB:
-
a unit of information equal to 220 bytes or, loosely, one million bytes
- Mb:
-
same as Mbp
- Mbp:
-
mega base pairs (1,000,000 bp)
- mRNA:
-
messenger RNA
- MSA:
-
multiple sequence alignment
- mtDNA:
-
mitochondrial DNA
- nt:
-
nucleotide
- nucDNA:
-
nuclear DNA
- N50:
-
If we sort sequences by size and sum their sizes in succession from the shortest sequence, the N50 will be the last sequenced added to achieve 50% of the total size
- ORFs:
-
open reading frames
- PacBio:
-
Pacific Biosciences
- PE:
-
paired-end (sequence of both ends of a fragment)
- rRNA:
-
ribosomal RNA
- SMRT:
-
Single Molecule Real-Time
- SAM:
-
sequence alignment/map
- SE:
-
standard errors
- SR:
-
single-read (sequencing from only one end)
- ssDNA:
-
single stranded DNA tRNA: transfer RNA
- VCF:
-
variant call file
References
Cary GA, Cameron RA, Hinman VF. Genomic resources for the study of echinoderm development and evolution. In: Methods Cell Biol, vol 151. New York: Elsevier: 2019. p. 65–88. https://doi.org/10.1016/bs.mcb.2018.11.019.
Carnevali MC. Regeneration in Echinoderms: repair, regrowth, cloning. Invert Surviv J. 2006; 3(1):64–76.
Mashanov V, Akiona J, Khoury M, Ferrier J, Reid R, Machado DJ, Zueva O, Janies D. Active Notch signaling is required for arm regeneration in a brittle star. PLoS ONE. 2020; 15(5):0232981. https://doi.org/10.1371/journal.pone.0232981.
Mashanov VS, Zueva OR, García-Arrarás JE. Transcriptomic changes during regeneration of the central nervous system in an echinoderm. BMC Genomics. 2014; 15(1):1–21.
Quispe-Parra DJ, Medina-Feliciano JG, Cruz-González S, Ortiz-Zuazaga H, García-Arrarás JE. Transcriptomic analysis of early stages of intestinal regeneration in Holothuria glaberrima. Sci Rep. 2021; 11(1):1–14.
Purushothaman S, Saxena S, Meghah V, Swamy CVB, Ortega-Martinez O, Dupont S, Idris M. Transcriptomic and proteomic analyses of Amphiura filiformis arm tissue-undergoing regeneration. J Proteomics. 2015; 112:113–24.
Czarkwiani A, Dylus DV, Carballo L, Oliveri P. Fgf signalling plays similar roles in development and regeneration of the skeleton in the brittle star Amphiura filiformis. Development. 2021; 148(10):180760. https://doi.org/10.1242/dev.180760.
Mashanov VS, Zueva OR, García-Arrarás JE. Myc regulates programmed cell death and radial glia dedifferentiation after neural injury in an echinoderm. BMC Dev Biol. 2015; 15(1):1–9.
Alicea-Delgado M, García-Arrarás JE. Wnt/ β-catenin signaling pathway regulates cell proliferation but not muscle dedifferentiation nor apoptosis during sea cucumber intestinal regeneration. Dev Biol. 2021; 480:105–13.
Sodergren E, Weinstock GM, Davidson EH, Cameron RA, Gibbs RA, Angerer RC, Angerer LM, Arnone MI, Burgess DR, Burke RD, et al. The genome of the sea urchin Strongylocentrotus purpuratus. Science. 2006; 314(5801):941–52. https://doi.org/10.1126/science.1133609.
Cameron RA, Kudtarkar P, Gordon SM, Worley KC, Gibbs RA. Do echinoderm genomes measure up?Mar Genom. 2015; 22:1–9. https://doi.org/10.1016/j.margen.2015.02.004.
Kinjo S, Kiyomoto M, Yamamoto T, Ikeo K, Yaguchi S. Hpbase: A genome database of a sea urchin, hemicentrotus pulcherrimus. Dev Growth Differ. 2018; 60(3):174–82. https://doi.org/10.1111/dgd.12429.
Sergiev PV, Artemov AA, Prokhortchouk EB, Dontsova OA, Berezkin GV. Genomes of Strongylocentrotus franciscanus and Lytechinus variegatus: are there any genomic explanations for the two order of magnitude difference in the lifespan of sea urchins?Aging. 2016; 8(2):260. https://doi.org/10.18632/aging.100889.
Kudtarkar P, Cameron RA. Echinobase: an expanding resource for echinoderm genomic information. Database. 2017;2017. https://doi.org/10.1093/database/bax074.
Zhang X, Sun L, Yuan J, Sun Y, Gao Y, Zhang L, Li S, Dai H, Hamel J-F, Liu C, et al. The sea cucumber genome provides insights into morphological evolution and visceral regeneration. PLoS Biol. 2017; 15(10):2003790. https://doi.org/10.1371/journal.pbio.2003790.
Hall MR, Kocot KM, Baughman KW, Fernandez-Valverde SL, Gauthier ME, Hatleberg WL, Krishnan A, McDougall C, Motti CA, Shoguchi E, et al.The crown-of-thorns starfish genome as a guide for biocontrol of this coral reef pest. Nature. 2017; 544(7649):231–4. https://doi.org/10.1038/nature22033.
Long KA, Nossa CW, Sewell MA, Putnam NH, Ryan JF. Low coverage sequencing of three echinoderm genomes: the brittle star Ophionereis fasciata, the sea star Patiriella regularis, and the sea cucumber Australostichopus mollis. GigaScience. 2016; 5(1):13742–016. https://doi.org/10.1186/s13742-016-0125-6.
Say T. On the species of the Linnean genus Asterias inhabiting the coast of the United States. P Acad Nat Sci Phila. 1825; 5(1):151–4.
Bowmer T, Keegan B. Field survey of the occurrence and significance of regeneration in Amphiura filiformis (echinodermata: Ophiuroidea) from galway bay, west coast of ireland. Mar Biol. 1983; 74(1):65–71.
Weber AA-T, Dupont S, Chenuil A. Thermotolerance and regeneration in the brittle star species complex Ophioderma longicauda: A preliminary study comparing lineages and mediterranean basins. Comptes Rendus Biologies. 2013; 336(11-12):572–81. https://doi.org/10.1016/j.crvi.2013.10.004.
Hurlbut GD, Kankel MW, Lake RJ, Artavanis-Tsakonas S. Crossing paths with notch in the hyper-network. Curr Opin Cell Biol. 2007; 19(2):166–75.
Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015; 31(19):3210–2. https://doi.org/10.1093/bioinformatics/btv351.
Waterhouse RM, Seppey M, Simão FA, Manni M, Ioannidis P, Klioutchnikov G, Kriventseva EV, Zdobnov EM. Busco applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2018; 35(3):543–8. https://doi.org/10.1093/molbev/msx319.
Walton KD, Croce JC, Glenn TD, Wu S-Y, McClay DR. Genomics and expression profiles of the hedgehog and notch signaling pathways in sea urchin development. Dev Biol. 2006; 300(1):153–64. https://doi.org/10.1016/j.ydbio.2006.08.064.
Marlow H, Roettinger E, Boekhout M, Martindale MQ. Functional roles of notch signaling in the cnidarian Nematostella vectensis. Dev Biol. 2012; 362(2):295–308. https://doi.org/10.1016/j.ydbio.2011.11.012.
Layden MJ, Martindale MQ. Non-canonical notch signaling represents an ancestral mechanism to regulate neural differentiation. EvoDevo. 2014; 5(1):1–14. https://doi.org/10.1186/2041-9139-5-30.
Erkenbrack EM. Notch-mediated lateral inhibition is an evolutionarily conserved mechanism patterning the ectoderm in echinoids. Dev Genes Evol. 2018; 228(1):1–11. https://doi.org/10.1007/s00427-017-0599-y.
Favarolo MB, López SL. Notch signaling in the division of germ layers in bilaterian embryos. Mech Develop. 2018; 154:122–44. https://doi.org/10.1016/j.mod.2018.06.005.
Lloyd-Lewis B, Mourikis P, Fre S. Notch signalling: sensor and instructor of the microenvironment to coordinate cell fate and organ morphogenesis. Curr Opin Cell Biol. 2019; 61:16–23. https://doi.org/10.1016/j.ceb.2019.06.003.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990; 215(3):403–10. https://doi.org/10.1016/S0022-2836(05)80360-2.
Lee T, Bae YJ, Shin S. Mitochondrial gene rearrangement and phylogenetic relationships in the amphilepidida and ophiacanthida (echinodermata, ophiuroidea). Mar Biol Res. 2019; 15(1):26–35. https://doi.org/10.1080/17451000.2019.1601226.
Zueva O, Khoury M, Heinzeller T, Mashanova D, Mashanov V. The complex simplicity of the brittle star nervous system. Front Zool. 2018; 15(1):1–26. https://doi.org/10.1186/s12983-017-0247-4.
Clark EG, Fezzaa K, Burke J, Racicot R, Shaw J, Westacott S, Briggs D. A farewell to arms: using x-ray synchrotron imaging to investigate autotomy in brittle stars. Zoomorphology. 2019; 138(3):419–24. https://doi.org/10.1007/s00435-019-00451-7.
Mashanov V, Zueva O. Radial glia in echinoderms. Dev Neurobiol. 2019; 79(5):396–405. https://doi.org/10.1002/dneu.22659.
Ehebauer M, Hayward P, Martinez-Arias A. Notch signaling pathway. Sci STKE. 2006; 2006(364):7. https://doi.org/10.1126/stke.3642006cm7.
Cormier S, Le Bras S, Souilhol C, Vandormael-Pournin S, Durand B, Babinet C, Baldacci P, Cohen-Tannoudji M. The murine ortholog of notchless, a direct regulator of the notch pathway in Drosophila melanogaster, is essential for survival of inner cell mass cells. Mol Cell Biol. 2006; 26(9):3541–9. https://doi.org/10.1128/MCB.26.9.3541-3549.2006.
Cary GA, Cameron RA, Hinman VF. EchinoBase: tools for echinoderm genome analyses In: Kollmar M, editor. Eukaryotic Genomic Databases (Methods Mol Biol), vol 1757. New York: Springer: 2018. p. 349–69. https://doi.org/10.1007/978-1-4939-7737-6_12.
Kondo M, Akasaka K. Current status of echinoderm genome analysis - what do we know?. Curr Genomics. 2012; 13(2):134–43. https://doi.org/10.2174/138920212799860643.
Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al.Real-time DNA sequencing from single polymerase molecules. Science. 2009; 323(5910):133–8. https://doi.org/10.1126/science.1162986.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014; 30(15):2114–20. https://doi.org/10.1093/bioinformatics/btu170.
O’Connell J, Schulz-Trieglaff O, Carlson E, Hims MM, Gormley NA, Cox AJ. NxTrim: optimized trimming of Illumina mate pair reads. Bioinformatics. 2015; 31(12):2035–7. https://doi.org/10.1093/bioinformatics/btv057.
Yang X, Liu D, Liu F, Wu J, Zou J, Xiao X, Zhao F, Zhu B. HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinformatics. 2013; 14(1):1–4. https://doi.org/10.1186/1471-2105-14-33.
Xu H, Luo X, Qian J, Pang X, Song J, Qian G, Chen J, Chen S. FastUniq: a fast de novo duplicates removal tool for paired short reads. PloS ONE. 2012; 7(12):52249. https://doi.org/10.1371/journal.pone.0052249.
Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009; 19(6):1117–23. https://doi.org/10.1101/gr.089532.108.
Thrash A, Hoffmann F, Perkins A. Toward a more holistic method of genome assembly assessment. BMC Bioinformatics. 2020; 21(4):1–8. https://doi.org/10.1186/s12859-020-3382-4.
Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, et al.Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PloS ONE. 2014; 9(11):112963. https://doi.org/10.1371/journal.pone.0112963.
Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics. 2013; 29(21):2669–77. https://doi.org/10.1093/bioinformatics/btt476.
Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JC, Wides R, et al.The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002; 298(5591):129–49. https://doi.org/10.1126/science.1076181.
Chakraborty M, Baldwin-Brown JG, Long AD, Emerson J. Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res. 2016; 44(19):147. https://doi.org/10.1093/nar/gkw654.
Hunt M. assembly-stats. Unknown Month 2014. https://github.com/sanger-pathogens/assembly-stats. Accessed 18 Feb 2019.
UC Davis Bioinformatics Core. assemblathon2-analysis. 2012. https://github.com/ucdavis-bioinformatics/assemblathon2-analysis. Accessed 24 Mar 2019.
Gregory TR. Animal genome size database. 2020. http://www.genomesize.com. Accessed 28 Oct 2020.
Hinegardner R. Cellular DNA content of the Echinodermata. Comp Biochem Physiol. 1974; 49B:219–26.
Hardie DC, Gregory TR, Hebert PD. From pixels to picograms: a beginners’ guide to genome quantification by Feulgen image analysis densitometry. J Histochem Cytochem. 2002; 50(6):735–49. https://doi.org/10.1177/002215540205000601.
Rasch EM, Lee CE, Wyngaard GA. DNA–Feulgen cytophotometric determination of genome size for the freshwater-invading copepod Eurytemora affinis. Genome. 2004; 47(3):559–64. https://doi.org/10.1139/g04-014.
Donnenberg VS, Landreneau RJ, Pfeifer ME, Donnenberg AD. Flow cytometric determination of stem/progenitor content in epithelial tissues: an example from nonsmall lung cancer and normal lung. Cytometry Part A. 2013; 83(1):141–9. https://doi.org/10.1002/cyto.a.22156.
Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, et al.Fiji: an open-source platform for biological-image analysis. Nat Methods. 2012; 9(7):676–82.
Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011; 27(6):764–70. https://doi.org/10.1093/bioinformatics/btr011.
Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H, Gurtowski J, Schatz MC. Genomescope: fast reference-free genome profiling from short reads. Bioinformatics. 2017; 33(14):2202–4. https://doi.org/10.1093/bioinformatics/btx153.
Chu C, Nielsen R, Wu Y. REPdenovo: inferring de novo repeat motifs from short sequence reads. PloS ONE. 2016; 11(3):0150719. https://doi.org/10.1371/journal.pone.0150719.
Smit AFA, Hubley R. RepeatModeler Open-1.0. 2008. http://www.repeatmasker.org. Accessed 10 Nov 2020.
Jurka J. Repeats in genomic dna: mining and meaning. Curr Opin Struc Biol. 1998; 8(3):333–7. https://doi.org/10.1016/S0959-440X(98)80067-5.
Jurka J. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000; 16(9):418–20. https://doi.org/10.1016/S0168-9525(00)02093-X.
Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005; 110(1-4):462–7. https://doi.org/10.1159/000084979.
Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. 2015; 6(1):11. https://doi.org/10.1186/s13100-015-0041-9.
Smit AFA, Hubley R, Green P. RepeatMasker Open-4.0. Unknown Month 2013. http://www.repeatmasker.org. Accessed 10 Nov 2020.
Reese MG, Moore B, Batchelor C, Salas F, Cunningham F, Marth GT, Stein L, Flicek P, Yandell M, Eilbeck K. A standard variation file format for human genome sequences. Genome Biol. 2010; 11(8):88. https://doi.org/10.1186/gb-2010-11-8-r88.
Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M. Braker1: unsupervised rna-seq-based genome annotation with genemark-et and augustus. Bioinformatics. 2016; 32(5):767–9. https://doi.org/10.1093/bioinformatics/btv661.
Brŭna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. Braker2: Automatic eukaryotic genome annotation with genemark-ep+ and augustus supported by a protein database. NAR Genom Bioinforma. 2021; 3(1):108. https://doi.org/10.1093/nargab/lqaa108.
Lomsadze A, Burns PD, Borodovsky M. Integration of mapped rna-seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res. 2014; 42(15):119. https://doi.org/10.1093/nar/gku557.
Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cdna alignments to improve de novo gene finding. Bioinformatics. 2008; 24(5):637–44. https://doi.org/10.1093/bioinformatics/btn013.
Stanke M, Schöffmann O, Morgenstern B, Waack S. Gene prediction in eukaryotes with a generalized hidden markov model that uses hints from external sources. BMC Bioinformatics. 2006; 7(1):1–11. https://doi.org/10.1186/1471-2105-7-62.
Ensembl Metazoa Home. Stronglocentrotus purpuratus (Spur 01) (Spur_5.0). 2020. https://metazoa.ensembl.org/Strongylocentrotus_purpuratus. Accessed 12 Aug 2020.
Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005; 6(1):1–11. https://doi.org/10.1186/1471-2105-6-31.
Wu TD, Watanabe CK. Gmap: a genomic mapping and alignment program for mrna and est sequences. Bioinformatics. 2005; 21(9):1859–75. https://doi.org/10.1093/bioinformatics/bti310.
Kent WJ. Blat—the blast-like alignment tool. Genome Res. 2002; 12(4):656–64. https://doi.org/10.1101/gr.229202.
Jacob Machado D, Janies D, Brouwer C, Grant T. A new strategy to infer circularity applied to four new complete frog mitogenomes. Ecol Evol. 2018; 8(8):4011–8. https://doi.org/10.1002/ece3.3918.
Donath A, Jühling F, Al-Arab M, Bernhart SH, Reinhardt F, Stadler PF, Middendorf M, Bernt M. Improved annotation of protein-coding genes boundaries in metazoan mitochondrial genomes. Nucleic Acids Res. 2019; 47(20):10543–52. https://doi.org/10.1093/nar/gkz833.
Lowe TM, Eddy SR. trnascan-se: a program for improved detection of transfer rna genes in genomic sequence. Nucleic Acids Res. 1997; 25(5):955–64. https://doi.org/10.1093/nar/25.5.955.
Chan PP, Lowe TM. trnascan-se: searching for trna genes in genomic sequences. In: Gene Prediction. Springer: 2019. p. 1–14. https://doi.org/10.1007/978-1-4939-9173-0_1.
Hennebert E, Leroy B, Wattiez R, Ladurner P. An integrated transcriptomic and proteomic analysis of sea star epidermal secretions identifies proteins involved in defense and adhesion. J Proteomics. 2015; 128:83–91. https://doi.org/10.1016/j.jprot.2015.07.002.
Ruiz-Ramos DV, Schiebelhut LM, Hoff KJ, Wares JP, Dawson MN. An initial comparative genomic autopsy of wasting disease in sea stars. Mol Ecol. 2020; 29(6):1087–102. https://doi.org/10.1111/mec.15386.
Baughman KW, McDougall C, Cummins SF, Hall M, Degnan BM, Satoh N, Shoguchi E. Genomic organization of hox and parahox clusters in the echinoderm, Acanthaster planci. Genesis. 2014; 52(12):952–8. https://doi.org/10.1002/dvg.22840.
Yasuda N, Hamaguchi M, Sasaki M, Nagai S, Saba M, Nadaoka K. Complete mitochondrial genome sequences for crown-of-thorns starfish Acanthaster planci and Acanthaster brevispinus. BMC Genomics. 2006; 7(1):1–10. https://doi.org/10.1186/1471-2164-7-17.
Jung G, Lee Y-H. Complete mitochondrial genome of chilean sea urchin: Loxechinus albus (camarodonta, parechinidae). Mitochondrial DNA. 2015; 26(6):883–4. https://doi.org/10.3109/19401736.2013.809449.
Warner JF, Lord JW, Schreiter SA, Nesbit KT, Hamdoun A, Lyons DC. Chromosomal-level genome assembly of the painted sea urchin Lytechinus pictus: A genetically enabled model system for cell biology and embryonic development. Genome Biol Evol. 2021; 13(4):061. https://doi.org/10.1093/gbe/evab061.
Davidson PL, Guo H, Wang L, Berrio A, Zhang H, Chang Y, Soborowski AL, McClay DR, Fan G, Wray GA. Chromosomal-level genome assembly of the sea urchin lytechinus variegatus substantially improves functional genomic analyses. Genome Biol Evol. 2020; 12(7):1080–6. https://doi.org/10.1093/gbe/evaa101.
Bronstein O, Kroh A. The first mitochondrial genome of the model echinoid Lytechinus variegatus and insights into odontophoran phylogenetics. Genomics. 2019; 111(4):710–8. https://doi.org/10.1016/j.ygeno.2018.04.008.
Morrison AMS, Goldstone JV, Lamb DC, Kubota A, Lemaire B, Stegeman JJ. Identification, modeling and ligand affinity of early deuterostome cyp51s, and functional characterization of recombinant zebrafish sterol 14 α-demethylase. BBA-Gen Subj. 2014; 1840(6):1825–36. https://doi.org/10.1016/j.bbagen.2013.12.009.
Tu Q, Cameron RA, Worley KC, Gibbs RA, Davidson EH. Gene structure in the sea urchin Strongylocentrotus purpuratus based on transcriptome analysis. Genome Res. 2012; 22(10):2079–87. https://doi.org/10.1101/gr.139170.112.
Stevens ME, Dhillon J, Miller CA, Messier-Solek C, Majeske AJ, Zuelke D, Rast JP, Smith LC. Sptie1/2 is expressed in coelomocytes, axial organ and embryos of the sea urchin Strongylocentrotus purpuratus, and is an orthologue of vertebrate tie1 and tie2. Dev Comp Immunol. 2010; 34(8):884–95. https://doi.org/10.1016/j.dci.2010.03.010.
Tartari M, Gissi C, Lo Sardo V, Zuccato C, Picardi E, Pesole G, Cattaneo E. Phylogenetic comparison of huntingtin homologues reveals the appearance of a primitive polyq in sea urchin. Mol Biol Evol. 2008; 25(2):330–8. https://doi.org/10.1093/molbev/msm258.
Tu Q, Brown CT, Davidson EH, Oliveri P. Sea urchin forkhead gene family: phylogeny and embryonic expression. Dev Biol. 2006; 300(1):49–62. https://doi.org/10.1016/j.ydbio.2006.09.031.
Neill AT, Moy GW, Vacquier VD. Polycystin-2 associates with the polycystin-1 homolog, surej3, and localizes to the acrosomal region of sea urchin spermatozoa. Mol Reprod Dev. 2004; 67(4):472–7. https://doi.org/10.1002/mrd.20033.
Multerer KA, Smith LC. Two cdnas from the purple sea urchin, Strongylocentrotus purpuratus, encoding mosaic proteins with domains found in factor h, factor i, and complement components c6 and c7. Immunogenetics. 2004; 56(2):89–106. https://doi.org/10.1007/s00251-004-0665-2.
Kamei N, Glabe CG. The species-specific egg receptor for sea urchin sperm adhesion is ebr1, a novel adamts protein. Gene Dev. 2003; 17(20):2502–7. https://doi.org/10.1101/gad.1133003.
Tombes RM, Faison MO, Turbeville J. Organization and evolution of multifunctional ca2+/cam-dependent protein kinase genes. Gene. 2003; 322:17–31. https://doi.org/10.1016/j.gene.2003.08.023.
Sirotkin V, Seipel S, Krendel M, Bonder EM. Characterization of sea urchin unconventional myosins and analysis of their patterns of expression during early embryogenesis. Mol Reprod Dev. 2000; 57(2):111–26.
Pancer Z. Dynamic expression of multiple scavenger receptor cysteine-rich genes in coelomocytes of the purple sea urchin. P Natl Acad Sci. 2000; 97(24):13156–61. https://doi.org/10.1073/pnas.230096397.
Pancer Z, Rast JP, Davidson EH. Origins of immunity: transcription factors and homologues of effector genes of the vertebrate immune system expressed in sea urchin coelomocytes. Immunogenetics. 1999; 49(9):773–86. https://doi.org/10.1007/s002510050551.
LaFleur Jr GJ, Horiuchi Y, Wessel GM. Sea urchin ovoperoxidase: oocyte-specific member of a heme-dependent peroxidase superfamily that functions in the block to polyspermy. Mechanisms Devel. 1998; 70(1-2):77–89. https://doi.org/10.1016/s0925-4773(97)00178-0.
Hartman JJ, Mahr J, McNally K, Okawa K, Iwamatsu A, Thomas S, Cheesman S, Heuser J, Vale RD, McNally FJ. Katanin, a microtubule-severing protein, is a novel aaa atpase that targets to the centrosome using a wd40-containing subunit. Cell. 1998; 93(2):277–87. https://doi.org/10.1016/s0092-8674(00)81578-0.
Marsden M, Burke RD. The βl integrin subunit is necessary for gastrulation in sea urchin embryos. Dev Biol. 1998; 203(1):134–48. https://doi.org/10.1006/dbio.1998.9033.
Marsden M, Burke R. Cloning and characterization of novel β integrin subunits from a sea urchin. Dev Biol. 1997; 181(2):234–45. https://doi.org/10.1006/dbio.1996.8451.
Valverde JR, Marco R, Garesse R. A conserved heptamer motif for ribosomal rna transcription termination in animal mitochondria. P Natl Acad Sci. 1994; 91(12):5368–71. https://doi.org/10.1073/pnas.91.12.5368.
Qureshi SA, Jacobs HT. Two distinct, sequence-specific dna-binding proteins interact independently with the major replication pause region of sea urchin mtdna. Nucleic Acids Res. 1993; 21(12):2801–8. https://doi.org/10.1093/nar/21.12.2801.
Kitagawa M. Notch signalling in the nucleus: roles of mastermind-like (maml) transcriptional coactivators. J Biochem. 2016; 159(3):287–94. https://doi.org/10.1093/jb/mvv123.
Jin K, Zhou W, Han X, Wang Z, Li B, Jeffries S, Tao W, Robbins DJ, Capobianco AJ. Acetylation of mastermind-like 1 by p300 drives the recruitment of nack to initiate notch-dependent transcription. Cancer Res. 2017; 77(16):4228–37. https://doi.org/10.1158/0008-5472.CAN-16-3156.
Wallberg AE, Pedersen K, Lendahl U, Roeder RG. p300 and pcaf act cooperatively to mediate transcriptional activation from chromatin templates by notch intracellular domains in vitro. Method Mol Cell Biol. 2002; 22(22):7812–9. https://doi.org/10.1128/MCB.22.22.7812-7819.2002.
Weaver KL, Alves-Guerra M-C, Jin K, Wang Z, Han X, Ranganathan P, Zhu X, DaSilva T, Liu W, Ratti F, et al.Nack is an integral component of the notch transcriptional activation complex and is critical for development and tumorigenesis. Cancer Res. 2014; 74(17):4741–51. https://doi.org/10.1158/0008-5472.CAN-14-1547.
Kim GS, Park H-S, Lee YC. Opthis identifies the molecular basis of the direct interaction between csl and smrt corepressor. Mol Cells. 2018; 41(9):842. https://doi.org/10.14348/molcells.2018.0196.
Gazave E, Lapébie P, Richards GS, Brunet F, Ereskovsky AV, Degnan BM, Borchiellini C, Vervoort M, Renard E. Origin and evolution of the notch signalling pathway: an overview from eukaryotic genomes. BMC Evol Biol. 2009; 9(1):1–27. https://doi.org/10.1186/1471-2148-9-249.
Shi S, Stanley P. Protein o-fucosyltransferase 1 is an essential component of notch signaling pathways. P Natl Acad Sci. 2003; 100(9):5234–9. https://doi.org/10.1073/pnas.0831126100.
Taylor P, Takeuchi H, Sheppard D, Chillakuri C, Lea SM, Haltiwanger RS, Handford PA. Fringe-mediated extension of o-linked fucose in the ligand-binding region of notch1 increases binding to mammalian notch ligands. P Natl Acad Sci. 2014; 111(20):7290–5. https://doi.org/10.1073/pnas.1319683111.
Van Tetering G, Vooijs M. Proteolytic cleavage of notch: “hit and run”. Curr Mol Med. 2011; 11(4):255–69. https://doi.org/10.2174/156652411795677972.
Koutelou E, Sato S, Tomomori-Sato C, Florens L, Swanson SK, Washburn MP, Kokkinaki M, Conaway RC, Conaway JW, Moschonas NK. Neuralized-like 1 (neurl1) targeted to the plasma membrane byN-myristoylation regulates the notch ligand jagged1. J Biol Chem. 2008; 283(7):3846–53. https://doi.org/10.1074/jbc.M706974200.
Koo B-K, Yoon K-J, Yoo K-W, Lim H-S, Song R, So J-H, Kim C-H, Kong Y-Y. Mind bomb-2 is an e3 ligase for notch ligand. J Biol Chem. 2005; 280(23):22335–42. https://doi.org/10.1074/jbc.M501631200.
Kopan R, Ilagan MXG. The canonical notch signaling pathway: unfolding the activation mechanism. Cell. 2009; 137(2):216–33. https://doi.org/10.1016/j.cell.2009.03.045.
Teider N, Scott DK, Neiss A, Weeraratne SD, Amani VM, Wang Y, Marquez VE, Cho Y-J, Pomeroy SL. Neuralized1 causes apoptosis and downregulates notch target genes in medulloblastoma. Neuro Oncol. 2010; 12(12):1244–56. https://doi.org/10.1093/neuonc/noq091.
Iso T, Kedes L, Hamamori Y. Hes and herp families: multiple effectors of the notch signaling pathway. J Cell Physiol. 2003; 194(3):237–55. https://doi.org/10.1002/jcp.10208.
Young PW. Lnx1/lnx2 proteins: Functions in neuronal signalling and beyond. Neuronal Signal. 2018; 2(2):20170191. https://doi.org/10.1042/NS20170191.
Zhou Y, Atkins JB, Rompani SB, Bancescu DL, Petersen PH, Tang H, Zou K, Stewart SB, Zhong W. The mammalian golgi regulates numb signaling in asymmetric cell division by releasing acbd3 during mitosis. Cell. 2007; 129(1):163–78. https://doi.org/10.1016/j.cell.2007.02.037.
Sakata T, Sakaguchi H, Tsuda L, Higashitani A, Aigaki T, Matsuno K, Hayashi S. Drosophila nedd4 regulates endocytosis of notch and suppresses its ligand-independent activation. Curr Biol. 2004; 14(24):2228–36. https://doi.org/10.1016/j.cub.2004.12.028.
Acknowledgements
No comments.
Funding
Research reported in this publication was supported by the National Institute for General Medical Sciences of the National Institutes of Health under award number 1R15GM128066-01. We also acknowledge funding from the University of North Florida. We acknowledge funding and logistical support from several entities of the University of North Carolina at Charlotte including: The Bioinformatics Services Division, the Department of Bioinformatics and Genomics, the Bioinformatics Research Center, University Research Computing, the College of Computing and Informatics. We are grateful for funding from the Belk Family. The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Author information
Authors and Affiliations
Contributions
VM and DJM share first authorship. VM, RR, CB, and DAJ: initial conceptualization and funding acquisition. VM, DAJ, RR, DJM, and JK: writing (original draft, review, and editing). RR and DJM: methodology, formal analysis, investigation, resources, software, validation, data curation, and visualization. DJ, VM, CB, DJM and JK: project administration. The authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Supplementary information accompanies this paper at Zenodo, DOI: https://doi.org/10.5281/zenodo.6618000.
Additional file 1
File S1. Summary of sequence read statistics (number of sequences, number of base pairs, maximum read length, average read length, sum, estimated genome size, and estimated overall sequence depth).
Additional file 2
∙ ‘notch-related.gtf‘: Independent annotation of genes that belong to the Notch signaling pathway. ∙ ‘repeatMasker.gtf‘: Repeat annotation based on similarity (not included in the gene statistics). ∙ ‘braker.gtf‘: Main gene annotation file produced with the BRAKER pipeline. ∙ ‘exonerate_complete.gtf‘: All hits from exonerate alignments (may include suboptimal hits because it stores the best hit per transcript query, not the best hit per target location). ∙ ‘exonerate_filtered.gtf‘: Filtered results from exonerate with the best hits per target location.
Additional file 3
File S3. Bioinformatics protocols for quality control of raw sequence reads, subsequent genome assembly, and the methodology and main results for BUSCO v4.0.6 analyses.
Additional file 4
File S4. Expected C-value variation in echinoderms. Data in this table is reference in our manuscript and is based mainly on information available from the “Animal genome size database” [52].
Additional file 5
File S5. Protocols for Feulgen image analysis densitometry and sequence-based genome size estimation.
Additional file 6
File S6. Expected variation of haploid genome sizes in echinoderms.
Additional file 7
File S7. Protocol for de novo DNA repeat assembly from shotgun sequence reads.
Additional file 8
File S8. Protocol for DNA repeat identification.
Additional file 9
File S9. Template scripts for gene prediction and annotation.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Mashanov, V., Machado, D.J., Reid, R. et al. Twinkle twinkle brittle star: the draft genome of Ophioderma brevispinum (Echinodermata: Ophiuroidea) as a resource for regeneration research. BMC Genomics 23, 574 (2022). https://doi.org/10.1186/s12864-022-08750-y
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s12864-022-08750-y