Skip to main content
  • Research article
  • Open access
  • Published:

Twinkle twinkle brittle star: the draft genome of Ophioderma brevispinum (Echinodermata: Ophiuroidea) as a resource for regeneration research

Abstract

Background

Echinoderms are established models in experimental and developmental biology, however genomic resources are still lacking for many species. Here, we present the draft genome of Ophioderma brevispinum, an emerging model organism in the field of regenerative biology. This new genomic resource provides a reference for experimental studies of regenerative mechanisms.

Results

We report a de novo nuclear genome assembly for the brittle star O. brevispinum and annotation facilitated by the transcriptome assembly. The final assembly is 2.68 Gb in length and contains 146,703 predicted protein-coding gene models. We also report a mitochondrial genome for this species, which is 15,831 bp in length, and contains 13 protein-coding, 22 tRNAs, and 2 rRNAs genes, respectively. In addition, 29 genes of the Notch signaling pathway are identified to illustrate the practical utility of the assembly for studies of regeneration.

Conclusions

The sequenced and annotated genome of O. brevispinum presented here provides the first such resource for an ophiuroid model species. Considering the remarkable regenerative capacity of this species, this genome will be an essential resource in future research efforts on molecular mechanisms regulating regeneration.

Background

Echinoderms are a phylum of marine invertebrates, which together with hemichordates constitute the group Ambulacraria. In turn, the Ambulacraria form a sister clade to the phylum Chordata within the monophyletic superphylum Deuterostomia. Echinoderms have attracted much attention from scholars from various disciplines of biology (e.g., ecology, evolution, and developmental biology [1]). One of the fascinating aspects of the biology of many echinoderm species is the ability to regenerate all the tissues of large body parts and organ systems completely [2]. Thus far, attempts to understand the molecular underpinnings of echinoderm regeneration have been largely driven by transcriptome and gene expression studies [37]. Only recently functional perturbation studies started to emerge [3, 79], but the progress has been relatively slow in part due to the scarcity of genomic resources. Although the gene expression approaches are valuable in identifying the sets of genes that are involved in the regeneration program, gene expression studies do not by themselves lead to a comprehensive mechanistic understanding of regeneration. A mechanistic understanding of regeneration will be achieved through the reconstruction of the cause-and-effect relationships between regulatory and effector genes in functional genomics studies. The availability of genomic data is one of the essential prerequisites for these studies and it shall allow for the identification of cis-regulatory modules, reconstruction of chromatin accessibility maps, and the design of the genome editing experiments to probe for the function of candidate genes.

The efforts in echinoderm genomics were pioneered by the sequencing of the genome of the purple sea urchin, Strongylocentrotus purpuratus (Stimpson, 1857) [10]. Since then, the availability and affordability of the new generation high-throughput sequencing technologies have expanded genome sequencing and annotation projects to other species representing all five classes within the phylum. Table 1 lists the 21 echinoderm genomes that are currently available in public databases. These genomes, sequenced and annotated to a varying degree of completeness, have been submitted to the National Center for Biotechnology Information (NCBI) or other databases (e.g., [1117]). These sequencing efforts, however, have rarely been driven by regeneration research. Only seven of the currently sequenced species have ever been studied in terms of their regenerative capacities (i.e., there is at least one published study on the topic). Only three of the sequenced species, the sea cucumbers Holothuria glaberrima and Apostichopus japonicus and the sea star Patiria miniata, are established model organisms in echinoderm regenerative biology (i.e., there has been continuous effort to characterize cellular and molecular events underlying regeneration). Two brittle star genomes are available at present, Ophionereis fasciata and Ophiothrix spiculata, but neither of those species has been used in studies of regenerative biology.

Table 1 Genome assembly statistics for echinoderm genomes available at the NCBI’s Assembly database (https://www.ncbi.nlm.nih.gov/assembly) on Jun. 17, 2021. Ditto marks (") indicate values identical to the cell above. Asterisks (*) indicate GenBank assembly accessions that have corresponding RefSeq assemblies

The primary aim of this contribution is to provide genomic data and tools for a highly regenerative brittle star (ophiuroid echinoderm) Ophioderma brevispinum [18], an emerging model organism in the field of regenerative biology [3] (Fig. 1). This species is capable of autotomizing and quickly regenerating its arms, the segmented body appendages. Arm regeneration is a classic example of ”epimorphic” regeneration, which involves extensive unidirectional terminal growth through rapid generation of new tissues distal to the plane of the injury [3] (Fig. 1). As arms are often exposed to predators in natural habitats, brittle stars are known for their ability to sustain arm loss followed by remarkably high rates of arm regeneration. It has been estimated that regenerated tissues may account for as much as half of the total body weight in a brittle star individual [19]. It has also been shown that regeneration is temperature-dependent in at least some Ophioderma species [20].

Fig. 1
figure 1

Ophioderma brevispinum, an emerging model organism in echinoderm regenerative biology. A An uninjured adult individual of O. brevispinum. BH Regenerating arm at different time points post-injury. The regenerating distal end of the arm is to the left

The rationale for choosing this particular brittle star species for our research is as practical as it is fundamental. O. brevispinum is common in the Western Atlantic in shallow water and can be easily collected in numbers sufficient for molecular biology studies. During experiments, these animals are easily maintained in indoor aquaria and as research stock with minimal maintenance for extended periods (e.g., months). In addition, live individuals of O. brevispinum are available to all interested researchers, as they can be ordered from several commercial suppliers at a moderate price.

Our previous work on O. brevispinum demonstrated a critical role of the Notch signaling pathway (Fig. 2) in ophiuroid arm regeneration [3]. This pathway is a key node of a complex hypernetwork of interconnected signaling pathways that mediate cell-cell interactions in various developmental contexts [21]. Thus a specific goal of this paper includes describing the genomic composition of the key components of the Notch pathway [21]. A report of these genes in O. brevispinum is provided to serve as a fundamental toolkit for future studies on regeneration. This work will facilitate further research to unravel the functions of signaling pathways in regeneration and identify their target genes through functional genomics approaches.

Fig. 2
figure 2

Simplified diagram of the Notch signaling pathway. The pathway is mediated by juxtacrine signaling that requires direct physical contact between the signaling and receptor cells. The Delta/Serrate (Jagged) ligands and Notch receptors are transmembrane proteins embedded into the plasma membrane of the signaling and receptor cells, respectively. Ligand-receptor interaction triggers conformational changes in the Notch protein that allows for proteolytic cleavage of the receptor by the ADAM metalloprotease and the multiprotein γ-secretase complex. The latter includes the catalytic component presenilin, as well as regulatory/stabilizing subunits nicastrin, Aph-1, and Pen (presenilin enhancer)-2. This proteolytic cleavage releases the Notch intercellular domain that translocates into the nucleus and activates the transcription factor RBP-J by inducing the release of co-repressors (e.g., NCOR, CIR, MINT, and HDAC) and recruitment of co-activators, such as Mastermind (MAM), p300, and NACK. The activated transcription factor complex initiates transcription of the direct targets of the pathway, including Hes and Hey. Even though the pathway itself is conceptually simple, it is subjected to a multitude of regulatory inputs at multiple levels, including receptor post-translational maturation and stability/availability of the key pathway components in both the signaling and receptor cells. One of the properties of the Notch pathway is the ability to sustain itself through a series of feed-forward loops, thus resulting in an all-or-nothing response. For example, NACK, which is a transcriptional co-activator in the pathway is itself positively regulated by Notch. The genes shown in the diagram were searched for and identified in the O. brevispinum draft genome (see Table 5). Three different searches for Notch related genes in the draft genome were performed, designated by green boxes (BLAST, Exonerate, and conserved domain search respectively), filled boxes indicate positive identification

Here, we report a de novo genome assembly and annotation of Ophioderma brevispinum. This is the third ophiuroid for which genome sequencing has been applied following low coverage sequencing of Ophiothrix spiculata [14] (NCBI’s BioProject accession number PRJNA182997) and Ophionereis fasciata [17]. Our Ophioderma brevispinum genome and transcriptome [3] allow us to describe regulatory gene families to further explore the molecular bases of echinoderm regeneration. This use case is an example of one of the several applications for these genomic data that researchers will find.

Results

Sequencing

Three different genomic DNA library preparation and sequencing strategies were employed. First, PCR-free library preparation (“short” libraries) followed by sequencing on an Illumina HiSeq 25000 machine yielded 2 ×163,310,307 250 nt paired-end reads with an overall GC content of 37% (NCBI’s SRA accession number SRP238266). Second, mate-paired libraries (“long” libraries) with approximately 3 Kbp insert size were sequenced as above and resulted in 2 ×103,843,354 250 nt paired-end reads with an overall GC content of 41%. Third, PacBio long-read sequencing generated approximately 23 million reads with a total yield of 159 billion bp (51.3 × coverage) (Fig. 3). A summary of sequence read statistics is provided in Additional file 1: File S1.

Fig. 3
figure 3

Schematic workflow of the O. brevispinum de novo DNA and RNA assembly. The main software tools used at each step of the workflow are shown in parenthesis. Grey boxes indicate four main components of the workflow. DNA: library preparation, quality control, high-throughput sequencing, and de novo assembly of gDNA. RNA: transcriptome assembly described in a previous publication. Genome size estimation: two different strategies used to estimate the haploid genome size. Repeat Masking: identification and categorization of repetitive DNA in the gDNA assembly. We used FastQC at different workflow steps to track the effect of quality control procedures on the sequence reads (see dashed arrowhead lines)

In addition, in order to facilitate the annotation efforts, we took advantage of the earlier RNA-Seq study [3] that generated 17,318,775 MiSeq and 832,245,006 HiSeq quality filtered and adapter trimmed reads used in a de novo transcriptome assembly.

Nuclear DNA assembly statistics

The draft assembly of the O. brevispinum genome generated 88,538 scaffolds with the total length of the assembly of 2,684,874,461 bp (approximately 2.68 Gb) (Table 2). This value is close to the haploid genome size independently determined by a densitometry assay (2.89 Gb) and from paired-end sequence data using a k-mer statistical approach (2.11 Gb). Scaffolds range from 2,035 bp to 612,917 bp, with a mean scaffold size of 30,325 bp. The N50 scaffold length and L50 scaffold size are 48,505 bp and 15,677, respectively. The scaffold nucleotide content is 30.77%, 19.22%, 19.18%, and 30.83% for A, C, G, and T, respectively.

Table 2 Summary metrics of the O. brevispinum genome assembly

An independent de novo assembly of repetitive DNA elements with REPdenovo resulted in 92,505 individual sequences with a total length of 134,023,983 bp. The average length and N50 of the repetitive DNA segments assembled this way were 1,448.83 bp and 2,838 bp (14,228 sequences), respectively. The sequences assembled with REPdenovo were used to aid in our draft genome assembly’s repeat identification and masking. A total of 1,410,344,530 bp (52.53%) of the draft assembly were classified as repetitive DNA and masked (see the summary of results in Table 3). Most DNA repeats (42.91% of the repetitive DNA sequence length) were classified as interspersed elements. However, a significant number of repeats (49.91% of the repetitive DNA) were marked as unclassified. The most common (5.27% of the repetitive DNA) transposable element (TE) in the classification of repeats corresponds to long interspersed nuclear elements (LINEs). Short interspersed nuclear elements (SINEs), repetitive DNA elements, and long terminal repeats (LTRs) amounted to 0.11%, 0.73%, and 1.12% of the total sequence length, respectively.

Table 3 Summary of results from RepeatMasker v4.0.8, run with rmblastn v2.6.0+. This table corresponds to the classification of 1,410,344,530 bp (GC content of 38.40%) of repetitive DNA in the draft genome assembly of Ophioderma brevispinum. (*) Most repeats fragmented by insertions or deletions have been counted as one element

In the GTF files from BRAKER and exonerate (Additional file 2: File S2), we found 361,060 exons in 146,703 genes, each corresponding to a different transcript in a total of 53,436 nuclear DNA scaffolds. According to position information, at least 3,394 of those 146,703 genes could represent gene isoforms. These putative isoforms are found in 1,311 scaffolds. The degree of fragmentation of this draft genome and limitations in resolving gene isoforms may partially account for this overestimation of the number of protein coding genes in Ophioderma brevispinum. These caveats should be addressed in future efforts to increase the completeness of this genomic resource.

The completeness of the draft genome assembly at the scaffold level was also evaluated with BUSCO [22, 23], a commonly used tool to assess the representation of marker genes in newly generated genomic and transcriptomic datasets. The results are summarized in Table 4. BUSCO analysis of this brittle star transcriptome has been previously reported elsewhere (see [3]).

Table 4 Summary of BUSCO v4.0.6 results for the scaffolds of the draft genome assembly. The database column names each odb10 BUSCO database used. The species column indicates the Augustus training parameter. Ditto marks (”) indicate values identical to the cell above. The names “Human”, “Fly” and “Spur” correspond to Homo sapiens, Drosophila melanogaster, and Strongylocentrotus purpuratus, respectively

Notch signaling pathway

To demonstrate the utility of the genome, we performed a case study in which we assessed the genomic representation of the main components of the Notch signaling pathway. This pathway is highly conserved across all multicellular animals and is known to coordinate a multitude of diverse cellular events, including: proliferation, differentiation, cell fate specification, and cell death [21, 2429]. In the context of echinoderm regeneration, we have recently demonstrated that the proper function of the Notch pathway is crucial for the arm regeneration in O. brevispinum [3]. Here, we searched the draft genome for 29 genes involved in the pathway (Fig. 2, Table 5).

Table 5 Select components of the Notch signaling pathway identified in the draft genome of O. brevispinum using reference sequences from UniProt and Echinobase. For each gene, we list its name, the known function in the pathway, and whether or not the gene was recovered from the draft genome with independent BLAST and exonerate alignments. In addition, we also indicate if we could identify conserved protein domains in the predicted protein sequences

All genes were retrieved by BLAST [30] search. In addition, for all selected genes, except Mesp2, Presenilin 1, and NACK, we also recovered the same putative coding regions from exonerate alignments. In addition, in 18 genes, we also identified the expected conserved protein domains. Taken together, the newly assembled draft genome of O. brevispinum allowed us to retrieve the sequences of the Notch pathway components that will be subsequently used to design functional genomic studies to further probe into the mechanistic role of the pathway in brittle star regeneration.

Mitogenome assembly

The O. brevispinum mitochondrial genome (mitogenome) is 15,831 bp long and has a GC content of 32.4% (Fig. 4). These values are similar to those of the previously published [31] reference mitogenome of another brittle star species Ophiarachnella gorgonia (NCBI accession number NC_046053), which has a length of 15,948 bp and a GC content of 36.7%. Likewise, mitochondrial genome features of O. brevispinum showed the same gene order reported for O. gorgonia, and their putative control regions are of similar length (488 and 474 bp, respectively).

Fig. 4
figure 4

The complete mitochondrial genome of Ophioderma brevispinum. Arrows indicate the main genomic features and their orientation. The blue lines indicate variation in GC content. The green lines indicate variation in AT content

There are also differences between these two brittle star mitogenomes that are worth noting. For example, the size difference between the mitogenomes of O. brevispinum and O. gorgonia is mostly due to deletions in non-coding intergenic regions. However, deletions in tRNA, rRNA, and protein-coding genes are also observed. Furthermore, unlike in O. gorgonia, the ND4 coding sequence in O. brevispinum is complete and does not add 3’ adenine residues to the mRNA.

Discussion

Here, we present a draft genome assembly for the highly regenerative brittle star species O. brevispinum. Due to its availability, ease of maintenance, and amenability to experimental manipulations, O. brevispinum has become an emerging model organism in echinoderm regenerative biology [3]. We previously performed transcriptome-wide gene expression studies in this species and identified a range of candidate regeneration-associated genes for further experiments. However, without a fully sequenced genome, including non-coding and regulatory regions, it was not previously feasible to delve into the molecular mechanisms of regeneration with functional genomics tools for purposes such as reconstructing gene regulatory networks that underlie regenerative events. This draft genome of O. brevispinum provides the first such resource in ongoing and future molecular studies

The new genome has immediately allowed analysis of protein-coding genes. To demonstrate the utility of the genome, we aimed to retrieve 29 select components of the Notch signaling pathway, including the ligands, receptors, transcription factors, regulators, and target genes. All 29 genes of interest were identified in the assembly. The identity and predicted function of the proteins can be inferred by the presence of the conserved domains.

One of the limitations of our new draft genome assembly is its fragmented state. Ideally, the ultimate goal of any genome sequencing and annotation project is to reconstruct continuous chromosome-size sequences with the fully preserved order of the genes and non-coding sequence elements. Like many first-effort sequencing projects, our assembly will require subsequent efforts to reach that level. Even at its current state though, these data provide a valuable resource for the ongoing and future studies. This research will not be limited to regenerative biology, but can also benefit other areas, such as evolution of the echinoderm body plan, animal phylogeny, and history of gene families, to name a few.

Conclusion

Here we presented the first draft nuclear genome and a complete mitochondrial genome of the brittle star Ophioderma brevispinum (Say, 1825) (Echinodermata: Ophiuroidea: Ophiacanthida: Ophiodermatidae), a rising model for regenerative studies (e.g., [3, 3234]). The mitochondrial genome of this brittle star has 15,831 bp (with a mean depth of 1,658.7 and GC content of 32.4%) with 13 protein-coding genes, 22 tRNAs, and 2 rRNAs. The draft nuclear DNA assembly has 88,538 scaffolds summing up to 2.7 Gbp, corresponding to 93% of the expected haploid genome size independently determined by a densitometry assay. Despite the high degree of fragmentation of the assembly, which is partially caused by a high frequency of repetitive DNA elements (52.5% of the assembly), we demonstrated the usefulness of these data for biological investigation by identifying 29 key genes of the Notch signaling pathway, which is essential to tissue regeneration (e.g., [3, 26, 29, 35, 36]). We predict that the resources we are making available in this publication will be fundamental towards assembling the entire genome of O. brevispinum at the chromosomal level and establishing this brittle star species as a model for studies of regeneration and other fields.

Methods

Supporting genomic resources

Comparative genomic analysis relied on our original data and other genomic resources that are publicly available, including the complete genome of the purple sea urchin Strongylocentrotus purpuratus [10, 11] and other genomes available during the preparation of the manuscript (Table 1). These genomes represent 19 species and 17 genera of the classes Asteroidea (orders Forcipulatida and Valvatida), Crinoidea (order Comatulida), Echinoidea (orders Camarodonta and Cidaroida), Holothuroidea (orders Holothuriida and Synallactida), and Ophiuroidea (order Amphilepidida).

Additional interrogation and exploration of echinoderm genomic data leverage on resources made available through the Echinobase (www.echinobase.org) [37]. Recent reviews of genomic resources for the study of echinoderm development and evolution are available elsewhere [1, 38].

Computational resources

All analysis steps where performed using computer clusters (Red Hat Enterprise Linux 7.5 with 64 CPUs and 512 GB to 1.5 TB of memory) as well as high-memory machines (Red Hat Enterprise Linux 7.5 with 16 CPUs and 512 GB to 4 TB of memory) at the University of North Carolina at Charlotte.

Animal collection

Adult individuals of the brittle star O. brevispinum were obtained from the Marine Biological Laboratory (Woods Hole, MA, USA). Specimens (catalog no. 1970) were received on April 13, 2016. Immediately after delivery, the package was opened and left overnight to slowly allow the seawater to warm up to room temperature. The animals were then kept in aquaria with aerated artificial seawater.

RNA-Seq

Complete RNA-Seq analysis (from RNA sampling and isolation until sequencing, de novo transcriptome assembly, and gene expression analysis) was described in [3]. Results correspond to BioProject number PRJNA596798 and SRA accession number SRP238266, and were deposited to NCBI’s Gene Expression Omnibus (GSE142391, https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE142391). BUSCO scores for the trasncritome assembly were presented in [3].

DNA isolation and evaluation

A total of 100 mg of tissue was collected from the arms of a single non-regenerating adult individual through the natural autotomy response. The animal remained alive after tissue collection and regenerated its arms. The collected tissue samples were washed in filter-sterilized (0.2 μm) seawater, cut into small pieces with a sterile blade, and put into the lysis buffer. The high molecular weight nuclear genome DNA was then extracted using the Qiagen MagAttract HMW kit according to the manufacturer’s instructions with the following modifications that were found to increase the yield and the molecular weight of the resulting DNA: 1) the vortexing speed with magnetic speed was reduced from 1,400 rpm to 1,200 rpm; and 2) after the first vortexing step with the magnetic beads the samples were allowed to sit for 5 min at room temperature before being placed into the magnetic rack. The concentration of the extracted DNA was assessed using the Qubit dsDNA Broad Range Kit (ThermoFisher). The total amount of DNA was determined at 50 μg. The integrity of the genomic DNA sample was verified by agarose (0.6%) gel electrophoresis (at 2 V/cm, 4 hours).

DNA library preparation and sequencing

All DNA sequencing was performed at the David H. Murdock Research Institute (DHMRI; Kannapolis NC, USA). Two different technologies were employed to obtain short and long sequence reads from the high molecular weight genomic DNA (HWM gDNA) sample extracted as described above (Fig. 3).

For sequencing on the Illumina HiSeq2500 platform, subsamples of ≥8 μg of HMW gDNA were used to produce short sequence reads. Two complementary strategies were used to generate ”short” and ”long” insert libraries, respectively. First, three PCR-free paired-end read libraries (“short” insert size) with a 450 bp fragment size were constructed using the TruSeq DNA PCR-free library preparation kit (catalog no. FC-121-3001; Illumina, USA). Second, three mate paired-end read libraries (“long” insert size) were generated using the Illumina Nextera Mate Pair Sample Preparation Kit with the insert size of 3 Kbp. The short and long libraries were combined onto their respective pools and sequenced in the Rapid Mode to produce 2 × 251 bp reads.

To generate long sequence reads (10 Kbp), we used the Pacific Biosciences Single Molecule Real-Time (SMRT) platform [39]. Based on the estimated genome size of 2.9 Gbp (see “Genome size estimate”, below), we aimed to generate a 50 × coverage in an effort to improve the assembly by reducing and closing gaps. Over 10 μg of HWM gDNA with an average fragment length of ≥60 Kbp was used to produce four SMRTbell libraries. The libraries were generated using the SMRTbell Template Prep Kit 1.0 following the PacBio “ >20 Kbp Template Preparation Using BluePippin Size-Selection System (15-20 Kbp) for Sequel Systems” procedures and checklist (catalog no. 100-286-000-07; Pacific Biosciences, USA). Libraries were sequenced in a combined total of 35 SMRTcells.

DNA assembly and descriptive statistics

We performed several quality preprocessing steps on the raw DNA sequence reads before the assembly. The overall quality of sequence data before and after each step was determined by FastQC. See the supplementary information for additional details on the DNA assembly, including the commands used to execute the computational analyses described below (Additional file 3: File S3).

The Illumina reads from the short insert library were processed with Trimmomatic v0.38 [40] to remove the adapters and low-quality bases at the ends of each read. Trimmomatic also scanned each read with a 4-base sliding window, cutting when the average quality per base dropped below 15. Reads shorter than 36 bp were discarded. The long insert library Illumina reads were processed with NxTrim v0.3.0-alpha [41] using default parameters to separate reads into four different categories according to the adapter position: mate pairs, unknown (which are mostly mate pairs), paired end, and single end sequence reads.

All the sequence files produced by Trimmomatic and NxTrim were then evaluated with the HTQC toolkit v0.90.8 [42] to produce quality stats per file (using ht-stat) and perform the final read trimming and filtering (with ht-trim and ht-filter, respectively). Finally, FastUniq v1.1 [43] was used to remove duplicates introduced by PCR amplification from paired short reads.

All cleaned Illumina reads were used as input for the de novo assembly with ABySS v2.11 [44] using k-mers ranging from 23 to 61 (with a steps of 2). The individual assemblies generated at each k-mer value were ranked using several metrics, including the number of sequences, total assembly length, L50, and N50 [45]. We then polished the resulting best assembly in Pilon v1.2.3 [46] to improve base calling and detect sequence variation.

The long PacBio sequence reads we assembled with MaSuRCA v3.2.7 [47]. The contigs generated this way were polished using Arrow v2.3.3 [48] and merged with the ABySS assembly using quickmerge v3be7287 [49].

Assembly stats were calculated using the “assembly-stats” [50] (developed at the Wellcome Sanger Institute) and “assemblathon_stats.pl” [51] (developed a the UC Davis Bioinformatics Core) tools. The completeness of protein-coding gene representation in the scaffolds of the draft genome assembly was assessed with BUSCO v4.0.6 [22] run in the “genome mode” against the evolutionary conserved metazoan gene set (metazoa_odb10, creation date: 2021-02-17, number of species: 65, number of BUSCOs: 954) and the conserved eukaryota gene set (eukaryota_odb10, creation date: 2020-09-10, number of species: 70, number of BUSCOs: 255). Since O. brevispinum is not listed among the available species available for Augustus training, we tested other three species: Homo sapiens, Drosophila melanogaster, and S. purpuratus.

Genome size estimate

The Animal Genome Size database has only one entry for species of Ophioderma, O. panamensis [52]. The expected C-value variation in echinoderms is provided in Additional file 4: File S4. That entry indicates that the expected C-value for O. panamensis is 3.3 pg (3.23 Gbp) based on bulk fluorometric assay [53]. The genome size of O. brevispinum have never been determined before. We, therefore, estimated it using two complementary approaches, including Feulgen densitometry (FD) assay [5456] and also from the sequence data.

For the FD assay, soft uncalcified tissues (stomach wall and podia) from a single individual were finely minced with a razor blade, fixed in methanol:acetic acid (3:1) for 10 min and squashed in a drop of 45% acetic acid on a gelatin-coated slide. The samples were then air-dried and post-fixed in methanol:formalin:acetic acid (85:10:15) for 24 hours. After rinsing in tap and distilled water, the samples were hydrolyzed in 5N HCl and stained for 2 hours in Schiff reagent. After brief washes in a 0.5% sodium metabisulfite solution and then in water, the slides were dehydrated in an ethanol series, air dried and mounted in the immersion oil. Microscopic images were then taken at a consistent light intensity in the green monochromatic channel. The optical density of the stained nuclei was measured in the Fiji/ImageJ software [57]. To convert the optical density relative units to the absolute values of DNA mass per nucleus, the following control samples with known DNA content were processed and quantified along with the O. brevispinum specimens: chicken erythrocyte nuclei, trout erythrocyte nuclei, triploid trout nuclei, and human (male) cheek epithelial cell nuclei.

In addition to the FD assay, we also estimated the haploid genome size of O. brevispinum from the paired-end sequence data in Jellyfish v2.2.4 [58] using a k-mer-based statistical approach. The histograms produced by Jellyfish were used to estimate the haploid genome size in GenomeScope [59].

Detailed protocols for genome size estimation based on FD and on k-mer-based statistics are provided in Additional file 5: File S5. The estimated genome size was evaluated considering the variation in haploid genome size among echinoderms (see Additional file 6: File S6).

Assembly and classification of repetitive DNA elements

Previous to this study, the reports of observed haploid genome size in echinoderms varied over 8-fold. Haploid genome size ranged from 0.53 Gbp in the sea star Dermasterias imbricata to 4.3 Gbp in the sea cucumber Thyonella gemmata [52]. The largest haploid genomes in the subphylum Asterozoa belong to the order Ophiurida, Ophioderma panamensis, with 3.3 Gbp [52].

In addition to the whole-genome assembly described above, we also performed a stand-alone de novo assembly of repetitive DNA elements in the genome of O. brevispinum with REPdenovo v0.0 [60] following the protocol described in the supplementary information (Additional file 7: File S7).

In short, we used REPdenovo to assemble repeats directly from the cleaned paired-end and single-end short sequence reads that resulted from the quality control steps described above, using different k-mer sizes ranging from 25 to 50 with a step of 2.

The contigs assembled with REPdenovo were used as input to RepeatModeler v1.0.11 [61] to build a library of repetitive genomic elements in the genome of O. brevispinum. This resulting brittle star repeat library was then combined with repeat libraries from the 2018 version of Repbase [6265] and RepeatMasker v4.0.8 [66]. Only unique entries were kept to generate a final custom repeat library. This custom repeat library was then used to screen the draft genome of O. brevispinum with RepeatMasker to identify interspersed repeats and low complexity DNA sequences. Finally, the RepeatMasker output was manually curated and written into a General Feature Format version 3 (GFF3) file [67]. The details of the repeat library preparation and repeat masking are provided in Additional file 8: File S8.

Gene prediction and annotation

The gene prediction and annotation workflow is summarised in Fig. 5. We also provide template scripts listing the parameters used to execute each program listed below in Additional file 9: File S9.

Fig. 5
figure 5

Schematic workflow of the procedures used for gene prediction and annotation of the O. brevispinum draft genome. Main steps (indicated by the grey boxes) are named according to the leading software used on each stage (BRAKER; BLAST; and exonerate, GMAP, and BLAT)

Full gene structure annotations were generated with BRAKER v2.1.2 [68, 69], which integrates GeneMark-ET/EP+ v4.38 [70] and AUGUSTUS v.3.3.2 [71, 72] and allows for fully automated training from RNA-Seq or protein homology information. We also conducted an independent run with AUGUSTUS on selected scaffolds.

The BRAKER annotation pipeline used the genome of the purple sea urchin S. purpuratus (assembly Spur_5.0) as a reference [10, 73] and also the de novo assembled transcriptome of O. brevispinum [3].

The predicted gene models were aligned with BLAST v.2.9.0+ [30] against the following databases (each downloaded on October 25, 2019): the UniProt Archive (UniParc; https://www.uniprot.org/help/uniparc), the NCBI’s non-redundant nucleotide database (“nt”; https://ftp.ncbi.nlm.nih.gov/), and the complete EchinoDB database of protein coding genes (https://echinodb.uncc.edu/).

In addition, the genomic scaffolds were also aligned to the transcriptome of O. brevispinum and the cDNA sequences from S. purpuratus (assembly Spur_5.0) using exonerate v2.4.0ls [74], GMAP v2021.03.08 [75], and BLAT v36x2 [76].

The programs listed above generated different annotation tables. These tables were formatted as GTF files that are listed in the supplementary information. The main GTF files are provided in Additional file 2: File S2.

Annotation of genes associated with the notch signaling pathway

As a case study, to demonstrate the practical utility of our draft genome assembly, we annotated selected core components and modifiers of the Notch signaling pathway (Fig. 2) using reference amino acid sequences from the UniProt and Echinobase (www.echinobase.org) [37] databases. The sequences from this query reference database were aligned to target exons from the BRAKER annotation using the TBLASTN program to search translated nucleotide databases (from scaffolds) using a protein query (described above) with the E-value, bit score, and percentage identity cutoff thresholds of 1.0E-5, 30.0, and 23%, respectively. In parallel, we also aligned the amino acids query sequences to all assembled scaffolds using exonerate to test if its exon predictions match BLAST results.

We used NCBI’s Conserved Domain Search (www.ncbi.nlm.nih.gov/Structure/cdd) to identify conserved protein domains in the brittle star Notch pathway genes returned by BLAST and/or exonerate. The conserved domains were searched against the CDD v3.19 database, with an E-value threshold of 0.01 and compositional-based statistics adjustment. We stored the best 500 hits for each gene sequence and then manually inspected the output for the presence or absence of diagnostic domains.

The complete list of genes related to the Notch signaling pathway we searched is provided in the Results section. Just as for the annotation tables for the genomic scaffolds, we also formatted the results of the annotation Notch-related genes in GTF files that we provide as supplementary information (Additional file 2: File S2).

Mitochondrial genome

The mitochondrial genome (mitogenome) of O. brevispinum was contained in a single scaffold generated during the whole-genome assembly. It was identified via sequence alignments using BLAST v.2.9.0+ [30] and a reference sequence from an ophiuroid of the same family (NCBI’s accession number NC_046053.1), Ophiarachnella gorgonia (Müller & Troschel, 1842) (Echinodermata: Ophiuroidea: Ophiacanthida: Ophiodermatidae).

The putative circular sequence was extracted from the selected scaffold using AWA (available from https://gitlab.com/MachadoDJ/awa; accessed on July 22, 2021) [77]. Next, we remapped filtered short paired-end reads back to AWA’s putative mitogenome using Bowtie2 to review base calling. Finally, we used MITOS WebServer (version 2; available from http://mitos2.bioinf.uni-leipzig.de/index.py) [78] to predict genes and an independent analysis with tRNAscan-SE 2.0 [79, 80] to confirm the annotation of tRNAs.

Availability of data and materials

The data sets supporting the conclusions of this article are included within the article and its additional files. Supplementary information accompanies this paper is available at Zenodo, DOI: 10.5281/zenodo.6618000. Data corresponding to our draft genome assembly of O. brevispinum can be found at NCBI’s databases under BioProject number PRJNA779014, BioSample number SAMN23008116, and GenBank’s accession number JAMKCH000000000.1. The mitochondrial genome sequence and annotations have been submitted to NCBI under the same Bioproject number (PRJNA779014).

Abbreviations

BAM:

binary alignment map

BLAST:

Basic Local Alignment Search Tool

bp:

base pair

dsDNA:

Double stranded DNA

EBI:

European Bioinformatics Institute

EMBL:

European Molecular Biology Laboratory

EMBOSS:

European Molecular Biology Open Software Suite

ENCODE:

Encyclopedia of DNA Elements

FASTA:

it is pronounced “fast A” and stands for “Fast-All”

FASTP:

it is pronounced “fast P” and stands for “Fast-Protein”

FD:

Feulgen densitometry

FDR:

false discovery rate

FPKM:

fragments per kilobase of exon per million fragments mapped

GB:

gigabyte (approx. 1024 MB)

Gb:

same as Gbp

Gbp:

giga base pairs (1,000,000,000 bp)

gDNA:

genomic DNA

GFF3:

general feature format or gene-finding format, version 3

GO:

gene ontology

HTS:

high-throughput sequencing

HMW:

high molecular weight

InDel:

insertion or deletion

Kb:

same as Kbp

Kbp:

kilo base pairs (1,000 bp)

L50:

If we sort sequences by size and sum their sizes in succession from the shortest sequence, the L50 will be the number of sequences needed to achieve 50% of the total size

MB:

a unit of information equal to 220 bytes or, loosely, one million bytes

Mb:

same as Mbp

Mbp:

mega base pairs (1,000,000 bp)

mRNA:

messenger RNA

MSA:

multiple sequence alignment

mtDNA:

mitochondrial DNA

nt:

nucleotide

nucDNA:

nuclear DNA

N50:

If we sort sequences by size and sum their sizes in succession from the shortest sequence, the N50 will be the last sequenced added to achieve 50% of the total size

ORFs:

open reading frames

PacBio:

Pacific Biosciences

PE:

paired-end (sequence of both ends of a fragment)

rRNA:

ribosomal RNA

SMRT:

Single Molecule Real-Time

SAM:

sequence alignment/map

SE:

standard errors

SR:

single-read (sequencing from only one end)

ssDNA:

single stranded DNA tRNA: transfer RNA

VCF:

variant call file

References

  1. Cary GA, Cameron RA, Hinman VF. Genomic resources for the study of echinoderm development and evolution. In: Methods Cell Biol, vol 151. New York: Elsevier: 2019. p. 65–88. https://doi.org/10.1016/bs.mcb.2018.11.019.

    Google Scholar 

  2. Carnevali MC. Regeneration in Echinoderms: repair, regrowth, cloning. Invert Surviv J. 2006; 3(1):64–76.

    Google Scholar 

  3. Mashanov V, Akiona J, Khoury M, Ferrier J, Reid R, Machado DJ, Zueva O, Janies D. Active Notch signaling is required for arm regeneration in a brittle star. PLoS ONE. 2020; 15(5):0232981. https://doi.org/10.1371/journal.pone.0232981.

    Article  CAS  Google Scholar 

  4. Mashanov VS, Zueva OR, García-Arrarás JE. Transcriptomic changes during regeneration of the central nervous system in an echinoderm. BMC Genomics. 2014; 15(1):1–21.

    Article  CAS  Google Scholar 

  5. Quispe-Parra DJ, Medina-Feliciano JG, Cruz-González S, Ortiz-Zuazaga H, García-Arrarás JE. Transcriptomic analysis of early stages of intestinal regeneration in Holothuria glaberrima. Sci Rep. 2021; 11(1):1–14.

    Article  CAS  Google Scholar 

  6. Purushothaman S, Saxena S, Meghah V, Swamy CVB, Ortega-Martinez O, Dupont S, Idris M. Transcriptomic and proteomic analyses of Amphiura filiformis arm tissue-undergoing regeneration. J Proteomics. 2015; 112:113–24.

    Article  CAS  PubMed  Google Scholar 

  7. Czarkwiani A, Dylus DV, Carballo L, Oliveri P. Fgf signalling plays similar roles in development and regeneration of the skeleton in the brittle star Amphiura filiformis. Development. 2021; 148(10):180760. https://doi.org/10.1242/dev.180760.

    Article  CAS  Google Scholar 

  8. Mashanov VS, Zueva OR, García-Arrarás JE. Myc regulates programmed cell death and radial glia dedifferentiation after neural injury in an echinoderm. BMC Dev Biol. 2015; 15(1):1–9.

    Article  CAS  Google Scholar 

  9. Alicea-Delgado M, García-Arrarás JE. Wnt/ β-catenin signaling pathway regulates cell proliferation but not muscle dedifferentiation nor apoptosis during sea cucumber intestinal regeneration. Dev Biol. 2021; 480:105–13.

    Article  CAS  PubMed  Google Scholar 

  10. Sodergren E, Weinstock GM, Davidson EH, Cameron RA, Gibbs RA, Angerer RC, Angerer LM, Arnone MI, Burgess DR, Burke RD, et al. The genome of the sea urchin Strongylocentrotus purpuratus. Science. 2006; 314(5801):941–52. https://doi.org/10.1126/science.1133609.

    Article  PubMed  Google Scholar 

  11. Cameron RA, Kudtarkar P, Gordon SM, Worley KC, Gibbs RA. Do echinoderm genomes measure up?Mar Genom. 2015; 22:1–9. https://doi.org/10.1016/j.margen.2015.02.004.

    Article  Google Scholar 

  12. Kinjo S, Kiyomoto M, Yamamoto T, Ikeo K, Yaguchi S. Hpbase: A genome database of a sea urchin, hemicentrotus pulcherrimus. Dev Growth Differ. 2018; 60(3):174–82. https://doi.org/10.1111/dgd.12429.

    Article  CAS  PubMed  Google Scholar 

  13. Sergiev PV, Artemov AA, Prokhortchouk EB, Dontsova OA, Berezkin GV. Genomes of Strongylocentrotus franciscanus and Lytechinus variegatus: are there any genomic explanations for the two order of magnitude difference in the lifespan of sea urchins?Aging. 2016; 8(2):260. https://doi.org/10.18632/aging.100889.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Kudtarkar P, Cameron RA. Echinobase: an expanding resource for echinoderm genomic information. Database. 2017;2017. https://doi.org/10.1093/database/bax074.

  15. Zhang X, Sun L, Yuan J, Sun Y, Gao Y, Zhang L, Li S, Dai H, Hamel J-F, Liu C, et al. The sea cucumber genome provides insights into morphological evolution and visceral regeneration. PLoS Biol. 2017; 15(10):2003790. https://doi.org/10.1371/journal.pbio.2003790.

    Article  CAS  Google Scholar 

  16. Hall MR, Kocot KM, Baughman KW, Fernandez-Valverde SL, Gauthier ME, Hatleberg WL, Krishnan A, McDougall C, Motti CA, Shoguchi E, et al.The crown-of-thorns starfish genome as a guide for biocontrol of this coral reef pest. Nature. 2017; 544(7649):231–4. https://doi.org/10.1038/nature22033.

    Article  CAS  PubMed  Google Scholar 

  17. Long KA, Nossa CW, Sewell MA, Putnam NH, Ryan JF. Low coverage sequencing of three echinoderm genomes: the brittle star Ophionereis fasciata, the sea star Patiriella regularis, and the sea cucumber Australostichopus mollis. GigaScience. 2016; 5(1):13742–016. https://doi.org/10.1186/s13742-016-0125-6.

    Article  CAS  Google Scholar 

  18. Say T. On the species of the Linnean genus Asterias inhabiting the coast of the United States. P Acad Nat Sci Phila. 1825; 5(1):151–4.

    Google Scholar 

  19. Bowmer T, Keegan B. Field survey of the occurrence and significance of regeneration in Amphiura filiformis (echinodermata: Ophiuroidea) from galway bay, west coast of ireland. Mar Biol. 1983; 74(1):65–71.

    Article  Google Scholar 

  20. Weber AA-T, Dupont S, Chenuil A. Thermotolerance and regeneration in the brittle star species complex Ophioderma longicauda: A preliminary study comparing lineages and mediterranean basins. Comptes Rendus Biologies. 2013; 336(11-12):572–81. https://doi.org/10.1016/j.crvi.2013.10.004.

    Article  PubMed  Google Scholar 

  21. Hurlbut GD, Kankel MW, Lake RJ, Artavanis-Tsakonas S. Crossing paths with notch in the hyper-network. Curr Opin Cell Biol. 2007; 19(2):166–75.

    Article  CAS  PubMed  Google Scholar 

  22. Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics. 2015; 31(19):3210–2. https://doi.org/10.1093/bioinformatics/btv351.

    Article  PubMed  CAS  Google Scholar 

  23. Waterhouse RM, Seppey M, Simão FA, Manni M, Ioannidis P, Klioutchnikov G, Kriventseva EV, Zdobnov EM. Busco applications from quality assessments to gene prediction and phylogenomics. Mol Biol Evol. 2018; 35(3):543–8. https://doi.org/10.1093/molbev/msx319.

    Article  CAS  PubMed  Google Scholar 

  24. Walton KD, Croce JC, Glenn TD, Wu S-Y, McClay DR. Genomics and expression profiles of the hedgehog and notch signaling pathways in sea urchin development. Dev Biol. 2006; 300(1):153–64. https://doi.org/10.1016/j.ydbio.2006.08.064.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Marlow H, Roettinger E, Boekhout M, Martindale MQ. Functional roles of notch signaling in the cnidarian Nematostella vectensis. Dev Biol. 2012; 362(2):295–308. https://doi.org/10.1016/j.ydbio.2011.11.012.

    Article  CAS  PubMed  Google Scholar 

  26. Layden MJ, Martindale MQ. Non-canonical notch signaling represents an ancestral mechanism to regulate neural differentiation. EvoDevo. 2014; 5(1):1–14. https://doi.org/10.1186/2041-9139-5-30.

    Article  CAS  Google Scholar 

  27. Erkenbrack EM. Notch-mediated lateral inhibition is an evolutionarily conserved mechanism patterning the ectoderm in echinoids. Dev Genes Evol. 2018; 228(1):1–11. https://doi.org/10.1007/s00427-017-0599-y.

    Article  CAS  PubMed  Google Scholar 

  28. Favarolo MB, López SL. Notch signaling in the division of germ layers in bilaterian embryos. Mech Develop. 2018; 154:122–44. https://doi.org/10.1016/j.mod.2018.06.005.

    Article  CAS  Google Scholar 

  29. Lloyd-Lewis B, Mourikis P, Fre S. Notch signalling: sensor and instructor of the microenvironment to coordinate cell fate and organ morphogenesis. Curr Opin Cell Biol. 2019; 61:16–23. https://doi.org/10.1016/j.ceb.2019.06.003.

    Article  CAS  PubMed  Google Scholar 

  30. Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990; 215(3):403–10. https://doi.org/10.1016/S0022-2836(05)80360-2.

    Article  CAS  PubMed  Google Scholar 

  31. Lee T, Bae YJ, Shin S. Mitochondrial gene rearrangement and phylogenetic relationships in the amphilepidida and ophiacanthida (echinodermata, ophiuroidea). Mar Biol Res. 2019; 15(1):26–35. https://doi.org/10.1080/17451000.2019.1601226.

    Article  Google Scholar 

  32. Zueva O, Khoury M, Heinzeller T, Mashanova D, Mashanov V. The complex simplicity of the brittle star nervous system. Front Zool. 2018; 15(1):1–26. https://doi.org/10.1186/s12983-017-0247-4.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  33. Clark EG, Fezzaa K, Burke J, Racicot R, Shaw J, Westacott S, Briggs D. A farewell to arms: using x-ray synchrotron imaging to investigate autotomy in brittle stars. Zoomorphology. 2019; 138(3):419–24. https://doi.org/10.1007/s00435-019-00451-7.

    Article  Google Scholar 

  34. Mashanov V, Zueva O. Radial glia in echinoderms. Dev Neurobiol. 2019; 79(5):396–405. https://doi.org/10.1002/dneu.22659.

    PubMed  Google Scholar 

  35. Ehebauer M, Hayward P, Martinez-Arias A. Notch signaling pathway. Sci STKE. 2006; 2006(364):7. https://doi.org/10.1126/stke.3642006cm7.

    Article  Google Scholar 

  36. Cormier S, Le Bras S, Souilhol C, Vandormael-Pournin S, Durand B, Babinet C, Baldacci P, Cohen-Tannoudji M. The murine ortholog of notchless, a direct regulator of the notch pathway in Drosophila melanogaster, is essential for survival of inner cell mass cells. Mol Cell Biol. 2006; 26(9):3541–9. https://doi.org/10.1128/MCB.26.9.3541-3549.2006.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Cary GA, Cameron RA, Hinman VF. EchinoBase: tools for echinoderm genome analyses In: Kollmar M, editor. Eukaryotic Genomic Databases (Methods Mol Biol), vol 1757. New York: Springer: 2018. p. 349–69. https://doi.org/10.1007/978-1-4939-7737-6_12.

    Google Scholar 

  38. Kondo M, Akasaka K. Current status of echinoderm genome analysis - what do we know?. Curr Genomics. 2012; 13(2):134–43. https://doi.org/10.2174/138920212799860643.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al.Real-time DNA sequencing from single polymerase molecules. Science. 2009; 323(5910):133–8. https://doi.org/10.1126/science.1162986.

    Article  CAS  PubMed  Google Scholar 

  40. Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014; 30(15):2114–20. https://doi.org/10.1093/bioinformatics/btu170.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. O’Connell J, Schulz-Trieglaff O, Carlson E, Hims MM, Gormley NA, Cox AJ. NxTrim: optimized trimming of Illumina mate pair reads. Bioinformatics. 2015; 31(12):2035–7. https://doi.org/10.1093/bioinformatics/btv057.

    Article  PubMed  CAS  Google Scholar 

  42. Yang X, Liu D, Liu F, Wu J, Zou J, Xiao X, Zhao F, Zhu B. HTQC: a fast quality control toolkit for Illumina sequencing data. BMC Bioinformatics. 2013; 14(1):1–4. https://doi.org/10.1186/1471-2105-14-33.

    Article  CAS  Google Scholar 

  43. Xu H, Luo X, Qian J, Pang X, Song J, Qian G, Chen J, Chen S. FastUniq: a fast de novo duplicates removal tool for paired short reads. PloS ONE. 2012; 7(12):52249. https://doi.org/10.1371/journal.pone.0052249.

    Article  CAS  Google Scholar 

  44. Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009; 19(6):1117–23. https://doi.org/10.1101/gr.089532.108.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  45. Thrash A, Hoffmann F, Perkins A. Toward a more holistic method of genome assembly assessment. BMC Bioinformatics. 2020; 21(4):1–8. https://doi.org/10.1186/s12859-020-3382-4.

    Google Scholar 

  46. Walker BJ, Abeel T, Shea T, Priest M, Abouelliel A, Sakthikumar S, Cuomo CA, Zeng Q, Wortman J, Young SK, et al.Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PloS ONE. 2014; 9(11):112963. https://doi.org/10.1371/journal.pone.0112963.

    Article  CAS  Google Scholar 

  47. Zimin AV, Marçais G, Puiu D, Roberts M, Salzberg SL, Yorke JA. The MaSuRCA genome assembler. Bioinformatics. 2013; 29(21):2669–77. https://doi.org/10.1093/bioinformatics/btt476.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Holt RA, Subramanian GM, Halpern A, Sutton GG, Charlab R, Nusskern DR, Wincker P, Clark AG, Ribeiro JC, Wides R, et al.The genome sequence of the malaria mosquito Anopheles gambiae. Science. 2002; 298(5591):129–49. https://doi.org/10.1126/science.1076181.

    Article  CAS  PubMed  Google Scholar 

  49. Chakraborty M, Baldwin-Brown JG, Long AD, Emerson J. Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res. 2016; 44(19):147. https://doi.org/10.1093/nar/gkw654.

    Google Scholar 

  50. Hunt M. assembly-stats. Unknown Month 2014. https://github.com/sanger-pathogens/assembly-stats. Accessed 18 Feb 2019.

  51. UC Davis Bioinformatics Core. assemblathon2-analysis. 2012. https://github.com/ucdavis-bioinformatics/assemblathon2-analysis. Accessed 24 Mar 2019.

  52. Gregory TR. Animal genome size database. 2020. http://www.genomesize.com. Accessed 28 Oct 2020.

  53. Hinegardner R. Cellular DNA content of the Echinodermata. Comp Biochem Physiol. 1974; 49B:219–26.

    Google Scholar 

  54. Hardie DC, Gregory TR, Hebert PD. From pixels to picograms: a beginners’ guide to genome quantification by Feulgen image analysis densitometry. J Histochem Cytochem. 2002; 50(6):735–49. https://doi.org/10.1177/002215540205000601.

    Article  CAS  PubMed  Google Scholar 

  55. Rasch EM, Lee CE, Wyngaard GA. DNA–Feulgen cytophotometric determination of genome size for the freshwater-invading copepod Eurytemora affinis. Genome. 2004; 47(3):559–64. https://doi.org/10.1139/g04-014.

    Article  CAS  PubMed  Google Scholar 

  56. Donnenberg VS, Landreneau RJ, Pfeifer ME, Donnenberg AD. Flow cytometric determination of stem/progenitor content in epithelial tissues: an example from nonsmall lung cancer and normal lung. Cytometry Part A. 2013; 83(1):141–9. https://doi.org/10.1002/cyto.a.22156.

    Article  CAS  Google Scholar 

  57. Schindelin J, Arganda-Carreras I, Frise E, Kaynig V, Longair M, Pietzsch T, Preibisch S, Rueden C, Saalfeld S, Schmid B, et al.Fiji: an open-source platform for biological-image analysis. Nat Methods. 2012; 9(7):676–82.

    Article  CAS  PubMed  Google Scholar 

  58. Marçais G, Kingsford C. A fast, lock-free approach for efficient parallel counting of occurrences of k-mers. Bioinformatics. 2011; 27(6):764–70. https://doi.org/10.1093/bioinformatics/btr011.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  59. Vurture GW, Sedlazeck FJ, Nattestad M, Underwood CJ, Fang H, Gurtowski J, Schatz MC. Genomescope: fast reference-free genome profiling from short reads. Bioinformatics. 2017; 33(14):2202–4. https://doi.org/10.1093/bioinformatics/btx153.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  60. Chu C, Nielsen R, Wu Y. REPdenovo: inferring de novo repeat motifs from short sequence reads. PloS ONE. 2016; 11(3):0150719. https://doi.org/10.1371/journal.pone.0150719.

    Article  Google Scholar 

  61. Smit AFA, Hubley R. RepeatModeler Open-1.0. 2008. http://www.repeatmasker.org. Accessed 10 Nov 2020.

  62. Jurka J. Repeats in genomic dna: mining and meaning. Curr Opin Struc Biol. 1998; 8(3):333–7. https://doi.org/10.1016/S0959-440X(98)80067-5.

    Article  CAS  Google Scholar 

  63. Jurka J. Repbase update: a database and an electronic journal of repetitive elements. Trends Genet. 2000; 16(9):418–20. https://doi.org/10.1016/S0168-9525(00)02093-X.

    Article  CAS  PubMed  Google Scholar 

  64. Jurka J, Kapitonov VV, Pavlicek A, Klonowski P, Kohany O, Walichiewicz J. Repbase Update, a database of eukaryotic repetitive elements. Cytogenet Genome Res. 2005; 110(1-4):462–7. https://doi.org/10.1159/000084979.

    Article  CAS  PubMed  Google Scholar 

  65. Bao W, Kojima KK, Kohany O. Repbase Update, a database of repetitive elements in eukaryotic genomes. Mobile DNA. 2015; 6(1):11. https://doi.org/10.1186/s13100-015-0041-9.

    Article  PubMed  PubMed Central  Google Scholar 

  66. Smit AFA, Hubley R, Green P. RepeatMasker Open-4.0. Unknown Month 2013. http://www.repeatmasker.org. Accessed 10 Nov 2020.

  67. Reese MG, Moore B, Batchelor C, Salas F, Cunningham F, Marth GT, Stein L, Flicek P, Yandell M, Eilbeck K. A standard variation file format for human genome sequences. Genome Biol. 2010; 11(8):88. https://doi.org/10.1186/gb-2010-11-8-r88.

    Article  Google Scholar 

  68. Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M. Braker1: unsupervised rna-seq-based genome annotation with genemark-et and augustus. Bioinformatics. 2016; 32(5):767–9. https://doi.org/10.1093/bioinformatics/btv661.

    Article  CAS  PubMed  Google Scholar 

  69. Brŭna T, Hoff KJ, Lomsadze A, Stanke M, Borodovsky M. Braker2: Automatic eukaryotic genome annotation with genemark-ep+ and augustus supported by a protein database. NAR Genom Bioinforma. 2021; 3(1):108. https://doi.org/10.1093/nargab/lqaa108.

    Article  CAS  Google Scholar 

  70. Lomsadze A, Burns PD, Borodovsky M. Integration of mapped rna-seq reads into automatic training of eukaryotic gene finding algorithm. Nucleic Acids Res. 2014; 42(15):119. https://doi.org/10.1093/nar/gku557.

    Article  CAS  Google Scholar 

  71. Stanke M, Diekhans M, Baertsch R, Haussler D. Using native and syntenically mapped cdna alignments to improve de novo gene finding. Bioinformatics. 2008; 24(5):637–44. https://doi.org/10.1093/bioinformatics/btn013.

    Article  CAS  PubMed  Google Scholar 

  72. Stanke M, Schöffmann O, Morgenstern B, Waack S. Gene prediction in eukaryotes with a generalized hidden markov model that uses hints from external sources. BMC Bioinformatics. 2006; 7(1):1–11. https://doi.org/10.1186/1471-2105-7-62.

    Article  CAS  Google Scholar 

  73. Ensembl Metazoa Home. Stronglocentrotus purpuratus (Spur 01) (Spur_5.0). 2020. https://metazoa.ensembl.org/Strongylocentrotus_purpuratus. Accessed 12 Aug 2020.

  74. Slater GSC, Birney E. Automated generation of heuristics for biological sequence comparison. BMC Bioinformatics. 2005; 6(1):1–11. https://doi.org/10.1186/1471-2105-6-31.

    Article  CAS  Google Scholar 

  75. Wu TD, Watanabe CK. Gmap: a genomic mapping and alignment program for mrna and est sequences. Bioinformatics. 2005; 21(9):1859–75. https://doi.org/10.1093/bioinformatics/bti310.

    Article  CAS  PubMed  Google Scholar 

  76. Kent WJ. Blat—the blast-like alignment tool. Genome Res. 2002; 12(4):656–64. https://doi.org/10.1101/gr.229202.

    CAS  PubMed  PubMed Central  Google Scholar 

  77. Jacob Machado D, Janies D, Brouwer C, Grant T. A new strategy to infer circularity applied to four new complete frog mitogenomes. Ecol Evol. 2018; 8(8):4011–8. https://doi.org/10.1002/ece3.3918.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Donath A, Jühling F, Al-Arab M, Bernhart SH, Reinhardt F, Stadler PF, Middendorf M, Bernt M. Improved annotation of protein-coding genes boundaries in metazoan mitochondrial genomes. Nucleic Acids Res. 2019; 47(20):10543–52. https://doi.org/10.1093/nar/gkz833.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  79. Lowe TM, Eddy SR. trnascan-se: a program for improved detection of transfer rna genes in genomic sequence. Nucleic Acids Res. 1997; 25(5):955–64. https://doi.org/10.1093/nar/25.5.955.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Chan PP, Lowe TM. trnascan-se: searching for trna genes in genomic sequences. In: Gene Prediction. Springer: 2019. p. 1–14. https://doi.org/10.1007/978-1-4939-9173-0_1.

  81. Hennebert E, Leroy B, Wattiez R, Ladurner P. An integrated transcriptomic and proteomic analysis of sea star epidermal secretions identifies proteins involved in defense and adhesion. J Proteomics. 2015; 128:83–91. https://doi.org/10.1016/j.jprot.2015.07.002.

    Article  CAS  PubMed  Google Scholar 

  82. Ruiz-Ramos DV, Schiebelhut LM, Hoff KJ, Wares JP, Dawson MN. An initial comparative genomic autopsy of wasting disease in sea stars. Mol Ecol. 2020; 29(6):1087–102. https://doi.org/10.1111/mec.15386.

    Article  PubMed  Google Scholar 

  83. Baughman KW, McDougall C, Cummins SF, Hall M, Degnan BM, Satoh N, Shoguchi E. Genomic organization of hox and parahox clusters in the echinoderm, Acanthaster planci. Genesis. 2014; 52(12):952–8. https://doi.org/10.1002/dvg.22840.

    Article  CAS  PubMed  Google Scholar 

  84. Yasuda N, Hamaguchi M, Sasaki M, Nagai S, Saba M, Nadaoka K. Complete mitochondrial genome sequences for crown-of-thorns starfish Acanthaster planci and Acanthaster brevispinus. BMC Genomics. 2006; 7(1):1–10. https://doi.org/10.1186/1471-2164-7-17.

    Article  CAS  Google Scholar 

  85. Jung G, Lee Y-H. Complete mitochondrial genome of chilean sea urchin: Loxechinus albus (camarodonta, parechinidae). Mitochondrial DNA. 2015; 26(6):883–4. https://doi.org/10.3109/19401736.2013.809449.

    Article  CAS  PubMed  Google Scholar 

  86. Warner JF, Lord JW, Schreiter SA, Nesbit KT, Hamdoun A, Lyons DC. Chromosomal-level genome assembly of the painted sea urchin Lytechinus pictus: A genetically enabled model system for cell biology and embryonic development. Genome Biol Evol. 2021; 13(4):061. https://doi.org/10.1093/gbe/evab061.

    Article  CAS  Google Scholar 

  87. Davidson PL, Guo H, Wang L, Berrio A, Zhang H, Chang Y, Soborowski AL, McClay DR, Fan G, Wray GA. Chromosomal-level genome assembly of the sea urchin lytechinus variegatus substantially improves functional genomic analyses. Genome Biol Evol. 2020; 12(7):1080–6. https://doi.org/10.1093/gbe/evaa101.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  88. Bronstein O, Kroh A. The first mitochondrial genome of the model echinoid Lytechinus variegatus and insights into odontophoran phylogenetics. Genomics. 2019; 111(4):710–8. https://doi.org/10.1016/j.ygeno.2018.04.008.

    Article  CAS  PubMed  Google Scholar 

  89. Morrison AMS, Goldstone JV, Lamb DC, Kubota A, Lemaire B, Stegeman JJ. Identification, modeling and ligand affinity of early deuterostome cyp51s, and functional characterization of recombinant zebrafish sterol 14 α-demethylase. BBA-Gen Subj. 2014; 1840(6):1825–36. https://doi.org/10.1016/j.bbagen.2013.12.009.

    Article  CAS  Google Scholar 

  90. Tu Q, Cameron RA, Worley KC, Gibbs RA, Davidson EH. Gene structure in the sea urchin Strongylocentrotus purpuratus based on transcriptome analysis. Genome Res. 2012; 22(10):2079–87. https://doi.org/10.1101/gr.139170.112.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  91. Stevens ME, Dhillon J, Miller CA, Messier-Solek C, Majeske AJ, Zuelke D, Rast JP, Smith LC. Sptie1/2 is expressed in coelomocytes, axial organ and embryos of the sea urchin Strongylocentrotus purpuratus, and is an orthologue of vertebrate tie1 and tie2. Dev Comp Immunol. 2010; 34(8):884–95. https://doi.org/10.1016/j.dci.2010.03.010.

    Article  CAS  PubMed  Google Scholar 

  92. Tartari M, Gissi C, Lo Sardo V, Zuccato C, Picardi E, Pesole G, Cattaneo E. Phylogenetic comparison of huntingtin homologues reveals the appearance of a primitive polyq in sea urchin. Mol Biol Evol. 2008; 25(2):330–8. https://doi.org/10.1093/molbev/msm258.

    Article  CAS  PubMed  Google Scholar 

  93. Tu Q, Brown CT, Davidson EH, Oliveri P. Sea urchin forkhead gene family: phylogeny and embryonic expression. Dev Biol. 2006; 300(1):49–62. https://doi.org/10.1016/j.ydbio.2006.09.031.

    Article  CAS  PubMed  Google Scholar 

  94. Neill AT, Moy GW, Vacquier VD. Polycystin-2 associates with the polycystin-1 homolog, surej3, and localizes to the acrosomal region of sea urchin spermatozoa. Mol Reprod Dev. 2004; 67(4):472–7. https://doi.org/10.1002/mrd.20033.

    Article  CAS  PubMed  Google Scholar 

  95. Multerer KA, Smith LC. Two cdnas from the purple sea urchin, Strongylocentrotus purpuratus, encoding mosaic proteins with domains found in factor h, factor i, and complement components c6 and c7. Immunogenetics. 2004; 56(2):89–106. https://doi.org/10.1007/s00251-004-0665-2.

    Article  CAS  PubMed  Google Scholar 

  96. Kamei N, Glabe CG. The species-specific egg receptor for sea urchin sperm adhesion is ebr1, a novel adamts protein. Gene Dev. 2003; 17(20):2502–7. https://doi.org/10.1101/gad.1133003.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  97. Tombes RM, Faison MO, Turbeville J. Organization and evolution of multifunctional ca2+/cam-dependent protein kinase genes. Gene. 2003; 322:17–31. https://doi.org/10.1016/j.gene.2003.08.023.

    Article  CAS  PubMed  Google Scholar 

  98. Sirotkin V, Seipel S, Krendel M, Bonder EM. Characterization of sea urchin unconventional myosins and analysis of their patterns of expression during early embryogenesis. Mol Reprod Dev. 2000; 57(2):111–26.

    Article  CAS  PubMed  Google Scholar 

  99. Pancer Z. Dynamic expression of multiple scavenger receptor cysteine-rich genes in coelomocytes of the purple sea urchin. P Natl Acad Sci. 2000; 97(24):13156–61. https://doi.org/10.1073/pnas.230096397.

    Article  CAS  Google Scholar 

  100. Pancer Z, Rast JP, Davidson EH. Origins of immunity: transcription factors and homologues of effector genes of the vertebrate immune system expressed in sea urchin coelomocytes. Immunogenetics. 1999; 49(9):773–86. https://doi.org/10.1007/s002510050551.

    Article  CAS  PubMed  Google Scholar 

  101. LaFleur Jr GJ, Horiuchi Y, Wessel GM. Sea urchin ovoperoxidase: oocyte-specific member of a heme-dependent peroxidase superfamily that functions in the block to polyspermy. Mechanisms Devel. 1998; 70(1-2):77–89. https://doi.org/10.1016/s0925-4773(97)00178-0.

    Article  CAS  Google Scholar 

  102. Hartman JJ, Mahr J, McNally K, Okawa K, Iwamatsu A, Thomas S, Cheesman S, Heuser J, Vale RD, McNally FJ. Katanin, a microtubule-severing protein, is a novel aaa atpase that targets to the centrosome using a wd40-containing subunit. Cell. 1998; 93(2):277–87. https://doi.org/10.1016/s0092-8674(00)81578-0.

    Article  CAS  PubMed  Google Scholar 

  103. Marsden M, Burke RD. The βl integrin subunit is necessary for gastrulation in sea urchin embryos. Dev Biol. 1998; 203(1):134–48. https://doi.org/10.1006/dbio.1998.9033.

    Article  CAS  PubMed  Google Scholar 

  104. Marsden M, Burke R. Cloning and characterization of novel β integrin subunits from a sea urchin. Dev Biol. 1997; 181(2):234–45. https://doi.org/10.1006/dbio.1996.8451.

    Article  CAS  PubMed  Google Scholar 

  105. Valverde JR, Marco R, Garesse R. A conserved heptamer motif for ribosomal rna transcription termination in animal mitochondria. P Natl Acad Sci. 1994; 91(12):5368–71. https://doi.org/10.1073/pnas.91.12.5368.

    Article  CAS  Google Scholar 

  106. Qureshi SA, Jacobs HT. Two distinct, sequence-specific dna-binding proteins interact independently with the major replication pause region of sea urchin mtdna. Nucleic Acids Res. 1993; 21(12):2801–8. https://doi.org/10.1093/nar/21.12.2801.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  107. Kitagawa M. Notch signalling in the nucleus: roles of mastermind-like (maml) transcriptional coactivators. J Biochem. 2016; 159(3):287–94. https://doi.org/10.1093/jb/mvv123.

    CAS  PubMed  Google Scholar 

  108. Jin K, Zhou W, Han X, Wang Z, Li B, Jeffries S, Tao W, Robbins DJ, Capobianco AJ. Acetylation of mastermind-like 1 by p300 drives the recruitment of nack to initiate notch-dependent transcription. Cancer Res. 2017; 77(16):4228–37. https://doi.org/10.1158/0008-5472.CAN-16-3156.

    Article  CAS  PubMed  Google Scholar 

  109. Wallberg AE, Pedersen K, Lendahl U, Roeder RG. p300 and pcaf act cooperatively to mediate transcriptional activation from chromatin templates by notch intracellular domains in vitro. Method Mol Cell Biol. 2002; 22(22):7812–9. https://doi.org/10.1128/MCB.22.22.7812-7819.2002.

    Article  CAS  Google Scholar 

  110. Weaver KL, Alves-Guerra M-C, Jin K, Wang Z, Han X, Ranganathan P, Zhu X, DaSilva T, Liu W, Ratti F, et al.Nack is an integral component of the notch transcriptional activation complex and is critical for development and tumorigenesis. Cancer Res. 2014; 74(17):4741–51. https://doi.org/10.1158/0008-5472.CAN-14-1547.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  111. Kim GS, Park H-S, Lee YC. Opthis identifies the molecular basis of the direct interaction between csl and smrt corepressor. Mol Cells. 2018; 41(9):842. https://doi.org/10.14348/molcells.2018.0196.

    CAS  PubMed  PubMed Central  Google Scholar 

  112. Gazave E, Lapébie P, Richards GS, Brunet F, Ereskovsky AV, Degnan BM, Borchiellini C, Vervoort M, Renard E. Origin and evolution of the notch signalling pathway: an overview from eukaryotic genomes. BMC Evol Biol. 2009; 9(1):1–27. https://doi.org/10.1186/1471-2148-9-249.

    Article  CAS  Google Scholar 

  113. Shi S, Stanley P. Protein o-fucosyltransferase 1 is an essential component of notch signaling pathways. P Natl Acad Sci. 2003; 100(9):5234–9. https://doi.org/10.1073/pnas.0831126100.

    Article  CAS  Google Scholar 

  114. Taylor P, Takeuchi H, Sheppard D, Chillakuri C, Lea SM, Haltiwanger RS, Handford PA. Fringe-mediated extension of o-linked fucose in the ligand-binding region of notch1 increases binding to mammalian notch ligands. P Natl Acad Sci. 2014; 111(20):7290–5. https://doi.org/10.1073/pnas.1319683111.

    Article  CAS  Google Scholar 

  115. Van Tetering G, Vooijs M. Proteolytic cleavage of notch: “hit and run”. Curr Mol Med. 2011; 11(4):255–69. https://doi.org/10.2174/156652411795677972.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  116. Koutelou E, Sato S, Tomomori-Sato C, Florens L, Swanson SK, Washburn MP, Kokkinaki M, Conaway RC, Conaway JW, Moschonas NK. Neuralized-like 1 (neurl1) targeted to the plasma membrane byN-myristoylation regulates the notch ligand jagged1. J Biol Chem. 2008; 283(7):3846–53. https://doi.org/10.1074/jbc.M706974200.

    Article  CAS  PubMed  Google Scholar 

  117. Koo B-K, Yoon K-J, Yoo K-W, Lim H-S, Song R, So J-H, Kim C-H, Kong Y-Y. Mind bomb-2 is an e3 ligase for notch ligand. J Biol Chem. 2005; 280(23):22335–42. https://doi.org/10.1074/jbc.M501631200.

    Article  CAS  PubMed  Google Scholar 

  118. Kopan R, Ilagan MXG. The canonical notch signaling pathway: unfolding the activation mechanism. Cell. 2009; 137(2):216–33. https://doi.org/10.1016/j.cell.2009.03.045.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  119. Teider N, Scott DK, Neiss A, Weeraratne SD, Amani VM, Wang Y, Marquez VE, Cho Y-J, Pomeroy SL. Neuralized1 causes apoptosis and downregulates notch target genes in medulloblastoma. Neuro Oncol. 2010; 12(12):1244–56. https://doi.org/10.1093/neuonc/noq091.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  120. Iso T, Kedes L, Hamamori Y. Hes and herp families: multiple effectors of the notch signaling pathway. J Cell Physiol. 2003; 194(3):237–55. https://doi.org/10.1002/jcp.10208.

    Article  CAS  PubMed  Google Scholar 

  121. Young PW. Lnx1/lnx2 proteins: Functions in neuronal signalling and beyond. Neuronal Signal. 2018; 2(2):20170191. https://doi.org/10.1042/NS20170191.

    Article  Google Scholar 

  122. Zhou Y, Atkins JB, Rompani SB, Bancescu DL, Petersen PH, Tang H, Zou K, Stewart SB, Zhong W. The mammalian golgi regulates numb signaling in asymmetric cell division by releasing acbd3 during mitosis. Cell. 2007; 129(1):163–78. https://doi.org/10.1016/j.cell.2007.02.037.

    Article  CAS  PubMed  Google Scholar 

  123. Sakata T, Sakaguchi H, Tsuda L, Higashitani A, Aigaki T, Matsuno K, Hayashi S. Drosophila nedd4 regulates endocytosis of notch and suppresses its ligand-independent activation. Curr Biol. 2004; 14(24):2228–36. https://doi.org/10.1016/j.cub.2004.12.028.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

No comments.

Funding

Research reported in this publication was supported by the National Institute for General Medical Sciences of the National Institutes of Health under award number 1R15GM128066-01. We also acknowledge funding from the University of North Florida. We acknowledge funding and logistical support from several entities of the University of North Carolina at Charlotte including: The Bioinformatics Services Division, the Department of Bioinformatics and Genomics, the Bioinformatics Research Center, University Research Computing, the College of Computing and Informatics. We are grateful for funding from the Belk Family. The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.

Author information

Authors and Affiliations

Authors

Contributions

VM and DJM share first authorship. VM, RR, CB, and DAJ: initial conceptualization and funding acquisition. VM, DAJ, RR, DJM, and JK: writing (original draft, review, and editing). RR and DJM: methodology, formal analysis, investigation, resources, software, validation, data curation, and visualization. DJ, VM, CB, DJM and JK: project administration. The authors read and approved the final manuscript.

Corresponding author

Correspondence to Vladimir Mashanov.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Supplementary information accompanies this paper at Zenodo, DOI: https://doi.org/10.5281/zenodo.6618000.

Additional file 1

File S1. Summary of sequence read statistics (number of sequences, number of base pairs, maximum read length, average read length, sum, estimated genome size, and estimated overall sequence depth).

Additional file 2

∙ ‘notch-related.gtf‘: Independent annotation of genes that belong to the Notch signaling pathway. ∙ ‘repeatMasker.gtf‘: Repeat annotation based on similarity (not included in the gene statistics). ∙ ‘braker.gtf‘: Main gene annotation file produced with the BRAKER pipeline. ∙ ‘exonerate_complete.gtf‘: All hits from exonerate alignments (may include suboptimal hits because it stores the best hit per transcript query, not the best hit per target location). ∙ ‘exonerate_filtered.gtf‘: Filtered results from exonerate with the best hits per target location.

Additional file 3

File S3. Bioinformatics protocols for quality control of raw sequence reads, subsequent genome assembly, and the methodology and main results for BUSCO v4.0.6 analyses.

Additional file 4

File S4. Expected C-value variation in echinoderms. Data in this table is reference in our manuscript and is based mainly on information available from the “Animal genome size database” [52].

Additional file 5

File S5. Protocols for Feulgen image analysis densitometry and sequence-based genome size estimation.

Additional file 6

File S6. Expected variation of haploid genome sizes in echinoderms.

Additional file 7

File S7. Protocol for de novo DNA repeat assembly from shotgun sequence reads.

Additional file 8

File S8. Protocol for DNA repeat identification.

Additional file 9

File S9. Template scripts for gene prediction and annotation.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mashanov, V., Machado, D.J., Reid, R. et al. Twinkle twinkle brittle star: the draft genome of Ophioderma brevispinum (Echinodermata: Ophiuroidea) as a resource for regeneration research. BMC Genomics 23, 574 (2022). https://doi.org/10.1186/s12864-022-08750-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12864-022-08750-y

Keywords