- Open Access
A crustacean annotated transcriptome (CAT) database
BMC Genomics volume 21, Article number: 32 (2020)
Decapods are an order of crustaceans which includes shrimps, crabs, lobsters and crayfish. They occur worldwide and are of great scientific interest as well as being of ecological and economic importance in fisheries and aquaculture. However, our knowledge of their biology mainly comes from the group which is most closely related to crustaceans – insects. Here we produce a de novo transcriptome database, crustacean annotated transcriptome (CAT) database, spanning multiple tissues and the life stages of seven crustaceans.
A total of 71 transcriptome assemblies from six decapod species and a stomatopod species, including the coral shrimp Stenopus hispidus, the cherry shrimp Neocaridina davidi, the redclaw crayfish Cherax quadricarinatus, the spiny lobster Panulirus ornatus, the red king crab Paralithodes camtschaticus, the coconut crab Birgus latro, and the zebra mantis shrimp Lysiosquillina maculata, were generated. Differential gene expression analyses within species were generated as a reference and included in a graphical user interface database at http://cat.sls.cuhk.edu.hk/. Users can carry out gene name searches and also access gene sequences based on a sequence query using the BLAST search function.
The data generated and deposited in this database offers a valuable resource for the further study of these crustaceans, as well as being of use in aquaculture development.
The Arthropoda is a phylum containing the largest number (nearly 85%) of described living species in the world. For various historical reasons, most of our knowledge of their biology comes from insects, particularly fruit flies Drosophila. Crustacea (including shrimps, lobsters, crayfish, crabs) forms a large subphylum of arthropods now proven to be the closest relatives of Insecta. In the past decade, a substantial number of insect genomes have been sequenced across the different groups (e.g. beetle, wasp, bee, aphid, butterfly, and moth), especially in the course of the on-going 5000 insect genome project (i5k Consortium). By contrast, the genomic resources of crustaceans are relatively scarce, and are limited to a few species (e.g. [1,2,3,4,5,6]). Carcinology, or the study of crustaceans, benefits both basic science and the aquaculture industry, presently the fastest growing animal food-producing sector worldwide. Here, we generated a user-friendly database, the crustacean annotated transcriptome (CAT) database, which enables users to search for the annotated gene name as well as gene sequences based on sequence query. This database contains newly generated crustacean transcriptomic data from different developmental stages and the tissues of seven crustacean species, including a stomatopod mantis shrimp, two decapod shrimps, a crayfish, a lobster, and two anomuran crabs (Fig. 1).
Construction and content
Specimens of the seven crustacean species were acquired either from fish markets and aquarium shops in Hong Kong or from overseas sources (see details below). The creatures were then maintained in the laboratory before being dissected, as described below:
Coral shrimps (Decapoda: Stenopodidea: Stenopodidae: Stenopus hispidus) were sourced from an aquarium shop and maintained for over 2 weeks as mating pairs in separate 10-L seawater tanks at an ambient indoor temperature (20–26 °C) with diurnal lighting and environmental enrichments of moss and wood, and were fed with aquarist shrimp feed. Tissue samples were collected from a single adult female at the intermolt stage, while “whole body” samples were obtained from 50 to 100 early (no eye spot) and late (with eye spot) stage eggs obtained from two females separately.
Cherry shrimp (Decapoda: Caridea: Atyidae: Neocaridina davidi) were purchased from an aquarium shop in Hong Kong. Again, they were kept in 10-L freshwater tanks at an ambient indoor temperature with diurnal lighting, and fed with aquarist shrimp feed. Tissue samples were collected from a single female adult at the intermolt stage, while “whole-body” samples were obtained from a 15-day-old juvenile, as well as from ~ 20 early (no eye spot) and late (with eye spot) stage eggs (~ 6 eggs per replicate) from two females separately.
Red claw crayfish (Decapoda: Astacidea: Parastacidae: Cherax quadricarinatus) at different life history stages were sourced from a breeder in Queensland, Australia. The juvenile (~ 7–10 cm in length) and adult (15–18 cm in length) crayfish were acclimated for over 2 weeks in 100-L freshwater tanks at an ambient indoor temperature with diurnal lighting and enriched with hiding nets, and were fed aquarist shrimp feed. Tissue samples were collected from a single adult female at the intermolt stage, from a single juvenile, from 4 newborn larvae (less than 10 days old, 2 individuals per replicate) and from 6 early (orange) and 6 late (brown) stage egg (3 eggs per replicate).
Spiny lobsters (Decapoda: Achelata: Palinuridae: Panulirus ornatus) were purchased from a fish market in Hong Kong, and acclimated for 2 weeks in 500-L tanks in an outdoor enclosure at 25–30 °C and fed with live clams. Tissue samples were collected from a single adult female at the intermolt stage.
Adult male coconut crabs (Decapoda: Anomura: Coenobitidae: Birgus latro) were purchased and imported from a fish market in Okinawa, Japan.. The crabs were fed a diet of coconut meat and boiled root vegetables while acclimating for 2 weeks in a controlled environment in a large outdoor enclosure at 25–30 °C. The enclosure was enriched with damp straw, reptile lights on diurnal control and a pool of running fresh water, and a humidifier maintained a relative humidity of 70–80%. Tissue samples were collected from a single individual.
Adult male king crabs (Decapoda: Anomura: Lithodidae: Paralithodes camtschaticus) were imported from Alaska and fed with live clams while acclimating for 2 weeks in 100-L seawater tanks kept at 4 °C in a dark room. Tissue samples were collected from a single individual.
Zebra mantis shrimp (Stomapoda: Lysiosquillidae: Lysiosquillina maculata) were purchased from a fish market in Hong Kong and acclimated for 2 weeks in 100-L seawater tanks at ambient indoor temperature with diurnal lighting and 20 cm of sand, and were fed with live fish. Tissue samples were collected from a single adult female at the intermolt stage.
Tissue samples of gill, eye stalk, ovary (female only), hepatopancreas, and muscle were obtained from adults of all target species and juveniles of crayfish. Gill tissues were dissected, pooled and homogenised. Tissue from eyestalks were dissected, avoiding the pigmented retina and discarding the exoskeleton. Ovary tissues were collected from mature females. Hepatopancreas tissues were taken at distant tubules from the midgut caecae to avoid heavy bacterial contamination. Muscle was isolated from the abdomen from all the shrimp and crayfish species (including stomatopod) and from the large chela of crabs. Duplicate biological samples were collected. Tissue samples from adult and the “whole body” of juvenile animals, larvae, and eggs were frozen in liquid nitrogen and then stored at − 80 °C before total RNA extraction.
RNA extraction and sequencing
Total RNAs were isolated using the miRVana microRNA Isolation Kit (Thermo Fisher Scientific). RNA concentration and quality were assessed by a NanoDrop Flourospectrometer (Thermo Scientific). At least 5 μg of total RNA for each sample were enriched by ribo-reduction using Ribo-Zero rRNA removal kits (Epicentre). Transcriptome libraries were created using TruSeq Stranded RNA Library Prep Kit v2 (Illumina) by Theragen Bio Institute in Korea, followed by 150 bp paired-end sequencing on an Illumina HiSeq 4000 platform to obtain at least 51 million clean reads (after filtering and trimming).
Transcriptome assembly and annotation
Raw sequencing reads from 71 transcriptomes were pre-processed with quality trimmed by trimmomatic (v0.33 with parameters “ILLUMINACLIP:TruSeq3-PE.fa:2:30:10 SLIDINGWINDOW:4:5 LEADING:5 TRAILING:5 MINLEN:25”, ), followed by de novo transcriptome assembly using Trinity (v2.4.0, [8, 9]) with the options “--SS_lib_type RF --normalize_reads” and other default parameters. All biological duplicates were combined to carry out the de novo assembly and estimation of transcript abundance using the script “align_and_estimate_abundance.pl” of the Trinity software with “--est_method RSEM --aln_method bowtie” (v1.1.2, ). Coding regions within transcripts were annotated using TransDecoder (v5.0.2 ;), and functional annotation and analyses were carried out using Trinotate (v3.1.1, ). A summary of the assembled transcriptomes is shown in Table 1.
Utility and discussion
The Crustacean Annotated Transcriptome (CAT) database is available at http://cat.sls.cuhk.edu.hk/. It was built using CodeIgniter Web Framework. CodeIgniter (https://www.codeigniter.com/) is a powerful PHP framework with a tiny footprint. The website provides researchers with several tools for transcriptome visualization, gene search, and gene blast.
Gene expression data of various samples in each species can be visualised through the Degust (https://github.com/Victorian-Bioinformatics-Consortium/degust) toolset . It allows the comparison of gene expression between different tissues of the same species. The users can browse differentially expressed genes (DEGs) between samples within the same species, perform their own DEG analysis, or analyse expression profiles using the inbuilt server.
Gene sequence search
The database contains 462,877 pieces of gene annotation information (coral shrimp: 57240, cherry shrimp: 92956, red claw crayfish: 99100, spiny lobster: 28805, coconut crab: 72729, red king crab: 73144, zebra mantis shrimp: 38903). The users can search a gene of a certain species by querying the “gene id” or “gene name” and selecting the species in the gene search section. After users submit their request, the results will be displayed in a table. The number of results will be shown at the top of the table. The table will list the general information of all matched genes, including the gene id, gene name and species information. Clicking on the “gene id” or “gene name” will bring users to a detailed information page of the gene. The nucleic acid sequence derived from de novo assembly, protein sequence deduced from assembled transcripts, and the expression of the gene in each sample can be viewed on the page.
The user can input or upload query sequence(s) in fasta format, select the corresponding species database and the blast type to perform the gene blast. Hits will be listed in the result table. The users can browse the detailed information of the hit genes by clicking on the hit IDs.
Carcinology benefits both the basic science and the aquaculture industry. We have here generated a platform (CAT) in hosting 71 new transcriptomes generated for seven species of decapod crustaceans and a stomatopod. CAT is constructed in a way aiming to facilitate research on this important branch of life, and will continue to be updated, to host more crustacean genomic resources in the future.
Availability of data and materials
The transcriptome data were deposited in NCBI under BioProjects PRJNA562428.
Basic Local Alignment Search Tool
Crustacean annotated transcriptome database
Differentially expressed genes
Ribosomal ribonucleic acid
Colbourne JK, et al. The ecoresponsive genome of Daphnia pulex. Science. 2011;331(6017):555–61.
Kenny NJ, et al. Genome sequence and experimental tractability of a new decapod shrimp model, Neocaridina denticulata. Marine Drugs. 2014;12(3):1419–37.
Song L, et al. Draft genome of the Chinese mitten crab, Eriocheir sinensis. Gigascience. 2016;5:5.
Kao D, et al. The genome of the crustacean Parhyale hawaiensis, a model for animal development, regeneration, immunity and lignocellulose digestion. Elife. 2016;16:5.
Guteknust J, et al. Clonal genome evolution and rapid invasive spread of the marbled crayfish. Nat Ecol Evol. 2018;2(3):567–73.
Zhang X, et al. Penaeid shrimp genome provides insights into benthic adaptation and frequent molting. Nat Commun. 2019;10(1):356.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20.
Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Maucelia E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren BW, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A. Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data. Nat Biotechnol. 2011;29(7):644.
Haas BJ, Papanicolaou A, Yassour M, Grabherr M, Blood PD, Bowden J, Couger MB, Eccles D, Li B, Lieber M, Macmanes MD, Ott M, Orvis J, Pochet N, Strozzi F, Weeks N, Westerman R, William T, Dewey CN, Henschel R, LeDuc RD, Friedman N, Regev A. De novo transcript sequence reconstruction from RNA-seq using the trinity platform for reference generation and analysis. Nat Protoc. 2013;8(8):1494.
Langmead B. Aligning short sequencing reads with bowtie. Curr Protoc Bioinformatics. 2010;32(1):11–7.
Haas, B., Papanicolaou, A. (2012). Transdecoder. http://transdecoder.github.io/.
Haas, B. J. (2015). Trinotate: transcriptome functional annotation and analysis. https://github.com/Trinotate/Trinotate.github.io/wiki
Powell D. Degust: visualize, explore and appreciate RNA-seq differential gene-expression data. In: COMBINE RNA-seq workshop; 2015.
The authors would like to thank David Wilmshurst, former academic editor at CUHK, for editing the English of the manuscript. The authors would also thank Clive Jones of James Cook University, Australia for his kind assistance in procuring the crayfish samples; Henry So of The Chinese University of Hong Kong for providing the cherry shrimp picture; and Tin-Yam Chan of National Taiwan Ocean University and Peter Ng of National University of Singapore for providing other pictures in Fig. 1.
The work presented in this paper was supported by a grant from the Collaborative Research Fund (project no. C4042-14G), Research Grants Council, Hong Kong SAR, China. The funding bodies played no role in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Nong, W., Chai, Z.Y.H., Jiang, X. et al. A crustacean annotated transcriptome (CAT) database. BMC Genomics 21, 32 (2020). https://doi.org/10.1186/s12864-019-6433-3