BioBarcode: a general DNA barcoding database and server platform for Asian biodiversity resources
© Lim et al; licensee BioMed Central Ltd. 2009
Published: 3 December 2009
DNA barcoding provides a rapid, accurate, and standardized method for species-level identification using short DNA sequences. Such a standardized identification method is useful for mapping all the species on Earth, particularly when DNA sequencing technology is cheaply available. There are many nations in Asia with many biodiversity resources that need to be mapped and registered in databases.
We have built a general DNA barcode data processing system, BioBarcode, with open source software - which is a general purpose database and server. It uses mySQL RDBMS 5.0, BLAST2, and Apache httpd server. An exemplary database of BioBarcode has around 11,300 specimen entries (including GenBank data) and registers the biological species to map their genetic relationships. The BioBarcode database contains a chromatogram viewer which improves the performance in DNA sequence analyses.
Asia has a very high degree of biodiversity and the BioBarcode database server system aims to provide an efficient bioinformatics protocol that can be freely used by Asian researchers and research organizations interested in DNA barcoding. The BioBarcode promotes the rapid acquisition of biological species DNA sequence data that meet global standards by providing specialized services, and provides useful tools that will make barcoding cheaper and faster in the biodiversity community such as standardization, depository, management, and analysis of DNA barcode data. The system can be downloaded upon request, and an exemplary server has been constructed with which to build an Asian biodiversity system http://www.asianbarcode.org.
DNA barcoding is the standardized minimal approach to facilitate biodiversity studies that include species identification and discovery. It helps researchers to understand evolutionary and genetic relationships by assembling molecular, morphological, and distributional data . Species-level identification through DNA barcoding is usually accomplished by the retrieval of a short DNA sequence from a standard part of the genome (i.e., 650-base fragment of the 5' end of the mitochondrial cytochrome c oxidase I (COI) gene for animal species) from the specimen under investigation . The barcode sequence from each unknown specimen is then compared with a library of reference barcode sequences derived from individuals of known identity .
The Consortium for the Barcode of Life (CBOL), which was launched in May 2004 and now includes more than 170 member organizations from 50 countries, is promoting DNA barcoding sensu stricto as the global standard for biological identification . In contrast with the limited supply of taxonomic expertise, the need to assign specimens to known species arises every day and everywhere. Using molecular biomarkers, nonspecialists can assign specimens to known species - even specimens that can confound specialists (e.g., eggs, larvae, incomplete adults). Barcoding can therefore free taxonomists from the routine identification task of documenting new species. Next-generation DNA sequencing systems [5, 6] will enable the rapid production of barcodes, thus eventually promoting the assignment of unknown individuals to classified species.
DNA barcoding sensu lato have reached out actively to new research areas other than taxonomy such as forensic science [7, 8], the biotechnology and food industries, and animal diet [9, 10]. Ecologists, environmental scientists, agricultural inspectors, public health officials, and other potential users with the need to identify specimens are exploring barcoding as a new approach to applied problems . Taxon identification with diagnostic single-nucleotide polymorphisms (SNPs) and biodiversity assessment from environmental samples (e.g., soil and water) can also be considered DNA barcoding sensu lato .
The DNA barcoding pilot projects contain several large groups of animals such as birds , fish , cowries , spiders , amphibians , and several arrays of Lepidoptera [18–20]. In addition, DNA barcoding systems are now being established for other groups of organisms, including plants , macroalgae , fungi , protists , and bacteria . The barcoding projects of Korea Barcode of Life (KBOL), launched in April of 2007, are currently collecting barcode data of vertebrates , invertebrates, land tracheophytes, and lower plants.
We built a DNA barcoding database and web server system, BioBarcode, which was developed as a part of KBOL project, to provide a reusable barcode construction system for more specific projects. In other words, BioBarcode is a bioinformatics template or platform rather than a specific DNA barcode server. The purpose of BioBarcode is to be used by biologists who have specific species information and want to build a DNA barcode database and server. It supports the compilation, storage, analysis, and publication of high-quality DNA barcode records. For many experimental biologists, building a local DNA barcoding system is expensive and time consuming. Therefore, BioBarcode will be useful for providing the tools needed to launch successful barcoding projects in the Asian biodiversity research community, including software for data management and analysis, data standards, and a data repository. To establish data standard, we have adopted the guidelines from CBOL and GenBank at the National Center for Biotechnology Information (NCBI) that must be satisfied for records to gain formal barcode status. Furthermore, it can be used for promoting international collaboration for building an Asian biodiversity system aiming to be the Asian biodiversity database server. Here, we introduce an exemplary web system using BioBarcode.
System architecture and scheme
Tables used in the BioBarcode database
Postal Address and ZIP code (geographical information)
Entry (general information of specimens)
Attached chromatogram file
Attached Image File
Registered Member Management
Bulletin Board System for News
Data uploads and repository
Anyone can create a project(s) by registering as a BioiBarcode user through the completion of a short online form http://www.asianbarcode.org/register.php. While the data upload of the Barcode of Life Data Systems (BOLD, http://www.barcodinglife.org) is carried out in two parts of specimen and sequence, BioBarcode system can be uploaded data in three parts: specimen, taxonomy, and sequence.
Data collection and validation
BioBarcode provides users with a pathway for the direct submission of their data, which can include the information related to specimens, sequences, trace files, and images. Once data are submitted, if the information needs updating, an edit function is directly accessible from the sequence and specimen pages. Projects created by any registered user (a 'project manager' status will be acquired through the project creation) will be subject to optional security measurement, and all data records will remain private to a single researcher or to a group of collaborators until they opt for public release.
Results and discussion
Data access and searches
A data search can be performed by the input of a DNA sequence or a keyword. Sequence-based data retrieval is done by creating a database containing publically available mitochondrial COI gene data registered in GenBank and Korean barcode projects. The search of an input sequence data is accomplished by using the MEGABLAST program. The search results are shown in the order of sequence similarity, sequence length, and gap opening. Users can filter the results by E-value (default is 10E-5). On the same search web page, there is the keyword-based data retrieval section. Users can choose an open or a closed project type, and the barcode ID, sample ID, collector, and scientific name can be then used as a search parameter. For a historical reason of providing the service in Korea, a Korean name can be used as a search parameter. The results from both methods are directly linked to the entry of the general information of specimens.
Through the 'Lineage' page of the main menu section (1 of Figure 6), taxonomy information based on input data which requires authorization can be identified. Personal information and registered project lists are available in the login state (2 of Figure 6). The 'Category' registered by administrator is linked with the 'Project' (3 of Figure 6). The statistics of registered species, specimen, sequence, projects, and trace files can be viewed on the 'Statistics' page (4 of Figure 6). The 'Project' page (6 of Figure 6) consists of six sections: project title, general overview (e.g., number of species, specimens and sequencing reaction, dates of start and end project,), geographical distribution, statistics, member list, and entries (e.g., barcode ID, species, and sequence with trace data).
Recently mobile interface program have become available such as Yahoo's blueprint interface http://mobile.yahoo.com/developers. We plan to implement the mobile application interface so that researchers can easily and rapidly deposit and monitor data in real time for sampling locations, collections, and observations. Another major issue is various Asian languages supports by the BioBarcode DB/Server construction system. We will include language pack in the next version of BioBarcode.
A DNA barcode server and database construction system, BioBarcode, and an exemplary server are presented, aiming to provide a platform for biological researchers who want to establish their own DNA barcode database and web server system compatible with international standards that meet the criteria in the International Nucleotide Sequence Database Collaboration (INSDC, which includes GenBank, the European Molecular Biology Laboratory, and the DNA Data Bank of Japan). BioBarcode is targeted to Asian researchers who have many local biodiversity resources. It intends to be easy to run and maintain with inexpensive open source software.
Other papers from the meeting have been published as part of BMC Bioinformatics Volume 10 Supplement 15, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Bioinformatics, available online at http://www.biomedcentral.com/1471-2105/10?issue=S15.
The BioBarcode system was developed with funding from the Korean Research Institute of Bioscience and Biotechnology (KRIBB) Research Initiative Program, the Eco-Technopia 21 Project from Ministry of Environment of Korea (052-071-051), and the Ministry of Education, Science and Technology of Korea (20090080140, 20090080150 and M10869030001-08N6903-00110).
This article has been published as part of BMC Genomics Volume 10 Supplement 3, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/10?issue=S3.
- Savolainen V, Cowan RS, Vogler AP, Roderick GK, Lane R: Towards writing the encyclopedia of life: an introduction to DNA barcoding. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences. 2005, 360: 1805-1811. 10.1098/rstb.2005.1730.PubMed CentralView ArticlePubMedGoogle Scholar
- Hebert PDN, Cywinska A, Ball SL, deWaard JR: Biological identifications through DNA barcodes. Proc Biol Sci. 2003, 270: 313-321. 10.1098/rspb.2002.2218.PubMed CentralView ArticlePubMedGoogle Scholar
- Hajibabaei M, Singer GAC, Hebert PDN, Hickey DA: DNA barcoding: how it complements taxonomy, molecular phylogenetics and population genetics. TRENDS in Genetics. 2007, 23 (4): 167-172. 10.1016/j.tig.2007.02.001.View ArticlePubMedGoogle Scholar
- Miller SE: Proposed standards for BARCODE records in INSDC (BRIs). In request document for continuation of support by the Alfred P. Sloan Foundation submitted by the Smithsonian Institution on behalf of Consortium for the barcode of Life: 22 January 2006. Edited by: Robert H. 2005, 36-38.Google Scholar
- Hudson ME: Sequencing breakthroughs for genomic ecology and evolutionary biology. Mol Ecol Res. 2008, 8: 3-17. 10.1111/j.1471-8286.2007.02019.x.View ArticleGoogle Scholar
- Schuster SC: Next-generation sequencing transforms today's biology. Nat Methods. 2008, 5: 16-18. 10.1038/nmeth1156.View ArticlePubMedGoogle Scholar
- Teletchea F, et al: Molecular identification of vertebrate species by oligonucleotide microarray in food and forensic samples. J Appl Ecol. 2008, 45: 967-975. 10.1111/j.1365-2664.2007.01415.x.View ArticleGoogle Scholar
- Birstein VJ, et al: Polyphyly of mtDNA lineages in the Russian sturgeon, Acipenser gueldenstaedtii: forensic and evolutionary implications. Conserv Genet. 2000, 1: 81-88. 10.1023/A:1010141906100.View ArticleGoogle Scholar
- Marrero P, et al: Diet of the endemic Madeira Laurel Pigeon Columba trocaz in agricultural and forest areas: implications for conservation. Bird Conserv Int. 2004, 14: 165-172. 10.1017/S0959270904000218.View ArticleGoogle Scholar
- Cristobal-Azkarate J, Arroyo-Rodriguez V: Diet and activity pattern of howler monkeys (Alouatta palliata) in Los Tuxtlas, Mexico: effects of habitat fragmentation and implications for conservation. Am J Primatol. 2007, 69: 1013-1029. 10.1002/ajp.20420.View ArticlePubMedGoogle Scholar
- Valentini A, Pompanon F, Taberlet P: DNA barcoding for ecologists. Trends in Ecology and Evolution. 2008, 24 (2): 110-117. 10.1016/j.tree.2008.09.011.View ArticlePubMedGoogle Scholar
- Kristensen R, Berdal KG, Holst-Jensen A: Simultaneous detection and identification of trichothecene- and moniliformin-producing Fusarium species based on multiplex SNP analysis. J Appl Microbiol. 2006, 102 (4): 1071-1081.Google Scholar
- Hebert PDN, et al: Identification of birds through DNA barcodes. PLoS Biol. 2004, 2: e312-10.1371/journal.pbio.0020312.PubMed CentralView ArticlePubMedGoogle Scholar
- Ward RD, et al: DNA barcoding Australia's fish species. Philos Trans R Soc Lond B Biol Sci. 2005, 360: 1847-1857. 10.1098/rstb.2005.1716.PubMed CentralView ArticlePubMedGoogle Scholar
- Meyer CP, Paulay G: DNA barcoding: error rates based on comprehensive sampling. PLoS Biol. 2005, 3: e422-10.1371/journal.pbio.0030422.PubMed CentralView ArticlePubMedGoogle Scholar
- Barrett RDH, Hebert PDN: Identifying spiders through DNA barcodes. Can J Zool. 2005, 83: 481-491. 10.1139/z05-024.View ArticleGoogle Scholar
- Vences M, Thomas M, Meijden Van der A, Chiari Y, Vieites DR: Comparative performance of the 16S rRNA gene in DNA barcoding of amphibians. Frontiers in Zoology. 2005, 2: 5-10.1186/1742-9994-2-5.PubMed CentralView ArticlePubMedGoogle Scholar
- Hebert PDN, et al: Ten species in one: DNA barcoding reveals cryptic species in the neotropical skipper butterfly Astraptes fulgerator. Proc Natl Acad Sci USA. 2004, 101: 14812-14817. 10.1073/pnas.0406166101.PubMed CentralView ArticlePubMedGoogle Scholar
- Hajibabaei M, et al: DNA barcodes distinguish species of tropical Lepidoptera. Proc Natl Acad Sci USA. 2006, 103: 968-971. 10.1073/pnas.0510466103.PubMed CentralView ArticlePubMedGoogle Scholar
- Janzen DH, et al: Wedding biodiversity inventory of a large and complex Lepidoptera fauna with DNA barcoding. Philos Trans R Soc Lond B Biol Sci. 2005, 360: 1835-1845. 10.1098/rstb.2005.1715.PubMed CentralView ArticlePubMedGoogle Scholar
- Kress WJ, et al: Use of DNA barcodes to identify flowering plants. Proc Natl Acad Sci USA. 2005, 102: 8369-8374. 10.1073/pnas.0503123102.PubMed CentralView ArticlePubMedGoogle Scholar
- Saunders GW: Applying DNA barcoding to red macroalgae: a preliminary appraisal holds promise for future applications. Philos Trans R Soc Lond B Biol Sci. 2005, 360: 1879-1888. 10.1098/rstb.2005.1719.PubMed CentralView ArticlePubMedGoogle Scholar
- Summerbell RC, et al: Microcoding: the second step in DNA barcoding. Philos Trans R Soc Lond B Biol Sci. 2005, 360: 1897-1903. 10.1098/rstb.2005.1721.PubMed CentralView ArticlePubMedGoogle Scholar
- Scicluna SM, et al: DNA barcoding of blastocystis. Protist. 2006, 157: 77-85. 10.1016/j.protis.2005.12.001.View ArticlePubMedGoogle Scholar
- Sogin ML, et al: Microbial diversity in the deep sea and the underexplored 'rare biosphere'. Proc Natl Acad Sci USA. 2006, 103: 12115-12120. 10.1073/pnas.0605127103.PubMed CentralView ArticlePubMedGoogle Scholar
- Yoo HS, Eah JY, Kim JS, Kim YJ, Min MS, Paek WK, Lee H, Kim CB: DNA Barcoding Korean Birds. Mol Cells. 2006, 22 (3): 323-327.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.