Automatic DNA Diagnosis for 1D Gel Electrophoresis Images using Bio-image Processing Technique
© Intarapanich et al. 2015
Published: 9 December 2015
DNA gel electrophoresis is a molecular biology technique for separating different sizes of DNA fragments. Applications of DNA gel electrophoresis include DNA fingerprinting (genetic diagnosis), size estimation of DNA, and DNA separation for Southern blotting. Accurate interpretation of DNA banding patterns from electrophoretic images can be laborious and error prone when a large number of bands are interrogated manually. Although many bio-imaging techniques have been proposed, none of them can fully automate the typing of DNA owing to the complexities of migration patterns typically obtained.
We developed an image-processing tool that automatically calls genotypes from DNA gel electrophoresis images. The image processing workflow comprises three main steps: 1) lane segmentation, 2) extraction of DNA bands and 3) band genotyping classification. The tool was originally intended to facilitate large-scale genotyping analysis of sugarcane cultivars. We tested the proposed tool on 10 gel images (433 cultivars) obtained from polyacrylamide gel electrophoresis (PAGE) of PCR amplicons for detecting intron length polymorphisms (ILP) on one locus of the sugarcanes. These gel images demonstrated many challenges in automated lane/band segmentation in image processing including lane distortion, band deformity, high degree of noise in the background, and bands that are very close together (doublets). Using the proposed bio-imaging workflow, lanes and DNA bands contained within are properly segmented, even for adjacent bands with aberrant migration that cannot be separated by conventional techniques. The software, called GELect, automatically performs genotype calling on each lane by comparing with an all-banding reference, which was created by clustering the existing bands into the non-redundant set of reference bands. The automated genotype calling results were verified by independent manual typing by molecular biologists.
This work presents an automated genotyping tool from DNA gel electrophoresis images, called GELect, which was written in Java and made available through the imageJ framework. With a novel automated image processing workflow, the tool can accurately segment lanes from a gel matrix, intelligently extract distorted and even doublet bands that are difficult to identify by existing image processing tools. Consequently, genotyping from DNA gel electrophoresis can be performed automatically allowing users to efficiently conduct large scale DNA fingerprinting via DNA gel electrophoresis. The software is freely available from http://www.biotec.or.th/gi/tools/gelect.
DNA gel electrophoresis (GE) technology is a method to separate DNA molecules by their size. This technology has a wide number of applications, including size estimation of DNA molecules , analysis of PCR amplicons or genotyping , and separation of genomic DNA before Southern analysis . To perform genetic diagnosis, target DNA sequences are amplified by polymerase chain reaction (PCR). The resulting PCR products (amplicons) are loaded into wells located on top of the gel matrix that indicate lanes for DNA molecules to migrate through the gel medium. At the end of electrophoresis, different sizes of DNA molecules appear as bands in each lane. These bands can be visualized by DNA stains such as ethidium bromide (agarose gel) or silver nitrate (polyacrylamide gel). A densitometer is commonly used to capture the band images from the gel slab. Manual interpretation of banding patterns can be very laborious and inaccurate. Performing large-scale DNA fingerprinting or genotyping thus requires an automated workflow for analysis.
Many imaging processing techniques have been proposed to address the two main steps in GE analysis, namely lane and band detection. The accuracy of these steps is often compromised by technical variation inherent to GE . This variation includes distortion, i.e. lane or band curvature, which affects automatic lane segmentation, and sub-optimal gel image exposure that affects band detection performance. Caridade et al.,  presented a technique to extract DNA bands by converting an input image to gray scale and using the column histogram method to detect lanes. To detect DNA bands, they proposed a heuristic to match a given band to a reference band. The band quantification accuracy of this technique is very variable among GE images. Bajla et al.  proposed a technique to deal with image distortion by letting users to adjust a Gaussian deconvolution parameter so that band positions can be easily detected. Kaabouch et al.  attempted to improve the band detection process by enhancing the quality of a gel image first using their proposed automatic thresholding technique. Lee et al.  presented another automated gel electrophoresis analysis system that uses an enhanced fuzzy c-means algorithm and Gaussian function for lane segmentation. In their workflow, the bands were identified by tracing the segmented lanes while enhancing the detection accuracy through an elimination of repetitive band procedure. The Dynamic Time Warping (DTW) method was introduced in  to increase band detection sensitivity by cross-adjusting positions of the same bands from different lanes. A recent report by Tseng and Lee  claimed that none of the previously presented techniques can fully automate the band detection process. They offered new heuristics that can adjust for geometric distortion of lanes (slanted lanes) and increase the sensitivity of band identification by taking first derivative of the band gray-level. Doublet bands (two bands that are very close together in a lane) can be extracted with high accuracy by this method.
Although most research efforts claimed to have an automated band extraction system, none of them offer practical software that can be used to carry out the underlying task. Tseng and Lee  established the theoretical platform of image processing techniques that could be implemented as an automated tool. Several commercial software tools such as GelQuant, QuantiScan, Gel-Pro Analyzer and GelCompar [11–14] offer a partial image processing solution with limited features. The review article by Heras et. al.  surveys DNA fingerprinting tools, including Gel Plugin ImageJ , GelAnalyzer , GelClust , GelQuant.NET , Image , Laneruler  and PyElph . Several of these free tools, however, either have limited function (GelQuant.NET has no lane detection module) or can no longer be used owing to outdated dependent software (Image software by Sanger and Laneruler). Moreover, the lane analysis available in Gel Plugin ImageJ does not have automatic lane detection. The most recently published tool GelJ  provides a comprehensive tool incorporating many features of DNA fingerprinting available in other tools.
The performance of these image processing tools depend majorly on the ability to detect lanes correctly. Most tools assume that lanes are parallel lines. However, uneven heating or buffer degradation during electrophoresis can often create migration artifacts that lead to lanes that are not straight. The most recent algorithm described in  addresses this issue by applying geometric distortion in which a box is created automatically with slanted sides over the lane. This method can correct for minor lane aberrations. However, we found that this method often fails when lanes are highly curved. We propose a novel image processing tool for gel electrophoresis, called GELect that can automatically perform the analysis of large-scale DNA fingerprinting. In particular, a novel lane segmentation algorithm is incorporated for accurately assigning bands into lanes, even when the lanes are highly curved. Moreover, GELect also offers a genotyping feature that collectively groups the same banding patterns together. We used images obtained from DNA fingerprinting of sugarcane DNA samples to test GELect. To demonstrate the performance over existing tools, we compare GELect with free software, namely PyElph, GelJ, GelClust and GenAnalyzer, in terms of the ability to detect and correct for curved lanes. GELect was implemented in Java and converted into imageJ library so that the tool can be easily utilized as well as further improved by other developers.
Results and Discussion
We tested the performance of the proposed system in two aspects, lane segmentation and band extraction performance. Ten PAGE images with 433 samples (lanes) were tested on both aspects. We examined how well the proposed system is able to separate distorted lanes. After performing lane separation, each lane was further analyzed to detect DNA bands.
Comparison of different gel analysis tools.
Curved Lane detection
Band smiling effect correction
To demonstrate the need of curved lane detection, we also compared GELect with PyElph, GelJ, GelClust, and GelAnalyzer in terms of their ability to segment curved lanes (Table 1 and Additional File 1). GelJ allows users to manually draw polygons to select the lanes. However, we did not test this function as we were only interested in comparing the automatic feature of each al gorithm. Of these tools, only GELect can automatically detect curved lanes. Other tools use the as sumption that lanes can only be constructed by two parallel lines.
The GELect tool is a convenient program for DNA diagnosis from 1D gel electrophoresis image. The tool can efficiently segment lanes from gel electrophoresis image with curved lanes as well as poor image exposure. GELect can construct a band model by performing band registration against a reference band. Therefore, the genotyping from DNA gel electrophoresis can be done through the band classification technique.
Materials and methods
Genotyping of sugarcane cultivars
The lane segmentation results.
Dimension in pixels
Number of lanes
Accuracy in %
1884 × 524
1955 × 524
1871 × 524
1911 × 546
1810 × 718
1810 × 718
473 × 288
234 × 500
276 × 574
276 × 399
Overview of image processing workflow
Extraction of DNA bands
Automatic band genotyping
Availability of supporting data
The instruction of the software and the electrophoretic gel images used in this paper are available to download from our website, http://www4a.biotec.or.th/GI/tools/gelect.
We acknowledge the support from Giga Impact Initiative project funded by the Cluster Program Management Office (CPMO), National Science and Technology Development Agency (NSTDA). PJS acknowledges support from the Thailand Research Fund (TRF) code: RSA5780007. SDT acknowledges the partial funding support from TRF code: RSA5880061. Finally, we would like to thank Mitr Phol research for supplying sugarcane DNA samples used in this work.
Publication charges of this work were funded by the National Science and Technology Development Agency (NSTDA).
This article has been published as part of BMC Genomics Volume 16 Supplement 12, 2015: Joint 26th Genome Informatics Workshop and 14th International Conference on Bioinformatics: Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S12.
- Heidcamp WH: Electrophoresis-Introduction, Biology Department, Gustavus Adolphus College. [http://homepages.gac.edu/~cellab/chpts/chpt4/intro4.html]
- Day INM, Humphries SE: Electrophoresis for Genotyping: Microtiter Array Diagonal Gel Electrophoresis on Horizontal Polyacrylamide Gels, Hydrolink, or Agarose. Analytical Biochemistry. 1994, 222: 389-395.PubMedView ArticleGoogle Scholar
- Brown TA: Southern Blotting and Related DNA Detection Techniques. Encyclopedia of Life Sceince. 2001, 1-6.Google Scholar
- Rio DC, Ares M, Hannon GJ, Nilsen TW: Purification of RNA using TRIzol (TRI reagent). 2010, Cold Spring Harbor protocolsGoogle Scholar
- Caridade CMR, Marcal ARS, Mendonca T, Pessoa AM, Pereira S: An Automatic Method to Identify and Extract Information of DNA Bands in Gel Electrophoresis Images. 31st Annual International Conference of the IEEE EMBS, Minneapolis, Minnesota, USA. 2009, 1024-1027.Google Scholar
- Bajla I, Hollander I, Burg K: Improvement of Electrophoretic Gel Image Analysis. Measurement Science Review. 2001, 1: 5-10.Google Scholar
- Kaabouch N, Schultz RR, Balakrishnan L: An Analysis System for DNA Gel Electrophoresis Images Based on Automatic Thresholding an Enhancement. IEEE EIT Proceeding. 2007, 26-31.Google Scholar
- Lee J, Huang C, Wang N, Lu C: Automatic DNA Sequencing for Electrophoresis Gels Using Image Processing Algorithm. J Biomedical Science and Engineering. 2011, 4: 523-528.View ArticleGoogle Scholar
- Skutkova H, Vitek M, Krizkova S, Kizek R, Provaznik I: Preprocessing and Classification of Electrophoresis Gel Images Using Dynamics Time Warping. International Journal of Electrochemical Science. 2013, 8: 1609-1622.Google Scholar
- Tseng D, Lee Y: Automatic band detection on pulse-field gel electrophoresis images. Pattern Anal Applic. 2015, 18: 145-155.View ArticleGoogle Scholar
- Image Quantitation and Protein, RNA & DNA Gel Quantitation. [http://biochemlabsolutions.com/GelQuantNET.html]
- QuantiScan. [http://www.biosoft.com/w/quantiscan.htm]
- Gel-Pro Analyzer. [http://www.mediacy.com/index.aspx?page=GelProOverview]
- Gel-Compar. [http://www.applied-maths.com/gelcompar-ii]
- Heras J, Domínguez C, Mata E, Pascual V, Lozano C, Torres C, Zarazaga M: A survey of tools for analysing DNA fingerprints. Briefings in bioinformatics. 2015, bbv016-Google Scholar
- ImageJ team: Gel Quantification Analysis for ImageJ. 2014, (20 August 2015, date last accessed), [http://imagejdocu.tudor.lu/doku.php?id1/4video:analysis:gel_quantification_analysis]Google Scholar
- GelAnalyzer. [http://www.gelanalyzer.com]
- Khakabimamaghani S, Najafi A, Ranjbar R, Raam M: GelClust: a software tool for gel electrophoresis images analysis and dendrogram generation. Comput Methods Programs Biomed. 2013, 111: 512-8.PubMedView ArticleGoogle Scholar
- BiochemLab Solutions: GelQuant.NET. [http://biochemlabsolutions.com/GelQuantNET.html]
- Image: The Fingerprint Image Analysis System. [https://www.sanger.ac.uk/resources/software/image/]
- Wong RTF, Filbotte S, Corbett R, Saeedi P, Jones SJM, Marra MA: LaneRuler: Automated Lane Tracking for DNA Electroporesis Gel Images. IEEE Transaction on Automation Science and Engineering. 2010, 7: 706-8.View ArticleGoogle Scholar
- Pavel A, Vasile C: PyElph-a software tool for gel images analysis and phylogenetics. BMC Bioinformatics. 2012, 13: 9-PubMedPubMed CentralView ArticleGoogle Scholar
- Heras J, Domínguez C, Mata E, Pascual V, Lozano C, Torres C, Zarazaga M: GelJ-a tool for analyzing DNA fingerprint gel images. BMC Bioinformatics. 2015, 16: 270-PubMedPubMed CentralView ArticleGoogle Scholar
- Mitr Phol Research Group: [http://www.mitrphol.com/index.php/en/business_unit/index.html]
- Phytosome database. [http://www.coptis.com]
- Ranka S: Clustering Part 4. Computer and Information Science and Engineering, University of Florida, GainesVille, [http://www.cise.ufl.edu/class/cis4930sp09dm/notes/dm5part4.pdf]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.