ProSeeK: A web server for MLPA probe design
© Pantano et al; licensee BioMed Central Ltd. 2008
Received: 04 June 2008
Accepted: 28 November 2008
Published: 28 November 2008
The technological evolution of platforms for detecting genome-wide copy number imbalances has allowed the discovery of an unexpected amount of human sequence that is variable in copy number among individuals. This type of human variation can make an important contribution to human diversity and disease susceptibility. Multiplex Ligation-dependent Probe Amplification (MLPA) is a targeted method to assess copy number differences for up to 40 genomic loci in one single experiment. Although specific MLPA assays can be ordered from MRC-Holland (the proprietary company of the MLPA technology), custom designs are also developed in many laboratories worldwide. After our own experience, an important drawback of custom MLPA assays is the time spent during the design of the specific oligonucleotides that are used as probes. Due to the large number of probes included in a single assay, a number of restrictions need to be met in order to maximize specificity and to increase success likelihood.
We have developed a web tool for facilitating and optimising custom probe design for MLPA experiments. The algorithm only requires the target sequence in FASTA format and a set of parameters, that are provided by the user according to each specific MLPA assay, to identify the best probes inside the given region.
To our knowledge, this is the first available tool for optimizing custom probe design of MLPA assays. The ease-of-use and speed of the algorithm dramatically reduces the turn around time of probe design. ProSeeK will become a useful tool for all laboratories that are currently using MLPA in their research projects for CNV studies.
The technological evolution of platforms for assessing genome-wide copy number imbalances  has allowed the discovery of an unexpected amount of human sequence involved in duplications and deletions (termed copy number variants or CNVs). In terms of sequence coverage, this is the most important type of human variation identified so far and can make an important contribution to human diversity and disease susceptibility (see  for review). So far, derived from the study of several hundreds of individual genomes, ~19% of the euchromatic portion of the human genome has been reported as variable (mainly in copy number) . Several studies have shown the relationship between CNVs and disease phenotypes [4, 5].
Although specific MLPA assays can be ordered from MRC-Holland (the proprietary company of the MLPA technology), custom designs are also developed in many laboratories worldwide. After our own experience, an important drawback of custom MLPA assays is the time spent during the design of the specific oligonucleotides. Due to the large number of probes included in a single assay, a number of restrictions need to be met in order to maximize specificity and to increase success likelihood. Given the tedious stepwise procedure that is followed, the goal of ProSeeK is to automate the process of probe design and to obtain the best candidate probes for a given region.
ProSeeK is presented as an easy-to-use and point-and-click web interface. Is implemented in CGI (Common Gateway Interface) Perl scripts and made accessible to the user using PHP on top of an APACHE server with MYSQL database support. It is accessible through the Internet (at http://davinci.crg.es/estivill_lab/mlpa) with IE5.0 and Netscape 7 or higher, from any platform. By making use of universally available web GUIs, the system solves the problem of portability of this software. No client-side software installation is required.
Input to server
ProSeek requires the DNA sequence of the target region in which the MLPA probes will be designed. Several parameters can be used to restrict the probe design: (1) maximum GC content, (2) maximum melting temperature (Tm) of the hybridizing sequence, (3) Blat e-value (minimal length that the Blat will detect as a match), (4) hybridizing sequences (HS) length, (5) stuffer sequence, (6) sequence of the universal primers to flank the HS, and (7) desired probe length. (Additional file 1).
Output from server
After computing all available possibilities, ProSeek produces a table in HTML format containing optimal probes which are presented to the user, together with their characteristics, which include position within the user-entered sequence, genome mapping, GC content, melting temperature, probe sequence, nucleotide length, self-folding capacity (i.e. DNA secondary structure prediction using DINAMelt Server ), and links to the UCSC Genome Browser  and to the Database of Genomic Variants  The projects are kept on the server for one month, so the users can retrieve their results at any time by returning to the website and identifying himself on the initial web page. (Additional file 1).
A number of high-throughput technologies have become available to address the genome-wide detection of structural variations in humans. An important drawback of these new methods is that a huge amount of false positive results typically arise after analysis, thus it is mandatory to validate observations made with these technologies using alternative and more reliable approaches. Among others, due to its simplicity, robustness and relative low price, the MLPA is often used as a targeted method to assess copy-number differences. One important inconvenience is the required time for designing the probe-mixes to target the desired regions, since a lot of restriction should be fulfilled to get a sensitive, specific and reproducible experiment. To overcome this aspect, we developed ProSeek that produces the optimal probes for the regions of interest. ProSeeK is, to our knowledge, the first algorithm for the design of MLPA probes, that allows saving time and improving accuracy of MLPA assays.
Availability and requirements
This work has been supported by Spanish Ministry of Science and Innovation under NOVADIS project (SAF2008-00357), and by the European Commission under Aneuploidy (037627) and ENGAGE (201413) projects.
- Carter NP: Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet. 2007, 39: S16-S21. 10.1038/ng2028.PubMedPubMed CentralView ArticleGoogle Scholar
- Feuk L, Carson A, Scherer S: Structural variation in the human genome. Nature Reviews. 2006, 7: 85-97.PubMedView ArticleGoogle Scholar
- Scherer W, Charles L, Ewan B, Altshuler D, Eichler E, Carte N, Hurles M, Feuk L: Challenges and standards in integrating surveys of structural variation. Nat Genet. 2007, 39 (7 Suppl): S7-S15. 10.1038/ng2093.PubMedPubMed CentralView ArticleGoogle Scholar
- Feuk L, Marshall C, Wintle R, Scherer S: Structural variants: changing the landscape of chromosomes and design of diseases studies. Hum Mol Genet. 2006, 15: R57-R66. 10.1093/hmg/ddl057.PubMedView ArticleGoogle Scholar
- McCarroll S, Altshuler D: Copy-number variation and association studies of human disease. Nat Genet. 2007, 39: S37-S42. 10.1038/ng2080.PubMedView ArticleGoogle Scholar
- Schouten J, McElgunn C, Waaijer R, Zwijnenburg D, Diepvens F, Pals G: Relative quantification of 40 nucleic acid sequences by multiplex ligation-dependent probe amplification. Nucleic Acids Res. 2002, 30: e57-10.1093/nar/gnf056.PubMedPubMed CentralView ArticleGoogle Scholar
- Kent W: BLAT – The BLAST-Like Alignment Tool. Genome Res. 2002, 12 (4): 656-664.PubMedPubMed CentralView ArticleGoogle Scholar
- Markham NR, Zuker M: DINAMelt web server for nucleic acid melting prediction. Nucleic Acids Res. 2005, 33: W577-W581. 10.1093/nar/gki591.PubMedPubMed CentralView ArticleGoogle Scholar
- Kent W, Sugnet C, Furey T, Roskin K, Pringle T, Zahler A, Haussler D: The human genome browser at UCSC. Genome Res. 2002, 12: 996-1006.PubMedPubMed CentralView ArticleGoogle Scholar
- Iafrate A, Feuk L, Rivera M, Listewnik M, Donahoe P, Qi Y, Scherer S, Lee C: Detection of large-scale variation in the human genome. Nat Genet. 2004, 36: 949-51. 10.1038/ng1416.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.