A simple and efficient method for isolating polymorphic microsatellites from cDNA

Background Microsatellites in cDNA are useful as molecular markers because they represent transcribed genes and can be used as anchor markers for linkage and comparative mapping, as well as for studying genome evolution. Microsatellites in cDNA can be detected in existing ESTs by data mining. However, in most fish species, no ESTs are available or the number of ESTs is limited, although fishes represent half of the vertebrates on the earth. We developed a simple and efficient method for isolation of microsatellites from cDNA in fish. Results The method included normalization of 150 ng cDNA using 0.5 U duplex-specific nuclease (DSN) at 65°C for 30 min, enrichment of microsatellites using biotinylated oligonucleotides and magnetic field, and directional cloning of cDNA into a vector. We tested this method to enrich CA- and GA-microsatellites from cDNA of Asian seabass, and demonstrated that enrichment of microsatellites from normalized cDNA could increased the efficiency of microsatellite isolation over 30 times as compared to direct sequencing of clones from cDNA libraries. One hundred and thirty-nine (36.2%) out of 384 clones from normalized cDNA contained microsatellites. Unique microsatellite sequences accounted for 23.6% (91/384) of sequenced clones. Sixty microsatellites isolated from cDNA were characterized, and 41 were polymorphic. The average allele number of the 41 microsatellites was 4.85 ± 0.54, while the expected heterozygosity was 0.56 ± 0.03. All the isolated microsatellites inherited in a Mendelian pattern. Conclusion Normalization of cDNA substantially increased the efficiency of enrichment of microsatellites from cDNA. The described method for isolation of microsatellites from cDNA has the potential to be applied to a wide range of fish species. The microsatellites isolated from cDNA could be useful for linkage and comparative mapping, as well as for studying genome evolution.


Background
Microsatellites are short segments of DNA in which a specific motif of 1-6 bases is repeated [1,2]. Due to their high polymorphism, codominant inheritance, ease of scoring and dense distribution throughout eukaryotic genomes, microsatellites are now generally considered to be the most powerful genetic markers for genetic mapping and evolutionary studies [3]. One perceived difficulty with microsatellites is the long lead time in identifying and characterizing microsatellites in new taxonomic groups. This problem is alleviated by developing novel protocols for enriching repeat DNA from genomic DNA [4]. How-ever, most microsatellites are type II markers for which no known function has been established. Type I markers are associated with genes of known functions and are more useful for comparative gene mapping to study genome evolution [5] and for identifying markers associated with important quantitative traits [6]. Although SNPs in genes were identified in some fish species [7,8], type I markers are still relatively rare in fish. Detection of polymorphic microsatellites located within transcribed genes provides a possibility to convert type II markers to type I markers [9]. Previous studies demonstrated that some microsatellites with genes were associated with economically important traits [6,10], and could be used as markers for markerassisted selection. Currently, microsatellites in transcribed genes have been identified in model organisms [11] and economically important animal [8,9,[12][13][14] and plant species [15,16] by data mining in ESTs using bioinformatics tools or direct sequencing ESTs. However, in most of 31,000 fish species existing on the earth [17], it is difficult to obtain microsatellites in cDNA through data mining, due to the fact that no ESTs are available, or the number of ESTs is limited in these species. Although a method for enriching microsatellites from genomic DNA has been adapted to identify microsatellites from cDNA in catfish [18], the efficiency of isolation of microsatellites in cDNA is still not very high as comparing that in genomic DNA [19], due to the redundancy of cDNA. In this paper, we report a very simple and efficient method for isolating microsatellites from transcribed genes. The method included cDNA normalization, microsatellite enrichment and directional cloning of cDNA enriched with microsatellites.

Results and discussion
In a previous study [10], we sequenced 4800 ESTs from six normalized cDNA libraries of Asian seabass (Lates calcarifer). From the 4800 ESTs, a total of 70 unique sequences containing microsatellites (repeat length: dinucleotide > 7, trinucleotide > 6, tetranucleotide > 5) from 130 clones were identified. Among the 70 microsatellites, 42 were CA-repeats, 23 GA-repeats, two GGA-repeats and three other types of repeats. These data indicate that unique microsatellite sequences accounted for 1.45% (70/4800) of cDNA clones in Asian seabass. CA-and GA-microsatellites were most abundant in cDNA of Asian seabass. However, they represent only 0.83% (40/4800) and 0.48% (23/4800) of cDNA clones from normalized cDNA libraries. Hence, straightforward random sequencing of clones from normalized cDNA libraries is not efficient for discovering microsatellites.
In this study, we tried to enrich CA-, GA-microsatellites from unnormalized cDNA of Asian seabass using biotinylated (CA) 10 and (GA) 10 oligonucleotides, since these two types of microsatellites are most abundant in cDNA of Asian seabass. Two cDNA libraries were constructed, one enriched for CA-microsatellites and another for GAmicrosatellites. From each library, 192 randomly picked clones were sequenced in both directions. Among the 192 clones from the cDNA library enriched for CA-repeats, 80 clones contained microsatellites. Of the 80 clones containing microsatellites, only 11 were singletons, and the remaining 69 were included in 8 clusters. A total of 19 (9.9%) unique microsatellites were obtained from the 192 sequences clones (Table 1). Similarly, among the 192 sequenced clones from the cDNA library enriched for GArepeats, 40 clones contained microsatellites. Eight were singletons, and 32 were included in 6 clusters. The cDNA sequence of the parvalbumin gene beta-1 containing one CT-microsatellites [10] appeared 8 times in the 192 clones. A total of 14 (7.3%) unique microsatellites were obtained from the cDNA library enriched for GA-microsatellites. In comparison to the random sequencing of clones from normalized cDNA libraries without enrichment of microsatellites, the efficiency of microsatellite isolation from unnormalized cDNA libraries enriched for microsatellites has been raised over 10 times (for CA microsatellites: 9.9% vs. 0.83%; for GA-microsatellites: 7.3% vs.0.43%). In catfish, similar efficiency of isolation of microsatellites from cDNA was reported [18]. However, high redundancy of cDNA sequences from unnormalized cDNA libraries reduced the efficiency of microsatellite isolation from cDNA.
In order to increase the efficiency of enrichment of microsatellites, we tried to reduce the redundancy of cDNA by normalizing cDNA using duplex-specific nuclease (DSN) [20] before enrichment of CA-and GA-microsatellites  Figure 1). After cDNA normalization, redundant cDNA were removed ( Figure 2). Two normalized cDNA libraries, one enriched for CA-microsatellites and another for GArepeats were created. From each library, 192 clones were sequenced in both ends respectively. Eighty-eight (45.8%) and 51 (26.5%) clones of the 192 clones from the normalized cDNA libraries enriched for CA-and GA-repeats respectively, contained microsatellites ( Table 1). The redundancy of clones was substantially reduced. In the 88 clones containing microsatellites from the cDNA library enriched for CA-microsatellites, 41 were singletons, the remaining 47 were included in 10 contigs. A total of 51 (26.5%) unique microsatellites were obtained from 192 sequenced clones. In the 51 clones containing microsatellites from the cDNA library enriched for GA-repeats, 35 were singletons, and 16 were included in 5 clusters. A total of 40 (20.8%) unique microsatellites we obtained from 192 sequenced clones (Table 1). In comparison to the efficiency of microsatellite enrichment from unnormalized cDNA, the efficiency was about three folds increased by using normalized cDNA (for CA enrichment: 26.5% vs. 9.9%; for GA enrichment: 20.8% vs. 7.3%). Therefore, decreasing the prevalence of clones representing abundant transcripts before microsatellite enrichment by normalization of cDNA is essential for microsatellite isolation from cDNA. The normalization of cDNA using DSN was very simple and highly efficient in comparison to other cDNA normalization methods [21]. The whole procedure of microsatellite enrichment starting from normalization of cDNA lasted only 5 days. Application of this method to isolate microsatellites from cDNA of grass carp brain got similar results (data no shown). Therefore the method is robust and reproducible.
Sixty of 91 microsatellites isolated from the libraries enriched for CA-and GA-microsatellites had enough flanking regions for designing primers, and were characterized in a panel of 24 individuals previously used for characterization of microsatellites isolated from genomic DNA [22]. Forty-one were polymorphic with an average allele number of 4.85 ± 0.54 ranging from 2 to 20 (Table  2), whereas the average expected and observed heterozygosity were 0.56 ± 0.03 and 0.47 ± 0.04 respectively. The average allele number of microsatellites isolated from cDNA is slightly lower than those isolated from genomic DNA libraries, and characterized with same DNA panel [22], which might be due to the relatively lower number of repeats of microsatellites identified from cDNA. Examination of genotyping errors using MicroChecker revealed no evidence for large-allele dropout or stutter-band scoring at any of the 41 loci. All 41 polymorphic microsatellites showed a Mendelian pattern of inheritance. Twentynine of 41 microsatellites were in HWE (Table 2). Departure from HWE at 12 loci may be caused by the presence of null alleles. However, examination of genotypes using MicroChecker showed the possibility of presence of null alleles is low (P > 0.05). Therefore, microsatellites isolated from cDNA using the described method could be useful for linkage mapping and comparative mapping and studies on genome evolution.

Conclusion
We have developed a very simple and highly efficient method for identifying microsatellites from cDNA. Microsatellites isolated from cDNA showed polymorphism and a Mendelian pattern of inheritance. Therefore, the method will be ideal for isolation of microsatellites from cDNA of fish species where there are no EST sequences available or the number of ESTs is limited.

Identification of microsatellites from existing ESTs of Asian seabass (Lates calcarifer)
In a previous study we sequenced 4800 ESTs from six normalized cDNA libraries of Asia seabass [10]. Microsatellites in these ESTs were identified using SciRoKo 3.1 [23].
Default parameters were used in the search for microsatellites. SciRoKo provides statistical analysis of the microsatellites.

Isolating microsatellites from cDNA
Synthesis of first strand cDNA and second cDNA Total RNA was isolated from brain of a 3-months old Asian seabass using Trizol (Invitrogen) according to the manufacturer's protocol. DNA residue in the RNA was removed with the treatment of DNAse (NEB Schematic presentation of the method for microsatellite enrichment from normalized cDNA Figure 1 Schematic presentation of the method for microsatellite enrichment from normalized cDNA. Details of each step can be found in the section "Methods". Normalization of smart cDNA Normalization of cDNA was conducted using DSN (duplex-specific nuclease) (Evrogen) [20] according to the manufacturer's recommendation. Briefly, the amplified second strand cDNA was cleaned using glassmilk (Gen101) and diluted to 50 ng/μl. Three microliter of cDNA with 1 μl 4× hybridization buffer [200 mM Hepes-HCl (pH 8.0), 2 M NaCl] was denatured at 95°C for 5 min, and then incubated at 68°C for 4 h for renaturation. After the incubation, the following reagents preheated at 68°C were added to the hybridization reaction: 3 μl water, 1 μl 5 × DSN buffer [500 mM Tris-HCl (pH 8.0); 50 mM MgCl 2 and 10 mM DTT] and 0.5 μl (1 U/μl) DSN (Evrogen). The 10 μl reaction was incubated at 65°C for 30 min on a PTC-100 PCR machine followed by heating at 95°C for 8 min to inactivate the DSN. The normalized cDNA was diluted 4 times with water, and amplified with the Smart PCR primer for 20 cycles as described in the above section.

Incorporating Sal I and Not I linkers to the ends of cDNA
To produce directional microsatellite-enriched cDNA libraries, the 5' and 3' ends of the normalized cDNA were annealed a linker with a cutting site Sal I and Not I respectively by the following 50 μl PCR reaction: 1 μl (20 times diluted) smart cDNA or normalized smart cDNA, 1 × advantage 2 buffer (BD Bioscience), 200 μM dNTPs, 1 μl

Enrichment of microsatellites
Microsatellites in cDNA were enriched by using biotinylated oligonucleotides and streptavidin-coated magnetic beads. Briefly, 1 μg cDNA in 6 × SSC was denatured at 98°C for 5 min, followed by hybridization with 1 μl 10 pmol/μl biotinylated (CA) 10 or (GA) 10 in 65 μl 6 × SSC at 55°C for 25 min. DNA hybridization products (65 μl) were captured with 35 μl (ca. 350 μg) streptavidin coated beads (Pierce) (suspended in 6 × SSC) which were washed twice in 1 × TE (pH 8.0) and twice in 6 × SCC before capture at room temperature. Beads capturing microsatelliteenriched cDNA were washed twice in 2 × SSC containing 0.1% SDS and twice in 1 × SSC at room temperature, and then a final wash in 1 × SSC at 55°C for five min. The captured cDNA was eluted with 30 μl water and PCR-amplified in a reaction of 25 μl consisted of 3 μl eluted cDNA, 200 nM SalI primer, 200 nM NotI-T25 primer, 200 μM dNTPs, 1 × PCR buffer, and two units of polymerase mix (BD Bioscience). The PCR was carried out on a PTC-100 PCR machine using the following program: 30 cycles of 95°C for 8 s, 65°C for 20 s and 72°C for 3 min. PCR products were cleaned and concentrated using glassmilk (Gen 101).

Sequencing of clones
White colonies were picked and arrayed into 96-well plates containing 40 μl LB liquid medium with 100 μg/ml ampicillin in each well. The 96 well plates were cultured at 37°C for 16-18 hours without shaking. Inserts of each colony were PCR amplified using two microliter cell culture in LB as template, low concentration (50 nM) of Agarose gel electrophoresis (1%) of smart cDNA, normal-ized cDNA and cDNA enriched with microsatellites    [24]. Mendelian inheritance patterns of all microsatellites were examined on one of three pedigrees, each including one parental pair and 24 offspring using the chi-square test. Hardy-Weinberg Equilibrium (HWE) and linkage disequilibrium were examined using GDA [25] List of abbreviations