The common carp BAC library, constructed with genomic DNA from a female individual, containing 92,160 BAC clones with an average insert size of 141 kb, was used for generating BAC-end sequences .
BAC Culture and End Sequencing
BAC clones were inoculated into deep 96-well culturing blocks containing 1.2 ml 2 × YT medium and 12.5 μg/ml chloramphenicol from 384-well stocking plates using 96-pin replicator (V&P Scientific, Inc., San Diego, CA). The culture blocks were sealed with an air permeable seal (Excel Scientific, Wrightwood, CA) and shaked at 37°C for 20 hours with the speed of 300 rpm. The bacteria were then collected by centrifugation at 2000 g for 10 min in a Beckman Avanti J-26 XP centrifuge. After carefully removing all liquid from the culture blocks, bacterial pellets were used for BAC DNA extraction by using an alkaline lysis protocol  with modification on lysate clarification. The fritted filter plates (NUNC, Roskilde, Denmark) were used for lysate filtration, which significantly increased the BAC DNA quality for BAC end sequencing. BAC DNA was precipitated with isopropanol and washed with 70% ethanol twice. BAC DNA was then eluted into 40 μl milliQ water and collected in 96 plates and stored in -20°C before use.
Sanger sequencing reactions were conducted in 96-well semi-skirt plates using the following ingredients: 2 μl 5X Sequencing Buffer, 2 μl sequencing primer (3 pmol/μl), 1 μl BigDye v3.1 Dye Terminator(Life Technology, Foster City, CA), and 5 μl BAC DNA. The sequencing reactions were conducted in ABI 9700 Thermal Cyclers (Life Technology) under the following conditions: initial 95°C for 5 min; then 99 cycles of 95°C for 30 sec, 55°C for 10 sec, 60°C for 4 min. The T7 and PIBRP primers were used for sequencing reactions (T7 primer: TAATACGACTCACTATAGGG; PIBRP primer: CTCGTATGTTGTGTGGAATTGTGAGC). The sequencing reactions were then precipitated with pre-chilled 100% ethanol and cleaned up with 70% ethanol. The samples were then analyzed with ABI 3730 XL (Life Technology).
Clone Tracking and Quality Control
In order to avoid any orientation mistake, eight clones were re-sequenced from each 384-plate from positions A1, A2, B1, B2, C1, C2, D1, and D2. The quality control sequences were then searched against all collected BAC end sequences with BLAST program. The re-sequencing data hit the BES with a same well position will assure the correct plate orientation.
The software Phred [30, 31] was used for the BAC end sequences base calling. Quality score of Q20 was used as a cutoff in base calling. Seqclean  in DFCI Gene Indices Software Tools was used for vector trimming against UniVec database  with default parameter values. The trimmed BES were searched against themselves with BLASTN and BES that have >95% identity with other BES and have full-length covered in the alignment were filtered out in the following analysis.
To detect known repeats in carp BES, we screened and masked BES using Repeatmasker software  againt Vertebrates Repeat library with default parameter values. Next, BES homology to proteins encoded by diverse families of transposable elements were searched using TransposonPSI , a program that performs tBLASTn searches using a set of position specific scoring matrices (PSSMs) specific for different transposon element families.
Two de novo software packages, PILER-DF  and RepeatScout , were used to search for de novo repeat sequences within carp BES and built two repeat libraries, respectively. The repeat sequences in one library were compared with those sequences in the other one using BLASTN. The shorter sequences were filtered when two repeats aligned with identity ≥ 95% and coverage ≥ 95% of full length. A non-redundant de novo repeat library of common carp was then constructed with those distinct repeat sequences. The BES that were neither masked with known vertebrates repeat library nor similar to TE, were then searched against the de novo repeat library with RepeatMasker.
Identification of Microsatellites
Microsatellites were identified in non-redundant BES by using the perl script Msatfinder which was specifically designed to identify and characterize microsatellites. Only the microsatellites of 2-6 nucleotide motifs with at least 5 repeat units were collected.
BLASTX searches of the repeat-masked BES were conducted against the Non-Redundant Protein database. A cut off e-value of e-5 was used as the significance similarity threshold for the comparison. The top BLASTX result of each BES query was collected.
To compare the similarity of common carp and zebrafish genomes and anchor common carp BACs to zebrafish genome, we assumed that the zebrafish genome assembly is correct and carp BES that were masked with repeats and transposons, were searched against zebrafish genome assembly 8 (zv8) by using the program BLASTN with e-value cutoff 10-5. The top hit of each BES were further analyzed.
The conserved microsyntenies were defined as the alignment regions where carp BAC clones had ends ≤ 300 kb apart on the same chromosome and with the same orientation. Conserved microsyntenies were then divided into five categories based on transcriptional signals in zebrafish homolog genome regions to carp BES. Zebrafish Refseq genes as transcriptional signals were downloaded from UCSC database  and divided into protein-coding genes and non-coding genes from their annotation.