The genus Gossypium contains many species of great economic and scientific importance. Cotton produces the world’s most important natural textile fiber and is also a significant oilseed crop. The cotton fiber is an outstanding model in which to study plant cell elongation and cell wall and cellulose biosynthesis . Genetic improvement of fiber production and processing will ensure that this natural renewable product will be competitive with petroleum-derived synthetic fibers. Moreover, modifying cottonseed for food and feed could profoundly enhance the nutrition and livelihoods of millions of people in food-challenged economies . Although cotton genome sequencing has been undertaken by a scientific consortium, cotton genomics has failed to keep pace with the accomplishments in genome sequencing in other angiosperms such as Arabidopsis thaliana, poplar, rice , and grapevine etc.
The genus Gossypium includes approximately 50 species, 45 diploid (2n = 2x = 26) and 5 tetraploids (2n = 2x = 52). Diploid cotton species contain eight genome types, denoted A-G and K . Interestingly, the A genome diploids and tetraploid species produce spinnable fiber and are cultivated on a limited scale, whereas the D genome species do not . In the A genome, D genome and AD genome, the genome sizes vary by approximately 3-fold, from 885 Mb in the D genome to 2,500 Mb in the tetraploid [7, 9]. Genome size in cotton is not only much larger than in Arabidopsis thaliana, poplar, grapevine and rice, but the cotton genome has also experienced a higher frequency of genome polyploidization events than any of these species [10, 11], although the grapevine genome appears to be an ancient hexaploid . Much of the size variation in cotton genomes can be attributed to accumulation of transposable elements, although some lineages show evidence of specific mechanisms to remove repetitive DNA [12, 13]. Repetitive elements comprise approximately 50% of the D genome . Because of this, progress in cotton genome sequencing has lagged behind other flowering plants.
Genomic resources for cotton such as bacterial artificial chromosomes (BACs), expressed sequence tags (ESTs), genomic sequences, genetic linkage maps, and physical maps provide landmarks for sequence analysis and assembly. Since the first genetic map of cotton was published in 1994 , several high-density genetic maps composed of more than 2,000 loci have been released [15–18]. These high-density maps were constructed with multiple types of DNA markers including restriction fragment-length polymorphisms (RFLPs) , amplified fragment-length polymorphisms (AFLPs) , sequence-related amplified polymorphisms (SRAPs) , single nucleotide polymorphisms (SNPs) , and simple sequence repeats (SSRs) [16–18]. Genome-wide integration of genetic and physical maps is a prerequisite for large-scale genome sequencing, which can in turn provide initial insights into the structure, function, and evolution of plant genomes [19–21]. In the development of genomic resources in cotton, BAC libraries have been constructed for several cotton species [22–25]. The physical map of homoeologous chromosomes 12 and 26 in upland cotton , and a draft physical map of a D-genome cotton species (Gossypium raimondii)  have been reported.
At present, a large number of cotton sequences are publically available via the Genbank database (http://www.ncbi.nlm.nih.gov/). Of these, approximately 435,354 are expressed sequence tags (EST), including 297,214 ESTs from G. hirsutum, 63,577 from G. raimondii, 41,781 from G. arboreum, 32,535 from G. barbadense, and 247 from G. herbaceum. Furthermore, genome sequence information produced by several high-throughput DNA sequencing platforms, such as the Roche/454 FLX and the Illumina Genome Analyzer, have been released for several cotton species. A pilot study by the U.S. Department of Energy Joint Genome Institute (http://www.jgi.doe.gov/) to generate a whole-genome scaffold sequence for G. raimondii was recently completed. However, draft genome sequences lack sufficient contiguity in many genomic regions to allow for cross-species comparison of genome organization and structure [27, 28]. An independent genetic map often facilitates the correct ordering of DNA segments on chromosomes and can thus clarify the changes in genome organization revealed by multiple species comparisons [29, 30]. As a result, structural, functional, and evolutionary studies in Gossypium will largely be accelerated and a whole-genome sequence will ultimately be realized.
In this paper, we report an update to a high-density interspecific genetic map in allotetraploid cultivated cotton based on earlier work in our laboratory [16, 31–34]. Using the high-density linkage map, we developed the genome-wide sequences analysis by the integration of high-density genetic map and publically-available Gossypium DNA sequence. This study will serve as a valuable genomic resource for tetraploid cotton genome sequencing, assembly and further comparative genomic analyses in Gossypium.