An important finding from the HSY and X physical maps is the rapid expansion of the HSY in the past 2–3 million years after genetic recombination became suppressed in the sex determining region. The HSY physical map contains about 8.5 Mb Yh-specific sequences with the distinctive cytological feature of having four Yh-specific knobs. In contrast, the HSY counterpart X is only 4.5 Mb in size, excluding the 900 kb region of Knob 1 which is shared with the HSY. The 89% expansion of physical size in HSY indicates rapid expansion of Y-specific sequences in the early stages of sex chromosome evolution. Given sufficient time, the X and Y may evolve into morphologically distinctive chromosomes, from each other and from autosomes. Thus the Y chromosome would become significantly larger than the X, as is the case with the young sex chromosomes of Silene latifolia that evolved about 10–20 million years ago . Similarly, the medaka fish MSY exhibited 68.1% DNA sequence expansion  while the stickleback Y-specific BAC sequence showed 37% expansion compared with their X sequence . Expansion of the Y chromosome in papaya has progressed until most gene content will be lost by the time when the Y starts a contraction phase of sex chromosome evolution .
The four Yh-specific heterochromatic knobs might account for a large portion of the HSY expansion, mostly by retrotransposon insertions, as evident by the 85% repetitive sequences in HSY versus 60% in its X counterpart . The HSY is much larger than our original estimate of 4–5 Mb which was based on the 57% sex co-segregating sequence characterized amplified region (SCAR) markers embedded in the then 2.5 Mb physical map of HSY . The AFLP derived SCAR markers were developed from markers that were polymorphic between the female and hermaphrodite parents. These polymorphic and sex co-segregating markers might be enriched in Yh specific sequences, such as the four Knobs, causing the underestimate of the physical size of HSY.
We could not identify a region in the papaya X chromosome that corresponds to the Knob 2 in HSY. Moreover, the distance between Knob 1 and the corresponding region of Knob 3 on the X chromosome was much shorter than the distance between Knobs 2 to 3 on HSY. Knob2, represented by SH99O03 on HSY, was highly HSY-specific, indicating accumulation of novel sequences after recombination was suppressed by retrotransposon insertions, translocations, and perhaps other chromosomal rearrangements.
The mapping of border BACs that are not distinguishable between X and Yh allowed us to develop additional markers from the draft genome and from sequencing of selected BACs to genetically define the borders of HSY. Four F2 populations with a total of 1522 individuals were analyzed for SSR markers near the borders to identify recombinants. Only seven of the 77 SSR markers developed were polymorphic for the four pairs of parent papaya cultivars (Table1). At border A, all five polymorphic markers were found in only one (AU9 x SunUp) of the four F2 populations. Our finding of more polymorphism at border A in the AU9 x SunUp family than in the others is supported by the report that SunUp and AU9 are more distantly related than are the other cultivars . A total of 87 recombinants were found at border B from initial screening using the SSR marker P3K2608. A significant reduction of recombination was observed in all populations near the border B region between P3K2608 (in BAC SH68N07, outside of border B, not on the physical maps of HSY and X) and BorderB-2001a (in BAC SH86B15), a distance of 1.6 Mb. Our finding of very little recombination in all populations reinforces the numerous reports that sex determination regions are characterized by suppression of recombination. The HSY region was genetically defined to lie between the two SSR markers, spctg177-12a in Border A BAC SH52F06 and 58 C24–25b in Border B BAC SH58C24.
Despite numerous attempts, even when utilizing an additional male AU9 BAC library, the gap between border A and HSY remained unfilled. Chromosome walking on HSY kept jumping to land on the X chromosome. This 900 Kb region is where Knob 1 is located, the only knob structure shared between the X and Yh chromosomes . It is likely that the jump from HSY to X while walking is due to higher DNA sequence homology between the X and Yh at Knob 1 than in other parts of the chromosome due to the paucity of Yh-specific sequences near Knob 1. For these reasons, the region of Knob 1 could be beyond the HSY, even though we were not able to detect any recombination in this region. Suppression of recombination in this region could be the results of heterochromatic sequence in Knob 1. This gap on the HSY was filled on the physical map of the X counterpart.
For a genomic region like the HSY that is rich in repetitive sequences, it is unusual that the 8.5 Mb physical map of HSY could be assembled into only one contig. This success might have been facilitated by the large average insert size of 174 Kb of the second batch of the hermaphrodite SunUp BAC library . This possibility is apparent by noting that the BACs mapped on the HSY and X physical map frequently contain large inserts (often above 200 Kb) (Figure1).
Although there is no gap in the border regions of the physical map of the X counterpart there is one within the map. We made substantial efforts to close this gap using two additional BAC libraries, one from male AU9 and one from female SunUp. We also used the draft genome of female SunUp , but the gap remained. When comparing the physical maps and knob locations of HSY and X, it appears that the gap within the X physical map may correspond to Knob 4 of HSY (Figure1). Based on cytological FISH results, HSY BAC SH52H15 of Knob 4 mapped to the most terminal position compared to the other four knobs at metaphase I. Authors of that report suggested that the centromere of the Yh chromosome is located between Knobs 4 and 5 . The gap we were unable to fill on the physical map of the X counterpart could be the centromere of the X chromosome. Interestingly, this gap was filled on the physical map of HSY, so it is possible that we have fine mapped its centromere. If so, this is a surprise as most centromeres are highly repetitive and heterochromatic, very difficult to map or sequence, as has been repeatedly reported by genome sequencing projects of multiple organisms [27, 28]. To date, the centromere of rice chromosome 8 (Cen8) is the only reported sequenced centromere in plants . Rice Cen8 contains active genes and is at an early stage of centromere evolution, evolving from a neocentromere to a mature centromere. If the centromere of the Yh chromosome is indeed between Knobs 4 and 5, we may have been able to fine map it because it is atypical, perhaps a recently evolved centromere.
Complete sequencing of the papaya HSY and its X counterpart is the first step towards identification of the sex determination gene(s), which could lead to engineering a true breeding hermaphrodite variety without the Yh chromosome. In addition, the sequences of papaya sex chromosomes would be the first one in plants, and the third Y chromosome to be sequenced including the Y chromosomes of humans and the chimpanzee [30, 31]. Physical mapping of a region without recombination is a challenging task, and our task was further confounded by a potentially embedded centromere and highly repetitive, heterochromatic sequences. The trioecious nature of papaya and genomic resources already developed for it expedited our sequencing efforts. It was possible to construct an HSY and corresponding X map only through having the three BAC libraries representing the three sex types of papaya, a collection of BES from the hermaphrodite BAC library, a genome wide physical map, and an annotated draft genome [9, 17, 19, 21, 26]. FISH validation proved crucial to ensure that we were walking on the correct molecule.