A first reference genetic map for Citrus
The reviews of citrus genetic mapping performed by Ruiz and Asins
, Chen et al.
 and Roose
 underlined that most of the earlier citrus genetic maps were based on intergeneric hybrids between Citrus and Poncirus. This was due to the importance of Poncirus trifoliata for rootstock breeding. Most of these studies suffered from relatively low numbers of analyzed hybrids and from the dominant nature of the markers (RAPD, AFLP) without sequence data on the mapped fragments. Several of the more recent maps were generated using co-dominant markers, particularly SSRs
[17–19]. However, the number of mapped markers was insufficient to establish the nine linkage groups corresponding to the nine chromosomes present in haploid citrus. Some recent studies also focused on the genetic mapping of Citrus varieties
[17, 20, 21, 33]. The map of Gulsen et al.
 was the first C. clementina map, while Bernet et al.
 mapped Chandler pummelo and Fortune mandarin, a C. clementina × C. tangerina hybrid. None of these maps encompassed enough markers with published sequences to establish a reference citrus map useful to be combined with whole genome sequence data.
The current reference Clementine map, established from Clementine male and female segregation, includes 961 co-dominant markers (677 SNPs, 258 SSRs and 26 Indels) spread among nine LG. The map spans 1084.1 cM, with an average marker spacing of 1.13 cM. This is a substantially higher marker density than reported in previous citrus maps, in which nine LG were obtained. Omura et al.
 established a genetic map spanning 801 cM with 120 CAPS markers. Sankar and Moore
 published an 874 cM map including 310 markers (mostly ISSR and RAPD). Carlos de Oliveira et al.
) established an 845 cM map with 227 AFLP markers and more recently using 215 markers (mostly SRAP) Gulsen et al.
 produced a 858 cM map.
The marker density in the current reference Clementine map varied along the genome. The density was particularly low in some regions of LG7 and LG8, with three gaps over 10 cM between markers in each of these LGs. The SNP markers are the most numerous markers on the Clementine map and were randomly selected. Therefore, these low marker density areas probably reveal highly homozygous regions of the Clementine genome. WGS data for the diploid Clementine will be very useful for developing targeted markers within these "no marker" regions. At the opposite extreme, high density areas were observed in some LGs. As described by Lindner et al.
 and Van Os et al.
, some of these high marker density regions may be associated with centromeric locations with large physical distances, possibly corresponding to low genetic distances. Another hypothesis is that some areas with high marker density correspond to portions of the genome in interspecific heterozygosity. Indeed, Clementine is considered to be a hybrid between Mediterranean mandarin and sweet orange
[8, 9, 16]. As sweet orange is thought to have originated as a result of interspecific hybridization between C. maxima and C. reticulata gene pools
[6, 7, 9], some parts of the Clementine genome may represent interspecific heterozygosity (C. maxima/C. reticulata). Garcia-Lor et al.
 showed that the SNP/kb frequency was approximately six times higher between C. reticulata and C. maxima that it was within C. reticulata. Thus, randomly selected markers should be six times more frequent (by physical distance unit) in those parts of the Clementine genome involved in interspecific heterozygosity. Despite the heterogeneity of marker dispersion, the distance to the nearest mapped marker is less than 5 cM in most locations of the Clementine genome. Moreover previous published diversity studies done with the mapped SSRs (5, 23–26, 28), InDels (30) and SNPs (8) gave accurate information of their transferability and polymorphisms, at individual locus level, within and between the principal varietal groups. Therefore, this marker framework will be very useful for marker-trait association studies based on linkage disequilibrium, such as QTL analysis, bulk segregant analysis, or even genetic association studies in the mandarin group, where strong diversity was observed for the mapped SNP markers
. This map is being used to facilitate the chromosome assembly of the reference whole genome citrus sequence based on a haploid Clementine genotype
Linear marker order is highly conserved between species, but genetic distances are variable between sexes and species
The citrus genetic maps based on dominant and mainly cross-specific markers (such as RAPD, AFLP and ISSR) do not permit genetic map comparisons. Multi-allelic codominant markers, such as SSRs, are more powerful for such applications
. Chen et al.
 and Bernet et al.
 successfully used SSRs for citrus map comparison at the interspecific and intergeneric levels.
In the present study, the main genotyping effort concerned SNPs. Eight hundred and thirty-six SNP markers were genotyped in the three populations. Most of these markers were mined from Nules Clementine BAC end sequences
[8, 27] and, as a result, were heterozygous for Clementine. The development of the GoldenGate SNP markers from the Clementine sequence without information on the interspecific variability in flanking areas resulted in numerous homozygous null alleles in pummelo as described by Ollitrault et al.
 and in trifoliate orange. Heterozygous null alleles for 72 markers were found in sweet orange, expanding the number of markers mapped in this species. The selected SNP markers were not efficient for pummelo or trifoliate orange mapping due to the very low number of heterozygous loci in these species. Moreover, the biallelic nature of SNP markers limited the establishment of two anchored maps (male and female) from a single cross. Therefore, comparison between Clementine and pummelo was still primarily limited to common multiallelic SSRs (109 between Clementine and Chandler pummelo and 52 between Clementine and Pink Pummelo). With sweet orange and Clementine maps being developed from different populations, the 418 common heterozygous SNPs allowed more substantial anchorage of the two maps.
The conservation of synteny was complete between the species, with no discrepancy in marker localization on the different linkage groups between the maps. Furthermore, the linear order of markers also appeared to be highly conserved between C. clementina, C. sinensis and C. maxima. This is in agreement with the conclusions of Bernet et al.
 following their comparative study of partial maps between three species (C. aurantium, C. maxima and P. trifoliata) and Fortune mandarin, a Clementine-derived mandarin hybrid. In the present study, small localized inversions of marker orders were observed between maps, particularly in dense markers areas. Bernet et al.
 concluded that similar results, for local ordering changes in the integrated maps, resulted from the inclusion of markers with missing data, and eventually different levels of distorted segregations between populations. It is also possible that small genotyping errors concerning the markers located in these dense regions disturbs the mapping order
[40, 41]. The fine mapping of such regions will require larger populations than the ones genotyped in this study. For this reason, these local inversions are not detailed in the results of this study since artifactual origins were quite probable. Chen et al.
 also concluded that colinearity at the intergeneric level was highly conserved between genetic maps of C. sinensis and P. trifoliata. However, they also observed some inversions between shared loci that might reveal chromosomal rearrangement events, such as translocations or inversions. Considering the data of this study and the two previous comparative mapping studies, marker colinearity appears highly conserved at the intrageneric level (Clementine, mandarin, pummelo, sweet orange and sour orange), but also between Citrus and Poncirus. This global conservation of citrus genome organization will allow reasonable inferences of most citrus genome sequences via mapping NGS re-sequencing data to the haploid Clementine reference genome sequence.
Variations in LG sizes were observed between the current male Clementine and female Clementine maps. These variations were confirmed when the new maps were exclusively built using the markers shared between the three populations used for the implementation of the Clementine and sweet orange maps. Several LGs were longer in the male Clementine map than in the female one. This was observed in LGs with significant and extensive segregation distortions in the male haplotype populations compared with the female populations, and this was also observed in LG2, where very similar patterns of low skewed loci were observed. From simulated data, Hackett and Broadfoot
 found that segregation distortion (due to gametic selection) alone had very little effect on marker order or map length. As discussed below, the observed distortion in Clementine probably results from gametic rather than zygotic selection. Therefore, it is probable that the longer LGs observed within the male Clementine map do not result from biased estimations due to segregation distortion, but instead reflect differential recombination rates. Such heterochiasmy between sexes is frequent in plants and animals
[42–47]. According to species, recombination should be higher in male or in female gametes
. Despite the fact that heterochiasmy was documented early in the last century
, there is still no consensus as to which of the several proposed hypotheses may explain its occurrence
. The various models were reviewed by Lenormand and Duteil
. Based on a large survey in animals and plants, these authors concluded that sexual heterochiasmy is not influenced by the presence of heteromorphic sex chromosomes; rather, it should result from a male–female difference in gametic selection. However, in this study, the citrus observations do not fit their global model considering as Trivers
, that higher gametic selection in one sex reduced recombination in that sex to preserve the favorable gene combinations that confer reproductive success. Indeed, we found (see discussion on segregation distortion below) much more significant segregation distortion, and therefore probable gametic selection, for Clementine male gametes than for female gametes. The citrus data is more in agreement with models that suggest that the sex experiencing the more intense selection, or otherwise having the higher variance in reproductive success, should show more recombination (as reported by Burt et al.
Important differences in LG lengths were also observed between Clementine (male and female) and sweet orange for LG1, LG2, LG3, LG5, LG6 and LG8. The LGs for sweet orange were systematically shorter. The literature on plants and animals shows that the impact of structural heterozygosity on recombination frequency is variable. Different situations have been discussed by Parker et al.
. It is well established that sequence divergence at the interspecific level has an inhibitory effect on sexual recombination
[49–52]. Chetelat et al.
 observed a strong reduction in the recombination rate in a mapping population of an interspecific F1 tomato hybrid of Lycopersicon esculentum × Solanum lycopersicoides. The authors concluded that the high DNA sequence divergence between L. esculentum and S. lycopersicoides is a better explanation of reduced recombination than structural reorganization. Previously (and also in tomato), Liharska et al.
 showed that the amount of recombination in a defined genetic interval decreased as the proportion of foreign chromatin (introgressed from close relatives of L. esculentum) increased. The authors also mentioned that, as the donor of the foreign chromatin became more distantly related, the level of observed recombination was lower. As the Clementine is a mandarin × sweet orange hybrid, and sweet orange arose from mandarin and pummelo gene pools (with a higher proportion of C. reticulata;
[7, 9]), it is highly probable that sweet orange contains more genome regions of interspecific heterozygosity (C. reticulata/C. maxima) than the Clementine. Therefore, it can be hypothesized that the lower LG sizes, and the associated lower recombination rates observed in sweet orange compared with Clementine, are associated with the relative interspecific patterns along the genome of these two species. The area of LG9 that displays substantially greater marker density in Clementine and sweet-orange suggests limited recombination within a large genome portion. Thus, two set of markers were common between the Clementine map and the two pummelo maps (MEST308, CIBE6092 and MEST065 for Pink pummelo and mCrCIR07F11, JI-AAG03, MEST 308 and CIBE6092 for Chandler pummelo). Interestingly, in the pummelo maps, these markers cover 26.5 cM and 30 cM, respectively, compared with an area concentrated within 2 cM in the Clementine map. It appears that both Clementine and sweet orange are strongly affected by a similar recombination limitation in LG9 for which they display equivalent map sizes. Haplotype analysis of sweet orange and diploid Clementine shows that the Clementine haplotype transmitted by sweet orange was inherited primarily from one of the sweet orange haplotypes, and only a small telomeric fragment was likely to be transmitted from the other sweet orange haplotype. Further genome analysis along with cytogenetic and mapping studies will be necessary to explain the different recombination patterns observed between species.
Extensive segregation distortions are observed in specific linkage group areas particularly when Clementine is used as the male parent
Distortions from expected Mendelian allelic segregations were observed for all mapped parents of the segregating progenies. The highest rate was recorded for male Clementine with 56% skewed loci (p < 0.05). This percentage is more than four times higher than that of female Clementine (13%), which was equal with the estimate of female ‘Chandler’ pummelo. Male ‘Pink’ pummelo displayed a slightly higher level of distortion than female ‘Chandler’ pummelo (16%), while sweet orange (mainly from female data) displayed an intermediate level (27%). Distorted loci were also observed in most of the previous citrus mapping studies
[17, 20, 54–57]. Bernet et al.
 also reported a higher percentage of skewed loci in the male parents compared to the female parents in a reciprocal cross between ‘Chandler’ pummelo and ‘Fortune’ mandarin. Since most segregation distortions affect the allele frequencies without disturbing the genotypic frequency equilibrium (non significant F value–Wright fixation index; data not shown), it is probable that gametic selection was the main factor causing skewed segregation. Bernet et al.
 reached the same conclusion from supporting biological data on parental fertility. Upon cross pollination with compatible parents, the proportion of fertilized ovules is much greater than the proportion of successful male gametes. Therefore, it appears logical that gametic selection is likely to be much more pronounced in male gametes than in females ones. This can result from several mechanisms such as gamete abortion, pollen competition or, the citrus gametophytic incompatibility system
. The pattern of X2 conformity test values, as well as the excess of mandarin alleles along the linkage groups, suggests that the presence of a small number of loci under relatively strong selection pressure on each chromosome is more likely than selection at multiple loci. Similar patterns were observed in tomato
. Identical areas of skewed loci were observed between Clementine and sweet orange in several linkage groups (LG1, LG3, LG5 and LG9). Modern sweet orange varieties arose from an interspecific hybrid prototype that has undergone vegetative propagation or propagation from seeds containing nucellar embryos over a several thousand year period. Besides favorable mutations and stable epigenetic variations that have been selected by man and the environment, it is probable that without the filter of sexual reproduction, the sweet orange genome accumulated unfavorable mutations in a heterozygous status. Some of these unfavorable mutations were likely transmitted to Clementine, as attested by the high proportion of weak progeny obtained from Clementine × sweet orange hybridization (our unpublished data), which should affect both sweet orange and Clementine segregations. Interestingly, the gametic selections have the same orientation for male and female Clementine in the genomic regions where sweet orange segregations are also skewed (LG1, end of LG5, and LG9). In other genome regions, male and female Clementine segregation distortions appeared disconnected. A very strong selection is observed in the middle of LG3 for the male Clementine, without significant skewing in the female. The male and female distortions appeared totally opposite at the end of LG4 and in the first part of LG5. The gametophytic incompatibility system described in citrus
 could be a factor for male gametic selection. However, this may lead to a complete exclusion of one allele for the concerned locus and therefore, a very high distortion for the linked marker locus. This pattern was not observed in the present study. The gametophytic incompatibility system was also excluded as an explanation for the segregation distortion observed in the reciprocal crosses between ‘Fortune’ mandarin and ‘Chandler’ pummelo
. Some of the more extremely unequal allelic ratios (70/30) for the male Clementine occurred in areas without significant distortion (or even opposite selection) in the female. Such differences between male and female selection may partly explain the inconsistent results observed for trait segregation in the reciprocal crosses. Thus, it is difficult to infer genetic control from observed trait segregations without concomitant marker segregation analysis. This is particularly true if major genes controlling the studied trait are heterozygous in the male parent. QTL analysis may also be affected as described by Xu
Haplotype structure of the diploid Clementine and the haploid Clementine used for the implementation of the citrus whole genome reference sequence
Clementine is thought to have been selected as a chance seedling from a ‘Mediterranean’ mandarin by Father Clement just over one century ago in Algeria. The mandarin female parentage was confirmed by mitochondrial genome analysis
. The ‘Granito’ sour orange was initially considered to be the male parent
. However, molecular studies demonstrated that the Clementine was more likely a mandarin × sweet orange hybrid
[8, 9, 16]. The marker phase analysis performed from the Clementine and sweet orange mapping data confirmed this hypothesis, and allowed the identification of the haplotype structures of the mandarin and sweet orange gametes that produced the Clementine. Nine recombination break points between the two sweet orange haplotypes (one each in LG1, LG7 and LG9, and two each in LG3, LG4 and LG5) were identified for the sweet orange gamete that produced the Clementine.
The implementation of a reference citrus whole genome sequence has been the primary focus of the ICGC for the last 5 years. Polymorphism in a whole genome sequence complicates the assembly process. Assembly contiguity and completeness is significantly lower than would have been expected in the absence of heterozygosity
. Commercial citrus varieties are characterized by high heterozygosity levels
[6, 7]. The comparison of blind versus "known-haplotype" assemblies of shotgun sequences obtained from a set of BAC clones from the heterozygous sweet orange
 led the ICGC to establish the reference sequence of the citrus genome from a homozygous genotype. A haploid plant derived from the Clementine was selected due to its immediate availability and preexisting molecular resources
[26, 27, 62–64]. The selected haploid was obtained by induced gynogenesis after in situ pollination with irradiated pollen
. The haploid Clementine was genotyped using the markers mapped in diploid Clementine and sweet orange. This permitted the constitution of the haploid genome to be determined according to the mandarin and sweet orange haplotypes constitutive of the diploid Clementine. Eight recombination break points were identified between the two Clementine haplotypes (one in LG1, LG7 and LG8; two in LG 5 and three in LG3). LG2, LG4, LG6 and LG9 appear to have been entirely inherited from the ’Mediterranean’ mandarin haplotype without recombination. Overall, a very large fraction of the genome of the haploid Clementine used for WGS was inherited from the ‘Mediterranean’ mandarin.