In this study we undertook the analysis of the detailed genotypic structure of near-isogenic materials specifically produced for the introgression of three heterotic QTL in maize that we detected and then further characterized [4, 25, 26]. The NIL pairs hereby analyzed represent unique material for the study of hybrid vigor in maize, since the analysis of mendelized heterotic QTL might shed light on some relevant, and possibly general, genetic and molecular mechanisms underlying this phenomenon. Given this premise, it is paramount that great efforts are made to acquire a detailed knowledge of the genetic structure of such materials, possibly on a whole-genome scale. In fact, accurate determination of the actual isogenicity of NILs at a genome-wide level must not be neglected, since conclusions derived on the effects of QTL mendelizing in NIL materials largely rely upon it. Pursuing this goal, we genotyped the original inbred lines (B73-SSA and H99) and the recombinant NILs obtained from their cross by the Illumina MaizeSNP50 BeadChip. This platform allows the scoring of predetermined genotype variants at more than 50,000 SNPs selected upon a large maize diversity panel and was recently tested on several US inbred lines . The Illumina MaizeSNP50 BeadChip system was chosen because, in the context of a bi-parental genetic system based on common US inbred lines such as the one under analysis, it provided the best combination of potential information content, technical reliability, and resolution needed to ascertain the overall genetic structure and the level of isogenicity of NILs. It has been proposed that heterosis might be associated to large structural variations in the genome leading to complex patterns of gene complementation through the combination of the dispensable genomes within the extremely diverse maize germplasm . The platform chosen for the present analysis, differently from others based upon CGH or re-sequencing techniques, is not suited for addressing the study of such variations, which were at this point beyond the scope of the present work, even though they might indubitably be relevant for a closer investigation on the molecular nature of the introgressed QTL.
First of all we compared the genotypic structure of B73-SSA, the inbred line from which all NILs were derived, with that of the reference B73 accession, for which both replicate SNP scoring produced identical results, confirming the technical reproducibility of genotype calls. The few differences detected between B73-SSA and B73 were in line with the level of inconsistency already observed with duplicate samples from different seed sources . Overall, B73-SSA resulted less heterozygous than its reference counterpart, which might reflect an actual higher homozygosity for this accession. A sampling effect cannot be discarded, since the DNA of B73-SSA was obtained from a small pool of 5–6 seedlings. Quality filtering criteria on SNPs produced 42,771 good quality SNPs vs. the 49,585 SNPs retained in a previous study including 274 maize samples  where, as in the present study, a control upon pedigree consistency was made on parent/offspring triplets including the F1 hybrid and its parental lines. In our case, this procedure had the additional purpose, due to the lack of replicate samples, to reduce the chances of spurious polymorphism detection and consequent inaccurate genotype calls in NILs samples. It must be considered that the quality of the genotyping data was assessed by Ganal and coworkers with respect to an average failure rate based on a large sample set , rather than on a single comparison as in the present study. Despite this difference, however, when removing the 8,628 SNPs having a failure rate > 5% from the good quality SNPs reported by Ganal and coworkers, the resulting number of SNPs (40,957) is strikingly similar to that of SNPs never failed in any sample in the present study (40,852), confirming the reliability of the genotyping platform. The application of stringent quality criteria inevitably caused a reduction of the number of SNPs available to the analysis, with the obvious advantage of providing, on the other hand, a more reliable genotype data set. An indication of this aspect came from the fact that the last filtering step (i.e., the removal of inconsistent SNPs present in any NIL sample) caused the exclusion of 5 additional SNPs only. SNPs heterozygous in either of the parental lines were filtered out upon considering both their reduced number, thus their marginal effect on the overall picture, and the fact that their inheritance by descent to the offspring could not be used to unambiguously determine NILs’ genotypic structure at the respective loci. No inferences on the presence of null alleles were made upon failed SNP calls, in order to avoid both an undesirable increase in the genotyping error rate and the use of dominant-type data which do not allow scoring heterozygosity. The maintenance of the distribution pattern of mapped SNPs after quality filtering further indicated that the informative content of the chip, although inevitably reduced, was not overall biased by the filtering process.
The number of polymorphic SNPs detected between B73-SSA and H99 inbred lines by the Illumina MaizeSNP50 chip was adequate for the purposes of the present study, which was to describe the detailed genetic structure and genotype inheritance patterns in bi-parental materials from them derived. However, no absolute considerations upon the polymorphism level hereby detected between these two lines could be made, nor any comparisons with those previously detected among others. In fact, despite being designed upon a large maize germplasm panel, many of the SNP markers present on the Illumina MaizeSNP50 chip were selected upon data available from inbred lines B73 and Mo17. An anomalous high number of polymorphic SNPs (ca. 52%) with respect to previous knowledge about genetic diversity in maize was observed between these two lines when analyzed by this SNP platform in the original study on a large maize diversity panel . This suggested the presence of an ascertainment bias associated to the design of this SNP chip, which has been further confirmed more recently by a diversity analysis extended to a panel of 77 elite European inbred lines . In the context of germplasm organization, inbreds are commonly assigned to heterotic pools according to estimates of their genetic similarity . Surprisingly, however, inbred line H99, analyzed in the present work for the first time with this platform, resulted more similar to B73 than to Mo17 (respectively 35% vs. 41% of polymorphic SNPs, calculated on the same set of 42,771 good quality SNPs; Pea et al., unpublished data), despite the fact that H99 and Mo17 both belong to the Lancaster Sure Crop heterotic group .
Our analysis showed that the distribution of SNPs in the genome is in general not uniform. Considering in particular the chromosomes bearing the introgressed QTL, chromosomes 3 and 4 showed a high polymorphic vs. monomorphic ratio, in both cases due to a lower than expected number of non-informative monomorphic SNPs. Chromosome 4 has also a significant low number of SNPs when compared to other chromosomes. However, SNP density varies sensibly within each chromosome, showing a marked tendency for an over-representation of polymorphic SNPs in telomeric regions which has been already observed for IBM and LHRF populations . This aspect might reflect the constraint in the SNP discovery process towards the use of unique, and thus genic, sequences, which tend to be more abundant in telomeric regions. This biased distribution of SNPs also affected the density of SNPs within the different introgressed QTL regions. In fact, the number of total SNPs per Mbp is 20.12 for QTL 3.05 and 17.34 for QTL 10.03, and more than twice as much (42.42) for QTL 4.10 region, which maps at the telomere of long arm of chromosome 4. This relative difference is even larger when considering polymorphic SNPs only, which are 6.57 and 7.15 per Mbp for QTL 3.05 and QTL 10.03, respectively, against 20.69 per Mbp in QTL 4.10 region.
The distances between adjacent polymorphic SNPs between B73 and H99 present on the chip, being for the vast majority shorter than 1 Mbp, allowed us to draw detailed maps of the genetic structure of this unique NIL material, to our knowledge the only available introgression material for heterotic QTL in maize. First of all, we obtained an accurate definition of the allelic structure at QTL regions, where the successful introgression of coherent chromosome blocks of the expected genotypes within the flanking markers used for marker-assisted selection (MAS) was confirmed in all contrasting NILs. Therefore, polymorphic SNPs mapping within the QTL introgression regions represent high-density markers that can be used for the fine mapping of the underlying heterotic QTL through the scoring of QTL-specific segregating populations derived from the cross of contrasting lines within each of the NIL pairs. Recombination events and regions of non-isogenicity were also pinpointed genome-wide for all NIL pairs to an unprecedented level of detail, also establishing an invaluable asset towards the characterization and the isolation of the introgressed QTL. In particular, the homogeneous introgression of the same QTL in NIL pairs having different, and known, recombinant structures will allow us to undertake a fine mapping approach of QTL 3.05 and QTL 4.10 in distinct, yet comparable, near-isogenic segregating populations. Such an approach might allow, in turn, the detection of epistatic effects and the isolation of disturbing factors, thus increasing the chances of both characterizing and fine mapping these QTL. In the case of NIL pairs 3.05_R8 and 3.05_R40, the target QTL region is interrupted by two and one isogenic regions, respectively. However, experimental evidences show that the QTL effect is still present in both NIL pairs [25, 26], suggesting that in both cases the QTL might map within the spared non-isogenic sub-regions, which in turn might represent per se a refinement of QTL 3.05 mapping position.
In all NILs SNPs were found organized along chromosomes in blocks of different length bearing concordant genotypes, consistent with the presence of coherent bi-parental chromosomal recombination blocks, as expected given the adopted introgression design , and further supporting the reliability of the adopted SNPs genotyping platform. The assessment of the allelic inheritance patterns at the genome wide-level allowed us to identify unforeseen non-isogenic regions present outside the target QTL regions in contrasting NILs. These regions generally consisted of clusters of adjacent SNPs having coherent contrasting genotypes, rather than being made of isolated discordant SNPs. Non-isogenic regions were found immediately flanking both sides of all target QTL, largely due to the effects of linkage drag associated with the specific markers used for MAS. These effects appear to be specific to the markers used for introgression, since non-isogenic regions of comparable size were observed flanking the same QTL independently introgressed in different RIL backgrounds (i.e., QTL 3.05 in RILs 8 and 40 and QTL 4.10 in RILs 40 and 55). The fact that the effects of linkage drag might be correlated to variable recombination rates along chromosomes is supported by data produced using the same genotyping platform in two maize recombinant populations . The large linkage drag observed on the centromeric side of QTL 3.05 in both NIL pairs corresponds to a chromosomal region of low recombination rate as compared to the region on the telomeric side of the QTL. This latter region in fact is characterized by a much more limited linkage drag in both NIL pairs introgressing QTL 3.05. The higher recombination rate associated to the telomere of chromosome 4 long arm might instead account for the more limited extension of linkage drag associated to QTL 4.10. Finally, a region of exceptionally low recombination rate immediately surrounded by areas of high recombination rate roughly corresponds to the introgression region of QTL 10.03. The former characteristic (low recombination) would explain the consistency of genotypes observed for the contrasting NILs along the whole length of this extended introgression region, whereas the latter (high recombination) might account for the very limited linkage drag observed for this region.
The non-isogenic regions detected in other parts of the genome are randomly distributed and, from the comparison between NIL pairs introgressing the same QTL in distinct backgrounds, do not appear to be related to the QTL introgression procedure. They rather seem to largely reflect the presence of different residual heterozygous regions peculiar to the single BC1-S1 individuals from which the progenitors of the contrasting NILs in each pair were selected, although other random effects that possibly occurred throughout the inbreeding and selection scheme adopted for the QTL introgression cannot be excluded. The remarkable genotypic similarity observed between NIL pairs having a common ancestor (i.e., 3.05_R40 and 4.10_R40) clearly shows that sizes and distribution of recombination blocks reflect the genetic structure of the specific RIL genotype originating each NIL pair. However, despite the fact that NILs were produced through the same breeding scheme, different level of non-isogenicity were observed in different pairs. This can be ascribed to the effective residual heterozygosity of the single progenitor F4:5 sister plants employed for the independent introgression procedures that led to the production of each NIL pair . The average proportion of non-shared alleles over all the polymorphic SNPs (excluding the QTL introgression region, which were influenced by MAS) should roughly coincide with the fraction of the genome, also excluding the QTL target region, expected to be heterozygous when crossing contrasting NILs within a pair. This was in fact what we observed for NIL pair 4.10_R55, where the proportion of non-shared alleles between the contrasting NILs was 6.0% against an observed residual heterozygosity of 5.1% in their hybrid (NIL4.10_R55-BH).