Mapping QTL affecting resistance to Marek's disease in an F6 advanced intercross population of commercial layer chickens

Background Marek's disease (MD) is a T-cell lymphoma of chickens caused by the Marek's disease virus (MDV), an oncogenic avian herpesvirus. MD is a major cause of economic loss to the poultry industry and the most serious and persistent infectious disease concern. A full-sib intercross population, consisting of five independent families was generated by crossing and repeated intercrossing of two partially inbred commercial White Leghorn layer lines known to differ in genetic resistance to MD. At the F6 generation, a total of 1615 chicks were produced (98 to 248 per family) and phenotyped for MD resistance measured as survival time in days after challenge with a very virulent plus (vv+) strain of MDV. Results QTL affecting MD resistance were identified by selective DNA pooling using a panel of 15 SNPs and 217 microsatellite markers. Since MHC blood type (BT) is known to affect MD resistance, a total of 18 independent pool pairs were constructed according to family × BT combination, with some combinations represented twice for technical reasons. Twenty-one QTL regions (QTLR) affecting post-challenge survival time were identified, distributed among 11 chromosomes (GGA1, 2, 3, 4, 5, 8, 9, 15, 18, 26 and Z), with about two-thirds of the MD resistance alleles derived from the more MD resistant parental line. Eight of the QTLR associated with MD resistance, were previously identified in a backcross (BC) mapping study with the same parental lines. Of these, 7 originated from the more resistant line, and one from the less resistant line. Conclusion There was considerable evidence suggesting that MD resistance alleles tend to be recessive. The width of the QTLR for these QTL appeared to be reduced about two-fold in the F6 as compared to that found in the previous BC study. These results provide a firm basis for high-resolution linkage disequilibrium mapping and positional cloning of the resistance genes.


Background
Marek's disease (MD) is a highly contagious disease that affects chickens worldwide and is caused by the Marek's disease virus (MDV), an oncogenic avian herpesvirus. MDV replicates in the lymphoid tissues [1] and is commonly manifested as an acute disease with lymphomas in multiple visceral organs [2]. MD has a large impact on production, making it the most costly viral disease in the chicken industry [3]. It has also been proposed as a natural model for lymphomas that over-express Hodgkin's disease antigen (CD30) [4], and for understanding how vaccines control lymphomas [5] and how pathogenic viruses continue to evolve [6].
The current family selection methods used for improving resistance are based on challenge exposure, and hence are very expensive in terms of selection space, and the financial and ethical cost of routinely challenging large numbers of birds. It is anticipated that identification of QTL for MD resistance will eventually allow marker-assisted selection on an individual bird level, without need for routine challenge. This will greatly enhance the accuracy of selection, and reduce costs by orders of magnitude. QTL mapping, in conjunction with positional comparative cloning, gene expression [7,8], and protein-protein interaction studies [9,10] could also provide a platform for identification of the genes underlying the QTL. The identification of these genes will provide insight on disease pathways and resistance mechanisms, leading to more effective vaccines or other control strategies, and even more effective selection schemes based on gene-assisted selection [11].
Simulation and theoretical studies [12][13][14] show that large scale BC or F2 experiments can locate QTL of moderate allele substitution effect (0.2 or 0.3 standardized units) only to within a confidence interval of 10 to 20 cM or more. Consequently, Darvasi and Soller [15] proposed the use of Advanced Intercross Lines (AIL) for fine mapping of QTL. An AIL is initiated by a cross between two inbred lines, and continued by random intercrossing through successive generations. Recombination events between loci accumulate with the advance of the intercross generations. The AIL approach has been applied successfully in many mouse studies (e.g., [16]). A full-sib intercross line (FSIL) [17] is a variant of the AIL that is suitable for analysis of outcrossing populations. In a FSIL, a single male and female from the same or different populations, are crossed to produce a large full-sibship. The full-sibs are mated at random, and the population continued by random intercrossing to advanced generations. To a large extent, map expansion and QTL resolution in an FSIL parallel that in an AIL.
Previous studies in commercial and non-commercial White Leghorn populations have identified genes, QTL and genomic regions associated with resistance to MD (briefly summarized in [18]; see also [19]). Heifetz et al. [18], using selective DNA pooling, identified 15 QTL affecting MD resistance in a large reciprocal backcross (BC) between two partially inbred commercial White Leghorn layer chicken lines that differed in resistance to MD. Five of these QTL were previously reported by McElroy et al. [20] in a smaller study of an independent hatch of one of these BC populations, and four were previously reported by Yonash et al. [21] in a study of an F2 population generated by a cross between two inbred White Leghorn layer lines that differed markedly in resistance to MD.
In the present study, a large F6 FSIL was generated as a continuation of the same F1 population from which the BC populations of McElroy et al. [20] and Heifetz et al. [18] were derived. The objective of the study was to confirm previously mapped QTL, improve QTL map resolution, and uncover additional QTL segregating in this population. Table 1 shows characteristics of resistant and susceptible pools according to family and MHC blood type (BT).  1 The number of individuals in the susceptible pools was low (10 to 12 birds), hence an additional susceptible pool was created, which contained the next most susceptible 20% of birds.   [18]. At 56.7%, the average mortality rate was much higher in the F6 families than in the pure line chicks but close to the target mortality for the challenge test (50%).

Densitometric genotyping
All pools were densitometrically genotyped for microsatellite or SNP markers as described in methods, and differences in allele frequency between resistant and susceptible pools (D-values) were calculated. In order to validate the markers that had high D-values, 35 microsatellite markers were retested across the resistant and susceptible pools of all family × BT combinations. The correlation between first and second genotyping was calculated for each marker separately across all pools. Except for one outlier with a correlation of 0.52, the individual marker correlations ranged from 0.70 to 1.0, with an average of 0.93. Including the outlier, the overall correlation between the two genotypings was 0.87. Since this correlation represents a double path (from first genotyping to actual pool content, and from actual pool content to second genotyping), this observed correlation of 0.87 is equivalent to a correlation of 0.93 between a single genotyping and actual pool content. This high correlation confirms the reliability and accuracy of the pool genotyping results as representing the content of the pools. Nevertheless, since both genotypings were done on the same pools, the high correlation does not relate to the degree to which the content of the pools represents the genotypes of the individuals making up the pool. The number of significant markers for each test, and the statistical power of the test are also presented in Table 2. Power was generally low, indicating that many additional markers may have been in linkage to QTL but did not  These markers were significant for CS, but were not supported by IA. They are included in the list of QTLR, because they corresponded to QTLR previously mapped by [18]. PFP thresholds: Z, 0.028; CS, 0.048; *, Non-significant but part of a linked series of significant markers defining a QTLR; Italics, markers not included in final count of QTLR because significant only for a single test and marker; §, CS not supported by IA; Type, tests for which marker is significant; r at the end of a marker name represents a repeat test of the same marker; Spaces between rows delineate the individual QTLR. reach statistical significance. This is to be expected. As a result of the map expansion in the F6, markers "move away" from their QTL. Hence, the significance of the test drops, reducing power, even though the marker is still in linkage with the QTL. Power for CS was greater than for Z. This is probably due to the fact that CS will often be significant when Z is significant, and in addition, CS will also be significant when interactions are important, while Z will not be significant in these instances. Power for IA is also presented, but does not carry much meaning, as the proportion of tests estimated as representing linkage is very low for this test. Table 3 shows P-values for Z, CS and IA for 60 markers for which Z or CS, or both were significant at the PFP = 0.20 threshold. In addition, 14 markers are listed that were part of a sequence of significant markers defining a QTLR, although they did not reach significance for Z or CS. Of the 60 markers with significant tests, 10 were not included among the QTLR, because they were significant for a single test only, or were not part of a sequence of two or more markers each with at least one significant test. Of the remaining 50 markers, 36 were significant for CS with strong support from IA. Correspondence between CS and IA is particularly striking in the instances where both differed widely from Z. Eight markers were significant for CS, but were not supported by IA, and hence were initially not considered as significant for purposes of QTLR definition. However five markers (on chromosomes 9 and 15) that were excluded initially as not meeting criteria, were included among the final list QTLR, because they corresponded to QTLR previously mapped by [18]. This returned four markers that were significant for CS, but not supported by IA. In all, 52 markers were considered as defining QTLR. Of these, 27 were significant for CS only, while 25 were significant for Z, of which 11 were significant for Z only and 14 were significant for Z and CS. Joint significance of Z and CS basically means that CS is supporting Z, since large uni-directional D values will generate significance for both tests. Since power for CS is less than for Z, it is expected that some markers will be significant for Z but not for CS. Thus, about half of the significant markers represented significant main effects, while the other half represented significant interaction effects without significant main effects.

QTLR
Based on Table 3, QTLR and non-Q regions, as defined in Methods, are listed in Table 4, which shows the total number of markers in each region (counting also non-significant markers incorporated within a QTLR) and the location of the markers defining the start and end points of each region. For QTLR, the statistical tests for which the region is significant, the estimated allele substitution effect on survival time, and overlap with regions of significance in Heifetz et al. [18] and Yonash et al. [21] are also shown. A total of 21 QTLR were identified. Of these, 8 were significant for CS only, while 13 were significant for Z either alone (6) or for Z as well as CS (7). For six of the regions, QTL location and confidence interval, as determined by IA, are also shown in the table.
Eight of the QTLR corresponded to QTLR that were previously identified by Heifetz et al. [18] in the reciprocal BC populations generated from these lines. As noted above, however, two of these (QTLR 9-II and 15-II) were of borderline significance in the F6 and were included among the F6 QTLR because of support from the Heifetz et al. [18] study. In addition, three of the QTLR corresponded to significant QTL reported by Yonash et al. [21], of which one was also reported in the Heifetz et al. [18] study. Thus 11 out of 22 QTLR identified in the F6 were supported by other studies of these lines or of other Leghorn layer lines.
The total absolute allele substitution effect summed across all 21 QTLR was 95.84 days, with a mean absolute allele substitution effect of 4.56 days. This compares well with the mean absolute allele substitution effect of 5.53 days for the QTLR identified in the Heifetz et al [18] study when based on combined data of the two BCs. Of the 21 QTLR, 14 presented positive effects, indicating that the resistant allele originated from Line 2 (the more resistant line), and 7 presented negative effects, indicating that the resistant allele originated from Line 1 (the less resistant line). This corresponds to the results of the BC analysis [18], in which case also resistance alleles originating from Line 1 were uncovered. The total summed effect for survival time, across all 21 loci including algebraic sign, was 27.20 days in favor of Line 2. Although not directly comparable, this is in accord with the observed difference of 21.0% in proportion of survivors in favor of Line 2, found in the present study.
When some of the same QTL are uncovered in independent genome scans, this supports the validity of these QTL. However, since independent scans will have some QTL in common and some that differ, it is always possible that at least some of the shared QTL represent chance Type I errors that happened to occur in the same chromosomal region. The likelihood of this decreases when the common QTL share specific qualities, other than their quantitative effect. For example, in a previous study of QTL affecting milk yield and protein percent in Brown Swiss dairy cattle [22], it was found that QTL identified in the Brown Swiss breed shared specificity (for milk yield or for protein percent, or both) with those identified in the Holstein breed. This was taken as strong support for the reality of these QTL. Similarly, in the present study, the identified QTL have specificity in terms of direction of effect (whether the allele derived from the more resistant line had a positive of negative effect on resistance) and magnitude of effect. It is striking, therefore, that of the 8 QTLR  that were found in the present study and also in the Heifetz et al. [18] study, the same 7 had positive effects in both studies, and the same one had a negative effect in both studies. The likelihood of either of these events is low, although within the realm of chance. Furthermore, the correlation between estimated signed allele effects of the common QTLR in the F6 and BC was 0.67 (P < 0.03); the correlation increased to 0.84 (P < 0.001) if an exceptional value (for QTLR Z-III) is excluded. Thus, the nearly significant correspondence of algebraic sign for the common QTLR of the two studies, and the significant correlation of their estimated effects strongly supports their underlying reality. The preponderance of positive effects among the QTLR that were common to the two studies, suggest that these are The QTLR that contributed to the superior resistance of Line 2.

Discussion
MHC type B15/B15 was clearly more resistant than B2/B2 and B2/B15, but the resistance appeared to be fully recessive, as the heterozygote was virtually identical to the B2/ B2 homozygote. Similar results were obtained in the reciprocal backcross study of these founder lines [18]. This was somewhat unexpected, as a number of studies comparing the influence of various MHC haplotypes on MD in different strains of chickens found that the B2 haplotype is frequently associated with greater resistance [23,24]. These data are, however consistent with others that suggest that some genes may interact to complement the MHC haplotype influence [25].
The QTL results of this study accord well with those of the previous reciprocal backcross study of these lines [18]. In the BC case, a total of 15 QTLR were identified. Of these, 3 QTLR were significant for CS only, 12 were significant for ANOVA or Z, of which 8 were also significant for CS. Three-fifths of the QTLR identified in the previous study were also identified in the present study, and the effects of the common QTLR were identical in sign, and highly correlated in magnitude.
Considering the three main MD resistance QTL-mapping studies, the present F6 and the BC [18] experiments between them identified all four of the significant Yonash et al. [21] QTL; 8 QTL were common to the F6 and BC [18] experiments; while 13 and 7 were uniquely identified by the F6 and BC [18] [26]. Thus the observed difference in power of the two experiments is best attributed to sampling variation.
Overall, Line 2 contributed about twice as many resistance alleles as Line 1, as would be expected from the relative resistance of the two lines. Curiously, in the present study, the pure line controls of Line 1 and Line 2 displayed much higher resistance (28.6 and 7.6% mortality to end of test, respectively), compared to the F6 (52.7% mortality to end of test). Thus, interactions of the background genome appear to have major effects on the expression of resistance [27]. The fact that the cross of the two lines was more susceptible than each of the individual lines is consistent with other indications that the QTL alleles that confer resistance in these lines are recessive, so that the cross shows negative heterosis for resistance. This stands in some contrast to the results of Stone [28], who conducted a number of crosses between ADOL resistant and susceptible lines and concluded that MD resistance was dominant, and of Yonash et al. [21], who found that at a majority of their identified QTL, the resistant alleles were dominant. The results are, however, consistent with the Heifetz et al. [18] study, in that QTL mapping in the reciprocal backcross populations developed from these founder lines identified different QTL, as would be expected if QTL for resistance are recessive. The hypothesis that resistance alleles are recessive also predicts that the F6 should identify QTL that came to expression in both of the reciprocal BC populations. In this regard, eight of the QTL identified in the BC populations corresponded to QTL identified in the F6. Of these, one was specific to BC1, three were specific to BC2, and four were found in both BC's. Seven of the QTL mapped in the BC, however, did not have corresponding QTLR in the F6. The simplest explanation for these negative results may be the incomplete power of the two experiments, which appreciably reduces the likelihood that the same QTL will be found in the two experiments.
Three of the QTLR identified in the F6 were also identified in the Yonash et al., [21] study. This is particularly noteworthy, since the challenge strain in the Yonash et al. [21] study was JM/102W, which is a virulent (v) MDV strain, but less pathogenic than the vv+ MDV strain 648A used in the present study. Thus, this lends some support to the widely held assumption that QTL conferring resistance to one MDV strain will confer resistance to another as well, at least in White Leghorns where genetic diversity is limited.
Considering QTL map locations in the BC as compared to the F6, there is a clear tendency for distinctly narrower QTLR in the F6 than in the BC; average QTLR extent in the F6 was just about half that in the BC, as anticipated from the observed map expansion in the F6. Additional genotyping at higher marker density across the QTLR is clearly warranted to obtain full benefit of the F6 map expansion.
Counting all resistance QTL uncovered in the two BC populations by Heifetz et al. [18] and in the F6 in the present study, gives a total of 28 QTL; eight common to both series of crosses, 13 uncovered only in the F6, and 7 uncovered only in the BC populations. Of the 8 common QTLR, 5 were significant by Z or by Z and CS, indicating important main effects. These results should provide a strong platform for comparative positional cloning, after confirmation of the associations by individual genotyping of the pools. Comparative functional genomics based on the complete chicken genome sequence could be used to identify candidate genes in the identified chromosomal regions. As a preliminary exercise, we have searched Build 2 of the chicken genome across 5 cM centered at each of three regions which had narrow widths in the present study, and corresponded to regions of significance in [18] or [21], namely: QTLR 2-II (supported by [21]), centered at ADL0270 at about 9.7 Mb; QTLR 8-II (supported by [18] and [21]), centered at HYL08003 at about 18.5 Mb; and QTLR 9-II (supported by [18] antigen is a receptor involved in cell adhesion and signalling, that is present on the surface of most activated leukocytes. MDV is thought to infect and transform only activated CD4+ T cells. Consequently, cell adhesion might assist the virus transmission from one infected cell to another, as MDV is highly cell associated. PIGK is a subunit of the GPI transamidase complex that catalyzes the attachment of GPI (glycosylphosphatidulinositol) to proteins. GPI is a membrane anchor for cell surface proteins.
As such it provides for rapid protein release in response to a stimulus, since the protein bound by the GPI anchor can be immediately released without a requirement for RNA or protein synthesis. In this context it is relevant that SCA2 (stem cell antigen 2) has a GPI anchor, and is one of the most strongly documented MD resistance genes [9]. For QTLR 9-II, using LEI0197 (at 13.0 Mb) as the reference point, we were unable to find any attractive candidate genes. This may in part be because the biology of MD resistance is not well defined. A further contributing factor is the fact that much of the chicken gene annotation is inferred by electronic annotation and not by experimental evidence. Consequently, one can speculate about almost any gene being involved in viral replication and spread, or cellular transformation, especially as the biomedical literature is slanted heavily towards cancer.
These and other candidate genes can be screened further by examining their mRNA expression pattern under MDV challenge, using existing data banks. However, they could also readily be examined for linkage disequilibrium with MD resistance by constructing appropriate resource populations using the existing data and sample banks accumulated at Hy-Line through the routine MD challenge and testing component of their regular commercial breeding program. Such association tests could also be implemented by selective DNA pooling, perhaps using the new fractioned pool designs [29] to increase power and accuracy.

Conclusion
If the resistance alleles are recessive and at low to moderate allele frequencies, this would explain the slow response to selection for increased resistance, and enhances the need for mapping in order to increase the effectiveness of selection on these QTL within the pure lines. Because most chickens used for commercial egg or meat production are crosses between 2 to 4 lines, it will be important that the recessive resistance alleles are present in all pure lines, such that the cross will be homozygous for the resistance QTL. For lines that lack the resistance alleles, which would need to be determined as a first step, it might be most effective to introgress resistance alleles from one line to another. Achieving this, without losing the heterosis for production traits that characterizes the cross of these commercial lines, could only be achieved through marker or preferably gene assisted introgression focused on the limited genome regions carrying the resistance alleles.

Resource population Stocks
The experiment was carried out using facilities and two commercial White Leghorn lines (henceforth, Line 1 and Line 2) of Hy-Line International (henceforth, Hy-Line). Both lines were partially inbred and had been subjected to selection for resistance to MD, and both were relatively resistant when compared to field strains of poultry. However, under the same challenge protocol as the present study, Line 2 was distinctly more resistant than Line 1. Further details of these lines are in Heifetz et al. [18].

Experimental populations
In order to provide replication and some indication of QTL segregation within the two lines, the F6 FSIL was produced in five independent replicates, termed "FSIL-families", as follows ( Figure 1): Five Line 1 males were each pair-mated with a single Line 2 female, to produce an F1 generation consisting of five large and independent fullsib F1 families. Each of these F1 families served as the founder of one of the replicate FSIL-families. Within each of these five F1 founder families, 7-10 males were randomly mated to two females each, to produce seven F2 subfamilies within each of the five FSIL-families. Each of the F2 subfamilies within each of the five FSIL-families was then continued by crossing full-brother and sister within the subfamily for another two consecutive generations, creating at the F4 generation five replicate FSIL-families, each consisting of 7 partially inbred subfamilies. This was done to maintain genetic diversity within each family, while still accumulating recombination events. At the F4 generation, males from each of the seven subfamilies within each FSIL-family were crossed to females of other subfamilies within their own FSIL-family (but not with their own full-sisters, or with females of the other FSIL-families). In this way, the five separate F5 FSIL-families were reconstituted. In the F5 generation, a total of about 30 males and 120 females per FSIL-family were chosen and mated at random to produce the five independent F6 replicate FSIL-families, which constituted the mapping population. The F6 birds were produced in two hatches, with a total of 862 and 753 female chicks for Hatches 1 and 2, respectively. The total number of female F6 birds within each of the five FSIL-families with full data (survival time, MD diagnosis) ranged from 98 to 248.

MD challenge test
Day-old female F6 chicks from the two hatches were vaccinated with 500 plaque forming units (pfu) of bivalent HVT/SB-1 vaccine (Merial Select, Gainesville, GA, 30503) and then housed in cages. A total of 200 pure line chicks from each of the two parental lines were included as controls for each hatch of the F6 (800 chicks total). At 7 days of age, the chicks were inoculated subcutaneously with 500 pfu of the very virulent plus (vv+) strain (648A) of MDV [30]. Age at mortality was recorded on all chicks, as an indicator of MD resistance, until 116 and 123 days of age for Hatches 1 and 2 respectively, at which time remaining birds were terminated. Under normal circumstances, mortality in these lines is virtually zero (0.5%) from 2 to 18 weeks of age. Under challenge, MD mortality does not begin until four weeks of age at the earliest. After this age, however, virtually all mortality in the challenged birds is due to MD. Therefore, any chick that died after 21 days was considered to have died from MD, without positive diagnosis by necroscopy. At the end of the experiment at 18 weeks (126 days, just prior to entry into lay), the birds were visually examined and any birds that were blind, lame or completely paralyzed were identified as having MD. However, birds with mild symptoms or birds that had recovered, were not recognized as such, and hence were classed as survivors. Thus, the resistant pools, which were made up of survivors, may also have included some proportion of susceptible birds with MD tumours. Target mortality for the challenge test averages about 50%, but in any particular hatch this varies considerably, from 30 to 70%. The challenge test and diagnosis procedure has been shown to be repeatable and to result in data with substantial heritability for sire progeny averages based on 30 daughters (pers. comm., Neil O'Sullivan, Hy-Line Int.).

DNA extraction and pool construction
In order to reduce the number of genotypings while limiting loss of power, the selective DNA pooling method [31,32] was used. This method has proven very accurate in our laboratory [33]. Ten and six days post infection for Hatches 1 and 2, respectively, blood was collected from the jugular vein in syringes containing EDTA with 22 gauge needles. DNA was prepared using salt and ethanol precipitation. The OD 260/280 ratios were determined and each sample was diluted to approximately 50 ng/μl DNA concentration. DNA content was retested and further diluted to 25 ng/μl. Pools of DNA were made by combining equal volumes of the 25 ng/μl samples from each bird in the pool.
Mortality rate was very similar for the two hatches (56.2 and 57.2%, respectively) and was also consistent across hatches for the five families considered separately (data not shown). For each F6 family, therefore, chicks from the two hatches were combined when forming pools. There were three blood type (BT) groups in each F6 family: B2/ B2, B2/B15, and B15/B15. To account for the well known effects of MHC blood type on resistance to MD [34][35][36], birds were pooled within B blood groups. For selective DNA pooling, progeny within each family × BT combina-tion were ranked by age at mortality and the top and bottom 20% from each combination were chosen for the pools (Table 1). This gave 30 pools in total (5 families × 3 BT × 2 tails (high and low)). However, the Family 4 × B15/ B15 combination had only 5 birds in total, which reduced the total number of pools to 28. The remaining pools ranged in size from 9 to 53 individuals. For most of the susceptible pools, the cut off survival age to be included in the pool varied from 62 to 79 days. However, two pools (Family 1 × B15/B15, and Family 2 × B2/B2) included birds surviving up to 115 and 92 days, respectively. In family × BT combinations with small numbers, the number of individuals in the susceptible pools was low (10 to 12 birds), hence an additional susceptible pool (Type II pool) was created, which contained the next most susceptible 20% of birds (cut off points demarcating the two pools can readily be inferred from the range column of Table 1). There were four susceptible pools of this type. With respect to the resistant pools, in most family × BT combinations the proportion of surviving (i.e., resistant) birds exceeded 20%, since overall survival to end of test was 56%. In this case, birds for the resistant pools were chosen so as to ensure that each hatch was equally represented. In all, then, there were a total of 32 pools: 14 resistant pools and 18 susceptible pools (14 type I and 4 type II).

Genotypic information Markers
Microsatellite markers were chosen that had alleles differentiating the two lines. Genotyping of pools was per-formed in two rounds. The first round included 182 markers, chosen to provide representative genome coverage. Based on the first round of genotyping, 35 markers that gave a suggestive result were genotyped again on the same pools. These results were treated in the final analysis as though they were new markers at the same location. In addition, in 10 regions that showed results of special interest in the first round, 50 new markers were added, among them 35 microsatellites and 15 SNPs. Thus, in total the pools were genotyped for 232 markers: 15 SNPs and 217 microsatellites. However, since 35 markers were genotyped twice, and the replicates were treated as new markers, all told there were 267 single-marker statistical tests. The chromosomal regions spanned by the genotyped markers summed to a total of 2477 cM. Adding 10 cM upstream and downstream of the most distal and proximal markers for each of the 15 scanned chromosomes would add another 300 cM, giving a total of about 2800 cM, or about 70% of the chicken genome. The average distance between markers on the same chromosome was ~17 cM, with a maximum interval of 107 cM on chromosomes 3.

Marker position
The interval analysis method used for QTL mapping requires recombination rates between markers. To obtain these rates, markers were positioned on a linkage map based on the consensus 2000 map [37]. When there was a discrepancy between the consensus map and the published Build 1 genome sequence [38], the order of the markers was taken according to the sequence and markers were positioned as described in Heifetz et al. [18].
Genetic distances between markers in cM, as given in the public marker maps, are based on data obtained in F2 or BC populations, and hence represent effects of a single round of recombination. For AIL, infinitesimal map distances expand according to the expression d t = 1/2 td, where d t is map distance in the F(t) generation, and d is the map distance in the F2 generation [15]. According to this expression, for an F6 population, the overall expansion factor is three-fold. For interval analysis, map distances calculated in this manner were transformed to recombination rate (r) values using the Haldane map function. The three-fold map expansion applies to AIL with random mating. Here, the lines were propagated by brother-sister matings to generate the F3 and F4 generations, which created some level of inbreeding and may have reduced the actual expansion of the map. Indeed, studies under way indicate that the realized expansion was two-, rather than three-fold. Therefore, the interval analysis was done using marker maps calculated with an expansion factor of 2, 3 and 4 to check the sensitivity of the results to this parameter.
The experiment design Figure 1 The experiment design.

Genotyping methods
Genotyping for the B group was done by standard serotyping using B2 and B15 specific reagents. For the microsatellite markers, allele frequency differences between high and low pools were estimated as described by Heifetz et al. [18]. For eight of the SNP markers, relative allele frequencies were determined by pyrosequencing. In brief, mini-sequencing primers were designed by the Pyrosequencing Assay Design Software (version 1.0) (Biotage, Uppsala, Sweden). The forward, biotinylated reverse, and sequencing primers were synthesized by Operon (Huntsville, AL). Fragments were amplified in a total volume of 40 μl by PCR as described by Liu and Cheng [39]. The biotin-ssDNA was isolated by binding PCR amplicons to streptavidin sepharose (Amersham Biosciences, Uppsala, Sweden) followed by denaturation with 0.2 M NaOH. The mini-sequencing primer was annealed to the ssDNA, and the sequencing reactions were initiated using a nucleotide dispensation order based on the sequence. The PSQ96MA analysis program (Biotage, Uppsala, Sweden) calculated a percentage for each base at the SNP using the surrounding peaks as reference values. Controls included absence of template and known homozygous and heterozygous individuals. Using the same procedures but calling genotypes instead of allele frequencies, the remaining seven SNP markers were genotyped individually on all individuals that constituted the pools. For these markers the frequency in each pool was the actual frequency of the alleles in the pool and was treated in the analyses as a pool frequency estimate.

Frequency estimates and D-values
Following Lipkin et al. [32], the frequencies of alleles at each microsatellite marker estimated from pools were corrected for differential amplification and for the shadow bands that are inherent to microsatellite markers. The basic datum that was used to identify markers that are associated with QTL for MD was the difference (D ijk ) in allele frequency of the i th marker between the resistant and susceptible pools of the j th BT × k th family combination: D ijk = dF ijk1 -dF ijk2 , where dF ijk1 and dF ijk2 are the densitometric estimates of the frequency of the marker allele derived from Line 2 in the resistant and susceptible pools, respectively.
A total of 18 D ijk were calculated for each marker. These included 14 D ijk based on the resistant pools and the Type I susceptible pools; and an additional four D ijk based on the four Type II susceptible pools and their corresponding resistant pools (which are the same as for the corresponding Type I pools). Although the four Type II D ijk are not independent of their corresponding Type I values (since they are from the same test population and were contrasted to the same resistant pool), they were considered as being independent in the analyses which follow.  [18] and will not be repeated here. Each of these tests explored a somewhat different aspect of the data. In particular, the Z-test evaluates the main effect of a marker allele on D ijk across all BT × family (BF) combinations. Thus, the Z-test is sensitive to main effects and provides an estimate of the direction of the effect of specific alleles, but is insensitive to marker-BT-family interaction effects. The Chi-square test analyzes D ijk within each BF combination, allowing for different directions of effects and is, therefore, less sensitive to main effects than the Ztest but more sensitive to interaction effects. The Z-and Chi-square tests are both based on analysis of single markers. To take into account the additional information present in adjacent markers, D ijk for all markers on a chromosome were analyzed jointly using a likelihood-based interval mapping (IA) method [40], as implemented by Heifetz et al. [18] for a backcross but with additional adjustments for autosomal and Z chromosomes in an F6 population. By analyzing each BF combination as a separate family, the IA shares with Chi-square its sensitivity to interaction effects but is also less powerful than the Z-test to detect main effects. These three tests are based on the D ijk divided by their standard errors (SE). Two additional tests: a three-way ANOVA (marker × blood type × family) and a nonparametric sign test were also used. These, although based on the same D ijk , each use a different basis to test significance, in this way providing an additional control to the statistical calculations. Both ANOVA and the Sign Test share with the Z-test its sensitivity to main effects and insensitivity to interaction effects. The Sign Test was used primarily as a check on the Z-test and ANOVA, and proved effective in indicating technical errors in the analysis, but will not be presented or discussed in detail. Mean Square error in the ANOVA was estimated from the pooled interaction effects, rather than from replicated markers. Hence, considering that significant interaction effects were present in this population, the MS error term would have been subject to upward bias, which increases P-values for this test, decreasing power relative to the Z-test.

Accounting for multiple tests
To take into account the multiple test situation while retaining power, a 20% "proportion of false positive (PFP)" threshold was used to determine the critical comparisonwise error rate (CWER) or P-value for declaring marker-QTL linkage [41], where PFP for the i th test is calculated as: PFP i = (P i t N )/R i P i is the P-value of the i th test, when the N tests are ranked by their P-values from lowest to highest, R i is the rank number of the i th test, and t N is the number of tests for which the null hypothesis is true.
Estimation of t N was by the Nettleton et al. [42] algorithm for ANOVA, Z and CS (see also [18]).
Given an estimate of t N , the number of tests representing falsified null hypotheses, f N (i.e., tests representing true marker-QTL linkage), can then be estimated as f N = N -t N , and effective power, as o N /f N where o N is the number of tests that are significant according to the designated significance level.
For the IA test, the simplified algorithm of [43] was used to estimate t N , which works well when t N is almost equal to N. The PFP calculation was done using all IA tests that were conducted on a chromosome at 1 cM intervals on the expanded map, as in the range of CWER values > 0.001 there was a fairly smooth and monotonic relationship between rank number and PFP (see also figure 2 of [44]). On this basis, PFP calculation was done using all IA tests that were conducted on a chromosome at 1 cM intervals on the expanded map.

Effect of added markers and map expansion on PFP threshold levels
As noted above, genotyping was performed in two rounds. Markers used in the first round were chosen to provide good genome coverage, but without relation to previously identified genes or QTLR affecting MD resistance. However, the 85 markers used in the second round were specifically targeted to "suggestive" regions uncovered in the first round. These were either repeat genotypings of the suggestive markers themselves (35 markers) or additional markers targeted to the suggestive regions (50 markers). Thus, the actual proportion of positives can be expected to be greater among the second round markers than among the first round markers. Combining the two sets of markers in a single PFP analysis, as was done in the present study, will increase the actual proportion of positives among all markers and, hence, can be expected to render the PFP thresholds somewhat less stringent relative to those appropriate to the first set of markers, but more stringent relative to those appropriate to the second set of markers. This consideration will be more important for the single marker tests (ANOVA, Z, CS) than for IA, since IA deals with chromosomal regions and the number of markers in a significant region is not material.
With respect to the effect of map expansion on PFP thresholds, for Z, ANOVA, and CS the effect of map expansion should be to increase average distance between markers and QTL, and hence increase P-values for all markers except those that are very close to the QTL. Thus, there should be a general increase in P-values. This will result in more stringent PFP thresholds for the F6 as compared to the same markers in the F2 or BC, and a consequent reduction in power as compared to the BC. As will be demonstrated in results, loss of power was especially severe for ANOVA and IA. For ANOVA, loss in power due to map expansion added to the loss of power due to use of the interaction MS as error term. For IA, in addition to the reduced P-values all along the genome, there will also be a massive increase in the number of non-significant test points (after taking three-fold map expansion into account, the IA analysis included 12,753 individual points). Since the number of QTL is fixed, the ratio of QTL to test-points drops precipitously, with the vast majority of test points representing true null hypotheses. This will necessarily increase stringency of the threshold for a given PFP, since total number of test points enters PFP calculation in the numerator. For this reason, we did not relate to the actual PFP thresholds for ANOVA or IA, but to the absolute P-values, and considered ANOVA and IA as supporting significant Z-test and CS tests, respectively. In some cases CS gave a significant P-value for a marker, while IA gave P ≥ 0.50 for the same marker. In all such instances, this was due to the fact that CS P-values for markers closely flanking the significant marker were high and not consistent with the low P-value of the significant marker. Thus, IA was more correctly reflecting the overall significance of the region. In these cases, the IA results were taken as definitive, and the single significant CS markers were attributed to technical error and therefore not taken as indicating a linked QTL. When both IA and CS indicated significance, the IA also provided a point estimate of QTL location according to the cM of highest significance, and a confidence interval of QTL location according to the region for which P < 0.05.

Defining QTL containing regions (QTLR) and regions not containing QTL (non-Q regions)
Examination of the full set of marker test results showed that significant markers most often appeared in short stretches of two to five closely linked markers (in some cases incorporating one or more non-significant markers). These stretches of significant markers were flanked by stretches of non-significant markers. Each such stretch of one or more significant markers was taken to define a single QTL containing chromosomal region (QTLR). The flanking non-significant regions were taken to define regions that did not contain QTL (non-Q regions). The minimum extent of a QTLR is thus defined by the locations of the start and end markers of the QTLR; the maximum extent of a QTLR is defined as running from the midpoint between the distal marker of the QTLR and the proximal marker of the flanking upstream non-Q region to the midpoint between the proximal marker of the QTLR and the distal marker of the flanking downstream non-Q region. For example, QTLR 1-IV extended from 245 to 248 cM. It was flanked upstream by non-Q region 1-V having proximal marker at 259 cM, and flanked downstream by non-Q region 1-III having distal marker at 243 cM. Thus, the minimum extent of this QTLR was 3 cM, running from 245 to 248 cM; and the maximum extent of the QTLR was 9.5 cM, running from 244 cM to 253.5 cM.

Estimating allele substitution effects
Allele substitution effects for a given QTLR were calculated as described in Appendix I, from the average differences, D, in allele frequency of the alleles derived from Line 2 in the resistant and susceptible pools. When a QTLR was represented by multiple markers in the same region, D was set equal to the weighted average D ijk -value across the resistant and susceptible pools of the 18 BF jk combinations within each of the M i markers defining the QTLR, weighted by the number of individuals in each pool.

Searching for candidate genes in selected QTLR
The marker of interest for a specific QTL was identified in chicken genome assembly Build 2. Based on the cM per Kb from [45], the entire region was visualized on the USCS browser [46]. Using the USCS browser all putative genes in the region were identified using a variety of tools to visualize the features and annotations (e.g., chicken RefSeq, non-chicken RefSeq, chicken EST and mRNA). The identified genes were manually examined for a potential role in MD pathogenesis.

Authors' contributions
EMH carried out all statistical analyses and participated actively in their interpretation and in writing all drafts of the manuscript. JEF participated in planning the study, carried out blood collections, DNA isolation, microsatellite genotyping and densitometric allele frequency analyses for selective DNA pooling and reviewed all drafts of the manuscript. NPO participated in planning the study and was responsible for implementing the crosses, directed the challenge tests, and provided overall coordination of the phenotyping aspects of the study. JAA was responsible for initiating and setting up this project, and providing initial overall direction. He participated actively in writing the final draft. HC added new SNP markers in selected regions of interest, carried out SNP genotyping, reviewed later drafts of the manuscript and carried out the candidate gene search. JW was involved in evaluating alternate mating designs for the later generations of the AIL, provided programs for the interval analysis and helped adjust them to the F6-AIL design. MS conceived and participated in planning the study and the later rounds of statistical analyses, and shared in their interpretation and in writing the advanced drafts of the manuscript. JCMD was involved in designing the later generations of the FSIL, secured funding for the statistical analysis of the data generated by the project, participated in planning and interpreting the statistical analysis and participated in writing the first draft and reviewing the later drafts of the manuscript.
All authors read and approved the final manuscript.

Estimating the effects of individual markers on survival time
Let P R and P S be the proportion of total population selected to construct the resistant and susceptible pools; let α P be the observed allele substitution effect of the Line 2 marker allele relative to the Line 1 marker allele taken over both of the selected tails of the population; and let α T be the actual substitution effect in the population as a whole. Then, substituting in the Darvasi and Soller [31] expression, gives, where, i PX = X P /P X , (P X = P R or P S ) is the selection intensity of the pool, X P is the ordinate of the standard normal distribution at the point Z P which cuts off proportion P of the distribution.
In the present study, P S = 0.22 was calculated from Table  1 as the total number of birds taken to the susceptible pools across all family and blood type combinations, including the Type II susceptible pools; P R = 0.44 was calculated from Table 1 as the weighted mean proportion selected across all family and blood type combinations with weighting according to the number of individuals in the pool.
With pool data, α P is calculated as where, taking into account that an F6 population is equivalent to an F2 population with respect to expected genotype frequencies, and following Darvasi and Soller [31], G 2 is the mean of individuals homozygous for the Line 2 allele taken over all resistant and susceptible pools, G 1 is the mean of individuals homozygous for the Line 1 allele taken over all resistant and susceptible pools.
Letting, F M2 and F M2S be the frequency of the Line 2 allele (M2) in the resistant and susceptible pools, respectively, and F M1R and F M1S be the same for the Line 1 allele, then following Darvasi and Soller [31], In the present study, since all individuals taken to the resistant pools survived until the end of the test, T R was taken as the mean test cut-off age across the two hatches, namely, 119.5 days; T S was taken as weighted mean survival time of the individuals in the susceptible pools, calculated from Table 1 across all family and blood type combinations = 63.02 days, with weighting according to the number of individuals in the pool. Thus, T R -T S = 56.5 days, and α P = 2D(56.5) days.
As noted in Heifetz et al. [18], when applied to survival data which have a right skewed distribution, the Darvasi and Soller [31] correction factor, which is based on the assumption of a normal distribution appears to provide estimates of QTL effect that are about 10% greater than the actual effects.