Skip to main content
Fig. 3 | BMC Genomics

Fig. 3

From: Association mapping by aerial drone reveals 213 genetic associations for Sorghum bicolor biomass traits under drought

Fig. 3

Overview of HDP-GWAS peak definition pipeline. To define preliminary GWAS peaks, all SNPs identified as significant in at least one out of 460 individual GWAS were consolidated into a single file. Local pairwise LD (r2) was calculated by first calculating the relationship between pairwise LD and SNP pair distance using a Gaussian kernel smoother (σ = 500), after which, for every SNP in a particular linkage block, the SNP position was found in the pairwise LD table and all linked SNPs identified (a, 1–2). A SNP was considered linked if r2 ≥ 0.2 for all chromosomes except chromosomes 6 and 9, for which a SNP was considered linked if r2 ≥ 0.3. (a, 2). Max distance (Max dist) was then defined as the largest bp distance between linked SNPs (a, 4). This process was repeated for all linkage blocks. Once max dist was defined for each significant SNP, the upper boundary of each preliminary GWAS peak could be defined as SNP position + max dist, and the lower boundary of each GWAS peak could be defined as SNP position – max dist (a 5–7). All SNPs falling in between the boundaries were then considered to be within the same GWAS peak (a, 8). This process was repeated for all peaks. In the event that more than one peak contained the same SNPs, they were merged into a single peak (a, 9). After defining preliminary peaks in this manner, peaks were refined by drawing ‘zoomed’ Manhattan plots around peaks, i.e., SNPs +/− 50 Kb from the preliminary peak boundaries (b, 10). Each zoomed Manhattan plot was then assessed visually to determine if the peak was, in fact, a single peak, or if the pattern of linkage indicated that the peak should be split into two or more peaks (b, 11). If it was determined that a preliminary peak should be split into two or more peaks, the diagnosis was confirmed by drawing second zoomed Manhattan plot including SNPs +/− 2 Mb around the peak boundaries (b, 12). After peaks were refined in this way, each individual zoomed Manhattan plot was rated either 1, 2, or 3 based on the evidence suggesting the peak was not an artifact, using visual assessment, where a rating on 1 indicated a peak with no evidence to suggest it was not an artifact, and a rating of 3 indicated a peak with very strong evidence it was not an artifact. All other peaks were rated as 2 (c, 14–15). Any GWAS peaks with only ‘1’ ratings were removed from the final set of significant GWAS peaks (c, 16). The final step of the pipeline is results analysis, i.e., identifying the combinations of trait, treatment, time point, and location that resulted in each significant GWAS peak (d, 17)

Back to article page