Server-side configuration
An instance of GCViT can be set up in a Docker container or installed as a Go and Nodejs application. Set up consists of the service configuration file, preparation of data sets and connecting them to GCViT, and configuration of the user interface (UI). Instructions for deploying an instance of GCViT are provided in the GitHub repository (https://github.com/LegumeFederation/gcvit).
The GCViT display
Binning and scaling
To handle the display of dense data, the chromosomes are divided up into bins and counts are represented for each bin. The default bin size is 500 kb, but this can be changed in the server-side configuration file and interactively by the user. Bin sizes should be set according to SNP density and the degree of scaling: very high-density genotype data may be suited for smaller bin sizes, but a large genome will require larger bins due to pixel size because data can’t be displayed at a scale less than one pixel.
Display types
There are 3 different data displays: histogram, heatmap, and haplotype. The histogram view shows SNP counts in each bin, where the size of each bar represents the count proportional to the minimum and maximum values across the entire genome. The heatmap view shows SNP counts within each bin using color ranges that are proportional to the minimum and maximum values across the genome. The haplotype view shows SNP presence/absence within each bin if the count in the bin matches or exceeds a given threshold.
User Interface
The user interface (UI) controls are grouped into sections: “Configure View,” where the data set and genotypes are selected; “Display,” where the image and its interactive controls are displayed; and “View Controls,” which contains controls for turning on and off portions of the image. Detailed instructions for the UI are provided in GCViT itself, through the Help button.
Selecting the reference genotype
The first step is to select a data set and reference genotype. Data set availability is established in the configuration, along with file paths and data set name. In addition, availability of a particular data set may be controlled via simple authentication. Comparisons can be made only within a single data set.
Selecting the comparison genotypes
After selecting the data set and reference genotype, one or more comparison genotypes can be selected and each assigned a distinct color. A full color palette is provided to help distinguish the selected genotypes.
SNP comparisons
Comparisons can be displayed on the left or right side of the chromosome backbones. For each comparison, the user selects a display type (histogram, heatmap, or haplotype), and the type of comparison (alleles are different from the reference, same as the reference, or the total SNP count). Depending on the display type, the user has the option of setting specific minimum or maximum values rather than leaving GCViT to calculate them across the genome (histogram and heatmap), or of setting a threshold value (haplotype).
Optional settings
In the Configure View Options section, the image can be given a title, the bin size can be changed, the ruler placement can be modified, and the ruler interval (frequency of tic marks and how often coordinate counts are displayed) can be changed.
Control buttons
There are three main buttons, Display, Download, and Help. The Display button generates the image. The image may be larger than the viewport, in which case it can be moved by clicking and dragging the image. The Download button gives the option of downloading the results in SVG or PNG formats. There are some differences between the two options: the SVG format is downloaded as the whole image (which may be larger than what is displayed on the screen, while the PNG format will only download what is currently visible in the viewport. The GFF file that was created and used to draw the visualization can also be downloaded. The Help button provides information about GCViT and instructions for using the interface.
Pop up box
Clicking on a glyph in the image will pop up a box that identifies the bin number, chromosome coordinates, the value for each accession and the total value for the bin. The pop up box can be customized by modifying the CViTjs pop up template. Examples of potential customizations include link-outs to other resources, such as the Germplasm Repository Information Network (GRIN) accession page, or to a genome browser. In our example on SoyBase there are linkouts to the SoyBase Gbrowse instance, for exploration of genic features in the bin; and to the Legume Information System “Context Viewer,” which enables examination of synteny among similar genomic regions.
Key
Above the image, a key is displayed with the currently displayed genotypes and their respective colors. This key will update only after the Display button has been pressed to update the view.
Image controls
On the left side of the image is a toolbox that provides zoom controls and a set of drawing options that permit drawing free-hand lines or rectangles, an eraser, and a color palette. The image can be moved within the viewport by clicking and dragging with the mouse (Fig. 1). Note that the bin size does not change when zooming in or out.
View control
At the bottom of the page, the ‘View Control’ section permits the user to toggle off and on individual chromosomes and other display elements (Fig. 1).
Data sets
Online instances of GCViT at the time of writing include soybean (https://soybase.org/gcvit/), common bean (https://gcvit.phaseolus.legumeinfo.org), chickpea (https://gcvit.cicer.legumeinfo.org), and peanut (https://peanutbase.org/germplasm/gcvit/). Data sets available for soybean include: the whole U.S. germplasm collection genotyped with the SoySNP50K array [16], resequencing of 481 soybean accessions [17], resequencing of 102 Canadian accessions [18], the soybean Nested Association Mapping (SoyNAM) parents and progeny [19, 20], 222 Korean accessions genotyped using the Axiom® SoyaSNP array [21], 4234 Korean accessions using the Axiom® SoyaSNP array [22], GmHapMap data consisting of 1007 resequenced accessions [1], and genotyping of 374 U.S. and Brazilian accessions [23].
Data available for Chickpea contains genotype information from 279 Chickpea accessions [24]. For common bean, diversity data is available for two diverse collections of Phaseolus vulgaris: the Mesoamerican Diversity Panel (MDP) and the Andean Diversity Panel (ADP) [25]. The peanut data set contains the U.S. Peanut Mini Core Collection genotyped using the 58 K Affymetrix SNP array, Axiom Arachis [26].