Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: Novel metrics for quantifying bacterial genome composition skews

Fig. 1

Overview of the method, using B. burgdorferi as example. a: In the absence of annotated origins of replication, the minimal value of the cumulative G-C graph is used to determine the likely origin of replication (ori) and hence the predicted directions of replication (black arrows) leading to terminator sites (graph maxima). Genes transcribed in those directions (blue arrows) are considered to be on the leading strand, while genes transcribed in the opposite directions (orange arrows) are on the lagging strand. b: Treatment of draft genome assemblies. Each contig is analyzed separately to determine likely directions of transcription, from minimal to maximal values of cumulative G-C. Putative origins of replication, terminator sites and gene orientations are determined as above. c: For each gene on the leading strand (blue) or lagging strand (orange), TA and GC skews are computed relative to the leading strand. Circle area is proportional to gene length. The vectors point from the origin (zero skews) to the weighted average of skews for genes on the leading strand and genes on the lagging strand. d: definition of the characteristic skews (leadGC, leadTA, lagGC and lagTA), and the angle θ between the two vectors. e: The three metrics computed based on the characteristic skews and the angle θ. The multiple arrows leading to the third metric (residual skew) denote that this metric integrates information from many genomes

Back to article page