Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: seq-seq-pan: building a computational pan-genome data structure on whole genome alignment

Fig. 1

Visualization of the alignment workflow for an example with three genomes. Input genomes (g1-3) are depicted as green, yellow and blue blocks. All sub-sequences are part of locally collinear blocks (LCBs) in the final result and are therefore marked within the whole genomes and numbered according to their appearance in the respective genome. The first two genomes are aligned and provided as separated blocks of aligned sub-sequences. Block I and II indicate a rearrangement of sub-sequence 3 of g1 when compared to g2 and parts of g1 are not present in g2. Consensus sequences are built individually for each LCB in the alignment and concatenated with stretches of ’N’ as delimiters to form a consensus genome (depicted in red with delimiters in gray). It is used in the alignment with g3, which is presented in detail in steps a-e. a The consensus genome is aligned with the third genome (g3, blue), yielding six blocks. Block I and III represent a rearrangement of sub-sequence 6 of g1. Block II shows a large deletion in g3 compared to the consensus genome. Block IV-VI show single-sequence blocks. b Blocks resulting from alignment with the consensus genome are broken up into smaller blocks at delimiter positions (Block II in a is now Block II-VI in b). The small single-sequence block with sub-sequence 5 of the consensus genome (Block IV in a) is merged to its neighboring sub-sequence 4 of the consensus genome, introducing gaps into sub-sequence 3 of g3 (see Block IV in b). c Remaining single-sequence blocks of both genomes (depicted in lighter red and blue) are concatenated with stretches of ’N’ as delimiters (c.a). Sequences are aligned (c.b) and resulting blocks are resolved at delimiter positions (c.c). Small single-sequences would also be merged to neighboring blocks (not shown). d Aligned and single-sequence blocks from step c are joined with initially aligned blocks and all blocks are sorted by their position in the consensus genome. e The full alignment is traced back using the newly formed blocks and the alignment of the first two genomes. f A consensus genome is built from the full alignment and alignment of additional genomes is achieved by consecutive repetition of steps a-f

Back to article page