Skip to main content
Fig. 4 | BMC Genomics

Fig. 4

From: A completeness-independent method for pre-selection of closely related genomes for species delineation in prokaryotes

Fig. 4

Outline of the FRAGTE approach. A, fragmenting phase. An incomplete genome is concatenated (a). Then the concatenated genome is divided by a sliding l-kb window with 0.5 l-kb overlap (b) and 256 z-scores are calculated for each fragment (c). For each fragment, PCCDs are calculated with all non-overlapped intragenomic fragments (d) and then summed as an accumulated PCCD. Subsequently, a representative fragment with the maximal accumulated PCCD is determined for its genome (e) and its z-scores is selected as z-scores for representative fragment (ZRF). Besides, 4 fragments with top 4 largest accumulated PCCDs are used to calculate z-scores for long fragment (ZLF) (f). Finally, the average PCCD and standard deviation (SD) based on all PCCDs of the representative fragment are calculated and genome-specific cutoff (GSC) is thus computed as the mean intragenomic PCCD minus two SDs with two restrictions (g). In this way, FRAGTE finishes fragmenting phase and obtains z-scores for the representative fragment (ZRF) and the fourfold longer fragment (ZLF), as well as a GSC. b determining phase. a PCCD (P1) based on ZRFs is calculated. If P1 > LSC, the pair is selected. To improve specificity, GSC is used. GSC for a pair (GSCp) is determined as the smaller between GSC for the query (GSCq) and for the reference (GSCr). If P1 > GSCp, this pair is finally sieved. Otherwise, a second PCCD (P2) based on ZLFs is calculated. If P2 > GSCp, this pair is sieved

Back to article page