Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: DNAscent v2: detecting replication forks in nanopore sequencing data with deep learning

Fig. 1

Schematic of the DNAscent v2 workflow. (a) A typical ONT sequencing workflow is shown, where a library is sequenced, basecalled, and aligned to a reference genome. The raw nanopore signal (in FAST5 format), the reference genome (in FASTA format), and the alignment of reads to the reference genome (in BAM format) produced during this workflow are required inputs for the DNAscent detect subprogram. DNAscent detect uses a residual neural network to assign the probability of BrdU at each thymidine position in each read. These probabilities are written to a single file which is the only input for the DNAscent forkSense subprogram. DNAscent forkSense uses an autoencoder neural network to interpret the pattern of BrdU incorporation on each read into fork direction, and replication origin, fork, and termiantion calls are written to bed files. As an optional third step, DNAscent comes equipped with a utility that can convert the output of DNAscent detect and forkSense into bedgraphs that can be visualised in a genome browser. (b) Architecture of the residual neural network used by DNAscent detect, loosely based on [16]. For each read, DNAscent detect performs a hidden Markov signal alignment to create an input tensor for the neural network. The final softmax layer normalises the output of the network to the probability that BrdU is at each thymidine position in the read. Further details, training information, and the number of parameters used in each layer are described in Section S2 of Additional file 1. (c) Architecture of the autoencoder neural network used by DNAscent forkSense. For each read, the output of DNAscent detect (the probability of BrdU at each thymidine position along the read) forms the input tensor, and the network outputs the probability that a leftward- and rightward-moving fork passed through each thymidine position on the read during the BrdU pulse. Further details, training information, and the number of parameters used in each layer are described in Section S3 of Additional file 1. Abbreviations: batch normalisation (BN), convolution (Conv)

Back to article page