CodingQuarry flow diagram. Examples are shown of correct annotations of coding sequences, (A) and a typical CodingQuarry input; assembled transcripts aligned to the genome (B). The stages used within CodingQuarry to predict coding sequences are shown (C-G). Firstly, coding sequences are predicted from transcript sequences (introns are removed) using a GHMM (C). Possible prediction errors after this step are coloured red, and notes show how these are identified (D). These error prone predicted genes are discarded (E), and regions are selected for prediction from genome sequence (F). The resulting prediction is output by CodingQuarry (G), which merges the retained predictions from transcript sequences (E) with the predictions from selected areas of the genome sequence (F). Sections of the example genome sequence and annotations have been labelled i-x in each part of the diagram (A-G), and marked with vertical dotted lines. These sections are labelled to facilitate in-text references to the diagram in the Implementation section of this manuscript. Labels i-x correspond to the same genome sections through A-G.