Identification and extension of previous
sequences. A. All 18 A. leptorhynchus protein sequences deposited in UniProt plotted individually to illustrate the previously determined sequence length compared with the percent aligned to an assembly sequence. UniProt sequences annotated as fragments are denoted with diamonds. The left and right halves of the markers denote the completeness of the N- and C- termini, respectively, of the best aligned sequence from our final assembly (green = our terminus aligned with the previously complete terminus, red = our sequence is missing terminus sequence, orange = our sequence has extended sequence compared with the previous sequence, but it is likely still not complete, blue = our terminus extends the sequence to a likely terminal residue). Extended sequences are labeled. Inset shows sequences clustered around 172–182 amino acids long that all had 100% coverage. B. HoxA13b (asterixed in A) was extended to completion based on identification of a stop codon, as well as a start codon that aligned to the start of sequences from D. rerio and I. punctatus (extended termini are highlighted in colored boxes). Alignment performed with ClustalW2: * (asterisk) = fully conserved residue; : (colon) = conservation between groups of strongly similar properties (>0.5, Gonnet PAM 250 matrix); . (period) = conservation between groups of weakly similar properties (≤0.5).