Full length clone selection (top) and TC categories (bottom). ESTs derived from different clones were clustered and assembled. The CAP3 contig was compared to protein databases using BLASTX and FASTY and hits categorized in 4 categories. Class 1 hits had to match the whole protein sequence and start with an ATG in the TC and M in the protein and the hit had to end at a STOP codon. Class 2 hits had to match the whole protein sequence, start with an ATG in the TC and M in the protein. Class 3 had to match the full protein sequence (without further restrictions), class 4 had to cover the protein over almost its full length, allowing the match to start or end maximal 10 ten amino acids after/before the start or end of the protein. Predicted 5' TCs (P5P) had to have enough sequence to fill up the missing 5' end of the protein sequence. Clone selection: Clone A and B were discarded because of missing IMAGE id. Clone 54321 does not span 5' end of protein match. Clone 21345 was selected as most 5' clone fulfilling the requirements.