From: Rapid transcriptome sequencing of an invasive pest, the brown marmorated stink bug Halyomorpha halys
Attribute | Weight | Variable | Reason |
---|---|---|---|
A) Based on the putative transcript sequence | Â | Â | Â |
1. What proportion of the database protein is covered in the first Uniref100 hit? | 10 | Proportion covered | Reward the putative transcript based on the proportion of the database protein that is covered in the best BLASTX hit |
2. What proportion of the putative transcript is covered by the first Uniref100 hit? | 8 | Proportion covered | Reward the putative transcript based on the proportion of the query putative transcript that is covered in the best BLASTX hit |
3. What is the length covered on the database protein in the first Uniref100 hit? | 7 | Database hit length/longest database hit length | Reward based on the absolute database protein length covered in the best BLASTX hit, compared to the longest hit length in the component |
4. What is the length covered on the putative transcript in the first Uniref100 hit? | 5 | Putative transcript hit length/longest putative transcript hit length | Reward based on the absolute query putative transcript length covered in the best BLASTX hit, compared to the longest hit length in the component |
5. Is the strand of the Uniref100 match, the expected one (based on SSLR)? | 4 | Match strand* (−SSLR/max |SSLR|) | Reward matches in plus strand if SSLR <0 or matches in minus strand if SSLR >0. In contrast, penalize matches in plus strand if SSLR >0 or matches in minus strand if SSLR <0 |
6. What proportion of the database protein is covered in the first NR hit? | 9 | Proportion covered | Same as the corresponding metric for Uniref100 |
7. What proportion of the putative transcript is covered by the best NR hit? | 7 | Proportion covered | Same as the corresponding metric for Uniref100 |
8. What is the relative length covered on the database protein in the first NR hit? | 6 | Database hit length/longest database hit length | Same as the corresponding metric for Uniref100 |
9. What is the relative length covered on the Trinity putative transcript in the first NR hit? | 4 | Putative transcript hit length/longest putative transcript hit length | Same as the corresponding metric for Uniref100 |
10. Is the strand of the NR match, the expected one (based on SSLR)? | 3 | Match strand* (−SSLR/max |SSLR|) | Same as the corresponding metric for Uniref100 |
11. Is the SSLR negative (i.e. the expected)? | 7 | - SSLR / max |SSLR| | Reward putative transcripts with the normal, negative SSLR |
12. How long is the putative transcript compared to the longest in the component? | 7 | Putative transcript length/longest putative transcript length | Reward longer putative transcripts |
B) Based on the ORFs | Â | Â | Â |
1. Is the best match for each ORF the same? | 10 | (1 - Number of best matches)/number of best matches | Penalize putative transcripts having ORFs that have different hits. |
2. Are there ORFs in both strands with both having an NR hit? | 10 | - Number of ORFs in strand "A"/number of ORFs in strand "B" | Maximum penalty if both ORFs have a NR hit |
3. Are there ORFs in both strands with only one having an NR hit? | 8 | - Number of ORFs in strand "A"/number of ORFs in strand "B" | Intermediate penalty if only one of the ORFs has a NR hit |
4. Are there ORFs in both strands with none of the two having an NR hit? | 3 | - Number of ORFs in strand "A"/number of ORFs in strand "B" | Small penalty if none of the ORFs have a NR hit |
5. How many ORFs are called? | 8 | (1 - number of ORFs)/number of ORFs | Penalize putative transcripts having >1 ORFs |
6. Are the ORFs found only in the expected strand (SSLR)? | 8 | ORF strand* (−SSLR/max |SSLR|) | Reward putative transcripts having ORFs called in only the expected strand |
C) Sequencing coverage dips | Â | Â | Â |
1. How many sequencing coverage dips? | 10 | - Number of dips/max number of dips in the component | Penalize putative transcripts with sequencing coverage dips |