Gene similarity of unigenes against the nr database. (A) Effect of query sequence length on the percentage of significant matches. The cut-off value was set at 1.0E-5. The proportion of sequences with matches in the nr database at NCBI is greater among the assembled sequences with a greater length. (B) E-value distribution of the top BLAST hits for each unigene (E-value of 1.0E-5). (C) Similarity distribution of the best BLAST hits for each unigene. (D) Distribution of BLAST results by species is shown as the percentage of the total homologous sequences (with an E-value ≤1.0E-5). All plant proteins in the NCBI nr database were used for homology search and the best hit of each sequence was used for analysis.