Skip to main content

Table 1 Protein sets used for analysis of orthologous genes

From: Combined de novo and genome guided assembly and annotation of the Pinus patula juvenile shoot transcriptome

Source

Species

Total sequences

Clustered sequencesa

Protein sets from PLAZA v2.5

Arabidopsis thaliana

27 403

26 465

Glycine max

46 324

36 364

Oryza sativa

41 363

39 541

Physcomitrella patens

28 090

26 072

Populus trichocarpa

40 141

35 668

Ricins communis

31 009

30 330

Selaginella moellendorffii

18 384

16 876

Theobroma cacao

28 858

28 294

Vitis vinifera

26 238

24 635

Zea mays

39 172

34 664

External protein sets

Amborella trichopoda

25 347

24 643

Picea abies

22 070

20 869

Picea sitchensis

10 521

8 770

Pinus patula

52 112

41 956

Pinus taeda

50 172

47 225

  1. aProteins were clustered to 90 % identity and only the longest sequence was retained for each cluster