Skip to main content

Table 2 Correspondence to mammalian genes and estimated efficiencies of cloning of start codons of EST assemblies

From: Large-scale sequencing based on full-length-enriched cDNA libraries in pigs: contribution to annotation of the pig genome draft sequence

 

Unique Gene ID (without HomoloGene ID)

Unique HomoloGene ID

Assemblies matched to protein sequences

Assemblies estimated to include start codons

 

Contigs

Singlets

 

Contigs

Singlets

Human

13,691 (754)

12,911

64,011

12,056

51,955

47,229

9,635

37,594

Mouse

12,955 (730)

12,137

63,444

12,028

51,416

45,539

9,588

35,951

Cattle

13,445 (1935)

11,341

63,718

12,035

51,683

47,118

9,634

37,484

Dog

12,293 (763)

11,410

62,815

11,871

50,944

37,193

8,090

29,103

Pig

14,275

 

63,169

11,917

51,252

46,063

9,396

36,667

  1. Numbers of genes that had unique NCBI Gene IDs and corresponded to contigs and singlets generated by assembly of expressed sequence tags (ESTs) are indicated. Also shown are the numbers that had unique Gene IDs in the NCBI HomoloGene database (a database of orthologs among species) and corresponded to the contigs and singlets generated. Numbers in parentheses indicate numbers of gene IDs that had no corresponding HomoloGene IDs. HomoloGene IDs in pigs are not indicated, because there is no HomoloGene ID database for pig genes.
  2. EST assemblies were estimated to contain start codons if the length upstream of the matches (BLAST score >50) in the assemblies was greater than that between the start base of the coding sequence and the matched region of the corresponding gene. Numbers of assemblies (contigs and singlets) corresponding to protein sequences in humans, mice, cattle, dogs, and pigs are also shown.