Skip to main content

Table 1 Datasets used to test DiscoverY

From: DiscoverY: a classifier for identifying Y chromosome sequences in male assemblies

Dataset Reads Assembly
Species Sex Tech
nology
Sample Read length Read number Autosome coverage depth Contig number Contig N50 Total length (gb) Y1 length (mb)
human M Illumina NA24385 250 bp 883 mil 60x 65,436 169 kb 2.84 18
human M 10X NA24385 151 bp 477 mil 20x 359,515 24 kb 7.71 24
human M PacBio NA24385 10kb2 13 mil 30x 12,523 4.5 mb 2.99 18
human F Illumina NA12878 148 bp 5.5 bil 300x N/A
human F mixed hg38 - Y N/A 23 156.0 mb 3.03 N/A
human F 10X NA12878 N/A 21,562 16.2 mb 2.85 N/A
human F PacBio NA12878 N/A 18,903 26.8 mb 3.17 N/A
gorilla M Illumina see Methods 150 bp 406 mil 20x N/A
gorilla M N/A gorGor5 + gorY N/A 16,329 9.6 mb 3.10 25
gorilla F Illumina Gg6 150 bp 141 mil 7x N/A
  1. 1Length of contigs aligning to the reference Y chromosome. For gorilla male, this is known directly from the construction, rather than through alignment
  2. 2Read N50 is shown instead of length, since the length varies