Dataset | File pair and remarks | File sizes (in MB) | δ |
---|---|---|---|
E61/fa | Homo_sapiens.GRCh37.61.dna_rm.chromosome.HSCHR6_MHC_SSTO.fa | 166.04 | 0.015 |
 | Homo_sapiens.GRCh37.61.dna_rm.chromosome.HSCHR6_MHC_MANN.fa | 166.06 |  |
 | These are two alternative haplotype "patch" files for the same chromosome locus. The dataset contains 11 other examples of similar file pairs with δ < 0.06 (when unpacked). All are related to the alternative haplotypes for the MHC locus. The next most similar pair of files has δ > 0.8. |  |  |
GPL570/cel | GSM405175.CEL | 12.93 | 8e-6 |
 | GSM341406.CEL | 12.93 |  |
 | The second file differs from the first by a single Affymetrix probe measurement. According to GEO metadata the two files are simply different packagings of the same experimental data by two researchers. The GEO570 dataset contains 9 other examples of similar file pairs with δ < 0.002. The next most similar pair of files has δ > 0.3. |  |  |
GPL570/cel.gz | GSM405175.CEL.gz | 4.31 | 6e-4 |
 | GSM341406.CEL.gz | 4.31 |  |
 | A gzip-compressed version of the pair above. Same remarks apply. The most similar pair of actually different datafiles has δ > 0.9. |  |  |
BioC2.7/B SGenome/u | BSgenome.Athaliana.TAIR.01222004/extdata/chr1.rda | 29.04 | 2e-4 |
 | BSgenome.Athaliana.TAIR.04232008/extdata/chr1.rda | 29.04 |  |
 | Consequtive versions of A.thaliana reference genome. The next most similar file pair in this dataset has δ > 0.5. Note that the compressed versions of the same files have δ > 0.9. |  |  |