Skip to main content

Table 2 Arrays and attributes in REVEAL: SingleCell

From: Rapid single cell evaluation of human disease and disorder targets using REVEAL: SingleCell™

Array

Dimensions

Attribute data types

Attributes

RNAQUANTIFICATION

sample_id1

measurementset_id1

cell_id1

feature_id1

value: float

Raw count, normalized count

SAMPLE

sample_id2

name: string

description: string

project_id: int641

public: bool

Project ID, Sample ID, Subject ID, DOID, UBERONID, Enrichment, Library type, Organism NCBI Taxonomy ID

Assay type

MEASUREMENTSET

describes how the data was collected and processed.

measurementset_id2

sample_id1

experimentset_id: int641

entity: string

name: string

description: string

featureset_id: int641

 

CELL

cell_id2

sample_id1

name: string

description: string

individual_id: int641

CL ID, Cl ontology

FEATURE (Genes)

Features can also be proteins, other biomolecules, and or hierarchical names.

featureset_id1

gene_symbol_id1

feature_id2

name: string

gene_symbol: string

chromosome: string

start: string

end: string

feature_type: string

source: string

Feature ID, Featureset ID, ENSG ID, Hugo gene symbol

FEATURE SET

featureset_id2

 

GRCh version, Reference model, Feature-set ID

PROJECT FEATURE

describes the project, or datasource like HCA

project_id2

name: string

description: string

project_id: int641

Project name, Project ID

  1. Legend: shows the schema. Data of interest can be accessed and filtered by their dimensions and attributes. The superscript 1 indicates primary dimensions for selection, and the superscript 2 inidcates secondary dimensions for selection. The general categories for attributes include but are not limited to:
  2. ▪ scRNAseq expression values, both normalized and raw counts
  3. ▪ categorical and continuous tags which can contain metadata on any entities from the pipeline used to generate the tags.
  4. - projects, e.g. data generation source (public, institutional -internal)
  5. - samples, e.g. UBERONID; DOID; organ (lung, rectum, illium)
  6. - cells, e.g. CL ID; cell type (CD8+, enterocytes); percent.mt (percent mitochondria)
  7. - features, e.g. strand (+, −); biotype (protein-coding, frameshift)
  8. Assay type (10x or Dropseq, …)
  9. Note that the tags, UBERONID, DOID, and CL ID, hold controlled vocabulary from publicly curated ontologies like Ontobee. These tags enable hierarchical searches, e.g. search for all cells matching CLID CL:0000584 (enterocyte) and its children