Determination of dosage compensation of the mammalian X chromosome by RNA-seq is dependent on analytical approach
- Nathaniel K Jue†1,
- Michael B Murphy†1,
- Seth D Kasowitz1,
- Sohaib M Qureshi1,
- Craig J Obergfell1,
- Sahar Elsisi1,
- Robert J Foley1,
- Rachel J O’Neill1 and
- Michael J O’Neill1Email author
© Jue et al.; licensee BioMed Central Ltd. 2013
Received: 20 September 2012
Accepted: 23 February 2013
Published: 6 March 2013
An enduring question surrounding sex chromosome evolution is whether effective hemizygosity in the heterogametic sex leads inevitably to dosage compensation of sex-linked genes, and whether this compensation has been observed in a variety of organisms. Incongruence in the conclusions reached in some recent reports has been attributed to different high-throughput approaches to transcriptome analysis. However, recent reports each utilizing RNA-seq to gauge X-linked gene expression relative to autosomal gene expression also arrived at diametrically opposed conclusions regarding X chromosome dosage compensation in mammals.
Here we analyze RNA-seq data from X-monosomic female human and mouse tissues, which are uncomplicated by genes that escape X-inactivation, as well as published RNA-seq data to describe relative X expression (RXE). We find that the determination of RXE is highly dependent upon a variety of computational, statistical and biological assumptions underlying RNA-seq analysis. Parameters implemented in short-read mapping programs, choice of reference genome annotation, expression data distribution, tissue source for RNA and RNA-seq library construction method have profound effects on comparing expression levels across chromosomes.
Our analysis shows that the high number of paralogous gene families on the mammalian X chromosome relative to autosomes contributes to the ambiguity in RXE calculations, RNA-seq analysis that takes into account that single- and multi-copy genes are compensated differently supports the conclusion that, in many somatic tissues, the mammalian X is up-regulated compared to the autosomes.
KeywordsRNA-seq X chromosome Dosage compensation
Chromosome-based sex determination systems are most often characterized by heterotypic sex chromosomes, with one sex carrying at least one degenerate homolog [1–3]. Heterokaryotypy may result from differential gene loss or gain as the sex chromosome complement evolves from an ancestral homologous pair. Depending on the extent of the loss or gain, and the dosage sensitivity of genes on the incipient sex chromosomes, natural selection may favor the evolution of compensating mechanisms to balance expression between the sexes and between the sex chromosomes and autosomes. This can be accomplished either by up-regulating expression of sex-linked genes in the heterogametic sex or by down-regulating expression in the homogametic sex in relation to the autosomes. In Drosophila and Sciara, genes on the single X chromosome in males are transcriptionally up-regulated, while in the nematode worm, Caenorhabditis elegans, the two X chromosomes in hermaphrodites are down-regulated to equal that of the XO males . In contrast, for organisms displaying female heterogamety, such as birds, evidence of sex chromosome dosage compensation is lacking [7–10]. The differences in compensating mechanisms, or lack thereof, will likely reflect the relative content of haplosufficient vs. haploinsufficient genes on the sex chromosomes, but will also reflect early events of sex chromosome evolution, outcomes of sexual selection and sexual conflict, and the life history of the organism .
In eutherian mammals and marsupials, sex chromosome dosage compensation is achieved by global inactivation of one of the two X chromosomes in females. X chromosome inactivation (XCI) in eutherians is initiated by the expression of the XIST non-coding RNA just prior to implantation of the embryo, leading to heterochromatinization of one of either parental X chromosome in the fetus . X-inactivation in marsupials also involves heterochromatinization of one X, governed by a non-coding RNA, RSX, with XIST-like properties, but the paternal X is exclusively chosen for inactivation [13, 14].
Halving the apparent dosage of X-linked genes in female mammals via XCI presents an evolutionary conundrum: if sex chromosomes evolve from an ancestral autosomal pair, it is the heterogametic sex that would be impelled to compensate for the complete loss or degradation of the evolving Y. In other words, since female mammals never receive a Y chromosome, it is difficult to see how loss of gene dosage from the evolving Y would have any influence on regulation of X genes in females. The simplest compensating step in response to attritional gene loss from the incipient Y would be cis-regulatory change or cis-gene duplication, i.e. genetic mutation, of genes on the X. In Drosophila, a male-specific epigenetic mechanism of dosage compensation spares the homogametic female a potentially detrimental up-regulation of X-linked genes. If, however, compensation is achieved by genetic mutation, selection would favor epigenetic down regulation in females. Ohno recognized this and hypothesized that down-regulation of X-linked genes might evolve in response to regulatory changes to the X that are transmitted from father to daughter . This would appear to be the scenario played out in C. elegans and mammals. Regardless of the eventual dosage compensation mechanism settled upon, the first step in compensating for gradual haploinsufficient gene loss on the Y must be an increase in transcription of surviving genes on the X in males.
Ohno’s hypothesis appeared to be borne out in three recent reports [16–18], which each showed by microarray-based transcriptome analysis that the single active X chromosome in both males and females in several eutherian species was expressed at or near a 1:1 ratio to the averaged expression of the diploid autosomal complement, termed the “X:A ratio”. However, this work was called into question by He and colleagues  who, through analysis of high throughput transcriptome sequence (RNA-seq) data from various tissues from human and mouse, concluded that the X:A ratio of gene expression was closer to 0.5, indicative of a lack of X-linked gene up-regulation. Xiong et al. report that the former studies were compromised by apparent compression of expression differences; a factor they argue is inherent to microarray expression analysis. Recently, Disteche and colleagues published a report re-analyzing RNA-seq data from  as well as new RNA-seq data from human cells and tissues and arrive at the conclusion that the mammalian X chromosome is upregulated in relation to autosomes . Additionally, three other studies [21–23] following on the heels of  also report up-regulation of the mammalian X. However, in reply to these reports, He and colleagues maintain their conclusion that Ohno’s hypothesis is “invalid” .
The widely divergent conclusions, i.e. compensation vs. no compensation, of these studies highlight the dramatic differences in biological conclusions that can be drawn from different analytical approaches applied to similar or even identical next-generation sequence datasets. As the recent controversy over RNA editing illustrates [25–28], even though the computational tools available for next-generation sequencing analysis may be vetted in the literature, parameters that can profoundly affect outputs are often applied haphazardly. In this report we consider several issues that may contribute to variation in calculating the whole chromosome expression values that form the basis of conclusions drawn regarding the relative expression of X-linked genes to that of autosomal genes. We compared the global transcriptional output of the X chromosome with that of the autosomes using our own RNA-seq datasets and those utilized in  and  that are publically available [29–31]. Our analysis also includes RNA-seq data we have generated from X monosomic mouse and human tissues. Since X monosomy obviates X-inactivation, results from X monosomic samples are not confounded by the effects of X-linked genes that escape inactivation. We report that assumptions made in dataset trimming and several factors integral to the implementation of RNA-seq quantitative analysis have a large effect on the global calculation of the relative expression of the X chromosome to that of autosomes.
Data distribution and treatment of outliers
Unless otherwise stated, gene expression levels are represented as FPKM (fragments per kilobase of exon per million fragments mapped). In their recent study reporting mammalian X chromosome to autosome expression ratios (X:A) equal to ~0.5, He and colleagues utilized RNA-seq datasets that then were truncated by removing substantial proportions of both highly and lowly expressed loci in order to exclude the effect of FPKM values at or near 0 while arbitrarily preserving a median value for statistical testing . Contrarily, Disteche and colleagues contend that compensation of the mammalian X, can only be discerned once the skewed content toward reproductive genes on the X is taken into account . Nevertheless, they too only detect compensation in most human tissues once genes with FPKM ≤ 1 are excluded. Likewise, each of the other reports addressing the Ohno controversy [21–23] disregard genes with FPKM ≤ 1, or as in  RPKM < 3. However, FPKM determination is not absolute and can vary significantly based on sequencing depth, sequencing platform, RNA source and other factors . Moreover, since functional genes expressed at any level may be subject to selection for dosage compensation, exclusion of data based on the level of expression may skew final analysis.
Mapping parameters in measuring chromosome-wide gene expression
RXE based on mapping parameters
Xm, Human lymphoblast
Xp, Human lymphoblast
Xm, mouse brain
Xp, mouse brain
XX, Mouse Brain
XY, mouse brain
Ostensibly, unique mapping parameters are employed to create FPKM values while avoiding potential confounding effects of including genes that are erroneously counted as “expressed” due to cross-mapping of short reads to multiple loci. However, paralogous gene families having even short segments of high similarity will be completely excluded by such methods. Since gene duplication is one potential means of achieving dosage compensation upon loss of a homolog, we examined the relative X chromosome content for highly similar paralogous gene families (> 70% sequence similarity) compared to the autosomes. For human we found ~2 fold enrichment for paralogous gene families on the non-recombining portion of the X chromosome compared to autosomes, and ~1.5 fold enrichment for mouse (Additional file 2: Table S2). Such enrichment means that when only unique mapping parameters are considered, the X chromosome would be more likely to have reads excluded as compared to autosomes, skewing the estimates of RXE downward. To more accurately account for paralogous transcripts/genes, we implemented a mapping approach, termed “non-unique”, that aligns each read only to the best fit position in the genome but does not exclude reads that map to multiple positions.
The report of Xiong and colleagues ignored alternative splicing in mapping program implementation, as reads spanning splice junctions were discarded. Mapping our datasets using TopHat, which considers both paralogous transcripts and splice site junctions (referred to hereafter as “non-unique, spliced” mapping), shifted RXE levels to a range of −0.96 to 0.21 (Table 1, Additional file 3: Table S3, Additional file 4: Table S4) [33, 34]. Consideration of splicing pushed estimates of RXE both up or down, depending on the library examined; however, many of the estimates for a specific tissue increased their estimates of RXE. Also, three of the 10 datasets showed a twofold or greater up-regulation of the X chromosome.
RXE across paralogs
Human lymphoblast 45, Xm
Human lymphoblast 45, Xp
2.57 (41, 30) †
0.18 (31, 28)
0.51 (13, 18)
0.79 (14, 17)
0.24 (38, 31)
−0.64 (76, 43) *
0 (69, 40)
0.20 (56, 31)
−0.55 (61, 63)
−0.06 (77, 44)
0.49 (106, 65) †
0.03 (93, 60)
0.14 (66, 43)
−0.15 (68, 46)
0.19 (104, 70)
−0.18 (415, 546) *
−0.5 (378, 520) *
−0.32 (269, 393) *
−0.38 (272, 412) *
−0.73 (400, 559) *
RXE based on annotation
Human lymphoblast 45, Xm
Human lymphoblast 45, Xp
RefSeq protein coding
UCSC (h19) known genes
Dosage compensation in tissues
Library preparation and depth of coverage
Small RNA and riboprotein enrichment based on library preparation (Illumina or SOLiD)
Xp lymphoblast (SOLiD)
Human lymphoblast (Illumina)
XY mouse brain (SOLiD)
Mouse brain (Illumina)
To examine the effect of library size on RXE estimation, we included 9 more human lymphoblastoid cell line RNA-seq libraries from a recent study , for a total of 14 human RNA-seq libraries analyzed. We calculated the library size and the associated RXE value for each library (Additional file 7: Figure S2). As anticipated, the 10 human lymphoblastoid cell lines, which were relatively small libraries (<50 million reads), clustered together with RXE values ranging between −0.4 and −0.1. Smaller libraries also appeared to be more variable in their estimates of RXE. For the 45, Xm and 45, Xp lymphoblastoid cell line libraries, which were much larger (>250 million reads), the RXE values approached and surpassed 0. Overall, RXE values look to asymptote to ~1 as library size increases. Given this result and our observations about the divergent behavior of our smaller mouse libraries, it has been demonstrated that low coverage libraries lack the power to properly assess expression of lowly expressed genes and paralogs and, in turn, alter the final RXE values.
Functional components of dosage compensation
Given the variable conclusions reached in several investigations concerning sex chromosome dosage compensation in different organisms [38–42], how confident can we be that any particular report has accurately measured expression levels clustered by a chromosome-to- chromosome level? Recently, data from a previously reported non-dosage compensated Z-chromosome in the silkworm  has now been re-analyzed with consideration for statistical biases and concludes that the Z is being dosage compensated, rejecting the premise that ZW sex determination necessitates deviation from dosage compensation . Our analysis of RXE in human and mouse identified similarly serious issues that are not only important to the specific question of dosage compensation but address broader issues concerning the implementation of analytical tools for next-generation sequencing data. While our analysis focused on chromosome level comparisons, the issues we address will likely impinge on conclusions drawn from many types of global or clustered analysis of short read sequences. Our examination of the effects of library construction/sequencing methods, mapping protocols, sequence annotations and statistical treatment of data on estimates of RXE may also prove to be incomplete as RNA-seq data analysis continues to mature. The pitfalls we illustrate for RNA-seq are similarly presented for repetitive elements in short read genome assemblies by .
Three key questions considered in mapping short read sequence to a reference genome have a profound effect on downstream quantitative analysis of RNA-seq datasets: 1) are reads that align to more than one location in the reference reported in the mapped dataset; 2) if so, how many of those alignments are reported; and 3) if reporting of multiply-aligning short reads is limited, what rules govern the location to which a reported short read is assigned? Unique mapping parameters, implemented in a mapping program such as Bowtie, typically elide any reads that align to more than one location, hence genes that contain even short segments of high similarity to other genes will be excluded from further analysis. Depending on the limits to reporting of multiply-aligned read, “non-unique” parameters, may either swamp quantitative analysis with inclusion of high copy-number repeat transcripts or lead to inappropriate inclusion of non-expressed paralogs. Default parameters in programs such as TopHat and Cufflinks report multiply-aligned reads that may dramatically influence conclusions drawn in clustered analyses. Our analysis shows that the X chromosome is enriched for paralogous gene families relative to the autosomes. Since gene duplication is a straightforward method for achieving dosage compensation of a haploinsufficient gene, implementation of short-read sequence analysis tools that are inclusive of limited multiply-aligned sequences is essential to generating the most biologically realistic RXE levels.
Another consideration that appears to have a significant effect on the calculation of RXE is the use of a mapping tool that includes splice junction fragments. Consideration of splice junction fragments removed biases created by enrichment for small RNAs or riboproteins that were introduced during library preparation for either SOLiD or Illumina platforms. Differences in the consideration of splice junction fragments may also underlie the large discrepancy in RXE values produced from using different gene annotations. While all tissues across all annotations exhibited higher levels of RXE than those described in , we found considerable variation in RXE estimates when comparing values between all five annotations. This is of particular concern considering that some of these comparisons should be very similar. For instance, Ensembl uses Gencode annotations in the formulation of Ensembl genes and Ensembl Transcripts annotations. The fact that choice of annotation for mapping assignment is an unexpectedly important facet of RNA-seq analysis has also been reported by others .
Previous RNA-seq studies of X:autosome expression applied arbitrary cutoffs when filtering data, removing a proportion of genes that are highly or lowly expressed [19–24]. In each of these reports the approach to data trimming can be seen to improve the fit of the calculated X:autosome expression ratio to the authors’ desired conclusion. Trimming according to expression level clearly introduces bias because compensated genes may be disproportionately represented within different expression level classes. In other words, excluding genes with low, or high, FPKM values may result in exclusion of a significant cohort of compensated X-linked genes. Most FPKM estimation programs such as Cufflinks have some type of threshold criteria for determining whether or not a FPKM value will be called for that locus. Ascertainment bias from arbitrary cutoffs will be particularly acute for smaller libraries. Many RNA-seq studies, including the X dosage reports discussed herein, cite  for the designation: 1 transcript per cell is equivalent to FPKM=3. It should be noted that that equivalency only holds for the specific approach (i.e. RNA source, library preparation, mapping parameters) used in . This is particularly the case when using Cufflinks, in which FPKM estimates, without some sort of standard reference, are meaningful only in the relative sense. Recent studies indicate biologically relevant transcripts are represented at much greater depth [46, 47] and need to be accounted for in mapping and transcript assembly.
We found library size and type are very important in interpreting global expression analysis. The decision about which method of library construction to use can have a profound influence on characterization of expression profiles . In our study we included results from both Illumina RNA-seq and SOLiD RNA-seq, revealing differences between the two platforms largely due to the method of rRNA exclusion in library construction. Our comparisons of non-coding versus protein-coding annotations show that methods that exclude non-coding elements present lower estimates of RXE. Sequencing depth also plays a role in accurately modeling global or clustered gene expression. Library size and RXE are positively correlated in our analyses. Recent studies have indicated that a lack of sequencing depth is typically associated with the inability to detect lowly expressed genes [46, 49–51].
The evolution of dosage compensation of sex-linked genes will be driven by the fitness cost of under-expression in the heterogametic sex weighed against the cost of over-expression in the homogametic sex on a gene-by-gene basis. It is expected that only some genes will necessitate compensation once they become hemizygous. Therefore, gene function and the relative representation of certain functional groups on the sex chromosomes becomes an important consideration in the detection of dosage compensation at the chromosome level. Although only weakly supported, our RXE calculations with respect to gene ontology classification largely agree with the predictions of the gene balance hypothesis , in which regulatory genes tend to be compensated while structural genes tend not to be.
High-throughput gene expression profiling forms the experimental basis of several recent reports that show either no evidence for dosage compensation such as in birds [8, 42] and lepidoptera , or that show some amount of dosage compensation such as in platypus , stickleback , and flour beetle . Even with the greater sensitivity afforded by next-generation sequencing and RNA-seq analysis, the choice of analytical tools and decisions implicit in their implementation, particularly with respect to inclusiveness of data, will have a profound effect on the conclusions drawn in any clustered analysis. More importantly, as others have argued, compensation may be more local than global [11, 36]. In the absence of an overriding chromosome-wide epigenetic mechanism, detection of dosage compensation for a sex chromosome will clearly depend mostly on the relative number of dosage sensitive genes to dosage insensitive genes that reside on it.
Our analysis of RNA-seq data, in consideration of several mitigating factors, indicates that gene expression from the X chromosome in mammals is up-regulated in many somatic tissues. While not every tissue-specific RNA-seq dataset has an RXE ≥ 0, no tissue in our analysis exhibits RXE as low as the values reported in . Some of these differences in RXE can be attributed to tissue specific activity of X-linked genes , however we find RXE values falling within the range of variability of other chromosome-to-chromosome expression ratios. In addition, we identified serious issues not only important to addressing dosage compensation but to the larger concern of accurately implementing analytical tools for next generation sequencing. Our study shows how choices made along the entire pipeline of next-gen sequence analysis can profoundly influence the final conclusions to questions asked by many biologists.
In order to generate an global estimate for the relative expression of the X-chromosome to the autosomes, we implemented an analytical framework to RNA-seq that involved taking into consideration the methods and various assumptions associated with each methodological step: library construction; sequencing run; mapping reads from sequencing runs to a reference genome; assigning those mapped reads to annotated region of interest; calculating an expression value for that region of interested largely based on the mapping of those reads.
We implemented three different mapping protocols in our study to address three specific base assumptions about how mapping should be done and are referred as follows: “unique”, “non-unique”, and “non-unique, spliced”. Mapping runs were conducted using the Bowtie v0.12.7 algorithm and program . “Unique” means that a read is only included in the mapping results file if it maps to only one unique location in the reference. In terms of Bowtie parameters, this means that parameter k and m were set to 1. If the read maps to more than one region of a reference, then it is discarded from downstream expression estimates. A “non-unique” approach allows those reads that map to multiple locations in the reference to be included in downstream analyses. To isolate the effect of simply including multiply-mapped reads, our “non-unique” mapping allows for multiply-mapped reads to be report but only once. For this approach, Bowtie parameter k was set to 1, while m had no limit. Additionally, all subsequent mapping matches for a read were ranked using the “best” and “strata” algorithms within Bowtie that rank the matches for a specific read using the number of mismatches within seed and across the entire read as well as the Phred scores at those mismatches. Our “non-unique” analysis only reports the “best” ranked match for a mapped read. Lastly, a “non-unique, spliced” mapping approach is most commonly recognized and implemented in the TopHat v1.3.1 program , which includes the consideration of splice junctions for discontinuous mapping of reads. All default parameters were used for these runs; however, no novel transcripts were predicted as Gencode v4 and mm9 USCS gene models were used to define all splice junctions for human and mouse, respectively. This approach allows for non-unique mapping as well and uses a similar methods of assigning alignment scores, but reports up to 20 randomly selected sequences if alignment scores are identical (default setting). To detail whether the distribution of genetic entities such as paralogous genes among chromosomes might bias a specific mapping strategy, paralogs were identified using BioMart and differences in read mapping for those paralogs with >70% sequence similiarity were examined for both unique and non-unique mapping runs. Our 70% minimum cutoff for paralogs was empirically determined by the ability of the Biomart paralogs search algorithm to identify X-linked multigene families (eg. Xlr) of which we had prior knowledge.
To describe the role that reference annotation had on RXE, we examined different approaches to assigning reads for expression calculations: (1) mapped reads only, disregarding a priori regions of interest (such as exonic regions); (2) RefSeq exon annotations for genes to determine which reads mapping to specific regions we would be retained in our estimates of expression; (3) RefSeq exon annotations for protein coding genes only; (4) Ensembl exon annotations for genes; (5) Ensembl exon annotation for transcripts; and (6) Gencode exon annotations. For approach (1), we estimated relative expression by weighting the number of reads that mapped to any specific chromosome by the number of genes found on that respective chromosome (patterns of chromosomal gene-enriched were described using BioMart) and then dividing the weighted number for the X-chromosome by that number. By averaging values of this relationship across all chromosomes, we calculated a value for RXE for each library that we examined (data was log2-transformed to maintain consistency for reasons described below). Alternatively, for approaches (2)-(6), we implemented the program Cufflinks v1.0.3  to estimate fragments per kilobase of exon per million fragments (FPKM) – an index typically used in RNA-seq analyses – using different annotations with the same mapping results files. Default parameters were used for all Cufflinks FPKM calculations except for limiting FPKM calculations to the sites determined by the aforementioned associated annotations (without allowing additional transcript prediction). All multi-mapped reads contributions to FPKM values are equally distributed across all valid mapping sites (i.e. if a single read maps to 10 sites, then each sites is awards 1/10th of that read to its total read count). Software-based bias corrections (both Fragment and Multi-map) were implemented but neither had any significant effect on results.
Data manipulation and selection
We implemented three treatments of raw results to increase impartial statistical rigor, the amount of data used in the analysis, and overall robustness of analysis: (1) we log-base two transformed all FPKM values; (2) we removed any outliers that were 1.5 times the mid-50 percentile distance greater or less than the 75th and 25th percentiles, respectively; and (3) we used mean values and instead of median values. A log2-transformations of data changes the scale of analyses and allows for more appropriate assessment of lowly-expressed loci (particularly, FPKM values <1) and highly-expressed loci (reducing effects of large values on moment estimation) by allowing the distribution of data to closely resemble a “normal” distribution model and, thus, better describe the central tendencies of that distribution. Taking an impartial approach to outlier identification minimizes differences among tissues with very specific patterns of gene expression and removes data points that may overly influence mean estimation of a general pattern in relative X and autosomal gene expression while maintaining statistical rigor.
Given that we used a log2-transformation, instead of the traditional X:A ratio (described as X expression divided by autosome average expression), we used an index of log2(X expression) – log2(A expression) to describe patterns in relative X expression, or RXE. Here, a value of zero means equal expression of X and A, a value of 1 means twice as much expression of X than A, and a value of −1 means half as much expression on the X as compared to A. Therefore, it follows that values near zero indicate dosage compensation, while values near −1 indicate no dosage compensation occurring.
Final relative X-chromosome expression estimation
Using information gathered from the above treatments of data, we decided on using a “non-unique, spliced” approach to mapping that uses the conservative gene-identifying RefSeq annotation and log2-transformation of FPKM values with a traditional approach to outlier removal to estimate RXE. In addition to estimating RXE, we also calculated the relative expression of each chromosome to all other chromosomes (excluding the Y and mitochondria) in order to see if the X truly deviates from the expression patterns of other chromosome (i.e. is at half the expression level of other chromosomes).
Functional component dosage compensation
All FPKM values for 5 human tissue-types (brain, liver, lymphocyte, lymphocyte XM, and lymphocyte XP) were filtered based on the GO-term of interest by mining the on-line AMIGO database for gene names associated with each Biological Process term of interest. Molecular Function group comparisons were done in a similar fashion, however, the identification of term of interest was based on results by Kondrashov and Koonin’s  that found some specific terms to be overly-represented in haplo-insufficient genes.
GSE16921; GSE12946; SRA001030; SRA047980
Relative X expression
X chromosome inactivation
Fragments per kilobase of exon per million fragments mapped
Maternal X chromosome
Paternal X chromosome
This work was supported by grants from: NINDS, 1R01NS057607 (MJO); NSF, IOS- 0920088 (MJO and RJO); and NSF, MRI-R2, DBI-0959365 (MJO and RJO).
- Muller HJ: A factor for the fourth chromosome of Drosophila. Science. 1914, 39 (1016): 906-View ArticlePubMedGoogle Scholar
- Charlesworth B: Model for evolution of Y chromosomes and dosage compensation. Proc Natl Acad Sci U S A. 1978, 75 (11): 5618-5622. 10.1073/pnas.75.11.5618.PubMed CentralView ArticlePubMedGoogle Scholar
- Graves JA: Sex chromosome specialization and degeneration in mammals. Cell. 2006, 124 (5): 901-914. 10.1016/j.cell.2006.02.024.View ArticlePubMedGoogle Scholar
- Larsson J, Meller VH: Dosage compensation, the origin and the afterlife of sex chromosomes. Chromosome Res. 2006, 14 (4): 417-431. 10.1007/s10577-006-1064-3.View ArticlePubMedGoogle Scholar
- da Cunha PR, Granadino B, Perondini AL, Sanchez L: Dosage compensation in sciarids is achieved by hypertranscription of the single X chromosome in males. Genetics. 1994, 138 (3): 787-790.PubMed CentralPubMedGoogle Scholar
- Meyer BJ: Targeting X chromosomes for repression. Curr Opin Genet Dev. 2011, 20 (2): 179-189.View ArticleGoogle Scholar
- Itoh Y, Melamed E, Yang X, Kampf K, Wang S, Yehya N, Van Nas A, Replogle K, Band M, Clayton D: Dosage compensation is less effective in birds than in mammals. J Biol. 2007, 6 (1): 2-10.1186/jbiol53.PubMed CentralView ArticlePubMedGoogle Scholar
- Ellegren H, Hultin-Rosenberg L, Brunstrom B, Dencker L, Kultima K, Scholz B: Faced with inequality: chicken do not have a general dosage compensation of sex-linked genes. BMC Biol. 2007, 5: 40-10.1186/1741-7007-5-40.PubMed CentralView ArticlePubMedGoogle Scholar
- Mank JE, Ellegren H: All dosage compensation is local: gene-by-gene regulation of sex-biased expression on the chicken Z chromosome. Heredity. 2009, 102 (3): 312-320. 10.1038/hdy.2008.116.View ArticlePubMedGoogle Scholar
- Wolf J, Bryk J: General lack of global dosage compensation in ZZ/ZW systems? Broadening the perspective with RNA-seq. BMC Genomics. 2011, 12 (1): 91-10.1186/1471-2164-12-91.PubMed CentralView ArticlePubMedGoogle Scholar
- Mank JE, Hosken DJ, Wedell N: Some inconvenient truths about sex chromosome dosage compensation and the role of sexual conflict. Evolution. 2011, 65 (8): 2133-2144. 10.1111/j.1558-5646.2011.01316.x.View ArticlePubMedGoogle Scholar
- Lee JT: Lessons from X-chromosome inactivation: long ncRNA as guides and tethers to the epigenome. Genes Dev. 2009, 23 (16): 1831-1842. 10.1101/gad.1811209.PubMed CentralView ArticlePubMedGoogle Scholar
- Escamilla-Del-Arenal M, da Rocha ST, Heard E: Evolutionary diversity and developmental regulation of X-chromosome inactivation. Hum Genet. 2011, 130 (2): 307-327. 10.1007/s00439-011-1029-2.PubMed CentralView ArticlePubMedGoogle Scholar
- Grant J, Mahadevaiah SK, Khil P, Sangrithi MN, Royo H, Duckworth J, McCarrey JR, VandeBerg JL, Renfree MB, Taylor W: Rsx is a metatherian RNA with Xist-like properties in X-chromosome inactivation. Nature. 2012, 487 (7406): 254-258. 10.1038/nature11171.PubMed CentralView ArticlePubMedGoogle Scholar
- Ohno S: Sex chromosomes and sex-linked genes. 1967, Berlin, New York [etc.]: Springer-VerlagView ArticleGoogle Scholar
- Nguyen DK, Disteche CM: Dosage compensation of the active X chromosome in mammals. Nat Genet. 2006, 38 (1): 47-53. 10.1038/ng1705.View ArticlePubMedGoogle Scholar
- Gupta V, Parisi M, Sturgill D, Nuttall R, Doctolero M, Dudko OK, Malley JD, Eastman PS, Oliver B: Global analysis of X-chromosome dosage compensation. J Biol. 2006, 5 (1): 3-10.1186/jbiol30.PubMed CentralView ArticlePubMedGoogle Scholar
- Lin H, Gupta V, VerMilyea MD, Falciani F, Lee JT, O'Neill LP, Turner BM: Dosage compensation in the mouse balances up-regulation and silencing of X-linked genes. PLoS Biol. 2007, 5 (12): 2809-2820.View ArticleGoogle Scholar
- Xiong YY, Chen XS, Chen ZD, Wang XZ, Shi SH, Wang XQ, Zhang JZ, He XL: RNA sequencing shows no dosage compensation of the active X-chromosome. Nat Genet. 2010, 42 (12): 1043-U1029. 10.1038/ng.711.View ArticlePubMedGoogle Scholar
- Deng X, Hiatt JB, Nguyen DK, Ercan S, Sturgill D, Hillier LW, Schlesinger F, Davis CA, Reinke VJ, Gingeras TR: Evidence for compensatory upregulation of expressed X-linked genes in mammals, Caenorhabditis elegans and Drosophila melanogaster. Nat Genet. 2011, 43 (12): 1179-1785. 10.1038/ng.948.PubMed CentralView ArticlePubMedGoogle Scholar
- Kharchenko PV, Xi R, Park PJ: Evidence for dosage compensation between the X chromosome and autosomes in mammals. Nat Genet. 2011, 43 (12): 1167-1169. 10.1038/ng.991. author reply 1171–1162View ArticlePubMedGoogle Scholar
- Lin H, Halsall JA, Antczak P, O'Neill LP, Falciani F, Turner BM: Relative overexpression of X-linked genes in mouse embryonic stem cells is consistent with Ohno's hypothesis. Nat Genet. 2011, 43 (12): 1169-1170. 10.1038/ng.992. author reply 1171–1162View ArticlePubMedGoogle Scholar
- Yildirim E, Sadreyev RI, Pinter SF, Lee JT: X-chromosome hyperactivation in mammals via nonlinear relationships between chromatin states and transcription. Nat Struct Mol Biol. 2012, 19 (1): 56-61.PubMed CentralView ArticleGoogle Scholar
- He X, Chen X, Xiong Y, Chen Z, Wang X, Shi S, Wang X, Zhang J: He et al. Reply. Nat Genet. 2011, 43 (12): 1171-1172. 10.1038/ng.1010.View ArticleGoogle Scholar
- Li M, Wang IX, Li Y, Bruzel A, Richards AL, Toung JM, Cheung VG: Widespread RNA and DNA sequence differences in the human transcriptome. Science. 2011, 333 (6038): 53-58. 10.1126/science.1207018.PubMed CentralView ArticlePubMedGoogle Scholar
- Kleinman CL, Majewski J: Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science. 2012, 335 (6074): 1302-author reply 1302View ArticlePubMedGoogle Scholar
- Lin W, Piskol R, Tan MH, Li JB: Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science. 2012, 335 (6074): 1302-author reply 1302View ArticlePubMedGoogle Scholar
- Pickrell JK, Gilad Y, Pritchard JK: Comment on "Widespread RNA and DNA sequence differences in the human transcriptome". Science. 2012, 335 (6074): 1302-author reply 1302View ArticlePubMedGoogle Scholar
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008, 5 (7): 621-628. 10.1038/nmeth.1226.View ArticlePubMedGoogle Scholar
- Wang ET, Sandberg R, Luo SJ, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. Nature. 2008, 456 (7221): 470-476. 10.1038/nature07509.PubMed CentralView ArticlePubMedGoogle Scholar
- Cheung VG, Nayak RR, Wang IX, Elwyn S, Cousins SM, Morley M, Spielman RS: Polymorphic cis- and trans-regulation of human gene expression. PLoS Biol. 2010, 8 (9): e1000480. doi:10.1371/journal.pbio.1000480.
- Tarazona S, Garcia-Alcalde F, Dopazo J, Ferrer A, Conesa A: Differential expression in RNA-seq: a matter of depth. Genome Res. 2011, 21 (12): 2213-2223. 10.1101/gr.124321.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Trapnell C, Pachter L, Salzberg SL: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009, 25 (9): 1105-1111. 10.1093/bioinformatics/btp120.PubMed CentralView ArticlePubMedGoogle Scholar
- Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010, 28 (5): 511-U174. 10.1038/nbt.1621.PubMed CentralView ArticlePubMedGoogle Scholar
- Kondrashov FA, Koonin EV: A common framework for understanding the origin of genetic dominance and evolutionary fates of gene duplications. Trends Genet. 2004, 20 (7): 287-290. 10.1016/j.tig.2004.05.001.View ArticlePubMedGoogle Scholar
- Birchler JA, Veitia RA: The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol. 2010, 186 (1): 54-62. 10.1111/j.1469-8137.2009.03087.x.PubMed CentralView ArticlePubMedGoogle Scholar
- Pessia E, Makino T, Bailly-Bechet M, McLysaght A, Marais GA: Mammalian X chromosome inactivation evolved as a dosage-compensation mechanism for dosage-sensitive genes on the X chromosome. Proc Natl Acad Sci U S A. 2012, 109 (14): 5346-5351. 10.1073/pnas.1116763109.PubMed CentralView ArticlePubMedGoogle Scholar
- Deakin JE, Hore TA, Koina E, Graves JAM: The status of dosage compensation in the multiple X chromosomes of the platypus. PLoS Genet. 2008, 4 (7): 13-View ArticleGoogle Scholar
- Zha XF, Xia QY, Duan J, Wang CY, He NJ, Xiang ZH: Dosage analysis of Z chromosome genes using microarray in silkworm, Bombyx mori. Insect Biochem Mol Biol. 2009, 39 (5–6): 315-321.View ArticlePubMedGoogle Scholar
- Leder EH, Cano JM, Leinonen T, O'Hara RB, Nikinmaa M, Primmer CR, Merila J: Female-biased expression on the X chromosome as a Key step in Sex chromosome evolution in threespine sticklebacks. Mol Biol Evol. 2010, 27 (7): 1495-1503. 10.1093/molbev/msq031.View ArticlePubMedGoogle Scholar
- Prince EG, Kirkland D, Demuth JP: Hyperexpression of the X chromosome in both sexes results in extensive female bias of X-linked genes in the flour beetle. Genome Biol Evol. 2010, 2: 336-346. 10.1093/gbe/evq024.PubMed CentralView ArticlePubMedGoogle Scholar
- Itoh Y, Replogle K, Kim YH, Wade J, Clayton DF, Arnold AP: Sex bias and dosage compensation in the zebra finch versus chicken genomes: general and specialized patterns among birds. Genome Res. 2010, 20 (4): 512-518. 10.1101/gr.102343.109.PubMed CentralView ArticlePubMedGoogle Scholar
- Walters JR, Hardcastle TJ: Getting a full dose? Reconsidering sex chromosome dosage compensation in the silkworm, Bombyx mori. Genome Biol Evol. 2011, 3: 491-504. 10.1093/gbe/evr036.PubMed CentralView ArticlePubMedGoogle Scholar
- Treangen TJ, Salzberg SL: Repetitive DNA and next-generation sequencing: computational challenges and solutions. Nat Rev Genet. 2012, 13 (1): 36-46.Google Scholar
- Roberts A, Pimentel H, Trapnell C, Pachter L: Identification of novel transcripts in annotated genomes using RNA-Seq. Bioinformatics. 2011, 27 (17): 2325-2329. 10.1093/bioinformatics/btr355.View ArticlePubMedGoogle Scholar
- Tarazona S, GarcÃ-a-Alcalde F, Dopazo J, Ferrer A, Conesa A: Differential expression in RNA-seq: A matter of depth. Genome Res. 2011, 21 (12): 2213-2223. 10.1101/gr.124321.111.PubMed CentralView ArticlePubMedGoogle Scholar
- Hebenstreit D, Fang M, Gu M, Charoensawan V, van Oudenaarden A, Teichmann SA: RNA sequencing reveals two major classes of gene expression levels in metazoan cells. Mol Syst Biol. 2011, 7: 497-PubMed CentralView ArticlePubMedGoogle Scholar
- Cui P, Lin QA, Ding F, Xin CQ, Gong W, Zhang LF, Geng JN, Zhang B, Yu XM, Yang J: A comparison between ribo-minus RNA-sequencing and polyA-selected RNA-sequencing. Genomics. 2010, 96 (5): 259-265. 10.1016/j.ygeno.2010.07.010.View ArticlePubMedGoogle Scholar
- Cloonan N, Forrest ARR, Kolle G, Gardiner BBA, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G: Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods. 2008, 5 (7): 613-619. 10.1038/nmeth.1223.View ArticlePubMedGoogle Scholar
- Blencowe BJ, Ahmad S, Lee LJ: Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes. Genes Dev. 2009, 23 (12): 1379-1386. 10.1101/gad.1788009.View ArticlePubMedGoogle Scholar
- Toung JM, Morley M, Li MY, Cheung VG: RNA-sequence analysis of human B-cells. Genome Res. 2011, 21 (6): 991-998. 10.1101/gr.116335.110.PubMed CentralView ArticlePubMedGoogle Scholar
- Keane TM, Goodstadt L, Danecek P, White MA, Wong K, Yalcin B, Heger A, Agam A, Slater G, Goodson M: Mouse genomic variation and its effect on phenotypes and gene regulation. Nature. 2011, 477 (7364): 289-294. 10.1038/nature10413.PubMed CentralView ArticlePubMedGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10 (3): R25-10.1186/gb-2009-10-3-r25.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.