Several methods now exist to measure quantitatively the expression of genes within a biological sample, allowing us to compare expression between cells from different tissues, and between cells from the same tissue under different conditions. More recent technologies for this purpose include microarrays and RNA-seq. However one of the most popular remains RT-qPCR, due to its accessibility, relatively cheap price, small requisite amount of starting material and high precision . Although it has a lower throughput than some other methods, technological advances in recent years have led to improvements. Through microfluidics and other technologies it is now possible to run hundreds, even thousands of RT-qPCR reactions in parallel with the same starting sample [2, 3], with a high enough precision that it is frequently used in order to validate findings made through higher throughput technologies  (details of available technologies are provided in ). Its usage remains ubiquitous.
Such RT-qPCR technologies quantify gene expression by attempting to amplify a target DNA sequence, representing a gene or other biological molecule, in a query sample (the target is DNA because the RNA in the original tissue is reverse transcribed to make cDNA). The sample is placed in a well with a primer specific for the DNA sequence to be measured, necessary for amplification to begin . In the case of the high-throughput RT-qPCR technologies, the sample is delivered to a number of wells in parallel, each containing a separate primer. Then a number of amplification cycles are performed for each well. A predefined threshold is set within the exponential amplification phase, when doubling of the product can be detected above background fluorescence, and the number of cycles it takes to get to this threshold is used to estimate the amount of cDNA sequence present, and thus the amount of RNA that was present in the initial tissue .
These values are known as quantification cycle (Cq) values (also known as threshold-cycle (Ct) values, but herein referred to as Cq, in line with the standardised nomenclature suggested in ). By comparing the Cq values between two samples (for example treated and untreated tissue), one can compare the amount of DNA sequence in one sample relative to the other. It is strongly recommended to normalise these raw values to account for systematic variation between samples, related to differing starting amounts of material, tissue-specific differences in transcription efficiency, and a number of other factors. This is typically achieved through the use of reference genes (endogenous control or housekeeping genes, also referred to simply as housekeepers). These are stably expressed genes that should not change in expression in response to a change in the cell’s environment, or between different cell types .
Assuming the reference gene exhibits stable expression across different samples, and assuming it does not show a change in expression between sample-types (i.e. between cells under different conditions/ between different cell types), the subtraction of the Cq value of the reference gene from the target gene should account for the systematic variation between samples, and allow for the expression of genes in different samples to be compared to each other directly. Furthermore it is generally recommended to combine multiple reference genes in order to reduce error, assuming their combination also shows stable expression .
However, it is often the case that reference genes do change in expression between sample-types, or show high stochastic variation under certain conditions [8–11]. The choice of a reference gene that shows variation between sample-types will clearly bias estimation of the expression of other genes within the samples, since subtraction of said reference gene’s expression value from a gene will lead to over or underestimation of the true expression of that gene. Similarly, a reference gene that shows a high intrinsic variation in expression under the conditions of the experiment, will lead to inflated stochastic error when estimating the true abundance of the other genes within a sample [8, 12].
Several statistical methods have now been proposed to deal with the problem of reference gene selection. These methods will either select the optimal reference gene for an experiment, or a number of reference genes, whose expression values should be combined in order to generate a normalisation factor (NF), which can be used as the calibrator. The work of Vandesompele et al. starts with a number of potential reference genes and attempts to find the best set of reference genes from this initial list (with a minimum of two, since the two most stable genes cannot be ranked). It does this by looking for the most stably expressed reference genes across all samples within an experiment, without taking into account the labelling of different sample-types. Andersen et al. proposed a model-based approach that takes into account the overall variability of a reference gene within an experiment, and also between different sample-types. More details on these methods (amongst others) can be found in a recent paper by Chervoneva et al., which also introduces a new method for reference gene selection, accounting for correlation between different reference genes. A summary of available software is provided in a chapter of a recently published, comprehensive book on RT-qPCR .
The raw-Cq value of a target gene minus that of the best reference gene is known as the ΔCqvalue. To calculate relative fold change between different conditions, the ΔCq value of a gene of interest in one sample type can be subtracted from its value in another sample type, in order to calculate the ΔΔCq value, and thus 2−ΔΔCq
Another way the reference genes can be used to normalise the Cq results is through the adaptation of the method of Pfaffl et al., where the efficiency of the reference gene is estimated and taken into account when normalising the other genes of interest .
Recently, other normalisation methods have been proposed that adapt methods originally developed for microarrays and other high-throughput genomic technologies [19–21].
Here we present two packages, ReadqPCR and NormqPCR, written in the freely available statistical computing software R (http://www.r-project.org/),  and available as part of the Bioconductor project (http://www.bioconductor.org/), . They allow the user to read RT-qPCR data into R, deal with undetermined Cq values, find a suitable reference gene or genes for a given experiment using a method for optimal reference gene selection and normalise the data via the ΔCq and 2−ΔΔCq
normalisation methods. The user can also use a number of existing bioconductor packages and functions to perform quality control on their data, and can check the adequacy of reference genes visually. We demonstrate the basic functionality of the packages here and provide an example work-flow, involving the different packages alongside several other well known and highly-used CRAN and Bioconductor packages, applied to a generic RT-qPCR experiment. We then present a experiment where ReadqPCR and NormqPCR have been used to analyse a RT-qPCR dataset, and take the user through the different steps that were undertaken in the analysis of the data.