Materials
Briefly, cervical avulsed human DRG tissues (n = 10) were removed in the setting of a dorsal rhizotomy of lower cervical (and upper thoracic in one case) nerve roots in patients with persistent pain syndromes after accidental avulsion injuries. Details on obtaining, processing and measuring the expression of total RNA in the human DRG and control tissue samples are located elsewhere http://itsa.ucsf.edu/~yxiao/Avulsion. The control human DRG sample was pooled from total RNA (n = 8; prepared by Clontech) obtained from post-mortem tissues. The pool of n=8 was then run in triplicate (n=3). Total RNA was extracted from the tissue by Trizol (LifeTechnologies) extraction and polyA+ RNA was recovered by Oligotex mRNA Spin Columns (Qiagen). Biotinylated cRNA targets were prepared and hybridized to Affymetrix HU6800 oligonucleotide arrays according to the manufacturer's protocol (Affymetrix, Santa Clara, California).
Data analysis methods
Normalization
All data from patient and control samples were preprocessed (background subtraction, differential intensities) using Affymetrix GeneChip software, and average differential intensity (ADI) was used for analysis. The average of all ADI values of each chip was calculated and the mean of average values of all chips was used as the target intensity. The ADI value of each chip was normalized to be equal to the target intensity. Since negative, zero, and very small positive values (<20) of ADI most likely represent noise instead of actual gene expression, all such values were truncated at 20 resulting in 5,638 probe sets for analysis.
Graphical display
A Quantile-Quantile plot (Q-Q plot) was generated by plotting the quantiles of t statistics of all genes against the quantiles of a standard normal distribution. In addition to examining the distribution of t statistics, we used this plot to observe points that deviate markedly from the bulk of the observations since they may represent difference in means of the patient and control groups. Four scatter plots were generated to study the features of large t statistics: t statistics vs. average intensities, t statistics vs. t denominators, t statistics vs. t numerators, and t numerators vs. t denominators.
Significance Analysis of Microarrays (SAM)
The SAM procedure described by Tusher et al. [3] based on the t-statistic (under the assumption of equal variances between the two groups) and the Welch statistic (under the assumption of unequal variances between the two groups) was modified as follows:
1. The relative difference in the expression d(i) for the ith of k = 5638 probe sets was defined as,
; i = 1, 2, ..., k
where
and
are means of expression levels of probe set i in the patient and control groups, respectively.
For statistics,
and for Welch statistics,
; where
and
are the variances of expression levels for probe set i in the patient and control groups. The determination of s0 is discussed in the Results section. For all d(i) values, corresponding p values were calculated with d(i) referenced to an appropriate t distribution.
2. All p values were ordered: p1 ≤ p2 ≤ ... ≤ p
k
.
3. An exhaustive set of 285 (286 minus the original configuration, 286 being 13 choose 3) permutations were conducted. For each permutation b, p values were calculated and ordered in the way of 
4. Expected p values for all probe sets were calculated as
, where i = 1, 2, ..., k.
5. For a fixed threshold Δ, a probe set was called "significant" if its p value satisfied the criterion
. The total number of "significant" probe sets was counted. The p
i
largest among the "significant" probe set was defined as the cutoff value.
6. For each permutation, all probe sets whose p values were smaller than the cutoff value were found and were called "falsely significant". The total number of these probe sets was counted. The estimated number of "falsely significant" probe sets was defined as the average of the number of "falsely significant" probe sets from all permutations. The false discovery rate (FDR) was computed as the ratio of the estimated number of "falsely significant" probe sets to the total number of "significant" probe sets.
7. FDR was computed with different values of threshold Δ.
Westfall and Young step-down adjusted p values
Both t-statistics and Welch statistics were used to compare gene expression between the patient and control groups. Our data were organized in a matrix in which each row represents the expression of one gene for all subjects and each column represents the expression of all genes for one subject. The permutation algorithm developed by Westfall and Young to obtain step-down adjusted p values was then applied as follows. Note that the procedures were modified when using Welch statistics. Instead of using the Welch statistic to compute adjusted p values, we used unadjusted p values derived from Welch statistics and the appropriate degrees of freedom. An exhaustive set of 285 permutations were conducted.
1. Compute the t statistic for each gene in the original dataset.
2. Order them: |tr 1| ≥ |tr 2| ≥ |tr 3| ≥ ... ≥ |t
rk
|.
3. Permute the 13 columns of the data matrix. The first 10 columns now represent the pseudo-patient group and the other 3 columns represent the pseudo-control group.
4. Compute t statistics for all probe sets for the permuted dataset: 
5. Compute
and
, 1 ≤ j ≤ k - 1, where r
j
is such that |tr 1| ≥ |tr 2| ≥ |tr 3| ≥ ... ≥ |t
rk
| for the original dataset.
6. Repeat 1–5 N (N = 286) times and calculate the adjusted p values:
where I(•) is the indicator function setting to 1 if the condition in parentheses is true and 0 if false. The monotonicity was enforced as
, for 2 ≤ j ≤ k.
Supplementary material
Supplementary material is located at [2]. This web page includes our data analysis results and also further tissue and patient information.