The distribution and evolution of Arabidopsis thaliana cis natural antisense transcripts

BMC Genomics

Table 4 Numbers of different cis-NATs with RNA structures*

	P > 0.5	P > 0.5 **	P > 0.8	P > 0.9
TAIR data set	12/55 (22%)	6/55 (11%)	7/55 (13%)	5/55 (9%)
TAIR data set	2/55 (4%) ¶,	2/55 (4%)	1/55 (2%)	1/55 (2%)
Matsui2008	71/1365 (5%) ¶	--	46/1365 (3%)	32/1365 (2%)
Okamoto2010	22/485 (5%) ¶	--	13/485 (3%)	10/485 (2%)

* The fractions are of the totals that have significant sequence conservation. For the TAIR data, the first row is for RNAz calculations across the whole NOLP sequence. For non-TAIR data and for the second row for the TAIR data, these are cases that overlap CNSs (conserved non-coding sequences) identified in Haudry, et al. [21].
** Removing non-significantly conserved cases after Holm-Bonferroni correction.
¶ The total numbers of conserved are significantly enriched compared to randomly sampled near-gene DNA (P = 0.046 for the TAIR set, P < 0.00001 for the other two sets, normal statistics). To assess this, for each of the three actual data sets listed, 500 samples of near-gene DNA of the same distribution of sizes and position relative to neighbour genes as the actual set were generated (as described in the Methods ).

ISSN: 1471-2164