Skip to main content

Evolutionary implications of inversions that have caused intra-strand parity in DNA

Abstract

Background

Chargaff's rule of DNA base composition, stating that DNA comprises equal amounts of adenine and thymine (%A = %T) and of guanine and cytosine (%C = %G), is well known because it was fundamental to the conception of the Watson-Crick model of DNA structure. His second parity rule stating that the base proportions of double-stranded DNA are also reflected in single-stranded DNA (%A = %T, %C = %G) is more obscure, likely because its biological basis and significance are still unresolved. Within each strand, the symmetry of single nucleotide composition extends even further, being demonstrated in the balance of di-, tri-, and multi-nucleotides with their respective complementary oligonucleotides.

Results

Here, we propose that inversions are sufficient to account for the symmetry within each single-stranded DNA. Human mitochondrial DNA does not demonstrate such intra-strand parity, and we consider how its different functional drivers may relate to our theory. This concept is supported by the recent observation that inversions occur frequently.

Conclusion

Along with chromosomal duplications, inversions must have been shaping the architecture of genomes since the origin of life.

Background

The most famous of Chargaff's rules is that in DNA, the proportion of A equals that of T, and C that of G [1]. This nucleotide balance is governed by complementary base-pairing rules fundamental to the structure of the double helix [2]. Astonishingly, the nucleotides retain almost the same equality balance in either of the two single strands of DNA [3] and this phenomenon is sometimes named Chargaff's second parity rule [410]. Table 1 provides an illustration, with analysis of large contiguous segments from each human chromosome.

Table 1 Mononucleotide content in contiguous single-stranded DNA scaffolds from each human chromosome *

When there is no bias in mutation and selection between complementary strands, base substitution may explain the parity phenomenon [11, 12]. In fact, strand bias has been demonstrated with mutational skews between the two strands, which causes deviation from parity [13, 15]. Bacterial origins of replication were successfully identified by the distribution of such skews [16, 17]. The strand bias of mutations, which can be associated with direction of transcription, is also found in mammalian genomes [18, 19]. In spite of these anomalies, any violation of the second parity phenomena is generally small in magnitude [8, 20].

Although different explanations for this parity phenomenon have been put forth, such as intra-strand base pairing [6], a simpler explanation for the rule may be DNA duplication and inversion [4, 8, 10]. If double-stranded DNA of any composition undergoes duplication followed by an inversion of the duplicated region, then each strand of the resulting DNA molecule would precisely satisfy Chargaff's second parity rule, so that %A = %T and %C = %G (Fig. 1A).

Figure 1
figure 1

Inversions as an explanation for intra-strand parity. A, Duplication followed by inversion. If a double-stranded DNA, shown in gray, undergoes duplication and inversion, then the resulting molecule precisely demonstrates the strand parity (both within and between strands). B, A mathematical explanation of intra-strand parity. The n th inversion is illustrated by a box with crossed bars and r n is the relative length of the inversion within a total fragment of length = 1. Ultimately both A n and T n converge to the average of their initial frequencies. See Methods for details. Although a linear double-stranded DNA is shown, this could also be circular. C, A small number of inversions can cause DNA to follow the intra-strand parity. A 40-bp double-stranded DNA fragment in the human mtDNA (position 1875–1914 in accession number NC_001807) is shown, along with the outcome of a single artificial inversion, which has homogenized the contents of the two strands.

Not only single nucleotides but also oligonucleotides up to 30 nucleotides (nt) in length can demonstrate the parity phenomenon within strands [5, 7, 8]. In other words, the frequency of a particular oligonucleotide is approximately equal to that of its reverse complementary sequence in the same strand. Since DNA strands are complementary, the frequency of a particular oligonucleotide in one strand approximates that in the opposite strand. Hence, this double-stranded DNA characteristic can also be called "symmetry of complementary DNA strands" [5, 8]. Chargaff's second parity rule ordinarily considers only mononucleotides, which have been extensively studied. However, since a single nucleotide could be deemed a one-nt oligonucleotide, it is plausible that addressing the symmetry of oligonucleotides (high-order strand symmetry) is a more general way of assessing biological meaning. Hereafter, we designate this comprehensive symmetry as "intra-strand parity" and attempt to explain it based on the mechanism of chromosomal inversion. Single nucleotide mutations may be considered to explain mononucleotide parity within strands [11, 12] but have not been effective to explain the extended parity of oligonucleotides [8].

Results

We propose that inversion events (with or without underlying duplications) might be a sufficient mechanism to explain the phenomenon. To test this, we consider a double-stranded DNA molecule without intra-strand parity but which is long enough to undergo various (stochastic) inversions (Fig. 1B). A n and T n are defined as the frequency of any particular oligonucleotide sequence and its reverse complementary sequence, respectively, in the same strand after n inversions (n > 0). A 0 (0 <A 0 < 1) is the initial frequency of any particular oligonucleotide sequence (which can also be a mononucleotide) in the upper strand. T 0 (0 < T 0 < 1) is the initial frequency of its reverse complementary sequence in the same strand. If we define r n (0 < r n << 1) as the relative length of the n th inversion (Fig. 1B), we obtain these two equations.

A n = An-1- r n (An-1- Tn-1) (1)

T n = Tn-1- r n (Tn-1- An-1)(2)

Equations (1) and (2) mean that an inversion changes A n and T n toward T n and A n , respectively. When the whole sequence is long enough, r n is close to 0. Nevertheless, whatever the size of the inverted region examined, any oligonucleotide sequence will eventually be homogenized between two strands. In other words, A n and T n ultimately converge to be equal to each other, regardless of r n , as long as r n is stochastic (see mathematical derivation in Methods).

lim n A n = lim n T n = A 0 + T 0 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiGbcYgaSjabcMgaPjabc2gaTbWcbaGaemOBa4MaeyOKH4QaeyOhIukabeaakiabdgeabnaaBaaaleaacqWGUbGBaeqaaOGaeyypa0ZaaCbeaeaacyGGSbaBcqGGPbqAcqGGTbqBaSqaaiabd6gaUjabgkziUkabg6HiLcqabaGccqWGubavdaWgaaWcbaGaemOBa4gabeaakiabg2da9maalaaabaGaemyqae0aaSbaaSqaaiabicdaWaqabaGccqGHRaWkcqWGubavdaWgaaWcbaGaeGimaadabeaaaOqaaiabikdaYaaaaaa@4CDA@
(3)

Equation (3) is a mathematical explanation of intra-strand parity based on our hypothesis that inversions are sufficient to cause any DNA segment conform to parity. In this way, the vast majority of naturally occurring DNA molecules (chromosomes) will evolve to intra-strand parity via many inversions. Those few that deviate, such as mitochondrial DNA (mtDNA) [8, 9, 17], will have special properties (see below). We presume that any DNA can be made to evolve to intra-strand parity through a process of inversions, and that deviations from parity have been rare in evolution. Inversions must have been occurring as genomes of ancestral organisms were growing in complexity with the acquisition or creation of new genes.

The insertion of repetitive sequences was proposed to be a possible source underlying parity [8, 10]. However, removing apparent repeats from the human and other genomes prior to analysis (see Methods) did not alter the symmetry characteristics of the remaining sequences. (An example of a 28.6-Mb contig from human chromosome 21 is shown in Table 2). Therefore, it is unlikely that insertion of such sequences accounts for the intra-strand parity, either in humans or organisms that have fewer repetitive sequences in their genomes.

Table 2 Dinucleotide frequencies in a human genomic contig without repetitive sequences *

We employ radar charts to allow simple visual perception of the high-order symmetry and asymmetry of exemplary DNAs (Fig. 2). Mitochondria are thought to have been derived from bacteria [21]. Mammalian mtDNA (Fig 2C) is an exception that does not demonstrate intra-strand parity [8, 9, 17] whereas mtDNAs from plants and lower eukaryotes do. Mammalian mtDNA may have gradually deviated from its ancestral form [9]. The small circular size, its unique replication mechanism [22], and extra-nuclear localization could introduce different selective pressures against tolerance of inversions and thus deviation from the more general observation of intra-strand parity.

Figure 2
figure 2

Intra-strand parity visually represented by radar charts. Frequencies of trinucleotides in various DNA sequences are shown here. Each trinucleotide is sorted alphabetically from bottom to top (left side). The corresponding complementary trinucleotides are arranged across to the right. A, Radar chart representing a fully sequenced contig (NT_010966, 33,548,238 bp) of human chromosome 18. This contig is continuous and does not include any annotated gaps or ambiguous nucleotides. The symmetrical chart shows the equal frequencies of specific oligonucleotides and their reverse complementary oligonucleotides. The high frequencies of poly-A and poly-T, which might be, in part, traces of retrotranspositions of poly-A+ mRNA, and the deficiencies of trinucleotides that contain the CpG dinucleotide make the stalk and four grooves, respectively, of the "maple leaf" shape. (The shapes vary slightly based on the genome sequence analyzed, but the general symmetry is maintained). B, The genomic sequence of the p53 (TP53) locus (U94788, 20,303 bp). The symmetry is roughly retained in sequences as short as 20 kb in length. The protein-coding sequences occupy 5.8% of this locus. This chart also suggests that transcriptional asymmetry is small in magnitude. C, Human mtDNA. The asymmetry illustrates that this DNA does not show intra-strand parity. D, Human mtDNA after inversion in silico. It becomes symmetrical, demonstrating that inversions can change a sequence to create the parity. In this case, each r n approximates to 1/16.6. This also demonstrates that only 1/(2r ave ) inversions (eight inversions in this case) are enough to make a sequence conform to parity. E, The difference of frequencies of GGG and CCC ([GGG] - [CCC]) in human mtDNA approaches 0 by in silico random inversions. In this analysis, for simplicity, the size of each inversion was fixed to 100 bp. In human mtDNA, GGG and CCC have the largest difference of frequencies among all trinucleoties (see Fig. 2C).

The mammalian mtDNA offers a natural source of sequence sufficiently deviating from parity to allow us to further test our mathematical explanation. We produced in silico semi-random inversions in human mtDNA. As few as eight 1-kb regularly-distributed inversions (see Methods) would be sufficient to homogenize the two strands of the 16.6-kb mtDNA and create intra-strand parity (Fig. 2D). We also depict a hypothetical inversion in the mtDNA to show the potential for rapid homogenization (Fig. 1C).

Although the lack of intra-strand parity in mammalian mtDNA could be ascribed to its small length, other loci of comparable length (e.g. the TP53 gene, Fig. 2B) do adhere to parity. Unlike other mtDNAs, those of mammals have no intergenic segments and have only one regulatory region per strand. Moreover, unlike among nuclear genomes, the order and direction of genes – as well as biased gene density between the two strands – are strictly conserved among mammalian species [23]. Therefore, it seems that the configuration is already fixed, and that inversions are not tolerated in mammalian mtDNA.

Discussion

The ubiquity of inversions suggests that they had some advantage in natural selection. Duplications are thought to play an important role in creating genetic variety [24], however, some duplications are deleterious for organisms, due to sudden increases of gene dosage. To avoid being negatively selected, one of the duplicated copies could undergo mutation such as deletion. Inversions or interchromosomal rearrangements could render the duplicated gene nonfunctional due to its release from interaction with its promoter or other regulatory elements. This may be one reason why many inverted and interchromosomal segmental duplications are found in the human genome [25, 26]. An approximately symmetrical gene distribution between the two strands may have been brought about by these rearrangements [27].

In some cases, a rearranged genome might confer positive selection. Although we can find syntenic regions among vertebrates, chromosomal organizations can be quite different among species. This suggests an advantage for evolution or speciation. Recently, the importance of gene order and gene position in the three-dimensional nucleus has been suggested [28]. It is likely that genomes continually undergo rearrangement toward optimal positions for each gene and each gene cluster. Our group showed an unexpectedly large number of inversions (from 23 bp to 62 Mb in size) between human and chimpanzee genomes [29], species which diverged only six million years ago. Although most may be selectively neutral, some likely were selected for, and contributed to the speciation. Many more inversions may also have occurred and may have been negatively selected. Inversions can also give rise to new transcripts, some of which will be selected for and become new genes. We identified hybrid transcripts of the AZGP1 and GJE1 genes on human chromosome 7 (manuscript in preparation) and are intrigued that the orthologues of these genes in non-primate mammals reside in a head-to-head manner. It is likely that the common ancestor of primates underwent inversion of the AZGP1 gene to produce the hybrid transcripts, creating an opportunity for primate diversity.

Conclusion

In summary, we propose that the relatively frequent occurrence and accumulation of inversions in genomes may be a major contributor to the phenomenon of intra-strand parity. Whereas single base substitutions might explain Chargaff's second parity rule at the level of mononucleotides, they can explain neither the high-order intra-strand parity nor the exceptional deviation of mammalian mtDNAs. In contrast, inversion events are not limited by size and can involve millions of bases of sequence. Other mechanisms may have contributed to some extent; nevertheless, they are not necessary to account for intra-strand parity if inversions are considered.

Inversions are one process contributing to genome evolution that allow for rearrangement toward optimal position, order, and orientation of genes and regulatory elements, and for escape from deleterious effects caused, for example, by some duplications. Although we acknowledge the possibility of preferential sites, inversions occur randomly as shown in our mathematical explanation. Many of these are expected to be deleterious and would presumably be selected against, but others should be neutral or positively selected and could therefore become fixed in the genome [30]. Quantitative estimation of inversion using genomic sequences of extant organisms is unfortunately meaningless, as it cannot account for those events lost to natural selection. Further, inversions must have contributed to the basic character of DNA sequences since the origin of life. There are now substantial data supporting the frequency of inversions within genomes of a variety of organisms, including plants, insects and primates [2933], and these observable events are but the tip of the iceberg. Chromosomal rearrangements such as inversions reduce the rate of meiotic recombination between homologous chromosomes, with subsequent reproductive isolation [34]. Moreover, in these regions, mutations tend to be positively selected to give rise to speciation [35]. Ohno's seminal work [24] and that of others have emphasized the importance of duplications in evolution. Our suppositions further these ideas, in particular suggesting how inversions and duplications can complement each other to yield the properties of extant genomes.

Methods

Calculation of frequencies of oligonucleotides

The genomic sequences (human contigs, the TP53 gene, and the mtDNA sequence) were downloaded from NCBI (Build 36). Calculation of frequencies of oligonucleotides (including mononucleotides) was performed using Perl scripts, which are available upon request. The "plus" strand, which is stored in the database, was analyzed. We generated sequence free of repetitive elements using RepeatMasker with which 46.4% of the 28,617,429 nucleotides were masked. The coordinates of the eight 1-kb regularly-scattered in silico inversions were 1001–2000, 3001–4000, 5001–6000, 7001–8000, 9001–10000, 11001–12000, 13001–14000, and 15001–16000 in NC_001807.

Mathematical derivation

For the frequency of a particular oligonucleotide A n (n > 0), via the n th inversion, (1 - r n ) An-1remains; r n An-1decreases; r n Tn-1increases if we suppose the distribution of contents is even in the whole sequence. In this way, the two recurrence formulas (1) and (2) are derived (see text). The following equations are obtained by adding equations (1) and (2).

A n + T n = An-1+ Tn-1(4)

A n + T n = A0 + T0(5)

These mean that inversions do not change the sum of the two frequencies. Using (5), other forms of (1) and (2) are derived.

A n = (1 - 2r n )An-1+ r n (A0 + T0) (6)

T n = (1 - 2r n )Tn-1+ r n (A0 + T0)(7)

When we subtract (A 0 + B 0 )/2 from (6) and define B n , (9) is derived.

B n = A n A 0 + B 0 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaacqWGcbGqdaWgaaWcbaGaemOBa4gabeaakiabg2da9iabdgeabnaaBaaaleaacqWGUbGBaeqaaOGaeyOeI0YaaSaaaeaacqWGbbqqdaWgaaWcbaGaeGimaadabeaakiabgUcaRiabdkeacnaaBaaaleaacqaIWaamaeqaaaGcbaGaeGOmaidaaaaa@3A31@
(8)
B n = ( 1 2 r n ) B n 1 = ( 1 2 r 1 ) ( 1 2 r 2 ) ( 1 2 r 3 ) ... ( 1 2 r n 1 ) B 0 = B 0 k = 1 n 1 ( 1 2 r k ) = A 0 T 0 2 k = 1 n 1 ( 1 2 r k ) MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaafaqaaeabbaaaaeaacqWGcbGqdaWgaaWcbaGaemOBa4gabeaakiabg2da9iabcIcaOiabigdaXiabgkHiTiabikdaYiabdkhaYnaaBaaaleaacqWGUbGBaeqaaOGaeiykaKIaemOqai0aaSbaaSqaaiabd6gaUjabgkHiTiabigdaXaqabaaakeaacqGH9aqpcqGGOaakcqaIXaqmcqGHsislcqaIYaGmcqWGYbGCdaWgaaWcbaGaeGymaedabeaakiabcMcaPiabcIcaOiabigdaXiabgkHiTiabikdaYiabdkhaYnaaBaaaleaacqaIYaGmaeqaaOGaeiykaKIaeiikaGIaeGymaeJaeyOeI0IaeGOmaiJaemOCai3aaSbaaSqaaiabiodaZaqabaGccqGGPaqkcqGGUaGlcqGGUaGlcqGGUaGlcqGGOaakcqaIXaqmcqGHsislcqaIYaGmcqWGYbGCdaWgaaWcbaGaemOBa4MaeyOeI0IaeGymaedabeaakiabcMcaPiabdkeacnaaBaaaleaacqaIWaamaeqaaaGcbaGaeyypa0JaemOqai0aaSbaaSqaaiabicdaWaqabaGcdaqeWbqaaiabcIcaOiabigdaXiabgkHiTiabikdaYiabdkhaYnaaBaaaleaacqWGRbWAaeqaaOGaeiykaKcaleaacqWGRbWAcqGH9aqpcqaIXaqmaeaacqWGUbGBcqGHsislcqaIXaqma0Gaey4dIunaaOqaaiabg2da9maalaaabaGaemyqae0aaSbaaSqaaiabicdaWaqabaGccqGHsislcqWGubavdaWgaaWcbaGaeGimaadabeaaaOqaaiabikdaYaaadaqeWbqaaiabcIcaOiabigdaXiabgkHiTiabikdaYiabdkhaYnaaBaaaleaacqWGRbWAaeqaaOGaeiykaKcaleaacqWGRbWAcqGH9aqpcqaIXaqmaeaacqWGUbGBcqGHsislcqaIXaqma0Gaey4dIunaaaaaaa@8C53@
(9)

Using -1 << 1 - 2r k < 1 (0 <r k << 1), lim n B n = A 0 T 0 2 lim n k = 1 n 1 ( 1 2 r k ) = 0 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiGbcYgaSjabcMgaPjabc2gaTbWcbaGaemOBa4MaeyOKH4QaeyOhIukabeaakiabdkeacnaaBaaaleaacqWGUbGBaeqaaOGaeyypa0ZaaSaaaeaacqWGbbqqdaWgaaWcbaGaeGimaadabeaakiabgkHiTiabdsfaunaaBaaaleaacqaIWaamaeqaaaGcbaGaeGOmaidaamaaxababaGagiiBaWMaeiyAaKMaeiyBa0galeaacqWGUbGBcqGHsgIRcqGHEisPaeqaaOWaaebCaeaacqGGOaakcqaIXaqmcqGHsislcqaIYaGmcqWGYbGCdaWgaaWcbaGaem4AaSgabeaakiabcMcaPiabg2da9iabicdaWaWcbaGaem4AaSMaeyypa0JaeGymaedabaGaemOBa4MaeyOeI0IaeGymaedaniabg+Givdaaaa@5B54@ .

Therefore, lim n A n = lim n B n + A 0 + T 0 2 = A 0 + T 0 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiGbcYgaSjabcMgaPjabc2gaTbWcbaGaemOBa4MaeyOKH4QaeyOhIukabeaakiabdgeabnaaBaaaleaacqWGUbGBaeqaaOGaeyypa0ZaaCbeaeaacyGGSbaBcqGGPbqAcqGGTbqBaSqaaiabd6gaUjabgkziUkabg6HiLcqabaGccqWGcbGqdaWgaaWcbaGaemOBa4gabeaakiabgUcaRmaalaaabaGaemyqae0aaSbaaSqaaiabicdaWaqabaGccqGHRaWkcqWGubavdaWgaaWcbaGaeGimaadabeaaaOqaaiabikdaYaaacqGH9aqpdaWcaaqaaiabdgeabnaaBaaaleaacqaIWaamaeqaaOGaey4kaSIaemivaq1aaSbaaSqaaiabicdaWaqabaaakeaacqaIYaGmaaaaaa@5400@ .

Similarly, lim n T n = A 0 + T 0 2 MathType@MTEF@5@5@+=feaafiart1ev1aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacH8akY=wiFfYdH8Gipec8Eeeu0xXdbba9frFj0=OqFfea0dXdd9vqai=hGuQ8kuc9pgc9s8qqaq=dirpe0xb9q8qiLsFr0=vr0=vr0dc8meaabaqaciaacaGaaeqabaqabeGadaaakeaadaWfqaqaaiGbcYgaSjabcMgaPjabc2gaTbWcbaGaemOBa4MaeyOKH4QaeyOhIukabeaakiabdsfaunaaBaaaleaacqWGUbGBaeqaaOGaeyypa0ZaaSaaaeaacqWGbbqqdaWgaaWcbaGaeGimaadabeaakiabgUcaRiabdsfaunaaBaaaleaacqaIWaamaeqaaaGcbaGaeGOmaidaaaaa@400A@ .

References

  1. Chargaff E: Structure and function of nucleic acids as cell constituents. Fed Proc. 1951, 10: 654-659.

    CAS  PubMed  Google Scholar 

  2. Watson JD, Crick FH: Molecular structure of nucleic acids: a structure for deoxyribose nucleic acid. Nature. 1953, 171: 737-738. 10.1038/171737a0.

    Article  CAS  PubMed  Google Scholar 

  3. Rudner R, Karkas JD, Chargaff E: Separation of B. subtilis DNA into complementary strands. 3. Direct analysis. Proc Natl Acad Sci USA. 1968, 60: 921-922. 10.1073/pnas.60.3.921.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  4. Fickett JW, Torney DC, Wolf DR: Base compositional structure of genomes. Genomics. 1992, 13: 1056-1064. 10.1016/0888-7543(92)90019-O.

    Article  CAS  PubMed  Google Scholar 

  5. Prabhu VV: Symmetry observations in long nucleotide sequences. Nucleic Acids Res. 1993, 21: 2797-2800. 10.1093/nar/21.12.2797.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  6. Forsdyke DR, Mortimer JR: Chargaff's legacy. Gene. 2000, 261: 127-137. 10.1016/S0378-1119(00)00472-8.

    Article  CAS  PubMed  Google Scholar 

  7. Qi D, Cuticchia AJ: Compositional symmetries in complete genomes. Bioinformatics. 2001, 17: 557-559. 10.1093/bioinformatics/17.6.557.

    Article  CAS  PubMed  Google Scholar 

  8. Baisnée PF, Hampson S, Baldi P: Why are complementary DNA strands symmetric?. Bioinformatics. 2002, 18: 1021-1033. 10.1093/bioinformatics/18.8.1021.

    Article  PubMed  Google Scholar 

  9. Mitchell D, Bridge R: A test of Chargaff's second rule. Biochem Biophys Res Commun. 2006, 340: 90-94.

    Article  CAS  PubMed  Google Scholar 

  10. Albrecht-Buehler G: Asymptotically increasing compliance of genomes with Chargaff's second parity rules through inversions and inverted transpositions. Proc Natl Acad Sci USA. 2006, 103: 17828-17833. 10.1073/pnas.0605553103.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  11. Sueoka N: Intrastrand parity rules of DNA base composition and usage biases of synonymous codons. J Mol Evol. 1995, 40: 318-325. 10.1007/BF00163236.

    Article  CAS  PubMed  Google Scholar 

  12. Lobry JR: Properties of a general model of DNA evolution under no-strand-bias conditions. J Mol Evol. 1995, 40: 326-330. 10.1007/BF00163237.

    Article  CAS  PubMed  Google Scholar 

  13. McLean MJ, Wolfe KH, Devine KM: Base composition skews, replication orientation, and gene orientation in 12 prokaryote genomes. J Mol Evol. 1998, 47: 691-696. 10.1007/PL00006428.

    Article  CAS  PubMed  Google Scholar 

  14. Bell SJ, Forsdyke DR: Deviations from Chargaff's second parity rule correlate with direction of transcription. J Theor Biol. 1999, 197: 63-76. 10.1006/jtbi.1998.0858.

    Article  CAS  PubMed  Google Scholar 

  15. Daubin V, Perriere G: G+C3 structuring along the genome: a common feature in prokaryotes. Mol Biol Evol. 2003, 20: 471-483. 10.1093/molbev/msg022.

    Article  CAS  PubMed  Google Scholar 

  16. Nikolaou C, Almirantis Y: A study on the correlation of nucleotide skews and the positioning of the origin of replication: different modes of replication in bacterial species. Nucleic Acids Res. 2005, 33: 6816-6822. 10.1093/nar/gki988.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  17. Nikolaou C, Almirantis Y: Deviations from Chargaff's second parity rule in organellar DNA insights into the evolution of organellar genomes. Gene. 2006, 381: 34-41. 10.1016/j.gene.2006.06.010.

    Article  CAS  PubMed  Google Scholar 

  18. Green P, Ewing B, Miller W, Thomas PJ, NISC Comparative Sequencing Program, Green ED: Transcription-associated mutational asymmetry in mammalian evolution. Nat Genet. 2003, 33: 514-517. 10.1038/ng1103.

    Article  CAS  PubMed  Google Scholar 

  19. Louie E, Ott J, Majewski J: Nucleotide frequency variation across human genes. Genome Res. 2003, 13: 2594-2601. 10.1101/gr.1317703.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Prescott DM, Dizick SJ: A unique pattern of intrastrand anomalies in base composition of the DNA in hypotrichs. Nucleic Acids Res. 2000, 28: 4679-4688. 10.1093/nar/28.23.4679.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  21. Fileé J, Forterre P: Viral proteins functioning in organelles: a cryptic origin?. Trends Microbiol. 2005, 13: 510-513. 10.1016/j.tim.2005.08.012.

    Article  PubMed  Google Scholar 

  22. Clayton DA: Replication of animal mitochondrial DNA. Cell. 1982, 28: 693-705. 10.1016/0092-8674(82)90049-6.

    Article  CAS  PubMed  Google Scholar 

  23. Pääbo S, Thomas WK, Whitfield KM, Kumazawa Y, Wilson AC: Rearrangements of mitochondrial transfer RNA genes in marsupials. J Mol Evol. 1991, 33: 426-430. 10.1007/BF02103134.

    Article  PubMed  Google Scholar 

  24. Ohno S: Evolution by Gene and Genome Duplication. 1970, Springer, Berlin

    Book  Google Scholar 

  25. Bailey JA, Gu Z, Clark RA, Reinert K, Samonte RV, Schwartz S, Adams MD, Myers EW, Li PW, Eichler EE: Recent segmental duplications in the human genome. Science. 2002, 297: 1003-1007. 10.1126/science.1072047.

    Article  CAS  PubMed  Google Scholar 

  26. Cheung J, Estivill X, Khaja R, MacDonald JR, Lau K, Tsui LC, Scherer SW: Genome-wide detection of segmental duplications and potential assembly errors in the human genome sequence. Genome Biol. 2003, 4: R25-10.1186/gb-2003-4-4-r25.

    Article  PubMed Central  PubMed  Google Scholar 

  27. Dunham I, Shimizu N, Roe BA, Chissoe S, Hunt AR, Collins JE, Bruskiewich R, Beare DM, Clamp M, Smink LJ, Ainscough R, Almeida JP, Babbage A, Bagguley C, Bailey J, Barlow K, Bates KN, Beasley O, Bird CP, Blakey S, Bridgeman AM, Buck D, Burgess J, Burrill WD, O'Brien KP, et al: The DNA sequence of human chromosome 22. Nature. 1999, 402: 489-495. 10.1038/990031.

    Article  CAS  PubMed  Google Scholar 

  28. Kosak ST, Groudine M: Gene order and dynamic domains. Science. 2004, 306: 644-647. 10.1126/science.1103864.

    Article  CAS  PubMed  Google Scholar 

  29. Feuk L, MacDonald JR, Tang T, Carson AR, Li M, Rao G, Khaja R, Scherer SW: Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies. PLoS Genet. 2005, 1: e56-10.1371/journal.pgen.0010056.

    Article  PubMed Central  PubMed  Google Scholar 

  30. Hoffmann AA, Sgrò CM, Weeks AR: Chromosomal inversion polymorphisms and adaptation. Trends Ecol Evol. 2004, 19 (9): 482-488. 10.1016/j.tree.2004.06.013.

    Article  PubMed  Google Scholar 

  31. Blanc G, Barakat A, Guyot R, Cooke R, Delseny M: Extensive duplication and reshuffling in the Arabidopsis genome. Plant Cell. 2000, 12: 1093-1101. 10.1105/tpc.12.7.1093.

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  32. Coluzzi M, Sabatini A, della Torre A, Di Deco MA, Petrarca V: A polytene chromosome analysis of the Anopheles gambiae species complex. Science. 2002, 298: 1415-1418. 10.1126/science.1077769.

    Article  CAS  PubMed  Google Scholar 

  33. Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, Haugen E, Hayden H, Albertson D, Pinkel D, Olson MV, Eichler EE: Fine-scale structural variation of the human genome. Nat Genet. 2005, 37: 727-732. 10.1038/ng1562.

    Article  CAS  PubMed  Google Scholar 

  34. Rieseberg LH: Chromosomal rearrangements and speciation. Trends Ecol Evol. 2001, 16 (7): 351-358. 10.1016/S0169-5347(01)02187-5.

    Article  PubMed  Google Scholar 

  35. Navarro A, Barton NH: Chromosomal speciation and molecular divergence – accelerated evolution in rearranged chromosomes. Science. 2003, 300: 321-324. 10.1126/science.1080600.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

The authors thank J. Buchanan, O. Akiyama, S. Horike, C. R. Marshall, A. Navarro, P. Pevzner, R. F. Wintle and J. Zhang for discussions and critical reading of the manuscript. We acknowledge the Centre for Computational Biology and The Centre for Applied Genomics for computational assistance. The work is supported by Genome Canada/Ontario Genomics Institute, the McLaughlin Centre for Molecular Medicine, and The Hospital for Sick Children Foundation. S.W.S. is an Investigator of the Canadian Institutes for Health Research and International Scholar of the Howard Hughes Medical Institute.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Stephen W Scherer.

Additional information

Authors' contributions

KO conceived the study, performed the computational analyses, mathematical derivation, and drafted the manuscript. JW participated in the coordination of the study and performed the computational analyses. SWS participated in the design and coordination of the study and helped draft the manuscript. All authors read and approved the final manuscript.

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Rights and permissions

Open Access This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/2.0 ), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Okamura, K., Wei, J. & Scherer, S.W. Evolutionary implications of inversions that have caused intra-strand parity in DNA. BMC Genomics 8, 160 (2007). https://doi.org/10.1186/1471-2164-8-160

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/1471-2164-8-160

Keywords