sex: The importance of base composition rather than homology when nucleic acids
Donald R. Forsdyke
Journal of Theoretical Biology (2007) 249, 325-330 http://dx.doi.org/10.1016/j.jtbi.2007.07.023
Copyright Else vier Ltd.
2. Hybridization of synthetic RNAs
3. Stem-loop intermediates
4. Structures must be synchronized
6. Crick’s unpairing hypothesis
7. Why pairing?
End Note (Nov. 2008)
On learning that nucleic acid
hybridization had been achieved in a test tube, Huxley hailed the
discovery of “molecular
description was apt, since sex involves recombination, which requires
hybridization that, in turn, depends on a successful homology search.
Conversely, when the homology search fails, recombination fails. In
yeast this failure has been attributed to “simple
But sequence divergence does not impair nucleic acid hybridization
simply. Most natural single-stranded nucleic acids are predisposed to
adopt higher order structures containing stem-loops. Tomizawa showed
that the rate-limiting step in the hybridization of single-stranded
sequences is an initial “kissing”
exploration between complementary loops, which must first be
appropriately extruded and aligned. Successful duplex formation requires
successful synchronization of matching higher ordered structures, which
depends, not so much on the degree of similarity between their base
sequences as on the closeness of their base compositions (GC%).
In these terms we can understand how the anti-recombinational effect of GC%
differences supports the duplication both of genes within a genome and
of genomes within a genus (speciation).
Keywords: GC%; Recombination; Sequence divergence; Speciation; Stem-loops
of the origin of biological species are broadly categorized as genic and
chromosomal (Coyne and Orr, 2004). While in individual cases either
category may have applied, at issue is which is most likely to have
applied in the general case (Kliman et al., 2001; Forsdyke, 2004a).
Genetic analyses in yeast have shown that genic incompatibilities are
unlikely to have initiated divergence into new species. Furthermore, it
is inferred that incompatibilities which impair the meiotic pairing of
chromosomes are not due to segmental DNA rearrangements. By exclusion,
– namely, differences in individual DNA bases – is left (Liti et
al., 2006; Greig, 2007). A similar conclusion is suggested by studies in
fruit fly by Naveira and Maside (1998) who invoke “foreign
irrespective of its protein-encoding potential. Base differences would
impair the homology search that precedes meiotic recombination between
chromosomes, so impeding gametogenesis and rendering hybrids sterile.
Thus the parents of a hybrid would be reproductively isolated from each
other, a condition that could facilitate divergence into new species.
yeast results were found “surprising”
since, although complicated, genic hypotheses were considered to be “widely
2007). Yet, the alternative – a chromosomal hypothesis of sequence
divergence due to base differences – was viewed as “simple”
(Liti et al., 2006; Greig, 2007). This may be reflective of
a dichotomy between genetical and biochemical evolutionists. At the
extremes, the former tend to think in terms of phenotypes and
mathematical models, whereas the latter tend to think in terms of
genotypes and DNA chemistry. Both groups recognize that base differences
usually suffice to prevent the hybridization that can lead to
recombination. But biochemical evolutionists have long dwelled on the
fact that species differ in base composition (Sueoka, 1961; Wada et al.,
1991; Bernardi, 2005; Forsdyke, 2006). Either directly (Bellgard et al.,
2001) or indirectly (Forsdyke, 2007), such differences can sometimes be
detected early in the speciation process – consistent with a cause and
Of the three fundamental parameters involving the sum of two of four bases – GC%, AG% and GT% (which reciprocate, respectively, with AT%, CT% and AC%) – values for GC% (i.e. G + C expressed as a percentage of the four bases) vary most widely both between and within species (Schultes et al., 1997), and have proved to be critical. I here review evidence on the decisive role played by differences in GC%, rather than in base sequence per se, in the preservation of sequences by protecting them from recombination with sequences from which they have begun to diverge. A genome which has diverged from others in its species may no longer be a reliable template for error-correction. As such it must be excluded from recombination with other members of the species, but when so excluded the deviant genome then becomes a candidate for an incipient speciation event (Forsdyke, 2001).
2. Hybridization of synthetic RNAs
In the 1950s it became possible to synthesize artificial single-stranded RNA sequences such as UUUUUUUUUUUU – poly(rU), and AAAAAAAAAAAA – poly(rA) (Grunberg-Manago et al., 1955). The single strands when mixed together (i.e. poly(rU) + poly(rA)) formed a double-stranded hybrid (Rich and Davies, 1956; Warner, 1957), which had a helical structure similar to that of double-stranded DNA (Watson and Crick, 1953). Omitting the helix, this can be represented as:
What was going on in the privacy of the test-tube when millions of flexible, snake-like, poly(rU) molecules were mixed with millions of flexible, snake-like, poly(rA) molecules? Following the Watson-Crick base pairing rules, molecules of poly(rU) react only weakly with each other (since U pairs weakly with U). Furthermore, there is little inclination for the molecules to fold back on themselves, permitting internal pairing of U with U. The same applies for poly(rA). So there was nothing left but for As to pair with Us. Since the molecules had little internal secondary structure (no folding back on themselves), it was easy for a writhing chain of Us to find a writhing chain of As. Millions of relatively rigid, duplex molecules  resulted. Their formation could be monitored either spectrophotometrically or by observing an increase in viscosity.
Things got more complicated when more complex sequences were tried. Take, for example, the twelve base sequence UUUUUUUUAAAA, which should mix with the twelve base sequence AAAAAAAAUUUU, to give:
In this case, before they can be mixed, each molecule will have rapidly and spontaneously folded back on itself:
Each of the molecules in  has a stem (bases 1-4 and 9-12) where there is base-pairing, and a loop (bases 5-8) where there is no base pairing. This situation better corresponds to that of cellular RNAs (Meyer and Miklós, 2005; Shabalina et al., 2006; Forsdyke, 2006). By virtue of their complex structures, “sense” natural RNAs would need a little coaxing – perhaps heating a little – to get them to form duplexes as in  with the corresponding “antisense” RNAs. The details of this process were elucidated by Tomizawa (1984). In  the loops are facing each other, so that the bases in the left loop can reversibly pair with the bases in the right loop. Tomizawa referred to this – the critical rate-limiting stem in hybridization – as "kissing" (Eguchi et al., 1991).
In this case the kissing can rapidly escalate since Us pair with As. But if the left loop had one of its Us substituted by the base C (in one of the positions 5-8) escalation would be less likely, and would become increasingly more unlikely as more of the four positions were substituted with C. Under the Watson-Crick rules, As do not pair with Cs. Tomizawa used the word "kissing" to imply an exploratory interaction. The chemical energetics are such that, if the kissing can be sustained, the stems of the two parental molecules will disrupt to allow formation of a complete duplex as in . A more elaborate example is shown in Figure 1. Here the pairing first occurs between As and Us and between Gs and Cs on the loops at the left. The loops attempt to form a mini-double helix (not shown). This is reversible, so that if adequate complementary pairing bases are not found, the kissing loops (middle) separate. The long duplex at the right (actually a double helix), is more chemically stable than the structures at the left, so if the conditions are right the reaction will primarily be in the direction of the arrows (Eguchi et al., 1991).
|Fig. 1. Tomizawa model for pairing between two RNA molecules that have in
common a segment of close sequence similarity. “Kissing” interactions
between loops (at left) are rate-limiting for formation of a hybrid duplex (at
must be synchronized
Both reacting molecules at the left in Figure 1 are in the same tube in the same salt solution and at the same temperature. For consummation of their initial pairing a little warming might be needed, but the heat would be delivered to the tube in a uniform manner, so that both reacting molecules would be affected equally. We can think of them as being synchronized so that the structures match – one does not remain in entirely single-stranded mode while the other adopts a stem-loop configuration. Both form stem-loops to an equal extent and at the same time, apart from minor idiosyncratic fluctuations.
Given their common environment, what small difference in the partners at the left would be most effective in preventing their union? One might imagine that a change in one of the base-pairs in the loops – perhaps a C opposite one of the As – would impede the kissing. Alternatively, there might be a mismatch in the duplex at the right. In other words, one tends to think in terms of the sequence similarity between the two partners being less than perfect. However, imperfect similarity per se does not necessarily prevent hybridization. Far more critical are the base pairs that give the secondary structure of the RNA its stability. A difference in pairing, such as the substitution of an AU pair for a GC pair almost anywhere in the stem of one partner, would tend to desynchronize their configurations so that the bases in the loops, even if precisely matched, would not meet each other. Indeed, numerous studies reveal the exquisite sensitivity of the structures formed by single-stranded nucleic acids to changes in only one base pair (Orita et al., 1989; Shen et al., 1999; Dong et al., 2001; Woodside et al., 2006). Chen et al. (1990) showed for RNA that it is not so much similarity (i.e. equivalence in base order) as base composition – and specifically GC% – that critically determines secondary structure. Base composition makes a much greater contribution to structural energetics than base order. Of various base compositional parameters, the product of the frequencies of G and C is the best predictor of structure.
This takes us back to Huxley’s remark (Rich, 2006). It is easy to think that a human baby itself (or the adult that baby develops into) is the final product of a parental copulation nine months earlier. In fact, that act of copulation began something that (as far as an ending can be discerned in a process that is essentially cyclical) ends in the gonads of the adult their baby grows to become. The chances are that, right now, within your gonads that ending is being enacted by DNA copies derived from parental DNA molecules that have been cooperating from the moment of your conception, both within your main body ("soma") and within your gonads ("germ-line"). Most of the time your two parental DNAs (genomes) work together but separately. They multiply as the cells containing them multiply (by mitosis). However, in the gonad, when new gametes are made there is a different type of cell division (meiosis). This meiotic division is characterized by the union (or "conjugation" as the early cytologists called it) of your parental chromosomes. In essence, it completes the act of conjugation your parents initiated decades earlier, and involves the formation of hybrid duplexes with one DNA strand being of paternal origin and the other of maternal origin.
Subjects of much debate are how and why this meiotic union occurs. Crick saw the problem as one of determining how parental duplex DNA molecules, both double-stranded in accordance with the Watson-Crick model, would be able to recognize each other. For simplicity, he thought the recognition should follow the base-pairing rules that he and Watson had discerned. But in helical duplex DNA the bases were inward-looking. How could inward-looking bases in one duplex look outwards to recognize homologous bases in another DNA duplex? Thus, came the “unpairing hypothesis” (Crick, 1971). In both duplexes the two strands would locally unpair so as to present outward-looking, single-stranded, sequences of bases (Krueger et al., 2006). In this way, a segment of bases in a paternal duplex would be able to pair with a similar segment of bases in a maternal duplex. If the pairing sequences were absolutely identical (i.e. there was homology) the duplex would be considered a “homoduplex” (like the original parental duplexes). If the pairing sequences differed by as little as one base-pair, then the duplex would be considered a “heteroduplex” (Holliday, 1990; Allers and Lichten, 2001).
Crick saw the unpaired strands as remaining single-stranded prior to forming a homoduplex or heteroduplex. But energetic considerations dictate that, as they “unzip” from each other, the unpaired single strands should quickly fold back on themselves to form stem-loop structures (Murchie et al., 1992; Woodside et al., 2006). This would be supported by the crowded intracellular environment where entropic contributions to the critical base-stacking interactions would be increased (Yakovchuk et al., 2006). Thus, the initiation of pairing between the parental DNA strands should involve the same “kissing” process as envisaged by Tomizawa for RNA molecules (Sobell, 1972; Wagner and Radman, 1975; Doyle, 1978; Kleckner and Weiner, 1993; Hawley and Arbel, 1993). Of key importance for this would be that the unpairing and folding displayed a sufficient degree of synchrony so complementary DNA loops would have the opportunity to meet. As in the case of RNA folding (Chen et al., 1990), it was found for DNA that base composition, rather than actual sequence similarity (base order), was critical in determining the degree of secondary structure (Forsdyke, 1998). These results, derived from thermodynamic calculations of nearest-neighbour energies (Mathews, 2006), were supported by optical force clamp studies of isolated molecules (Woodside et al., 2006). Thus, a very small difference in the base composition between the paternal and maternal DNAs should suffice to prevent the initiation of hybridization. This is summarized in Figure 2. Here, on the left, paternal (P) and maternal (M) DNAs have the same base composition ("X"). As conditions change, the two strands unpair in synchrony and loops are positioned so that kissing can occur. Formation of a paranemic joint (no immediate strand breakage) can lead to recombination (Wong et al., 1998). On the right, the base compositions differ slightly (X and X+1) and the unpairing is unsynchronized.
Fig. 2.The exquisite sensitivity of stem-loop extrusion from duplex DNA to differences in base composition can prevent the initiation of pairing between homologous DNA sequences. At the left, paternal (P) and maternal (M) duplexes have the same GC% value. As negative supercoiling progressively increases, the strands of each duplex synchronously open to allow formation of equivalent stem-loop secondary structures so that “kissing” interactions between loops can progress to pairing. At the right, paternal and maternal duplexes differ slightly in GC%. The maternal duplex of higher GC% opens less readily as negative supercoiling increases, so strand opening is not synchronous, “kissing” interactions fail, and there is no progress to pairing. In this model chromosome pairing occurs before the strand breakage that accompanies recombination (not shown). Even if strand breakage were to occur first (as required by some models), unless inhibited by single-stranded DNA-binding proteins the free single strands so exposed would, in the crowded intracellular environment, rapidly adopt stem-loop configurations. So the homology search could still involve kissing interactions between the tips of loops.
Why does meiotic pairing occur? Why should it matter that the pairing partners be synchronized so that a hybrid duplex is formed between paternal and maternal chromosomes? Furthermore, what are the consequences of failure to form such a duplex? The case has been made that, apart from assisting the equal partitioning of chromosomes among gametes, a hybrid duplex is an essential intermediate in the recombination of segments of paternal and maternal genomes that results, not only in increased genetic diversity among gametes (by intra-chromosomal exchange of segments) but, more importantly, decreased genetic diversity due to correction of mutations (gene conversion; Bernstein and Bernstein, 1991). This appears as a compelling reason for sexual, as contrasted with asexual, reproduction. In the course of this correction, differences between the parental genomes would decrease, so that information for any adaptations that might have depended on those differences would be less likely to be forwarded to their grandchildren. In other words, while heterozygosity would not be entirely eliminated, there would be some blending – an ironing out of differences (Forsdyke, 2006).
opposite would be expected when duplex formation fails. Sequences of DNA then
become recombinationally isolated, a condition favourable to the emergence both
of new genes from the duplication of pre-existing genes within a genome, and of
new species from the duplication of pre-existing species within a genus. It is
here that the importance of base composition differences (rather than
non-homology per se) in preventing the initiation of recombination, is apparent.
This would seem to provide a rationale for the well known differences in base
composition between genes within a genome, and between species within a genus
(Wada et al., 1991; Bernardi, 2005). Thus, the recombinational isolation brought
about by differences in GC% – the “accent” of DNA – can be seen as a
major protector of novel DNA sequences against recombinational repair processes
(Forsdyke, 2004b). When in a
protein-encoding region, the differences would primarily affect third
(synonymous) codon positions, so there could be reproductive isolation without
necessarily affecting protein function (i.e. without in the first instance
affecting phenotype; Forsdyke, 2007).
recurring criticism of this viewpoint (raised by a reviewer of this manuscript)
has been that closely related species may have very similar GC% values. However,
barriers to reproduction (the first being a postulated difference in GC%) tend
to replace each other consecutively. Having diverged, GC% values can converge
when a second barrier appears (Schultes et al., 1997). The following metaphor
may help. Your dog may be tethered by a leash. But if you build a high fence,
the leash is no longer necessary. If the initial barrier (the leash) is damaged
or lost, it may not be noticed. The second barrier (fence) should suffice. On
the other hand, the leash then becomes available for some other function. Thus,
following establishment of a second barrier, a first barrier may degenerate or
change in a random way, or may find other employment. If a Sherlock Holmes then
tried to discern whether there had been an earlier barrier than the fence, and
what form it had taken, there might be a problem (Forsdyke, 2001).
The Watson-Crick model (1953) revealed how genetic attributes could be evenly distributed among child cells – a triumph for the “reductionist” approach through which knowledge of the whole is derived through minute examination of its parts. As shown here, it seems possible that it will be through a better understanding of DNA chemistry that conflicting views of speciation will be resolved (Kliman et al., 2001; Forsdyke, 2004a)
Allers, T., Lichten, M., 2001. Intermediates of yeast
meiotic recombination contain heteroduplex DNA. Mol. Cell
Bellgard, M., Schibeci, D., Trifonov, E., Gojobori, T.
J., 2001. Early detection of G + C differences in bacterial species
inferred from the comparative analysis of the two completely sequenced Helicobacter
pylori strains. J.
Mol. Evol. 53,
Bernardi, G., 2005. Natural Selection and Genome
Bernstein, C., Bernstein, H.,
1991. Aging, Sex and DNA Repair. Academic Press,
Chen, J-H., Le, S-Y., Shapiro, B., Currey, K. M., Maizel, J. V., 1990. A computational procedure for assessing the significance of RNA secondary structure. CABIOS 6, 7-18.
Coyne, J. A., Orr, H. A., 2004. Speciation. Sinauer,
Crick, F., 1971. General model for
the chromosomes of higher organisms. Nature
Dong, F., Allawi, H. T.,
Doyle, G. G., 1978. A general
theory of chromosome pairing based on the palindromic DNA model of Sobell with
modifications and amplification. J. Theor. Biol. 70, 171-184.
Eguchi, Y., Itoh, T., Tomizawa,
J., 1991. Antisense RNA. Ann. Rev. Biochem.
Forsdyke, D. R., 1998. An
alternative way of thinking about stem-loops in DNA. A case study of the G0S2
Theor. Biol. 192, 489-504.
Forsdyke, D. R., 2001. The Origin of Species,
University Press, Montreal.
Forsdyke, D. R., 2004a. Chromosomal speciation: a reply. J. Theor. Biol. 230, 189-196.
Forsdyke, D. R. 2004b. Regions of relative GC% uniformity are
recombinational isolators. J. Biol. Sys. 12,
Forsdyke, D. R., 2007. Positive Darwinian selection. Does the comparative
method rule? J. Biol. Sys. 15, 95-108.
Greig, D., 2007. A screen for recessive speciation genes expressed in the
gametes of F1 hybrid yeast. PLOS Genetics 3, 281-286.
Grunberg-Manago, M., Ortiz, P. J., Ochoa, S., 1955. Enzymatic synthesis of nucleic acid-like polynucleotides. Science 122, 907-910.
Hawley, R. S., Arbel, T., 1993.
Yeast genetics and the fall of the classical view of meiosis. Cell 72,
R., 1990. The history of heteroduplex DNA. BioEssays
Kleckner, N., Weiner, B. M., 1993.
Potential advantages of unstable interactions for pairing of chromosomes in
meiotic, somatic and premeiotic cells. Cold Spring Harb. Symp. Quant.
Kliman, R. M., Rogers, B. T., Noor, M. A. F., 2001. Differences in (G+C)
content between species: a commentary on Forsdyke’s “chromosomal
viewpoint” of speciation. J.
Theor. Biol. 209, 131-140.
Krueger. A., Protozanova, E., Frank-Kamenetskii, M. D., 2006.
Sequence-dependent base-pair opening in DNA double helix. Biophys.
Liti, G., Barton, D. B. H., Louis, E. J., 2006. Sequence diversity,
reproductive isolation and species concepts in Saccharomyces. Genetics
Mathews, D. H., 2006. Revolutions in RNA secondary structure prediction.
J. Mol. Biol. 359, 526-532.
Meyer, I. M., Miklós,
Murchie, A. I. H., Bowater, R., Aboul-Ela, F., Lilley, D. M. J., 1992. Helix opening transitions in supercoiled DNA. Biochem. Biophys. Acta 1131, 1-15.
Naveira, H. F., Maside, X. R., 1998. The genetics of hybrid male
sterility in Drosophila. In: Endless Forms: Species and
Speciation. (Howard, D. J., Berlocher, S. H., eds.), pp.
330-338. Oxford University
Orita, M., Iwahana, H.,
Rich, A., 2006. Discovery of the hybrid helix and the first DNA-RNA hybridization. J. Biol. Chem. 281, 7693-7696.
Rich, A., Davies, D. R., 1956. A new two-stranded helical structure: polyadenylic acid and polyuridylic acid. J. Amer. Chem. Soc. 78, 3548-3549.
Schultes, E., Hraber, P. T., La Bean, T. H., 1997. Global similarities in nucleotide base composition among disparate functional classes of single-stranded RNA imply adaptive evolutionary convergence. RNA 3, 792-806.
Shabalina, S. A., Ogurtsov, A. Y.,
Spiridonov, N. A., 2006. A
periodic pattern of mRNA secondary structure created by the genetic code.
Nucleic Acids Res. 34, 2428-2437.
Shen, L. X., Basilion, J. P.,
Stanton, V. P., 1999. Single nucleotide polymorphisms can cause different
structural folds of mRNA. Proc. Natl. Acad. Sci. USA 96, 7871-6.
Sobell, H. M., 1972. Molecular
mechanism for genetic recombination. Proc. Natl. Acad. Sci. USA
Sueoka, N., 1961. Compositional correlation between deoxyribonucleic acid and protein. Cold Spring Harb. Symp. Quant. Biol. 26, 35-43.
Tomizawa, J., 1984. Control of
ColE I plasmid replication: the process of binding of RNA I to the primer
Wada, A., Suyama, A., Hanai, R., 1991. Phenomenological
theory of GC/AT pressure on DNA base composition. J.
Mol. Evol. 32, 374-378.
Wagner, R. E., Radman, M., 1975. A
mechanism for initiation of genetic recombination. Proc.
Natl. Acad. Sci. USA 72, 3619-3622.
Warner, R. C., 1957. Interaction
of polyadenylic and polyuridylic acids. Fed. Proc. 16, 266-267.
Watson, J. D., Crick, F. H. C.,
1953. A structure for deoxyribose nucleic acid. Nature 171,
Wong, B. C., Chiu, S-K., Chow, S. A., 1998. The role of negative superhelicity and length of homology in the formation of paranemic joints promoted by RecA protein. J. Biol. Chem. 273, 12120-12127.
Woodside, M. T., Behnke-Parks, W. M., Larizadeh, K., Travers, K., Herschlag, D., Block, S. M., 2006. Nanomechanical measurements of the sequence-dependent folding landscapes of single nucleic acid hairpins. Proc. Natl. Acad. Sci. USA 103, 6190-6195.
Yakovchuk, P., Protozanova, E., Frank-Kamenetskii, M. D., 2006. Base-stacking and base-pairing contributions into thermal stability of the DNA double helix. Nucleic Acids Res. 34, 564-574.
End Note (Nov. 2008)
First "Paranemic" Pathway for Recombination.
Concerning the initiation of recombination, citing Sobell 1972 but not
Crick 1971, Wilson in 1979 pointed out that "cut
first" models were more drastic in that they implied a
commitment to recombination, whereas a "pair
first" model between intact homologous duplexes would
be more easily reversed.
Wilson, J. H. (1979) Nick-free formation of reciprocal heteroduplexes: a simple solution to the topological problem. Proceedings of the National Academy of Sciences, USA 76, 3641-45.
End Note (May 2014)
Heterozygosity correlates non-specifically with
degree of hybrid sterility.
Surprise was also expressed by Moehring (2011) who noted this
"unexpected" correlation and also that "the directionality in sterility
[Haldane's rule] is likely due to the different amounts of heterospecific
genome present in the two backcrosses." And, in line with their colleagues
Naviera and Maside (1998), Moran and Fontdevila (2014) found a high degree
of "polygenicity" with "exchangeability" (i.e. non-specificity) between
members of the putative "polygenes" that contributed quantitatively to
establish a given degree of hybrid sterility. All this accords with
"simple sequence divergence."
Moehring AJ (2011) Heterozygosity and its unexpected correlations with hybrid sterility. Evolution 65, 2621-30.
Moran T, Fontdevila A (2014) Genome-side dissection of hybrid sterility in Drosophila confirms a polygenic threshold architecture. Journal of Heredity 105, 381-396.
Go to Evolution Index Click Here
Go to Bioinformatics Index Click Here
Go to Home Page Click Here
This page was established in July 2007 and was lasted edited 21 Mar 2017 by Donald Forsdyke