Trends in Immunology (2002) 23, 575-579
With copyright permission from Elsevier Science Ltd. This version has small differences from the published version, two extra figures, and end-notes.
An elementary immune system
Polymorphism creates unpredictability
Junk DNA is transcribed
Repetitive elements transcribed in infected cells
Double-stranded RNA as an alarm signal
Purine-loading to avoid self-recognition
Intracellular protein “immune receptors”
ConclusionEnd Note (2003)
End Note (2006)
End Note (April 2007)
End Note (March 2008)
End Note (Jan 2009)
End Note (April 2009)
End Note (Jan 2010)
Instead of being greeted as supporting the growing corpus of immunological theory, recent advances in the bioinformatic analysis of genomes have often surprised the discoverers and failed to attract the attention of immunologists.
The view that multicellular immune systems are adaptations of already highly evolved unicellular immune systems that are capable of self/not-self discrimination can assist our comprehension of phenomena such as “junk” DNA, genetic polymorphism and the ubiquity of repetitive elements.
The “hidden transcriptome,” revealed by run-on transcription of genes or repetitive elements, contains a diverse repertoire of RNA “immune receptors,” with the potential to form double-stranded RNA with viral RNA “antigens,” thus triggering intracellular alarms.
Unicellular organisms are likely to have evolved some 800 million years before multicellular organisms. Brücke dubbed single cells “elementary organisms”  implying that many multicellular level functions might have prototypic equivalents at the unicellular level. We here explore implications of the postulate that immune systems of multicellular organisms arose as extensions of immune systems preexisting at the unicellular level [2,3].
Fig. 1. The cell as an elementary immune organism. The left circle (a) represents a multicellular organism with Y-shaped antibodies of various specificities. The right circle (b) represents a unicellular organism with a repertoire of “antibody-like” protein and “antibody-like” RNA molecules (stem-loop structures) These are referred to as “immune receptors” implying that parts of these molecules can interact with intracellular antigens.
An elementary immune system
In a clonal unicellular population where asexual reproduction predominates, self-destruction (i.e. apoptosis) is the simplest mechanism to prevent spread of a pathogen and to promote survival of a “selfish gene”. However, even such a primitive defence needs to be coupled to specific and adaptable sensors. We propose that such a sensory system is provided by a multiplicity of structurally distinct macromolecules, of which we emphasize here proteins and RNAs (Fig. 1). Many of these will have distinct properties (e.g. catalytic, structural, transporting, templating, etc.).
On the other hand, there is a high probability that, in the crowded cytosol , one or more of these molecules will be able to bind an invading virus with sufficient affinity to tag it as “not-self,” thus initiating an innate immune response. Such an “immunological repertoire” could develop either over evolutionary or, as in the case of antibodies, over somatic time . Whatever the mechanism and timing of the diversification process, there is a need to eliminate receptors with an affinity for “self” antigens.
Unfortunately, given the high replication and mutation rates of viruses relative to those of their hosts, it would be highly probable that viruses would pre-adapt to avoid interaction with hostile host macromolecules. What a virus had “learned” (by mutation and selective proliferation) in one host, it would exploit on the next host. New information on host genome polymorphism suggests that this difficulty may now not be so formidable as it once appeared (see below).
In an elementary unicellular immune system viruses that, through mutation, acquired the ability to inactivate host apoptotic mechanisms, would preferentially survive. In the ensuing arms race, an intracellular “inflammatory” host response would have evolved to limit viral activities. However, in multicellular organisms apoptosis of the primarily infected cell might limit the opportunity to alert other target cells and cells of the immune system (e.g. for MHC-peptide presentation). Sophistications developed at the multicellular level are considered below.
Polymorphism creates unpredictability
Specific and general functions of a protein as reflected in its
structure. Dedicated functions are associated with conserved, internal,
hydrophobic, globular domains. Potential immune receptor functions are
associated with variable, external, hydrophilic, non-globular domains.
On average, the haploid maternal and paternal contributions to your diploid genome are likely to differ from each other at least once every 0.5-2.0 kilobases, and general intraspecies differences may arise at least once every 185 bases . Such polymorphism should decrease the extent to which a pathogen from one host can anticipate the genomic characteristics of its next host. When the polymorphism affects proteins, it probably affects sequences of relatively low complexity that correspond to hydrophilic non-globular domains at the protein surface . Thus, these domains, usually not critical for the specialized function of the protein, are available for interaction with complementary molecular patterns of intracellular pathogens (“not-self;” Fig. 2). These same domains should also have the potential to react with “self” proteins, sometimes to an extent sufficient to trigger adverse responses in the host (intracellular “autoimmune” pathology). Organisms with mutations avoiding this would have been favoured over evolutionary time [8-10].
Junk DNA is transcribed
Had we not known of the existence of an antibody repertoire, the discovery of sets of V-genes would have been greeted with surprise. However, our surprise at learning that 98% of our DNA is non-genic has been somewhat blunted by a facile explanation, – “junk”[11,12].
To be functional it is likely that non-genic DNA would have to be transcribed . Recent investigations of the transcriptional activities of the ß-globin region of human chromosome 11 , and of entire chromosomes 21 and 22, reveal a “hidden transcriptome,” corresponding to a large number of low copy number cytoplasmic RNAs. It is estimated that there is “an order of magnitude” more transcriptionally active DNA than can be accounted for by conventional genes . Can this be dismissed as mere cytoplasmic “junk,” an unavoidable consequence of the existence of genomic “junk”?
To understand its role, if any, in the economy of the organism, we need to know, by analogy with known transcriptional processes, whether there are specific promoters, whether there are dedicated RNA polymerases, whether transcription occurs randomly or under specific conditions, and whether transcripts are diverse and include appreciable non-repetitive DNA.
Repetitive elements transcribed in infected cells
Fig. 3. Location of Alu elements is likely to permit downstream transcription of variable genomic segments. Alu and other repetitive elements are shown in part of the 100 kilobase segment of human chromosome one containing the two exon gene, G0S2.
Horizontal arrows indicate transcription directions of G0S2 (grey boxes), of Alu elements (red boxes demarcated by vertical dashed lines) and of other repetitive elements (cyan boxes). The abbreviated names of repetitive elements are printed vertically.
Purine-loading (excess of purines/kb over pyrimidines/kb; grey balls) and CpG frequency (dinucleotides/kb; green continuous line) were evaluated for 400 base windows moving in steps of 25 bases. When purine frequency equals pyrimidine frequency, purine-loading is zero.
Values for CpG
frequency (plotted on the same scale) are zero or positive. The CpG peak (“CpG
island”) associated with
indicates a gene expressed in the germ line. (Note that, if the sequences of a
virus and its host are known, then it should be possible to locate host segments
complementary to virus segments and, from displays such as this, determine the
feasibility of their transcription.)
Much non-genic DNA consists of repetitive elements, the most prominent of which in humans are the 1,090,000 Alu elements . Both conventional genes and repetitive elements can provide promotors for the transcription of non-genic DNA. Some gene transcripts have been found longer then expected due to a failure of transcriptional termination (“run-on” transcription; [17,18]). Some classes of repetitive element contain promoters from which transcription can initiate and extend beyond the bounds of an element into neighbouring genomic regions [19-22].
Are such extended transcripts generated randomly in time? In the case of Alu
elements, transcription (by
RNA polymerase III) has been observed to increase at
times of cell stress
(e.g. viral infection, heat shock). Indeed, viral infection can trigger the heat
shock response with the induction of heat shock proteins (for Refs. see
Thus, it is possible that Alu
transcription reflects as adaptive response to virus infection (for Refs. see
the location of Alu elements likely to
permit downstream transcription of variable genomic segments? Figure 3 shows a
segment of human chromosome one containing the G0/G1
switch gene 2 (G0S2), which is
upregulated in activated lymphocytes
. The gene demonstrates the general
phenomenon of “purine loading” (more purines than pyrimidines) which is
characteristic of most RNAs of most organisms. Thus, when transcription is to
the right of the promoter exons are purine-loaded, and when transcription is to
the left of the promoter exons are pyrimidine-loaded (i.e. negatively
purine-loaded). In both circumstances the RNAs end up being purine-loaded
for reasons discussed below.
transcription direction of G0S2 being
to the right (indicated by the horizontal arrow in Fig. 3a), the gene and the
corresponding mRNA are purine-loaded. This purine-loading extends for about a
kilobase downstream of G0S2 into a
region with no repetitive elements. Thus, if there were conditions such that
transcription did not terminate, then the extended transcript would itself be
purine-loaded and contain non-repetitive DNA.
Also shown in Figure 3 are various repetitive elements with assigned potential transcription directions. Although within a class of repetitive element there is some variability, by definition the repetitive elements themselves tend to diminish genome variability. However, the regions downstream of Alu elements are often devoid of other repetitive elements. For example, the pyrimidine-loaded leftward-transcribing Alu element downstream of G0S2 has a clear downstream region that retains the pyrimidine-loading of the original transcript. On the other hand, several kilobases upstream of G0S2 are two leftward-transcribing Alu elements, one of which transcribes into a region that is purine-loaded and contains repetitive elements of the L2 family. These results are illustrative of the general features of this genomic region. A parallel study of the region of a much smaller human chromosome containing the FOSB/G0S3 gene (chromosome 19; ), revealed a much tighter packing of repetitive elements (data not shown).
|Fig. 4. Two RNA molecules (blue and red) meeting, "kissing", and forming dsRNA. [For space reasons this figure was omitted from the final paper.]|
Double-stranded RNA as an alarm signal
protein molecules can recognize specific nucleic acids (and the converse), it is
convenient here to consider proteins recognizing proteins and RNAs recognizing
RNAs. In the cytosol RNA molecules adopt characteristic stem-loop configurations
(Fig. 1b), and RNA-RNA interactions can initiate by way of a “kissing”
homology-search between bases at the tips of loops. If sequence complementary is
found (e.g. G pairing with C, and A pairing with U) then two RNA species can
pair, partially or completely, to generate a length of double-stranded RNA (dsRNA)
that in some circumstances can play a regulatory role
[Fig. 4; see ref. 28].
If a virus introduced its own RNA into a cell, would there be sufficient variability among host RNA species for a host “immune receptor” RNA to form a segment of dsRNA with the “not-self” RNA of the virus? Calculations made elsewhere  show this to be feasible, especially if the entire genome were available for transcription. Would the dsRNA be able to initiate an adaptive intracellular “inflammatory” response? How would the host cell prevent generation of “self” dsRNAs?
Formation of dsRNA has long been recognized as an early cellular response to viral entry. Protein synthesis can be inhibited non-specifically by very low concentrations of dsRNA . This involves activation of dsRNA-dependent protein kinase (PKR), which inhibits a protein involved in the initiation of protein synthesis. Evasive viral strategies would include the acceptance of mutations to avoid formation of dsRNA (see below), and inhibition of cell components required for the formation of, or the response to, dsRNA [31,32].
Virus-infected cells produce interferons, which can be considered part of
the inflammatory response. The
interferons induce a general anti-viral state spreading together with various
chemokines from the cell of origin to other cells
. Their production is
powerfully stimulated by dsRNA
. There is now growing evidence that, both in
animals and plants, another more sequence specific “inflammatory” response
to dsRNA arises as part of an intracellular mechanism for self/not-self
. Just as in the antibody response there is amplification of
the production of specific antibody, so, courtesy of enzymes such as
RNA-dependent-RNA polymerase and “dicer,” there is amplification of the
production of specific “immune receptor” RNA
(for Refs. see accompanying
paper of Martinez et al. ).
|Fig. 5. Run-on transcription reveals the "hidden transcriptome." (For space reasons this figure was omitted from the original paper.)|
Although it is currently believed that host cells detect dsRNA of virus origin , given the functioning of dsRNA as an alarm signal (Fig. 5), viruses should have evolved to avoid the formation of dsRNA replicative intermediates. Indeed, viruses with dsRNA genomes have adaptations that would appear to conceal their genomes from host cell surveillance mechanisms . More than twenty base pairs are needed to activate PKR in vitro , or to silence specific genes .
Among the RNA species of a cell there might be two whose members, by chance,
happened to have enough base complementarity for formation of a mutual duplex of
a length sufficient to trigger alarms. Thus, there would need to have been an
evolutionary selection pressure favouring mutations in host RNAs that decrease
the possibility of their interaction with other "self" RNAs in the
same cell. In many cases mutations to a purine would assist this, since purines
do not pair with purines. Indeed, interaction with “self” RNAs seems to have
been avoided by "purine-loading" the loop regions of these RNAs, thus
avoiding the initial loop-loop "kissing" reactions which precede more
complete formation of dsRNA. The above-mentioned excess of purines, observed
both at RNA and at DNA levels (in mRNA-synonymous DNA strands), is found in a
wide variety of organisms and their viruses [26,40].
Exploratory "kissing" interactions between hybridizing nucleic acids involve transient base stacking interactions  with the exclusion of structured water. Such reactions have a strong entropy-driven component, and so might increase as temperature increases (i.e. fever ). Accordingly, purine-loading should be high in thermophiles, as is indeed found [41; R. Lambros, J. Mortimer and D. Forsdyke, unpublished work].
Furthermore, proteins with a tendency to become involved in autoimmune reactions have acquired runs of charged amino acids with no known function at the protein level [42,43]. Charged amino acids correspond to codons rich in purines, which should countermand formation of dsRNA. Thus, the presence of runs of charged amino acids may be a consequence of the need to purine-load RNA, and not vice-versa.
A general increase in transcription in cells exposed to “stress” (simulating virus invasion ), would dictate a period of preincubation without stress before testing for specific transcription. This has indeed been found as a requirement for studies with freshly explanted human lymphocytes .
Intracellular protein “immune receptors”
Amino acids in
proteins do not pair on a one-to-one basis, like bases in nucleic acids.
Nevertheless, similar considerations might apply in the case of protein
molecules (Figs. 1b, 2). These would form heteroaggregates (aggregates of
self-proteins with pathogen proteins), and “not-self” homoaggregates
(aggregates of individual pathogen protein species) by mechanisms discussed
[4,8-10, 45,46]. Recent observations of diseases associated with
protein aggregation suggest an interconnection between protein “self” and
RNA “self” homoaggregates, which may both be required for disease
While the existence of an intracellular immune system remains unproven, a growing number of disparate observations appear comprehensible from this perspective. Non-genic “junk” DNA can be viewed in much the same way as we view the diverse genes encoding the variable regions of immunoglobulin antibodies. Just as B-cells capable of synthesizing a unique anti-self antibody would be eliminated during somatic time to prevent self-reactivity, so junk DNA would have been screened over evolutionary time (by positive selection of individuals in which favourable mutations had been collected together by recombination) to decrease the probability of two complementary “self” transcripts interacting to form dsRNA segments of more than 20 bases. High polymorphism of non-genic DNA would make it difficult for viruses to anticipate the “immune receptor” RNA repertoire of future hosts. Since viruses can be enriched for either purines or pyrimidines , the repertoire should include both purine-rich and pyrimidine-rich segments (Fig. 3). The initiating event is one of self/not-self discrimination, be it between two RNA species or between two protein species, and be it extracellular or intracellular.
Acknowledgements We thank Jim Gerlach for assistance with computer configuration, and Jerzy Jurka and coworkers for access to Repbase. The Canadian Bioinformatics Resource (Halifax) provided access to the GCG program suite. Andrew Reynolds kindly provided the Brücke text. Queen’s University hosts DRF’s web pages where full texts of several of the references may be found.
Brücke, E. (1861) Die Elementarorganismen. Sitzungsber.
Forsdyke, D.R. (1991) Early evolution of MHC polymorphism. J.
Theor. Biol. 150, 451-456
3 Forsdyke, D.R. (1992) Two signal model of self/not-self discrimination:
J. Theor. Biol.
Forsdyke, D.R. (1995) Entropy-driven protein self-aggregation as the basis for
5 Lewis, S.M. (1994) The mechanism of V(J)D joining. Adv. Immunol. 56, 27-150
Stephens, J.C. et al. (2001) Haplotype
variation and linkage disequilibrium in 313 human genes.
7 Bustamente, C.D. et al. (2000) Solvent accessibility and purifying selection within proteins of Escherichia coli and Salmonella enterica. Mol. Biol. Evol. 17, 301-308
8 Forsdyke,D.R. (2001a) Adaptive value of polymorphism in intracellular self/not-self discrimination. J. Theor. Biol. 210, 425-434
9 Forsdyke, D.R. (2001b) The Origin of Species, Revisited. McGill-Queen’s University Press
Forsdyke, D.R. (2001c) Functional constraint and molecular evolution. In
of Life Sciences, vol. 7, pp. 396-403, Nature Publishing Group, London
11 Ohno, S. (1972) So much junk DNA in our genome.
Brookhaven Symp. Biol.
Pennisi, E. (2002) Charting a genome’s hills and valleys.
Mattick, J. S. (2001) Non-coding RNAs: the architects of eukaryotic complexity.
EMBO Reps. 2, 986-991
14 Plant, K.E. et al. (2001) Intergenic transcription in the human ß-globin gene cluster. Mol. Cell. Biol. 21, 6507-6514
Kapranov, P. et al. (2002) Large-scale transcriptional activity in chromosomes 21
Jurka, J. et al. (1996) CENSOR: a program for identification and elimination
of repetitive elements from DNA sequences.
Chem. 20, 119-122
Heximer, S.P. et al. (1998) Expression
and processing of G0/G1
Switch Gene 24 (G0S24/TIS11/TTP/NUP475) RNA in
cultured human blood mononuclear cells.
Cell Biol. 17, 249-263
Iseli, C. et al. (2002) Long range heterogeneity at the 3’ ends of human
J.L. and Colozzo, M.T. (1982) Synthesis in vitro of an exceptionally long
transcript promoted by an AluI
Feuchter, A.E. et al. (1992) Strategy
for detecting cellular transcripts promoted by human endogenous long terminal
Ferrigno, O. et al. (2001) Transposable B2 SINE elements can provide mobile RNA
polymerase II promoters.
Nigumann, P. et al. (2002) Many human genes are transcribed from the antisense
promoter of LI retroposon.
Forsdyke, D.R. (1985) Heat shock proteins defend against intracellular
J. Theor. Biol.
Kim, C. et al. (2001) Genome-wide chromatin remodelling modulates the Alu
heat shock response.
Russell, L. and Forsdyke, D.R. (1991) A human putative lymphocyte G0/G1
switch gene containing a CpG-rich island encodes a small basic protein with the
potential to be phosphorylated.
Biol. 10, 581-591
Forsdyke, D.R. and Mortimer, J.R. (2000) Chargaff’s legacy.
Heximer, S.P. et al. (1996) Sequence
analysis and expression in cultured lymphocytes of the human FOSB gene (G0S3).
Cell Biol. 12, 1025-1038
Eguchi, Y. et al. (1991) Antisense RNA. Annu.
Cristillo, A.D. et al. (2001)
Double-stranded RNA as a not-self alarm signal.
Theor. Biol. 208, 475-491
Ehrenfeld, E. and Hunt, T. (1971) Double-stranded poliovirus RNA inhibits
initiation of protein synthesis by reticulocyte lysates.
Natl. Acad. Sci. USA
Elia, A. et al. (1996) Regulation of the double-stranded RNA-dependent
protein kinase PKR by RNAs encoded by a repeated sequence of the Epstein-Barr
Nucleic Acids Res.
Mittelsten Scheid, O. (1999) New tool for Swiss army knife.
Suzuki, K. et al. (1999) Activation of target-tissue immune-recognition
molecules by double-strand polynucleotides.
Natl. Acad. Sci. USA
Marcus, P. (1983) Interferon induction by viruses: one molecule of dsRNA as the
threshold for induction.
Plasterk, R.H.A. (2002) RNA silencing; The Genome’s Immune System.
36 Martinez, M.A. et al. (2002) RNA interference of HIV replication. Trends Immunol. 23, 559-561
Bamford, D.H. (2002) Those magnificent molecular machines: logistics in dsRNA
Tian, B. et al. (2000) Expanded CUG repeat RNAs form hairpins that activate
Elbashir, S.M. et al. (2001) RNA
interference is mediated by 21- and 22-nucleotide RNAs.
Devel. 15, 188-200
Saul, A. and Battistutta, D. (1988) Codon usage in Plasmodium
Mol. Biochem. Parasitol.
Lao, P.J. and Forsdyke, D.R. (2000) Thermophilic bacteria strictly obey
Szybalski's transcription direction rule and politely purine-load RNAs with both
adenine and guanine.
Brendel, V. et al. (1991) Very long charge runs in systemic lupus erythematosus-associated
Proc. Natl. Acad. Sci. USA
Dohlman, J.G. et al. (1993) Long
charge-rich alpha-helices in systemic autoantigens.
Biophys. Res. Comm.
Suzuki, T. et al. (2000) Control selection for RNA quantitation.
45 Forsdyke, D.R. (1999) Heat shock proteins as mediators of "danger" signals: implications of the slow evolutionary fine-tuning of sequences for the antigenicity of cancer cells. Cell Stress Chaperones 4, 205-210
Double-stranded RNA and/or heat-shock as initiators of chaperone
mode switches in diseases associated with protein aggregation.
Peel, A.L. et al. (2001) Double-stranded RNA-dependent protein kinase, PKR,
binds preferentially to Huntington’s disease (HD) transcripts and is activated
in HD tissue.
End Note (27 Dec 2003)
The above view of the role of "junk DNA" predicted that large sets of low abundance "non-coding transcripts" would be a feature of many eukaryotic genomes and that, in view of the postulated role in intracellular aspects of immunological defenses, they would not be evolutionarily conserved. This was greatly supported by the discovery of multiple "non-coding" transcripts in cDNA libraries prepared from humans and mice. In a paper entitled "Complete Sequencing and Characterization of 21243 full-length human cDNAs", Toshio Ota and coworkers noted:
Ota et al. (2004) Nature Genetics 36, 40-45.
correspondence of Alu elements with min-CpG islands, as seen with the CpG-island-containing
G0S2 gene in Figure 3, is supported by new work of Brohede and Rand
(2006 Human Genetics 119,
457-458).This suggests that:
End Note (April 2007)
Bacteria appear to have a defence system analogous to that outlined above (and expanded on in my text Evolutionary Bioinformatics, 2006, pp. 270-2). Bacteria have "Clustered Regularly Interspaced Short Palindromic Repeats" (CRISPR) between which are variable spacer sequences that resemble sequences from the viruses that infect bacteria (bacteriophages). It appears that in the course of a "primary" infection these spacers acquire a sequence from the pathogen. When there is a "secondary" infection, this "memory" can be called upon in the bacterium and its progeny, which transcribe it in an orientation such as to generate interfering RNAs which hybridize with the corresponding nucleic acid sequences of the infecting virus, so inactivating the virus.
Makarova et al., (2006) A putative RNA-interference-based immune system in prokaryotes. Biology Direct 1, 7.
Barrangoue et al., (2007) CRISP provides acquired resistance against viruses in prokaryotes. Science 315, 1709-1712.
End Note (March 2008)
In the heat-shock response transcription is generally repressed, but Alu transcription (by RNA polymerase III) increases. Mariner and colleagues now show that human Alu sequences (and similar sequences in mice) participate in the general repression of genes by binding to RNA polymerase II (the major polymerase for mRNA synthesis).
Transcription, as indicated by [3H]-uridine labelling, is usually high in freshly cultured peripheral blood mononuclear cells. Thus, a response to the lectin Concanavalin-A is best observed after leaving the cells to "rest" for a day. We note above: "A general increase in transcription in cells exposed to “stress” ... would dictate a period of preincubation without stress before testing for specific transcription. This has indeed been found as a requirement for studies with freshly explanted human lymphocytes ." Indeed, Baechler et al. report that "hundreds of genes are sensitive to ex vivo handling of blood." By the criterion of decline during rest phase, there appear to be increases in mRNAs corresponding to G0S2 (Fig. 3 above), FosB (G0S3), Fos (G0S7), RGS2 (G0S8), TIS11 (G0S24 ) and EGR1 (G0S30). But, in keeping with the observations of Mariner et al., many mRNAs such as G0S19 (CCL3) and RGS1, decrease (Heximer et al. 1997).
Baechler, E. C. et al., (2004) Expression levels of many genes in human peripheral blood cells are highly sensitive to ex vivo incubation. Genes & Immunity 5, 347-353.
Heximer, S. P., Cristillo, A. D. & Forsdyke, D. R. (1997) Comparison of mRNA expression of two regulators of G-protein signaling, RGS1/BL34/IR20 and RGS2/G0S8, in cultured human blood mononuclear cells. DNA Cell Biology 16, 589-598.
Mariner, P. D. et al. (2008) Human Alu RNA is a modular transacting repressor of mRNA transcription during heat shock. Molecular Cell 29, 499-509.
End Note (Jan 2009)
Intriguingly, the phage sequences targeted by the transcripts of the CRISP-Repeats (see above) may have, for a particular class of CRISP-R, a common purine-rich flanking sequence. By mutating the target sequence a phage can evade the CRISP-R host defence system. This is as one might expect, since it is likely that homologous base pairing between host transcript and the phage target sequence is required. However, evasion can also be brought about by mutating the non-targeted purine-rich flanking sequence (Deveau et al. 2008). This is consistent with a requirement for a "kissing" interaction between nucleic acid secondary structures (see Fig. 4 above) prior to hybridization (Forsdyke 2007). A flanking mutation could change the secondary structure and hence hybridization would not occur.
Deveau, H. et al. (2008) Journal of Bacteriology 190, 1401-1412. Diversity, activity, and evolution of CRISPR loci in Streptococcus thermophilus.
Forsdyke, D. R. (2007) Journal of Theoretical Biology 249, 325-330. Molecular sex: the importance of base composition rather than homology when nucleic acids hybridize. (Click here for full text)
End Note (April 2009)
et al. (2009) have shown that many human and mouse transcripts initiate
within repetitive elements (LINES, SINES and other retrotransposons).
But "retroposon transcripts appear to be less expressed on average
than protein-coding mRNAs." They conclude that: "The ultimate
function, perhaps after further processing, of transcripts associated
with novel retrotransposon promoters deserves future study."
Faulkner, G. J. et al. (2009) Nature Genetics 41, 563-571.The regulated retrotransposon transcriptome of mammalian cells.
End Note (October 2009)
(2009) proposed a CRISP-R like mechanism involving the conferral of
specific resistance to viral pathogens by "immunospecific RNA (imRNA)"
in crustaceans and insects. Some problems with this were considered by
Forsdyke in the comment section of the paper (Click Here).
Flegel T. W. (2009) Biology Direct 4, 32. Hypothesis for hereditable, antiviral immunity in crustaceans and insects.
End Note (Jan 2010)
Since we are all heterozygotes for many alleles, mass transcription as part of the "stress" response to foreign intracellular invasion (Fig. 5), could generate proteins encoded by our two parental genomes that, apart from bringing about in utero "phenocopy" effects (Click Here), might also interact. The discovery of biallelic "promiscuous" thymic expression of certain self antigens during T cell education (under control of the AIRE transcription factor; Kyewski & Derbinski 2004), suggests a central mechanism by which such chance interactions (normal self recognizing normal self) could avoid future triggering of T cells in the periphery.
This hypothesis supposes that the aim of AIRE-induced transcription is not the display of self peptides from all AIRE-dependent transcribed genes, but merely those corresponding to genes whose products interact in a common cytosol, so generating hetero-aggregates for proteosome processing. Failure to eliminate T cells responding to peptides from such aggregates (negative selection) would be sufficient to account for the autoimmune disease seen in AIRE mutants. Under normal circumstances, having eliminated T cells responding to pMHCs corresponding to such interacting promiscously expressed self-proteins (negative selection), subsequent "promiscuous" expression in the periphery would be part of the response to foreign invasion (Fig. 5). Then the intracellular "antibody" repertoire seeks out protein (or RNA) corresponding to the intruder (normal self recognizing non-self; Gardner & Anderson 2009), and generates a novel coaggregate, peptides from which (as pMHC) activate T cells (positive selection).
Why does each thymic medullary epithelial cell (MEC), under the influence of the AIRE transcription complex, transcribe only a small proportion of the total number of AIRE-transcribable genes? Thus, if the entire set be represented as A-Z, one MEC might transcribe A-C and another D-F, the choice being seemingly random. Thus, if A interacts with Q, this will be evident only in a cell that, by chance, transcribes, say AQW. The 5 day sojourn of T cells in the medulla should suffice for the deletion of those T cells of high affinity/avidity for any MHC-presented A and Q peptides.
At the time of this writing, the conventional wisdom is that all thymic AIRE transcripts are translated into proteins which are then, by some mechanism, displayed as MHC peptide complexes without overloading the mechanism. So peptides from A and Q and W - all three - would be displayed as pMHC on an AQW transcribing MEC, apparently without the necessity for prior protein aggregation. This would somehow suffice to eliminate high affinity/avidity T cells. But this negative selection is likely to require higher pMHC concentrations at the cell surface than positive selection. How are such high concentrations to be achieved?
By restricting a MEC to displaying only part of the A-Z spectrum, the cell is more likely to be able to achieve the required high, specific, pMHC concentration (since there is less competition from other pMHCs). By restricting the display only to members of a part of the A-Z spectrum (e.g. AQW), some members of which can interact (e.g. A+ Q), the cell is even more likely to be able to achieve the high pMHC concentration that is needed for negative selection. Subsequently, peripheral promiscuous expression could generate a wide range of proteins (A-Z) in the hope that at least one would generate a novel coaggregate with a pathogen protein. This would lead to pMHC display and peripheral positive selection of T cells as part of the normal immune response (Click Here).
J. M. & Anderson, M. S. (2009)
Nature Immunol 10,
The sickness unto Deaf.
End Note (October 2011)
Just as there is diversification of the
antibody repertoire to confront potentially harmful extracellular agents,
so, in an emergency ("stress"), it might be predicted that there would be
short-term diversification of the "RNA antibody" repertoire to confront
potentially harmful intracellular agents (e.g. the nucleic acid thereof).
It would be more important, in the short term, to protect the cell, than
to optimally perform an RNA's usual function. Thus, we should not be
surprised (although we are!) that Carmi et al
(2011) report widespread ultra-editing of RNAs that is likely to be
mediated by ADARs (adenosine deaminases acting on double-stranded RNAs).
Interestingly, most ultra-editing is seen with RNAs extracted from a
liver, part of which had been subjected to the stress of partial
hepatectomy. Referring to Samuel (2011), the authors suggest: "The
extreme number of ultra-edited RNAs from a regenerating liver library may
also indicate induction of ADAR1 due to stress, possibly a viral
infection." As an additional benefit, since most mRNAs have much secondary
structure (probably because the corresponding gene requires such
structure), ADARs that convert the purine A (adenine) to the purine I
(inosine) would also decrease RNA secondary structure, making it easier to
hybridize when the sequence of the RNA was complementary to the nucleic
of a potentially pathogenic agent. Carmi et al. also note: "Ultra-edited
RNAs exhibit the known sequence motif of ADARs and tend to localize in
sense strand Alu elements," and that "ultra-editing occurs primarily in
Carmi S, Borukhov I, Levanon EY (2011)PLOS Genetics 7, e1002317. Identification of widespread ultra-edited human RNAs
Samuel CE (2011) Virology 411, 180-193. Adenosine deaminases acting on RNA (ADARs) are both antiviral and proviral.
End Note (December 2012)
(2012) report "massive transcription" of repetitive elements when the
oncogene p53 is inactivated and cultured mouse cells are treated with a
demethylating agent likely to remove the (generally inhibitory) 5-Methyl
group from DNA cytosine residues. Noting that under "stress" conditions
such repetitive elements are normally transcribed, so increasing the
probability of dsRNA formation, they suggest that the inhibitory effects
of p53 and DNA methylation are overcome under such conditions. The
interferon response follows.
Leonova KI et al.(2012) Proc. Natl. Acad. Sci USA (early edition) p53 cooperates with DNA methylation and a suicidal interferon response to maintain epigenetic silencing of repeats and noncoding RNAs..
End Note (Mar 2013)
Zabolotneva et al. (2010) have also suggested an RNA-based "intracellular 'immune system'," where alarms begin ringing when an RNA of viral origin forms dsRNA with an 'RNA antibody' of host origin. Like us, they suspect that it may partly explain the variable quantities of 'junk DNA' found in genomes. Furthermore, they propose, and present bioinformatic analyses supporting, the idea that "Casual [random] combinations of nucleotides in ... the genome might create new DNA motifs that theoretically, after being transcribed, could be used by the host organism as tool for recognition and targeting of intracellular pathogen transcripts. Novel transcribed [host] DNA motifs that would target the host genes [i.e. 'self']would be eliminated from the genome, whereas those that complementarily match the pathogen RNAs would be positively selected. Neutral motifs [yet to find a pathogen target but not interacting with 'self'] could be 'stored' in the genomes as ordinary non-coding DNA."
Zabolotneva A, Tkachev V, Filatov F, Buzdin A (2010) Biology Direct 5, 62. How many antiviral small interfering RNAs may be encoded by the mammalian genome?
End Note (July 2016)
(2016) "conservatively estimate" that "viruses have driven close to 30% of
all adaptive amino acid changes in the part of the human proteome
conserved within mammals." Such "virus interacting proteins" vastly exceed
the known proteins that regularly engage in immune responses to viruses
(e.g. protein kinase R). This is consistent with our above suggestion of
intracellular protein "immune receptors." Thus, over evolutionary time a
protein that primarily evolved for a distinct function, but also happened
to cross-react with some virus component, would in addition be selected by
virtue of the latter function.
Enard D, Cai L, Gwennap C, Petrov DA(2016) eLife 5:e12469. Viruses are a dominant driver of protein adaptation in mammals.
Several studies suggest that ADAR hyperediting is primarily aimed to prevent formation of self-dsRNAs with strong secondary structure, thus assisting discrimination from not-self RNAs that might have more G-C pairs (i.e. more stable structures; 1-3). It is suggested above that more open (i.e. weaker) self-RNA structures will expand the repertoire of "antibody RNAs" with the potential to recognize and target (form dsRNA with) intracellular pathogen transcripts.
1. Mannion et al. (2014) The RNA-editing enzyme ADAR1 controls innate immune responses to RNA. Cell Reports 9: 1482-94.
2. Liddicoat et al. (2015) RNA editing by ADAR1 prevents MDA5 sensing of endogenous dsRNA as nonself. Science 349: 1115-20. How many antiviral small interfering RNAs may be encoded by the mammalian genome?
3. Savva et al. (2016) Reprogramming, circular reasoning and self versus non-self: one-stop shopping with RNA editing. Frontiers in Genetics 7, article 100.
Update on Heat-Shock Response and Self/Not-Self Discrimination (2004 abstract and slides) (Click Here)
Return to Theoretical Immunology Index (Click Here)
Return to HomePage (Click Here)
Placed here in 2002 and last edited on 07 Aug 2016 by Donald Forsdyke