RSS 2.0 Feed

» Welcome Guest Log In :: Register

  Topic: The Canonical Genetic Code, Resources & arguments< Next Oldest | Next Newest >  
Wesley R. Elsberry

Posts: 4966
Joined: May 2002

(Permalink) Posted: May 23 2002,08:35   

A popular antievolution argument often deployed by "intelligent design" advocates concerns the non-universality of the genetic code:Jonathan Wells, Paul Nelson, Stephen Meyer, Michael Behe, and Cornelius G. Hunter.

I'm opening this thread for discussion of the canonical genetic code and its variants, and also for examination of the claims of ID advocates concerning the canonical code.

Kenneth Miller of Brown University has a couple of essays on this topic: here and here.

"You can't teach an old dogma new tricks." - Dorothy Parker

Wesley R. Elsberry

Posts: 4966
Joined: May 2002

(Permalink) Posted: May 23 2002,08:55   

The canonical genetic code

It is tough to represent tabular information in a proportional font.  So I will present a list representing the canonical genetic code.  There are three items separated by commas in each line.  The first is a representation of a codon, a nucleotide triplet.  Each base is represented by a letter: "a" for adenine, "g" for guanine, "c" for cytosine, and "u" for uracil.  The second item is either an amino acid or a "stop", where each amino acid is represented by a three-letter abbreviation.  The third item is a single-letter code for an amino acid.


Here's a page with a table showing the canonical code and the full expansion of the abbreviations.

"You can't teach an old dogma new tricks." - Dorothy Parker

Wesley R. Elsberry

Posts: 4966
Joined: May 2002

(Permalink) Posted: May 23 2002,09:48   

How many codes could there be?

The simple answer is "lots".  The canonical genetic code has 64 entries coding for 21 different things (20 amino acids plus a "stop" signal).  It is called "degenerate", which is a fancy way of saying that there are more codes than there are things coded for, which results in some redundancy in the canonical code.  If you look closely at how codons are matched with amino acids, you will likely notice that in many cases a change in the third base of the codon results in no change in the coded-for amino acid.  This results in a typical clustering of codons, so that a change of one base has about a one-in-five chance of causing no change at all in what amino acid is coded for.  In other words, the canonical genetic code is not as "brittle" as it could be.  I hope to explore that more thoroughly later.

But back to the question of interest.  How many "genetic codes" could there be?  Let me be clear here.  The phrase "genetic code" is sometimes sloppily used to refer to the specific sequence of bases observed in the genome of an organism.  That's not the way I am using it here.  The "genetic code" is used here as the way in which triplets of three nucleotide bases are mapped to corresponding amino acids for the purpose of protein synthesis.  Figuring out how many different ways such a code can be instantiated can be approached through combinatorial "counting rules".

The first "counting rule" of interest is the factorial function.  Given some positive integer number n of items, the factorial is defined as the product of every positive integer greater than or equal to one and less than or equal to n.  The number of different ways 64 symbols can be represented as a sequence is factorial(64) (or 64!), or about 1.268869e89.  Since a degenerate genetic code doesn't have 64 different symbols, but rather 64 positions for symbols, this represents an upper bound on the number of possible genetic codes using triplet codons.

So what counting rule gives us what we want?  The answer is the "partition rule".  This tells us that the number of ways that k different symbols can be arranged to fill n spaces when we know how many of each of the k symbols there are.  The rule is

n! / (m_1! * m_2! * ... * m_k!)

The sum of m_1 through m_k = n

For 21 symbols, the worst case situation would be if most of the code specified a single amino acid.  This occurs if one symbol is repeated 44 times and the remaining symbols have 1 instance each.  In this case, application of the partition rule tells us that there are 4.8e34 possible codes of that sort.

The best case situation is where all the codes are as nearly evenly represented as possible.  This is the case when one symbol has 4 instances and the remaining 20 symbols each have 3 instances.  In this case, there are about 1.4e72 possible different codes of that sort.

If we take the distribution of symbols in the canonical code, we have 64! / (4!6!2!2!2!2!2!4!2!3!6!2!1!2!4!3!6!4!1!2!4!), or about 2.3e69 possible different codes of that sort.

It is interesting that the actual canonical genetic code has a distribution that would permit almost as many variants as the very best case situation.

Next up will be considering what the numbers mean for evolutionary biology.

"You can't teach an old dogma new tricks." - Dorothy Parker


Posts: 97
Joined: May 2002

(Permalink) Posted: May 23 2002,16:32   

Awhile back I had a discussion about the DI's wacko response to the PBS evolution series here.  Scroll down a few messages.  There are some links to some good recent papers in that thread.  The references are:

Trends in Biochemical Sciences 2001, 26:591-596

Proc Natl Acad Sci U S A 1995 Mar 92:2441-5

J. Biol. Chem., Vol. 276, Issue 10, 6881-6884, March 9, 2001

Proc. Natl. Acad. Sci. USA, Vol. 97, Issue 15, 8392-8396,       July 18, 2000

Annu. Rev. Biochem. 2000. 69:617-650.



Posts: 36
Joined: May 2002

(Permalink) Posted: May 25 2002,12:35   

Quote (theyeti @ May 23 2002,16:32)
Awhile back I had a discussion about the DI's wacko response to the PBS evolution series here.  Scroll down a few messages.  There are some links to some good recent papers in that thread.  The references are:

Trends in Biochemical Sciences 2001, 26:591-596

Proc Natl Acad Sci U S A 1995 Mar 92:2441-5

J. Biol. Chem., Vol. 276, Issue 10, 6881-6884, March 9, 2001

Annu. Rev. Biochem. 2000. 69:617-650.


Let me add some links to the references

Proc. Natl. Acad. Sci. USA, Vol. 97, Issue 15, 8392-8396, July 18, 2000 Interpreting the universal phylogenetic tree, Carl R. Woese

This article was referenced by others, a few quotes


Archaeal Phylogeny Based on Ribosomal Proteins
Oriane Matte-Tailliez , Céline Brochier , Patrick Forterre and Hervé Philippe

Until recently, phylogenetic analyses of Archaea have mainly been based on ribosomal RNA (rRNA) sequence comparisons, leading to the distinction of the two major archaeal phyla: the Euryarchaeota and the Crenarchaeota. Here, thanks to the recent sequencing of several archaeal genomes, we have constructed a phylogeny based on the fusion of the sequences of the 53 ribosomal proteins present in most of the archaeal species. This phylogeny was remarkably congruent with the rRNA phylogeny, suggesting that both reflected the actual phylogeny of the domain Archaea even if some nodes remained unresolved. In both cases, the branches leading to hyperthermophilic species were short, suggesting that the evolutionary rate of their genes has been slowed down by structural constraints related to environmental adaptation. In addition, to estimate the impact of lateral gene transfer (LGT) on our tree reconstruction, we used a new method that revealed that 8 genes out of the 53 ribosomal proteins used in our study were likely affected by LGT. This strongly suggested that a core of 45 nontransferred ribosomal protein genes existed in Archaea that can be tentatively used to infer the phylogeny of this domain. Interestingly, the tree obtained using only the eight ribosomal proteins likely affected by LGT was not very different from the consensus tree, indicating that LGT mainly brought random phylogenetic noise. The major difference involves organisms living in similar environments, suggesting that LGTs are mainly directed by the physical proximity of the organisms rather than by their phylogenetic proximity

Proc. Natl. Acad. Sci. USA, Vol. 98, Issue 3, 805-808, January 30, 2001 The universal nature of biochemistry  Norman R. Pace

Next reference

Lluís Ribas de Pouplana, Paul Schimmel, Aminoacyl-tRNA synthetases: potential markers of genetic code development, Trends in Biochemical Sciences 26 (10) (2001) pp. 591-596.

Operational RNA code for amino acids in relation to genetic code in evolution.  Ribas de Pouplana, L ... Schimmel P
J Biol Chem 2001 Mar 9;276(10):6881-4.

Some links to research

AMINOACYL-TRNA SYNTHESIS  Annu. Rev. Biochem. 2000 , Vol. 69: 617-650.


Aminoacyl-tRNAs are substrates for translation and are pivotal in determining how the genetic code is interpreted as amino acids. The function of aminoacyl-tRNA synthesis is to precisely match amino acids with tRNAs containing the corresponding anticodon. This is primarily achieved by the direct attachment of an amino acid to the corresponding tRNA by an aminoacyl-tRNA synthetase, although intrinsic proofreading and extrinsic editing are also essential in several cases. Recent studies of aminoacyl-tRNA synthesis, mainly prompted by the advent of whole genome sequencing and the availability of a vast body of structural data, have led to an expanded and more detailed picture of how aminoacyl-tRNAs are synthesized. This article reviews current knowledge of the biochemical, structural, and evolutionary facets of aminoacyl-tRNA synthesis.


Posts: 319
Joined: May 2002

(Permalink) Posted: May 26 2002,00:55   

Ooh, another great opportunity for collaboration.  This is a huge topic with a lot of literature, so unless one happens to be a biochemist who did their PhD. on the topic it is hard for one person to scrape together the diverse information & references that are necessary to explain the problems with Paul Nelson's pseudoargument to the public.  I think Ken Miller's replies had some difficulty in this regard (and I don't even recall the statement "universal genetic code" being in the actual Evolution series -- is this just me missing it or is it actually there?)

The most complete presentation of Nelson v. common descent that I can recall was a longish talk that I recall listening to online -- but I can't find it at the moment.  Is there a good online essay where Paul Nelson actually lays out the argument from "the genetic code isn't quite universal" to the conclusion "common descent is false [to some unspecified degree]"?

Anyhow, Nelson's argument went like this:

- the code was thought to be universal, and this was crowning evidence for common descent because the code couldn't change because intermediate stages are fatal so it must have come from a common ancestor

[actually, it was just one of many pieces of evidence, but whatever]

- but it's not quite universal, therefore either:

(a) the code can change after all
(b) common descent is false

- Nelson doesn't like (a), citing a (single) paper that criticizes another (single) scientist's proposal about how a codon assignment could change.

- therefore evolutionists are dogmatically clinging to an auxilliary hypothesis that is shielding their main theory from rigorous testing.

I'm sure I'm oversimplifying, I heard the talk last year, but that's basically it.

However, I recall doing some digging on these arguments for an ARN post or two, I will see if I can find them...

Hmm, as usual the ARN UBB search engine is proving useless.  Well, here's some general points regarding "deviant/noncanonical genetic codes":

(1) Deviant genetic codes are most common in critters/organelles with small or otherwise weird genomes, e.g. ciliates (which Nelson specifically mentions IIRC):


Pubmed link

The molecular basis of nuclear genetic code change in ciliates

Quote: "Most changes in the genetic code involve termination: this may be because stop codons are rare, occurring only once per gene, and so changes in termination are likely to be less deleterious than change in sense codons. This would be particularly true for those species of ciliates whose genes reside on gene-sized chromosomes and/or have short 3' untranslated regions. In addition, termination is a competition for stop-codon-containing ribosomal A sites between release factors and tRNAs. Consequently, relatively small changes either in the tRNAs or in eRF1 may shift this balance toward partial or complete readthrough in some cases. For instance, Bacillus subtilis uses in-frame UGA codons extensively to encode tryptophan; however, this readthrough is inefficient, and UGA is also used as a stop codon [33, 34] . The abundance of stop codon reassignments relative to amino acid codon reassignment, however, could also be an observer bias. In-frame stop codons are much easier to detect in protein coding sequences than amino acid replacements, especially if the latter have similar properties."

(2) Some organisms, extant today, have ambiguous codon assignments (i.e. one codon codes for both an amino acid and 'stop' at the same time, proving that this is not necessarily a fatal situation, contra Paul Nelson.

[I've seen this stated in an article somewheres, if anyone else finds examples they might post them.  They pretty clearly refute the "transitional stages impossible" contention.]

(3) Deciding whether or not the code is optimal, how optimal, and how much a potential "frozen accident" is by no means a simple question as Nelson seems to assume.

The below paper argues for optimality in at least one sense, but note the back-and-forth, and how what constitutes "optimal" may be different for different organisms at different times (& which may thus result in the evolution of code deviants).


Pubmed link -- free online BTW

Mol Biol Evol 2000 Apr;17(4):511-8

Early fixation of an optimal genetic code.

Freeland SJ, Knight RD, Landweber LF, Hurst LD.

Department of Ecology, Princeton University, University of Bath, Bath, England.

The evolutionary forces that produced the canonical genetic code before the last universal ancestor remain obscure. One hypothesis is that the arrangement of amino acid/codon assignments results from selection to minimize the effects of errors (e.g., mistranslation and mutation) on resulting proteins. If amino acid similarity is measured as polarity, the canonical code does indeed outperform most theoretical alternatives. However, this finding does not hold for other amino acid properties, ignores plausible restrictions on possible code structure, and does not address the naturally occurring nonstandard genetic codes. Finally, other analyses have shown that significantly better code structures are possible. Here, we show that if theoretically possible code structures are limited to reflect plausible biological constraints, and amino acid similarity is quantified using empirical data of substitution frequencies, the canonical code is at or very close to a global optimum for error minimization across plausible parameter space. This result is robust to variation in the methods and assumptions of the analysis. Although significantly better codes do exist under some assumptions, they are extremely rare and thus consistent with reports of an adaptive code: previous analyses which suggest otherwise derive from a misleading metric. However, all extant, naturally occurring, secondarily derived, nonstandard genetic codes do appear less adaptive. The arrangement of amino acid assignments to the codons of the standard genetic code appears to be a direct product of natural selection for a system that minimizes the phenotypic impact of genetic error. Potential criticisms of previous analyses appear to be without substance. That known variants of the standard genetic code appear less adaptive suggests that different evolutionary factors predominated before and after fixation of the canonical code. While the evidence for an adaptive code is clear, the process by which the code achieved this optimization requires further attention.

...and also note the rather unambiguous first sentence of the introduction of this article:


All known nonstandard genetic codes appear to be secondarily derived minor modifications of the canonical code (Osawa 1995).

Here is their conclusion FYI:


The Mechanism of Adaptive Code Evolution

This leads to the question of the evolutionary mechanisms responsible for an adaptive canonical code. The many models of precanonical code evolution, reviewed extensively elsewhere (Knight, Freeland, and Landweber 1999 ), permit two major possibilities: that an adaptive code was selected from a large pool of variants, or that an adaptive code arose de novo by code expansion (or simplification) within adaptive, error-minimizing constraints. Individual codon reassignments, necessary for adaptive code shuffling, are certainly possible, but the question remains unresolved, and two lines of evidence increasingly favor the latter explanation.

First, the notion of code expansion from a simpler primordial form, although still lacking in detail, is now associated with a diverse body of empirical and phylogenetic evidence (Knight, Freeland, and Landweber 1999 ). It seems unlikely that clear patterns of biosynthetic relatedness would be found in a code which had undergone extensive codon assignment shuffling. Additionally, while adaptive code structure is unlikely to be an artifact of a stereochemically determined code, empirical evidence suggests that stereochemistry is not without a role. For example, RNA molecules artificially selected to bind Arginine contain disproportionately many CGN/AGR codons (Knight and Landweber 1998 ). If all or most amino acids show stereochemical affinities for their corresponding codons, this would suggest that natural selection worked in concert with stereochemical interactions and biosynthetic expansion to produce the canonical code de novo, "choosing" the current 20 amino acids as those that satisfied criteria for both stereochemical affinity and error minimization. This interpretation would thus offer a novel insight into the selection of the proteinaceous amino acids from the near-infinite possibilities of both prebiotic syntheses and biosynthetic modification.


We have presented comprehensive evidence that the standard genetic code is a product of natural selection to minimize the phenotypic impact of genetic error; the arrangement of codon assignments meets, to an extraordinary degree, the predictions of the adaptive hypothesis and cannot be explained as an artifact of stereochemistry, biosynthetically mediated code expansion, or analytical methodology. However, the process by which an adaptive code evolved at present remains unclear, and yet its resolution may be of key importance to our understanding of the amino acid components universal to life.

This is the Osawa reference which looks to be key:

Osawa, S. 1995. The evolution of the genetic code. Oxford University Press, Oxford, England.

Anyhow, as usual when one begins to investigate the actual biology of an ID argument, one finds that the IDists are taking a thoroughly myopic view instead of looking at the broad range of evidence that is necessary.

Thanks, nic


Posts: 97
Joined: May 2002

(Permalink) Posted: May 28 2002,09:54   


The most complete presentation of Nelson v. common descent that I can recall was a longish talk that I recall listening to online -- but I can't find it at the moment.  Is there a good online essay where Paul Nelson actually lays out the argument from "the genetic code isn't quite universal" to the conclusion "common descent is false [to some unspecified degree]"?

Here are a few links from various places.

PBS Charged with "False Claim" on "Universal Genetic Code"  From ARN.  The title says it all...

A "Dying Theory" Fails Again  Miller's response.

Reply To Kenneth Miller On The Genetic Code.  The DI's wacky response to Miller.

DI Fails Again to "Crack" the Code.  Miller's second response.

That's it for the DI vs. Miller spat AFAIK.  A couple of others:

Is Common Descent an Axiom of Biology?  This one's by Nelson.  Scroll down a bit and you'll see his genetic code arguments.

Should We Stop Criticizing the Doctrine of Universal Common Ancestry?  By our dear friend Jonathan Wells.  He uses the genetic code argument, along with a number of other really bad ones.  I don't think that Wells does much more than assert his position, but this is a good article to keep handy next time someone says that Wells accepts common ancestry, though he's ambiguous as usual.


Posts: 319
Joined: May 2002

(Permalink) Posted: May 30 2002,00:51   

Here is a whole double issue of JME with a large group of articles devoted to evolution-of-genetic-code issues:

Volume 53 - Number 4/5, 2001


Lluís Ribas de Pouplana, James R. Brown, Paul Schimmel
Structure-Based Phylogeny of Class IIa tRNA Synthetases in Relation to an Unusual Biochemistry
Article in: PDF | HTML-Frames

David H. Ardell, Guy Sella
On the Evolution of Redundancy in Genetic Codes
Article in: PDF | HTML-Frames

Yoshikazu Nakamura
Molecular Mimicry Between Protein and tRNA
Article in: PDF | HTML-Frames

Shigehiko Kanaya, Yuko Yamada, Makoto Kinouchi, Yoshihiro Kudo, Toshimichi Ikemura
Codon Usage and tRNA Genes in Eukaryotes: Correlation of Codon Usage Diversity with Translation Efficiency and with CG-Dinucleotide Usage as Assessed by Multivariate Analysis
Article in: PDF | HTML-Frames

Robin D. Knight, Laura F. Landweber, Michael Yarus
How Mitochondria Redefine the Code
Article in: PDF | HTML-Frames

Shin-ichi Yokobori, Tsutomu Suzuki, Kimitsuna Watanabe
Genetic Code Variations in Mitochondria: tRNA as a Major Determinant of Genetic Code Plasticity
Article in: PDF | HTML-Frames

Funny that Paul Nelson's views were not included, eh?

I just received a good private message on this topic from a new poster & I encouraged him to post it in the general discussion, I'll then post a link to it from this thread.



Posts: 319
Joined: May 2002

(Permalink) Posted: May 30 2002,01:14   

One more article.

Here's the short version of the case as I now understand it.

- In the beginning, scientists thought the genetic code was universal (maybe; this is the standard line, whether all relevant experts also assumed this initially seems to me to be uncertain, at least I've not seen any analysis of the topic).

- in the 1980's it was documented that this was not the case

- In the late 1980's Osawa proposed the "codon disappearence" theory for the evolution of code changes, described in the Schultz & Yarus (1996) article referenced below thusly:


Codon reassignment to new amino acids within large, complex, relatively modern genomes (Osawa et al. 1992) poses interesting mechanistic problems. Osawa and Jukes have proposed (1989), and reaffirm in recent publications (1992; Osawa and Jukes 1995), that during codon reassignment every example of a codon in an entire genome must mutate or otherwise disappear as a result of mutational change in genomic GC content. Subsequent to its total disappearance, a codon can be captured by, e.g., an anticodon mutation in a dispensable tRNA, thereby reappearing with a new identity. We will call this the ``codon disappearance'' theory, after its characteristic intermediate state.

- In the mid-1990's another theory was proposed, apparently right in Schultz & Yarus' 1996 article:


We find the absolute disappearance of hundreds, thousands, or tens of thousands of examples of a codon by mutation pressure alone, in diverse independent cases, an improbable evolutionary scenario. Total disappearance should be an extremely slow occurrence, because mutation pressure and genetic drift in large populations are among the weakest evolutionary forces, producing only very slow changes in genomic composition. Furthermore, back mutation increases in effectiveness as the goal is approached because of the accumulation of codons related to the disappearing codon by single mutation. Finally, complete disappearance of codons in eukaryotes would be hindered by coherent areas of varied GC content along chromosomes (Sharp and Lloyd 1993; Ikemura and Wada 1991). Because codon choice follows GC content, such areas can provide sheltered enclaves for particular codons (Santos and Tuite 1995).

Though total disappearance is difficult to prove, mutation pressure certainly causes codon frequencies to change. Evolution to very low frequencies and inefficient translational function is well supported (e.g., Kano et al. 1993). But we argue that mutation and drift in codon frequency over entire genomes are vulnerable to being overtaken by faster evolutionary processes such as selection. Thus the question: Are there plausible faster processes, perhaps selection-driven processes, for codon reassignment?

Schultz and Yarus characterized a nonanticodon tRNA site (1994a,b) where particular nucleotide sequences allow a tRNA to read an unusual near-cognate codon. More generally, several sites are known where single mutations in nonanticodon nucleotides (reviewed in Yarus and Smith 1995) enhance tRNA ability to read (at least) two codons, (at least) one of which is forbidden by normal base-pairing and wobble rules. Schultz and Yarus suggested (1994c) that such equivocal adapters could catalyze codon reassignment for one of the codons being ambiguously read. (For clarity in what follows, a codon read in more than one way is said to be ``ambiguous''; a tRNA which reads normal codons as well as codons not normally assigned is said to be ``equivocal.'';) In the particular case in which reassignment is initiated by a mutation that impairs normal translation of a codon, reassignment via an equivocal adapter tRNA might evolve quickly by selection for improved translation of the newly ambiguous codon. Transitional coding ambiguity could finally be resolved by, for example, loss or mutation of the original tRNA, and anticodon mutation to equivocal complementarity in the new (equivocal) tRNA, so that the amino acid of the previously equivocal tRNA is reassigned. We will call this the ``ambiguous intermediate'' theory.

Here is the reference, and some of Schultz & Yarus' (1996) lines of evidence for the ambiguous intermediate theory:


JME link
J Mol Evol 1996 May;42(5):597-601

On malleability in the genetic code.

Schultz DW, Yarus M.

To explain now-numerous cases of codon reassignment (departure from the "universal" code), we suggest a pathway in which the transformed codon is temporarily ambiguous. All the unusual tRNA activities required have been demonstrated. In addition, the repetitive use of certain reassignments, the phylogenetic distribution of reassignments, and the properties of present-day reassinged tRNAs are each consistent with evolution of the code via an ambiguous translational intermediate.


Firstly: at the heart of our proposal lies the supposition that codons are read ambiguously by two tRNAs (or a tRNA and an RF, in the case of terminators), specifying insertion of more than one amino acid (or an amino acid as well as stop). In contrast, the assumption that codons vanish before reassignment, which is characteristic of codon disappearance theory, is mandated by the assertion that codons cannot have two meanings.

In strict form, this axiom of nonambiguity contradicts chemical principle. An infinite free energy difference between reaction pathways is required to select one reactant and reject another absolutely. The strict absence of ambiguity is also contradicted by experiment. Cumulative missense translation in normal E. coli has been estimated at 4 × 10-4 per codon (Ellis and Gallant 1982). Total miscoding per peptide chain is the much larger sum over the hundreds of codons in the protein. Therefore an appreciable basal ambiguity (yielding ~ 10% of the average 250-amino-acid protein with a variant sequence) is evident, and tolerated, in wild-type cells.

Further, cells are unharmed even when this substantial basal ambiguity is increased dramatically. We have constructed strains containing equivocal E. coli tRNAs that demonstrate suppressor efficiencies of 50 to nearly 100%, making a stop codon ambiguous (Schultz and Yarus 1994a,b). Ribosomal ambiguity mutations (RAM) increase misreading of stop codons up to 100-fold in cells that remain viable (Strigini and Brickman 1973; Andersson et al. 1982). Most specifically, the general error frequency can be increased 13-fold (using 5 µg/ml streptomycin) and cells continue to grow exponentially at a rate close to controls. After more than 400 generations in streptomycin, there is no detectable decrease in cellular viability (Gallant and Palmer 1979). Thus ambiguity at a variety of codons (to >=1 error in the average 250-amino-acid protein) is well tolerated, or has no apparent phenotype. The limited ambiguity we posit as the initiating event in codon reassignment, occurring at one (or a few) codon(s) and perhaps initially quantitatively small, seems quite plausible in this context.

Nor is coding ambiguity limited to prokaryotes. Eukaryotes have basal levels of coding ambiguity which are probably similar to prokaryotes (Gallant and Palmer 1979). Normal yeast glutamine tRNAs are known to read equivocally at the first codon position (Weiss and Friedberg 1986; Edelman and Culbertson 1991). Similar ambiguities can be exploited for an organism's own purposes, as when animal and plant viruses purposefully use ambiguous stop condons to adjust the level of stop readthrough to an essential gene product. This misreading by a wild-type tRNA is known to approach 5% at stop codons within a special mRNA context (Skuzeski et al. 1991; Feng et al. 1990). Thus, during codon reassignment there seems to be no reason that all codons must invariably be read without ambiguity.

[note here that ambiguity is not exactly vanishingly rare and therefore the assumption that intermediates would be nonviable is false]

Secondly: There is no definite direction to reassignment in codon disappearance theory; dispensable RNA genes may capture unassigned codon by, e.g., random single base changes in their anticodons (Osawa and Jukes, 1989). However, we first argue that known reassignments (Table 1) are very nonrandom. We then argue the nonrandomness supports ambiguous intermediate theory because it is explicable by types of equivocal reading already demonstrated in tRNAs.


However, 14 of 14 single-nucleotide reassignments in Table 1 parallel the activities of known equivocal tRNAs. That is, all 14 changes might be mediated by tRNAs reading a single base equivocally, using G-U (anticodon-codon) wobble at the first position, or C-A or G-A mispairing at the third codon position. Equivocal C-A third-position mispairing has long been known from study of tRNA opal (UGA) suppressors (Hirsh 1971). We recently constructed two new tRNAs that demonstrate in vivo the required equivocal G-U and C-A readings (Schultz and Yarus 1994b), thereby potentially accounting for ten assignments (Table 1). This congruence, in fact, first drew our attention to the possibility that tRNAs might mediate codon reassignment. The remaining exceptional wobble, transitional G-A pairing at the third position, has also been detected in the equivocal tRNA repertoire in vitro, using cytoplasmic and chloroplast tRNA Cys (Nicotiana) as UGA suppressors (Urban & Beier, 1995). The remaining 15th case requires a more unusual first/second-position double miscoding. However, the Candida albicans tRNA translating the reassigned CUG codon has been independently shown to be capable of a similar doubly equivocal coding (Santos et al. 1993; see below). Thus 15 of 15 known reassignments can be matched with known tRNA capabilities.

Thirdly: Phylogenetic distribution of reassignment is consistent with ambiguous intermediates. Tourancheau et al. (1995) have made the initially surprising observation that UAA/UAG in ciliates have been reassigned to glutamine at least three times independently (on the basis of the rRNA tree), instead of depending on a common ancestral reassignment. This striking phylogenetic cluster of identical but independent reassignments has no apparent explanation in the codon disappearance scheme. However, such a cluster is easily explained within the ambiguous intermediate mechanism by a tendency to equivocal reading of these codons inherited from an ancestor. Such an ambiguity might be conserved within a group of species if used for an important regulatory event like stop codon readthrough. These authors also found no correlation between GC content of the ciliates and reassignment, which might have been expected if evolutionary change in GC content drives the process.

Fourthly: Molecular fossil and functional evidence of translational ambiguity accompanies known cases of reassignment. We have previously pointed out that sequenced tRNAs that have captured new codons, such as the UAA and UAG reading tRNAs from the ciliate Tetrahymena thermophila (Hanyu et al. 1986), contain unusual nucleotide sequences that we have identified as enhancers of equivocal coding in E. coli (Schultz and Yarus 1994c). Thus the structure of these three related isoaccepting tRNAGln sequences suggests the existence of an ancestor that coded equivocally.


In summary: We acknowledge the significance of codon reassignment, and do not argue against change in GC content as a significant evolutionary event (e.g., Sueoka, 1993). But we do argue that codon reassignment is unlikely to be carried out entirely by the slow processes of mutation pressure and drift. Additionally, the axiom of nonambiguity fundamental to codon disappearance theory is not justified. The evident nonrandomness of known reassignments, the clustering of similar changes in phylogeny, and the properties of reassigned tRNAs, where known, are strikingly consistent with ambiguously translating intermediates. These phenomena are unexpected or contradictory to codon disappearance theory, acting in isolation.

In this connection, there is no logical incompatibility between mutational change in GC content and ambiguous intermediates. Schultz and Yarus (1994c) have noted that these may occur together. In fact, a codon which has become rare might also be expected to evolve a rare cognate tRNA. Such a rare tRNA would be more vulnerable than usual to competition during translation, including competition from an equivocal adaptor translating its codon. Thus not only might mutation pressure be overtaken by faster selection, but the initial effects of mutation pressure might facilitate the overtaking mechanism. Quantitative modeling of this process might prove rewarding.

Finally: if ambiguous intermediate theory gives a good account of modern coding changes, it thereby becomes a preferred route by which a limited ancestral code could have been transformed to the present ``universal'' genetic code. In fact, coding transitions via ambiguous intermediates would likely be easier during the formation of the code than today.

Other aspects of ambiguous intermediate and codon disappearance schemes can be compared in the previous note by Osawa and Jukes (1995), and in Schultz and Yarus (1994c), to which the interested reader is directed for references and details which do not appear here.

Edited by niiicholas on May 30 2002,01:18


Posts: 319
Joined: May 2002

(Permalink) Posted: May 30 2002,01:49   

Reviewing this 1993 article by Paul Nelson and Jonathan Wells:


Is Common Descent an Axiom of Biology?

[Editorial note:  The following discussion paper was written for the conference, “The Darwinian Paradigm: Problems and Prospects,” held June 22-25, 1993, at the Pajaro Dunes beach community on Monterey Bay, near Watsonville, California.  The conference was organized by Phillip Johnson.  Attendees included Michael Behe, Walter Bradley, John Angus Campbell, William Dembski, Dean Kenyon, Stephen Meyer, Paul Nelson, David Raup, Siegfried Scherer, Jonathan Wells, and Kurt Wise.]


To:                  Pajaro Dunes Conference Participants
From:              Paul Nelson and Jonathan Wells
Date:               15 June 1993
Re:                  Discussion paper for Topic Area I (homology, etc.)

...and skipping to the genetic code section, we find that Nelson & Wells are indeed assuming that the "functional invariance" thesis was dropped, without evidence, to protect common descent:


The Universal Genetic Code Argument for Common Descent

Lest it be thought that this pattern of reasoning – namely, sacrificing the auxiliary theory to save common descent – is an isolated example, we offer another, perhaps more striking case.

Most of us are familiar with the universal genetic code argument for common descent.  The argument first appeared in the mid to late 1960s, after the structure of the code was elucidated.  It is now widespread.[33]

[they quote several quotes to this effect]

...and then, they argue that "functional invariance" is highly probable and therefore scientists are unjustifiably dropping the "functional invariance theory" to protect common descent:


Postulating that such fundamental variations occurred is, however, very far from knowing how they occurred. "Direct replacements of one amino acid by another throughout proteins," argue Osawa et al., "would be disruptive in intact organisms and even in mitochondria."[45]  That is, we should not think that the body of molecular knowledge motivating functional invariance can be jettisoned at will. (Yes, if common descent is true, and variant codes exist, functional invariance has to go to the wall. Yet functional invariance still seems to be true, or at least highly probable.)  Rather, taking common descent as given, we are now faced with another novel research problem: "How could non-disruptive code changes occur?"[46]

I find this article fascinating because it exemplifies one particularly devious tactic of the ID movement: rather than taking the obvious, but difficult, route of simply arguing that common descent is true or false to some specific degree, based on this and that specific evidence, they try to make the convert the entire argument into one about the intellectual credibility of the biologists, and therefore the thesis the IDists are really trying to advance is something like "mainstream biologists are biased and would believe in evolution no matter what the evidence."  As in Nelson & Wells' conclusion:


Suppose Darwin had it right, namely, that "all the organic beings which have ever lived on this earth have descended from some one primordial form."[52]  The existence of this "one primordial form," the common ancestor, establishes a theoretical domain that logically subsumes all biological and paleontological phenomena.  That is, even if life had multiple origins, we will be unable, having assumed the truth of common descent, to provide any evidence for that possibility: all observed organisms, whether recent or extinct, will necessarily lie within what might be called the "common ancestor horizon."

If this seems counter-intuitive, try the following thought experiment. Assume the truth of common descent, and then attempt to construct an empirical argument against it. No imaginable evidence one might bring to bear, however striking – e.g., organisms for which no transitional stages seem possible, multiple genetic codes – will be able to overturn the theory. If there really was a common ancestor, then all discontinuities between organisms are only apparent, the artifacts of an incomplete history. An ideally fine-grained history would reveal the begetting relations by which all organisms have descended from the common ancestor.

If the axiom thesis is correct, then the theory of common descent will indeed be refractory to the evidential challenges thrown up by biological experience. One can see the point in Mayr's recent claim that common descent

has been gloriously confirmed by all researches since 1859. Everything we have learned about the physiology and chemistry of organisms supports Darwin's daring speculation that "all the organic beings which have ever lived on this earth have descended from some one primordial form..."[53]

One wonders what we could have learned about organisms, since 1859, that would not have confirmed common descent.

We offer the axiom thesis, not because we are persuaded of its truth, but to provide a starting point or focus for discussion. How, really, do the patterns of living things count for, or against, the notions of primary continuity (common ancestry) or primary discontinuity (polyphyly)? If common descent cannot be dislodged by the "evidence," then how should we go about evaluating it?

I propose a (new??) term for this style of argument: Argumentum ad Innuendo.

Thanks, nic

Edited by niiicholas on May 30 2002,01:51


Posts: 97
Joined: May 2002

(Permalink) Posted: May 30 2002,12:59   

If this seems counter-intuitive, try the following thought experiment. Assume the truth of common descent, and then attempt to construct an empirical argument against it. No imaginable evidence one might bring to bear, however striking – e.g., organisms for which no transitional stages seem possible, multiple genetic codes – will be able to overturn the theory.

That's got to be the stupidest argument I've ever seen.  If common descent is true, then there will be no empirical evidence against it.  What they're basically saying is "true theories can't be shown to be false empirically".  Why don't they just say that differing genetic codes are empirical evidence against common descent and be done with it?  How does it make sense to construct a thought experiment where we try to hold two contradictory notions at once, i.e., common descent is both true and shown to be false by the evidence?

Anyway, Nelson & Wells' contention that biologists dropped the "no viable intermediates" claim in order to protect common descent is demonstrably false.  It was dropped because it was shown to be wrong, empirically.  Not only does the example of ambiguous codes demonstrate this, but also the ability of researchers to alter the codes of living organisms.  Ironically, the DI aticle that responds to Miller alludes to this:

Experiments to change the identity of transfer RNA (tRNA)--another possible mechanism by which genetic codes might reassign codon “meanings”--have shown that the intermediate steps must be bridged by intelligent (directed) manipulation. In one such experiment, for instance, Margaret Saks, John Abelson, and colleagues at Caltech changed an E. coli arginine tRNA to specify a different amino acid, threonine. They accomplished this, however, only by supplying the bacterial cells (via a plasmid) with another copy of the wild-type threonine tRNA gene. This intelligently-directed intervention bridged the critical transition stage during which the arginine tRNA was being modified by mutations to specify threonine. [6]

Notice that they're trying to do with this experiment what they do with animal and plant breeding.  When mutation and selction are shown to be sufficient to cause substantial morphological change, they dismiss it outright because it was really just "intelligent design" even though it has nothing to do with ID as they conceive it.  And here, the ability of the code to change is dismissed because it was caused by "intelligent design", as if plasmid transfers never happen in the wild.    

Anyway, here are the refs for the papers cited:

6. Margaret E. Saks, Jeffrey R. Sampson, and John Abelson, “Evolution of a Transfer RNA Gene Through a Point Mutation in the Anticodon,” Science 279 (13 March 1998):1665-1670.

7. Jennifer Normanly, Richard C. Ogden, Suzanna J. Horvath & John Abelson, “Changing the identity of a transfer RNA,” Nature 321 (15 May 986):213-219.

I haven't read them as I don't have online access to either journal (though I might just get off my butt and walk the 100 yards to the library.)  I am interested in seeing how these papers compare to the DI quote-mining spin.


P.S.  Just noticed that the DI has a typo in their reference for the Normanly et al paper.  It looks like it's from 1986, in which case they're using a reference that's much too old given that they're lots of recent ones.


Posts: 319
Joined: May 2002

(Permalink) Posted: May 30 2002,20:33   


Just a little background in case we've got any lurkers who haven't taken biochemistry lately...

In the canonical genetic code that everyone learns in textbooks there are 20 amino acids -- however, chemically many more amino acids are possible.  As I understand it there are many cases where organisms will produce an amino acid chain using the canonical code, and then post-translationally modify some of the amino acids, effectively resulting in the usage of more than 20 amino acids by the organism, although technically the normal genetic code is still used.

However, there are some cases where the canonical code has been modified to include a noncanonical amino acid *during* translation.  A few weeks ago a new example of this was published, and in the AE general discussion a new poster Ed has alerted us to how this example fits in with the 'stop codon alteration' pattern that is so common in genetic code changes.

Here is the link to Ed's post "Stop codon thievery"

I'll quote Ed's post for the sake of thoroughness:

A very recent example of a "stop" codon being
sometimes coopted for another use is the subject of two papers and a "perspective" (1-3) in the 24 May 2002 issue of Science. These all are reporting on the "new" amino acid "pyrrolysine", which is coded for by the (usually) stop codon UAG in a certain methanogenic archaeon's mRNA. To quote from (1):


The way in which pyrrolysine is encoded bears striking parallels to the encoding of the 21st amino acid, selenocysteine. Selenocysteine is found in Archaea, eubacteria and animals, including mammals . Both nonstandard amino acids are encoded by the RNA nucleotide triplets (codons) that signify a command to stop translation of mRNA into protein (UGA is the "stop codon" encoding selenocysteine). The notion that at least 22 amino acids are directly encoded by the nucleotide sequence of mRNA reflects the greater richness of the genetic code than is apparent from the standard textbook account.

Originally, the coding problem was defined in terms of how the 20 common amino acids could be specified by four RNA nucleotides. As the triplet nature of the genetic code began to unfold in the early 1960s, it might have been tempting to speculate that some of the 64 possible codons encoded the many rare amino acids found in proteins. However, it became clear that 20 is the correct number of amino acids, and that the great majority of nonstandard amino acids are created by chemical modifications of standard amino acids after translation. In 1986 came the surprise discovery that the nonstandard amino acid selenocysteine is directly specified by the genetic code and is not created by posttranslational modification. Selenocysteine is now joined by pyrrolysine, and together these two amino acids demonstrate that the genetic code can be expanded by redefining the meaning of a stop codon.   {references omitted}

Reference (1) goes into some depth, with references, as to how the stop signal is subverted in the case of selenocysteine, the only other non-canonical amino acid known to be specified by the code and not built by modification after translation. In the selenocysteine case, only a minority of the UGA codons are used to code the amino acid: most are still stop codons. Signals elsewhere in the mRNA determine which. It is still unknown just how the UAG coding pyrrolysine works, however.

(1) Atkins JF, Gesteland R. Science 2002 May 24;296(5572):1409-10
(2) G. Srinivasan et al., Science 296, 1459 (2002).
(3) B. Hao et al., Science 296, 1462 (2002).

Thanks Ed, keep it up!


Edited by niiicholas on May 30 2002,20:34


Posts: 319
Joined: May 2002

(Permalink) Posted: Mar. 15 2003,02:05   

Some ISCID threads on this:

Topic: Is the DNA code universality strong evidence for evolution?;f=6;t=000316

Topic: Common descent;f=6;t=000056

Glenn Branch

Posts: 19
Joined: May 2002

(Permalink) Posted: Mar. 25 2003,22:08   

Finn Pond and Jean Pond published a lengthy article in Reports of the National Center for Science Education 2002: 22 (5) on evolutionary explanations for the divergences from the universal genetic code. It is only somewhat technical. E-mail NCSE to request a copy.


Posts: 97
Joined: May 2002

(Permalink) Posted: Sep. 29 2003,15:16   

Interesting recent article:

Proc Natl Acad Sci U S A. 2003 Sep 17

A noncognate aminoacyl-tRNA synthetase that may resolve a missing link in protein evolution.

Skouloubris S, Ribas De Pouplana L, De Reuse H, Hendrickson TL.

Efforts to delineate the advent of many enzymes essential to protein translation are often limited by the fact that the modern genetic code evolved before divergence of the tree of life. Glutaminyl-tRNA synthetase (GlnRS) is one noteworthy exception to the universality of the translation apparatus. In eukaryotes and some bacteria, this enzyme is essential for the biosynthesis of Gln-tRNA(Gln), an obligate intermediate in translation. GlnRS is absent, however, in archaea, and most bacteria, organelles, and chloroplasts. Phylogenetic analyses predict that GlnRS arose from glutamyl-tRNA synthetase (GluRS), via gene duplication with subsequent evolution of specificity. A pertinent question to ask is whether, in the advent of GlnRS, a transient GluRS-like intermediate could have been retained in an extant organism. Here, we report the discovery of an essential GluRS-like enzyme (GluRS2), which coexists with another GluRS (GluRS1) in Helicobacter pylori. We show that GluRS2's primary role is to generate Glu-tRNA(Gln), not Glu-tRNA(Glu). Thus, GluRS2 appears to be a transient GluRS-like ancestor of GlnRS and can be defined as a GluGlnRS.


  14 replies since May 23 2002,08:35 < Next Oldest | Next Newest >  


Track this topic Email this topic Print this topic

[ Read the Board Rules ] | [Useful Links] | [Evolving Designs]