- Genes – the language of God 0: Preface
- Genes – the language of God 1: Genes as Language
- Genes – the language of God 2: Other popular gene myths and metaphors
- Genes – the language of God 3: Why genes aren’t information
- Genes – the language of God 4: Why genes aren’t a language
- Genes – the language of God 5: God and genes
- Genes – the language of God 6: Theological implications
Genes are more commonly regarded as information than as a language, and in fact the informational metaphor underpins the language metaphor. In this post I will consider how genes came to be called information (that is, how the Dawkins view of genes as computer messages came to the fore), and what it can and cannot mean.
In The Blind Watchmaker (1986), Richard Dawkins compared DNA to computer programs (instructions for building organisms):
It is raining DNA outside. … [downy seeds from willow trees] The cotton wool is mostly made of cellulose, and it dwarfs the tiny capsule that contains the DNA, the genetic information. The DNA content must be a small proportion of the total, so why did I say that it was raining DNA rather than cellulose? The answer is that it is the DNA that matters… whose coded characters spell out specific instructions for building willow trees… It is raining instructions out there, it’s raining programs; it’s raining tree-growing, fluff spreading, algorithms. That is not a metaphor, it is the plain truth. It couldn’t be any plainer if it were raining floppy disks. [Chapter 5, p 111]
Floppy disks have been superseded by USB thumb drives, but the point is clear enough – DNA is information, not just a molecule. It’s not a metaphor.
However, many have tried to make this “plain truth” work, and failed. There are many reasons for this, but first let us look into the history of the idea that DNA is information.
As I noted in the first post of this series, the notion that inheritance is about information long precedes the discovery of DNA, let alone its structure and role in inheritance. But the idea that DNA is information goes back to the two discoverers of how DNA was structured, Francis Crick and James Watson. At first, back in 1952, the structure did not give the way DNA made proteins; it took some time to figure this out. In 1958, Crick published what came to be known as the “Central Dogma” of genetics:
[From Sandwalk’s excellent essay on the Central Dogma.] On the left Crick diagrammed all the possible ways sequence information could be passed on between DNA, RNA and proteins. DNA could copy itself, pass sequential information to RNA molecules or to proteins or all three; and the same was true for the other two types of molecules. In fact, Crick said, it only is passed on according to the right hand graph. Later, we discovered that some RNA sequences can be reverse transcribed into DNA, especially through the medium of what are now called retroviruses. Crick gave the following definition of the Central Dogma:
… once (sequential) information has passed into protein it cannot get out again.
It is very important to note that the “information” here is the linear sequence of the base pairs matching up to a linear sequence, first of RNA (tRNA), and then later of the proteins (through intermediary molecules of mRNA). Nothing beyond this is implied by the Central Dogma, and we can usefully call this “Crick information”, as Griffiths and Stotz do in their book. The passing of sequential or Crick information is thus a kind of templating from a sequence in the DNA to the [often edited] sequence in the RNA to the finished protein. It is not as “instructions” that Crick posited information. You lose nothing if you drop the word “information” in favour of “structure”, and I will argue there are good reasons for this.
When Crick was writing, information was all the rage. In 1948, the so-called Communications Theory of Information, made mathematical by Claude Shannon at IBM, was published, and many scientists thought this was a fruitful way to approach scientific problems. Inheritance seemed like a transmission of information, and so it was natural that Shannon’s scheme would be brought to bear. However, it was ultimately rather fruitless.
Another information idea, coincidentally published the next year by Norbert Wiener, is called Cybernetics. Here the information is about control of one thing by another, through signals. Cybernetic ideas about genes have been more fruitful, but in the end they turn out to be just analogies that are not terribly deep (in my opinion).
The code aspect of genes: what it is and isn’t
Code language is widely used when talking about how DNA causes proteins. Terms like editing, reading, transcribing, and expressing are all used in the technical literature. DNA is “expressed”, and “edited”; a gene is regarded as an “open reading frame”; DNA is “copied” or “replicated”. Such terms point up the leading property of DNA – it is both long lasting and its structure can be duplicated, not unlike a document. For this reason, some scientists refer to genetics as a “codical domain”.
But what is happening physically is that DNA molecules are split into two strands by helicases, and then either transcribed by polymerases, and RNA made from it, or that new DNA is made. The DNA and the RNA are just as physical as the proteins they produce. As Weiner noted in his book:
Information is information, not matter or energy. No materialism which does not admit this can survive at the present day. [p132]
Following Weiner here, DNA is a physical structure, and it is not “information” in the sense used by communications or computation theories. That sort of information is an abstract entity, a property of mathematics, not physics. Genes are not that kind of information. A mathematical model of genetics, especially population genetics which describes how genes change in populations, contains information about genes, but that’s a different kind of information too; it isn’t what those who say genes are information mean by it.
So the Crick information model – that genes are templates for the structure of RNA and through them of proteins – seems to be the only meaningful sense in which one can say genes are information.
Other types of information in genes
There are some other senses in which genes are supposed to have an informational aspect. They are the program sense, and the game theory sense.
Program/recipe: genetic control versus genetic involvement
The program or receipt metaphor has been used by many evolutionary biologists, including Ernst Mayr and Richard Dawkins. It is used in Dawkins’ quote above: genes are instructions. There can be no doubt that genes are involved, either directly or indirectly (say, by building molecules that have functions) in the development of living things. They are “first among equals”. But how can they be “instructions”?
Recall the mnemonic
G & E -> O
from the last post. In order for genes to be instructions, there would need to be a “computer” to “run” the instructions (or in the case of a recipe metaphor, a cook and kitchen to make the recipe). What could do this for genes? It would need to be not only the cellular machinery that expresses genes – ribosomes and so forth – but also the organism itself, which turns on and turns off genes, and the environment that provides the source material. So the mnemonic would have to become
G & O[<t] E -> O[t]
or, the genes G, together with the state of the organism before now O[<t], together with the environment E, gives the organism now O[t]. While this is true enough, the metaphor no longer seems to hold up. Why not just say that genes and organisms and the environment gives the later organism? There is no temptation to talk about some abstract program, and ascribe to genomes powers they do not have.
Incidentally, while the Human Genome Project delivered the entire genome in 2000 (it’s been revised a bit since), we have yet to discover what sorts of effects most of the expressed genes actually have, and it will probably be another century before we finish that. And of course most of the genome is unused junk.
Game theory: genes as bookkeepers
There is one final metaphor that is possibly more than a metaphor that we should look at. It is yet another view that is found in Richard Dawkins’ work: genes are strategies in a game. Here the metaphor is backed up by extensive mathematics: a field known as “game theory”, developed to deal with Cold War threats and counter threats, turns out to be very useful to model how genetics changes in certain conditions (when the fitness of genes and their propensity to work together to against each other within a single population are known).
This was the basic underlying metaphor of The Selfish Gene: genes have interests, and behave (evolutionarily) like self-interested players of a game known as The Prisoner’s Dilemma. The details are not important here.
Game theory treats genes as “players” or agents. But genes have no strategies themselves; it is just that the mathematics of games can transfer to genetics. This often happens, that mathematics developed for one field get used in other fields. It doesn’t mean that the properties of that first field (where game players are rational and selfish) apply to the new field, only that the maths applies.
In fact, the game theory view has been called by Stephen J. Gould a “bookkeeping” view of evolution; you track the “wins” and “losses” of a given gene in a mathematical scorecard. In other words, selfish genes exist only in how you record the outcomes of the evolving population. It’s useful, but it doesn’t mean genes actually are strategies, nor that they have them.
Next I will discuss why genes are not a language.
Molecular Biology (Stanford Encyclopedia of Philosophy)
Biological information (Stanford Encyclopedia of Philosophy)
A video on epigenetics: