Skip to content

Notes on novelty 7: Surprise!

Last updated on 21 Jun 2018

Notes on Novelty series:
1. Introduction
2. Historical considerations – before and after evolution
3: The meaning of evolutionary novelty
4: Examples – the beetle’s horns and the turtle’s shell
5: Evolutionary radiations and individuation
6: Levels of description
7: Surprise!
8: Conclusion – Post evo-devo

It is now time to return to the basic argument of this series. You will recall that it went like this:

  1. Novelty is specified at some level of description based on there being a nonhomologous structure or function
  2. There is always some level of description at which there is a homology for the underlying developmental or hereditary mechanisms that develop the novel trait
  3. Novelty is therefore a function of the level or scale of description
  4. What makes it novel is therefore our lack of knowledge of the right scale or level of description of traits; when we learn the right level (say, protein or cytological), the novelty seems less novel or not novel at all.
  5. Novelty is therefore a matter of the surprisal value of the trait relative to homologies at that scale

I have made out premises one to four. Now it is time to argue for the conclusion.

There is a concept in information theory known as the surprisal value of a received message: basically it is that the information content of a message is the degree to which the sequence is surprising, which is to say, the inverse of the expectation that the sequence would be received by chance:


where the probability p is 0 ? p ? 1. This is also called “self-information”. Note that the key term here is expectation. The point of surprisal is that it relies upon our prior expectations of probabilities. To the extent that we have an estimation of those distributions of probabilities, the received message is surprising and informative.

An analogous case is in play here. We describe (that is, set up a sequence of symbols, either in verbal or mathematical form) some traits or characters in biological cases and estimate their likelihood under “ordinary” evolution or drift. We then are surprised (informed) when the traits concerned do not fall out as likely. This, we think, calls for a special explanation and so we start looking at “non-Darwinian” causes.

Let us consider what “Darwinian” causes might be. Darwin adduced several causes of evolution: natural selection, sexual selection, artificial selection, use and disuse, and correlation. We can lump all the selective processes together under the rubric “selection”, as all that divides them is the intent of the selective agency, or a lack of it. Use and disuse was what is often misleadingly called Darwin’s “Lamarckism”, but I have argued before that it was not. He merely thought that traits that were useful would be more strongly inherited and those that were not would be weakly inherited. As a source of novelty, use and disuse is otiose. He did not think this was a case of acquired characters as such. In any case we can dismiss this, pace epigenetics, as a case of evolutionary novelty on the basis of modern knowledge. Correlation is now known under the rubric of allometry. Some traits (like large antlers) can be due simply to regulation and moderation of growth – if body size is upregulated, antlers will grow disproportionally due to differential growth rates of parts. That leaves selection as a Darwinian mechanism for novelty. Indeed, it leaves selection as the sole mechanism for novelty.

Objections to selective accounts of novelty often rely on the argument that selection only determines the subsequent fate of a novel trait (i.e., whether it spreads to fixation or equilibrium, or is eliminated), not its origin, and there is a sense in which this is true, but we should not be so restrictive about what selection includes. Darwin and Wallace always held that selection involves prior variation; which we now refer to as mutation since the Mendelian revolution. In any sensible interpretation, selection involves random mutation (random, that is, to the selective value of the traits in that population in that environment, or any future population and environment linearly derived from it).

Another class of objections to selective accounts of novelty is that selection must occur gradually, with fitness improvement at every step. Some of these novelties are thought not to have such intermediate improvements in fitness gradients. In the case of our two exemplars, the turtle shell and the beetles’ horns, this is not the case; we can see fitness improvements (the turtles being ecological, and the beetles being sexual) at every stage. We can leave this case to one side here, then.

So taking selection to be blind variation and selective retention, in Donald Campbell’s felicitous phrasing (Campbell 1965), the mechanism that produces novelty in a Darwinian fashion is some random variation to an existing trait, and selection on it when it has fitness differences. Obvious enough. However, it needs a bit of filling out. Fortunately, Thornhill and Ussery have done this already (2000). The pathways or routes to novel traits posited by the most general Darwinian view, and I think properly assigned to Darwin himself, are: serial direct evolution, parallel direct evolution, elimination of functional redundancy, and adoption from a different function. I will paraphrase and extend these (which were originally written to deal with the rather uninteresting claims of intelligent design).

Serial direct evolution occurs when some trait is modified along the lineage (the ancestor-descendent sequence), so that the trait at the end is distinct from the trait at the beginning. The sequence in seriatim has to occur at what I am calling the same grain of description. Changing the grain (upwardly or downwardly) would no long count as a series. I would call this sequential evolution of novelty.

Parallel direct evolution occurs when more than one functionally connected component is modified simultaneously to form a complex structure. The example Thornhill and Ussery give is a table where all the legs need to be extended in concert for the table to function (and the biological example here is the complex vertebrate eye). This is clearly compositional in nature.

How parallel and serial evolution can actually differ, as opposed to being merely conceptually different, is unclear. Any trait in a living system, because it is a system, must involve more than one part. Suppose a serial evolutionary process modifies a rib. That will change all attachments and integuments connected to the rib. It will change the shape of the body, and hence the locomotory subsystems. Internal organs and functions will shift, subtly or greatly. Organisms are not evolved by adding a single new part, and given that organisms develop as a system, any change in any part will affect many others, which will have to evolutionary adjust over time.

So I think that we can bring these two forms of Darwinian evolutionary routes into one: some part or parts are modified lineally and in sequence. This is not necessarily gradualism, though. Changes at one grain of resolution and description can be abrupt, so long as the descriptions at some grain allow us to identify homology over time. [A change without homological relations would indeed be non-Darwinian.]

The elimination of functional redundancy occurs when a part that had a now-redundant or unnecessary function loses that function enabling it to form the seed for another function. This leads to adoption from another function. Functional descriptions, however, are relative to the choice of functions that matter to the observer; it is inevitable that a part of almost any function of any complexity will also play a functional role in several or many other functional processes. So while the bones of the jaw that became those of the middle ear (Thornhill and Ussery’s example) lost their functional role as “jaw bones” (as stress-supporting biomechanics structs for mastication), they undoubtedly retained their functional role as skull supporting, facial muscles supporting bones, while they evolved. They “lost” one function (or better, lowered it gradually, more or less) and gained another “gradually, more or less) but retained many others and these, too, evolved. Our choice of what to describe here remains the key issue. Again, also, I think we can collapse these two.

Darwinian selection occurs on variation, so the real question is how variation occurs. As far as I can tell, there are four ways this can happen:

  1. Deletion of a part.
  2. Duplication of a part.
  3. Rearrangement of a part.
  4. Insertion of a part.

Consider genes in the traditional four letter code. A letter can be deleted, given a “novel” sequence (especially if the deletion is in the start or stop region of an open reading frame), or a sequence or letter can be duplicated to the same effect. When duplication occurs, one copy can be retained under the old function, while the new copy evolves through the other three ordinary Darwinian processes. A sequence can be inverted or chopped up and distributed through other sequences, forming all kinds of novelties of products downstream. Or a sequence can have an atomistic part (a “mer”) at that grain of description (here: the nucleotide “letters”) inserted.

Once these novel variants occur, and they may occur through deterministic processes at the grain of description but remain “random” relative to the selective pressures that obtain at the trait or organismic level, selection takes over and explains why so many of the population, or all of them, have the trait concerned.

In no way is an explanation other than a Darwinian selective account required to explain novelty unless it could be shown that a part arose without any kind of precursors whatsoever. In other words, there is no prior sequence of parts at any grain. When arguments are put that Darwinian selection is insufficient to account for a novel trait, usually on the basis of some kind of systems theory, it is because the descriptions do not permit a change of grain. And this is a matter of surprisal only when grain is fixed. Suppose, to return to our information theoretic example, you had a full description of all parts of the sender, and what is fed into it, along with a full description of the channel and the receiver. What might be surprising then would be some noisy interference, not that you had received the same or very similar sequence as was sent. What matters is what your grain of description is – an engineer’s or an operator’s.

It may be that there are indeed properties of systems that are surprising (at least, until we have worked them out) and which have, at some grain of description, explanatory weight in evolutionary contexts; but I suspect and hold that, like selection itself, these will always be promissory notes for a full and physical description to come, just like a mathematical equation is not explanatory until we can fill out the denotation and interpretation of the variables.


Campbell, Donald T. 1965. Variation and selective retention in socio-cultural evolution. In Social change in developing areas, a reinterpretation of evolutionary theory, edited by H. R. Barringer, G. I. Blanksten and R. W. Mack. Cambridge Massachusetts: Schenkman publishing company.


Thornhill, Richard T., and David W. Ussery. 2000. A Classification of Possible Routes of Darwinian Evolution. Journal of Theoretical Biology 203:111-116.


  1. Jeremy Bowman Jeremy Bowman

    I’m jumping ship here! It is scientism to treat epistemic expectation, information, etc. in numerical terms. It trades on confusions, such as the confusion of epistemic probability (how much something ought to be believed) and relative frequency (what proportion of a class belong to a sub-class). Semantic information (i.e. “potential knowledge”) and statistical co-variation (which is all mathematical information “theory” deals with) are entirely different concepts.

    • Three things:

      1. I did say the sense of surprisal I am using is analogous;

      2. This is science we are talking about, right?

      3. To call this scientism is just to use a term as a bad name; not argument. Scientism might be right, if this were actually a case of it.

  2. “1. I did say the sense of surprisal I am using is analogous;”

    But we had a better grasp of what epistemic “expectation”, “surprise”, etc. were before the statistical analogy that was supposed to refine them. The analogy is positively misleading if it gives the impression that all that epistemic stuff doesn’t depend on what each individual already believes – which differs from one individual to the next, and even from one moment to the next. That epistemic stuff is “subjective”, but statistics, Shannon-Weaver “information”, “surprisal” values, etc. are “objective”.

    “2. This is science we are talking about, right?”

    Well, yes and no. Sometimes we talk about it, and sometimes we actually do it. If we are merely talking about it, we should be careful not to get carried away and think we are doing it. Philosophers of science are usually engaged in the standard philosophical (i.e. “parasitic”) business of “trying to understand what is going on in a discipline”, but occasionally they actually do a bit of speculative science by contributing to it themselves. Philosophical contributions to science are very important, and your blog posts here are a fantastic example. By reading them, we understand what’s going on with ‘novelty’ much better. Biologists really need this philosophical input, and those who don’t value it are losing out. I for one value it hugely. Thanks!

    BUT (big ‘but’) this input is coming from a different, er, “level” from that of biology itself. The input involves looking at ourselves, at our own understanding, at what baffled us before about our understanding of ‘novelty’… The input involves seeing how we might escape our own bafflement by seeing how we have various ways of describing (and thinking about) things. This is philosophy, not science, although it contributes an important idea to science from outside science proper.

    “3. To call this scientism is just to use a term as a bad name; not argument.”

    Yeah, I accept that — I’m bad that way!

    “Scientism might be right, if this were actually a case of it.”

    Love the attitude, but wonder if you see my problem with scientism?

    Don’t get me wrong. Your posts on this topic have been brilliant and insightful, I have only been disagreeing here and there because I mostly agree with you.

  3. Jim Thomerson Jim Thomerson

    how does your “grain of description” relate to “level of organization”?

    • I am replacing all mentions of levels in biology; they are artifacts of our psychology and practice. “Grain of description” is about access to observable data at scales that depend upon what instruments you have available and know how to use (I can’t, for example, get any kind of information from a scanning electron microscope – I don’t even know how to turn one on). It makes no presumptions about what the “right” “level” is. Levels talk, like ranking talk, tends to project our cognitive and psychological dispositions onto the world.

  4. David Duffy David Duffy

    Novelty is therefore a matter of the surprisal value of the trait relative to homologies at that scale

    A trivial characteristic for novelty would be that the trait is non-metric: we would not be surprised by an existing organ becoming larger or smaller: eg eyelessness in cave fish.

    For a novelty to be “evolutionary” it has to be at the scale at which the variation would be expected to effect survival and fitness. So, mutation at the level of the single gene is not surprising, and consequent presence or absence of a phenotype is also not surprising: canalization and buffering can mean that no phenotype is observed, even though function of one gene product has been significantly altered. Such occult variation might be unmasked by mutation at another gene, introduction of a modifier gene from another population (sexual or horizontal), or a change in environment.

    Similarly, neutral variation occurs commonly, and there are novelties of form that have little or no effect on fitness. By chance (drift), some of these can become common or fixed in a population, and in some cases then lead to speciation, and affect evolution via that mechanism. But the evolutionary novelties most cited are “game changers” such as flight, bipedalism, the vertebrate body plan, or the wearing of armour, where there is a quite direct relationship between the phenotypic changes and a long-term successful strategy, and a big radiation.

    • Sometimes metric changes are indeed thought to be novelty. Consider the debates over bat wings – all that has happened, broadly speaking, is the extension of phalanges and the patagium, yet this is evolutionary novelty permitting an evolutionary radiation.

      What counts as a game changer depends on what is specified as a game…

      • David Duffy David Duffy

        Sometimes metric changes are indeed thought to be novelty: a very nice example.
        What counts as a game changer: you have given one identifier: novelties that permits an adaptive radiation. I notice there a few papers looking at statistical testing of this eg

        Bond & Opell (1998) Testing adaptive radiation and key innovation hypotheses in spiders. Evolution 52:403-14.

Comments are closed.