Skip to content

Phylogeny, induction, and the straight rule of homology

Last updated on 22 Jun 2018

Continuing my “natural classification” series, which I am writing with Dr Malte Ebach of UNSW.

After having experienced the circulation of the blood in human creatures, we make no doubt that it takes place in Titius and Maevius. But from its circulation in frogs and fishes, it is only a presumption, though a strong one, from analogy, that it takes place in men and other animals. The analogical reasoning is much weaker, when we infer the circulation of the sap in vegetables from our experience that the blood circulates in animals; and those, who hastily followed that imperfect analogy, are found, by more accurate experiments, to have been mistaken. [Philo, in David Hume’s Dialogues in Natural Religion, Part II]

Phylogenetic classification is a form of induction. It enables us to infer the properties of an as-yet unobserved member of a clade with a very high degree of likelihood, as was pointed out by Gary Nelson in the 1970s. [1] For inductive inferences to be successful, we have to guard against the grue problem outlined by Nelson Goodman. [2]


While this is very familiar to philosophers, it is less well known to biologists, so a short summary is in order. The grue problem is based on a kind of “broken” predicate or property: ordinarily we might infer from the fact that all prior emeralds have been observed to be green and the future emeralds would also be green – this would be a case of inductive inference. But in the absence of prior certainty about what the rules are, without a “straight rule”, [3] we cannot rule out the existence of another property, which Goodman calls “grue”, in which emeralds are green if observed before some time t and blue thereafter. Hence, every observation of a green emerald strengthens the inference that after t emeralds will be seen to be blue. It must be noted that this is not a claim that emeralds will change color. It is about what we can infer of unseen members of a class. A gruelike predicate is unprojectible, which is to say that it cannot be projected to unobserved entities. What inductive inference promises is projectibility, so that we can say things that are very likely to be true about unobserved members of the class.


This is more than a mere thought experiment. The infamous “black swan” example so beloved of logicians is a simple case. Swans were all observed to be white, until black swans were discovered in West Australia in the late 17th century. Had swans been defined by a white plumage (which was common at the time), then swan plumage would have been a grue property. More generally, consider such classes as Mammalia. Mammals are defined as tetrapodal (four limbed) animals which give live birth, lactate, and have hair. But whales have secondarily lost their hindlegs and hair, not everything that lactates is a mammal*, and monotremes lay eggs. More recently it has become clear that lineages of species are not straight but gruesome. What defines a group of organisms may change in a daughter species. Consider sexuality as a trait of a group of lizards. When one species becomes parthenogenic (secondarily asexual) we encounter a grue property for real. This applies to any single property of a group, potentially. Evolution leads to grue problems.

And yet, biology is not deeply troubled by grue problems, even though it is precisely the science that should be. While the colour of the swan’s plumage turned out not to be projectible, the new black swan was not placed in a new order or class. It was recognised to be a swan nevertheless, and placed into the existing genus, hitherto a monotypic genus. Although philosophers, who anyway tended then as now to rely upon folk taxonomic categories for their examples, were shocked in the manner of Captain Renault, biologists simply shrugged, reported the new species, and added it to the existing taxonomy. The reason is quite obvious by now: the swan was not defined, but classified upon the overall affinities it exhibited, and the fact that one homolog differed in character state from the rest was not crucial, any more than if it had a different shaped beak from the rest of the genus.

The issue here is with what Peter Godfrey-Smith calls the “dependence relations”:

We should not make a projection from a sample if there seem to be the wrong kind of dependence relations between properties of the sampled objects. [4]

It is our claim here that homological affinity does act to provide the “right” kind of dependence relations between properties of taxa. A single failure of a homolog to project properties is insufficient to make the taxon natural (that is, in philosophical terms, a projectible class), since the class (the taxon) is formed from the overall suite of homological relations (which my coauthor and I are calling the affinity, following early 19th century taxonomic use). Affinity acts to set up taxonomic kinds, and these act as a “straight rule”, as they do tend to converge upon projectible properties.

In taking inference from homology to be a kind of “straight rule”, the question is why it works. If the universe were such that properties correlated by chance, it would not work, but in the cases of the special/paletiological sciences, properties correlate due to a shared productive cause. If the universe lacked appreciable structure of this kind, then no search method would deliver knowledge (consider Wolpert’s and Macready’s “No Free Lunch” theorem [5]). Assuming that properties can be correlated, the epistemic question is how to identify those that are and to distinguish them from those that aren’t, which in biological systematics is the distinction between homology and homoplasy. If there’s knowledge to be had, then one way to acquire it is to iteratively refine one’s classifications in an attempt to maximize the homological relations on which they are based.

Such inferences are, of course, quite defeasible. It should not be thought we are supposing that natural classifications are in any way certain, or that any given homolog will exemplify the same states in each taxon or object classified. Of course this will not apply. On the one hand this is probabilistic [6] inference, in the sense that there is some likelihood or confidence that the projection will succeed for each property, and a high confidence that it will succeed for most properties. [This is akin to selection on a smooth landscape versus on a rugged landscape; selection can act on traits in a highly correlated “smooth” landscape (where adjacent coordinates are not too different in value from each other), but it fails on an uncorrelated, or “rugged” one. [7] The progress of science has been compared to an adaptive walk, [8] and similar considerations apply to inference in science as apply to selective searching of the adaptive landscape; both are special cases of a search procedure of the kind Wolpert and Macready discuss.]

Systematics in biology, and classification in science generally, resolves much of the practical issues of gruesome induction by, as Godfrey-Smith says, ensuring that the right class is sampled by finding the right dependence relations through a process of iterative refinement. These are what we are generally calling homologies.

  • If you count what a pigeon does as lactation, which, technically, it isn’t.


1 Nelson 1978.

2 Goodman 1954. See Godfrey-Smith 2003 for a discussion of the classical problem.

3 Hans Reichenbach proposed a “straight rule” for induction in his The theory of probability (Reichenbach 1949), in which induction was justified when increasing observations converged upon an asymptote. See also Salmon 1991. Here we are using it in a more general sense, as a way of ensuring that gruelike properties are eliminated.

4 Godfrey-Smith 2003: 579.

5 The No Free Lunch Theorem states that no single algorithm outperforms chance when amortized over all possible search spaces or functions (Wolpert and Macready 1997, Wolpert and Macready 1995).

6 Here we mean something like a likelihood probability. This may be Bayesian or some other statistical confidence; it doesn’t materially affect the argument which philosophical stance towards probability one adopts here, and we leave it to the reader to convert the argument to their favorite method or position on the matter.

7 Kauffman 1993, Gavrilets 2004.

8 Hull 1988, Wilkins 2008.


Gavrilets, Sergey. 2004. Fitness landscapes and the origin of species, Monographs in population biology; v. 41. Princeton, N.J.; Oxford, England: Princeton University Press.

Godfrey-Smith, Peter. 2003. Goodman’s Problem and Scientific Methodology. The Journal of Philosophy 100 (11):573-590.

Goodman, Nelson. 1954. Fact, fiction and forecast. London: University of London, The Athlone Press.

Hull, David L. 1988. Science as a process: an evolutionary account of the social and conceptual development of science. Chicago: University of Chicago Press.

Kauffman, Stuart A. 1993. The origins of order: self-organization and selection in evolution. New York: Oxford University Press.

Nelson G (1978) Classification and Prediction: A Reply to Kitts. Systematic Zoology 27: 216-218.

Reichenbach, Hans. 1949. The theory of probability, an inquiry into the logical and mathematical foundations of the calculus of probability. 2nd ed. Berkeley: University of California Press.

Salmon WC (1991) Hans Reichenbach’s vindication of induction. Erkenntnis 35: 99-122.

Wolpert, David H. , and William G. Macready. 1995. No free lunch theorems for search. Sante Fe, NM: Santa Fe Institute.

Wolpert, David H., and William G. Macready. 1997. No free lunch theorems for search. IEEE Transactions on Evolutionary Computation 1 (1):67-82.


  1. As an induction skeptic, I take a somewhat different view. I have posted a detailed comment at “On induction.”

    • John Harshman John Harshman

      The classification system may not be handed to us by God (or nature), but the true relationships that we are trying to represent in our classification are. We may have difficulty in recognizing them, but we suppose that they do exist. Does that make a difference?

      • John S. Wilkins John S. Wilkins

        To what?

        Our classifications are attempts to group things according to real relations (we’ll discuss this in the book). But we do not know the real relations to begin with, because, as you say, God didn’t hand us the natural classification (contrary to Linnaeus’ self-serving comments); so we have to do two things simultaneously – gather things into natural kinds, and make inferences from them.

        I do assume there is a real structure to the world. Some philosophers may disagree…

      • John Harshman John Harshman

        Actually, I was responding to Neil Rickert’s post. And he at least does appear to disagree.

  2. John Harshman John Harshman

    Since we’re interested in the influence of philosophy of science on science, could you briefly sketch how this work will affect my conduct of systematics?

    • John S. Wilkins John S. Wilkins

      Deeply. Very, very, deeply.

      More seriously, John, this is a philosophy work, for philosophically inclined readers. I suspect it may affect what some people think (mostly in that they will strongly disagree with what we write, and it will reaffirm their view that holophyly and monophyly are different or homology is similarity, etc.) but its primary function is to raise with philosophers some philosophical issues that taxonomists and systematists have been discussing for a while that seem not to have made their way into philosophical debates over natural kinds, etc.

      However, we are going to do case studies well outside of biology as well, so maybe it will get a few people making some cross-comparisons with biological systematics in those disciplines too.

  3. Sam C Sam C

    You say:

    And yet, biology is not deeply troubled by grue problems, even though it is precisely the science that should be.

    Engaging cynic mode…
    Lovely exemplar of philosophy of science in action. Philosopher opines that scientists have a problem. Scientists shrug and get on with doing science. Philosopher gets grant and/or writes paper. Philosopher announces solution to problem (or that, on reflection, it never was a problem), says science may continue with his blessing. Scientists shrug and get on with doing science. The sun rises on another day.
    … end cynic mode.

    This discussion of the issue seems to assume that the appropriate inferential model is logical induction. But a more useful model of this sort of inference is Bayesian beliefs, particularly Bayesian Belief Networks. We suspect green rather than “grue” because “grue” is wildly implausible, so a prior belief of close to zero.

    Black swans are less of an issue, because we know birds (generally) come in different colours, including black, and that colours can vary widely within groups of birds; a uni-colored group is unusual. A shiny golden swan would have been less plausible than a sedate black, an swan with four wings much more implausible, an orange swan with six wings, a prehensile tail, making nests in trees out of purple pipe cleaners, subsisting on a diet of cold porridge with apricots which courted its mate by performing the Dance of the Sugar Plum Fairy while whistling “Waltzing Matilda” precisely one fifth flat would have be even more incredible.

    It is a given in biology and other sciences that we don’t know everything and have not seen everything. So categories must be provisional. Mammals don’t lay eggs, as enny fule kno. Except in Ozland. Did that cause taxonomy or biology problems? Biologists knew that reptiles aren’t furry, mammals don’t lay eggs, so how to deal with something that breaks the rules? Revise the rules and be happy that the world has become a richer and more interesting place. That’s how belief networks work. It’s not like pure mathematics, where theorem must be built on theorem, and the whole tower can come down if theorems are invalid.

    No philosophers of science should consider his/her education complete without a good understanding of both “classical” statistical and Bayesian inference. Mischaracterising the scientific method as a simplistic form of logical induction helps no-one. Black swans are not a problem to science. Never have been, never will be.

    Put more succinctly, a problem in philosophy of science is usually a problem in philosophy, not a problem in science.

  4. bob koepp bob koepp

    Sam C – Could you clarify what it means for a predicate (i.e., ‘grue’) to be “wildly implausible.” Since there are many, many demonstrably grue objects in this world, the question is why this grueness isn’t “relevant” to scientific investigations.

  5. Allen Hazen Allen Hazen

    So, you don’t like pigeon milk? How about… There is a species of fish (I think freshwater and South American) in which one or the other (or both?) parent(s) secrete some sort of nutritious stuff from their sides which the young eat. Sorry, can’t remember details. Certainly analogous (though not homologous) to mammalian lactation!

    Fish are weird. Do you know Michael Slote’s “Theorey of Important Criteria” (Journal of Philosophy vol 63 (1966) pp. 211-224)? It’s a good, just pre-Kripke, attempt to outline a theory of the meaning of natural kind terms (with one goal being to explain why it was ALWAYS wrong to think that whales are fish). I think it’s a good piece of philosophy, but the prime example is “fish”… and for every candidate “important criterion” of piscitude, there is an actual species lacking it!

    • John S. Wilkins John S. Wilkins

      I find the taste to much like chicken…

      Again and again in the philosophy of logic and analytic discussions of natural kinds I find it odd that the things they refer to are folk kinds, like “tigers”, “fish”, “swans” and the like. They need to do this to support Millian essentialism of kinds, and once you start looking more closely, you find that exceptions are the rule (including exceptions to that exceptional rule) in the natural sciences outside physics and chemistry. Actual taxa are not defineable, nearly all the time. To define a taxon you must abstract away all the variation and polythetic data. That is why naturalists (biologists, geologists, psychologists, etc.) cannot do this and why the better of them do not try.

      I believe, and you can advise me on this, that basically the essentialist debate in phil-lang is akin more to Aristotelian debates over the meaning of predicates (words) than about the nature of anything like actual taxa. As I have said before, Whewell got it right, and Mill changed the subject, WRT natural kinds and classifications.

  6. Allen Hazen Allen Hazen

    Well, most of the philosophy of language stuff about essences and “”natural kinds” are hopeless as an account of biological taxa: my feeling is that the “species are individuals” people were right, and that what counts for biological kinds is referrence to a particular, spatio-temporally located, lineage of organisms, and not anything like essential defining properties. (Cladism rules, o.k.?) For other kinds of kinds, I’m not so sure. (David Lewis, who was in graduate school with Slote, refers to “Theory of important criteria” in the context of the ??? sociological ??? kind “convention”: I think it can be argued — appealing to Lewis’s book as evidence — that conventions ARE a kind, suitable for the construction of genuinely explanatory theories about them.)

    Making it all very ironic that biological taxa are among the standard EXAMPLES used in discussing natural kinds! … Having a certain “Bauplan” is, I guess, more like a KIND in the philosophy of langauge sense than belonging to a certain taxon is. It’s sort of a historical accident (though evolutionary biology wouldn’t have gotten started without it!) that Bauplans and Taxa correlate fairly well in practice. And I think it has caused a lot of confusion in the history of philosophy.

    Sorry, that was pretty vague and wafty.

    • John S. Wilkins John S. Wilkins

      Well it’s vague and wafty in an agreeable manner.

      Sure, conventions are kinds. They are functional kinds, however, based upon our dispositions and preferences, not upon the non-cognitive and non-social world; at best they are sociological kinds. What I am dealing with here are natural classifications that do not depend for their kindhood upon us, although of course any representation of them and any discovery of them involve us.

  7. John Vreeland John Vreeland

    Your examples of Goodman’s “paradox” finally made it clear to me. When he described it himself I assumed he was talking about the very same emerald, which seemed to exist in a superposition of possible states (blue appearance/green appearance) until it was viewed. It had me thinking philosophers were quite batty.

  8. John Harshman John Harshman

    Bit o’biology: whales are fish, of course, same as all tetrapods. If “fish” is supposed to be a natural kind, that is.

    And it seems to me that biological kinds (=clades) do have definitions, just not in terms of constant, visible membership criteria. The definition is of course phylogenetic: history, not present condition. Present condition offers our main means of discovering history, but as you say it’s a polythetic means; still, that’s diagnosis, not definition. Perhaps you have a restricted definition of “definition” that requires the use of visible characteristics only?

    • John S. Wilkins John S. Wilkins

      Extrinsic properties like “distance from Alpha Centauri” can be used to group anything. Phylogenetic properties are a kind of extrinsic property, and while Paul Griffiths has argued this leads to a kind of historical essence for groups, as you do here, it is unclear to me this is a solution. There appear to be no necessary intrinsic essences (contra Michael Devitt’s arguments) for taxa, although some may, contingently, have them. But the problem is that historical essence/definition (the two are coterminous in my view, but not in everyone’s) is something that we get post hoc from the intrinsic traits or characters, and not one that gives them. So in philosophical terms, this is a very different view of essence or definition than we have previously and traditionally had.

      • John Harshman John Harshman

        I was unaware it required an essence to make a natural group, and I’m not sure of your definition of “essence”. If it’s identical to “definition”, then historical groups have essences, since they have definitions. One could indeed argue that history gives characters — isn’t that after all what evolution is? Phylogenetic definitions may be recent (was it Gauthier, De Queiroz, or someone else who first made them explicit?), but a desire to make the defining feature of a group its phylogeny, and nothing more, is not. Darwin might plausibly be claimed to have felt that way, for example.

        You seem to be saying that natural groups must be defined by intrinsic characters. Why? Is that some traditional criterion?

      • John S. Wilkins John S. Wilkins

        It is generally accepted in philosophy (i.e., not in biology) that a natural kind (i.e., not group) requires essential properties. There have been challenges, mostly along the lines of homeostatic property clusters, or historical essences, but it is widely agreed that a natural kind requires some shared properties.

        It is this that we are challenging in philosophy (i.e., not in biology).

        The existence of properties that (and this phrase is crucial) “all and only” members of a group have is not, I think, required under any phylogenetic conception (well, maybe under Willman and Meier’s notion, and those who think species must be monophyletic and have constant characters). It is this philosophers’ definition of a natural kind that we have in our sights. If you think it is the case that there necessarily are characters that all and only members of all clades have, I would be interested to hear that.

      • bob koepp bob koepp

        I share John Harshman’s sense that the “problem” here is the assumption that essences, or defining characters of natural kinds are “intrinsic” properties. To be sure, there is a long philosophical tradition that has maintained this position, but the arguments behind it have never been compelling. Even outside biology there are candidate natural kinds that can’t be defined in terms of intrinsic properties; e.g., planets and, more generally, satellites.

      • John Harshman John Harshman

        As you know quite well, there isn’t a single character that has to unite a clade, other than shared ancestry. Any other feature can be lost or transformed, and so ferrets are fish, barnacles are crustaceans, cobras are tetrapods, etc. No essences, and just a moderate degree of predictive value. The value lies wholly in the representation of ancestry. But why represent ancestry? My personal answer is that it’s just way cool, quite aside from any use in summarizing information, however imperfectly. Knowing that you are a highly specialized, land-adapted, sarcopterygian fish gives rise to one of those insights about your place in the world that some would characterize as a spiritual experience. Same for the realization that birds are dinosaurs, whales are mammals, and we’re all opisthokonts together.

        Anyway, I think that in biological taxa you have a fine counterexample to the idea that natural kinds must have essences. So go to it.

  9. While I share John Harshman’s enthusiasm for the intrinsic value of the knowledge of phylogenetic history I’d add that there can be considerable practical/predictive utility in a “moderate degree of predictive value.” In the medical research world I work in, inferences of homology pretty important to how the functional side of biology is studied.

    As for the more philosophical side of things I may be misunderstanding the vocabulary, so forgive me if I’m way off, but John Wilkins seems to be suggesting that there aren’t intrinsic essences that describe phylogenetic taxa, and yet it seems to me that that’s at least part of what phylogeneticists are trying to figure out: What are the synapomorphies (i.e., the diagnostic features/intrinsic essences) that define a historically related group?

    In other words as phylogeneticists we assume there are some intrinsic properties (or combinations of properties that then make up a property) that uniquely define the group, they’re just not necessarily obvious. (I’m not completely clear on how a group is different from a kind)

    On the other hand I can see the point that our definition of what is shared can be based on somewhat statistical (and thus potentially arbitrary) classification schemes. For example at the molecular level sequences are usually classified as homologous (i.e., the same essence) if they have a believable level of similarity relative to everything else, and I’m not sure if that could qualify as an intrinsic essence or not.

    • John Harshman John Harshman

      Synapomorphies aren’t intrinsic essences because they can’t define a group. The synapomorphies of a group can be lost or transformed beyond recognition without affecting group membership. (The most popular example being that snakes are still members of Tetrapoda.) This is true even at the molecular level; neutrally evolving sequences, especially, can be transformed to the extent that their homology is undetectable, but they remain homologous even so.

      So, synapomorphies aren’t essences, are not properties of the taxon. They’re properties of at least some members of the taxon that, in combination, allow us to determine the true defining feature, phylogenetic relationship.

      • Thanks John Harshman. I understand that synapomorphies are not defined as obligate characteristics of every member of the group so they aren’t in themselves intrinsic essences.

        I guess I’m looking at this the other way around. If we were to find some characters that universally and exclusively define the members of a group, and that obviously reflected the shared evolutionary history of that group then we would have some sort of shared “essence.” I would still be surprised if we couldn’t identify, after the fact, characters shared by “all and only members” of any clade, though especially for very old branches there may be only a few or they might be hard to identify. Is it because we leave open the possibility that we look at a new species and find that character is no longer possessed by “all and only” we couldn’t consider them intrinsic essences?

      • John Harshman John Harshman

        If I understand philosophers correctly (which may be argued) intrinsic essences are, among other things, necessary characteristics for group membership. Since taxa have no necessary characteristics, they can’t have essences. There’s nothing preventing a species that lacks every supposed essential character from being a member of the taxon in question, other than inertia. Even universal characters, if there are any (and sure, some groups have them, though I’m doubtful that we could say this for “any clade”), are matters of historical accident, not necessity.

        Or, shorter answer: I agree with your last sentence.

        • John S. Wilkins John S. Wilkins

          You understand philosophers correctly. Well, some of them, anyway.

  10. ellen clarke ellen clarke

    @ John wilkins 13th jan
    just to play devil’s advocate here – isn’t the idea behind saying natural kinds have to have intrinsic essences to do with projectability.
    1. only properties that are possessed necessarily by an object can ground inferences about unobservables. if they are only possessed contingently then it seems that they might change, and then we cant be sure that we will find them or their causal consequences in the future.
    2. only intrinsic properties can be possessed necessarily, because relational properties depend on other independent variables, and so might change.

    So if the only property that the members of a taxon possess necessarily is shared ancestry, then the only thing we can say for sure about unobserved organisms is that they will share ancestry with the other members of their taxa. which isnt very exciting, no?

    • John Harshman John Harshman

      If you’re looking for certainty, you won’t want to study biology, because just about every rule has exceptions (notice I didn’t say “every rule”). And I do find shared ancestry very exciting.

      • bob koepp bob koepp

        Darwin showed us how a huge number of facts about the geographical and temporal distribution of flora and fauna can be “ordered” on the assumption of shared ancestry. Exciting? You bet!

  11. ellen clarke ellen clarke

    Sure bob. so the challenge is to explain how that is possible. the assumption of a shared essence can be seen as an attempt to explain how it is possible that a huge number of facts about the geographical and temporal distribution of flora and fauna can be “ordered” on the assumption of shared ancestry. If the assumption fails, what can we put in its stead?

    • John Harshman John Harshman

      I don’t understand how “shared essence” contributes anything to the game that “shared ancestry” doesn’t do all by itself. I’m also confused about which assumption (essence or ancestry) is failing in your terminal question. Can you explain?

Comments are closed.