Two kinds of natural classification, and hybrid classifications

It is fairly clear to anyone reading the last century’s discussions about classification that there are, with respect to natural classification, two main approaches. These are roughly: classification based on shared causal properties, and classification based upon shared phenomenal properties. In the debates between the “pheneticists” who used computer-based techniques to measure the “similarity” of “operational taxonomic units”, and the “evolutionary systematists” such as Mayr and Ashlock, the debate was not over whether or not things should be grouped according to their apparent similarities, for both agreed that they should, but over what similarities to measure. Mayr held that the assumption of “theory-free classification” of the phenetics techniques was unwarranted. Theory, he said, should guide our classifications.

At the same time a movement to classify by using shared properties called homologies arose out of the German systematics literature. Homologies were supposed, many of these original writers like Colin Patterson thought, to be theory-free. Mayr attacked these systematists, too. He thought that while we had to make our classifications rest on genealogical foundations, we needed to “recognise” that groups like crocodilians were in fact more closely allied to reptiles than to birds, which is the inverse of the cladistic classification, because crocodilians branched off from a common ancestor with birds more recently than what we call “reptiles”. So instead of a cladogram, which merely shows the topology of evolutionary relationships, Mayr wanted a “phyletic diagram” (figure 1), which showed also the “degree of difference” between branches of the evolutionary tree.

phyletic

Figure 1: An “orthodox” phyletic diagram as proposed by Ernst Mayr (1965: 82)

Mayr misunderstood cladistic classification to the extent that he even denied it was classification, asserting instead that it was merely “cladification”, or a simple system for ordering data. Unfortunately, this was an outcome of special pleading, since it is widely agreed that classification is an ordering of data. I think that Mayr, and many others since, were a little confused, in that they took their own practice to exhaust what the act of classification was. Moreover, by making “degree of difference” a key element of classification, he and his followers conflated the two kinds of classification. One kind, the causal process definition, gave the overall topology of the evolutionary tree (note that the topology is independent of the abscissa in Mayr’s diagram; you can rotate any of the branches and change the angles and retain the topology). The other kind, the apparent or phenomenal similarity kind, relates things independently of the causal process (a taxon can have evolved a lot or very little, and yet branched off in the same tree shape).

These “hybrid” classifications impede rather than enable inference, in ways that I will discuss later. The reason is that one kind of classification relies solely upon our ability to properly represent facts about the mind-independent world, while the other relies solely upon our subjective cognitive dispositions. If you mistake, as it is all too easy to do, your own subjective judgements for judgements about the world, you are asking for inferential problems, and such problems ran rampant, and still do, in biological classification. Hybrid classifications occur in other sciences too, especially the paletiological (historical, or “special”) sciences, although they are not absent from the general sciences of physics and chemistry either.

The two kinds of classification employ two different measures. The causal process model employs shared properties that derive from the causal process itself. In historical sciences, that causal property set is conserved over time by a commonality of the causal process itself. For example, in earth sciences, one might identify a mineral based on its chemical composition and structure or the process by which these are generated. The phenomenal model relies on things that are shared to an observer; which are subjective (or at the best, intersubjective). Sometimes these can coincide, and this will be the case when what is salient to the observer is caused by shared causal processes. But given that what is salient to an observer is a fact about the observer, and not the world, there is no guarantee that salience is an interesting property. Inferences based on that are classic examples of the ontological fallacy, the belief that because we have words or ideas for something, there is a something there. Thus, hybrid classifications are likely to mislead.

In the next post I will name the grounds for these two kinds of classification, and give some guidelines on inferences made on them.

Late note: A new paper has just been published in Annual Reviews in Ecology, Evolution and Systematics which argues that to be able even to identify adaptive radiations of the kind Mayr was fond, one first needs to have a phylogeny. I have asked for a copy from the author and I’ll comment on it when I read it.

6 thoughts on “Two kinds of natural classification, and hybrid classifications

  1. Totally OT:

    Just when I’d got to find out where you’d put things… it all moves!

    Like the new layout John. Don’t drop the link to OB’s!

  2. There is, however, a relevant “natural” observer: the ecological environment that provides selective pressure on the evolving population. As long as we can keep our subjective observations roughly aligned with this observer, our classifications can be useful.

    For purposes of studying evolution, we need a classification system in which lizards and alligators are more closely aligned to each other than to birds, despite the cladistic situation. Then we can create speculative scenarios WRT the sequence of evolutionary pressures that produced birds, derived from a “lizard shaped” ancestor.

    This might make classification more difficult, but IMO any system of analysis should be evaluated more on its cost/benefit ratio than its simple cost (difficulty). The benefits of a “hybrid” classification system that matched all the evolutionary processes rather than simply genetic ancestry would be enormous.

  3. You discover two poles of a spectrum with some hybrids in between. Is this scheme based on causal or phenomenal distinctions between schools of classification? Are you using subjective or objective criteria? Classify not lest ye be classified.

    More seriously, I strongly dislike the implication that people who classify based on genetic sequences are dealing with causes, whereas people who measure features of phenotypes are dealing with mere phenomena. In fact, you have it backward in some sense. Sequence based methods best find the true historical tree when the sequence divergence is causally silent – selectively non-neutral sequence changes mess up the “clock”. And, in the same vein, it is generally agreed that shark, tuna, and dolphin have similar shapes for the same cause – their shared membership in the guild of large aquatic predators. Yet we exclude this kind of cause from our analysis because we have decided that we are really not interested in either causes nor effects – we are only interested in getting the “true tree”.

    1. I’ll get to that, eventually. But allow me to say that the existence of sorites and vague groups dos not create an insoluble problem for objectivity in classification.

      But as to your second point, I would not privilege molecular data over phenotypic data; largely because molecular sequences are a kind of morphology. We do, however, exclude homoplasies because of causes, so causes count. But there are different causes.

      Let me know when we get that true tree…

      1. Let me know when we get that true tree…

        Sure, it should happen soon after we learn all the natural kinds, but probably before we nail down the real ontology of the universe.

        But more seriously, we seek the true (historical) tree because we consider that tree to be (one component of) the true cause. Of course, we know that constructing the true tree is an unattainable ideal, yet it is still the ideal we strive for. We shouldn’t replace that ideal with the attainable, but relatively pointless, ideal of “objectivity”.

        1. Nor am I supposing that we can, but at best we have trees that represent the structure of the data we have, and so it is crucial that the data is relevant data, and not data about our dispositions. Any cognitive act of ours is necessarily going to be human, and therefore subjective to some degree – the ideal of science is to reduce that subjectivity and not to ensconce it in our scientific political structures – but that doesn’t mean we give our classifications up to subjectivity. And the best representation of things is one that matches causal processes.

          Incidentally, the tree is not a cause. It is an outcome of causes. It is the pattern left by the causes, many of which are as yet unknown to us (which is why we investigate that data structure).

Leave a Reply