[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]

Ancestors [was: Re: And while on the theory of phylogenetic reconstruction...]



    Ok, since no-one seems to want to go near this, I'll give it a try...

Randy Simpson wrote:

> What was said was
> something like because cladists treats all taxa as terminals, it is
> necessarly going to require one more assumption that constructing a
> phylogeny operating according to (or allowing, at least)
ancestor-decendant
> relationships. I guess because you have to hypothesize a common ancestor,
> cladistics is less parsimonious.

    Ancestors have been a sticking point between traditional paleontologists
and phylogenetic systematists for a long time. To over-generalize, the
traditional practice went something like: If you found two morphologically
similar species, A and B, in the same region (with no formal limit on what a
"region" might be, often continent-scale), and A was in strata older than B,
then A was often presumed to be the ancestor to B, unless it was deemed too
"specialized." Often, all reasonable possibilities for ancestors were
evaluated, and the most "generalized" one accepted. Only once this supply of
generalized taxa was exhausted would it be suggested that the ancestry of a
species might be unknown. To clarify: although interpretations certainly
varied, this often did not imply DIRECT ancestry, but simply that B was one
descendant of species A, not necessarily the immediate descendant. Anyway,
traditionalist phylogenies are often chock-full of ancestor descendant
relationships. When ancestors could not be traced, they often left the
infamous "dotted line" extending towards, but not contacting, another
lineage.

    For later comparison, note that the traditional practice amounts to
setting ancestry as a null hypothesis: if A comes before B, and there is no
reason to reject A as an ancestor of B, we should consider it to be so.

    Despite a lot of rhetoric, phylogenetic systematists subscribe to
remarkably similar ideas, although they are formalized so as to reflect a
different concept of the problem of phylogeny and the relevant evidence.
Ideally, this should remove subjective judgments of specialization
(although, in practice, it may not, see the popularity of the term "basal,"
a politically correct form of "primitive"). The approach taken by many
phylogenetic systematists (including people who would call themselves
"cladists") is to set the null hypothesis as A is NOT ancestral to B. This
can be justified probabilistically; we know that there are frequent,
prominant gaps in the fossil record, gaps that span a period of time greater
than that required for major distributional shifts in faunas. We also know
that that the fauna of large portions of in any particular region may not be
recorded in the fossil record (fauna from environments far from major
depositional centers, e.g., highland faunas). Would you consider Pleistocene
_Homotherium_ (the "sabertoothed tiger") to be a reasonable descendant of
Holocene _Felis concolor_ (the mountain lion), simply because both are found
in California? Even excluding major dispersal events, in any case there are
probably other species from the same or adjacent regions and the same time
as species A; we may know very little about them, we may not know them at
all, but all other things being equal we should consider them to be nearly
equally likely to be the ancestor of B. This means that (again, all else
being equal) the odds of A being a lineal ancestor of B are probably less
than 50%. Hence, the null should be that A is not ancestral to B.

    Practically, there is no known, unequivocal, positive form of evidence
for ancestry (although possibilities are being explored). How WOULD you
"prove" that one species is ancestral to another? Like traditionalists,
phylogeneticists allow for falsification: the usual standard is that if a
species has no autapomorphies (that is, it is on a zero-length branch), it
may possibly be ancestral to its "sister-group" on the tree. This is
remarkably similar to the traditionalist approach, the difference lies in
how the data are treated: for a traditionalist, very often "generalized" and
"specialized" are global qualities, for a (responsible) phylogeneticist,
they have meaning only locally on a tree.

    Now, there is a technical aspect of this as well, and it is somewhat
similar to the situation of hybrid speciation and phylogenetics: nearly all
current methods only allow for reconstruction of bifurcating trees, i.e.,
trees that have 2n-1 branches for n taxa. Although this has been represented
as a hypothesis about the pattern of evolution (i.e., that all speciation is
bifurcating), and some phylogeneticists dogmatically hold to this position,
many others see it as a computational technicality, not necessarily an
assumption about evolution. Similarly, at least in theory, the treatment of
all taxa as terminals is a technical necessity.

    Although not everyone agrees, some of us take the approach that a tree
is a representation, but not necessarily a literal portrayal of evolution.
The pattern cladists, and many traditionalists, have spent a great deal of
time emphasizing the difference between a *cladogram* and a *phylogeny*.
Probably, in most cases, the two are interchangeable. However, not all
trees, even if they are "correct" for the data, can be taken as the best
representation of the phylogeny. As many have noted, simultaneous
speciation, ancestry and descent, hybridization, and other phenomena that
are considered very likely to actually occur in nature cannot be represented
directly through an analysis (yet). This is, again, a technical limitation,
not a statement of policy. When I look at a tree, I don't see a literal
representation of phylogeny, I see a set of hypothesized relationships,
which could viewed as a phylogenetic hypothesis given certain assumptions.
Do anything else at your own risk.

    Back to the number of branches on a tree: in probabilistic methods, it
is not possible to compare two trees with different numbers of branches
(e.g., one containing ancestors and one that does not), because they include
different numbers of parameters (there is a "zero-length branch test," but I
am not familiar with it). I would argue that comparing trees with different
numbers of branches is possible in a non-parametric analysis (i.e.,
parsimony), but again the nature of the data and the analysis make things
difficult. In the case of hybrids, given no evidence about groupings of
characters inherited as a set, it would be impossible to determine whether
homoplaisy in the data is the result of hybridization or represents true
homoplaisy. You could thus keep adding hybrid branches to resolve homoplaisy
(at least one method of detecting hybridization, NeighborNet, has this
problem, albeit intentionally). In the case of ancestry and descent, I am
pretty sure it is not possible for a tree with a taxon at a node (i.e.,
ancestral) to be any more parsimonious than one with a taxon on a
zero-length branch. This is the problem noted above; there is no positive
form of evidence for ancestry.

    So, all that said, is "cladistics less parsimonious" because it doesn't
directly hypothesize ancestry? If you accept that the tree is not
necessarily a literal picture of evolution, this entire objection goes away.
As far as hypothesizing common ancestors, I can't agree: we begin with the
assumption that any two taxa have a common ancestry, so a particular number
of common ancestors are already hypothesized. Theoretically, I don't believe
you are actually hypothesizing an additional COMMON ancestor, because you
have actually converted terminal to an ancestor. You may be hypothesizing
the presence of additional, unrecovered species, depending on how you
interpret the tree. This is giving me a headache... I give up trying to
figure it out. Anyway, taking the claim at face value, we *could* be "very
parsimonious" and hypothesize ONE ancestor for all the terminals. Would that
be more parsimonious than hypothesizing n-2 common ancestors for n taxa?
Obviously not, because the tree would be a hard polytomy with tremendous
homoplaisy. In a manner analogous to the non-comparability of trees with
different numbers of branches in a parametric analysis, the translation
between a measure of parsimony for numbers of character transformations and
number of nodes would not be straightforward, i.e., how many character
changes = loss of a node?

    Now, if you do have a taxon on a zero-length branch, you DO arguably
have a straightforward comparison based on parsimony, because the number of
steps is constant between a tree with that branch and a tree without that
branch. So, is it more parsimonious to include a branch with no support, or
collapse that terminal down to the node. I will actually part company with
the majority of phylogeneticists and suggest that it might be preferable to
accept the latter as the null. Statistically, this can be justified as
follows; if the null hypothesis is A is NOT ancestral to B, then there is no
powerful test of ancestry, because it is impossible to reject the null.
However, if the null is that A IS ancestral to B, then rejecting the null
becomes possible. In many cases, the characters in which closely related
species (including, presumably, ancestors and descendants) differ are often
relative plastic within and among the closely related taxa. Further, because
many of the more useful characters may be continuous or overlapping, they
might not actually make it into the analysis. We might reasonably expect a
higher-than-normal amount of homoplaisy within such groups, leading to
erroneous inference of branch lengths (through wrongly inferred
autapomorphies). Therefore, the rate of Type I error, wrongly rejecting the
null, is probably fairly high, and the test would have a pretty low level of
significance. Alternately, it the null is NOT ancestry, then the test has no
power, but is supremely significant, because we will never wrongly reject
the null. Therefore, I would argue that accepting a null of ancestry is a
conservative approach, and a more sensitive test for ancestry, statistically
speaking.

    It would help to remember the history of the Clade Wars. Many
phylogeneticists started out reacting to traditionalist approaches
(including, specifically, "forcing" taxa into position as ancestors), and
they certainly may have over-reacted. These "cold warriors" are still
around, and still very adamant about making a stand against traditionalist
ideas. Traditionalists did proceed from some ill-advised assumptions, and
they have been quick to point out the idiosyncratic methods advocated by
some phylogeneticists as evidence that the latter do likewise. Overall,
however, the old-timers had some very progressive ideas (e.g., read
Langston's Fossil Crocodilians from Colombia), and paleontology was THE
hotbed of phylogenetic thought under their reign. As usual, the best
approach probably lies somewhere in the middle. Despite several (IMHO)
ill-advised false starts, there will probably eventually be a place for
stratigraphic data in phylogenetics (maybe not as "stratocladistics"). As we
move on into exploring phylogenies that cannot be adequately represented on
bifurcating trees, issues like ancestry will probably come back to light,
especially in the paleontological world. When all this is done, the
difference between the two approaches will probably come down to the same
points that have always favored the "phylogenetic" approach: explicitness,
testability, and reproducibility.

Hope this long, aimless rant might have clarified something!

Wagner