[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]
Bayesian analysis for morphological data!!!
At last someone explains how to do that!
John J. Wiens, James W. Fetzner, Jr., Christopher L. Parkinson & Tod W.
Reeder: Hylid Frog Phylogeny and Sampling Strategies for Speciose Clades,
Systematic Biology 54(5), 778 -- 807 (October 2005)
>From page 781:
"Morphological data [in addition to several genes] were coded as binary
and multistate characters and were analyzed using parsimony and Bayesian
methods. Multistate characters involving quantitative variation along a
single axis (length or extent of ossification of a structure, number of a
meristic character) were ordered. Given that the states of these
characters were delimited based on the assumption that similarity in trait
values is informative, we believe it is only logical to use this
assumption in ordering the states. The alternative is to assume that
similarity in quantitative trait values is not informative, in which case
many taxa would have to be given a unique state for these characters
(because most taxa will not be identical), the states would be unordered,
and these characters would therefore be largely uninformative."
Meristic characters appear to be the number of something (like sacral
vertebrae). Not ordering these would mean that you can go from, say, 1
sacral vertebra to 10 in one step (and back in another one).
p. 781f:
"Bayesian analyses were performed using MrBayes version 3.0b4 [...].
Analyses of the morphological data used two replicate searches of 10.0 x
10^6 generations each, sampling every 1,000 generations, with four chains
and default priors (i.e., equal state frequencies; uniform shape
parameter; all topologies equally likely a priori; branch lengths
unconstrained:exponential). [...] The phylogeny was estimated from the
majority-rule consensus of post-burn-in trees pooled from the two
replicates. [...]
Bayesian analysis of the morphological data was performed using the
maximum likelihood model for discrete morphological character data (Markov
_k_ or Mk) developed by Lewis (2001). The data were modeled under the
assumption that only characters that varied among taxa were included
(i.e., coding = variable; see Lewis (2001)). Analyses were performed both
including and excluding a parameter for variation in rates of change among
characters (using the gamma distribution; Yang, 1993, 1994). We then
compared the fit of these models to our data using the Bayes factor
(following Nylander et al., 2004). The Bayes factor (B10) represents the
ratio of the model likelihoods of the two models under consideration.
Values of [...] [2 ln B10] were calculated (i.e., two times the difference
between the harmonic means of the log-likelihoods (post burn-in) of the
two models) and values > 10 were considered to be very strong evidence
favoring one model over the other (Kass and Raftery, 1995). The harmonic
mean of the log-likelihoods was calculated using the _sump_ command in
MrBayes, based on the pooled likelihood scores of the post-burn-in trees
from the two replicate searches for each model. These analyses strongly
favored the Mk + Gamma model (Mk-v of Lewis (2001), lnL = -3,723.62) over
the Mk model (lnL = -3,850.67), with a Bayes factor of 254.10. Only
results from the former analysis are presented."
Er, yeah. Erm. I don't quite get all of this, but it looks like a very
promising approach for the future, and maybe for the present. Note that
the Gamma parameter does not require input on which characters evolve
faster than which others.
p. 786:
"_Results_
_Morphological Data_
Parsimony and Bayesian analyses gave similar results for most analyses
in this study, and differences generally involved branches only weakly
supported by one or both methods. Given that we expect model-based methods
to provide phylogenetic estimates that are as accurate or more accurate
than those from parsimony (e.g., all data sets show demonstrably poor fit
to the simple model of character change assumed by equally weighted
parsimony), and in order to conserve space and paper, we present and
describe trees from the Bayesian analyses only (for all types of data).
However, we indicate congruent support from parsimony bootstrapping on all
trees, and describe many parsimony results in the text."
As an aside, the (fully resolved!!!) morphological tree is, to the
authors' own surprise, thoroughly weird. The traditional classification
(morphology-based) is much more similar to the molecular tree and to the
combined tree (which the authors prefer). I guess the reason is the
unfavorable ratio of taxa (79) to morphological + life history +
chromosomal characters (144). Compare the theropod analysis in The
Dinosauria -- 75 taxa, 638 characters (and a few impressive plesiomorphies
nevertheless). Way to go. -- Besides, the number of 144 characters is IMHO
artificially inflated. Consider characters 35 and 36: "Tympanic annulus:
(0) absent, (1) present", "Tympanic annulus: (0) separate from crista
parotica, (1) fused to crista parotica". Why not fuse these into an
unordered multistate character? "Tympanic annulus: (0) absent, (1)
separate from crista parotica, (2) fused to it"? There are several more
such pairs, like 137 and 138 or 25 and 26 (...which might also be
correlated with 24). The authors themselves only identify probable
correlation among characters that are probably adaptations for climbing
and convergence of a specialized tadpole type (coded as several
characters) between an ingroup and an outgroup taxon.
I won't bore you with the outcome, except to mention that most
leptodactylids aren't leptodactylids. And next I'll read
Paul O. Lewis: A Likelihood Approach to Estimating Phylogeny from Discrete
Morphological Character Data, Systematic Biology 50(6), November 2001
Oh, and the main message of Wiens et al.: Don't be afraid of missing data.
:-)
--
GMX DSL-Flatrate 1 Jahr kostenlos* + WLAN-Router ab 0,- Euro*
Bis 31.12.2005 einsteigen! Infos unter: http://www.gmx.net/de/go/dsl