[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][Subject Index][Author Index]
Re: Morpho v molecular (was Re: Tinamous: living dinosaurs)
- To: dinosaur@usc.edu
- Subject: Re: Morpho v molecular (was Re: Tinamous: living dinosaurs)
- From: evelyn sobielski <koreke77@yahoo.de>
- Date: Thu, 7 Jul 2011 18:31:17 +0100 (BST)
- Authentication-results: msg-ironport1.usc.edu; dkim=neutral (message not signed) header.i=none
- In-reply-to: <CA+nnY_HdA2c=Kh4jOdbSRXQ1Ae1RQMJfP3YSrOjMW4P0NOOmEQ@mail.gmail.com>
- Reply-to: koreke77@yahoo.de
- Sender: owner-DINOSAUR@usc.edu
> the phylogenetic signal. Across multigene datasets,
> structure is
> additive but random noise will be averaged out.
Not as easily as assumed, because the noise *is subtracted* from the signal.
The deeper you go in time, the less signal you have and the more noise (random
or not) you have. The signal/noise ratio is more interesting than the absolute
amount of signal for easy calculations of branch support ("easy" maning those
that only consider the best-supported pairing and not "almost-as-good"
alternatives). IF you compare the support for ALL POSSIBLE sisters of one
branch, then indeed the signal should conspicuously add up. But if you don't,
adding noise simply increases support for alternate
(non-most-likely/-most-parsimonous) sisters but you won't know which, while
decreasing support for the most likely/most parsimonious one.
So you'll probably end with the same topology, but even with *random* noise it
will be less well supported if the SNR decreases.
(For complete averaging-out, you'd need an infinitely amount of data in any
case. But for practical purposes, it's enough if any noise-generated "second
bests" according to one dataset A are prevented by the other datasets from
becoming any better.)
Bremer support and similar methods (those that show how much better the
best-supported sister pairing is versus others) may be the only easily-computed
workaround: if you add datasets and see ML/MP support decreasing but Bremer
support increasing, it is likely due to deep-time noise.
And from my observations, it looks like adding taxa is from some point onwards
more helpful than adding characters at least in molecular studies, not the
least by allowing to infer ancestral states more robustly.
Particularly pesky nodes ought to be tested more thoroughly. It would help for
example to know the ML/MP value *distribution* for a lineage and for all its
possible sisters, for example. Is one possible sister conspicuously
better-supported than the others? Bremer support can show this. But if the
Bremer support value is not so high, what then? Can we narrow it down to 2
alternatives? If there is sufficient signal, there should be very few pairings
with moderate-to-high ML/MP support, and a long tail of pairings with support
values close to zero.
Is there a measure for "number of alternative sisters with no more than 10%
increase in required steps presuming the next-lowest node doesn't change"?
("10%" is arbitrary, it could be 5% or 15% or whatnot, depending what is proven
to be a good cut-off value)
Regards,
Eike