Auto speed dating online
For the problem of inferring rates and times parameters on a fixed tree, we perform simulations, comparisons between hill-climbing and MCMC on a plant gene dataset, and dating analysis on an animal mt DNA dataset, showing that our methodology enables efficient, highly accurate analysis of very large trees.
The methodology is easily adapted to take data from fossil records into account and it can be used together with a broad range of rate and substitution models.
] the number of supporters of likelihood based estimation has steadily increased, and it is now widely considered the most accurate approach.
Our contribution leaves the field open for fast and accurate dating analysis of nucleotide sequence data.
Modeling branch substitutions rates and divergence times separately allows us to include birth-death priors on the times without the assumption of a molecular clock.
From the results of our example analyses, we conclude that our methodology generally avoids getting trapped early in local optima.
For the cases where this nevertheless can be a problem, for instance when we in addition to the parameters also infer the tree topology, we show that the problem can be evaded by using a simulated-annealing like (SAL) method in which we favour tree swaps early in the inference while biasing our focus towards rate and time parameter changes later on.
Models with birth-death priors on tree branching and auto-correlated or substitution rates among lineages have been proposed, enabling simultaneous inference of substitution rates and divergence times.
This problem has, however, mainly been analysed in the Markov chain Monte Carlo (MCMC) framework, an approach requiring computation times of hours or days when applied to large phylogenies.
Although ML phylogenetic inference is generally quicker than stochastic optimization inference (MCMC), the time complexity also of ML-algorithms has been prohibitive for analysis of large phylogenies.
(MAP) adaptation of the MCMC scheme results in considerable gain in computational efficiency.
We demonstrate also that a novel dynamic programming (DP) algorithm for branch length factorization, useful both in the hill-climbing and in the MCMC setting, further reduces computation time.