Maximum likelihood (ML) is a widely used criterion for selecting optimal evolutionary trees. However, the nature of the likelihood surface for trees is still not sufficiently understood, especially with regard to the frequency of multiple optima. Here, we initiate an analytic study for identifying sequences that generate multiple optima. We concentrate on the problem of optimizing edge weights for a given tree or trees (as opposed to searching through the space of all trees). We report a new approach to computing ML directly, which we have used to find large families of sequences that have multiple optima, including sequences with a continuum of optimal points. Such data sets are best supported by different (two or more) phylogenies that vary significantly in their timings of evolutionary events. Some standard biological processes can lead to data with multiple optima, and consequently the field needs further investigation. Our results imply that hill-climbing techniques as currently implemented in various software packages cannot guarantee that one will find the global ML point, even if it is unique.
History
Publication title
Molecular Biology and Evolution
Volume
17
Issue
10
Pagination
1529-1541
ISSN
0737-4038
Department/School
School of Natural Sciences
Publisher
Oxford Univ Press
Place of publication
Great Clarendon St, Oxford, England, Ox2 6Dp
Rights statement
The definitive publisher-authenticated version is available online at: www.oxfordjournals.org