Phylogeny of Mixture Models: Maximum Likelihood, Ambiguity, and Linear Tests
Speaker
Daniel StefankovicUniversity of Rochester
http://www.cs.rochester.edu/~stefanko/
Description
It is well known that phylogenetic trees can vary between genes. Even within regions having the same tree topology, the mutation rates often vary. This motivates the study of phylogenetic reconstruction in heterogeneous settings. We study the (im)possibility of reconstructing the underlying phylogeny when data is generated from a mixture of trees (same topology, different branch lengths). We first show the pitfalls of popular methods, including maximum likelihood and BMCMC algorithms. We then determine in which evolutionary models, reconstructing the tree topology, under a mixture distribution, is (im)possible. We prove that every model either has ambiguous distributions, in which case reconstruction is impossible in general, or there exist linear tests which identify the topology. This duality theorem, relies on our notion of linear tests and uses ideas from linear programming duality. Linear tests are closely related to linear invariants, which were first introduced by Lake. Joint work with Eric Vigoda.