Bayesian Inference of Phylogeny Using Variable Number of Tandem Repeats and Markov Chain Monte Carlo
Summary
We implement and evaluate methods to infer the phylogeny of Variable Number of Tandem Repeats (VNTR) isolates of tuberculosis through Bayesian inference and Markov Chain Monte Carlo, using an existing transition rate matrix (Sainudiin et al., 2004). By also inferring the phylogeny through the model of Hasegawa, Kishino and Yano (HKY) using nucleotide data of the same isolates, we are able quantitatively and qualitatively compare the phylogenies obtained through both models. By simulating data, we assess how well the true phylogeny can be inferred for both the Sainudiin and HKY model, for different levels of mutational saturation in the data. We show how both the Sainudiin and HKY model can be combined to yield a phylogeny that is better resolved and more accurate than by the use of either model. By changing the model for the mutation rate proportionality in the Sainudiin model, we are able to use the estimates of the model parameters to speculate on the mechanisms by which VNTR mutates. The developed methods have been made available in the package BEASTvntr for beast2.