Multiple Sequence Alignment Using Model-Based Evolutionary Algorithms
Summary
The construction of high quality multiple sequence alignments (MSAs) is an important problem in the field of bioinformatics. MSAs are used for a wide range of different purposes, such as phylogenetic analysis, conserved motif identification and structure prediction. In this thesis we present an approach for constructing multiple alignment profiles of high quality for solving the MSA problem. The multiple alignment profiles that are used in this thesis are the positional weight matrices (PWMs). A standard evolutionary algorithm and two variants of the Gene-Pool Optimal Mixing Evolutionary Algorithm (GOMEA) will be used for constructing high quality PWMs. The performance and the scalability of the evolutionary algorithm of Botta and Negro [9], of the univariate GOMEA algorithm and of the Linkage Tree Genetic Algorithm will be compared. We will show that the Linkage Tree Genetic Algorithm performs significantly better than the algorithm of Botta and Negro and the univariate GOMEA algorithm. We will also show that there is no significant performance difference between the algorithm of Botta and Negro and between the univariate GOMEA algorithm. Finally, we will show that both variants of the GOMEA algorithm scale quite poorly in comparison to the algorithm of Botta and Negro.