Alexsandro Oliveira Alexandrino
Sorting Permutations by Weighted Operations
One of the main problems in Computational Biology is to find the evolutionary distance among species. In most approaches, such distance only involves rearrangements, which are mutations that alter large pieces of the species’ genome. Considering that the genome has no repeated genes, we can represent them as signed permutations, where each element corresponds to a synteny block (region of high similarity between the genomes compared) and the sign of each element corresponds to the orientation of such blocks. When using permutations, the problem of transforming one genome into another is equivalent to the problem of Sorting Permutations by Rearrangement Operations. The traditional approach is to consider that any rearrangement has the same probability to happen, and so the goal is to find a minimum sequence of operations which sorts the permutation. However, studies have shown that some rearrangements are more likely to happen than others, and so a weighted approach is more realistic. In a weighted approach, the goal is to find a sequence which sorts the permutations, such that the sum of the rearrangements’ cost of that sequence is minimum. In this work, we presented approximation algorithms for two new variations of the Sorting Permutations by Weighted Operations problem. The first variation uses a cost function related to the amount of fragmentation caused by a rearrangement. The second variation uses a cost function proportional to the rearrangement’s length, along with the constraint that operations must be short. For each variation, we considered five problems with the following rearrangement models: unsigned reversals, signed reversals, transpositions, unsigned reversals and transpositions, and signed reversals and transpositions. Considering the problems of Sorting Permutations by Fragmentation-Weighted Operations, we presented an analysis of the relation between the traditional approach and this variation, 2-approximation algorithms for each rearrangement model, experimental results of these algorithms, and upper and lower bounds for the diameter of these problems. Besides that, we showed properties of simple permutations and a 1.5-asymptotic approximation algorithm for this class of permutation considering signed reversals and/or transpositions. Considering the problems of Sorting Permutations by Length-Weighted Short Operations, we presented an analysis of the relation between the traditional approach and this variation, approximation algorithms with constant factor for each rearrangement model, and experimental results for these algorithms. Besides that, we analyzed the approximation factor for the algorithms we developed when the cost function is equal to l^α, where l is the rearrangement’s length and α > 1 is a constant.
Advisors: Zanoni Dias and Carla Negri Lintzmayer
Master's thesis (In Portuguese), University of Campinas, Institute of Computing
Download