27 Mar 2024
09:00 Doctoral defense Fully distance
Theme
Variations of the Rearrangement Distance Problem
Student
Alexsandro Oliveira Alexandrino
Advisor / Teacher
Zanoni Dias - Co-supervisor: Ulisses Martins Dias
Brief summary
Considering a pair of genomes from organisms of related species, rearrangement distance problems aim to estimate how far one of them is in relation to the other in terms of genome rearrangements, which are mutational events capable of modifying the genetic material or position relative of segments of a genome. Considering the Principle of Maximum Parsimony, the term distance, or rearrangement distance, is defined as the minimum number of genome rearrangements necessary to transform one genome into another. The first works that studied the distance of rearrangements assumed that the compared genomes have the same set of genes (balanced genomes) and, furthermore, only the relative order of the genes and their orientations, when known, are used in the mathematical representation of the genomes. These restrictions imply that it is possible to transform one genome into another using only rearrangements that do not change the amount of genetic material in the genome (conservative rearrangements). In this case, genomes are represented as permutations, which gave rise to the problems of Ordering Permutations by Rearrangements. The main problems of Ordering Permutations by Rearrangements consider DCJs, reversals, transpositions, or the combination of reversals and transpositions, and they have known complexity. In addition to these, other problems were studied that combine transpositions with one or more of the following rearrangements: inverse transpositions, revrevs and reversals. Although there are approximation algorithms in the literature for these problems, their complexities remain open. One of the results of this thesis is the proof of the complexity of these problems that combine transpositions with inverse transpositions, revrevs and reversals. Furthermore, we present a new 1.375-approximation algorithm for Sorting Permutations by Transpositions that has better time complexity. As the area advanced, new work began to consider unbalanced genomes and incorporate the size distribution of intergenic regions. When considering unbalanced genomes, it is necessary to use insertions and deletions to transform one genome into another. In this thesis, we study both the Rearrangement Distance problem in unbalanced genomes considering only the gene sequence and their orientations (when known), and the Intergenic Rearrangement Distance problem in unbalanced genomes, which incorporates the sizes of intergenic regions in the representation of genomes, in addition to the use of gene sequences and their orientations (when known). We present new structures and concepts for problems involving reversals, transpositions, and the combination of reversals and transpositions, which are used in complexity proofs and approximation algorithms. Furthermore, we carried out experiments on synthetic genomes and real genomes, demonstrating the applicability of our algorithms.
Examination Board
Headlines:
Zanoni Dias IC / UNICAMP
Marie-France Sagot INRIA
Maria Emília Machado Telles Walter CIC / UnB
Orlando Lee IC / UNICAMP
Guilherme Pimentel Telles IC / UNICAMP
Substitutes:
Fábio Luiz Usberti IC / UNICAMP
Eduardo Candido Xavier Mercado Livre
Luis Antonio Brasil Kowada IC / UFF