Exercises marked with (*) require further reading/search beyond the suggested texts.
1. Compute the DCJ distance between the two unichromosomal, linear genomes below.
Answer:
The genomic distance is given by the number of breakpoints minus the number of cycles (b-c) in the comparison graph of the two genomes. We assume that genomes A and B have the same gene content, organized into synteny blocks. A synteny block (SB) is a maximal sequence of genes on a chromosome of genome A occurring unchanged in genome B.
Once synteny blocks are identified, we connect with black lines the SBs that are adjacent in genome A, and with grey lines the adjacent SBs in genome B. Then, caps are added to denote both ends of each linear chromosome. In this case, two caps for the whole genome, which is unichromosomal. Notice that A-caps are different from B-caps, at least at this point of the algorithm.
Yancopoulos et al. (2005) algorithm's phase 0 closes the two AB paths by identifying the A cap at one end of a path with the B cap at the other end. One cycle vanishes at this point. After closing the paths, each vertex has degree 2, so the graph consists of separate cycles alternating black and gray lines (see image below). The number of breakpoints in genome A (black lines not parallel to a gray line) equals the number of breakpoints in B (gray lines not parallel to a black line), and is 11 in this particular case.
There are four separate cycles in the graph, shown in different colors below.
Hence we have the genomic distance given by the equation:
d = b - c
d = 11 - 4
d = 7
This is the minimum number of DCJ operations required to get from genome A to genome B.
© 2015 Joao Meidanis