In a perfect phylogeny, the total number of changes would be 14 (2 for C1, 2 for C2, 2 for C3, 3 for C4, 4 for C5, and 1 for C6). This data set does not admit a perfect phylogeny because characters C1 and C2 are incompatible (notice the existence of configurations 00, 01, 10, and 11 in these two characters).
Therefore, a tree with 15 state changes, if one exists, would yield maximum parsimony. The tree below has 15 changes and is therefore a most parsimonious tree.
π = {fhat, aheh, etct, chbh, btdt, zhzt}
σ = {chft, aheh, etbh, btdt}
dSCJ(π, σ) = |π - σ| + |σ - π| = 4 + 2 = 6.
One way of solving this question is by constructing the breakpoint graph with caps, then identifying the ends of AB-paths, linking the ends of AA- and BB-paths with new edges, and then computing b - c (YAF method). This will result in 7 breakpoints and 3 cycles, for a distance of 4.
Another way of solving this question is by constructing the breakpoint graph without caps, and applying the rules seen in Exercise 2 of 2015-04-23. This will result in two 2-cycles, one 4-path, and two 1-paths. They contribute with 2 x 0, 1 x 2, and 2 x 1 to the DCJ distance, yielding a total of 4.
Yet another way of solving this question is by building the adajcency graph (AG) and using the formula N - C - AB-paths/2. In this case, N = 7, there are 2 cycles in AG, and one AB-path. The formula yields 7 - 2 - 2/2 = 4.
πchr = (f a -e c -b d -d b -c e -a -f)
σchr = (c f -c -f)(a -e -b d -d b e -a)(z -z)
σchrπchr-1 = (-a)(-b f -c e c)(-d)(-f a)(b)(d)(-z z) = (-b f -c e c)(-f a)(-z z)
dalg(π, σ) = ||σchrπchr-1||/2 = 6/2 = 3.
One can also use the adjacency representations:
πadj = (-f a)(-a -e)(e c)(-c -b)(b d)(-z z)
σadj = (-c f)(-a -e)(e -b)(b d)
2.5: 15 changes, all characters correct
2.3: 16 changes, all characters correct
2.0: 17 changes, all characters correct
1.7: 5 characters correct
1.4: 4 characters correct
1.1: 3 characters correct
0.8: 2 characters correct
0.5: 1 character correct
0.0: 0 character correct
+1.0 each: adjacency sets of the genomes
+1.0 each: adjacency sets A-B and B-A
+1.0: computed A ∩ B
+1.0 each: algebraic representation of the genomes
-0.5: counted telomeres among the adjacencies
-0.5: used genes (not adjacencies) to compute |A|, |B|
-0,5: forgot parentheses (merged two cycles into one)
2.5: correct distance and explanation
+1.5: construction of capped breakpoint graph
+1.5: construction of YAF graph
+1.0: listed cuts and joins
+0.5: estimation of distance based on cuts and joins
+0.5: AB-path closure rule
+0.5: AA-path and BB-path closure rule
+0.5: N=7 (genes) or |E|=14 (extremities)
-0.5: incorrect handling of BB-paths
+1.0 each: πchr and σchr
+1.0 each: πadj and σadj
+0.3: computed σ π-1
-1.0: forgot 2nd strand in chromosomal representation
-0.5: incorrect representation of circular, unigenic chromosome
-0.5: used |E|=7 instead of |E|=14
-0.5: computed number of orbits incorrectly
-0.5: forgot (z -z) in σ π-1
-0.5: wrote σ π-1 formula, but did not develop it
© 2015 Joao Meidanis