Exercises marked with (*) require further reading/search beyond the suggested texts.
2. For the same data as in Exercise 1, find ancestral states as follows. For each internal node, compute an SCJ weighted median using all descendant leaves with equal weights. How does this differ from the solution given in Exercise 1?
![]()
![]()
Answer:
Considering the definition of SCJ median as the genome that minimizes the sum of SCJ distances to each element of a given set of considered genomes, and in view of Lemma 4.2 from FM2011, the internal nodes of the topology would be as follows:
But since the SCJ median is not unique for an even number of genomes, nodes 2 and 4 can have more than one median genome. In node 2, for instance, G1 and G2 can have many medians as shown below:
If we consider M as the ancestral genome, the total distance would be 4 since dSCJ(M,G1) = 2 with the addittion of the 2 missing adjacencies, and dSCJ(M,G2) = 2 as well, for similar reasons. But we can, alternatively, use G1 as median. By doing so, we have dSCJ(G1,G1) = 0 and dSCJ(G1,G2) = 4, which add up to the same minimum distance as well. The same result is reached when G2 is used as median.
A fourth alternative is to use G5 as median. Then, dSCJ(G1,G5) = 3 and dSCJ(G2,G5) = 1, which also sum to 4. Thus, genomes {YhXt}, G1, G2, or G5 can be used as medians in node 2.
The same principle applies to node 4, where the ancestral can be either {YhXt}, G4, or G5, which will all sum the distances to 2.
This differs from the solution of Exercise 1 in that more than one alternative is given for some of the internal nodes. However, these alternatives do not provide an alternative solution to the small phylogeny problem, since their use would lead to ancestral states with total branch length greater than that of Exercise 1.
© 2015 Joao Meidanis