320 likes | 471 Views
Tutorial 6. Phylogenetic Trees. Tutorial 6. Phylogenetic Trees. Measuring distance Bottom-up algorithm (Neighbor Joining) Distance based algorithm Assuming star structure Not assuming star Relative distance based. Measuring Distance.
E N D
Tutorial 6 Phylogenetic Trees
Tutorial 6 Phylogenetic Trees • Measuring distance • Bottom-up algorithm (Neighbor Joining) • Distance based algorithm • Assuming star structure • Not assuming star • Relative distance based
Measuring Distance • Problem: unrelated sequences approach a fraction of difference expected by chance The distance measure converges. • Jukes-Cantor
Measuring Distance (cont) Euclidean Distance: Given a multiple sequence alignment Calculate the square root of the sum of the score at every position (i) between two sequences (a, b) (the score increases proportionally to the extent of dissimilarity between residues)
a b d c Star Structure Assumption: Divergence of sequences is assumed to occur at constant rate Distance to root equals
a b d c Basic Algorithm Constructs rooted tree. Distance matrix Initial star diagram 6
a b d c Selection step Choose the nodes with the shortest distance and fuse them. 7
a a c c a Dce Daf a,d e e e f f c,b Dce d Dbf Dde Dde b b d Quick review 8
Neighbor Joining’ (Not assuming equal divergence) • Step by step summary: • Calculate all pairwise distances. • Pick two nodes (i and j) for which the distance is minimal. • Define a new node (x) and re-calculate the distances from the free nodes to the new node. • Calculate Dix and Djx - the distance of the chosen nodes I and J to the new node X, as well as the distance from X to all other nodes. • Continue until two nodes remain – connect with edge.
5,6 Node 10 is a new node.
Re-calculate the distances from new node I,j : the fused nodes (5,6) X :a new added node (node 10) m :the remaining nodes in the star
Calculate Dix and Djx r: ~average distance to nodes L : number of leaves left in the tree (leaves nodes representing taxa, sequences,etc)
Calculate Dix and Djx ΣD5k ΣD6k r5=ΣD5k/(L-2)= 3.22406/(9-2)=0.46058 r6=ΣD6k/(L-2)= 3.22758/(9-2)=0.461083
Calculate Dix and Djx D10,5=(D5,6+r5-r6)/2=(0.06088+0.46058-0.461083)/2) = 0.0301886 D10,6=D5,6-D10,5=0.06088-0.0301886=0.0306914
0.0301886 0.0306914
Step 2 0.080375 0.044625
Step 3 0.069258 0.040447
Problems 2 1 0.1 0.1 0.1 0.4 0.4 4 3
Neighbor Joining (Not assuming equal divergence) • Step by step summary: • Calculate all pairwise distances. • Pick two nodes (i and j) for which the relative distance is minimal (lowest). • Define a new node (x) and re-calculate the distances from the free nodes to the new node. • Calculate Dix and Djx - the distance of the chosen nodes I and J to the new node X, as well as the distance from X to all other nodes. • Continue until two nodes remain – connect with edge.
Step 2. Pick two nodes (i and j) for which the relative distance is minimal (lowest).
X J I M • Negative values • As the average distance from the common ancestor to the rest of the nodes increases, Mij has a lower value. • Select pair that produce lowest value • Reevaluate M with every iteration
2 1 0.1 0.1 0.1 0.4 0.4 4 3
2 1 0.1 0.1 0.1 0.4 0.4 4 3
X J I M Re-calculate the distances from new node
EXAMPLE Original distance Matrix Relative Distance Matrix (Mij) The Mij Table is used only to choose the closest pairs and not for calculating the distances 30
Problems with phylogenetic trees Bacillus Bacillus Burkholderias Aeromonas Aeromonas Pseudomonas Pseudomonas Burkholderias Lechevaliera Lechevaliera E.coli E.coli Salmonella Salmonella Bacillus Pseudomonas Pseudomonas Aeromonas Burkholderias Burkholderias Aeromonas Bacillus Lechevaliera Lechevaliera E.coli E.coli Salmonella Salmonella
Software PHYLIP http://evolution.gs.washington.edu/phylip.html http://paup.csit.fsu.edu/ PAUP http://www.megasoftware.net/ MEGA3 http://evolution.genetics.washington.edu/phylip/software.html More