800 likes | 1.32k Views
Distance matrix methods. calculate a measure of distance between each pair of species , then find a tree that predicts the observed set of distances. Branch lengths and times.
E N D
Distance matrix methods calculate a measure of distance between each pair of species, then find a tree that predicts the observed set of distances.
Branch lengths and times in distance matrix methods, branch lengths reflect the expected amount of evolution in different branches of the tree. branch length = ri• ti elapsedtime rate of evolution
The least squares method minimise the difference between the observed matrix of distances and the matrix of distances predicted by the tree. Observed matrix
The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.05 d Expected matrix c
The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.05 d Expected matrix c
The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.08+0.05+0.10 0.05 d Expected matrix c
The least squares method b a 0.08 0.10 0.05 0.06 0.03 e 0.07 0.05 d Expected matrix c
The least squares method Q is a measure for the discrepancy between the observed and the expected matrix. expecteddistance between species i and j n n Q = S Swij (Dij – dij)2 i=1 j=1 observeddistance between species i and j
The least squares method distances can be weighed or not. weight (1, 1/D2, 1/D) n n Q = S S wij(Dij – dij)2 i=1 j=1
The least squares method Xij, k is a handy variable b a v1 v2 v7 v5 v6 e v4 v3 d c xij,k= 1 if branch k is on the path between species j and k = 0 if branch k is not on the path between species j and k
The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c Xa-b,1= 1
The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c Xa-b,1= 1 Xa-b,7= 1
The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c Xa-b,1= 1 Xa-b,7= 1 Xa-b,3= 0
The least squares method rewrite dij, the expected values n n Q = S S wij(Dij – dij)2 i=1 j=1 dij = S xij,kvk k
The least squares method n n Q = S S wij(Dij – Sxij,kvk)2 i=1 j=1 k
The least squares method differentiate Q and equate the derivative to zero dQ dvk n n Q = S Swij (Dij – Sxij,kvk)2 i=1 j=1 k n n = -2 S S wijxij, k (Dij – Sxij,kvk) i=1 j=1 k
The least squares method for the unweighted case dQ dvk n n = -2 S S xij, k (Dij – Sxij,kvk) = 0 i=1 j=1 k
The least squares method written in full dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j:j≠1 k xAB,1(DAB-SxAB,kvk) + xAC,1(DAC-SxAC,kvk) + xAD,1(DAD-SxAD,kvk) + xAB,1(DAE-SxAE,kvk) + xBC,1(DBC-SxBC,kvk) + xBD,1(DBD-SxBD,kvk)+ xBE,1(DBE-SxBE,kvk) + xCD,1(DCD-SxCD,kvk) + xCE,1(DCE-SxCE,kvk) + xDE,1(DDE-SxDE,kvk) = 0 i=3 i=2 i=1 i=4 j=2 j=3 j=4 j=5 j=3 j=4 j=5 j=4 j=5 j=5
The least squares method b a v1 v2 v7 v5 v6 e v4 v3 d c
The least squares method many terms are zero dQ dv1 n n = -2 S Sxij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k 1 (DAB-SxAB,kvk) + 1 (DAC-SxAC,kvk)+ 1 (DAD-SxAD,kvk)+ 1 (DAE-SxAE,kvk) + 0 (DBC-SxBC,kvk) + 0 (DBD-SxBD,kvk)+ 0 (DBE-SxBE,kvk) + 0 (DCD-SxCD,kvk) + 0 (DCE-SxCE,kvk) + 0 (DDE-SxDE,kvk) = 0
The least squares method non-zero terms expanded dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 b a v1 =1•v1 + 1•v2 + 0•v3 + 0•v4 + 0*v5 + 0•v6 + 1*v7 v2 v7 v5 v6 e v4 v3 d c
The least squares method dQ dv1 n n = -2 S Sxij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 b a v1 =1•v1 + 0•v2 + 1•v3 + 0•v4 + 0*v5 + 1•v6 + 0*v7 v2 v7 v5 v6 e v4 v3 d c
The least squares method rearranging to dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 DAB + DAC + DAD + DAE – 4v1 – v2 – v3 – v4 – v5 – 2v6 – 2v7 = 0 DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7
The least squares method dQ dv1 n n = -2 S S xij, 1 (Dij – Sxij,kvk) = 0 i=1 j=1 k (DAB-SxAB,kvk) + (DAC-SxAC,kvk) + (DAD-SxAD,kvk) + (DAE-SxAE,kvk) = 0 DAB + DAC + DAD + DAE – 4v1 – v2 – v3 – v4 – v5 – 2v6 – 2v7 = 0 DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 equation for v1
The least squares method mutatis mutandis for v2 DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 DAB + DBC + DBD + DBE = v1 + 4v2 + v3 + v4 + v5 + 2v6 + 3v7 equation for v1 equation for v2
The least squares method and all other branches DAB + DAC + DAD + DAE = 4v1 + v2 + v3 + v4 + v5 + 2v6 + 2v7 DAB + DBC + DBD + DBE = v1 + 4v2 + v3 + v4 + v5 + 2v6 + 3v7 DAC + DBC + DCD + DDE = v1 + v2 + 4v3 + v4 + v5 + 3v6 + 2v7 DAD + DBD + DCD + DDE = v1 + v2 + v3 + 4v4 + v5 + 2v6 + 3v7 DAE + DBE + DCE + DDE = v1 + v2 + v3 + v4 + 4v5 + 3v6 + 2v7 DAC + DAE + DCE + DBE + DCD + DDE = 2v1 + 2v2 + 3v3 + 2v4 + 3v5 + 6v6 + 4v7 DAB + DAD + DBC + DCD + DBE + DDE = 2v1 + 3v2 + 2v3 + 3v4 + 2v5 + 4v6 + 6v7 equation for v1 equation for v2 v3 v4 v5 v6 v7
The least squares method solving linear equations with matrices x + 2y = 4 3x - 5y = 1 4 1 1 2 3 -5 A = = B -5 -2 -3 1 -5 -2 -3 1 -5 -2 -3 1 1 1 1 A-1= = = - | A | 11 1*(-5)- 3*2 4 1 -22 -11 2 1 -5 -2 -3 1 1 1 X = A-1 B = - = - = 11 11
Clusteringalgorithms clustering methods have no criterion but apply algorithms to come up with trees
Clusteringalgorithms: UPGMA UPGMA assumes that evolutionary rates are the same in all lineages an ultrametric tree Unweighted Pair Group Method with Arithmetic mean
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.
Clusteringalgorithms: UPGMA sealion seal Find species i and j with the smallest distance . Calculate branch length between i and j. 12
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group.
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.
Clusteringalgorithms: UPGMA raccoon sealion bear seal Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.
Clusteringalgorithms: UPGMA raccoon sealion bear seal Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12 18.75 6.75 5.75
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j.
Clusteringalgorithms: UPGMA raccoon sealion weasel bear seal Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12 19.75 6.75 5.75
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups). = (4*44.5 + 1*51)/5 1 species in weasel 4 species in BRSS
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups). = (4*44.5 + 1*51)/5 1 species in weasel 4 species in BRSS
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups).
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group.
Clusteringalgorithms: UPGMA raccoon sealion weasel bear seal dog Find species i and j with the smallest distance . Calculate branch length between i and j. 13 12 19.75 22.9 6.75 5.75
Clusteringalgorithms: UPGMA Find species i and j with the smallest distance . Calculate branch length between i and j. Lump i and j into a new group. Lump i and j into a new group. Compute distance between new group and all other groups (weigh for number of species in groups). = (5*88.2 + 1*98)/6 1 species in dog 5 species in BRSSW