150 likes | 164 Views
Explore the significance of sequence conservation in understanding functional aspects using Vriend's rules of sequence analysis. Learn about CMA, correlations, and the relationship between conservation, importance, and variability. Discover how entropy and variability patterns aid in identifying functionally important residues. Uncover the power of variability-entropy analysis in detailing correlation studies. Emphasizing the complexity of data acquisition and the necessity of precise methodologies in exploring sequence functionalities.
E N D
Vriend’s first rule of sequence analysis If it is conserved, it is important. Regulation is most important, and thus most conserved. Second most conserved is the location of function. Third is function, Fourth is structure. And sequence is least conserved in evolution. However, sequence conservation is easiest to determine, so that is what people do research into...
Vriend’s second rule of sequence analysis If it is very conserved, it is very important
Consequences: If something is conserved in each sub-family, it is involved in a sub-family specific function.
What is CMA? Functions never is just one residue QWERTYASDFGRGH QSLMTYLNDFHRPM QAGTTNMKDTRRKC QPRSTNRGDTRRVW Red = conserved Green = variable Blue = correlated
Part of the big alignment We see correlations between columns and between ‘things’.
Correlations Residues can correlate with residues, and when that happens we found a function, no matter the conservation or variability. Residues that have a function, correlate with that function.
Wilma Example correlation: Which cysteines form a pair in this protein family? Shown are aligned peptides from five different bacteria. ASDFGCHIKLMCNPQRSCTVW YSDYGCNIKLFCQPQRSCT-- ATDYPVQIKLMCNPQKSCSMW YTDFGCHVKLLVQPNRSVTVW -TDFGVHVKLMCNPQKSCSFW Wilma Kuipers Thesis
Wilma Summary correlation If its conserved its important; if its important it remains conserved. If residue positions show correlation with ‘something’ it is involved in that ‘something’. ‘Something’ can be any of a very large number of functions (optimal wavelength of an opsin; cellular localisation; binding an ion; binding over an interface; involved in the same internal motion; collaborating to bind the substrate; etcetera). Wilma Kuipers Thesis
Wilma Conserved or very conserved? Recalcitrant. VT1V1TVC11TRC1RT1C?VV ASDFGCHIKLMCNPQRSCTVW YSDYGCNIKLFCNPQRSCT-- ATDLPVQIKLMANPQKSCSVW LSDFGCHIKLMCNPQRSCTVW YTDFGCHVKLLVQPNRSVAFW -SDAGVHVKLMVQPNKSVSF- YTDFGCHVKLLVQPNRSVVFW -TDSGVHVKLMIQPDKSVSFW V = Variable / not important T = Conserved type 1 = Conserved ? = No idea R= Recalcitrant Left R is certainly recalcitrant. Left one is, or is not. What is the concept?
Entropy and variability So far we saw that conservation and correlation can help us find functionally important residues. Can variability patterns also tell us something?
Entropy Sequence entropy Ei at position i is calculated from the frequency pi of the twenty amino acid types (p) at position i: 20 Ei =Spi ln(pi) i=1
Variability Sequence variability Vi is the number of amino acid types observed at position i in more than 0.5% of all sequences.
Summary variability analysis Variability patterns hold information. Entropy and Variability are two (of the) ways to measure variability patterns. Entropy and Variability patterns can say something about the type of function, and thus add detail to correlation studies.
Conclusions: Data is difficult, but we need it (sic); life would be so nice if we could do without it. PDB files are the worst. Nomenclature is not homogeneous. Ontologies…. Much data has been carefully hidden in the literature, where it can only be found back with great difficulty. Residue numbering is difficult but very necessary. Variability-entropy analysis is powerful, but requires very 'good' alignments.