1 / 12

A continuous probabilistic model of local RNA 3-D structure

A continuous probabilistic model of local RNA 3-D structure. Jes Frellsen The Bioinformatics Centre Department of Molecular Biology University of Copenhagen. Background. 3D structure is important for understanding the function of non-coding RNA molecules

mihaly
Download Presentation

A continuous probabilistic model of local RNA 3-D structure

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. T H E B I O I N F O R M A T I C S C E N T R E A continuous probabilistic model of localRNA 3-D structure Jes Frellsen The Bioinformatics Centre Department of Molecular Biology University of Copenhagen

  2. T H E B I O I N F O R M A T I C S C E N T R E Background • 3D structure is important for understanding the function of non-coding RNA molecules • Experimental methods for determining 3D structure are time consuming and sometimes difficult • Local structure is typically modeled by using discretization • E.g. fragment libraries are used in current methods for structure prediction • Our group has recently made a continuous probabilistic model of local protein structure with great success[PLoS Comput Biol 2006, 2:1121-113] • Dynamic Bayesian Networks • Directional statistics • We have used a similar approach to model local structure of RNA

  3. T H E B I O I N F O R M A T I C S C E N T R E Representation of RNA • Each nucleotide in an RNA molecule can be represented by the base type and 7 dihedrals angles • Allows for accurate conversion into coordinates of all atoms in the structure using standard values

  4. T H E B I O I N F O R M A T I C S C E N T R E Angle distributions • Each variable lies on a circle • Requires directional statistics • Each variable is multi-modal • Can be described by a mixture of simple distributions • Von Mises distribution • The angles co-vary both within nucleotides and between consecutive nucleotides • We model this by a sequential model

  5. T H E B I O I N F O R M A T I C S C E N T R E Our model • An DBN with 3 random variables per angle: • Discrete input variable indicating angle type (7 states) • Hidden variable with 20 states • Output variable representation the angle value and the CPDs given the hidden state is modelled by Von Mises distributions • Structure of an IOHMM with continuous output (except bookkeeping) • Does not impose a groping of the angles • Parameters are estimated by stochastic EM from experimental data

  6. T H E B I O I N F O R M A T I C S C E N T R E Evaluating the modelIndividual angle distributions • The model captures the distribution of the individual angles • E.g. the -angle and the -angle:

  7. T H E B I O I N F O R M A T I C S C E N T R E Evaluating the modelPairwise distribution • The model captures the pairwise dependencies between the angles • E.g. the pairwise distribution of  and  (inter-nucleotide)

  8. T H E B I O I N F O R M A T I C S C E N T R E Proof of concept: generating decoys for a target structure • A simple simulated annealing scheme: • Sample a whole structure, S, without clashes • Make new structure, S’, by resampling four consecutive angles in S (randomly picked) • Evaluate S’ • If it has clashed it is rejected • If it has a better energy than S then S’ is set to be the new S • If it has a worse energy then with probability, p, S’ is set to be the new S (otherwise it is rejected) • Go to step 2 • In the scheme we used • p = e(E-E’)/T , where T decreases with time • a simple “energy function” that promotes structure with the same Watson-Crick base pair as are found in the target structure

  9. T H E B I O I N F O R M A T I C S C E N T R E Results of generating 1,500 decoys for 5 different structures Target structure Best decoy 1ZIH

  10. T H E B I O I N F O R M A T I C S C E N T R E Perspectives • The model assigns a probability distribution to the conformational space and describes many aspects of local RNA structure well • It has numerous applications! • It allows for fast probabilistic sampling of locally RNA-like structures • Can thus be used in RNA 3D structure prediction • The model can be used to calculate the probabilities of seeing different local structures • Can thus be used for quality validation of experimentally determined structures

  11. T H E B I O I N F O R M A T I C S C E N T R E Acknowledgements • The research was conducted in the structural bioinformatics group, lead by Thomas Hamelryck, byJes Frellsen, Ida Moltke, Martin Thiim and Thomas Hamelryck • We would like to thank • Our collaborator Senior Research Professor Kanti V. Mardia from The University of Leeds for his contributions on directional statistics. • The Richardsons Lab at Duke University for making their RNA dataset available • JF thanks IMA for the invitation to the conference • JF is funded by The Danish Council for Strategic Research • TH is funded by The Danish Council for Technology and Innovation

  12. T H E B I O I N F O R M A T I C S C E N T R E Bayesian Networks andDynamic Bayesian Networks • A BN is a DAG where • Nodes are random variables • Edges represent conditional dependencies in the factorization of the joint probability • The graph encodes conditional indepencies • E.g. A and D is conditional independent give C • DBNs are the time series expansion of BNs • E.g. an HMM:

More Related