390 likes | 413 Views
Learn how DNA sequence, transcription factors, and nucleosome occupancies influence gene regulation through chromatin structure. Discover data-driven models for DNA mechanics and nucleosome binding affinities.
E N D
A biophysical approach to predicting intrinsic and extrinsic nucleosome positioning signals Alexandre V. Morozov Department of Physics & Astronomy and the BioMaPS Institute for Quantitative Biology, Rutgers University morozov@physics.rutgers.edu IPAM, Nov. 26 2007
Introduction to chromatin scales Electron micrograph of D.Melanogaster chromatin: arrays of regularly spaced nucleosomes, each ~80 A across.
Overview of gene regulation RNA Pol II + TAFs [mRNA] Gene [TF1] [TF2] [TF3] [Nucleosomes] • Prediction and design of gene expression levels from • DNA sequence: • Prediction of transcription factor and nucleosome occupancies in vitro and in vivo from genomic sequence • Prediction of levels of mRNA production from transcription factor and nucleosome occupancies
Data for modeling eukaryotic gene regulation Available data sources: • DNA sequence data for multiple organisms: • Genome-wide transcription factor occupancy data (ChIP-chip): • Structural data for 100s of protein-DNA complexes: • Nucleosomepositioning data: MNase digestion + sequencing or microarrays …accagtttacgt…
Biophysical picture of gene transcription Wray, G. A. et al. Mol Biol Evol 2003 20:1377-1419
Structure of the nucleosome core particle (NCP) Left-handed super-helix: (1.84 turns, 147 bp, R = 41.9 A, P = 25.9 A) PDB code: 1kx5 T.J.Richmond:K.Luger et al. Nature 1997 (2.8 Ǻ);T.J.Richmond & C.A.Davey Nature 2003 (1.9 Ǻ)
Gene regulation through chromatin structure • Transcription factor – DNA interactions are affected by the chromatin • Chromatin remodeling by ATP-dependent complexes • Histone variants (H2A.Z) • Post-translational histone modifications • (“histone code”) H2A H3 H2B H4 H3 tail
Experimental validation of thehistone-DNA interaction model Jon Widom • Adding key dinucleotide motifs increases nucleosome affinity • Deleting dinucleotide motifs or disrupting their spacing decreases affinity dyad 38 8 18 28 48 58 68 78 88 98 108 118 128 138
Histone-DNA interaction model and DNA flexibility • Nucleosome affinity depends on the presence and spacing of key dinucleotide motifs (e.g. TA,CA) • Nucleosome affinity can be explained by DNA flexibility
Data-driven model for DNA elastic energy (DNABEND) Geometry distributions for TA steps in ~100 non-homologous protein-DNA complexes: • Quadratic sequence-specific • DNA elastic energy: • mean = <θ> • width ~ <(θ - <θ>)2>-1 • Matrix of force constants: F W.K. Olson et al., PNAS 1998
Elastic rod model DNA looping induced by a Lac repressor tetramer
Elastic energy and geometry of DNA constrained to follow an arbitrary curve (DNABEND) Δr Sequence-specific DNA elastic energy “Constraint” energy Minimize to determine energy & geometry: System of linear equations: ½ x 6Nbs x 6Nbs
Example of DNA geometry prediction: nucleosome structure Ideal superhelix Prediction for NCP (1kx5)
Predictions of nucleosome binding affinities • Experimental techniques: • nucleosome dialysis A.Thastrom et al., J.Mol.Biol. 1999,2004; P.T.Lowary & J.Widom, J.Mol.Biol. 1998 • nucleosome exchange T.E.Shrader & D.M.Crothers PNAS 1989; T.E.Shrader & D.M.Crothers J.Mol.Biol. 1990 Alignment model (Segal E. et al. Nature 2006): Collect nucleosome-bound sequences in yeast Center align sequences Construct nucleosome-DNA model using observed dinucleotide frequencies
Alignment Model (in vivo selection) MNase digestion Extract DNA, clone into plasmids Sequence and center-align AGGTTTATAG.. AGGTTAATCG.. AGGTAAATAA.. ……………….. 142-152 bp Di-nucleotide log score:
From nucleosome energies to probabilities and occupancies Nucleosome energy Chromosomal coordinate Use dynamic programming to find the partition function and thus probabilities and occupancies of each DNA-binding factor, e.g. nucleosomes Nucleosome Probability & Occupancy Chromosomal coordinate
Nucleosome occupancy is dynamic Nucleosome-free site TGACGTCA Nucleosome-occluded site TGACGTCA Nucleosome is displaced by the bound TF TGACGTCA
Nucleosome occupancy of TATA boxes explains gene expression levels
Nucleosome occupancy in the vicinity of TATA boxes: default repression TATA
Functional sites by ChIP-chip:in vivo genome-wide measurementsof TF occupancy • Genome-wide occupancies for 203 transcription factors in yeast by ChIP-chip (Harbison et al., Nature 2004: “Transcriptional regulatory code”) • MacIsaac et al., BMC Bioinformatics 2006: “An improved map of phylogenetically conserved regulatory sites” (98 factor specificities + 26 more from the literature)
Nucleosome occupancy of transcription factor binding sites: default repression • <Occ(functional sites)> - <Occ(non-functional sites)> • In vitro: nucleosomes compete for DNA sequence only with each other DNABEND: Nucleosomes p < 0.05
Nucleosome occupancy of transcription factor binding sites • <Occ(functional sites)> - <Occ(non-functional sites)> • In vivo:nucleosomes compete for DNA sequence with TFs DNABEND: Nucleosomes + TFs p < 0.05
Functional transcription factor sites are clustered DNABEND: Nucleosomes + TFs, randomized functional sites p < 0.05 functional sites non-functional sites Clustering!
Functional transcription factor sites are not occupied by nucleosomes in vivo Yuan et al. microarray experiment DNABEND + Transcription Factors DNABEND Alignment model
Nucleosome-induced cooperativity Nucleosome-occluded TF sites: no separate binding TGACGTCA TAAGGCCT Nucleosome-occluded TF sites: cooperative binding TAAGGCCT TGACGTCA Miller and Widom, Mol.Cell.Biol. 2003
Nucleosome occupancy of TF sites in a model system TF sites pCYC1
Nucleosome position predictions:GAL1-10 locus GAL10 GAL1 Nucleosomes in vitro Nucleosomes in vivo TBP GAL4
Nucleosome position predictions:HIS3-PET56 locus Nucleosomes in vitro Nucleosomes in vivo TBP GCN4
Conclusions Predictedhistone-DNA binding affinities and genome-wide nucleosome occupancies using a DNA mechanics model + a thermodynamic model ofnucleosomescompeting with other factors for genomic sequence Chromatin structure around ORF starts is consistent with microarray-based measurements of nucleosome positions, and can be explained with a simple model of nucleosomes “phasing off” bound TBPs Nucleosome-induced cooperativity (brought about by clustering of functional transcription factor binding sites) is responsible for the increased accessibility of functional sites
Future Directions • Lots of nucleosome positioning sequences [soon to become] available – can a better model of dinucleotide (base stacking) energies be built? {Anirvan Sengupta, Rutgers} • Can such a model be used to inform a better DNA mechanics model? Conversely, can a DNA mechanics model be “compressed”, i.e. encapsulated in a simple set of dinucleotide energies? {Anirvan Sengupta, Rutgers} • DNABEND extensions to non-nucleosome systems, i.e. nucleoid proteins, DNA loops etc.? {John Marko, Jon Widom, Northwestern} • Prediction of in vivo nucleosome positions in gene expression libraries {Ligr et al., Genetics 2006: random libraries of yeast promoters; Lu Bai et al., unpublished}
Acknowledgements PEOPLE: • Eric Siggia (Rockefeller University) • Jon Widom (Northwestern University) • HarmenBussemaker (Columbia University) FUNDING: • Leukemia & Lymphoma Society Fellowship • BioMaPS Institute, Rutgers University
Induced periodicity of stable nucleosomes stable stable