120 likes | 282 Views
Learning Regulatory Networks that Represent Regulator States and Roles. Keith Noto (noto@cs.wisc.edu) and Mark Craven. K. Noto and M. Craven, Learning Regulatory Network Models that Represent Regulator States and Roles . To appear in Lecture Notes in Bioinformatics. Task. Given:
E N D
Learning Regulatory Networks that Represent Regulator States and Roles Keith Noto (noto@cs.wisc.edu) and Mark Craven K. Noto and M. Craven, Learning Regulatory Network Models that Represent Regulator States and Roles. To appear in Lecture Notes in Bioinformatics.
Task • Given: • Gene expression data • Other sources of data • e.g. sequence data, transcription factor binding sites, transcription unit predictions • Do: • Construct a model that captures regulatory interactions in a cell
Key Ideas: States and Roles • Regulator states • Cannot be observed • Depend on more than regulator expression • We use cellular conditions as surrogates/predictors of regulation effectors • Regulator roles • Is a regulator an activator or a repressor? • We use sequence analysis to predict these roles Regulator Expression Effector Cellular Condition Regulator State Regulatee Expression Regulatee Expression
Network Variables and Structure Regulators: expression states represented as a mixture of Gaussians Cellular Conditions: “stationary growth phase”, “heat shock”, ... Select relevant parents HiddenRegulator States: “activated” or “inactivated” Connect where we have evidence of regulation Regulatees: expression states represented as a mixture of Gaussians
Network Parameters: Hidden Nodes use CPD-Trees • Parents selected from regulator expression, cellular conditions • May contain context-sensitive independence metJ Growth Phase Growth Medium Heat Shock metJ state metJ metJ = Low expression metJ ≠ Low expression Growth Phase P(metJ state = activated): 0.001 Growth Phase ≠ Log Growth Phase = Log Phase P(metJ state = activated): 0.004 P(metJ state = activated): 0.994
Initializing Roles Transcription Start Site* -35 metA transcription unit DNA Binding sites Upstream Downstream CPT for regulatee metA metR state metJ state metR state metJ state (metR binds upstream; considered an activator) (metJ binds downstream; considered a repressor) P(Low) P(High) activated activated activated inactivated inactivated activated Inactivated inactivated 0.6 0.4 0.2 0.8 0.9 0.1 0.5 0.5 metA *Predicted transcription start sites from Bockhorst et. al., ISMB ‘03
Training the Model • Initialize the parameters • Activators tend to bind more upstream than repressors • Use an EM algorithm to set parameters • E-Step: Determine expected states of regulators • M-Step: Update CPDs • Repeat until convergence
Experimental Data and Procedure • Expression measurements from Affymetrix microarrays (Fred Blattner’s lab, University of Wisconsin-Madison) • Regulator binding site predictions from TRANSFAC, EcoCyc, cross-species comparison (McCue, et. al., Genome Research 12, 2002) • Experimental data consists of: • 90 Experiments • 6 Cellular condition variables (between two and seven values) • 296 regulatees • 64 regulators • Cross-fold validation • Microarrays held aside for testing • Conditions from test microarrays do not appear in training set
Model Classification Error Average Squared Error Log Likelihood 22.16% 0.75 -13,363 Baseline #1 (No hidden nodes, no cellular conditions) 12.42% 0.51 -12,193 Baseline #2 (No hidden nodes, using cellular conditions) 13.34% 0.51 -12,004 Our Model (3 iterations of adding missing TFs) Random Initialization (3 iterations of adding missing TFs) 14.19% 0.54 -11,893