1 / 41

Genetic network inference: from co-expression clustering to reverse engineering

Genetic network inference: from co-expression clustering to reverse engineering. Patrik D’haeseleer,Shoudan Liang and Roland Somogyi. The goal of this review. Principles of genetic network organization Computational methods for extracting network architectures from experimental data . Outline.

omer
Download Presentation

Genetic network inference: from co-expression clustering to reverse engineering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genetic network inference: from co-expression clustering to reverse engineering Patrik D’haeseleer,Shoudan Liang and Roland Somogyi

  2. The goal of this review • Principles of genetic network organization • Computational methods for extracting network architectures from experimental data

  3. Outline • Introduction • A conceptual approach to complex network dynamics • Inference of regulation through clustering of gene expression data • Modeling methodologies • Gene network inference:reverse engineering • Conclusions and Outlook

  4. Genes encode proteins, some of which in turn regulate other genes determine the structure of this intricate network of genetic regulatory interactions

  5. Traditional approach: local • Examining and collecting data on a single gene, a single protein or a single reaction at a time functional genomics

  6. Functional Genomics • Specifically, functional genomics refers to the development and application of global experimental approaches to assess gene function by making use of the information and reagents provided by structural genomic. • high throughput • large scale experimental methodologies combined with statistical and computational analysis of the results.

  7. Functional Genomics(Cont.) • We need to define the mapping from sequence space to functional space.

  8. Intermediate representation • Focus at the level of single cells • A biological system can be considered to be a state machine,where the change in internal state of the system depends on both its current internal state and any external inputs.

  9. The goal • Observe the state of a cell and how it changes under different circumstances, and from this to derive a model of how these state changes are generated • The state of cell • All those variables determining its behavior

  10. Example • A simple,6-node regulatory network

  11. Outline • Introduction • A conceptual approach to complex network dynamics • Inference of regulation through clustering of gene expression data • Modeling methodologies • Gene network inference:reverse engineering • Conclusions and Outlook

  12. The global gene expression pattern is the result of the collective behavior of individual regulatory pathways • Gene function depends on its cellular context; thus understanding the network as a whole is essential.

  13. Boolean Networks • Each gene is considered as a binary variable—either ON or OFF—regulated by other genes through logical or Boolean functions. • Even with this simplification ,the network behavior is already extremely rich.

  14. Boolean Networks(Cont.) • Cell differentiation corresponds to transitions from one global gene expression pattern to another.

  15. Outline • Introduction • A conceptual approach to complex network dynamics • Inference of regulation through clustering of gene expression data • Modeling methodologies • Gene network inference:reverse engineering • Conclusions and Outlook

  16. Scoring methods • Whether there has been a significant change at any one condition • Whether there has been a significant aggregate change over all conditions • Whether the fluctuation pattern shows high diversity according to Shannon entropy

  17. Guilt By Association • Select a gene • Determine its nearest neighbors in expression space within a certain user-defined distance cut-off

  18. Clustering • extract groups of genes that are tightly co-expressed over a range of different experiments.

  19. Caution • Different clustering methods can have very different results • It’s not yet clear which clustering methods are most useful for gene expression analysis.

  20. Definition:Gene Expression Profile • An expression profile ej of an ordered list of N samples(k=1 to N) for a particular gene j is a vector of scaled expression values vjk • The expression profile is: • ej=(vj1,vj2,vj3,…,vjN)

  21. Definition:Gene Expression Profile( Cont.) • A difference between two genes p and q may be estimated as N-dimensional metric “distance” between ep and eq. • Euclidean distance: • =

  22. Clustering algorithms • Non-hierarchical methods • Cluster N objects into K groups in an iterative process until certain goodness criteria are optimized • E.g. K-means

  23. Clustering algorithms • Hierarchical methods • Return an hierarchy of nested clusters, where each cluster typically consists of the union of two or more smaller clusters. • Agglomerative methods • Start with single object clusters and recursively merge them into larger clusters • Divisive methods • Start with the cluster containing all objects and recursively divide it into smaller clusters

  24. Other applications of co-expression clusters • Extraction of regulatory motifs • Genes in the same expression share biological funtions • Inference of functional annotation • Functions of unknown genes may be hypothesized from genes with know function within the same cluster • As a molecular signature in distinguishing cell or tissue types • mRNA expression

  25. Which clustering method to use? • There is no single best criterion for obtaining a partition because no precise and workable definition of ‘cluster’ exists. • Clusters can be of any arbitrary shapes and sizes in a multidimensional pattern space.

  26. Challenge in cluster analysis • A gene could be a member of several clusters, each reflecting a particular aspect of its function and control • Solutions • clustering methods that partition genes into non-exclusive clusters • Several clustering methods could be used simultaneously

  27. Outline • Introduction • A conceptual approach to complex network dynamics • Inference of regulation through clustering of gene expression data • Modeling methodologies • Gene network inference:reverse engineering • Conclusions and Outlook

  28. Level of biochemical detail • abstract • Boolean networks • concrete • Full biochemical interaction models with stochastic kinetics in Arkin et al.(1998)

  29. Forward and inverse modeling • Forward modeling approach • Inverse modeling, or reverse engineering • Given an amount of data, what can we deduce about the unknown underlying regulatory network? • Requires the use of a parametric model, the parameters of which are then fit to the real-world data.

  30. Outline • Introduction • A conceptual approach to complex network dynamics • Inference of regulation through clustering of gene expression data • Modeling methodologies • Gene network inference:reverse engineering • Conclusions and Outlook

  31. Goal of network inference • Construct a coarse-scale model of the network of regulatory interactions between the genes • It’s possible to reverse engineer a network from its activity profiles

  32. Data requirements • We need to observe the expression of that gene under many different combinations of expression levels of its regulatory inputs • Use data from different sources • Deal with different data types

  33. Estimates for network models • a sparse network model of N genes, where each gene is only affected by K other genes on average. a sparsely connected, directed graph with Nnodes and NK edges.

  34. Estimate for network models(Cont.) • To specify the correct model, we need bits of information.

  35. Correlation Metric Construction • Adam Arkin and John Ross • A method to reconstruct reaction networks from measured time series of the component chemical species. • The system is driven using inputs for some of the chemical species and the concentration of all the species is monitored over time.

  36. Correlation Metric Construction(Cont. ) • The time-lagged correlation matrix is calculated • From this a distance matrix is constructed based on the maximum correlation between any two chemical species • This distance matrix is then fed into a simple clustering algorithm to generate a tree of connections between the species • The results are mapped into a two-dimensional graph for visualization

  37. Additive regulation models • Property • The regulatory inputs are combined using a weighted sum • Can be used as a first-order approximation to the gene network

  38. Additive regulation models • The change in each variable over time is given by a weighted sum of all other variables • is the level of the i-th varibale • is a bias term indicating whether I is expressed of not in the absence of regulatory inputs • represents the influence of j on the regulation of i

  39. Use of such models • We can infer regulatory interactions directly from the data, by fitting these simple network models to large scale gene expression data.

  40. Outline • Introduction • A conceptual approach to complex network dynamics • Inference of regulation through clustering of gene expression data • Modeling methodologies • Gene network inference:reverse engineering • Conclusions

  41. Conclusion • Conceptual foundations for understanding complex biological networks • Several practical methods for data analysis

More Related