1 / 9

KDD Group Research Seminar Fall, 2001 – Presentation 2b of 11

This presentation discusses the AIS-BN algorithm for adaptive importance sampling on large Bayesian networks, including updates to the importance function and comparison to other sampling algorithms.

pinder
Download Presentation

KDD Group Research Seminar Fall, 2001 – Presentation 2b of 11

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. KDD Group Research SeminarFall, 2001 – Presentation 2b of 11 Adaptive Importance Sampling on Bayesian Networks (AIS-BN) Friday, 05 October 2001 Julie A. Stilson http://www.cis.ksu.edu/~jas3466 Reference Cheng, J. and Druzdzel, M (2000). “AIS-BN: An Adaptive Importance Sampling Algorithm for Evidential Reasoning in Large Bayesian Networks.” Journal of Artificial Intelligence Research, 13, 155-188.

  2. Outline • Basic Algorithm • Definitions • Updating importance function • Example using Sprinkler-Rain • Why Adaptive Importance Sampling? • Heuristic initialization • Sampling with unlikely evidence • Different Importance Sampling Algorithms • Forward Sampling (FS) • Logic Sampling (LS) • Self-Importance Sampling (SIS) • Differences between SIS, AIS-BN • Gathering results • How RMSE values are collected • Sample results for FS, AIS-BN

  3. Definitions • Importance Conditional Probability Tables (ICPTs) • Probability tables that represent the learned importance function • Initially, equal to the CPTs • Updated after each updating interval (see below) • Learning Rate • The rate at which the true importance function is being learned • Learning rate = a (b / a) ^ (k / kmax) • A = initial learning rate, b = learning rate in last step, k = number of updates that have been made, kmax = total number of updates that will be made • Frequency Table • Stores the frequency with which each instantiation of each query node occurs • Used to update importance function • Updating Interval • AIS-BN updates the importance function after this many samples • If 1000 total samples are to be taken, and the updating interval is 100, then 10 total updates will be made

  4. Basic Algorithm k := number of updates so far , m := desired number of samples , l := updating interval for (int i = 1, i <= m, i++) { if (i mod l == 0) { k++; Update importance function Pr^k(X\E) based on total samples } generate a sample according to Pr^k(X\E), add to total samples totalweight += Pr(s,e) / Pr^k(s) } totalweight = 0; T = null; for (int i = 1; i <= m, i++) generate a sample according to Pr^kmax(X\E), add to total samples totalweight += Pr(s,e) / Pr^kmax(s) compute RMSE value of s using totalweight }

  5. Updating Importance Function • Theorem: Xi in X, Xi not in Anc(E) => Pr(Xi | Pa(Xi), E) = Pr(Xi | Pa(Xi)) • Proved using d-connectivity • Only ancestors of evidence nodes need to have their importance function learned • The ICPT tables of all other nodes do not change throughout sampling • Algorithm for Updating Importance Function : Sample l points independently according to the current importance function, Pr^k(X\E) For every query node Xi that is an ancestor to evidence, estimate Pr’(xi | pa(Xi), e) based on the samples Update Pr^k(X\E) according to the following formula: Pr^(k+1)(xi | pa(Xi), e) = Pr^k(xi | pa(Xi), e) + LRate * (Pr’(xi | pa(Xi), e) – Pr^k(xi | pa(Xi), e)

  6. Sprinkler: On, Off Ground: Wet, Dry Cloudy: Yes No S C G R Rain: Yes, No Example Using Sprinkler-Rain • Imagine Ground is evidence – instantiated to Wet • More probable that Sprinkler is on and that it is raining • ICPT tables update the probabilities of the ancestors to evidence nodes to reflect this

  7. Why Adaptive Importance Sampling? • Heuristic Initialization: Parents to Evidence Nodes • Changes the probabilities of the parents to evidence to a uniform distribution when the probability of that evidence is sufficiently small • Parents of evidence nodes are most affected by the instantiation of evidence • Uniform distribution helps importance function be learned faster • Heuristic Initialization: Extremely Small Probabilities • Extremely low probabilities would usually not be sampled much • Slow to learn true importance function • AIS-BN raises extremely low probabilities to a set threshold and lowers extremely high probabilities accordingly • Sampling with Unlikely Evidence • Importance function very different from CPTs with unlikely evidence • Difficult to accurately sample without changing probability distributions • AIS-BN performs better than other sampling algorithms with unlikely evidence

  8. Different Importance Sampling Algorithms • Forward Sampling / Likelihood Weighting (FS) • Similar to AIS-BN, but importance function is not learned • Performs well under most circumstances • Doesn’t do well when evidence is unlikely • Logic Sampling (LS) • Network is sampled randomly without regard to evidence • Samples that don’t match evidence are then discarded • Simplest importance sampling algorithm • Also performs poorly with unlikely evidence • Inefficient when many nodes are evidence • Self-Importance Sampling (SIS) • Also updates an importance function • Does not obtain samples from learned importance function • Updates to importance function do not use sampling information • For large numbers of samples, performs worse than FS

  9. Gathering Results • Relative Root Mean Square Error : • P(i) is exact probability of sample • P^(i) is estimated probability of sample from frequency table • M:= arity, T:= number of samples • RMSE Collection • Relative RMSE computed for each sample • Each RMSE value is stored in an output file: printings.txt • Graphing Results • Open output file in Excel • Graph results using “Chart” • Example Chart • ALARM network, 10000 samples • Compares FS, AIS-BN

More Related