1 / 384

Simulation Modelling

Simulation Modelling. Contents. Input Modelling 3 Random Number Generation 41 Generating Random Variates 80 Output Analysis 134 Resampling Methods 205 Comparing Multiple Systems 219 Simulation Optimization 248 Metamodels 278 Variance Reduction 292

ogg
Download Presentation

Simulation Modelling

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Simulation Modelling IE 519

  2. Contents • Input Modelling 3 • Random Number Generation 41 • Generating Random Variates 80 • Output Analysis 134 • Resampling Methods 205 • Comparing Multiple Systems 219 • Simulation Optimization 248 • Metamodels 278 • Variance Reduction 292 • Case Study 350 IE 519

  3. Input Modelling IE 519

  4. Input Modelling • You make custom Widgets • How do you model the input process? • Is it deterministic? • Is it random? • Look at some data IE 519

  5. Orders Now what? IE 519

  6. Histogram IE 519

  7. Other Observations • Trend? • Stationary or non-stationary process • Seasonality • May require multiple processes IE 519

  8. Choices for Modelling • Use the data directly (trace-driven simulation) • Use the data to fit an empirical distribution • Use the data to fit a theoretical distribution IE 519

  9. Assumptions • To fit a distribution, the data should be drawn from IID observations • Could it be from more than one distribution? • Statistical test • Is it independent? • Statistical test IE 519

  10. Activity I • Hypothesize families of distributions • Look at the data • Determine what is a reasonable process • Summary statistics • Histograms • Quantile summaries and box plots IE 519

  11. Activity II • Estimate the parameters • Maximum likelihood estimator (MLE) • Sometimes a very simple statistics • Sometimes requires numerical calculations IE 519

  12. Activity III • Determine quality of fit • Compare theoretical distribution with observations graphically • Goodness of fit tests • Chi-square tests • Kolmogorov-Smirnov test • Software IE 519

  13. Chi-Square Test • Formal comparison of a histogram and the probability density/mass function • Divide the range of the fitted distribution into intervals • Count the number of observations in each interval IE 519

  14. Chi-Square Test • Compute the expected proportion • Test statistic is • Reject if too large IE 519

  15. How good is the data? • Assumption of IID observations • Sometimes time-dependent (non-stationary) • Assessment • Correlation plot • Scatter diagram • Nonparametric tests IE 519

  16. Correlation Plot • Calculate and plot the sample correlation IE 519

  17. Scatter Diagram • Plot pairs • Should be scattered randomly through the plane • If there is a pattern then this indicates correlation IE 519

  18. Multiple Data Sets • Often you have multiple data sets (e.g., different days, weeks, operators) • Is the data drawn from the same process (homogeneous) and can thus be combined? • Kruskal-Wallis test IE 519

  19. Kruskal-Wallis (K-W) Statistic • Assign rank 1 to the smallest observation, rank 2 to the second smallest, etc • Calculate IE 519

  20. K-W Test • The null hypothesis is H0: All the population distribution are identical H1: At least one is larger than at least one other • We reject H0 at a level if • In other words, the test statistic follows a chi-square distribution with k-1 degrees of freedom IE 519

  21. Absence of Data • We have assumed that we had data to fit a distribution • Sometimes no data is available • Try to obtain minimum, maximum, and mode and/or mean of the distribution • Documentation • SMEs IE 519

  22. Triangular Distribution IE 519

  23. Symmetric Beta Distributions a=b=2 a=b=3 a=b=5 a=b=10 IE 519

  24. Skewed Beta Distributions a=2, b=4 IE 519

  25. Beta Parameters IE 519

  26. Benefits of Fitting a Parametric Distribution • We have focused mainly on the approach where we fit a distribution to data • Benefits: • Fill in gaps and smooth data • Make sure tail behavior is represented • Extreme events are very important to the simulation but may not be represented • Can easily incorporate changes in the input process • Change mean, variability, etc. • Reflect dependencies in the inputs IE 519

  27. What About Dependencies • Assumed so far an IID process • Many processes are not: • A customer places a monthly order. Since the customer keeps inventory of the product, a large order is often followed by a small order • A distributor with several warehouses places monthly orders, and these warehouses can supply the same customers • The behavior of customers logging on to a web site depends on age, gender, income, and where they live • Do not ignore it! IE 519

  28. Solutions • A customer places a monthly order. • Should use a time-series model that captures the autocorrelation • A distributor with several warehouses • Need a vector time-series model • Customers logging on to a web site • Need a random vector model where each component may have a different distribution IE 519

  29. Taxonomy of Input Models Examples of models Discrete Continuous ‘Mixed Binomial, etc. Univariate Normal, gamma, beta, etc. Empirical/Trace-driven Time-independent Discrete Continuous ‘Mixed Independent binomial Multivariate Multivariate normal Bivariate-exponential Discrete-state Markov chains (stationary?) Discrete-time Cont.-state Time-series models Discrete-state Stochastic Processes Poisson process (stationary?) Continuous-time Cont.-state Markov process IE 519

  30. What if it Changes over Time? • Do not ignore it! • Non-stationary input process • Examples: • Arrivals of customers to a restaurant • Arrivals of email to a server • Arrivals of bug discovery in software • Could model as nonhomogeneous Poisson process IE 519

  31. Goodness-of-Fit Test • The distribution fitted is tested using goodness-of-fit tests (GoF) • How good are those tests? • The null hypothesis is that the data is drawn from the chosen distribution with the estimated parameters • Is it true? IE 519

  32. Power of GoF Tests • The null hypothesis is always false! • If the GoF test is powerful enough then it will always be rejected • What we see in practice: • Few data points: no distribution is rejected • A great deal of data: all distributions are rejected • At best, GoF tests should be used as a guide IE 519

  33. Input Modeling Software • Many software packages exist for input modeling (fitting distributions) • Each has at least 20-30 distributions • You input IID data, the software gives you a ranked list of distributions (according to GoF tests) • Pitfalls? IE 519

  34. Why Fit a Distribution at All? • There is a growing sentiment that we should never fit distributions (not consensus, just growing) • A couple of issues: • You don’t always benefit from data • Fitting distribution is misleading IE 519

  35. Is Data Reality • Data is often • Distorted • Poorly communicated, mistranslated or recorded • Dated • Data is always old by definition • Deleted • Some of the data is often missing • Dependent • Often only summaries, or collected at certain times • Deceptive • This may all be on purpose! IE 519

  36. Problems with Fitting • Fitting an input distribution can be misleading for numerous reasons • There is rarely a theoretical justification for the distribution. Simulation is often sensitive to the tails and this is where the problem is! • Selecting the correct model is futile • The model gives the simulation practitioner a false sense of the model being well-defined IE 519

  37. Alternative • Use empirical/trace-driven simulation when there is sufficient data • Treat other cases as if there is no data, and use beta distribution IE 519

  38. Empirical Distribution IE 519

  39. Beta Distribution Shapes IE 519

  40. What to Do? • Old rule of thumb based on number of data points available: • <20 : Not enough data to fit • 21-50 : Fit, rule out poor choices • 50-200 : Fit a distribution • >200 : Use empirical distribution IE 519

  41. Random Number Generation IE 519

  42. Random-Number Generation • Any simulation with random components requires generating a sequence of random numbers • E.g., we have talked about arrival times, service times being drawn from a particular distribution • We do this by first generating a random number (uniform between [0,1]) and then transforming it appropriately IE 519

  43. Three Alternatives • True random numbers • Throw a dice • Not possible to do with a computer • Pseudo-random numbers • Deterministic sequence that is statistically indistinguishable from a random sequence • Quasi-random numbers • A regular distribution of numbers over the desired interval IE 519

  44. Why is this Important? • Validity • The simulation model may not be valid due to cycles and dependencies in the model • Precision • You can improve the output analysis by carefully choosing the random numbers IE 519

  45. Pseudo-Random Numbers • Want an iterative algorithm that outputs numbers on a fixed interval • When we subject this sequence to a number of statistical test, we cannot distinguish it from a random sequence • In reality, it is completely deterministic IE 519

  46. Linear Congruential Generators (LCG) • Introduced in the early 50s and still in very wide use today • Recursive formula Every number is determined by these four values IE 519

  47. Transform to Unit Uniform • Simply divide by m • What values can we take? IE 519

  48. Examples IE 519

  49. Characteristics • All LCGs loop • The length of the cycle is the period • LCGs with period m have full period • This happens if and only if • The only positive integer that divides both m and c is 1 • If q is a prime that divides m, then q divides a-1 • If 4 divides m then 4 divides a-1 IE 519

  50. Types of LCGs • If c=0 then it is called multiplicative LCG, otherwise mixed LCG • Mixed and multiplicative LCG behave rather differently IE 519

More Related