210 likes | 337 Views
MEGN 537 – Probabilistic Biomechanics Ch.3 – Quantifying Uncertainty. Anthony J Petrella, PhD Bioengineering. Big Picture. Why study traditional probability (Ch. 2)? Unions and intersections allow us to conceptualize system level variability with multiple sources
E N D
MEGN 537 – Probabilistic BiomechanicsCh.3 – Quantifying Uncertainty Anthony J Petrella, PhD Bioengineering
Big Picture • Why study traditional probability (Ch. 2)? • Unions and intersections allow us to conceptualize system level variability with multiple sources • Chapter 3 deals with uncertainty in variables • Chapter 2 presupposes failure probabilities, but often times these are not absolute or fixed • This chapter allows us to characterize variability in parameters
Random Variables Continuous (variable) Data • Data measured on an infinitely divisible scale or continuum • No gaps between possible values • Examples • Height, weight • Blood glycogen level • Joint contact force • Kinematic measures • Response time (milliseconds) Discrete (attribute) Data • Discrete data measures attributes, qualitative conditions, counts • Gaps between possible values • Examples • Surgery or placebo • Fusion or dynamic • Number of surgeries per week or per year • Pain/satisfaction surveys • Number of patients
Population vs. Sample • Population - A total set of all process results • Sample - A subset of a population POPULATION Sample Sample Measures n x s Population Measures N m s number of data points mean standard deviation _
Histogram visually represent data centering, variability, and shape Histograms are a graphical tool used to depict the frequency of numerical data by categories (classes or bins) Frequency Histograms • Properties • All data will fall into a class or bin • No data will overlap
Descriptors of Uncertainty • Used to characterize measured data and distributions • Common descriptors • Mean • Standard deviation • Coefficient of variation • Skewness
Mean – also known as average; sum of all values divided by number of values Grand Average: overall average or average of the averages Median - midpoint of the data Arrange the data from lowest to highest, the median is the middle data point number 50% of the data points will fall below the median and the other 50% will fall above the median Mode - the most frequent data point, or value occurring the most often Measures of Location (Central Tendency )
Value 30 40 45 50 55 170 # data pts 2 1 1 1 1 1 20 30 40 50 60 100 170 Mode Median Mean Example - Measures of Location • Given the following data on salaries:$50k, $30k, $170k, $45k, $30k, $55k, $40k • Mean = (50+30+170+45+30+55+40)/7 = 420/7 = $60k • Median (midpoint): 30,30,40, 45, 50,55,170 • Mode (most frequent)
Range - Total width of a distribution. Range = Maximum Value - Minimum Value Variance (V)– Measure of the spread in data about the mean Second central moment Standard Deviation (S)- The most common measure of dispersion Measures of Dispersion (Variation or Spread) Range
High spread High standard deviation Poor performance X Low spread Low standard deviation Consistent performance 50 60 70 20 30 40 50 60 170 Standard Deviation • Standard deviation is a measure of variation telling us about consistency around the mean
Coefficient of variation (COV) - Relative indicator of uncertainty in a variable Ratio of standard deviation and the mean Skewness– Measure of the spread of data about the mean emphasizing the shape of the distribution Third central moment Other Descriptors
Skewness Coefficient (qx)– Non dimensional measure 0 = symmetric + skewness Most values below the mean - skewness Most values above the mean Skewness qx = + qx = - f(x) x
Probability Density Function • PDF is the typical histogram or bell curve fX(x) = Probability x is in a specific bin
Cumulative Distribution Function • CDF ranges from 0 to 1 • Integral of the pdf f(x) CDF PDF m = 0 s = 1
The CDF gives the probability of a continuous random variable having a value less than or equal to a specific value Relationship between pdf and cdf: The CDF has the following values F(x -∞) = 0 F(x = mx) = 0.5 F(x +∞) =1 Cumulative Distribution Function
Creating Histograms or PDFs • Arrange data in increasing order • Create evenly spaced bins and count how many data points occur in each bin • Note: the # of bins can affect the appearance of the histogram • Rule of thumb: k = 1 + 3.3 log10 n where k = # of bins and n = number of data points • Plot the number of observations versus the variable • Note: for PDFs, plot frequency = (# of observations) / n
Creating CDFs • Arrange data in increasing order • For each datapoint • Create an index i = 1, 2, 3,…, n • Compute F(x) = i / (n+1) • Plot F(x) as a function of the variable x • Demo in Excel (come back to this)
Joint distributions or joint PDF’s can be defined for multiple variables Multiple Random Variables Joint PDF in 3D represents how the variables are dependent on each other
Covariance and Correlation • Covariance indicates the degree of linear relationship between two random variables, denoted as: • Cov(X,Y) = E(XY) – E(X)*E(Y) where E( ) is the expected value • Covariance is the second moment about the respective means • Covariance = 0 for statistically independent events • Correlation coefficient (non-dimensional) represents the degree of linear dependence between two random variables • ρx,y = Cov(X,Y)/(σx * σy) • Correlation coefficient can range from -1 to +1 (Haldar p. 53) • ρ = 0 no correlation • ρ = +1 perfectly correlated / proportionate • ρ = -1 perfectly correlated / inversely proportionate
Project Demo Assume the following parameters are normally distributed with acoefficient of variation of 0.05: |rquad.knee|, rtubercle.x, rtubercle.y, rham.knee.y.
Demos… • First: Excel for PDF, CDF, trials • Second: Matlab demo with trials • Third: NESSUS