1 / 107

2.1 Probability Joint & Conditional Probabilities Statistical Independence Random Variables & Distributions

2.1 Probability Joint & Conditional Probabilities Statistical Independence Random Variables & Distributions. Consider an Experiment with several Possible Outcomes, s i. Sample Space S = {s 1 , s 2 , s 3 ... s n } ≡ Set of All Possible Outcomes possibly infinite set .

mariko
Download Presentation

2.1 Probability Joint & Conditional Probabilities Statistical Independence Random Variables & Distributions

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2.1 Probability • Joint & Conditional Probabilities • Statistical Independence • Random Variables & Distributions

  2. Consider an Experiment with several Possible Outcomes, si • Sample SpaceS = {s1, s2, s3 ... sn } ≡ Set of All Possible Outcomes • possibly infinite set • An EventA ≡ A S consists of some possible outcomes (samples) • e.g. A = { s2, s3 } • Event Ā ≡compliment of A • A Ā= • AĀ = S • Mutually Exclusive Events≡ A  B= 2.1 Probability

  3. Statistical probability of A occurring = P(A) • (1) P(A)  0 • (2) P(S) = 1 • (3) let Ai , i = 1,2,... be a (possibly infinite) number of events • assume for i  j Ai Aj= • then P(A1) P(A2)  ... P(An) = ΣiP(Ai)

  4. S = Let Aiand Bjbe possible outcomes from 2 experiments joint event≡ events Aiand Bj both occur joint outcome≡ (AiBj) joint probability ≡ P(Ai Bj) such that 0 P(Ai Bj)  1 Joint Events and Joint Probabilities • joint events consider multiple experiments • consider two experiments with outcomes A, B • sample space S of all outcomes consists of all 2-tuples (i,j)

  5. Let all Bjoutcomes, for j = 1 to m be mutually exclusive Let all Aioutcomes, for i = 1 to n be mutually exclusive 2.1-1 2.1-2 P(Ai)= P(Bj)= Let all outcomes from both experiments be mutually exclusive = 1 2.1-3 Mutually Exclusive Joint Events Case of 2 experiments generalizes to case of kexperiments

  6. e.g. experiment = die toss and coin toss A1 = probability of rolling a 1 on a die toss A2 = probability of rolling a 2 or 3 on a die toss A3 = probability of a rolling a 4, 5, or 6 on a die toss B1 = probability of a tail on a coin toss P(B1) = P(A1,B1) + P(A2,B1) + + P(A3,B1) since coin tosses are independent we have P(B1) = P(A1)· P(B1) + P(A2)· P(B1) + P(A3)· P(B1)

  7. P(A|B) = 2.1-4 2.1-5 P(A,B) = P(A|B) P(B) = P(B|A) P(A) Conditional Probabilities – Joint Events • consider two events A and B in sample space S • let P(A,B) = probability of both A and B (A  B) occurring •  P(A|B) = probability of A occurring given B has occurred • Relationships also apply to combined experiment • P(A,B) = probability of a joint event occurring • P(A|B) = probability of A occurring given B has occurred

  8. (1) if A  B=  P(A|B) = 0 2.1-6 • event B occurred  event A could not have occurred • e.g. is a die toss = 6 it cannot = 1 if A B  A  B = A (2) P(A|B) = = 2.1-7 • event B occurred  event A could have possibly occurred • e.g. is a die toss was = to 2 or 4 then it may have been = 2 if B A  A  B = B (3) P(A|B) = = = 1 2.1-8 • event B occurred  event A must have occurred • e.g. if a 2 and 4 were tossed then a 2 was tossed

  9. P(Ai|B) = = = • Bayes Theorem • assume n mutually exclusive events, Ai, i = 1,2... n such that • assume B is an arbitrary event with P(B) > 0 2.1-9

  10. P(Ai|B) = • Bayes Theorem • used to derive optimal receiver structure for digital communications • Ai, i = 1,2... n represents n possible transmitted messages, • P(Ai) = fixed probability that Ai is sent • B represents received Ai corrupted by noise P(Ai|B) = probability that Aiwas sent based on received B

  11. 1 1 6 1 6 6 6 1 2 7 7 2 2 7 7 2 8 3 8 3 8 3 3 8 9 4 9 9 4 4 9 4 0 5 5 5 0 0 5 0 e.g. Conditional Probabilities & Bayes Theorm Assume a deck of 40 cards with 4 suites – each with 10 unique values S = {A1, A2,…, A0} Ai= probability of picking a specific blue suited card  P(Ai) = 1/40 Bi = probability of picking any blue suited card  P(Bi) = 1/4 • Ai B  • P(Ai|B) =P(Ai) / P(B) = 1/10 • P(B|Ai) = P(Ai) / P(Ai) = 1 P(Ai|B) = P(B|Ai) / [P(B|A1) +…+ P(B|A10)] = 1/10

  12. P(A|B) = P(A) 2.1-10 then by 2.1-5 the joint probability of events A & B both occurring is given by: P(A,B) = P(A)P(B) 2.1-11 more generally P(A1, A2 ,...,An) = P(A1) P(A2)... P(An) 2.1-12 Statistical Independence • consider • multiple experiments or repeated trials of a single experiment • events A & B are statistically independent • occurrence of A is independent of B

  13. 2.1.1 Random Variables, Probability Distributions, Probability Densities • Assume an experiment with • sample spaceS representing all possible outcomes • elements S • then the random variable (RV) X(s) is a real function with • domainS • range = set of real numbers • experimental outcomes can be • discrete (digital)  discrete RV • continuous (analog)  continuous RV

  14. (1) F(x) = cumulative distribution function (CDF) of X • consider the event {X  x} • let P(X  x) probability of event occurring F(x)= P(X  x) for - < x <  2.1-13 (2) p(x) = probability density function (PDF) of X • consider event {X = x} • let P(X = x) probability of event occurring p(x)= P(X = x) for - < x <  2.1-14 CDF and PDF Assume X is a random variable & x = any real number

  15. p(x) = for - < x <  2.1-16 p(x) F(x1) p(x1) x F(x) F(x)= 2.1-15 1 0 F(x1) x • properties of F(x) • 0  F(x)  1 • F(-) = 0 • F() = 1

  16. (i) for discrete or mixed (continuous & discrete) RV • PDF contains impulses at discontinuities points of F(x) • xi represents discrete values of RV • discrete part of CDF can be written as F(n)= 2.1-17

  17. P(x1< X  x2) =F(x2) - F(x1) 2.1-18 = • (ii) For a specific range (x1,x2)  find P(x1 < X  x2) • probability RV X falls within (x1,x2) • consider an event as union of 2 mutually exclusive events • (1) (X  x1) • (2) (x1< X  x2) • then • P(X  x2) = P(x1< X  x2) + P(x1X) • P(x1< X  x2) = P(X  x2) - P(x1X)

  18. = F(x1, x2) = P(X1 x1, X2 x2) 2.1-19 p(x2 ,x1) = F(x2 ,x1) 2.1-20 • 2.1.1.1Multiple RVs with Joint CDFs and Joint PDFs • are the result of either • combined experiments or repeated trials on a single experiment • consideration of 2 random variables X1& X2

  19. 2.1-21 2.1-22 and F(,)= = 1 2.1-23 p(x2)= p(x1)= For the joint PDFp(x1, x2)theMarginal PDF of p(x1) or p(x2) is given by:

  20. (1) jointPDF given by p(x1 ,...,xn) = P(X1= x1,..., Xn= xn) 2.1-24 F(x1,..,xn) p(x1 ,...,xn) = (2) jointCDF given by F(x1,..., xn) = P(X1 x1,..., Xn xn) F(x1,..., xn) = 2.1-25 and • F(x1 ,x2, 3, x4,..., xn) = F(x1 ,x2, x4,,..., xn) • F(x1 ,x2, -3, x4,..., xn) = 0 MultidimensionalRVs & Joint Distributions • assume there are RVs which are given by Xi, i = 1,2,...n

  21. P(X1 x1 | X2  x2) = = 2.1.1.2Conditional Probability Distribution Functions Consider RVs X1 & X2 with joint PDF p(x1,x2) (i) Determine the probability of event(X1 x1| X2  x2) • probability of X1 x1 given that X2  x2

  22. then P(X1  x1 | x2-Δx2 < X2 x2) = 2.1-26 = (ii) Determine probability of event(X1 x1| x2- Δx2 < X2  x2) • probability of X1 x1conditioned onx2 -Δx2 < X2  x2 • Δx2 = some positive increment • from eqns (2.1-4) and (2.1-18)

  23. the Conditional CDF of X1 given X2is by definition P(X1 x1 | X2 = x2) ≡F(x1 | x2) = = then F(x1 | x2) = 2.1-27 (iii) Conditional CDF of X1 given X2 • assume the PDFs p(x1,x2) & p(x2) are continuous over (x2-Δx2, x2) • divide 2.1-26 by Δx2 & take limit as Δx2 0 • and • F(-| x2) = 0 • F(| x2) = 1

  24. p(x1, x2)= p(x1 | x2) p(x2) = p(x2 | x1) p(x1) p(x1 | x2) = differentiation of 2.1-27 by x1 yields PDF p(x1 | x2) 2.1-28 express joint PDFp(x1, x2) in terms of conditional PDFs 2.1-29

  25. p(x1 ,..., xk | xk+1,..., xn ) the joint conditional PDF is 2.1-31 F(x1 ,..., xk | xk+1,..., xn ) the joint conditional CDF is 2.1-32 F (x1 ,..., xk | xk+1,..., xn )= F (x2 ,..., xk | xk+1,..., xn ) F (, x2 ,..., xk | xk+1,..., xn ) = F (-, x2 ,..., xk | xk+1,..., xn ) = 0 Multidimensional Conditional RVs Assume there are RVs given by Xi, i = 1,2,...n and integer k, 1 < k < n the joint probability of {Xi}i = 1,2,...n is p(x1,..., xn)= p(x1 ,,..., xk| xk+1,..., xn ) p(xk+1,..., xn ) 2.1-30

  26. Statistically Independent Random Variables • assume RVs defined on sample space S are generated by either • (i) combined experiments • (ii) repeated trials of single experiment • extend idea of statistical independence for multiple events on S • Let oi= ith outcome of some experiment • assume mutually exclusive outcomes i  joi oj =  • p(oi) is independent of any p(oj) •  thus the joint probability of outcomes factors into the product of • probabilities for each outcome • p(o1, o2,..., on) = p(o1)p(o2) ... p(on)

  27. multidimensional RVs are statistically independent if & only if F(x1, x2,..., xn) = F(x1) F(x2) ... Fxn) 2.1-33 or alternatively p(x1, x2,..., xn) = p(x1)p(x2) ... p(xn) 2.1-34 RVs corresponding to each oi are independent in the sense that their joint PDFs factor into products of marginal PDFs

  28. (i) determine pY(y) in terms of pX(x) if mapping of X Y is one-one Y = aX + b, a > 0 (function is linear & has a monotonic mapping) Y Y=aX+b FY(y) = P(Y  y) = P(aX+b  y) X = - ö æ y b = ÷ ç F 2.1-35 X è a ø 2.1.2 Functions of Random Variables • given RVs X and Y characterized by PDFs pX(x) and pY(x) • assume Y = g(x), whereg(x) is some function of X

  29. - ö æ 1 y b pY(y) = 2.1-36 ÷ ç p X è a a ø px(x) py(y) 1 1/a -1 0 1 x b-a b b+a y Differentiate both sides of 2.1-35 with respect to y

  30. Y Y=aX2+b X b = = and 2.1-37 = 2.1-38 pY(y) = (ii) determine pY(y) when mapping of X Y one-one let Y = aX2 + b, a > 0 then FY(y) = P(Y  y) = P(aX2+b  y) differentiate both sides of 2.1-37 with respect to y to obtain

  31. x1 = x2 = 2.1-40  thus PDF of Y consists of two terms Note that y = g(x) = ax2 + b has 2 real solutions General Case of g(x) = y with real roots x1, x2, ..., xn PDF of Y = g(x) expressed as pY(y) = 2.1-39 pY(y) = where roots xi, i = 1,2,...,n are functions of y and g’(x) = dg(x)/dx = 2ax

  32. Assume fori = 1,2,...,n • there are RVs, Xi, with joint PDF given by pX(x1, x2, ...,xn) • RVs Yi ,Yi there exists a function gi() such that • Yi = gi(X1, X2, ...,Xn) • gi() is • - single-valued function with continuous partial derivatives • - and invertible  there exists g-1i() such that • Xi = g-1i(Y1, Y2, ...,Yn) • g-1i() is single-valued with continuous partial derivatives 2.1-41 2.1-42 Multidimensional Functions of RVs

  33. then • fori = 1,2,...,n substitute xi = gi-1(y1, y2, ...,yn) for notation, let gi-1≡gi-1(y1, y2, ...,yn) • Given pX(x1, x2, ...,xn)  determine pY(y1, y2, ...,yn) • Assume • X = n-dimensional space of RVs Xi • Y = 1-1 mapping of Xdefined by functions Yi = gi(X1, X2, ...,Xn)

  34. 2.1-43 then J = Jacobian transformation defined by determinant J = Desired joint PDF of Yi given by differentation of 2.1-43 pY(y1, y2, ...,yn) = pX(x1=g1-1, x2 = g2-1, ...,xn = gn-1)|J| 2.1-44

  35. {aij} are constants {bij}  A-1 Y = AX then using matrix notation Yi = Xi = i = 1,2,...,n i = 1,2,...,n X = A-1Y 2.1-46 J = 1/|A| pY(y1,...,yn) = e.g. if there is a linear relation between 2 sets of n-dimensional RVs 2.1-45 Joint PDF is 2.1-47

  36. 2.1.3 Statistical Averages of RVs • Averages are important for characterizing • outcomes of experiments • RV defined on sample space of experiments • Averages of specific interest include • (1) 1st & 2nd moments of single RV • (2) joint moment between two RVs in multidimensional set of RVs • - correlation • - covariance • (3) characteristic functionof single RV • (4) joint characteristic function of multidimensional set of RVs

  37. (i) nthMoment of X given by E[X] = mx = E[Xn] = for RV Y = g(X), where g(X) is an arbitrary function of X E[Y] = E[g(X)] = 2.1-50 (1) Given single RVX with PDF p(x) 2.1-48 1stmoment of X (aka mean or expected value) = given by: 2.1-49 p(x) = PDF of X

  38. 2ndcentral moment of RV X (aka Variance ) (ii)nthCentral Moment LetY = (X-mx)n E[Y] = E[(X-mx)n] = • measures the dispersion of RV X mx = mean of RV X 2.1-52 σx2= E[(X-mx)2] = Standard Deviationσx = = = E[X2] - E[X]2 = E[X2] - mx2 2.1-51

  39. (i) joint moment E[X1k,X2n] = E[(X1-m1)k (X2-m2)n] = 2.1-54 (ii) joint central moment where mi= E[Xi] (2) Given joint PDFp(x1,x2) of 2 RVs, X1 & X2 2.1-53

  40. (iii) correlation of X1& X2 = joint moment with k = n = 1 2.1-55 E[X1,X2] = (iv) covariance of X1& X2= joint central moment with k = n = 1 2.1-56 E[(X1-m1) (X2-m2)] = • correlation & covariance are joint moments for k = n = 1 that are used for pairs of RVs • assume RVs Xi, i = 1,2,...,n with joint PDF p(x1,...xn) • let p(xi , xj) be joint PDF of Xi and Xj normalized indication of whether 2 RV vary in similar manner indicates whether 2 RV vary in similar manner

  41. (i) correlation of Xi & Xj = E[Xi,Xj] = (ii) covariance of Xi & Xj≡uij = E[(Xi-mi) (Xj-mj)] uij= 2.1-57 = = E[(Xi Xj)] - mimj (iii) covariance matrix of {Xi} is nn matrix with elements uij 2.1-58 more generally for RVs {Xi}i = 1..n

  42. For two uncorrelated RVs, Xi & Xj correlation E[Xi,Xj]= E[Xi] E[Xj] = mimj 2.1-59 covariance uij = E[(Xi) (Xj)] - mimj = 0 Given any two RVs, Xi & Xj • statistical independence ofXi & Xj implies they are uncorrelated • uncorrelatedXi & Xjare not necessarily statistically independent • Xi & Xj are orthogonal if E[Xi,Xj]= 0 2.1-60 • Xi & Xj are orthogonal when • (i) they are uncorrelated • (ii) either/both have 0-mean

  43. 2.1-61 ψ(jv) ≡ E[exp(jvX)] = • v = real variable • j = p(x) = 2.1-62 Characteristic Functions • given RV X with PDF p(x), thecharacteristic functionψ(jv) is • defined as statistical average of exp(jvX) • ψ(jv) can be described as the Fourier Transform of p(x) • inverse Fourier Transform is given by

  44. 1st derivative of ψ(jv)with respect to v gives 2.1-63 • evaluated at v = 0 yields 1st moment (mean) 2.1-64 E[X] = mx = nthderivative of ψ(jv)evaluated at v = 0 yields nthmoment E[Xn] = 2.1-65 • (1) Moments of a RV can be determined from the characteristic • equation

  45. 2.1-66 ψ(jv) = ψ(jv) = 2.1-67 • (2) Sum of the moments of a RV can be related to characteristic • equation if ψ(jv) can be expanded into Taylor Series about the point v = 0 by substitution ψ(jv) is related to sum of the moments of X

  46. let RV Y be given as: Y = = = = • (3) Given a set of statistically independentRVs {Xi}i = 1..n with joint • PDF p(x1, x2,...., xn,) • find p(y), the PDF of Y using ψ(jv) & inverse Fourier Transform ψY(jv) = E[exp(jvY)] 2.1-68

  47. 2.1-69 ψXi(jv) = ψY(jv) = thus where 2.1-70 for statistically independent RVs  p(x1,...,xn) = p(x1)p(x2)...p(xn) • thus nthorder integral reduces to product on n single integrals • each integral i corresponds to characteristic equation of Xi ψY(jv) = =

  48. p(y) = 2.1-71 = p(y) = if Xi ’s have identical distribution (iid) all ψXi(jv) = ψXk(jv) ψY(jv) = PDF of Y determined from IFT of ψY(jv) 2.1-72

  49. For statistically independent RVs {Xi}i = 1..n and • ψY(jv) = product of • p(y) = n-fold convolution of p(xi) usually difficult to solve = Y = ψ(jv1, jv2,..., jvn, ) = (4) Characteristic Function for Joint RV For n-dimensional RVs {Xi}i = 1..n with joint PDF p(x1,...,xn) the n-dimensional characteristic function defined as

  50. ψ(jv1, jv2) 2.1-74 = • correlation of Xi& Xj given by 2.1-75 E[X1,X2] = = e.g. characteristic function for n=2 • use partial derivatives of ψ(jv1, jv2) with respect to v1& v2 to • generate joint moments • higher order moments are generated in similar manner

More Related