Random Variables

Random Variables Supplementary Notes Prepared by Raymond Wong Presented by Raymond Wong

e.g.1 (Page 3) • Suppose that we flip the coin 5 times. • The following shows the sample space of flipping the coin 5 times. X(HHHHH) = 5 X(HTHHT) = 3 … HTHHT HHHHH … Sample Space …. THTHT TTTTT … X(THTHT) = 2 X(TTTTT) = 0 Suppose that we are interested in the total number of heads when we flip the coin 5 times. We define a random variable X to denote the total number of heads when we flip the coin 5 times.

e.g.2 (Page 6) • Consider the example of flipping the coin 1 time. • Let X be the random variable where 1 if the flip is successful (i.e., showing a head) X = 0 if the flip is unsuccessful (i.e., not showing a head) • X is called a Bernoulli random variable. • The flip is called a Bernoulli trial.

e.g.3 (Page 8) • Consider the example of flipping the coin 5 times. • Let Xi be the random variable where 1 if the i-th flip is successful (i.e., showing a head) Xi = 0 if the i-th flip is unsuccessful (i.e., not showing a head) • Xi is called a Bernoulli random variable. • Each flip is called a Bernoulli trial. • Flipping it 5 times is a Bernoulli trial process. • Suppose that we are interested in the number of heads. • We have X1 + X2 + X3 + X4 + X5.(i.e., the sum of Bernoulli Random Variables)

e.g.4 (Page 9) • We have 5 Bernoulli trials with probability p success on each trial • Let S denote success and F denote failure. • What is the probability of the following? • (a) SSSFF • (b) FFSSS • (c) SFSFS • (d) any particular ordering on three S’s and any particular ordering on two F’s (e.g., FSFSS) (a) Since each trial is independent, we have P(SSSFF) = P(S) x P(S) x P(S) x P(F) x P(F) = p x p x p x (1-p) x (1-p) = p3(1-p)2

e.g.4 • We have 5 Bernoulli trials with probability p success on each trial • Let S denote success and F denote failure. • What is the probability of the following? • (a) SSSFF • (b) FFSSS • (c) SFSFS • (d) any particular ordering on three S’s and any particular ordering on two F’s (e.g., FSFSS) (b) Since each trial is independent, we have P(FFSSS) = P(F) x P(F) x P(S) x P(S) x P(S) = (1-p) x (1-p) x p x p x p = p3(1-p)2

e.g.4 • We have 5 Bernoulli trials with probability p success on each trial • Let S denote success and F denote failure. • What is the probability of the following? • (a) SSSFF • (b) FFSSS • (c) SFSFS • (d) any particular ordering on three S’s and any particular ordering on two F’s (e.g., FSFSS) (c) Since each trial is independent, we have P(SFSFS) = P(S) x P(F) x P(S) x P(F) x P(S) = p x (1-p) x p x (1-p) x p = p3(1-p)2

P(any particular ordering on three S’s and any particular ordering on two F’s) = p3(1-p)2 e.g.4 • We have 5 Bernoulli trials with probability p success on each trial • Let S denote success and F denote failure. • What is the probability of the following? • (a) SSSFF • (b) FFSSS • (c) SFSFS • (d) any particular ordering on three S’s and any particular ordering on two F’s (e.g., FSFSS) (d) Since each trial is independent, we have P(any particular ordering on three S’s and any particular ordering on two F’s) = P(S) x P(S) x P(S) x P(F) x P(F) = p x p x p x (1-p) x (1-p) = p3(1-p)2

5 5 3 3 = p3(1-p)2 P(any particular ordering on three S’s and any particular ordering on two F’s) = p3(1-p)2 e.g.5 (Page 10) • We have 5 Bernoulli trials with probability p success on each trial • Let S denote success and F denote failure. • What is the probability that the 5 trials contain exactly 3 successes? Is it equal to p3(1-p)2? No. The total number of trials containing 3 successes and 2 failures = P(5 trails contain exactly 3 successes) = P(SSSFF) + P(SSFSF) + …+ P(FFSSS) = p3(1-p)2 + p3(1-p)2 + …+ p3(1-p)2

… X = 1 X = 0 X = 2 Sample Space n n n n X = n … … … 0 2 1 n pn(1-p)n-n p0(1-p)n-0 p1(1-p)n-1 p2(1-p)n-2 e.g.6 (Page 12) • The sample space for the Binomial Random Variable X is:

e.g.7 (Page 12) • The binomial theorem is • If x = p and y = 1-p, we have

10 questions P(he answers correctly) = 0.8 P(he answers incorrectly) = 0.2 • What is P(answer exactly 8 questions correctly)? • (b) What is P(answer exactly 9 questions correctly)? • (c) What is P(answer exactly 10 questions correctly)? e.g.8 (Page 13) • There are 10 questions in a test. • A student takes this test. • Suppose that he who knows 80% of the course material has probability 0.8 of success on any question, independent of how he did on another question. • (a) What is the probability that he answers exactly 8 questions correctly? • (b) What is the probability that he answers exactly 9 questions correctly? • (c) What is the probability that he answers exactly 10 questions correctly?

10 0.8kx0.210-k if 0  k  10 k P(X = k) = otherwise 0 10 10 10 10 = 0.89x0.21 0.88x0.210-8 = 0.88x0.22 0.89x0.210-9 9 8 8 9 10 10 0.810x0.210-10 = 0.810x0.20 10 10 10 questions P(he answers correctly) = 0.8 P(he answers incorrectly) = 0.2 • What is P(answer exactly 8 questions correctly)? • (b) What is P(answer exactly 9 questions correctly)? • (c) What is P(answer exactly 10 questions correctly)? e.g.8 This example is similar to the Bernoulli trial process. A trial in this example is answering a question. A success in this example is that he answers the question correctly. A failure in this example is that he answers the question incorrectly. Thus, we can use the formula of the Bernoulli trail process(or Binomial Random Variable X) Let X be the total number of questions answered correctly. (a) =0.302 P(X = 8) = (b) =0.268 P(X = 9) = (c) =0.107 P(X = 10)=

e.g.9 (Page 15) • Suppose that we flip a fair coin TWICE. • The sample space is 0 head 1 head HT TT Sample Space HH TH 1 head 2 heads

HTT TTH TTT THT Sample Space HHH HTH THH HHT e.g.10 (Page 15) • Suppose that we flip a fair coin THREE times. • The sample space is 1 head 1 head 1 head 0 head 2 heads 2 heads 2 heads 3 heads

$ ??? You I You I $ 2 You I e.g.11 (Page 16) • Illustration 1 for Page 16 Step 1 Step 2 Flip 3 coins Step 3 The outcome is HHT

$ ??? You I You I $ 0 You I e.g.12 (Page 16) • Illustration 2 for Page 16 Step 1 Step 2 Flip 3 coins Step 3 The outcome is TTT

Sample Space X = 3 X = 1 X = 0 X = 2 e.g.13 (Page 17) • Let X be a random variable denoting a number equal to 0, 1, 2, or 3. • The sample space where we consider random variable X is P(X = xi) 1/4 1/4 1/4 1/4 xi What is E(X)? E(X) = 0 x ¼ + 1 x ¼ + 2 x ¼ + 3 x ¼ = 3/2

HTT TTH TTT THT Sample Space HHH HTH THH HHT e.g.14 (Page 17) • Suppose that we flip a fair coin THREE times. • The sample space where we flip a fair coin THREE times is 2 tails 2 tails 2 tails 3 tails 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1 tail 1 tail 1 tail 0 tail Let X be the random variable denoting the number of tails. What is E(X)? E(X) = 3x1/8 + 2x1/8 + 2x1/8 + 2x1/8 + 1x1/8 + 1x1/8 + 1x1/8 + 0x1/8 = 1.5

HTT TTH TTT THT Sample Space HHH HTH THH HHT e.g.15 (Page 17) • Suppose that we flip a biased coin THREE times where P(tail) = 2/3 and P(head) = 1/3 • The sample space where we flip a biased coin THREE times is 2 tails 2 tails 2 tails 3 tails 8/27 4/27 4/27 4/27 2/27 1/27 2/27 2/27 1 tail 1 tail 1 tail 0 tail Let X be the random variable denoting the number of tails. What is E(X)? E(X) = 3x8/27 + 2x4/27 + 2x4/27 + 2x4/27 + 1x2/27 + 1x2/27 + 1x2/27 + 0x1/27 = 2

Sample Space X = 3 X = 1 X = 0 X = 2 0 x + 1 x 3 3 3 3 3 3 3 3 0 0 0 0 0 3 2 1 + 3 x + 2 x (2/3)0(1/3)3-0 (2/3)1(1/3)3-1 (2/3)2(1/3)3-2 (2/3)3(1/3)3-3 (2/3)0(1/3)3-0 (2/3)0(1/3)3-0 (2/3)0(1/3)3-0 (2/3)0(1/3)3-0 e.g.16 (Page 17) • Suppose that we flip a biased coin THREE times where P(tail) = 2/3 and P(head) = 1/3 • Let X be the random variable denoting the number of tails. • The sample space where we consider random variable X is What is E(X)? E(X) = = 2

Sample space = { , , , , , } e.g.17 (Page 18) • Suppose that I want to throw one 6-sided dice. 1/6 1/6 1/6 1/6 1/6 1/6 1 spot 2 spots 3 spots 4 spots 5 spots 6 spots • Let X be the number of spots shown. • What is E(X)? E(X) = 1x1/6 + 2x1/6 + 3x1/6 + 4x1/6 + 5x1/6 + 6x1/6 = 7/2

3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 e.g.18 (Page 18) • Suppose that I want to throw two fair dice. • Let Y be the random variable denoting the number of spots shown. 1/36 2/36

3/36 4/36 5/36 6/36 5/36 4/36 3/36 2/36 1/36 e.g.19 (Page 18) • Suppose that I want to throw two fair dice. • Let Y be the random variable denoting the number of spots shown. 1/36 E(Y) = 2 x 2/36 + 3 x 2/36 + 4 x 3/36 + 5 x 4/36 + 6 x 5/36 + 7 x 6/36 + 8 x 5/36 + 9 x4/36 + 10 x 3/36 + 11 x 2/36 + 12 x 1/36 2/36 = 7

HTT TTH TTT THT Sample Space HHH HTH THH HHT e.g.20 (Page 20) • Suppose that we flip a fair coin THREE times. • The sample space where we flip a fair coin THREE times is 2 tails 2 tails 2 tails X(s) 3 tails P(s) 1/8 1/8 1/8 1/8 s 1/8 1/8 1/8 1/8 1 tail 1 tail 1 tail 0 tail Let X be the random variable denoting the number of tails. What is E(X)? E(X) = 0x1/8 + 1x1/8 + 1x1/8 + 1x1/8 + 2x1/8 + 2x1/8 + 2x1/8 + 3x1/8 = 1.5

Lemma 5.9 If a random variable X is defined on a (finite) sample space S, then its expected value is given by E(X) = X(s) P(s) = X(s)P(s) + Y(s)P(s) = [X(s)P(s) + Y(s)P(s) ] = Z(s) P(s) = [X(s) + Y(s)] P(s) e.g.21 (Page 25) IMPORTANT: X and Y can be independent X and Y can be dependent. Theorem 5.10 Suppose X and Y are random variables on the (finite) sample space S. Then E(X + Y) = E(X) + E(Y) Let Z = X+Y. Why is it correct? That is, given an outcome s in S, Z(s) = X(s) + Y(s) According to Lemma 5.9, we have E(X+Y) = E(Z) = E(X) + E(Y)

1 if head 0 if head 1 if head 0 if head X = Y = X = Y = 0 if tail 1 if tail 0 if tail 1 if tail (a) What is E(X)? (b) What is E(Y)? (c) What is E(X + Y) (without using Theorem 5.10)? (d) What is E(X+Y) (by using Theorem 5.10)? e.g.22 (Page 26) • Suppose that we flip a fair coin. • We have two random variables X and Y. • (a) What is E(X)? • (b) What is E(Y)? • (c) What is E(X + Y) (without using Theorem 5.10)? • (d) What is E(X+Y) (by using Theorem 5.10)?

1 if head 0 if head X = Y = 0 if tail 1 if tail (a) What is E(X)? (b) What is E(Y)? (c) What is E(X + Y) (without using Theorem 5.10)? (d) What is E(X+Y) (by using Theorem 5.10)? e.g.22 (a) E(X) = 1 x ½ + 0 x ½ = ½ (b) E(Y) = 0 x ½ + 1 x ½ = ½ (c) Consider two cases. Case 1: head X = 1 and Y = 0 X+Y = 1 Case 2: tail X = 0 and Y = 1 X+Y = 1 E(X+Y) = 1 x ½ + 1 x ½ = 1 (d) By using the theorem, we have E(X + Y) = E(X) + E(Y) = ½ + ½ = 1

1 if head 0 if head 1 if head 0 if head X = Y = X = Y = 0 if tail 1 if tail 0 if tail 1 if tail (a) What is E(X)? (b) What is E(Y)? (c) What is E(XY)? (d) Is “E(XY) = E(X)E(Y)”? e.g.23 (Page 26) • Suppose that we flip a fair coin. • We have two random variables X and Y. • (a) What is E(X)? • (b) What is E(Y)? • (c) What is E(XY)? • (d) Is “E(XY) = E(X)E(Y)”?

1 if head 0 if head X = Y = 0 if tail 1 if tail (a) What is E(X)? (b) What is E(Y)? (c) What is E(XY)? (d) Is “E(XY) = E(X)E(Y)”? e.g.23 (a) E(X) = ½ (b) E(Y) = ½ (c) Consider two cases. Case 1: head X = 1 and Y = 0 XY = 0 Case 2: tail X = 0 and Y = 1 XY = 0 E(XY) = 0 x ½ + 0 x ½ = 0 (d) Consider E(X)E(Y) = ½ x ½ = ¼ We know that E(XY) = 0 (from part (c)) Thus, E(XY)  E(X)E(Y)

e.g.24 (Page 26) • In any cases (or in general), E(X + Y) = E(X) + E(Y) • In some cases, E(XY)  E(X)E(Y) • In some other cases, E(XY) = E(X)E(Y)

e.g.24 (Page 27) • Illustration of Theorem 5.11 • E.g. E(2X) = 2E(X) The reason isE(2X) = E(X + X) = E(X) + E(X) = 2E(X)

5 students (or n students) e.g.25 (Page 36) • Consider Derangement Problem (or Dearrangement Problem) • Suppose that there are 5 (or n) students. • They put their backpacks along the wall. • Someone mixed up the backpacks so students get back “random” backpacks.

5 students (or n students) e.g.25 X: total number of students who get their backpacks back correctly Xi be an indicator random variable denoting the event Eithat student i gets his backpack correctly (a) Are E1 and E2 independent when n = 2? (b) What is E(X) when n = 5? • Let X be the total number of students who get their backpacks back correctly • Let Xi be an indicator random variable denoting the event Ei that student i gets his backpack correctly • (a) Are E1 and E2 independent when n =2? • (b) What is E(X) when n = 5?

Peter Ray Peter Ray Peter Peter Raymond Raymond 5 students (or n students) e.g.23 X: total number of students who get their backpacks back correctly Xi be an indicator random variable denoting the event Eithat student i gets his backpack correctly (a) Are E1 and E2 independent when n = 2? (b) What is E(X) when n = 5? (a) Suppose that student 1 is “Raymond” and student 2 is “Peter”. E1: the event that “Raymond” gets his backpack correctly. E2: the event that “Peter” gets his backpack correctly. There are only two cases. Case 1: Case 2: P(E1) = P(“Raymond” gets his backpack correctly) = ½ P(E2) = P(“Peter” gets his backpack correctly) = ½ P(E1 E2) = P(“Raymond and “Peter” get their backpack correctly) = ½ Note that P(E1) x P(E2) = ½ x ½ = ¼ Thus, P(E1) x P(E2)  P(E1 E2) Thus, E1 are E2are not independent.

1 if student i takes his backpack correctly Note that Xi = 0 if student i takes his backpack incorrectly 5 students (or n students) e.g.23 X: total number of students who get their backpacks back correctly Xi be an indicator random variable denoting the event Eithat student i gets his backpack correctly Note that events Ei (or the correspondence random variables Xi) are not independent. We can still use this linearity of expectation. (a) Are E1 and E2 independent when n = 2? (b) What is E(X) when n = 5? Note that X = X1 + X2 + X3 + X4 + X5 (b) By linearity of expectation, E(X) = E(X1 + X2 + X3 + X4 + X5) = E(X1) + E(X2) + E(X3) + E(X4) + E(X5) The next question is : What is E(Xi)? E(Xi) = 1 x P(student i takes his backpack correctly) + 0 x P(student i takes his backpack incorrectly) = P(student i takes his backpack correctly)

5 students (or n students) e.g.23 X: total number of students who get their backpacks back correctly Xi be an indicator random variable denoting the event Eithat student i gets his backpack correctly (a) Are E1 and E2 independent when n = 2? (b) What is E(X) when n = 5? Note that X = X1 + X2 + X3 + X4 + X5 (b) By linearity of expectation, E(X) = E(X1 + X2 + X3 + X4 + X5) = E(X1) + E(X2) + E(X3) + E(X4) + E(X5) The next question is : What is E(Xi)? E(Xi) = P(student i takes his backpack correctly)

Peter Raymond 5 students (or n students) e.g.23 X: total number of students who get their backpacks back correctly Xi be an indicator random variable denoting the event Eithat student i gets his backpack correctly (a) Are E1 and E2 independent when n = 2? (b) What is E(X) when n = 5? Note that X = X1 + X2 + X3 + X4 + X5 (b) By linearity of expectation, E(X) = E(X1 + X2 + X3 + X4 + X5) = E(X1) + E(X2) + E(X3) + E(X4) + E(X5) The next question is : What is E(Xi)? E(Xi) = P(student i takes his backpack correctly) = (5-1)!/5! = 4!/5! = 1/5 There are (5-1)! cases that Raymond gets his OWN backpack back. There are totally 5! cases

5 students (or n students) e.g.23 X: total number of students who get their backpacks back correctly Xi be an indicator random variable denoting the event Eithat student i gets his backpack correctly (a) Are E1 and E2 independent when n = 2? (b) What is E(X) when n = 5? Note that X = X1 + X2 + X3 + X4 + X5 (b) By linearity of expectation, E(X) = E(X1 + X2 + X3 + X4 + X5) = E(X1) + E(X2) + E(X3) + E(X4) + E(X5) The next question is : What is E(Xi)? E(Xi) = P(student i takes his backpack correctly) = (5-1)!/5! = 4!/5! = 1/5 Thus, E(X) = E(X1) + E(X2) + E(X3) + E(X4) + E(X5) = 1/5 + 1/5 + 1/5 + 1/5 + 1/5 = 1 Additional Question: If n can be any number, what is E(X)? E(X) = 1 Note that it is independent of n. E.g., If n = 1000, we expect that there is only one student who gets his backpack correctly.

Event T H e.g.26 (Page 40) • Suppose that we flip a coin. • The sample space of flipping a coin is P(H) = ½ Suppose that I flip a coin repeatedly. We want to see a head. Do you think that we “expect” to see a head within TWO flips? P(T) = ½

e.g.27 (Page 40) Suppose that I throw two dice repeatedly. We want to see the sum = 7. Do you think that we “expect” to see “sum = 7” within SIX times of throwing? • Suppose that we throw two dice. • The sample space of throwing two dice is P(sum=7) = 6/36 = 1/6

e.g.28 (Page 43) • Suppose that the trial process is “FFFS” where F corresponds to a failure and S corresponds to a success. • Let X be a random variable denoting the trial number where the first success occurs. • Let p be the probability of success. • (a) What is X(FFFS)? • (b) What is P(FFFS)? X(FFFS) = 4 P(FFFS) = (1-p)3p

For any real number -1 < x < 1, Theorem 4.6: For any real number x  1, x i . xi = nxn+2– (n+1)xn+1 + x (1-x)2 i . xi = n (1-x)2 = limn x-n 1 1 = limn = limn . xn x -n (ln x)(-1) (ln x)(-1) e.g.29 (Page 44) • We know the following known fact. • (1) • (2) You don’t need to recite (2). If we have (1), we can derive (2) Why? This is because nxn is equal to 0 when n is very large. If n is very large and -1 < x < 1, then what is the value of nxn? Consider limnnxn = 0 (This is because limn xn = 0) (By L’Hospital’s Rule)

For any real number -1 < x < 1, Theorem 4.6: For any real number x  1, x i . xi = nxn+2– (n+1)xn+1 + x (1-x)2 i . xi = (1-x)2 nxn+2– (n+1)xn+1 + x i . xi = (1-x)2 x i . xi = (1-x)2 e.g.29 • We know the following known fact. • (1) • (2) You don’t need to recite (2). If we have (1), we can derive (2) This is because nxn is equal to 0 when n is very large. Thus, from Theorem 4.6 Consider nxn+2 = nxn.x2 = 0.x2 if n is very large If n is large, we have = 0 Similarly, (n+1)xn+1 = 0 if n is very large

Random Variables