1 / 36

Unit3: Statistical Inferences

Unit3: Statistical Inferences. Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston. Estimation. Point Estimates A point estimate of a parameter θ is a single number used as an estimate of the value of θ .

ormand
Download Presentation

Unit3: Statistical Inferences

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Unit3: Statistical Inferences Wenyaw Chan Division of Biostatistics School of Public Health University of Texas - Health Science Center at Houston

  2. Estimation • Point Estimates • A point estimate of a parameter θ is a single number used as an estimate of the value of θ. • e.g. A natural estimate to use for estimating the population mean  is the sample mean . • Interval Estimation • If an random interval I=(L,U) satisfying Pr(L<θ<U)=1- α, the observed values of L and U for a given sample is called a 1- α conference interval estimate for θ.  Which one is more accurate?   Which one is more precise?

  3. Estimation What to estimate? •  B(n, p)  proportion •  Poisson ()  mean •  N(, σ2)  mean and/or variance

  4. Estimation of the Mean of a Distribution •  A point estimator of the population mean is sample mean. • Sampling Distribution of is the distribution of values of over all possible samples of size n that could have been selected from the reference population.

  5. Estimation • An estimator of a parameter is unbiased estimator if its expectation is equal to the parameter. • Note: The unbiasedness is not sufficient to be used as the only criterion for chosen an estimator. • The unbiased estimator with the minimum variance(MVUE) is preferred. • If the population is normal, then is the MVUE of .

  6. Sample Mean • Standard error (of the mean) = standard deviation of the sample mean • The estimated standard error where s: sample standard deviation .

  7. Central Limit Theorem • Let X1,…,Xn be a random sample from some population with mean  and varianceσ2 Then, for large n,

  8. Interval Estimation • Let X1, ….Xn be a random sample from a normal population N(, σ2). If σ2 is known, a 95% confidence interval (C.I.) for  is why? (next slide)

  9. Interval Estimation

  10. Interval Estimation Interpretation of Confidence Interval • Over the collection of 95% confidence intervals that could be constructed from repeated random samples of size n, 95% of them will contain the parameter  • It is wrong to say:There is a 95% chance that the parameter  will fall within a particular 95% confidence interval.

  11. Interval Estimation • Note: • When  and n are fixed, 99% C.I. is wider than 95% C.I. • If the width of the C.I. is specified, the sample size can be determined. n  length   length 

  12. Hypothesis Testing • Null hypothesis(H0): the statement to be tested, usually reflecting the status quo. • Alternative hypothesis (H1): the logical compliment of H0. • Note: the null hypothesis is analogous to the defendant in the court. It is presumed to be true unless the data argue overwhelmingly to the contrary.

  13. Hypothesis Testing • Four possible outcomes of the decision: • Notation:  = Pr (Type I error) = level of significance  = Pr (Type II error) 1- = power= Pr(reject H0|H1 is true)

  14. Hypothesis Testing • Goal : to make  and  both small • Facts:  then   then  • General Strategy: fix , minimize 

  15. Testing for the Population Mean • When the sample is from normal population H0 :  = 120 vs H1 :  < 120 • The best test is based on ,which is called the test statistic. The "best test" means that the test has the highest power among all tests with a given type I error. Is there any bad test? Yes. • Rejection Region: • range of values of test statistic for which H0 is rejected.

  16. One-tailed test • Our rejection region is •  Now,

  17. Result • To test H0 :  = 0vs H1 :  < 0, based on the samples taken from a normal population with mean  and variance unknown, the test statistic is . • Assume the level of significance is α then, • if t < tn-1, α , then we reject H0. • if t ≥ tn-1, α, then we do not reject H0.

  18. P-value • The minimum α-level at which we can reject Ho based on the sample. • P-value can also be thought as the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained from the sample, given that the null hypothesis is true.

  19. Remarks • Two different approaches on determining the statistical significance: • Critical value method • P-value method.

  20. One-tailed test • Testing H0: µ=µ0vs H1: µ> µ0 When unknown and population is normal Test Statistic: Rejection Region: t > tn-1,α p-value = 1- Ft,n-1 (t), where Ft,n-1 ( ) is the cdf for t distribution with df=n-1. • Note:If is known, the s in test statistic will be replaced σby and tn-1,αin rejection region will be replaced by zα, Ft,n-1 (t) will be replace by Ф(t).

  21. Testing For Two-Sided Alternative • Let X1,….,Xn be the random samples from the population N(µ, σ²), whereσ²is unknown. • H0 : µ=µ0vs H1 : µ≠µ0 • Test Statistic: • Rejection Region: |t|> tn-1,1-α/2 • p-value = 2*Ft,n-1 (t), if t<= 0. (see figures on next slide) 2*[1- Ft,n-1 (t)], if t > 0. • Warning: exact p-value requires use of computer.

  22. Testing For Two-Sided Alternative P-value for X>U0 P-value for X<=U0

  23. The Power of A Test • To test H0 : µ=µ0vs H1 : µ<µ0 in normal population with known variance σ²,the power is • Review: Power= Pr [rejecting H0 | H0 is false ] • Factors Affecting the Power

  24. The Power of The 1-Sample T Test • To test H0 : µ=µ0 vs H1 : µ<µ0in a normal population with unknown variance σ²,the power, for true meanµ1 and true s.d.= σ, is F(tn-1, .05), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality

  25. Power Function For Two-Sided Alternative • To test H0 :µ=µ0vs H1 : µ≠µ0in normal population with known variance σ²,the power is ,where µ1 is true alternative.

  26. Case of Unknown Variance • For the same test with an unknown variance population, the power is F(-tn-1, 1-α/2) + 1- F(tn-1, 1- α/2), where F( ) is the c.d.f of non-central t distribution with df=n-1 and non-centrality

  27. For example:H0 :µ=µ0vs H1 : µ<µ0 power : Hence, Sample Size Determination

  28. Factor Affecting Sample Size 1. 2. 3. 4. • To test H0 :µ=µ0vs H1 : µ≠µ0, σ²is known. Sample size calculation is

  29. Relationship between Hypothesis Testing and Confidence Interval • To test H0 :µ=µ0vs H1 : µ≠µ0, H0 is rejected with a two-sided level α test if and only if the two-sided 100%*(1 - α) confidence interval for µ does not contain µ0.

  30. One Sample Test for the Variance of A Normal Population

  31. One Sample Test for A Proportion

  32. Exact Method • If p(hat) < p0, the p-value • If p(hat) ≥ p0, the p-value

  33. Power and Sample size

  34. One-Sample Inference for the Poisson Distribution • X ~ Poisson with mean μ • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Obtain a two-sided 100(1- α)% C.I. for µ, say (C1, C2) • If µ0 (C1, C2), we accept H0 otherwise reject H0.

  35. One-Sample Inference for the Poisson Distribution • The p-value (for above two-sided test) • If observed X < µ0, then • If observed X > µ0, Where F(x |µ0) is the Poisson c.d.f with mean = µ0.

  36. Large-Sample Test for Poisson(for µ0≥ 10) • To test H0 :µ=µ0vs H1 : µ≠µ0 at α level of significance, • Test Statistic: • Rejection Region: • p-value:

More Related