Introduction to Hypothesis Testing

Introduction to Hypothesis Testing

Everyday phenomenon We ask ourselves questions everyday. Here are a few examples: • Is it going to rain today? • Will the coin turn up Heads? • Will this movie be good? • Uh-oh! Is this exam going to hard? • Can I jump off the top of Miriam Hall and survive? (actually you may not want to test this one – at least not everyday).

Everyday phenomenon • Clearly, we do make decisions based on our answers to these questions. • How do we make our decisions? • Usually based on apriori probabilities or aposteriori probabilities

Apriori and Aposteriori probabilities • An apriori probability is one that can be mathematically computed based on the possible outcomes. • An aposteriori probability is one is estimated based on experience or an understanding of factors that affect the phenomena.

Apriori and Aposteriori probabilities • The probability of the outcome of a coin-tossing experiment can be determined prior to tossing the coin. Hence, apriori. • The probability of precipitation on a given day cannot be determined apriori. • The probability of rain on a particular day can only be determined after studying the various associated phenomena with rain. Hence, aposteriori.

Questions lead to hypotheses • The earlier questions can also be re-phrased as hypotheses. • We then decide to either accept the hypotheses or reject them and take action accordingly.

Questions lead to hypotheses • For example, • Is it going to rain today? can be rephrased as “It is not going to rain today” or “It is going to rain today” • etc.

We test such hypotheses everyday • Typically, we form hypotheses based on some observation, result, expectation, or suspicion. • Examples of such observations or expectations: • The suspect’s hat was found near the victim’s body. • The biochemistry of a new drug suggests that it must be more effective in reducing cancerous cells than the current drug. • This sample of herring looks somewhat bigger than the herring we are used to (the Atlantic herring, Clupea harengus) – they probably belong to another species. (I have to include at least one fish example!) • There are more mosquito larvae in this stream when compared to the one upstream.

The initial hypothesis • Our first hypothesis is conservative, where we deny our expectation or observation • For the examples given, our hypotheses would be: • The suspect is not guilty. • This new drug is no moreeffective in reducing cancerous cells than the current drug. • Although this sample of herring looks somewhat bigger than Clupea harengus, they do notbelong to another species. • This mosquito larvae density in this stream is no different from that of the one upstream.

The hypothesis of no difference • Note the language in forming the hypotheses. We have used terms such as not, no more, do not, no different from. • That is, we are stating that the observation or result in question could easily have come from the “regular” or “original” population. In other words, it is no different from the values in the original or regular population.

The reference population is not the expected population • The regular or original population in our examples would be • The population of innocent people • The population of results when cancerous cells are treated with the current drug • The population of Clupea harengus • The mosquito larval population in the upstream stretch of the river.

The null hypothesis • Because we are conservative in this hypothesis and state that the observed result (e.g., sample mean) is no different from the regular population mean, • We term this initial hypothesis the null hypothesis, or hypothesis of no difference. • The null hypothesis is symbolized by H0.

The logic • OK. We have formed the null hypothesis. What next? • Well, we then test it. What does that entail? • This is done by computing the probability of the null hypothesis being true, given the evidence. (How we do this depends on the variable we are dealing with – to be discussed at length during the semester.) • If this probability is very low then we reject the null hypothesis; if the probability is high, we accept it.

The alternative hypothesis • Obviously, if we reject the null hypothesis, we need to accept some other hypothesis – an alternative hypothesis. • So we really begin by making two hypotheses – the null hypothesis and an alternative hypothesis (symbolized H1). • We test the null hypothesis. If we accept it, that’s the end of that story. If we reject it, we then are forced to accept the alternative hypothesis.

The alternative hypothesis • Note that only the null hypothesis (H0) is tested, and if H0 is rejected, the alternative hypothesis (H1) is automatically accepted (without further testing). • It is therefore, very important that we make the hypotheses carefully. • The two hypotheses must also complement each other (i.e., there should be no third alternative)

Example • The saola (Pseudoryx nghetinhensis), is a species of ruminant from Vietnam that was unknown to science until 1993. • (I’m making this part up): 17 specimens were found, and 14 were males and 3 were females. • Because of this disparity the question might arise: is the sex-ratio in this species really 1:1?

Example…contd. • How do we test this? • The possibilities are the following: • The sex-ratio of this species is really 1:1 but it just happened by chance that this sample had a skewed ratio. • The sex-ratio of this species is biased in favor of males. • The two sexes are unequal in frequency.

Example…contd. • The question is: • Which of these scenarios is true? • And how do we find out?

Example…contd. • One way of resolving the issue of course, is to go out into the jungles of Vietnam and find more saolas. • An easier and cheaper way (if less fun) is to use probability theory and what we have learnt about known distributions (in this case, the binomial distribution). • Of course, we have to assume that the group of saolas was randomly sampled.

Example…contd. • It seems reasonable to assume the binomial distribution (remember the conditions?). • We now approach the problem thus: • Our null hypothesis is that the population sex-ratio is truly 1:1, and that any observed difference is just by chance.

Example…contd. • How do we test if the null hypothesis is true? • We determine the probability of obtaining a sex-ratio of 14:3 or worse (i.e., more biased) in a sample of 17 randomly chosen animals, when the population sex-ratio of the saolas is truly 1:1. • In other words, what is the probability of observing a female proportion of 3/17 = 0.1765 or lesser in a random sample of 17 animals if the true proportion is 0.5? • We can answer this question by computing binomial probabilities with n = 17 and p = q = 0.5.

P(Y ≤ 3) = 0.006,363,42 Example…contd. • n =17, p = q = 0.5 • P(Y) = nCYpYq(n-Y)

Example…contd. • Therefore, P(Y ≤ 3) = 0.006,363,42. • Typically, the probability of an outcome must be at least 5% (P ≥ 0.05) before we accept it as not unlikely. This means that if there is at least a 5% chance that the null hypothesis is correct, we do not want to take the risk of going wrong by rejecting it. • Does 5% look like it’s too small? Do you want to be more certain that the null hypothesis is true before you accept it?

Why only 5% • Think about the examples we saw earlier. • Before you condemn a person to jail or worse, don’t you want to be as certain of his guilt as possible before you reject the null hypothesis of not guilty? • Don’t you want to be as sure as possible that drug #2 is better than the current drug before you spend $$ on it? After all, the current drug has been known to work. • You don’t want to look like a chump, so don’t you want to be as certain as you can be before announcing to the world that you’ve found a new species of herring? • Let us assume that tons of $$ and time has already been spent on researching rshI gene. So now, if you want people to shift focus from rshI to lasI, then you want to be very sure that lasI is indeed more involved in biofilm formation than rshI.

Example…contd. • By this standard, the probability of our null hypothesis (getting a sample sex-ratio of 14:3 from a population with true ratio 1:1) would be considered very low. • So we can reasonably conclude that the sample is unlikely to have come from a distribution whose mean sex-ratio is 1:1.

Example…contd. • So we have decided to reject the null hypothesis that the sex-ratio of the saolo population is 1:1 • Now, if the sample did not come from a population with sex-ratio 1:1, what kind of a population did it come from? That is, what is our alternative hypothesis? • The way we approach this question depends upon what we suspect. • For example, in the present problem, we suspect that the sex-ratio is biased in favor of males. In such cases, the alternative hypothesis can be just that: the sex-ratio is biased in favor of males.

Example…contd. • If our alternative hypothesis is that males outnumber females in this species, • then because we have rejected the null hypothesis of 1:1 ratio, • we have to accept the alternative hypothesis that males outnumber females. • Note that we did not test the alternative hypothesis, although this was the hypothesis of interest. • Instead, we tested the null hypothesis, and decided, based on the test, whether to accept it or accept the alternative hypothesis.

The area of interest in in only one tail of the distribution P(Y≤3) Sex ratio of 1:1 Example…contd. Sex-ratio in favor of males Sex-ratio in favor of females

Example…contd. • If, on the other hand, we have reason to believe that the sex-ratio could be biased in either direction, • then because we have rejected the hypothesis of 1:1 ratio, • the alternative is that we have to accept is that the two sexes are different in frequency: the males could be more than the females or vice-versa.

Example…contd. • The alternative scenario being considered here is that a sample of 3 females and 14 males is as likely as a sample of 3 males and 14 females. • That is, if Y is the number of females, P(Y ≤ 3) or P(Y ≥ 14) should be the same. • Therefore, the probability of either one happening is given by the sum of the two probabilities.

P(Y ≤ 3) = 0.006,363,42 Example…contd. [P(females ≤ 3)] • n =17, p = q = 0.5 • P(Y) = nCYpYq(n-Y)

P(Y ≥ 14) = 0.006,363,42 Example…contd. [P(females ≥ 14)] • n =17, p = q = 0.5 • P(Y) = nCYpYq(n-Y)

Example…contd. • Therefore, P(#females ≤ 3) = 0.006,363,42, P(#females ≥ 14) = 0.006,363,42

Probability now in both tails of the distribution P(#females ≤ 3) P(#females ≥ 14) Sex-ratio in favor of males Sex-ratio in favor of females Sex ratio of 1:1 Example…contd.

Example…contd. • Therefore, P(#females or #males ≤ 3) = 0.006,363,42 + 0.006,363,42 = 0.012,726,84 • Therefore, if our alternative hypothesis is simply a biased sex-ratio, we need to look at this probability, • and decide if it is large enough to accept the hypothesis of 1:1 ratio, or small enough to reject it and accept the hypothesis of a biased sex-ratio

The risk of accepting the wrong hypothesis • Of course, since we do not know the truth, we accept or reject the null hypothesis at some risk. • There are two kinds of risk.

The Hypotheses…contd. (The risks)

The Hypotheses…contd. (The risks) • The probability of rejecting a true null hypothesis is termed the Type I Error (also “Level of Significance”). • It is symbolized by  and is typically sought to be minimized because it is considered the more serious error.

The Hypotheses…contd. (The risks) • The probability of accepting a false null hypothesis is termed the Type II Error, and is symbolized by , • and is typically considered the less serious error. • Let us now formally answer the question of the sex-ratio in the saola.

The Hypotheses…contd. • The saola (Pseudoryx nghetinhensis), is a species of ruminant from Vietnam that was unknown to science until 1993. • Again, as before, 17 specimens were found, and 14 were males and 3 were females. Two alternative hypothesis questions: • Are there more males in the population than females? • The question could also have been: Is the sex-ratio different from 1:1? • Let us test these two one at a time.

The Hypotheses…contd. • H0: There are no more males than females in the saola (Pseudoryx nghetinhensis) population (i.e., the sex ratio is 1:1). • H1: There are more males than females in the saola. (Therefore, this is a one-tailed hypothesis.) • Level of significance:  = 0.05 and 0.01 (This is the level of risk one is willing to take in rejecting the null hypothesis when it is actually true; the so-called Type I error.)

The Hypotheses…contd. • Test statistic: T = P(Y) follows the binomial distribution. • Decision criteria: If P(Y ≤ 3) ≥  then accept H0 If P(Y ≤ 3) <  then reject H0 • Computation of the test statistic: P(Y ≤ 3) =0.006,363,42

The Hypotheses…contd. • Decision: Since P(Y ≤ 3) <  (0.05 or 0.01), we reject H0 • Therefore, we accept H1. • Conclusion: From the sample evidence we conclude that there are more males than females in the saola (P < 0.01)

The Hypotheses…contd. • What if we had made the other alternative hypothesis? • H0: The sex-ratio in the saola (Pseudoryx nghetinhensis) is 1:1. • H1: The sex-ratio in the saola (Pseudoryx nghetinhensis) is not 1:1. (Therefore, this is a two-tailed hypothesis.) • Level of significance:  = 0.05 and 0.01

The Hypotheses…contd. • Test statistic: T = P(Y) follows the binomial distribution. • Computation of the test statistic: P(#males ≤ 3) + P(#females ≤ 3) =2(0.006,363,42) = 0.012,726,84

The Hypotheses…contd. • Decision: Since T is itself a probability here, we can compare it directly to . • In this case, T < 0.05, but not ≤ 0.01. • Therefore, we reject H0 at a risk ≤ 5% (confidence ≥ 95%) • But since T > 0.01, we cannot be > 99% confident that we are correct in rejecting the null hypothesis. • Therefore, depending on the level of confidence we are looking for, we either reject H0 or accept it.

The Hypotheses…contd. • Conclusion: From the sample evidence we conclude that we can be > 95% confident that the sex-ratio of the saola is significantly different from 1:1. • We cannot, however, be more than 99% confident that the sex-ratio is different from 1:1.

Example 2 • You are a commercial fish farmer who has grown rainbow trout (Oncorhynchus mykiss) for many years. • You have spent a good deal of money and time in training your staff to produce good yields year after year.

Example 2 • By the end of eight months, your fish average 15 inches, with a standard deviation of 4 inches. • Along comes a smart-aleck who claims to know more about growing rainbow trout than you do..

Introduction to Hypothesis Testing