Hypothesis Testing in Statistics Proportions, 1-Tailed Z-Test
Let's look at another example in statistics where we need to determine whether to reject the null hypothesis or fail to reject the null hypothesis where the data deals with proportions rather than the mean (or average) of the population and where we will be using a 1-Tailed test rather than 2-Tailed test as we demonstrated in our last blog article.
The example I have chosen for this exercise in hypothesis testing is stated as follows:
The National Football League believes the chance of the Baltimore Ravens winning the next coin toss is greater than 50%. In a random sample of 200 coin tosses, the Ravens won 118 times. Is there enough evidence to suggest the Baltimore Ravens are cheaters?
So, as in previous examples I've shown you dealing with hypothesis testing, the first step in the process is to state both the null hypothesis and the alternative hypothesis. The null hypothesis in this example is what is currently accepted regarding coin tosses which is that the probability in a coin toss of winning that toss is 50%. Since we must assume that the Baltimore Ravens are innocent until proven guilty, we must accept the commonly held belief regarding the probabilities associated with coin tosses. This 50% represents a proportion rather than an average or mean of the population. Therefore, the null hypothesis is written as such:
Here, we're saying that we believe the probability of winning a coin toss is P = 0.50. Therefore, from the example, when a random sample of coin tosses was performed, it is suggested that alternatively the probability of winning the coin toss is greater than 50% due to some form of cheating that is taking place. So, the alternative hypothesis can be stated as:
Here, we're saying that we believe that instead of a fair coin yielding a 50% probability of winning a coin toss, that somehow the probability has been shifted in favor of the Baltimore Ravens and that probability of winning is greater than 50%.
Anytime you have a greater than or less than condition in the alternative hypothesis, this implies that a 1-Tailed test is going to be conducted rather than 2-tailed test. Now, let's look at the normal distribution curve representing the probability of winning which is greater than the average probability of 50% with a level of confidence of 99%. This implies a level of significance of a = 0.01 as shown in the graph below:
We are concerned only with the tail of this normal distribution that is greater than the 50% value which is located in the middle of this curve so the red region shaded in which represents a value of 0.01 is to the right (or the right-tail) and will also represent our rejection region. We are not including the left-tail here because we are not concerned with probabilities less than 50%. This is why this example is a 1-tailed test example. We could have chosen any value for a, such as a = 0.05, but I wanted to be more confident in our decision regarding the Baltimore Ravens with a level of confidence of 99% rather than 95%, and so chose a = 0.01 and thus C = 1 - 0.01 or 0.99 or 99%.
Now that we have completed steps 1 and 2 of our hypothesis testing; that is to say, stated our null hypothesis and alternative hypothesis, then in step 2 we chose our level of confidence, the next step in the process is to find the critical values. The critical values are those values that separate the tail(s) from the rest of the curve and can be either a t-value or z-value. Well, when working with proportions in statistics, we use the z-value if the sample size is large enough. What do we mean by "large enough"? How do we determine if the sample size is large enough? We can use the following formula to make that determination:
If both of the above conditions are met where P represents the probability or proportion of 50% and N represents the sample size taken in the sample test, then we will use the z-test; otherwise, we will use a t-test.
The value of P = 0.50 and N = 200 so PN = 100 > 5 is met. In the second equation,
1 - 0.50 times 200 = 100 > 5, and so we have a large enough sample size taken in this exercise since both conditions are met, and we will be using the z-test versus the t-test and a z-distribution will be used for the z-test as opposed to a t-distribution. Therefore, the critical value, or c.v., will be the point in the normal distribution where the red region separates the rest of the curve. How do we determine c. v.? We use a z-table from any CRC mathematics reference book. Let's look at the z-table we're going to use for this problem:
Looking at our z-table, recall we have a 1-tailed test with a = 0.01, so we look at the column heading "Area in one tail..." and follow down this column until we locate the value of a = 0.01, then look across under the "z-score" column to find our value of z which is 2.326. This is the value of c.v. in this problem. The significance of c.v. = 2.326 is that this represents the value on our normal distribution where the red region separates the rest of the normal distribution curve.
Now, our next step in the process is to choose our test statistic. Since we are using the z-distribution to determine the z-score for c.v., then we will be performing a z-test for this problem. When proportions are involved rather than mean values, the formula for computing the value of Z in the test looks like this:
In the above formula, P-bar is the sample proportion ( not the population proportion) and in the problem we were told that out of 200 coin tosses, the Baltimore Ravens won 118. This proportion is 118/200 = 0.59. The population proportion is the value of 0.50. Since the sample size n = 200, we can plug all the values that we have into the equation for calculating Z to arrive at:
With this value for Z = 2.55 > c.v = 2.326, this places Z in the red region of the graph which is in the rejection region. Therefore, we perform the last step in testing hypotheses which is to draw a conclusion.
The conclusion that we can draw here in this example is that we can reject the null hypothesis, H0
, that p = 0.50 and accept the alternative hypothesis, HA
, that p > 0.50 or that the probability of the coin toss is skewed in favor of the Ravens instead of conforming to what would be expected from chance occurrence.
Therefore, to answer the question asked in this problem: "Is there enough evidence to suggest the Baltimore Ravens are cheaters?", we can say with 99% confidence that indeed there is enough evidence to suggest that the Baltimore Ravens is cheating in the coin toss.