State null hypothesis: : .
State alternative hypothesis: : .
Calculate test statistic: .
Set significance level: 5%.
Look up pvalue: The area to the right of the tstatistic (2.40) for the tdistribution with 29 degrees of freedom is less than 0.025 but greater than 0.01 (since from Table C.1 the 97.5th percentile of this tdistribution is 2.045 and the 99th percentile is 2.462); thus, the uppertail pvalue is between 0.01 and 0.025.
Make decision: Since the pvalue is between 0.01 and 0.025, it must be less than the significance level (0.05), so we reject the null hypothesis in favor of the alternative.
Interpret in the context of the situation: The 30 sample sale prices suggest that a population mean of seems implausiblethe sample data favor a value greater than this (at a significance level of 5%).
To conduct an uppertail hypothesis test for a univariate mean using the pvalue method.
State null hypothesis: : .
State alternative hypothesis: : .
Calculate test statistic: . (In most cases, this should be a positive number.)
Set significance level: %.
Look up pvalue in Table C.1: Find which percentiles of the tdistribution with degrees of freedom are either side of the tstatistic; the pvalue is between the corresponding uppertail areas.
Make decision: If , then reject the null hypothesis in favor of the alternative. Otherwise, fail to reject the null hypothesis.
Interpret in the context of the situation: If we have rejected the null hypothesis in favor of the alternative, then the sample data suggest that a population mean of seems implausiblethe sample data favor a value greater than this (at a significance level of %). If we have failed to reject the null hypothesis, then we have insufficient evidence to conclude that the population mean is greater than .
For a lowertail test, everything is the same except:
Alternative hypothesis: : .
In most cases, the should be a negative number.
pvalue: Find which percentiles of the tdistribution with degrees of freedom are either side of the tstatistic; the pvalue is between the corresponding lowertail areas.
For a twotail test, everything is the same except:
Alternative hypothesis: : .
The could be positive or negative.
pvalue: Find which percentiles of the tdistribution with degrees of freedom are either side of the tstatistic; pvalue/2 is between the corresponding uppertail areas (if the is positive) or lowertail areas (if the is negative). Thus, the pvalue is between the corresponding tail areas multiplied by two.
Figure 1.6 shows why the rejection region method and the pvalue method will always lead to the same decision (since if the tstatistic is in the rejection region, then the pvalue must be smaller than the significance level, and vice versa). Why do we need two methods if they will always lead to the same decision? Well, when learning about hypothesis tests and becoming comfortable with their logic, many people find the rejection region method a little easier to apply. However, when we start to rely on statistical software for conducting hypothesis tests in later chapters of the book, we will find the pvalue method easier to use. At this stage, when doing hypothesis test calculations by hand, it is helpful to use both the rejection region method and the pvalue method to reinforce learning of the general concepts. This also provides a useful way to check our calculations since if we reach a different conclusion with each method we will know that we have made a mistake.
State null hypothesis: : .
State alternative hypothesis: : .
Calculate test statistic: .
Set significance level: 5%.
Look up ttable: critical value: The 97.5th percentile of the tdistribution with 29 degrees of freedom is 2.045 (from Table C.1); the rejection region is therefore any tstatistic greater than 2.045 or less than (we need the 97.5th percentile in this case because this is a twotail test, so we need half the significance level in each tail). pvalue: The area to the right of the tstatistic (2.40) for the tdistribution with 29 degrees of freedom is less than 0.025 but greater than 0.01 (since from Table C.1 the 97.5th percentile of this tdistribution is 2.045 and the 99th percentile is 2.462); thus, the uppertail area is between 0.01 and 0.025 and the twotail pvalue is twice as big as this, that is, between 0.02 and 0.05.
Make decision: Since the tstatistic of 2.40 falls in the rejection region, we reject the null hypothesis in favor of the alternative. Since the pvalue is between 0.02 and 0.05, it must be less than the significance level (0.05), so we reject the null hypothesis in favor of the alternative.
Interpret in the context of the situation: The 30 sample sale prices suggest that a population mean of seems implausiblethe sample data favor a value different from this (at a significance level of 5%).
1.6.3 Hypothesis test errors
When we introduced significance levels in Section 1.6.1, we saw that the person conducting the hypothesis test gets to choose this value. We now explore this notion a little more fully.
Whenever we conduct a hypothesis test, either we reject the null hypothesis in favor of the alternative or we do not reject the null hypothesis. Not rejecting a null hypothesis is not quite the same as accepting it. All we can say in such a situation is that we do not have enough evidence to reject the nullrecall the legal analogy where defendants are not found innocent but rather are found not guilty. Anyway, regardless of the precise terminology we use, we hope to reject the null when it really is false and to fail to reject it when it really is true. Anything else will result in a hypothesis test error. There are two types of error that can occur, as illustrated in the following table: Hypothesis test errors
A type 1 error can occur if we reject the null hypothesis when it is really truethe probability of this happening is precisely the significance level. If we set the significance level lower, then we lessen the chance of a type 1 error occurring. Unfortunately, lowering the significance level increases the chance of a type 2 error occurringwhen we fail to reject the null hypothesis but we should have rejected it because it was false. Thus, we need to make a tradeoff and set the significance level low enough that type 1 errors have a low chance of happening, but not so low that we greatly increase the chance of a type 2 error happening. The default value of 5% tends to work reasonably well in many applications at balancing both goals. However, other factors also affect the chance of a type 2 error happening for a specific significance level. For example, the chance of a type 2 error tends to decrease the greater the sample size.
1.7 Random Errors and Prediction
So far, we have focused on estimating a univariate population mean, , and quantifying our uncertainty about the estimate via confidence intervals or hypothesis tests. In this section, we consider a different problem, that of prediction. In particular, rather than estimating the mean of a population of values based on a sample, , consider predicting an individual value picked at random from the population.
Intuitively, this sounds like a more difficult problem. Imagine that rather than just estimating the mean sale price of singlefamily homes in the housing market based on our sample of 30 homes, we have to predict the sale price of an individual singlefamily home that has just come onto the market. Presumably, we will be less certain about our prediction than we were about our estimate of the population mean (since it seems likely that we could be further from the truth with our prediction than when we estimated the meanfor example, there is a chance that the new home could be a real bargain or totally overpriced). Statistically speaking, Figure 1.5 illustrates this extra uncertainty that arises with predictionthe population distribution of data values, (more relevant to prediction problems), is much more variable than the sampling distribution of sample means, (more relevant to mean estimation problems).
We can tackle prediction problems with a similar process to that of using a confidence interval to tackle estimating a population mean. In particular, we can calculate a prediction interval of the form point estimate uncertainty or (point estimate uncertainty, point estimate uncertainty). The point estimate is the same one that we used for estimating the population mean, that is, the observed sample mean, . This is because is an unbiased estimate of the population mean, , and we assume that the individual value we are predicting is a member of this population. As discussed in the preceding paragraph, however, the uncertainty is larger for prediction intervals than for confidence intervals. To see how much larger, we need to return to the notion of a model that we introduced in Section 1.2.
So far, we have focused on estimating a univariate population mean, , and quantifying our uncertainty about the estimate via confidence intervals or hypothesis tests. In this section, we consider a different problem, that of prediction. In particular, rather than estimating the mean of a population of values based on a sample, , consider predicting an individual value picked at random from the population.
Intuitively, this sounds like a more difficult problem. Imagine that rather than just estimating the mean sale price of singlefamily homes in the housing market based on our sample of 30 homes, we have to predict the sale price of an individual singlefamily home that has just come onto the market. Presumably, we will be less certain about our prediction than we were about our estimate of the population mean (since it seems likely that we could be further from the truth with our prediction than when we estimated the meanfor example, there is a chance that the new home could be a real bargain or totally overpriced). Statistically speaking, Figure 1.5 illustrates this extra uncertainty that arises with predictionthe population distribution of data values, (more relevant to prediction problems), is much more variable than the sampling distribution of sample means, (more relevant to mean estimation problems).