Actions

Single sample tests of means and proportions

From OPOSSEM

Revision as of 09:15, 9 July 2011 by Chris Lawrence (talk | contribs) (Introduction)


Objectives

Introduction

Sometimes social scientists want to be able to test a hypothesis comparing the mean of a sample to some "ideal" mean. For example, a local government might want to see if, on average, drivers are obeying a 25 mph (40 km/h) speed limit in a school zone, or the State Department of Motor Vehicles might want to ensure that the average wait time for driver's license applicants at various offices is less than 15 minutes.

In this module we introduce statistical <a _fcknotitle="true" href="Hypothesis testing">Hypothesis testing</a> by extending the concept of the <a href="Confidence interval">confidence intervals</a> to consider these problems.

The Z test

<math>Z = \frac {\bar{y}-\mu_0}{\sigma_{\mu_y}}</math>, where <math>\mu_0 </math> is the test value.

Example: The university administration is concerned that students are spending an inordinate amount of time at the end of semester waiting in line at the campus bookstore to sell back their books, and sets a benchmark wait time of no more than 6 minutes on average. They decide to monitor the situation by measuring the average wait time one day during the buy-back period, and find that 212 students waited in line for an average (mean) of 8 minutes. The population standard deviation is assumed to be 3 minutes. Assuming this day is typical, can the administration be confident, at the 95% confidence level, that the average wait time is exceeded throughout the buy-back period?

The t test

Just as in the section of <a _fcknotitle="true" href="Confidence intervals">Confidence intervals</a>, we encounter the problem that it is unrealistic in most circumstances to believe that we would know the population standard deviation without also knowing the population mean; if we knew both quantities already, there would be no need to engage in statistical inference to begin with! But, as William Gosset found, we cannot simply use the sample standard deviation as a stand-in for the population standard deviation. So again we use his Student's t distribution to

<math>t = \frac {\bar{y}-\mu_0}{s_\bar{y}}</math>, where <math>\mu_0 </math> is the test value and <math>\text{df} = n-1</math>.


Example: A South Carolina state legislator believes that increasing the speed limit on I-95 in the state will increase compliance with the law. She argues that increasing the speed limit to 75 miles per hour (120 km/h) will mean that the average driver will now be obeying the posted speed limit. The state department of transportation conducts a study of traffic on the highway and finds that, based on monitoring the speeds of 421 cars, the mean speed of the cars is 74 mph with a standard deviation of 4 mph. Assuming this is a random sample of I-95 traffic, can we be confident (at the 95% confidence level) that the true average speed of the "population" of all traffic is 75 miles per hour or less?


The Z test for proportions

In situations involving nominal or ordinal data, as discussed in the section on <a _fcknotitle="true" href="Confidence intervals">Confidence intervals</a>, we cannot use the mean as a statistic. Instead, we can use a proportions test to arrive at similar conclusions. For example, a researcher interested in the effectiveness of a state university's affirmative action program may want to test whether or not the proportion of non-white students admitted to the university is consistent with the state's percentage of non-white high school graduates.

In the Z test of proportions, the null hypothesis is that the true population proportion π is equal to the test value π0, or:

H0:π = π0

To test this hypothesis we use the Z test formula for proportions, specified below:

<math>Z = \frac {p-\pi_0}{\sigma_\pi}</math>, where <math>\pi_0 </math> is the test value.

As we learned in the section on confidence intervals, the standard error of the proportion is given by

<math>\sigma_\pi = \frac{\sqrt{p(1-p)}}{\sqrt{n}}</math> .

Most statistical software packages also include a set of nonparametric binomial tests, which are more accurate than the z test for small samples. These procedures are discussed in the Computing Notes section below.

The relationship between the t and Z tests and confidence intervals

The single-sample Z and t tests can be thought of as a simple transformation of the construction of <a _fcknotitle="true" href="Confidence intervals">Confidence intervals</a>, although the purpose of each approach is somewhat different. When constructing confidence intervals we are interested in finding the values of the population parameter (mean or proportion) that are likely to have led to the sample statistic that we observe; the single-sample tests, on the other hand, are concerned with the question of whether it is plausible to for a researcher to believe that a specific value of the population parameter led to the sample statistic we have found. In other words, while confidence intervals are largely exploratory in nature, the single-sample Z and t tests are true hypothesis tests; we are evaluating whether or not a specific hypothesis is true, rather than finding a range of values for which we might find the hypothesis to be true.

Computing Notes

Most statistical software packages, including SPSS, Stata, and R, when asked for a single-sample test will automatically do a t test, even for large samples; as discussed above, the z test is really only useful when the population mean is known, so is of limited use in practical applications. If for some reason you want to calculate the Z test, you will have to do it by hand, using the formula presented in this chapter.

The single-sample test procedures in these programs are also capable of providing the <a _fcknotitle="true" href="Confidence intervals">Confidence intervals</a> for the population mean, again using the t distribution.

SPSS

A single-sample test of means can be obtained in SPSS using the menus via Analyze → Means → One-sample t test.

SPSS does not implement the single-sample test of proportions (at least in the menus). However, the more exact binomial proportions test, discussed above, is available in SPSS using Analyze → Nonparametric Tests → Binomial.


Stata

Single-sample tests of means are available in Stata using the ttest procedure.

Single-sample tests of proportions are available using prtest (for large sample sizes) and bitest (for small sample sizes).


R

Single-sample tests of means are available in R using the t.test function.

Single-sample tests of proportions are available using prop.test (for large sample sizes) and binom.test (for small sample sizes).



Conclusion

References

<references group=""></references>

Discussion questions

Problems

=Glossary=

  • [[Def: ]]
  • [[Def: ]]
  • [[Def: ]]

__FORCETOC__