Textbook Notes
(362,879)

Canada
(158,081)

Western University
(15,343)

Statistical Sciences
(145)

Jennifer Waugh
(37)

Chapter 11

# Chapter 11.docx

Unlock Document

Western University

Statistical Sciences

Statistical Sciences 2244A/B

Jennifer Waugh

Spring

Description

11.1 Overview
● We want to introduce a procedure for testing the hypothesis that three or more population
means are equal, soa typical null hypothesis will be H : 0 =1μ = 2 etc3
○ the alternative hypothesis then would be that at least one mean is different from
the others
● In section 8-3 we already presented procedures for testing the hypothesis that two
population means are equal but the methods of that section do not apply when three or
more means are involved
● Analysis of variance (ANOVA) is a method of testing the equality of three or more
population means by analyzing sample variances
● Why can’t we just test two samples at a time; why do we need a new procedure when we
can test for equality of two means by using methods presented in chapter 8?
● For example, if we have four populations and wish to compare their means, why can’t we
compare two means at a time (resulting in 6 total tests and 6 null hypotheses)
● This is because in general, as we increase the number of individual tests of significance,
we increase the likelihood of finding a difference by chance alone (instead of a real
difference in means)
● The risk of a type I error is far too high; the method ofANOVAhelps us avoid that
particular pitfall by using one test for equality of several means
F Distribution
● TheANOVAmethods of this chapter require the F distribution; it has the following
important properties
○ the F distribution is not symmetric; it is skewed to the right
○ the values of F can be 0 or positive, but not negative
○ there is a different F distribution for each pair of degrees of freedom for the
numerator and denominator
● ANOVAis based on a comparison of two different estimates of the variance common to
the different populations
○ the estimates are the variance between samples and the variance within samples
● The term one-way is used because the sample data are separated into groups according to
one characteristic or factor
● In section 11-3 we will introduce two way analysis of variance which allows us to
compare populations separated into categories using two characteristics (or factors)
● We suggest that you begin section 11-2 by focusing on this key concept: we are using a
procedure to test a claim that three or more means are equal; although the details of the
calculations are complicated, the procedure will be easy because it is based on a P-value
○ if the P-value is small, reject equality of means; otherwise fail to reject equality of
means
112 OneWay ANOVA
● In this section we consider tests of hypothesis that three or more population means are all
equal, as in H0: μ1= μ 2 μ e3c. We recommended the following approach:
● Understand that a small P-value leads to rejection of the null hypothesis of equal means
● Develop an understanding of the underlying rationale by studying the example in this section
● Become acquainted with the nature of the sum of squares (SS) and mean square (MS)
values and their role in determining the F test statistic
● The method we use in called one-way analysis of variance because we use a single
property of characteristic for categorizing the populations; the characteristic is sometimes
referred to as a treatment or factor
● Atreatment or factor is a property or characteristic that allows us to distinguish the
different populations from one another
Rationale
● The method of analysis of variance is based on this fundamental concept: with the
assumption that the populations all have the same variance σ , we estimate the common
value of σ using two different approaches
● The F test statistic is evidence against equal population means
○ a small F test, statistic means that the P-value is large, thus the sample means are
all close and so we fail to reject the null hypothesis of equal means
○ a large F test statistic means that the P-value is small, thus at least one sample
mean is very different so we reject the null hypothesis2of equal means
● The two approaches for estimating the common value of σ are as follows:
○ the variance between samples (variance due to treatment) is an estimate of
common population variance σ that is based on the variation among the sample
means
○ the variance within samples (variance due to error) is an estimate of the common
population variance σ based on the sample variances
● Test Statistic for One-WayANOVA:
○ F = (variance between samples) / (variance within samples)
● The estimate of variance in the denominator depends on the sample variances and is not
affected by differences among the sample means
Calculations with Equal Sample Sizes n
● If data sets all have the sample sample size, the required calculations aren’t
overwhelmingly difficult
● First find the variance between samples by evaluating ns x(bar) where s x(bar)s the variance
of the sample means and n is the size of each of the samples
○ consider the sample means to be an ordinary set of values and calculate the
variance
● Next estimate the variance within samples by calculating s which ip the pooled variance
obtained by finding the mean of the sample variances
● The critical value of F is found by assuming a right tailed test because large F values
correspond to significant differences among means
● With k sample each value n values, the number of degrees of freedom are as follows:
○ numerator degrees of freedom = k-1
○ denominator degrees of freedom = k(n-1)
● The variance within a sample isn’t affected when we add a constant to every sample
value; the change in the F test statistic and the P-value is attributable only to the change
in x(bar) 1
● This illustrates that the F test statistic is very sensitive to sample means, even though it is
obtained through two different estimates of the common population variance Calculations with Unequal Sample Sizes
● While the calculations for cases with equal sample sizes are reasonable, they become
c

More
Less
Related notes for Statistical Sciences 2244A/B