Textbook Notes
(358,956)

Canada
(155,965)

Western University
(15,024)

Statistical Sciences
(146)

Jennifer Waugh
(37)

Chapter 6

# Chapter 6.docx

Unlock Document

Western University

Statistical Sciences

Statistical Sciences 2244A/B

Jennifer Waugh

Spring

Description

6.1 Overview
● In this chapter, we begin working with the true core of inferential statistics as we use
sample data to make inferences about population parameters
● Two major applications of inferential statistics involve the use of sample data to:
○ estimate the value of a population parameter
○ test some claim (or hypothesis) about a population
● This chapter includes important inferential methods involving population proportions,
population means, and population variances (or standard deviations); we begin with
proportions because:
○ We all see proportions frequently in the media and in articles in professional
journals
○ Proportions are generally easier to work with than means or variances so we can
better focus on the important principles of estimating parameters and testing
hypotheses when those principles are first introduced
6.2 Estimating a Population Proportion
● The main objective of this section is: given a sample proportion, estimate the value of the
population proportion p
● This section will only consider cases in which the normal distribution can be used to
approximate the sampling distribution of sample proportions
○ In a binomial procedure with n trials and probability p, if np >= 5 and nq >= 5,
then the binomial random variable has a probability distribution that can be
approximated by a normal distribution
● Requirements for Using a Normal Distribution as anApproximation to a Binomial
Distribution
○ The sample is a simple random sample
○ The conditions for the binomial distribution are satisfied; fixed number of trials,
independent, two categories of outcomes, probabilities remain constant
○ Normal distribution can be used to approximate the distribution of sample
proportions because np >= 5 and nq >= 5 are both satisfied
○ The methods of this section cannot be used with other types of sampling other
than SRS, such as stratified, cluster, and convenience sampling
● Once the requirements above have been met, we can:
○ using the sample as a basis for estimating the value of the population proportion p
● Notation for Proportions:
○ p = proportion of successes in the entire population
○ p(hat) = x/n = sample proportion of x successes in a sample of size n
○ q(hat) = 1 - p(hat) = sample proportion of failures in a sample of size n
● If we want to estimate a population proportion with a single value, the best estimate is
p(hat); because p(hat) consists of a single value, it is called a point estimate
○ a point estimate is a single value or point used to approximate a population
parameter
○ sample proportion p(hat) is the best point estimate of the population proportion p;
it is unbiased and most consistent of the estimators that could be used
Why Do We Need Confidence Intervals? ● Despite knowing that our best point estimate of the population proportion p is p(hat), we
do not know how good it is, just that it’s the best
● Because the point estimate has the flaw of not revealing how good it is, an estimate called
a confidence interval or interval estimate is used
● Aconfidence interval is a range of values used to estimate the true value of a population
parameter; is associated with a confidence level such as 0.95
● The confidence level gives us the success rate of the procedure used to construct the
confidence interval; often expressed as the probability or area of 1 - α
○ it is the proportion of times that the confidence interval actually does contain the
population parameter, assuming the estimation process is repeated a large number
of times
○ α is the complement of the confidence level; for 0.95 confidence level, α=0.05
Critical Values
● The methods of this section include the use of a standard z score that can be used to
distinguish between sample statistics that are likely to occur and those that are unlikely
● Such a z score is called a critical value and is based on the following observation
○ under certain conditions, sampling distribution of sample proportions can be
approximated by a normal distribution
○ sample proportions have a relatively small chance of falling in one of the tails
○ by rule of complements, there is a probability of 1-α that a sample proportion will
fall within the inner proportion
○ the z score separating the right tail region is commonly denoted as z α/2and is
referred to as a critical value
● The critical value z α/2s the positive z value that is at the vertical boundary separating an
area of α/2 in the right tail of the standard normal distribution; value of -z α/2is the area of
α/2 in the left tail
● Acritical value is the number on the borderline separating sample statistics that are likely
to occur from those that are unlikely to occur
Margin of Error
● The difference between sample proportion and the population proportion can be thought
of as an error
● When data from a simple random sample are used to estimate a population proportion p,
the margin of error denoted by E, is the maximum likely (with probability of 1-α)
difference between the observed sample proportion p(hat) and the true value of the
population proportion p
○ Also called the maximum error of the estimate and can be found by multiplying
the critical value and the standard deviation of sample proportions
● There is a probability of 1-α that a sample proportion will be in error (different from the
population proportion p) by no more than E, and there is a probability of α that the
sample proportion will be in error by more than E
● Confidence Interval for the population proportion p:
○ p(hat) - E < p < p(hat) + E
Interpreting a Confidence Interval
● Correct example of an interpretation:
○ We are 95% confident that the interval from 0.226 to 0.298 actually does contain the true value of p
■ This means that if we were to conduct many different experiments and
construct the corresponding confidence intervals, 95% of them would
actually contain the value of the population proportion p
■ note the level of 95% refers to the success rate of the process being used to
estimate the proportion, it does not refer to the population proportion itself
● Incorrect example of an interpretation:
○ There is a 95% chance that the true value of p will fall between 0.226 and 0.298
● Seriously read the paragraph on page 264-265, explains perfectly how to interpret a
confidence interval correctly
● Read the rationale for Margin of Error as well, great explanation
Determining Sample Size
● Suppose we want to collect sample data with the objective of estimating some population
proportion; how do we know how many sample items must be obtained?
● If we begin with the expression for the margin of error E, then solve for n, yo

More
Less
Related notes for Statistical Sciences 2244A/B