CH.6: Samples and Populations
• Population or Universe, a group consisting of a set of individuals who share at
least one characteristic such as citizenship, membership, ethnicity, etc.
• Researchers only study a sample, a smaller number of individuals from the
population to generalize from which it was taken.
• Random Sampling: every member of the population has an equal chance of being
drawn into the sample.
• Sampling error: mean of a sample as X__ and the mean of a population as μ, the
standard deviation of a sample will be symbolized as s and the standard
deviation of its population as σ. A sample mean will never be exactly the same as
the population mean and a sample standard deviation won’t be same as the
population standard deviation will result regardless.
• Sampling distribution of means: frequency is arranged from high to low in the
• Characteristics of a sampling distribution of means: 1) the sampling distribution
of means approximates a normal curve (true of all sampling distributions of
means regardless of the shape of distribution of raw scores in the population from
which the means are drawn as long as the sample size is reasonably large (over
30) and if the raw data are normally distributed to begin with then the
distribution of sample means in normal regardless of sample size. 2) The mean
of a sampling distribution of means (the mean of means) is equal to the true
population mean(interchangeable values) 3) the standard deviation of a sampling
distribution of means is smaller than the standard deviation of the population.
• The sampling distribution of means as a normal curve: to locate a sample mean
in the sampling distribution in terms of the number of standard deviations it falls
from the centre, we obtain the z score: z= X_ - μ/ σx_
X_= sample mean in the distribution
μ = mean of means
σx_= standard deviation of the sampling distribution of means
• Standard error of the mean:
σx_= σ/ N
Confidence Intervals: the probability that population mean falls within range of mean
values and μ is unknown
• find standard error of mean first.
68%CI= X_ +/- σx_
X_= sample mean σx_= standard error of the sample mean
- It has become a matter of convention to use a wider, less precise
confidence interval having a better probability of making an accurate or
true estimate of the population mean hence 95% confidence interval. To
find confidence interval at say 95%, the remaining is 5%, divide by 2 and
look for that number on column C and u can find the corresponding Z score
to that value.
95%CI= X_ +/- 1.96 σx_
X_= sample mean
σx_= standard error of the sample mean
99%CI= X_ +/- 2.58 σx_
- The precision of an estimate is determined by the margin of error obtained
by multiplying the standard error by the z score representing a desired level
of confidence. Larger the margin error, the wider the confidence interval.
• The t distribution - Ô ²= E(X-X_) ²/ N-1 AND Ô= E(X-X_) ²/N-1
- 2 purposes for calculating the variance and standard deviation: 1) to describe
the extent of variability within a sample of cases or respondents 2) to make an
inference or generalize about the extent of variability within the larger population of
cases from which a sample was drawn.
- sx_ = s/ N-1
- one more problema arises when we estimate the standard error of the mean.
The sampling distribution of means is no longer quite normal if we do not know the
population standard deviation. (df less than 30?) T= X_ - μ/ sx_ df= N-1
- degrees of freedom indicates how close the t distribution applies in a
particular instance and the greater the degrees of freedom, the larger the sample size
and the closer the t distribution gets to the normal distribution.
a= 1- level of confidence ex; 95% CI – a=.05, 99%CI- a= .01(area in the
tails of the t distribution)
- confidence interval= X_ +/- ts x_ OR CI= X_ +/- (Z conf) σx (σ/ N
- steps: confidence interval using t
1) Find the mean of the sample 2) obtain the standard deviation of the
sample (square the raw scores) s= Ex ²/N -x ²_ (answer: 3 decimal places) 3) obtain
the estimated standard error of the mean sx_= s/ N-4) determine the value of t from table C
(df= N-1 and look for a=.05 then result) 5) obtain the margin of error by multiplying the standard
error of the mean by the results from step 4. margin of error= t sX_ 6) add and subtract this
product from the sample mean to find interval within 95%CI and find out how confident the
The greater the level of confidence, the more likely it is that the confidence interval
actually includes the true population mean.
•The greater the level of confidence, the larger the z score.
•The greater the level of confidence, the wider the confidence interval.
• Estimating proportions: standard error of the proportion
Sp= P(1-P)/N Sp= standard error of the proportion P= sample
proportion N= total # in the sample
- t distribution was used for constructing confidence intervals for the
population mean when both the population mean and the population
standard deviation was unknown and had to be estimated but for
proportion, only one quantity is unknown: we estimate the population
proportion by the sample proportion and we used z distribution
95%CI= P +/- 1.96Sp P= sample proportion Sp= standard error of the
We use z distribution when the standard deviation of the population (σ) is KNOWN
We use the t distribution when the standard deviation of the population (σ) is
CH.7: Testing differences between means
•Hypothesis testing is designed to detect significant differences:differences that did
not occur by random chance.
•In the “one sample” case: we compare a random sample (from a large group) to a
population (Z test or t-test for one sample). • The Null hypothesis: no difference between means
- two samples are drawn from equivalent populations, any observed
difference between samples is regarded as a chance occurrence resulting
from sampling error alone. An obtained difference between two sample
means does not represent a true difference between their population
means. μ¹ = μ² where μ¹= mean of first population μ²- mean of the second population
- Null hypothesis does not deny the possibility of obtaining differences between sample means.
On the contrary, it seeks to explain such differences between sample means by attributing
them to the operating of sampling error. Retaining means we are merely unable to reject the
null hypothesis due to lack of contradictory evidence.
- Retain or reject the null hypothesis –does not prove the sample means are
equal or unequal
Research hypothesis shown symbolically as when dealing with two samples μ¹ ≠
; X_ ≠ μ and as when dealing with a sample and a population
• The research hypothesis: a difference between means
- If we reject the null hypothesis, if we find our hypothesis of no difference
between means probably does not hold, we automatically accept the research
hypothesis that a true population does exist. Research hypotheses says that the
two samples have been taken from populations having different means, it says
that the obtained difference between sample mean is too large to be accounted
for by sampling error. μ¹ ≠ μ²
• Sampling distribution of differences between means
- a frequency distribution of a large number of differences between sample
means that have been randomly drawn from a given population.
- Sampling distribution of differences between means approximates a normal
curve whose mean (mean of differences between means) is zero. As a
normal curve, most of the differences between sample means in the
distribution fall close to zero- its middlemost point; there are relatively few
differences between means having extreme values in either direction from
the mean of these differences.
• Testing hypotheses with the distribution of differences between means
- we seek to translate our sample mean difference z= (X_ 1 - X_2) – 0/ σx_1 -
X_1= mean of the first sample X_2= mean of the second sample
0= zero, the value of the mean of the sampling distribution of
differences between means (we assume μ¹ - μ²= 0)
- σx_1 - x_2= standard deviation of the sampling distribution of differences
between means BUT because the value of the mean of the distribution of
differences between means is assumed to be zero, we can drop it from the z
score formula without altering our result therefore z= X_1 - X_2/ σx_1 -
• Level of significance: a (alpha)— it is the level of probability at which the
null hypothesis can be rejected with confidence and the research hypothesis
can be accepted with confidence. We reject null hypothesis if the probability
is very small (less than 5/100) that the sample difference is a product of
sampling error. The small probability is symbolized as p < .05