false

Class Notes
(838,404)

Canada
(510,881)

University of Toronto Scarborough
(31,617)

Statistics
(297)

STAB22H3
(239)

Ken Butler
(34)

Lecture

Unlock Document

Statistics

STAB22H3

Ken Butler

Fall

Description

STAB22 LEC18
CHAPTER 18: SAMPLING DISTRIBUTION
MODELS
(COVERS 18,19)
[223]
OPINION POLL
- SRS; n = 1000 canadians, = 0.91
- 91% say that Canada is better than US with respect to health care system.
- QN: how accurate is this 91%?
- note that sampling variability exists: another sample has difthey geto the
may not be 91%
- assume p = 0.91 =
- Now, how far from 91% might anothe(sample proportion) be?
( )( )
- get SD =
- gives an idea of spread of sample prop's
- QN: based on what we got for SD, how likely is sample prop higher than 95%?
- CHECK to see if we can use Normal model in the first place:
(np) = 910 ≥ 10
(nq) = 90 ≥ 10 => both at least 10, so using Normal model is OK
- calcualte z-score using x = 0.95, mean = 0.91, SD = 0.0090
- get 4.44
- find corresponding prop to 4.44 from Z-table, and get it to be v.close to
0
- that SD that we calculated (used instead of p) is STANDARD ERROR
=> it is telling us that most of the time, sample prop. b/ween 0.91 ± 2(0.0090):
- in other words, another sample prop is unlikely to be greater than 2% away
from truth, with n=1000, p ~ 0.91.
Key point in this example:
[*] SAMPLE PROPORTION (in this example) unlikely to be >2% away from
truth, with n = 1000 and p near 0.91
Random sampling
- assuming that this is SRS
- b/c it is random, there is no bias on who is sampled
- what we are doing here is assuming that the sample prop we got was popn prop, and
then because n is large enough, we can use Normal model: mean = popn prop =
sample prop (we are guessing), and SD is calculated using = p)
- then, normal model shows us to what extent would over sample prop's vary.
[224]
SAMPLING DISTRIBUTION FOR SAMPLE MEANS lottery example
- have big chance of winning nth, small chance of winning sth worthwhile
- how much might we win, given we play lots of times?
- mean winnings
= popn mean
>- this is telling us that if we play game once, then we lose (on avg), 62 cents
>- this is expected value (aka mean) of random variable
Implication of law of Large numbers:
- given that we have sufficiently large sample, sample mean will be pretty close to popn
mean
=> if we play 1000 times, th
en we will lose, on avg, close to 0.62 per play.
- if large sample, then sample mean close to popn mean
- doesn't tell u :
a) how large of sample u need, and
b) how close u will get to p
[225]
SAMPLING DISTRIBUTION OF SAMPLE MEAN FOR VARIOUS SAMPLE SIZES
- what kind of sample means might u get for diff. sample sizes?
- ie. when n varies (ex. play 100 times, 20 times etc.) n = 20
- playing 20 times
- if play 20 times, then still pretty likely u win nth at all (frequency for -1 is the highest)
- ie. u lose $1 every single time
- but occassionally, might get lucky
if you graph histogram of popn, you get that it is..
- skewed to right
- this is done by plotting this data:
- use the probability as (frequency)
n = 50 - in gen, will lose more times, but will win less times
- still skewed to right
- note that beyond the 0 point, you are actually profitting from the game
- but here its saying if you play 50 times, v.unlikely that you make profit
n=100
- can see that -1 bin (losing every single time) is not v.likely anymore
- even tho we have 90% of losing once
- goes up to 0.2 (ie. likelihood of getting a 20 cent profit)
- shape is beoming more normal n=500
- if 500 times, mean is about -0.62 cents, and v.unlikely to win v.much more, or v.much
less
- bell-shape
[226]
QQ Plots (these are the corresponding QQ plots for each of the histograms on
the prev. slide)
- RECALL: For distribution to be OK for Normal, it should be close to black line as
possible
- ie. it should be as straight as possible
n = 20
- skewed to right
n = 100
- curved, but not much
- normal not bad for n = 100, but better for n = 500
Trend:
- as n gets larger and larger, the distribution goes from skewedness or bimodality to
looking Normal. [227]
WHERE DID NORMAL COME FROM?
- Not the popn
- recall that from this example, the popn distribution itself was right-skewed
- rather, it is b/c we have large sample and we are particularly looking at sample means
CENTRAL LIMIT THEOREM (the remarkable fact)
- for ANY popn, sampling distrib. of sample mean is approx. normal, given that
sample is large
=> when have big enough sample (n big enough), then sample means u get will follow
normal distrib, regardless of what popn looked like to begin with
- ex. ours was R-skewed
- regardless of what distrib start with (ie. what the popn is), if we take bigger and
bigger samples from it, and consider what type of sample mean we will get, they will
vary approx. according to Normal distrib.
- but you would need smaller n if popn distribution is already close to/is Normal
- ex. 30 would be big enough
[228]
MEAN AND SD OF SAMPLING DISTRIBUTION OF SAMPLE MEAN
Sampling distribution of sample mean
- is approx. normal
- mean is - SD is
- SD is given by:
- as sample size (n) gets larger, variability of sample mean decr's
=> sample mean will get closer to popn mean (conseq. of LLN (Law of Large
Numbers))
- we use this distribution when we are interested in knowing what sample mean would
be
-ie if want to know what kind of sample mean will get, normal distrib helps
approx.
- for normal distrib, need to know mean, SD
- what kind of sample mean might we get?
- on avg, sample mean and popn mean likely to be same
- ie. are going to be close (sample mean is same as popn mean, on avg)
- if sample size is big, then are dividing by huge #
- you make number smaller
- LLN: kind of sample means will get will be on avg the popn mean, and on avg
v.close to it as n gets larger, b/c SD becomes small (Variability of sample mean decr's)
[229]
CALCULATIONS FOR SAMPLE MEAN
Q: Sample of size n=25 is randomly selected from popn with: - mean = 40
- SD = 10
... Now, what is the probability that sample mean will be b/ween 36 and 44?
- Assume that CLT applies; (o/w we cannot do below calculations)
Just by mean and SD what does this give us an idea of?
- then val's you will see will be about 20 to 60 (2 SD's away ~ 95%)
- NOTE: here they are GIVING us info about popn (so we KNOW p, which we
set as mean of Normal, and know SD of popn as well)
Now, we want to get data for sampling distribution of sample mean (Normal
distribution), using data we have about popn
>- what is mean of Normal?
= popn mean (40)
>- what is SD of Normal?
= - so we have SD, and mean now, and we want the interval of (36,44) on Normal model
- this prop will tell us how likely is it that we get sample mean that'll be b/ween
36 and 44
- to do this, get z-scores for 36, 44
- we get magnitude of 2 for both, meaning that they are "w/in 2 SD's of mean"
sample mean:
mu = 40
SD = popn SD/\sqrt{n} = 2
- how likely is sample mean w/in that range?
- get z-score for 36
- -2
- get z-score for 44
- 2 - w/in 2 SD's of mean: P is about 95% (0.9544)
- so will happen about 95% of time
- 95% of time that sample mean will b/ween 36 and 44
Implication
- val's drawn from popn are ath about 20 to 60
- when u avg them up (when take random sample of size 25), get low and high,
and will roughly balance out
- so sample mean most of the time b/ween 36 and 44
- in other words, sample mean is unlikely to be more than 4 away
from popn
- these sampling distributions are giving us a sense of how far apart might popn
mean/prop might be from sample mean/prop, given that SRS was taken to collect data
- if have decently large sample (ex. n = 25), then sample mean and popn mean will be
pretty close, and can quantify how close, like how we calculated on this slide
- in this example, v.rarely will we get bizarre samples with means way far away
from popn mean
So,
- have just 25 observations, and based on those, I will get sample mean no more than
4 away from popn mean most of the time

More
Less
Related notes for STAB22H3

Join OneClass

Access over 10 million pages of study

documents for 1.3 million courses.

Sign up

Join to view

Continue

Continue
OR

By registering, I agree to the
Terms
and
Privacy Policies

Already have an account?
Log in

Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.