POL 322 week 8 Lec Nov 5, 2012
Inferential Statistics: An Initial Foray
-aim is to make conclusions about population
-today focused on how well a sample statistic reflects a population
-estimate what population looks like
Random Sampling
-allows us to estimate pop from sample
-any non-random sample (ex. snowball) can’t use
-rids of systematic error, but not sampling error
-sometimes when you randomly pick, you pick a biased result
-end product: population statistic = sample statistic +/- amount of sampling error willing
to incur
-amount of sampling error will to incur = t(s / square root of n)
-cannot just say your data is exactly what the real population feels
Process
-ex. work our way through:
-views about Thomas Mulcair (NDP leader)
-population: student in the class
-interval level variable (rate him on a scale)
-some people don’t fill out the survey properly, some don’t show up etc.
-we assume this still represents the whole pop
-statistic measured is the mean
-Mulcair: 49.7 (thermometer score)
-normally we don’t know how Canadians would feel about Mulcair
-here we pretend we have all the pop data
-sample of 30 students
-Mulcair: 51.8
-different than pop: 49.7
-assume we don’t know the whole pop
-need to create a basket of confidence (or allowable error) around sample mean POL 322 week 8 Lec Nov 5, 2012
-assuming like in reality we didn’t know the pop results, how do we take into
account the basket or error?
-allowable error was 2.1%
Sampling Error
-a.k.a standard error
-2 components: 1. variance and 2. sample size
1. Variance: spread of data around statistic
-more variance, more sampling error
-why? b/c less variance in pop, less likelihood sample will be of extreme cases
-but, we don’t know the pop variance (often), so need to take estimate of it
-why not the sample statistic variance?
-standard deviation
-denoted as “s” for sample
-given direct positive relationship with error, it os a numerator
-population statistic = sample size +/- t(s (this is the standard deviation) /
square root of n)
-recall standard deviation is the typical spread of the data around the mean
-For Mulcair: 17.0 (mean 51.8)
-use the equation --> 34.8 - 68.8
2. Sample Size: greater sample, less error
-informs us of amount of error, broadly
-corrective measure for variance
-but not uniform decrease in error
-curvilinear
-depreciating values as sample gets huge
-at a certain point you get a plateau of results
-why do more than you need to? expensive, difficult
-to reflect this depreciation we square-root sample size POL 322 week 8 Lec Nov 5, 2012
-the more people you have the more likely you have reliable results
-the sampling error drops
-increase sample size/ decrease sampling error
-square rooting moderates effect as sample size increases
-ex. square root 100, 1000, or 1100 you get 10, 31.6, or, 33.2
-huge change in sample size, small change in results as sample
grows
-sample size in sample data is denoted as “n”
-because its a negative relationship with error, put as denominator
-Mulcair: n = 30 (sample size)
-not a high sample size, but potentially enough to start
-it would be better to use 60 but 100 might not be worth it in a pop of 350
-variance and sample size work in unison
-low variance, high sample size = low sampling error
-high variance, low sample size = high sampling error
Confidence Interval
1. samples possess sampling error
2. to account for this error need to understand variance, but only have sample variance,
so need to use best estimate- our sample standard deviation
3. sa
More
Less