Class Notes (786,277)
Canada (482,090)
Statistics (266)
STAB22H3 (207)
Ken Butler (34)


13 Pages
Unlock Document

University of Toronto Scarborough
Ken Butler

STAB22 LEC06 (Covers the remaining part of Chapter 6 and start of Chapter 7) ---------------[CHAPTER6]----------------- [59] EXAMPLE OF NPP Given data, how do we know whether normal distribution works or not? - can easily tell by making & looking at NPP (normal probability plot) (ex) Potassium data for breakfast cereals - skewed to right - this is indicated by bunch of high val's spread out, and then have bunch of low val's clumped together - for normal distribution, the pts have to be pretty close to, or on the black line [60] EXAMPLE OF NPP -> actual normal data - realistic data typically does not follow blk line precisely - it looks OK to form a normal curve for this because val's are pretty close to blk line [61] EXAMPLE OF NPP -> cereal calorie data - notice that it curves down, and then curves up - forms an S-shaped distribution - this one is symmetrical, but too many outliers for it to be normal [62] EXAMPLE OF NPP -> cereal sugars - although it "wiggles" (meaning it curves), overall the pts are not far away from line - OK for normal distribution [63] EXAMPLE OF NPP -> cereal sugars histogram - it had hole in the middle, sth that NPP could not tell you - b/c of this, it is not really normal, b/c shape is not as symmetric, even tho. there isn't any outliers [64] 68-95-99.7 RULE - applies to normal distribution only - says that - within +/- 1 SD away from mean, there is about 68% of the data - within +/- 2 SD away from mean, there is about 95% of the data - within +/- 3 SD away from mean, there is about 99.7% of the data - this rule gives up prop's for certain SD's without having to use z-table. Rule (con.) - tells you where to get what % of the curve from its name - does not matter what the original mean and SD of the data was, this rule still applies, as long as the distribution of the data is, or is roughly normal. (ex) Roma Tomatoes - mean = 74g - SD = 2.5g ====== Question1: What interval of weights will be covered by 95%? Solution: - 95% => +/- 3 SD away from mean - to get lower value for interval, - val = 74 - 3(2.5) = 66.5g - to get upper val. for interval, - val = 74 + 3(2.5) = 81.5g => 95% of the weights will be between 66.5g and 81.5g. Question2: What proportion of the weights will be between 71.5g and 76.5g? Solution - get z-scores for these val's - For 71.5g, - For 76.5g, You want to find what % b/ween -1.00SD and 1.00SD away from mean, which is 68%, by the 68-95-99.7 Rule. [65] Roma Tomatoes - mean = 74g - SD = 2.5g ====== Question: Approximately what fraction of weights will be greater than 79g? Solution: - Get z-score - corresponding prop = 0.9772 => 97.72% below => ~2.3% above Portions of thecurve: - shaded parts both together make up 95% - on the upper end, that is 2.5% (unshaded part), and there is one on the lower end that is 2.5% => 2.5% of it is bigger than 79 [66] Roma Tomatoes - mean = 74g - SD = 2.5g ====== Question: About what proportion will be between 74 and 81.5g? Solution Method 1 - Get z-scores - For 74 => 50% of the data is below this - For 81.5 => about 99.8% of the data is below this - So to find what is between these two regions, we subtract 99.8% from 50% => 99.8 - 50.0= 49.8% Method 2 - 74g is the mean - The two points that are 3SD's away from the mean are: - val 3SD below mean = 74 - 3(2.5) = 66.5 - val 3SD above mean = 74 + 3(2.5) = 81.5 - together the regions in b/ween 3SD away from mean is about 99.7% - so this is from 66.5 to 81.5 - we only want 74 to 81.5 - by symmetry, the proportion between 74 and 81.5 is half of 99.7%, which is 49.85% [68] WHEN YOU DON'T HAVE NORMAL TABLE Can use the following formula: About theformula - only accurate to 2 d.p. Note about exam - will have NORMAL TABLE to work with in exams (ex) Roma tomatoes - mean = 74 - SD = 2.5 - prop. less than 77.4 gives z = 1.36 - anyways, you must find the positive z-value that would give the equivalent proportion if you want to work with the formula. ---------------------------- ---------------[CHAPTER7]----------------- [69] SCATTERPLOTS, ASSOCIATION & CORRELATION - now were looking at 2 quan. var's together - first tool introduced: scatterplot - plot of val's of one q.var against val's of another q.var. [70] EXAMPLE OF SCATTERPLOT ->airport in Oakland recorded #passengers leaving each month from 1990 to 2006 -> this scatterplot has: a) time (years since 1990) for x-axis b) passengers (#) for y-axis About this scatterplot - can see a general upward trend => as time goes by, number of passengers seems to incr. - each blue dot rep. one month's worth of data - ex. a month in 2001 where there was about 7000 passengers is rep. by 1 unique point
More Less

Related notes for STAB22H3

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.