Study Guides (238,096)
Statistics (41)
STATS 2B03 (26)
Final

# Stats Notes Everything.pdf

82 Pages
1253 Views

School
McMaster University
Department
Statistics
Course
STATS 2B03
Professor
Aaron Childs
Semester
Fall

Description
Lecture 1-2 January-07-13 8:38 AM 1.2-1.4: Some Terminology The (target) population consists of all the subjectsthat are beingstudied.(ex: Canadians,mac students) A sampleis a groupof subjectsselected frompopulation. DescriptiveStatistics (Ch.2) consists of summarizingand presentingdata. (ex: average marksof a test, or a histogramof the marks) InferentialStatistics (Ch. 6 on) consists of usinga sampleto makea conclusionabouta population usingprobability (Ch.3-5)(ex: taking a opinionpoll of a sampleof a populationand makingan inferenceof the wholepopulationbasedon that) - Largersampleslead to moreaccurate estimate of inferentialstatistics The sampledpopulation is the populationfromwhich the sampleis drawn - Ex: Sampledpopulationdependson how you collect the sample(if you were interestedin Canadians,butonly chosefrom peoplein Hamilton) - Ex2: If I am interestedin the averageage of Mac Students,then the target populationis all Mac students.If I take a sampleof 30 studentsfrom class,then the sampledpopulationis everyonein this class. The firstcolumn showsthe first number(4 in 40,5 in 50) The secondcolumnonly showsthe ones(1 in 41,5 in 55) 3 7 25 - The numbers from the data set example 1 8 7 Definition: the median of an ordereddata set with n observationsis the middlevalue if n is odd, and the averageof the middletwo values if n is even e.g 5,8,10,21,25(n=5 numbers) Median=10 e.g. 6,12,15,18,26,71(n=6) Median = (15+18)/2=16.5 Example(Cont'd): (Grouping:40-45,45-50,50-55,etc.) Example: 2.8,2.9,3.0,3.2,5.4,6.7,6.9 Leaf unit usedto give you the originalvalue Inclass Notes Page 2 Leaf unit can be negative. Note the median cannotbe splitbetween two differentrows,it is always one numberand rows do not overlap sincewe usethe conventionalmethod 2.4 Measuresof CentralTendencies The averageof a sample(x1,x2,….,xn)is called the samplemean or mean and is denoted by The averageof the populationxq, x2, xN is called populationmean and is denotedby Can’tcalculate populationmean,but you can estimate since we don'thave the actual population Note: that N is the populationsize.n is the samplesize. The mode of a data set is the value that occursmostoften e.g.: 7,8,8,8,9,10 mode=8 e.g.: 7,8,8,8,9,9,9,10,10,11mode= 8 and 9 In a groupeddata set, the modal class is the groupwith the largestfrequency e.g. Grouping Frequency 40-50 1 50-60 1 60-70 3 Modal class 70-80 2 80-90 1 Inclass Notes Page 3 Lecture 3 January-11-13 10:31 AM 2.5: Measures of Variation The range R of a data set is the largest value (X ) mLnus the smallest value (X ) s i.e., R= XL- Xs The sample variance (or variance) of a data set x1, x2,-,xn is defined by The population variance of a population of values x ,x ,…1x 2s defNned by: Example: 55, 63, 72, 41, 87, 75, 64, 60 R= 87-41=46 x= 64.625 (NEVER ROUND if needed for another calculation!) S = (55-64.625) + (63-64.625) +…+ (60-64.625) /(8-1) 2 =191.125 S = 55 +63 +…+60 -8(64.625) /(8-1) = 191.125 *** do not use a rounded x (bar) or average for the s squared equation *** The sample standard deviation (or standard deviation) is defined to be: Inclass Notes Page 4 The sample standard deviation (or standard deviation) is defined to be: In our example: S= (191.125)sqrroot = 13.82, this gives the "average" amount that the data values differ from x Population can be the sample itself, but then you wouldn't need statistics The coefficient of variation is a measure of relative variation that can be used to compare the variation of two different data sets possibly measured in different units. It is defined by: No good reason as to why we multiply by 100. Example: 2 a. Ages in years: 3,4,5,6 x= 4.5, s 1 1.667 b. The same ages in months: 36,48,60,72 x= 54, S = 2422 - Note: same amount of variation (even with different S squared values) because these are similar data sets Which data set has more variation? Note: coefficient of variation is independent of the scale of measurement Percentiles and Quartiles: The pth percentile is the number x with the property that p percent of the data is less than x. e.g. The 50th percentile is the median. The quartiles Q1, Q2,Q3, divide the data set into roughly four equal parts Inclass Notes Page 5 Inclass Notes Page 6 Inclass Notes Page 7 Inclass Notes Page 8 Lecture 4 January-14-13 9:07 AM 3.2-3.4 Probability Definition: The sample space, S, for an experimentis the set of all possible outcomes. Ex: In a family of 3 children: S= (BBB, GBB, BGB, BBG, GGB, GBG, BGG, GGG) Gender Combos S= (0, 1,2, 3) Possible Boys These possibilities are not equally likely unlike the Gender combos,thus should not be used as a sample space Definition: An event, E, is a subset of the sample space that satisfies a given condition. Example (cont'd): E= "Exactly one boy" = (GGB, GBG, BGG) Basic Rule of Probability If the outcomesin S are equally likely than the probability of E is P(E )= # of outcomesin E/# of outcomesin S OR = # of ways E can occur/Total# of possible outcomes Example (Cont'd): P(Exactlyone boy)= 3/8 Some Rules: Rule p1: 0 <= P(E ) <= 1 Rule p2: P(E ) = 1-P(E ) When E ("complement")is the event that E does not occur. Note that E is the opposite of E. Ex (Cont'd): P(at least 1 boy) = 1-P(at least 1 boy) = 1- 1/8 = 7/8 Definition Two events are mutually exclusive if they cannot both occur at the same time. Venn Diagram Definition: AUB ("A union B") is the event that A occurs or B occurs (or both) Rule P3 If A and B are mutually exclusivethen probability P(AUB)= P(A)+ P(B) Ex: Ex: Table 1 Biological offspring Parental Handedness Right Handed Left Handed Total Fathers Mothers) c- cross RightcRight (RR) 303 37 340 RightcLeft (RL) 29 9 38 LeftcRight (LR) 16 6 22 Total 348 52 400 If a person is selected at random from the above 400 people, find the probability that their parents were RR or LR. P(RR U LR)= P(RR) + P(LR) = 340/400+ 22/400 = 362/400 Definition AnB ("A intersect B") is the event that A occurs and B occurs. Rule P4: P(AUB)= P(A)+ P(B) -P(AB) = 10)= (20 choose 10)(1/5) (4/5) +(20 choose 11)(1/5) (4/5) +…(20 choose 20)(1/5) (4/5) 20 0 = 0.0026, Note: That is the probability of passing Probability of failing, its compliment, is: 1-0.026=0.974
More Less

Related notes for STATS 2B03

OR

Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Join to view

OR

By registering, I agree to the Terms and Privacy Policies
Just a few more details

So we can recommend you notes for your school.