11 Pages
Unlock Document

Mitch Mc Ivor

AN INTRODUCTION TO STATISTICS FOR CANADIAN ∑ (x−x) 2 | x− x | s = ∑ N N SOCIAL SCIENTISTS—MICHAEL HAAN CH 1. WHY SHOULD I WANT TO LEARN STATISTICS? • Probabilities • Sampling Error CH 2. HOW MUCH MATH DO I NEED TO LEARN STATISTICS? • Order of operations—BEDMAS (1) # Of hours at the mall= 0.2 x # of Friends + (0.01 x disposable income – 4 x Age) –0.2 x # of Security guards at mall +2 (2) Y=aχ+b (3) 0.2 x 3 (0.01 x 10,000—20x4) –0.2 x 25+2 (4) 0.6 + (100—80) –5 + 2 (5) 0.6 + 20 –3=17.6 Hours/ Month • Exponents— 2 =4 • Logarithm—log1000 = 3 or log 1000 = 3: logarithm is a way of stating that when 10 is the base it must be multiplied by itself 3 times (10 ) to obtain the product of 1,000 LEVELS OF MEASUREME Description Example NT Nominal Numbers are only the name of How many marathon things. No level or order of runners are from each significance country? Ordinal Data can be placed into an order. Who came first, second, Ranked –i.e. likert, but can not third place in the measure level of significance marathon? between ranks Interval Can be in order, added or subtractedOn a scale from 1 (very but not multiplied or divided unhappy) to 5 (very happy), how did each competitor feel as they crossed the finish line? Ratio Has meaningful “0” value and allows How much time did it for the exact difference between take to cross the finish observations to be measured line? CH 3. UNIVARIATE STATISTICS • Frequencies tell us the number of times an item, or a response category comes up in a sample • i.e.) Number of Males and Females in Canadian Population, 2006 Census of Canada Sex (χ) Frequency % Cumulative Cumulative Frequency % F 16,136,930 51.05 16,136,930 51.05 M 15,475,970 48.95 31,612,895 100.00 • Bar Charts o Response categories must always appear on the χ-axis | frequency should always be on the y-axis • Pie Chart o Legends, titles, data sources • Ratios o The number of observations in one category compared to the number of observations in another category. 1. I.e. 16,136,930 F for every 15, 475, 970 M in Canada 2. To simplify it divide #’s by 100,000= 161:155 • Rates o Rates usually used to present continuous data. o Use the same denominator Rates, Ratios, % are STANDARDIZED… use the same unit of measurement CH 4. INTRODUCTION TO PROBABILITY • Probability is a number between 0 and 1 o 0= Event NEVER occurs o 1=Event OCCURS • Sample Space All Possible Outcomes o Contains all of the theoretically possible outcomes of an event o Each probability is a fraction of All Possible Outcomes • Law of Large Numbers o States that if you repeat a random experiment, many, MANY times… your outcome will reach a level of stability—meaning it will come out as the theoretical probability • Theoretical probability o What is predicted to occur • Empirical Probability o Trials conducted to see if it occurs • Mutually Exclusive o Does not overlap, and are independent outcomes • Not Mutually Exclusive o When an outcome of a probability that occurs OVERLAPS, you must subtract its duplicate o I.e.) Draw out a king or a heart from a deck of cards= 4/25 +13/25 -1/25= • Probability of Unrelated Events o Independent—event that occurs is independent • Probability of Related Events o Dependent—when the event that occurs is dependent on an event that occurred prior. I.e.) 1 bullet in 6 rounds, person A shoots a blank, therefore your chances turn to 1/5 • Mutually Exclusive Probabilities That are Interchangeable o I.e. What is the probability of rolling a die and yielding EITHER 1 or 6  ONE= 1/6  SIX=1/6  ONE or SIX = 1/6 + 1/6= 2/61/3 • Non-Mutually Exclusive or Interchangeable probabilities o When two events can occur simultaneously –considered to be non- mutually exclusive o Overlap, therefore subtract the duplicate CH 5. THE NORMAL CURVE • Central Limit Theorem—same as Law Of Large Number, except CLT applies to graphs not just outcomes • ASYMPTOTIC means that the central limit theorem will exist—the more you run a trial to test it, the more “normal” it will become • Unimodal—One hump (Mode—the most frequently occurring value in data set, on a variable of interest) • Bimodal—two modes i.e. gender (F and M) • Multimodal- multiple modes • SKEW vs. SYMMETRICAL (normal bell curve) CH 6. MEASURES OF CENTRAL TENDENCY • MEAN: Average i.e. Grade calculation • MEDIAN: The value in the middle • MODE: The most frequently occurring value • RANGE = Highest Value- Lowest Value o Used to see how wide or spread out the values are o I.e. 5 year income values—$25,000 - $5,000 = $20,000 • MEAN DEVIATION o Can tell you how far someone is from the average value—the average distance o MEAN DEVIATION= o Translation: the sum of |the values of the observation- mean/average| all divided by total number of values o Steps to calculate: 1. Calculate the average 2. Subtract each value from the mean/average 3. Sum the absolute values (make values positive) 4. Divide the sum by the total number of observations  I.e. Average is 70, student A gets 60, Student B gets 80  Student A’s mean deviation =10, Student B’s mean deviation =10, • VARIANCE AND STANDARD DEVIATION o Like the mean deviation, the variance and standard deviation are measures of how far the average observation is from the mean o BUT… that variance and standard deviation maintain the integrity of the data by transforming all of the data versus just some of it. 2 2 ∑ (x−x) o s = N  ▯ o Steps to calculate Variance: 1. Calculate the average 2. Subtract the mean/average from each value 3. Squaring the differences! ( ) 4. Add the squared deviations together 5. Divide the sum by the number of observations NOW… simply just square root of the results to end up with standard deviation! CH 7. STANDARD DEVIATIONS, STANDARD SCORES, & THE NORMAL DISTRIBUTION • What would the law of large numbers look like if it were drawn out? • How to compare the values of different groups within a sample o Z-score • The continuum is the normal curve • Standard deviation and standard score are used to determine the rank of an observation—Standard deviation is like money (units of measurement)—always possible to compare 2 observations to each other by using standard deviation scores • Using the mean and S.D., the normal curve provides info about the characteristics of a variable • The mean provides a useful “cut-point” for assessing how a person ranks —i.e. exam scores, are you above or below the average? • μ = mu (Mean)—value of zero • 68-95-99 • Knowing the SD allows us to est. the proportion of all hours where the values are above and/or below the grand sample mean—we can also determine the distance a particular observation is from the mean •
More Less

Related notes for SOC202H1

Log In


Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.