Class Notes
(806,881)

Canada
(492,496)

University of Toronto St. George
(42,764)

Political Science
(3,355)

POL101Y1
(1,114)

Kenichi Ariga
(9)

Lecture

# FEB03.pdf

Unlock Document

University of Toronto St. George

Political Science

POL101Y1

Kenichi Ariga

Winter

Description

amyc
pol322
feb.03.2014
2.Center of Data
• Mode
• Median
• Mean
• Median:
◦ definition: order smallest value to largest value. Middle is median
◦ if even number of observations, then divide the middle numbers by 2
◦ if odd number of observations, then compute via MS Excel
• Mean:
◦ avg.value
◦ y=1/N << N i=1 y1
◦ N= # observations
◦ y=value of variable of observations
◦ i=index
◦ yi=value of observation I
◦ <<=sum of
◦ example: 1/11 (48+58+52...) ~ 52.09 (approx.
◦ -
◦ the mean is sensitive: mean may differ if you include/exclude outliers
◦ outliers are extreme values. Outliers always pull the mean to the right
◦ example: if you exclude outliers on the left, mean is larger
◦ example: if you exclude outliers on the right, mean is smaller
3.Variability of Data
• Why?
◦ The spread of distribution is greater in b), but it has the same mean as a)
• Variance and Standard Deviation
◦ S^2=1/N-1 Sum Of N, i=1 (y1,-y)^2
◦ definition: sum of squared diffs b/w each observation's value and the mean, divided by number of observations minus one
◦ yi-y=diff b/w each observation and mean
◦ (yi-y)^2=squared of the diff
◦ Sum Of N, i=1(yi-y)^2=sum of squared diffs for all observations
◦ 1/N-1 Sum Of N, i=1 (y1,-y)^2=divided by number of observations minus one
• Standard Deviation
◦ square root of variance
◦ S=Root S ^2
◦ standard deviation ~ avg. distance b/w each observation value (yi) and the mean (y)
◦ we want to show the variability of values of a variable across its mean
◦ avg. distance from each observation nd mean is nice, but if we compute it, it would equal to
0!
◦ thus, take the square of the distance to change everything into +ve values, then take the
square root of their avg to express it in the original unit of the variables
◦ we use N-1 instead of N because N-1 is more appropriate for statistical inference
◦ so, going back to a) and b), we can now describe the EXTENT of the 2 spreads:
• Percentile, Quartile, IQR
◦ percentile: the pth percentile is a value such that
◦ median is 50 percentile
◦ ex: the 10 percentile here is 32.3
th
◦ lower quartile: 25 percentile. ¼ of data fall below the lower quartile
◦ ex: figure 2
• Box Plot
◦ a type of graph: • Box Plot: Outliers
◦

More
Less
Related notes for POL101Y1