Study Guides (248,410)
Statistics (125)
STAB22H3 (102)
uknown (1)
Midterm

Cheat sheet for midterm.doc

3 Pages
504 Views

Department
Statistics
Course
STAB22H3
Professor
uknown
Semester
Fall

Description
Chapter 1 Cases: objects described by set of data customers, companies, subjects in a study Label: special variable used in some data sets to distinguish the different cases Variable: is a characteristic of a case Categorical Variable: places case into one of several groups/categories- bar graphs, pie charts Quantitative variable: numerical values (arithmetic operations) – stemleaf/histograms/boxplots Distribution of variable tells values it takes and how often it takes these values Distribution of categorical variables lists the categories and gives either the count or the percent of cases who fall in each category Describe the overall pattern of a histogram(frequency, percent-relative frequency, density) by shape (symmetric), centre (midpoint) and spread (outliers) Outlier: individual value that falls outside overall pattern Mean: x = x1 + x2....+ xn n Median: numbers from smallest to largest, if odd amount of numbers(n) – (n+1)/2 - -> 50 percentile, if n is even-- the median is the mean of two centre observations(Q1 and Q3 include the median numbers) pth percentile: of distribution is value that has a p percent of the observations fall at or below it First Quartile Q1: median of observations, position in ordered list is to the left of location of overall median Third Quartile Q3: median of the observations, position is to the right of the location of the overall median Five number summary: Minimum Q1 M Q3 Maximum boxplot: graph of the five number summary IQR = Q3-Q1 The 1.5 X IQR rules for outliers. Example: Q1= 87, Q2 = 52, IQR= 87-52 = 35, 1.5 x 35 = 52.5 upper quartile = 52.5 +87= 139.5(limit) lower quartile = 52 – 52.5 = -0.5 (limit) Modified boxplot: suspected outliers identified individually The variance s2 of a set of observations is the average of the squares of the deviations of the observations from their mean s2 = (x1 -x)2 + (x2-x)2 ...... + (xn-x)2 n-1 S- measures spread about mean, S = 0 when there is no spread – otherwise s>0 Density curve: always on or above horizontal axis. has area exactly 1 underneath it, + Mode: location where the curve is highest (peak point) The usual notation of the mean of an idealized distribution mu (u)standard deviation is sigma (o) The 68-95-99.7 rule − Approx. 68% of the observations fall withing sigma of the mean mu − Approx 95% of the observations fall within 2sigma of mu − Approx 99.7% of the observations fall within 3sigma of mu z-score: z = x-u Standardized normal distribution: Z= X-u N(0,1) mean-0,standard dev-1 o o Normal distribution is more than the area(24). ex. N(22,0.7) – 24-22/0.7=2.86 --> 0.9979 --> 1-0.9979 = 0.21% Bimodal distributions – two peaks Chapter 2: Response variable: measures outcome of a study Explanatory(Independent-x) variable: explains causes or changes in the response variable-variable you can manipulate Scatterplot: relationship of two quantitative variables --> Overall pattern -form(linear), direction(positive), and strength(weak, moderate, strong) Two variables are positively associated when above-average values of one tend to accompany above-average values of the other and below-average values also tend to occur together Two variables are negatively associated when above-average values of one tend to accompany below-average values of the other and visa versa Correlation: measures the direction and strength of the linear relationship between two quantitative variables. r. − correlation r always a number between -1 and 1, measure the strength of the linear relationship between two variable
More Less

Related notes for STAB22H3
Me

OR

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Join to view

OR

By registering, I agree to the Terms and Privacy Policies
Just a few more details

So we can recommend you notes for your school.