STAT 1400 Lecture 15: 4.18 Stat Notes (Ch. 9)

5 Pages
Unlock Document

University of Missouri - Columbia
STAT 1400
Margaret Bryan

Stat 1400 4.18.2017 9:30 am Correlation and Linear Regression How can we investigate whether two variables are associated with one another? • Is there a relationship between cardiac mortality and the consumption of wine? o Do a study! • Is there an association between marital relationships and health problems? o Study! Bivariate data • For each individual studied, we record data on TWO variables • We then examine whether there is a relationship between these two variables: Do changes in one variable tend to be associated with specific changes in the other variables? Scatterplots • A scatterplot is used to display quantitative bivariate data • Each variable makes up one axis o Each individual is a point on the graph Explanatory and response variables • A response (dependent) variable measures an outcome of a study. o Example: Weight is dependent on Age • An explanatory (independent) variable may explain or influence changes in a response variable. o Example: Gestational Age influences the Birth Weight o When there is an obvious explanatory variable, it is plotted on the x (horizontal) axis of the scatterplot. How to scale a scatterplot • Both variables should be given a similar amount of space: o Plot is roughly square o Points should occupy all the plot space (nonblank space) ▪ *colored box is around best graph 1 Stat 1400 4.18.2017 9:30 am Interpreting Scatterplots • After plotting two variables on a scatterplot, we describe the overall pattern of the relationship. Specifically, we look for … o Form: linear, curved, clusters, no pattern, etc. ▪ The example above is linear o Direction: positive, negative, no direction (increasing or decreasing??) ▪ The example above is negative o Strength: how closely the points fit the “form” (how scattered it is?) ▪ The example above is relatively weak with variation • … and clear deviations from that pattern 100 o Outliers of the relationship • Example: 80 o Form: Linear 60 o Direction: Positive 40 o Strength Moderate o No outliers 20 Manatee deaths from powerboat collision 0 400 600 800 1000 Powerboats registered (x1,000) • Example: o Form: Liner? Cluster? We need more data o Direction: Negative o Strength: weak o Outlier around (17, 120) Adding categorical variables to scatterplots ▪ Two or more relationships can be compared on a single scatterplot when we use different symbols for groups of points on the graph ▪ To add a categorical variable, use a different plot color or symbol for each category. ▪ Consider the relationship between mean SAT verbal score and percent of high-school grads taking SAT for each state o Orange dots are southern states and blue dots are northern states
More Less

Related notes for STAT 1400

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.