# Chapter 2 Looking at Data Relationships Exam Review

Unlock Document

University of Toronto Scarborough

Statistics

STAB22H3

Moras

Fall

Description

Chapter 2 Looking at Data Relationshipsassociated term used to describe the relationship between two variables ex breed and life spanexamining relationshipsWhat individuals or cases do the data describeWhat variables are present How are they measuredWhich variables are quantitative and which are categoricalEx on page 85response variable measures an outcome of a studyexplanatory variable explains or causes changes in the response variables Ex many of these do not involve direct causation Ex sat scores of high school students help predict future college grades but high sat scores dont CAUSE high college gradesindependent variables called explanatory variablesdependent variables called response variablesResponse variables rely on explanatory variables21 Scatterplotsscatterplots for showing relationship between two quantitative variables measured on the same individualsexplanatory variables on x axis called x if no explanatory variable then any of the variables can on either axisResponse variable on y axis called yInterpreting scatterplotsLook for overall pattern and deviations from pattern ex outliers falls outside the pattern of the relationshipDescribe overall pattern by the form direction and strength of the relationshipform ex clustersPg 87 fig 21 has two clusters Clusters groups of points on the graph They suggest that the data describe several distinct kinds of individuals positive associated when two variables are above average values of one tend to accompany above average values of the other and below average values also tend to occur togethernegatively associated when two variables are above average values of one accompany below average values of the other and vice versalinear relationship points roughly follow a straight lineStrength of relationship determined by how closely the points follow a clear formto add a categorical variable to a scatterplot use a different plot colour or symbol for each categorysmoothing systematic methods of extracting the overall pattern are helpful They use resistant calculations so they are not affected by outliers in the plotto display a relationship between a categorical explanatory variable and a quantitative response variable make a side by side comparison of the distributions of the response for each category22 Correlationmeasure used for data analysis by using a numerical measure to supplement the graph since our eyes are not good judges of how strong a relationship iscorrelation r helps us see that r is positive when there is a positive association between the variablesEx height and weight have a positive correlation Correlation measures the direction and strength of the linear relationship between two quantitative variables It is usually written as rEx suppose data on variable x and y for n individuals The means and standard deviations of the two variables are x and sx for the x values and y and sy for the y values The correlation r between x and y is r1n1xixsxyi ysy

More
Less
Related notes for STAB22H3