STAT 2020 Lecture Notes - Lecture 3: Standard Deviation, Scatter Plot
Document Summary
If data is skewed, do not report standard deviation, but rather percentiles. Modified box plots can help show outliers. When finding standard deviation, x bar (mean of sample) has to remain constant throughout. We use correlation when two variables are quantitative. Bivariate (paired) data = analyzed to find an association between the two variables. Correlation = measure of the strength and the direction of a linear association between two variables. An association between two variables can be investigated with a scatter plot. Strength is determined by how close the data points in a scatter plot match with the line of best fit. Perfect positive correlation r = 1. Perfect negative correlation r = -1. Linear correlation coefficient (r) = measures the strength of a linear association in a sample. Correlation coefficient of a population = p. Requirements for making assumptions about p using r. R studio command to determine correlation coefficient of a sample: