STATS 250 Lecture 4: STATS 280 NOTES 9:20
STATS 280 NOTES 9/20/2016
To convert from SU to percentile, you must either
o Assume the histogram follows a normal curve, and then
Use a normal table
Or use the R command pnorm + qnorm
o If you cannot assume a normal curve, do everything manually. As an intermediate
step, you’ll need to convert to the original score
Will need to know the SD and mean
Summarizing the two data sets
o Overlay two histograms
o Create two box plots
For comparing quantitative set vs qualitative set
o Scatterplot
For comparing two quantitative sets
Correlation
o Suppose you have two lists of numbers of the same length: X1, X2, …, Xn and Y1,
Y2, …, Yn
o Correlation is defined as
o The average of all the values of X
in standard units multiplied by Y in
standard units
X and Y are lists
Average((X in SU) * (Y in SU))
Covariance
o You can factor out the SD in the previous equation because it is constant for every
value:
o The numerator is known as covariance of X and Y, written Cov(X, Y)
o You can think of correlation as a “normalized” version of covariance
o Covariance is equal to average(X*Y) – average(X)*average(Y)
o The variance of X is the covariance of X with itself
Var(X) = Cov(X, X)
Why Correlation?
o Correlation is a measure of the strength of linear association between X and Y
o Correlation is unaffected by changes of scale
If you add a constant to every X number, the correlation won’t change. If
you multiply every X number by a constant, the correlation won’t change
o Correlation is always between -1 and 1
A correlation of 1 means perfect positive (linear) associations
A correlation of -1 means perfect negative (linear) association
A correlation of 0 means no (linear) association
o The value of the correlation doesn’t have an easy interpretation
find more resources at oneclass.com
find more resources at oneclass.com
jessieho and 37662 others unlocked
49
STATS 250 Full Course Notes
Verified Note
49 documents
Document Summary
To convert from su to percentile, you must either: assume the histogram follows a normal curve, and then. Or use the r command pnorm + qnorm: if you cannot assume a normal curve, do everything manually. As an intermediate step, you"ll need to convert to the original score. Will need to know the sd and mean. Summarizing the two data sets: overlay two histograms, create two box plots. For comparing quantitative set vs qualitative set: scatterplot. Average((x in su) * (y in su)) Why correlation: correlation is a measure of the strength of linear association between x and y, correlation is unaffected by changes of scale. If you add a constant to every x number, the correlation won"t change. If you multiply every x number by a constant, the correlation won"t change: correlation is always between -1 and 1. A correlation of 1 means perfect positive (linear) associations. A correlation of -1 means perfect negative (linear) association.