Lecture 11

# COMPSCI C8 Lecture 11: Week 11 Study Guide (Lecture & Textbook Notes)

School
University of California - Berkeley
Department
Computer Science
Course
COMPSCI C8
Professor
John Denero
Semester
Spring

Mon 43 [28]: Correlation Reading: 13, 13.1 Prediction Correlation New Methods Today: correlation Lecture: Prediction Problems For some sample we know all the characteristics Relation Between Two Variables How does one variable vary across a sample? Now talk about TWO variables. Association and Trend Positive Association Negative Association Pattern Any discernible shape Linear (points are near a single straight line) Nonlinear (scatterplot looks like a curve, etc.) What the process should be visualize then quantify (what is slope of line, what kind of shape do you see) Demo: Hybrid Cars Scatterplot Is there an association between acceleration and MSRP? Yes. Its almost linear (unclear but almost so). Is there an association between MPG and MSRP? Yes It doesnt look very linear but we can say its a negative association. It looks like an L shape. Standard Units: to describe all the values in some collection Scatterplot of msrp and standard_units(msrp) gives you a STRAIGHT line Correlation Coefficient r Measures linear association Based on standard units 1 r 1 r = 1: scatter is perfect straight line sloping up r = 1: scatter is perfect straight line sloping down r = 0; no linear association; uncorrelated Demo: Function r_scatter that makes up data from 2 diff normal distributions Correlation Coefficient formula: convert both x and y into std units, product, and then average them. Wed 45 [29]: Linear Regression Reading: 13.2, 13.3 Regression Line In general, individuals who are away from average on one variable are expected to be not quite as far away from average on the other. This is called the regression effect. Method of Least Squares New Methods Today: Lecture: Review of Correlation Coefficient r Measures linear association Based off of standard units
