# INFO 2020 Lecture 4: Week 4-6 Notes

28 views3 pages
7 Feb 2017
School
Course
Professor
Week 4-6 Notes
Part 2 (must be a spreadsheet)
- Organized and easy to understand
Scatter plots DV on left IV bottom
Observations correlation matrix highlighted and noted
Correlations doesn’t matter if they are bad, just need to interpret it correct
Correlations
Between IV’s and DV’s are good
IV’s to IV’s not necessarily good
Diagnostics
Standardized residuals should be between -2 and 2 if good model
Would like residuals to be independent
Residuals should be identically distributed
o Normally distributed
What to do with New Data
1. Data cleaning
2. Exploration
o Descriptive stats
o Histograms
o Correlations
o Scatter plots
3. Regression
Correlation Cut-Offs
What is good? What is bad?
o Depends on industry
o Strong maybe .7 or more
o No correlation .1 or less
Between .1 and .7 average
Coefficient of determination
Predicts power of model
I can predict 77% of the change in my dependent variable based on changes
in my independent variable
Model
o 77% is predicted, 23% is error
Is significance F good? <0.05 (p-value)
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows page 1 of the document.
Unlock all 3 pages and 3 million more documents.