INFO 2020 Lecture 4: Week 4-6 Notes

28 views3 pages
Week 4-6 Notes
Part 2 (must be a spreadsheet)
- Organized and easy to understand
Scatter plots DV on left IV bottom
Observations correlation matrix highlighted and noted
Correlations doesn’t matter if they are bad, just need to interpret it correct
Between IV’s and DV’s are good
IV’s to IV’s not necessarily good
Standardized residuals should be between -2 and 2 if good model
Would like residuals to be independent
Residuals should be identically distributed
o Normally distributed
What to do with New Data
1. Data cleaning
2. Exploration
o Descriptive stats
o Histograms
o Correlations
o Scatter plots
3. Regression
Correlation Cut-Offs
What is good? What is bad?
o Depends on industry
o Strong maybe .7 or more
o No correlation .1 or less
Between .1 and .7 average
Adjusted R^2
Coefficient of determination
Predicts power of model
I can predict 77% of the change in my dependent variable based on changes
in my independent variable
o 77% is predicted, 23% is error
What to start with
Is significance F good? <0.05 (p-value)
find more resources at
find more resources at
Unlock document

This preview shows page 1 of the document.
Unlock all 3 pages and 3 million more documents.

Already have an account? Log in

Get OneClass Notes+

Unlimited access to class notes and textbook notes.

YearlyBest Value
75% OFF
$8 USD/m
$30 USD/m
You will be charged $96 USD upfront and auto renewed at the end of each cycle. You may cancel anytime under Payment Settings. For more information, see our Terms and Privacy.
Payments are encrypted using 256-bit SSL. Powered by Stripe.