HDF 315L Study Guide - Midterm Guide: Linear Regression, Spurious Relationship, Dependent And Independent Variables

35 views7 pages
Understanding Linear regression
Linear regression is used to make predictions of scores on some measure
Using the independent variable(s) as predictors, a single straight-line equation is calculated
The predicted (or expected) value of the dependent variable can be calculated from the independent
variables
The general equation for a straight line is:
Y = a + bX
Where:
Y is the score on variable Y (the score to be predicted)
a is the intercept (the score on Y when X is zero)
b is the slope (this determines the angle of the line)
X is the score on variable X
Multiple Regression
Usually, researchers perform regression analysis when there are multiple independent variables, or
predictors, a technique known as multiple regression
The b-coefficients (or slope coefficients, or regression coefficients) indicate the strength and direction of the
relation between the predictor(s) and the outcome
The R2 for the equation provides an estimate of the accuracy of the predictions
o The R2 can be thought of as how much better than chance we are able to predict
o R2 = .10 means that we can predict 10% better than chance
o R2 = .50 means that we can predict 50% better than chance
Cohen’s conventions for R2
Small effect- R2 = .02
Medium effect- R2 = .13
Large effect- R2 = .26
Interpreting regression coefficients
b-coefficients are expressed in the original metric:
Y = 2.5 + 1.45(age)
For each unit of age months, years, Y increases by 1.45 units
Beta coefficients (β) are expressed in standardized metric
Y =.32(age)
For each standardized unit of age, Y increases by .32 units
Beta coefficients can be compared to one another; b-coefficients cannot:
Nonstandardized equation:
Y = 2.5 + 1.45(age) + .72(income)
We can’t tell if the effect for age is greater than the effect for income, because age is measured in years and
income is measured in dollars
Standardized equation:
Y = .32(age) + .45(income)
In the standardized equation, the strength of the coefficients can be directly compared: in this case, income
is a stronger predictor than age
Beta coefficients range from -1 to +1, and are interpreted like correlation coefficients
Cohen’s conventions for β
Small effect- β = .10
Medium effect- β = .30
Large effect- β = .50
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 7 pages and 3 million more documents.

Already have an account? Log in
Sequential regression models
Often, researchers will test sequential regression models
The purpose is to determine whether later models (usually more complex models) are a significant
improvement over the earlier models in the sequence
Researchers stay with the simpler models unless there is compelling evidence to accept the more complex
models
Categorical predictors
Categorical predictors can easily be handled in regression
Commonly, categorical predictors are coded 0,1:
o Gender (female)
Male = 0
Female = 1
Categorical predictors with more than two categories require additional variables
The number of variables is (usually) one less than the number of categories
Correlation does not imply causation
Just because two variables are empirically related does not mean that a causal relation exists:
Criteria for establishing causality
Variables are empirically related (causation does imply correlation)
The cause must precede the effect in time
The observed relation is not due to a third variable that acts as a common cause (i.e., the relation is not
spurious)
Example: Suppose a study finds a significant correlation between children’s participation in extra-curricular
activities and level of self-esteem.
r activities, esteem = .40, p < .05
Spurious relations- A spurious relation (or spurious correlation) is one in which the observed relation between two
variables is caused by a third variable
Is there an alternative explanation?
OBSERVED RELATION: Children whose mothers stay at home perform better in school than children whose
mothers work outside the home.
CAUSAL CONCLUSION: Maternal employment leads to cognitive deficits in children
Research Design- How can we show that the independent variable causes the dependent variable?
Causal Inference and the Counterfactual
A counterfactual is the alternate outcome: What would have happened if the treatment had not occurred
Causal Effect =
Yt(u) Yc(u) (highlighted portion is the counterfactual)
Rubin defined a causal effect as the difference between what happened after the treatment is administered
and what would have happened to the same person had the treatment not been administered
Causality can never be directly observed
To estimate the value of the counterfactual, the assumptions must be specified
Three research design approaches are available that address the fundamental problem of causal inference
o May be employed alone or in combination
find more resources at oneclass.com
find more resources at oneclass.com
Unlock document

This preview shows pages 1-2 of the document.
Unlock all 7 pages and 3 million more documents.

Already have an account? Log in

Document Summary

Linear regression is used to make predictions of scores on some measure. Using the independent variable(s) as predictors, a single straight-line equation is calculated. The predicted (or expected) value of the dependent variable can be calculated from the independent variables. The general equation for a straight line is: Y is the score on variable y (the score to be predicted) a is the intercept (the score on y when x is zero) is the slope (this determines the angle of the line) Usually, researchers perform regression analysis when there are multiple independent variables, or predictors, a technique known as multiple regression. The b-coefficients (or slope coefficients, or regression coefficients) indicate the strength and direction of the relation between the predictor(s) and the outcome. B-coefficients are expressed in the original metric: For each (cid:498)unit(cid:499) of age (cid:523)months, years(cid:524), y increases by 1. 45 units. Beta coefficients ( ) are expressed in standardized metric.