SOC 222 -- MEASURING the SOCIAL WORLD
Session #4 -- RAT RAT RELATIONSHIPS
Linneman: ch. 7
Kranzler: ch. 8: 79-87
ch. 9: 90-93
Today’s Objectives: Know…
1. How to obtain and interpret a scatterplot with a fit line.
2. Similarity of a scatterplot and a crosstab
3. How a regression fit line indicates relationship direction and effect size
4. The parts of a linear equation
5. And understand covariance 2
6. How to get regression coefficients and R in SPSS
7. How to use a regression equation to make predictions
8. PVE measures of effect size for regressions, correlations, and comparing means
Terms to Know
X and Y axes
regression fit line
coefficient of determination R 2
proportion of variation explained (PVE)
correlation coefficient “r”
RAT RAT RELATIONSHIPS
- Both the independent variable and dependent variable are both quantitative
- Good way to see three things about a ratio-ratio relationship 2
1. Shows all the cases plotted, shows the relationship between dependent and
2. We can see what direction the relationship has positive or negative
3. how strong the relationship is, what is the effect size.
RQ: Do immigrants settle in provinces which are more urban?
- Dependent would be the proportion of immigrants who are in city.
• percent of provincial population living in cities
• percent of provincial population who are immigrants
- data set open.
- Graphs, chart builder, window, ok, window 2, scan data
- Box chart builder is open: scatter/dot under gallery
- Simple scatter, drag to window beside variables
- Indep: percent of ppl living in cities, dep: proportion of immigrants
- ^ drag into x and y axis area
- Click ok
RUNNING a SCATTERPLOT
A scatterplot is a good way to see whether there’s a relationship between two ratio
(quantitative) variables, what direction it is (positive or negative), and (roughly) how
strong the relationship is.
1. On the menu bar (2 bar from top), click “Graphs”, then “Chart Builder”
• This brings up a box about levels of measurement.
• If your two variables are both ratio (“scale”), click “OK”
2. This brings up two boxes.
1. “Element Properties”. Ignore it.
2. “Chart Builder”. This is the one you work with.
3. The “Chart Builder” box has a lot of parts.
• Top left: a list of the variables in the data set.
• You’ll use this.
• Directly under: Two “category” lines. Ignore. 3
• Top right: the working “Chart Preview” box.
• You’ll use this.
• Under these boxes: four tabs. “Gallery” shows. Ignore others.
• Bottom left: A “Choose from” list of types of charts.
• Next to it: For each type of chart, a picture of possible sub-types.
• To the right: Two more buttons: Element Properties and Options.
• Bottom: Five action buttons.
• “OK” when you’re ready to make the chart
• “Reset” when you want to start over.
In the “Chart Builder” box
4. In “Choose from” box, click on “Scatter/Dot”
5. In Subtypes box (to the right), you get 8 icons.
• Mouse hover on top left picture. It says “Simple Scatterplot”
• Drag it up to the “Chart preview” box
• If this opens another “Element Properties” box, close it.
• In the “preview” box you now have a chart with a dotted box where the
Y-axis variable will go, and another for the X-axis variable.
6. From your “Variables” list (top left),
• Select your dependent variable name and drag it to the “Y-axis” box
• Drag your independent variable to the “X-axis” box
7. Click “OK”.
8. In left outline pane, you scatterplot shows under the title “GGraph”.
• The variable names on the plot are the variable labels
Adding a Regression Fit Line
9. Double-click on an empty grey part of the graph.
• This opens a new window called “Chart Editor”
• It has multiple bars with tiny icons for all the things you can do with
10.Find the fifth from the end.
• Mouse hover. Shows, “Add Fit Line at Total”
• Click this.
• This opens yet another box called “Properties” 4
• It wants to give you the right fit line.
• However, it has already given you a default fit line (linear).
11.The default fit line is fine for now.
• It has the regression equation superimposed on the line.
12. Close the Chart Editor box. This takes you back to your Output window.
• Save your output
Interpreting Your Scatterplot
Is there a relationship?
• If the line isn’t flat horizontal, or straight-up vertical, there’s a relationship.
What direction is it?
• If it slopes upwards (left to right), it’s positive
• The greater the “X”, the greater the “Y”
• If it slopes downwards, it’s negative
• The greater the “X”, the less the “Y” 5
- Independent always goes on bottom horizontal axis
- What can we tell?
o There is a relationship between the two variables, as the percent in cities
increases; we get higher values of proportion of immigrants. We do have a
o The relationship is positive because as x gets larger y gets larger.
Scatterplots & Crosstabs 6
- Divide into four
- We can count the cities and provinces in each of 4 categories
- Convert into cross tab:
Percent in Cities
Percent Immigrants: Low High
High 0 3
Low 2 8
REGRESSION FIT LINE
- Straight line to the points of the scatter plot. The line has to be straight
- This line is called: fit line, regression line,
- It’s the straight line that best fits all the points
- Once we have the line we can see if the line has a slope
Gives us two things: 7
1. It will tell us the effect size, the steeper the slope the greater the effect. The more
the dependent variable increases, with the independent variable. If the line were
flat then we have no relationship
2. Direction of the relationship, does it slope up (positive) or down (negative
- Double click graph, I get chart editor
- Click add fit line total, hover mouse over icon to find it
- Right away the line shows up
- By default it gives a linear line
- Close box and get rid of it
We can see:
1. The line is pretty steep and slopes upward we have a positive relationship. Good
effect size. 8
2. Points are moderately close; some are farther away from line. We look at vertical
distance between line and point. We call this the scatter, how far the points are
scattered. This affects the effect size, if some points are not too close then we
have a moderate effect size.
3. How do we get that fit line? How do we find it
- Any straight line can be rep. by equation. This is the straight line eq.
- 213 to 215 (L)
Y = a + b (X)
Four parts: X, Y, a, b
- SPSS gives equation of the line
- The two varaiables x, and y
- A is called the constant, its not going to vary as x varied, because it is not
multiplied by x
- The constant is where the regression line crosses the vertical y axis
- When x is 0 then you know when y crosses the line.
- Sub x as 0 and you will know values of y
- Coefficient b is called the slope of the line
- The interpretation is that every time the value of x increases by one that how
much the value of y is going to change. In this case its going to go up.
- So every time we increase x by one % the y will increase … Slope tells us
- The slope will tell us how steep that line is, it will be a measure of effect size
- How much of an effect x has on y
- Lin: page 214
• Every time the value of X increases by 1, this is how much the value of Y goes up
Finding the Regression Line Coefficients
- How do we find a and b?
- Important for finding value of b
- Variance is the spread or dispersion of one variable.
Variance is the “variation” or “dispersion” or “spread” of one variable 9
2 ∑ (x−x)2
variance = s = n−1
- Find its value on variable y and subtract the mean
- Multiply those two differences
- Do that for all cases. For all x and y values that’s what the E rep. the sum of all
Covariance = ∑ x−́x)y−́y )
• N - 1
The Logic of Covariance
- Why does covariance matter?
1. One variable
• Values: 1, 3, 5, 7, 9, 11
2. Product of Two Differences
• Small: 2, 3
• Big: 11, 12
• A small difference times a small difference is a small number 10
• 2 * 3 = 6
• A small difference times a big difference is a big number
• 2 * 11 = 22
• A big difference times a big difference is a very big number
• 11 * 12 = 132
• If a case is close to the mean on both X and Y, then those two differences
are going to be small and if u multiply them the case is going to be small
• If a case is far from the mean on both X and Y, each of those differences
will be large and result will be a large number
X Y Differences from mean for X, Y
1 2 -2, -2 =4
3 4 0, 0=0
5 6 2, 2=4
• Product results: 4, 0, 4, sum of all three cases
• Sum: 8, added together
X Y Differences from mean for X, Y
1 2 -2, -2=4
3 6 0, -2=0
5 4 2, 0=0
• Product results: 4, 0, 0,=4
• Sum: 4
• Co variance is going to be related to the strength of the relationship.
Slope Coefficient “b”
- When we have covariance we need slope
slope = coefficient “b”
Finding the term to convert covariance to slope: 11
1. Select the independent variable X
2. Calculate for each case the difference between the case value and the mean
3. Square these differences
4. Add the squared differences up and divide by N – 1