Class Notes (1,100,000)
CA (620,000)
UW (20,000)
ENVS (200)
ENVS178 (30)
Lecture 5

# ENVS178 Lecture Notes - Lecture 5: Waskesiu Lake, Environment And Climate Change Canada, Statistical Inference

Department
Environmental Studies
Course Code
ENVS178
Professor
Jeff Casello
Lecture
5

This preview shows pages 1-3. to view the full 28 pages of the document.
LOOKING AT DATA RELATIONSHIPS:
CORRELATION AND REGRESSION
De Veaux et al., Chapters 7 and 8
Sometimes, data analysis is defined by the number of variables considered. Here are several
cases:
Univariate analysis : »
Bivariate analysis : »
Multivariate analysis : »
In bivariate and multivariate analysis, you are usually interested in showing that one variable
helps to explain another. In this case special terms are used.
Response variable : »
Explanatory variable : »
Trying to estimate relationships amongst variables is often best done using visual inspection.
One tool to help is known as a “scatterplot”
Scatterplots:
Each dot represents: »
The scatter of dots represents: »
Example of a scatterplot. Both plots illustrate the same data; note the effect of scale.
Male versus Female Death Rates in Canada
between 1921 and 1997
0
2
4
6
8
10
12
0 5 10 15
Male deaths per 1000 population
Female deaths per 1000
population
Male versus Female Death Rates in Canada
between 1921 and 1997
4
5
6
7
8
9
10
11
12
6 7 8 9 10 11 12 13
Male deaths per 1000 population
Female deaths per 1000
population

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

Which dots represent which years?
In the above example, the direction of the scatter is clear: as the values for one variable increase
(or decrease), so do the values for the second variable. This is called
»
By contrast, if the values of one variable increased as the values of the other variable decreased,
this is referred to as
A scatterplot also provides information on the form of the relationship.
• Linear relationship: »
• Curvilinear relationship: »
What is the form of the relationship between male death rate and female death rate in Canada
over time?
You will have note that there was little scatter in the death rate plot and this says something
about the strength of the relationship.
What would the following data sets look like graphically?
Strong linear relationship
»
• Moderately strong linear relationship
»
• Weak relationship (form and direction unclear)
»

Only pages 1-3 are available for preview. Some parts have been intentionally blurred.

Correlation Analysis
Correlation analysis was invented by Sir Francis Galton (1822-1911), while he was thinking
about the resemblance between parents and children. Statisticians in Victorian England were
fascinated by family resemblances and gathered data on the topic.
One such hypothetical data set, which provides the adult height of 36 paired fathers and sons, is
depicted below. I have superimposed the best-fit line on the scatter. Note how it compares to the
line where X=Y.
What do each of the following represent in the above scatterplot?
• each dot »
• the x-co-ordinate of the dot (on the horizontal axis) (explanatory variable) »
• the y-co-ordinate of the dot (on the vertical axis) (response variable) »
What does the 45 degree line tell us? »
Where is the best-fit line positioned? »
Using the best-fit line, can we accurately predict a son's height based solely on the father's
height? »