Class Notes
(807,938)

Canada
(492,936)

University of Toronto St. George
(42,758)

Geography
(944)

GGR270H1
(38)

Stephen Swales
(2)

Lecture

# GGR 270 Lecture Notes

Unlock Document

University of Toronto St. George

Geography

GGR270H1

Stephen Swales

Fall

Description

September 11, 2013
What Is This Course About?
• Statistical tools or techniques
• Help support research projects, papers, reports
• Skill set for future employment
• Understand and critically comment on the work of others
We will…
• Examine the role of statistics in research
• Application of most appropriate technique or set of techniques
• Understand and interpret the result
• “What does this result mean to my research problem?”
General Course Topics
• Describing data
• Graphs
• Simple measures – one variable
• Average
• Measures of center and measures of dispersion
• Getting your grade back and asking for the class average
• Simple measures – two variables
• Probability and distributions
• Simple probability
• Sampling distributions
• Statistical estimation
• Process
• Techniques and tests
What are (is) Statistics?
• Any collection of numerical data
o Vital Statistics – birth rates, death rates (about vital statistics in
Toronto, released every year)
o Economic Indicators – unemployment rates, income levels
Way of measuring economic success or failure
o Social Statistics – poverty rates, crime rates
When we’re looking at neighborhood change – a misplaced
causality is usually in there – crime could be caused by other
variables such as government influence, etc.
• Methodology for collecting, presenting and analyzing data
o Summarize findings
o Theory validation
o Forecasting
1
GGR270 – Damian Dupuy "The labor force is growing at this rate now, we estimate
that it will be this much in 20 years”
o Evaluate
Help us understand what is going on in the subset of a
bigger picture
o Select among alternatives
Descriptive and Inferential Statistics
• Descriptive
o Organization and summary of data
o Replace large set of numbers with small summary measures
E.g. Average household income in Toronto is $60,000 –
summarizing everybody’s income
o Goal of techniques is to minimize information loss
• Inferential
o Links descriptive statistics to probability theory
Gives us the ability to speak with a higher degree of
confidence
o Generalize results of smaller group to a much larger one
As opposed to asking the general population in the class to
analyze the mean score on the final exam, we can take a
subset of about 50 and average their scores – the result
should be pretty similar to testing the whole class if done
correctly
• It is time consuming to ask everyone and in the
real world it would cost a pretty penny
o Goal is to ‘infer’ something about a larger group by looking at a
smaller one
Population and Sample
• Population
o Total set of elements (objects, persons, regions, etc.) under
examination
o For example, all potential voters in an urban area
We could examine voter behavior
o Denoted as N
• Sample
o Subset of elements in the population
o Used to make inferences about certain characteristics of the
population
o Try to predict the behavior of the population by looking closely at
the sample
If we set up everything properly and conducted accordingly,
we can make a pretty accurate conclusion
Sample 3 (n3)
2 o Denoted as n
Population - N Sample 2 (n2)
September 18, 2013 Sample 1 (n1)
Variables and Data
• Variable
o Characteristic of the population that changes or varies over time
Examples include temperature, income, education, etc.
o Observe and measure variables
o Two key categories
Quantitative – numerical e.g. umber of students who…
• Discrete (1,2,3,4…) or continuous (1.5, 2.76, 3.445…)
o Who passed GGR270 and who failed GGR270
Qualitative – Non Numerical e.g. male/female, plant species
• Data
Results from measuring variables – set of measurements
Different categories – Univariate, Bivariate, Multivariate
Variables – Scales of Measurement
• Scales defines amount of information a variable contains and what
statistical techniques can be used
Four scales: Lowest
Nominal
Ordinal
Interval Information
Ratio
Highest
Nominal
• Lowest scale of measurement, no numerical value attached
• Classifies observations into mutually exclusive and collectively exhaustive
groups
• Simply the name or category of the variable
• Often called ‘categorical’ data
o E.g. occupation type, gender, place of birth
Ordinal
• Stronger scale as it allows data to be ordered or ranked
o E.g. 12 largest towns in a region, income by group (high, middle,
low)
3
GGR270 – Damian Dupuy Interval
• Unit distance separating numbers is important
o E.g. temperature (F or C)
• But, it does not allow for ratios and does not have a true ‘Zero’
Ratio
• Strongest scale of measurement
• Ratios of distances on a number scale
• Presence of an absolute ‘ZERO’
o E.g. temperature (Kelvin), income from all sources ($), population
of a city
• In practice, we consider interval/ratio scales together
Graphs
• Pie charts
o Circular graph where measurements are distributed among
categories
• Bar Graph
Graph where measurements are distributed among categories
Relative Frequency Histogram
Graphs quantitative, rather than qualitative data
• Quantitative data – e.g. income levels, scores, etc.
Vertical axis (Y) shows “how often” measurements fall into a particular class or
subinterval
Classes are plotted on the horizontal axis (X) axis
Rules of thumb
5 to 12 intervals or categories
To interpret or explain the data – less than that and it useless, more than that and
it is hectic
1 + 3.3 Log (# of observations)
10
Must be mutually exclusive and collectively exhaustive
Intervals should be the same width
Don’t lengthen the last bar to make it fit, you make another one
RELATIVE FREQUENCY HISTOGRAM DOES NOT HAVE GAPS IN BETWEEN
BARS
• Observations:
1, 11, 14, 21, 23, 27, 28, 33, 35, 50
Number of Classes – (Rows In “Excel Table”)
k = 1 + 3.3 Log1010)
= 4.3 rounded up to 5
• Class Width – (Range Per “Excel Row”)
4 Largest number (subtract) smallest number (divided) number of classes
= (50 – 1) / 5
= 9.8 rounded up to 10
Observing The Graph – Skewness
Is the distribution symmetric?
• If the graph is skewed it is usually outliers pulling the graph; exerting a
greater degree of influence
Observing The Graph – Mode
How many peaks are there?
• Normal – 1 peak
• Bimodal – 2 peaks
Observing The Graph – Kurtosis
How peaked is the distribution?
• Mesokurtic – semi-high
• Platykurtic – flat peak
• Leptokurtic – very high with little data beside peak
itself
Describing Data: Measures of the Center & Measures of Variability
Statistics and Parameters
• Graphs are limited in what they can tell us
• Difficulty making inferences about a population when looking at a subset
or sample
• Therefore, we need to use numerical measures
• Measures associated with the population are called parameters
• Measures associated with a sample are called statistics
Measures of the Center
• Mean
o Most commonly used measure of central tendency
o Sum of all values or observations divided by the number of
observations
E.g.
Temperature data: 7.3, 10.7, 9.1, 8.4, 13.9, 94, 8.2
= 67/7
= 9.57
= 9.6
• Median
o Value occupying the ‘middle position’ in an ordered set of
observations
5
GGR270 – Damian Dupuy o Order the observations, lowest to highest, and find the middle
position
.5 (n + 1)
Sample Population
1. E.g. (uneven number)
Temperature data: 7.3, 10.7, 9.1, 8.4, 13.9, 9.4, 8.2
7.3, 8.2, 8.4, 9.1, 9.4, 10.7, 13.9
Using formula .5 (7+1) = 4 position in the ordered
set
th
2. (Even numbers) add .5 (6+1) = 3.5 position
September 25, 2013
Measures of Center
Mode
• Value that occurs with the highest frequency
• Allows you to locate the peak of a relative frequency histogram
Choosing an Appropriate Measure
• Mean is usually best measure as it is sensitive to change in a single
observation
o Yields the most information
Minimize information leakage
o But not a good measure when
Distribution is bimodal (2 modes)
Skewed distributions
• Outliers (extreme values) are present in the data set
e.g. {2, 5, 6, 8, 9, 21, 22}
• Bimodal (mean and median
are in the same position
between two modes)
Measures of Dispersion
• Range
o Simplest measure of dispersion
6 o Takes the difference between smallest and largest value in the
dataset, at the interval/ratio scale
o But, influenced by outliers Range = X max–
X min
• Quartiles
o Can yield more information and lessen impact of outliers
o Data are divided into quartiles (4 groups)
• Standard Deviation and Variance
o Two of the most commonly used methods of dispersion
o Comparing value of each measure to the mean
• Two key properties of the mean/value relationship Xí
o Sum of differences will always add up to Zero X
o Sum of squared differences will be the minimum sum possible.
Called ‘Least Squares’ property
Sample Standard Deviation
• Skewness
o Measures the degree of symmetry in a frequency distribution
o Determines how evenly (or unevenly) the values are distributed
either side of the mean
3(x−median)
S
• Coefficient of Variation
o Allows for comparison of variability spatial samples
o Tests which sample has the greatest variability
o Standard Deviation or Variance are absolute measures so, they are
influenced by the size of the values in the dataset
o To allow a comparison of variation across two or more geographic
samples, can use a relative measure of dispersion called
Coefficient of Variation
CV= S ∨ Standard Deviation
X Mean
• Empirical Rule: (Normal Distribution)
X = 20
S = 5
7
GGR270 – Damian Dupuy 1.) 68% fall between 20 +/- 5 or between 15 – 25
2.) 95% fall between 10 – 30
3.) 99.7% fall between 5 – 35
October 2, 2013
Z scores
• Standard scores are referred to as Z scores
• Indicate how many standard deviations separate a particular value from
the mean
• Z scores can be more or less depending if they are greater than or less
than the mean
• Z score of the mean is 0 and the standard deviation is + or – 1
• Table of Normal Values provides probability information on a standardized
scale
• But, we can also calculate Z scores
• Formula involves comparing values to the mean value, and dividing by the
standard deviation
• Result is interpreted as the ‘number of standard deviations an observation
lies above or below the mean’
(E.g.) Rainfall in Toronto
Mean = 39.95 inches of rainfall
S = 7.5 inches
What is the Z score for 48 inches?
Z= 48 – 39.95 / 7.5
= 8.05 / 7.5
= 1.07
Therefore 48 inches is 1.07 standard deviations above the mean
Describing Bivariate Data
• Correlation
o Allows us to observe, statistically the relationship
between two variables
You can never make the leap of faith in
concluding that this is a ‘causality’ effect
8 o Looking at the strength and direction of the relationship between
two variables
o Most common graphic technique is the scatterplot
• Direction of the Bivariate Relationship
o Positive
Going upwards and to the right
o Negative
To the right and downwards
o Neutral
Cluster in particular area, typically the middle
• Strength of the Bivariate Relationship
o Perfect association
Consistent upwards to the right in a straight line
o Strong association
Clump of points heading upwards to the right, not in a
straight line, but does appear closer together
o Weak association
Less of a straight line heading downwards to the right,
suggesting a negative direction of the scatter
o No association
Lines are in a cluster, no sign of a straight line
• Correlation Coefficients
More rigorous approach to observing and measuring strength and direction of a
bivariate relationship
Most constructed have a maximum value of 1.0 and can be positive or negative:
1. +1.0 Perfect Positive Relationship
2. –1.0 Perfect Negative Relationship
• Most common measure is Pearson’s Product Moment Correlation or
Pearson’s r
3. Used for Interval / Ratio scale data
• Pearson’s r and Covariance
4. At basis of this is Covariance
1. Covariance measures the degree to which two variables vary
together
Begins with deviations around means of both variables or:
Covariance
Formula
Pearson’s r
• Pearson’s r is expressed as the ratio of the Covariance of X and Y to the
product of the Standard Deviations of X and Y
9
GGR270 – Damian Dupuy • The higher the value of r (closer to 1.0), the stronger the relationship is
X−X ́
¿(Y −Y)
¿
¿n−1
∑¿
r=¿
Probability
• Studying spatial patterns is a key concern of geographers
o Try to understand what has led to those patterns
• Geography is about describing, explaining, and predicting geographic
patterns and processes
• Use probability for situations when patterns have some degree of
uncertainty
o E.g. weather forecasts – Probability of Precipitation
• Probability focuses on the occurrence of an event
o Where one of several possible outcomes could result
o Outcome are (and must be) mutually exclusive
• Can be thought of as frequency of an event occurring relative to all
possible outcomes
P (A) = F (A) / F (E) where…
• P (A) = probability of outcome A occurring
• F (A) = Absolute frequency of A
• F (E) = Frequency of all outcomes
E.g. 1
• Die has 6 faces numbered 1 – 6
• What is the likelihood of rolling a 6 in one throw?
P (6) = 1 / 6 or 1 in 6 or .167 or 16.7% chance
• Same probability exists for each of the other outcomes too
E.g. 2
Examine the record of wet and dry days over a 100 day period
62 days recorded as dry
38 days recorded as wet
What is the probability of a wet day occurring
P (wet) = # of wet days / total # of days = 38 / 100 = .38
Can also say 38% chance a wet day will occur
Alternatively, Probability of a Dry day occurring is…
P (dry) = 62 / 100 = .62
10 Rules
• Maximum probability of an outcome is 1.0 (100%)
o All probabilities must add up to 1 or…
0.0 ≤ P (A) ≤ 1.0
• Addition Rule
o Used when finding probability of single independent events
P (A or B) = P (A) + P (B)
• Even though the word or is in the statement, you’re
still adding them up
E.g. what is the probability of rolling a 6 or a 5 in a single throw?
• P (6) = .167
• P (5) = .167
o Therefore probability of 5 OR a 6 = P (6) + P
(5) = .334 which is equal to 33.4% chance of
throwing a 6 or 5 in a single throw
The higher the options, the higher your
probability
• Multiplication Rule
o Used when finding probability of multiple independent events
P (A and B) = P(A) x P(B)
E.g. what is the probability of rolling two sixes in subsequent
throws?
• P (6) = .167
• P (6) = .167
Therefore, the probability of a 6 and a 6 = P (6) x P (6) = .
02778 or a 2.8% chance of throwing a 6 and a 6
subsequently
Probability Distributions
• Often see consistent or typical patterns of probabilities in certain situations
• These are called Probability Distributions
o Similar to frequency distributions
Y axis contains probability of outcomes rather than
frequency of outcomes
o Discrete and continuous
o 3 key types
Binomial
• Discrete probability distribution
• Used to determine the probability of multiple events in
independent trials
• Each independent event has only 2 possible
outcomes
11
GGR270 – Damian Dupuy o E.g. rain / no rain or flood / no flood
• Probability of event occurring is = p
• Probability of event not occurring = 1 – p = q
Poisson
• Discrete probability distribution
• Used when looking at events that occur randomly in
space and time
• Especially used for distributions over space
particularly with quadrat analysis of point patterns
o E.g. hail showers or a tornado touching down
in a particular location
• Also used when probability of an event occurring is
less than it not occurring
Normal
• Most commonly applied distribution
• It is the basis for sampling theory and statistical
inference
• Mathematical formula is complex
• But, easier to understand via a symmetric graph
• Provides theoretical basis for sampling and statistical
inference
• Need to look at the ‘area under the curve’
• Total area under the curve represents 100% of
possible outcomes
• 50% of values lie to the right of the mean, and 50% to
the left – symmetry
• Need a methodology to effectively determine probability of values on the
distribution
• Could use integral calculus
• Easier to us Table of Normal Values
• Observations must be standardized, to use the table
o To do this, use Z-score table (Check Blackboard)
12 Sampling
• Aim of inferential statistics is to generalize about characteristics of larger
population
• So, need a process to obtain a sample
• Sampling can be spatial or non-spatial
o Geographers often deal with spatial questions; where things
actually happen
Any time locations of the value of the variable is valuable to
our research, it is spatial – i.e. census tract
• Therefore, an essential skill for any geographer to have
Why Sample?
• Necessary in cases of extremely large populations
• Efficient and cost-effective way of understanding the population
• Highly detailed information can be obtained easily
• Allows for follow-up activity or repetition
Sampling Error
• If a sample is representative then it will accurately reflect the
characteristics of the population, without bias
• Element of randomness must be introduced to preserve the representative
sample
13
GGR270 – Damian Dupuy • Can never eliminate bias, only minimize it. Reducing bias means reducing
error
• Precision and accuracy help categorize sources of error
o Precision refers to the notion of this sample – the higher the sample
is and closer to the population, the more finite the result
o They may sound the same, but are different
Sampling Designs
Numbers of different sampling designs exist:
• Simple random
5. Is a subset of individuals (a sample) chosen from a larger set (a
population)
1. Each individual is chosen at random
• Systematic sampling
th
6. Implying some kind of structure – every 10 element for instance
• Stratified
7. When sub-populations vary considerably, its advantageous to sample
each subpopulation (stratum) independently
8. Stratification is the process of grouping members of the population into
relatively homogenous subgroups before sampling
1. E.g. a political survey – if the respondents needed to reflect the
diversity of the population, the researcher would specifically seek to
include participants of various minority groups such as race or
religion, based on their proportionality to the total population
Can also have spatial sampling designs:
• Use Cartesian Coordinates
Grid system – reading coordinates on a grid
Simple
• Stratified Random
Impose a series of squares on the map like a grid
Ensuring that every location within each location has an equal probability of
being chosen
• Transect
Trying to ensure that you have coverage on the entire land space
Within the Cartesian points
You can put as many lines as you wish
Measure the length of each line
Add the proportion of land use that each line cuts through
Sampling Distribution
• Sample statistics will change or vary for each random sample selected
• Probability distributions for statistics are called sampling distributions
o A sampling distribution is the distribution of a statistic that is drawn
from all possible samples of a given size n
• Can be developed for any statistic, not just the mean
14 Central Limit Theorem
• Sampling distribution will have its own mean and standard deviation
• But… the mean of a sampling distribution has important properties –
summarized by Central Limit Theorem
• If all samples are randomly drawn and are independent, then the mean of
the Sampling Distribution of sample means will be the population mean μ
• The frequency distribution of sample means will be normally distributed
• What this means for us is that…
o When the sample size is too large, the sample mean is likely to be
quite close to the population mean
• Mean of a large sample is more likely to be closer to the true population
mean than the mean of a smaller sample
Central Limit Theory – Variability
Standard Deviation of the Sampling Distributions is equal to the sample standard
deviation divided by the square root of the sample size
This is called the Standard Error of the mean
Indicates how much a typical sample mean is likely to differ from the true
population mean
Measures the amount of Sampling Error
The larger the n the smaller the amount of sampling error
N > 30 = large sample & N < 30 = small sample
standard error of the mean standard error of a
proportion
• How large is large?
• If sampled population is normal, then sampling distribution of means will
also be normal, no matter what the sample size
• If the sampled population is approximately normal, then the sampling
distribution of means will be approximately normal for relatively small
sample sizes
• When the population is skewed, the sample size must be large (n>30)
before the sampling distribution wil

More
Less
Related notes for GGR270H1