Class Notes (807,938)
Canada (492,936)
Geography (944)
GGR270H1 (38)

GGR 270 Lecture Notes

31 Pages
Unlock Document

University of Toronto St. George
Stephen Swales

September 11, 2013 What Is This Course About? • Statistical tools or techniques • Help support research projects, papers, reports • Skill set for future employment • Understand and critically comment on the work of others We will… • Examine the role of statistics in research • Application of most appropriate technique or set of techniques • Understand and interpret the result • “What does this result mean to my research problem?” General Course Topics • Describing data • Graphs • Simple measures – one variable • Average • Measures of center and measures of dispersion • Getting your grade back and asking for the class average • Simple measures – two variables • Probability and distributions • Simple probability • Sampling distributions • Statistical estimation • Process • Techniques and tests What are (is) Statistics? • Any collection of numerical data o Vital Statistics – birth rates, death rates (about vital statistics in Toronto, released every year) o Economic Indicators – unemployment rates, income levels  Way of measuring economic success or failure o Social Statistics – poverty rates, crime rates  When we’re looking at neighborhood change – a misplaced causality is usually in there – crime could be caused by other variables such as government influence, etc. • Methodology for collecting, presenting and analyzing data o Summarize findings o Theory validation o Forecasting 1 GGR270 – Damian Dupuy  "The labor force is growing at this rate now, we estimate that it will be this much in 20 years” o Evaluate  Help us understand what is going on in the subset of a bigger picture o Select among alternatives Descriptive and Inferential Statistics • Descriptive o Organization and summary of data o Replace large set of numbers with small summary measures  E.g. Average household income in Toronto is $60,000 – summarizing everybody’s income o Goal of techniques is to minimize information loss • Inferential o Links descriptive statistics to probability theory  Gives us the ability to speak with a higher degree of confidence o Generalize results of smaller group to a much larger one  As opposed to asking the general population in the class to analyze the mean score on the final exam, we can take a subset of about 50 and average their scores – the result should be pretty similar to testing the whole class if done correctly • It is time consuming to ask everyone and in the real world it would cost a pretty penny o Goal is to ‘infer’ something about a larger group by looking at a smaller one Population and Sample • Population o Total set of elements (objects, persons, regions, etc.) under examination o For example, all potential voters in an urban area  We could examine voter behavior o Denoted as N • Sample o Subset of elements in the population o Used to make inferences about certain characteristics of the population o Try to predict the behavior of the population by looking closely at the sample  If we set up everything properly and conducted accordingly, we can make a pretty accurate conclusion Sample 3 (n3) 2 o Denoted as n Population - N Sample 2 (n2) September 18, 2013 Sample 1 (n1) Variables and Data • Variable o Characteristic of the population that changes or varies over time  Examples include temperature, income, education, etc. o Observe and measure variables o Two key categories  Quantitative – numerical e.g. umber of students who… • Discrete (1,2,3,4…) or continuous (1.5, 2.76, 3.445…) o Who passed GGR270 and who failed GGR270  Qualitative – Non Numerical e.g. male/female, plant species • Data Results from measuring variables – set of measurements Different categories – Univariate, Bivariate, Multivariate Variables – Scales of Measurement • Scales defines amount of information a variable contains and what statistical techniques can be used Four scales: Lowest Nominal Ordinal Interval Information Ratio Highest Nominal • Lowest scale of measurement, no numerical value attached • Classifies observations into mutually exclusive and collectively exhaustive groups • Simply the name or category of the variable • Often called ‘categorical’ data o E.g. occupation type, gender, place of birth Ordinal • Stronger scale as it allows data to be ordered or ranked o E.g. 12 largest towns in a region, income by group (high, middle, low) 3 GGR270 – Damian Dupuy Interval • Unit distance separating numbers is important o E.g. temperature (F or C) • But, it does not allow for ratios and does not have a true ‘Zero’ Ratio • Strongest scale of measurement • Ratios of distances on a number scale • Presence of an absolute ‘ZERO’ o E.g. temperature (Kelvin), income from all sources ($), population of a city • In practice, we consider interval/ratio scales together Graphs • Pie charts o Circular graph where measurements are distributed among categories • Bar Graph Graph where measurements are distributed among categories Relative Frequency Histogram Graphs quantitative, rather than qualitative data • Quantitative data – e.g. income levels, scores, etc. Vertical axis (Y) shows “how often” measurements fall into a particular class or subinterval Classes are plotted on the horizontal axis (X) axis Rules of thumb 5 to 12 intervals or categories To interpret or explain the data – less than that and it useless, more than that and it is hectic 1 + 3.3 Log (# of observations) 10 Must be mutually exclusive and collectively exhaustive Intervals should be the same width Don’t lengthen the last bar to make it fit, you make another one RELATIVE FREQUENCY HISTOGRAM DOES NOT HAVE GAPS IN BETWEEN BARS • Observations: 1, 11, 14, 21, 23, 27, 28, 33, 35, 50 Number of Classes – (Rows In “Excel Table”) k = 1 + 3.3 Log1010) = 4.3 rounded up to 5 • Class Width – (Range Per “Excel Row”) 4 Largest number (subtract) smallest number (divided) number of classes = (50 – 1) / 5 = 9.8 rounded up to 10 Observing The Graph – Skewness Is the distribution symmetric? • If the graph is skewed it is usually outliers pulling the graph; exerting a greater degree of influence Observing The Graph – Mode How many peaks are there? • Normal – 1 peak • Bimodal – 2 peaks Observing The Graph – Kurtosis How peaked is the distribution? • Mesokurtic – semi-high • Platykurtic – flat peak • Leptokurtic – very high with little data beside peak itself Describing Data: Measures of the Center & Measures of Variability Statistics and Parameters • Graphs are limited in what they can tell us • Difficulty making inferences about a population when looking at a subset or sample • Therefore, we need to use numerical measures • Measures associated with the population are called parameters • Measures associated with a sample are called statistics Measures of the Center • Mean o Most commonly used measure of central tendency o Sum of all values or observations divided by the number of observations E.g. Temperature data: 7.3, 10.7, 9.1, 8.4, 13.9, 94, 8.2 = 67/7 = 9.57 = 9.6 • Median o Value occupying the ‘middle position’ in an ordered set of observations 5 GGR270 – Damian Dupuy o Order the observations, lowest to highest, and find the middle position  .5 (n + 1) Sample Population 1. E.g. (uneven number) Temperature data: 7.3, 10.7, 9.1, 8.4, 13.9, 9.4, 8.2 7.3, 8.2, 8.4, 9.1, 9.4, 10.7, 13.9 Using formula .5 (7+1) = 4 position in the ordered set th 2. (Even numbers)  add .5 (6+1) = 3.5 position September 25, 2013 Measures of Center Mode • Value that occurs with the highest frequency • Allows you to locate the peak of a relative frequency histogram Choosing an Appropriate Measure • Mean is usually best measure as it is sensitive to change in a single observation o Yields the most information  Minimize information leakage o But not a good measure when  Distribution is bimodal (2 modes)  Skewed distributions • Outliers (extreme values) are present in the data set e.g. {2, 5, 6, 8, 9, 21, 22} • Bimodal (mean and median are in the same position between two modes) Measures of Dispersion • Range o Simplest measure of dispersion 6 o Takes the difference between smallest and largest value in the dataset, at the interval/ratio scale o But, influenced by outliers Range = X max– X min • Quartiles o Can yield more information and lessen impact of outliers o Data are divided into quartiles (4 groups) • Standard Deviation and Variance o Two of the most commonly used methods of dispersion o Comparing value of each measure to the mean • Two key properties of the mean/value relationship Xí o Sum of differences will always add up to Zero X o Sum of squared differences will be the minimum sum possible. Called ‘Least Squares’ property Sample Standard Deviation • Skewness o Measures the degree of symmetry in a frequency distribution o Determines how evenly (or unevenly) the values are distributed either side of the mean 3(x−median) S • Coefficient of Variation o Allows for comparison of variability spatial samples o Tests which sample has the greatest variability o Standard Deviation or Variance are absolute measures so, they are influenced by the size of the values in the dataset o To allow a comparison of variation across two or more geographic samples, can use a relative measure of dispersion called Coefficient of Variation CV= S ∨ Standard Deviation X Mean • Empirical Rule: (Normal Distribution) X = 20 S = 5 7 GGR270 – Damian Dupuy 1.) 68% fall between 20 +/- 5 or between 15 – 25 2.) 95% fall between 10 – 30 3.) 99.7% fall between 5 – 35 October 2, 2013 Z scores • Standard scores are referred to as Z scores • Indicate how many standard deviations separate a particular value from the mean • Z scores can be more or less depending if they are greater than or less than the mean • Z score of the mean is 0 and the standard deviation is + or – 1 • Table of Normal Values provides probability information on a standardized scale • But, we can also calculate Z scores • Formula involves comparing values to the mean value, and dividing by the standard deviation • Result is interpreted as the ‘number of standard deviations an observation lies above or below the mean’ (E.g.) Rainfall in Toronto Mean = 39.95 inches of rainfall S = 7.5 inches What is the Z score for 48 inches? Z= 48 – 39.95 / 7.5 = 8.05 / 7.5 = 1.07 Therefore 48 inches is 1.07 standard deviations above the mean Describing Bivariate Data • Correlation o Allows us to observe, statistically the relationship between two variables  You can never make the leap of faith in concluding that this is a ‘causality’ effect 8 o Looking at the strength and direction of the relationship between two variables o Most common graphic technique is the scatterplot • Direction of the Bivariate Relationship o Positive  Going upwards and to the right o Negative  To the right and downwards o Neutral  Cluster in particular area, typically the middle • Strength of the Bivariate Relationship o Perfect association  Consistent upwards to the right in a straight line o Strong association  Clump of points heading upwards to the right, not in a straight line, but does appear closer together o Weak association  Less of a straight line heading downwards to the right, suggesting a negative direction of the scatter o No association  Lines are in a cluster, no sign of a straight line • Correlation Coefficients More rigorous approach to observing and measuring strength and direction of a bivariate relationship Most constructed have a maximum value of 1.0 and can be positive or negative: 1. +1.0 Perfect Positive Relationship 2. –1.0 Perfect Negative Relationship • Most common measure is Pearson’s Product Moment Correlation or Pearson’s r 3. Used for Interval / Ratio scale data • Pearson’s r and Covariance 4. At basis of this is Covariance 1. Covariance measures the degree to which two variables vary together Begins with deviations around means of both variables or: Covariance Formula Pearson’s r • Pearson’s r is expressed as the ratio of the Covariance of X and Y to the product of the Standard Deviations of X and Y 9 GGR270 – Damian Dupuy • The higher the value of r (closer to 1.0), the stronger the relationship is X−X ́ ¿(Y −Y) ¿ ¿n−1 ∑¿ r=¿ Probability • Studying spatial patterns is a key concern of geographers o Try to understand what has led to those patterns • Geography is about describing, explaining, and predicting geographic patterns and processes • Use probability for situations when patterns have some degree of uncertainty o E.g. weather forecasts – Probability of Precipitation • Probability focuses on the occurrence of an event o Where one of several possible outcomes could result o Outcome are (and must be) mutually exclusive • Can be thought of as frequency of an event occurring relative to all possible outcomes P (A) = F (A) / F (E) where… • P (A) = probability of outcome A occurring • F (A) = Absolute frequency of A • F (E) = Frequency of all outcomes E.g. 1 • Die has 6 faces numbered 1 – 6 • What is the likelihood of rolling a 6 in one throw? P (6) = 1 / 6 or 1 in 6 or .167 or 16.7% chance • Same probability exists for each of the other outcomes too E.g. 2 Examine the record of wet and dry days over a 100 day period 62 days recorded as dry 38 days recorded as wet What is the probability of a wet day occurring P (wet) = # of wet days / total # of days = 38 / 100 = .38 Can also say 38% chance a wet day will occur Alternatively, Probability of a Dry day occurring is… P (dry) = 62 / 100 = .62 10 Rules • Maximum probability of an outcome is 1.0 (100%) o All probabilities must add up to 1 or…  0.0 ≤ P (A) ≤ 1.0 • Addition Rule o Used when finding probability of single independent events  P (A or B) = P (A) + P (B) • Even though the word or is in the statement, you’re still adding them up E.g. what is the probability of rolling a 6 or a 5 in a single throw? • P (6) = .167 • P (5) = .167 o Therefore probability of 5 OR a 6 = P (6) + P (5) = .334 which is equal to 33.4% chance of throwing a 6 or 5 in a single throw  The higher the options, the higher your probability • Multiplication Rule o Used when finding probability of multiple independent events  P (A and B) = P(A) x P(B) E.g. what is the probability of rolling two sixes in subsequent throws? • P (6) = .167 • P (6) = .167  Therefore, the probability of a 6 and a 6 = P (6) x P (6) = . 02778 or a 2.8% chance of throwing a 6 and a 6 subsequently Probability Distributions • Often see consistent or typical patterns of probabilities in certain situations • These are called Probability Distributions o Similar to frequency distributions  Y axis contains probability of outcomes rather than frequency of outcomes o Discrete and continuous o 3 key types  Binomial • Discrete probability distribution • Used to determine the probability of multiple events in independent trials • Each independent event has only 2 possible outcomes 11 GGR270 – Damian Dupuy o E.g. rain / no rain or flood / no flood • Probability of event occurring is = p • Probability of event not occurring = 1 – p = q  Poisson • Discrete probability distribution • Used when looking at events that occur randomly in space and time • Especially used for distributions over space particularly with quadrat analysis of point patterns o E.g. hail showers or a tornado touching down in a particular location • Also used when probability of an event occurring is less than it not occurring  Normal • Most commonly applied distribution • It is the basis for sampling theory and statistical inference • Mathematical formula is complex • But, easier to understand via a symmetric graph • Provides theoretical basis for sampling and statistical inference • Need to look at the ‘area under the curve’ • Total area under the curve represents 100% of possible outcomes • 50% of values lie to the right of the mean, and 50% to the left – symmetry • Need a methodology to effectively determine probability of values on the distribution • Could use integral calculus • Easier to us Table of Normal Values • Observations must be standardized, to use the table o To do this, use Z-score table (Check Blackboard) 12 Sampling • Aim of inferential statistics is to generalize about characteristics of larger population • So, need a process to obtain a sample • Sampling can be spatial or non-spatial o Geographers often deal with spatial questions; where things actually happen  Any time locations of the value of the variable is valuable to our research, it is spatial – i.e. census tract • Therefore, an essential skill for any geographer to have Why Sample? • Necessary in cases of extremely large populations • Efficient and cost-effective way of understanding the population • Highly detailed information can be obtained easily • Allows for follow-up activity or repetition Sampling Error • If a sample is representative then it will accurately reflect the characteristics of the population, without bias • Element of randomness must be introduced to preserve the representative sample 13 GGR270 – Damian Dupuy • Can never eliminate bias, only minimize it. Reducing bias means reducing error • Precision and accuracy help categorize sources of error o Precision refers to the notion of this sample – the higher the sample is and closer to the population, the more finite the result o They may sound the same, but are different Sampling Designs Numbers of different sampling designs exist: • Simple random 5. Is a subset of individuals (a sample) chosen from a larger set (a population) 1. Each individual is chosen at random • Systematic sampling th 6. Implying some kind of structure – every 10 element for instance • Stratified 7. When sub-populations vary considerably, its advantageous to sample each subpopulation (stratum) independently 8. Stratification is the process of grouping members of the population into relatively homogenous subgroups before sampling 1. E.g. a political survey – if the respondents needed to reflect the diversity of the population, the researcher would specifically seek to include participants of various minority groups such as race or religion, based on their proportionality to the total population Can also have spatial sampling designs: • Use Cartesian Coordinates Grid system – reading coordinates on a grid Simple • Stratified Random Impose a series of squares on the map like a grid Ensuring that every location within each location has an equal probability of being chosen • Transect Trying to ensure that you have coverage on the entire land space Within the Cartesian points You can put as many lines as you wish Measure the length of each line Add the proportion of land use that each line cuts through Sampling Distribution • Sample statistics will change or vary for each random sample selected • Probability distributions for statistics are called sampling distributions o A sampling distribution is the distribution of a statistic that is drawn from all possible samples of a given size n • Can be developed for any statistic, not just the mean 14 Central Limit Theorem • Sampling distribution will have its own mean and standard deviation • But… the mean of a sampling distribution has important properties – summarized by Central Limit Theorem • If all samples are randomly drawn and are independent, then the mean of the Sampling Distribution of sample means will be the population mean μ • The frequency distribution of sample means will be normally distributed • What this means for us is that… o When the sample size is too large, the sample mean is likely to be quite close to the population mean • Mean of a large sample is more likely to be closer to the true population mean than the mean of a smaller sample Central Limit Theory – Variability Standard Deviation of the Sampling Distributions is equal to the sample standard deviation divided by the square root of the sample size This is called the Standard Error of the mean Indicates how much a typical sample mean is likely to differ from the true population mean Measures the amount of Sampling Error The larger the n the smaller the amount of sampling error N > 30 = large sample & N < 30 = small sample standard error of the mean standard error of a proportion • How large is large? • If sampled population is normal, then sampling distribution of means will also be normal, no matter what the sample size • If the sampled population is approximately normal, then the sampling distribution of means will be approximately normal for relatively small sample sizes • When the population is skewed, the sample size must be large (n>30) before the sampling distribution wil
More Less

Related notes for GGR270H1

Log In


Don't have an account?

Join OneClass

Access over 10 million pages of study
documents for 1.3 million courses.

Sign up

Join to view


By registering, I agree to the Terms and Privacy Policies
Already have an account?
Just a few more details

So we can recommend you notes for your school.

Reset Password

Please enter below the email address you registered with and we will send you a link to reset your password.

Add your courses

Get notes from the top students in your class.