Lecture 1 and 2
Geographical / Spatial Data
A datum is regarded as spatial if it can be associated with a location
Spatial data that have reference to a location on the Earth’s surface are
termed geographical or georeferenced data.
Spatial vs. non-spatial statistical analysis
o Spatial (geographical) data are analyzed using conventional
o The geographical coordinates are excluded from the computational
o The results are independent of the spatial arrangement of
o Spatial (geographical) data are analyzed using spatial statistical
o The geographical coordinates are included into the computational
o The results depend on the spatial arrangement of geographical
Properties of Spatial Data
o The first law of geography: all things are related, but nearby
things are more related than distant things.
Spatial heterogeneity (or non-stationarity)
o The second law of geography: conditions vary (“smoothly”) over
the Earth’s surface.
The properties of geographical (spatial) data present a fundamental
challenge to the classic (non-spatial) statistics.
They violate the classic assumptions of independence and homogeneity.
Spatial statistics: methods specifically designed to analyze the properties
of geographical data.
Exploratory Spatial Data Analysis (ESDA)
What is exploratory data analysis (EDA)?
o Objectives of EDA
Pattern detection in data
Hypothesis formulation from data
Some aspects of model assessment
o EDA’s methods
Graphical and visual methods (histograms, box plots, scatter
Descriptive methods rather than formal hypothesis testing
Importance of “staying close to the original data”
What is exploratory spatial data analysis (ESDA)? o Extension of EDA to detect spatial properties of data
o Additional techniques to those found in EDA for:
Detecting spatial dependence
Detecting spatial heterogeneity (homogeneity or stationarity)
GIS Data and Linking
o GIS data
Attribute data (tables, graphs, etc.)
Geographical data (maps)
o Linking: Dynamic graphics
Linking attribute data (histogram) and geographical data
Visualizing data in the attribute space and geographical
Useful for exploring spatial stationary (homogeneity) of
spatial patterns and processes.
o Spatial Heterogeneity (homogeneity)
• Linking attribute data (histogram) and geographical
Linked -plot (Box and Whisker Plot)
• Provides a graphical summary of important features
of a dataset
o Median value is the middle value in the data
Described as being the second quartile
o Q1 – the lower quartile (lower 25% of data)
o Q3 – the upper quartile (upper 25% of data)
o Interquartile range (data between th