490 likes | 806 Views
IS415 Geospatial Analytics for Business Intelligence. Lesson 10: Geospatial Data Analysis-Point Patterns Analysis. What will you learn from this lesson. The differences between GIS analysis and geospatial data analysis Challenges face in analysing geospatial data
E N D
IS415 Geospatial Analytics for Business Intelligence Lesson 10: Geospatial Data Analysis-Point Patterns Analysis
What will you learn from this lesson • The differences between GIS analysis and geospatial data analysis • Challenges face in analysing geospatial data • The basic concepts of point patterns and point patterns analysis techniques
Core Competencies • Capable to apply appropriate spatial point analysis techniques to gain insights • Capable to provide accurate interpretation of spatial point analysis results
Background of the study • Shanghai retail (tobacco) audit study • Account classification • Total market volume • Volume across key price points and channels • Individual brand profiles
The study area ZHABEI DISTRICT YANGPU AREA PUTUO DISTRICT PUDONG NEW AREA CHANGNING DISTRICT HUANGPU DISTRICT XUHUI DISTRICT LUWAN DISTRICT
Sales by channel • Mom & Pop outlets account for 58% of total sales volume. • The average sales per Mom & Pop is much lower than Supermarkets, Hypermarkets and Convenience stores. • 16% of the market is made up of brands priced less than 3RMB. 11% of this volume is generated by Mom & Pops. • Supermarkets are the second most important channel with 15% of total sales being generated through this channel. • Convenience stores generate 13% of total sales while tobacconists are a the fourth most important channel
Questions: • Where are the locations of the different channel stores? • Are these channel stores tend to cluster together or they are evenly distributed? • Where are the locations of the top 10% channel stores? • Are the locations of the top 10% channel stores even distributed spatially? • Is there any association between the distribution of the top 10% channel stores and the distribution of the offices
Spatial point pattern analysis methods • Kernel density estimation • Ripley’s K function • L function • D function • K.hat12 function
Kernel density estimation (Silverman 1986) • A method to compute the intensity of a point distribution • The general formula: • Graphically:
The kernel functions • Normal distribution, quartic, triangular
The Ripley’s K function (Ripley, 1981) • A method to estimate the second-order properties of a point process
The L function (Besag 1977) • In practice, K function will be normalised to obtained a benchmark of zero • L(r)>0 indicates that the observed distribution is geographically concentrated • L(r)<0 implies dispersion • L(r)=0 indicates complete spatial randomness (CRS)
Monte Carlo simulation test of CSR • Perform m independent simulation of n events (i.e. 999) in the study region. • For each simulated point pattern, estimate K(d) and use the maximum and minimum of these functions for the simulated patterns to define an upper and lower simulation envelope. • If the estimated K(d) lies above the upper envelope or below the lower envelope, the estimated K(d) is statistically significant
Question: • Is the observed pattern of one set of event just a random subset of the overall pattern of a set of combined point patterns?
D function (Diggle & Chetwynd 1991) • Assuming heterogeneity of the distribution • The significant of D(r) can be testing by performing Monte Carlo simulation
Bivariate point process • Is the spatial distribution of top10% store independent of the distribution of office locations?
Bivariate K function • The general formula: • The significance of the estimated K.hat12 can be testing using Monte Carlo simulation
SPA (Spatial Point Pattern Analysis) • A collection of spatial point pattern analysis functions available within a GIS environment • Tight (shared) coupling GIS R COM Server Data Library
Sample interface: splancs - Kernel Density • Kernel density (kernel2d) • K-function (Khat, KenvCsr, KenvLael and KenvTor) • L-function (Lhat, LenvCsr, LenvTor) • D-function • K.hat12
Useful Spatial Point Data Analysis Tools • Spatstat: An R library for spatial statistics (http://www.spatstat.org/spatstat/) • CrimeStat: A Spatial Statistics Program for the Analysis of Crime Incident Locations (http://www.icpsr.umich.edu/CrimeStat/) • SaTScan™ : a free software that analyzes spatial, temporal and space-time data using the spatial, temporal, or space-time scan statistics. (http://www.satscan.org/)