250 likes | 409 Views
CrimeStat. Group 10 Norine Wilczek & Brad Johnston CSCI 5980 December 4, 2012. Organization. About CrimeStat CrimeStat analysis tools Problem & importance Data Challenges Tools & methods used Processes & map outputs Limitations Contributions to computers and society.
E N D
CrimeStat Group 10 Norine Wilczek & Brad Johnston CSCI 5980 December 4, 2012
Organization • About CrimeStat • CrimeStat analysis tools • Problem & importance • Data • Challenges • Tools & methods used • Processes & map outputs • Limitations • Contributions to computers and society
About CrimeStat • Analysis tool used in • Law enforcement • Public health • The environment • Free Download from National Institute of Justice • http://www.icpsr.umich.edu/CrimeStat/
CrimeStat Continued • Spatial statistics program • Windows based • Purpose: provide supplemental statistical tools to aid law enforcement agencies and criminal justice researchers in crime mapping • Uses GIS shapefiles to perform object-based analysis • Primary file • Incident locations with X,Y coordinate system • Secondary file • for comparison • Reference file • Grid overlay for measurement, used for model interaction of 2 points
CrimeStat Analysis Options • Spatial Description • Distance Analysis • Nearest neighbor, linear nearest neighbor, or Ripley’s K statistic distance between incidents • Calculates distance between incidents from 2 files and places on a grid • Spatial Autocorrelation • Stats for describing amount of spatial autocorrelation between incidents • Spatial Distribution • Mean center • Center of minimum distance • Hotspot Analysis • Mode, fuzzy mode, hierarchical nearest neighbor clustering • Risk-adjusted nearest neighbor hierarchical clustering -> ellipses or convex hull output • Spatial and Temporal Analysis of Crime (STAC), K-means cluster, Anselin’s local Moran, Getis-Ord local G statistics -> ellipses or convex hulls
More CrimeStat Analysis Options • Spatial Modeling • Regression modeling • Analyzes relationship between a dependent variable and one or more independent variables • Journey to Crime • Serial offender data – likely location based on distribution of incidents and travel behavior • Space-time Analysis • Clustering in time and space (serial offender data) • Interpolation • Single variable kernel density • dual-variable kernel density (comparing to baseline) • Crime Travel Demand Models • Trip Generation • Predict number of crimes in each zone (origins) and (destinations) • Trip Distribution • 2nd stage, distributes trips from each zone to every other zone using gravity model • Mode Split • Split predicted number of trips for zone to zone using function that approximates one mode relative to other modes. • Network Assignment • Shortest path algorithm predicts trips from each zone to other zone (likely path). Requires travel network (transits & one way streets, roads, etc)
Problem & Importance • Problem • Crime occurs globally • Statistical analysis is necessary • Patterns, trends, high crime areas, potential re-offending predictions • Importance • Response • Prevention • Crime • Injuries • Death • Utilize resources • Mitigation of economic losses • Lost/Recovered property
Data • University of Minnesota Police Department • 9/2011 – 9/2012 • September 2011 • (all crimes) • Theft from building • (9/11 – 9/12) • Bicycle thefts
Challenges • Data • Process through a GIS • View results with a GIS • .shp, .dbf (uses and produces shapefiles, not feature classes) • Clean up received data • Time/Date field • City, state, zip field → Google • Proper geo-coding in ArcMap
Tools & Methods Used • Spatial Distribution Tool • Distance Analysis Tool • Hotspot Analysis
Spatial Distribution Tool • Mean & median center, center of minimum distance • Standard deviation • Half of crimes in a cluster will be within one standard deviation ellipse of the mean center, around 90% will be found within two standard deviation ellipses • Forecasting: identifying where crime is likely to occur
Result Map Result: Map
Distance Analysis Tool • Distance Analysis 1 • Point pattern of clustering and dispersion • Distances between the points and reference locations as indicator (distance based tests) • Number of points in a given area for basis of test statistics • If distance is smaller than what it would be under complete spatial randomness, it suggests clustering • If distance tends to be larger, then it suggests dispersion
Result: Chart Nearest neighbor analysis: -------------------------- Sample size........: 26 Measurement type...: Direct Start time.........: 03:49:10 PM, 11/05/2012 Mean Nearest Neighbor Distance ..: 109.91 m, 360.58 ft, 0.06829 mi Standard Dev of Nearest Neighbor Distance ...............: 183.07 m, 600.61 ft, 0.11375 mi Minimum Distance ................: 0.00 m, 0.00 ft, 0.00000 mi Maximum Distance ................: 2611.89 m, 8569.19 ft, 1.62295 mi Based on Bounding Rectangle: Area ............................: 3808465.65 sq m 40993983.06 sq ft 1.47046 sq mi Mean Random Distance ............: 191.36 m, 627.83 ft, 0.11891 mi Mean Dispersed Distance .........: 411.25 m, 1349.25 ft, 0.25554 mi Nearest Neighbor Index ..........: 0.5743 Standard Error ..................: 19.62 m, 64.36 ft, 0.01219 mi Test Statistic (Z) ..............: -4.1523 p-value (one tail) ..............: 0.0001 p-value (two tail) ..............: 0.0001 Based on User Input Area: Area ............................: 10.00 sq m 107.64 sq ft 0.00000 sq mi Mean Random Distance ............: 0.31 m, 1.02 ft, 0.00019 mi Mean Dispersed Distance .........: 0.67 m, 2.19 ft, 0.00041 mi Nearest Neighbor Index ..........: 354.4364 Standard Error ..................: 0.03 m, 0.10 ft, 0.00002 mi Test Statistic (Z) ..............: 3447.6946 p-value (one tail) ..............: 0.0001 p-value (two tail) ..............: 0.0001 Mean Nearest Expected Nearest Nearest Order Neighbor Distance (m) Neighbor Distance (m) Neighbor Index ***** ********************* ********************* ************** 1 109.9061 0.3101 354.43636 .5743
Hotspot Analysis • Hotspot: dense area of incidents, in this case a spatial concentration of crime • "Geographic area representing a small percentage of the study area which contains a high percentage of the studied phenomena" • Spatial Description • Hotspot Analysis I • Fuzzy Mode • identifies the geographic coordinates, plus a user-specified surrounding radius, with the highest number of incidents • Nearest neighbor Hierarchical Spatial Clustering (Nnh) • Interpolation method • Minimum points per cluster • Results: • 7 NNH clusters • 10 or more Building Thefts within 1500 sq. meter area • Calculates mean X,Y of ellipses in the output table
Mode vs. Fuzzy Mode Mode Fuzzy Mode
Hotspot Analysis Result Map Theft from Buildings 2011-2012
Kernel Density Estimation • Most popular type of map in crime analysis • Generalized over larger areas (compared to Hotspot) • Interpolation method • Creates “risk areas” • Kernel size and weight determined by user, smoothed (linear relationship) throughout kernel • Multiple points at one location, kernels aggregate to total in grid cell
Kernel Density Estimation Analysis Result Map Theft from Buildings 2011-2012
Nearest Neighbor Hiearchical & Kernel Density Estimation *Nearest neighbor clusters and kernel density estimation analysis overlay *Mondale Hall & Carlson*Coffman Mem Union*Walter, Appleby, & Johnston Hall area*Rec Center
Limitations • CrimeStat uses data with latitude and longitude • Does not pick up “on the fly” • Spatial references need to match • Our data missing X,Y column • Add XY Data tool in ArcMap • Experimented with and added in XY data for use in CrimeStat • Size of geographic region • CrimeStat useful for larger areas (than U of M campuses) • Clusters would show up in a city or regional level where areas have crime that is less likely to occur (including stats of bike ridership, socioeconomic conditions)
Contributions to Computer & Society • Analysis tool for Law Enforcement, Public Health, and the Environment • Visual analysis vs. statistical analysis • Benefits of CrimeStat: • Calculates spatial statistics, which can calculate correlations between geographic variables and detect subtle changes in geography of a pattern over time that they eyes do not see • Law enforcement resource allocation • Faster response time • Citizen awareness
End of Presentation Questions ?