200 likes | 373 Views
24th October, EFGS 2013 Conference, Sofia. Disaggregation methods for georeferencing inhabitants with unknown place of residence : the case study of population census 2011 in the Czech republic. Ing. Jaroslav Kraus, Ph.D. Mgr. Štěpán Moravec. Starting Situation.
E N D
24th October, EFGS 2013 Conference, Sofia Disaggregation methods for georeferencing inhabitants with unknown place of residence : the case study of population census 2011 in the Czech republic Ing. Jaroslav Kraus, Ph.D. Mgr. Štěpán Moravec
Starting Situation Total number of usually resident population: 10 436 560 Georeferenced inhabited building points stored in the Register of Census Districts and Buildings managed by CZSO: 1 790 122 Georeferenced population with exact place of usual residence (x,y coordinates): 10 343 479 High coverage of georeferenced data (above 99 %): 93 thousands inhabitants not linked to the exact place of their usual residence (0,9 % of the total census population) 10436560– 10343479=93081 But, the census data of these inhabitants are linked to the level of statistical districts
Cause: missing, incomplete or incorrect address data Structure of the people with unknown place of residence: • homeless people • people living in emergency buildings or shelters • people living in buildings without final approval Possible solution for distribution of these people into buildings with x,y coordinates or into grids: • Application of some disaggregation method • Testing of 3 disaggregation methods via ArcGIS software • description of the problem
Case study: small town Abertamy in the northern part of the CR • Total number of census population: 1 213 • Number of not georeferenced inhabitants: 46 • Total number of statistical districts: 6 • Number of affected statistical districts: 6 • Number of inhabited buildings: 214 • CASE STUDY
2. Layer of population grids with number of georeferenced inhabitants 1. Layer of statistical districts with number of not georeferenced inhabitants
METHOD 1: CREATING NEW RANDOM BUILDING POINTS • Creates a specified number of random point features. Random points can be generated in an extent window, inside polygon features, on point features, or along line features • Parameters: • Number of Points • Minimum Allowed Distance • Others
METHOD 1:RECALCULATIONOF POPULATION BYRANDOM BUILDING POINTS (1) Source: Using field calculator: Create Random Values, Iowa State University
METHOD 1: METHODOLOGY • Creating new random building points • Defining population (from random interval) for new random building points • Recalculation of limit number of inhabitants (e.g. defined by information from statistical district ) • Source: ArcGIS10 Help
ArcGIS 10: method MedianCenter (or MeanCenter, CentralFeature) • Method 2: Creating of population centers of gravity (1)
MeanCenter (Spatial Statistics) • Identifies the location that minimizes overall Euclidean distance to the features in a dataset • Mean Center (and Median Center) are measures of central tendency • For line and polygon features, feature centroids are used in distance computations • The Case Field is used to group features for separate median center computations (e.g. by statistical districts) • Source: ArcGIS10 Help
Calculating Central Value (Mean Center, Median Center) →Layer of spatially weighted population centers of gravity • Method 2: METHODOLOGY (1)
Spatial join for linking persons with unknown place of residence into weighted center of gravity • Method 2: METHODOLOGY (2)
Aim: • To distribute not georeferenced population just into grids, not into particular buildings (x,y coordinates) • To respect known spatial distribution of population (based on georeferenced population only) Methodology: • To calculate a population weight of each inhabited gridsegmentwithin affected statistical district Population weight of grid segment i = • Method 3: Calculation of population weights of grids Population number of grid segmenti Total population of statistical district j
1. Layer of population grids with number of georeferenced inhabitants 2. Layer of population grids with relative population weight
To calculate a population number distributed to each inhabited grid segment within affected statistical district Population weight of grid segment i* Total number of not georeferenced persons within statistical district j • Rounding of the population number distributed to each inhabited grid segment to an integer value • Add the number of distributed not georeferenced persons to the initial number of georeferenced inhabitants for each grid segment • Method 3: Methodology
2. Layer of population grids with relative population weight 3. Layer of population grids with number of additionally distributed persons
Different types of irregularities and deviations: • Problem with rounding (increase or decrease of the distributed population number) • Problem with statistical districts without inhabited buildings • Problem with grids with the same population weight Definition of additional assumptions and consequent manual corrections required • Method 3: Methodological issues
Conclusion Pluses and minuses of method 1 and 2: inhabitants distributed to the level of buildings distribution according to spatial distribution of inhabited buildings Pluses and minuses of method 3: distribution according to spatial distribution of population inhabitants distributed to the level of grids • All mentioned methods are used for recalculation of people with unknown exact place of residence • There is relatively enough „handworks“ to do it → some automatizations of processes are important • Finally, recalculation on single (personal) records are aim of the whole process