1 / 39

Towards an Enhanced UK Spatial Interaction Data Service

Towards an Enhanced UK Spatial Interaction Data Service. Adam Dennett, Oliver Duke-Williams and John Stillwell School of Geography, University of Leeds Presentation for British Society for Population Studies, University of St Andrews, 11-13 September 2007. Outline of presentation.

margie
Download Presentation

Towards an Enhanced UK Spatial Interaction Data Service

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Towards an Enhanced UK Spatial Interaction Data Service Adam Dennett, Oliver Duke-Williams and John Stillwell School of Geography, University of Leeds Presentation for British Society for Population Studies, University of St Andrews, 11-13 September 2007

  2. Outline of presentation • Introduction: relevant background on interaction data and CIDER and WICID • Audit of Interaction Data Sources: a brief overview of the variety of interaction data sources available in the UK • What were the recommendations of the audit? • How do we propose to take things forward to create an enhanced UK spatial interaction data service? • The new INTERACTION system: overview of the issues and challenges involved • The new data: an overview of the individual characteristics of each of the new proposed datasets

  3. Introduction – CIDER • CIDER: the Centre for Interaction Data Estimation and Research • Based now, principally, at the University of Leeds though software runs at Manchester • Data Support Unit: part of the ESRC-funded UK Census Programme

  4. Access to Census Data and the Census Data Support Units Public access to key statistics, census area statistics and standard tables through National Statistics and NOMIS Provide essentially the same data for 2001, although CDU gives access to data from 1981 and 1991 as well. Census Registration Service (University of Essex) Based at at the University of Manchester. Provides access to census aggregate outputs from 1981 to 2001 through the interface. CeLSIUS, based at the London School of Hygiene and Tropical Medicine, provides access to the Longitudinal Study dataset, comprising linked records for 1% of the population of England and Wales from 1971. Based at the University of St-Andrews, the Scottish LS is a replica of the England and Wales LS, although samples 5.3% of the Scottish Population. Based at Edina at the University of Edinburgh provides access to digital boundary data associated with census outputs, as well as look-up tables for geographical conversion. Based principally in the School of Geography at the University of Leeds. Provides access to interaction datasets throughthe interface. SARs for small samples of households and individuals are supported by the Cathie Marsh Centre for Census and Survey Research (CCSR) based at the University of Manchester.

  5. Introduction - CIDER • Currently we currently administer interaction (flow) data from the 1981, 1991 and 2001 Censuses Migration Commuting

  6. Introduction – CIDER Data Sets and Geographies • 2001 Census: Special Workplace Statistics (SWS) (Levels 1, 2 & 3) • 2001 Census: Special Travel Statistics (STS) (Scotland Levels 1,2 & 3 and Level 2 Scottish postal sectors) • 2001 Census: Special Migration Statistics (SMS) (Levels 1,2 & 3) • Also comparable datasets from 1991 and 1981 • As well as the standard District, Ward and OA geographies available, different aggregations of these basic units, as well as various bespoke geographies are available for different data years

  7. Introduction - WICID

  8. Introduction – CIDER’s Ongoing Objectives CIDER’s objectives of relevance to this presentation: • To gather/estimate further UK census-based data sets and include them in the system • To expand the WICID system to incorporate a range of UK interaction data sets from outside of the census • To undertake research based on the current and future interaction data sets held within the software system

  9. Interaction Datasets in the UK: An Audit Purpose of the Audit: • Before adding new datasets to WICID, we needed to know what was out there! • To identify and evaluate sources of interaction data in the UK that might complement the current census datasets held in WICID • To make recommendations relating to the inclusion of the most useful datasets in a new, expanded version of WICID called INTERACTION

  10. Interaction Datasets in the UK: An Audit

  11. Census data sources of interaction data

  12. Major administrative sources of interaction data

  13. Important surveys containing interaction data

  14. Recommendations coming out of the Audit… Additional data should be included in the new system from the following four sources: • 2001 Census: the large and more complex matrices of migration and commuting flows commissioned from ONS that have national coverage at district and sub-district spatial scales • NHSCR: annual flows, from 1975 to 1998, of NHSCR patient re-registration movements between 100 FHSA-based zones, disaggregated by age and sex; and annual flows, from 1998/99 onwards, of NHS patients movements between HAs, disaggregated by age and sex

  15. Recommendations coming out of the Audit… • HESA: annual flows, from 2001 onwards, of student movements between MLSOA of parental domicile and HEI, disaggregated by various characteristics • NHS IC: annual flows, from 2001 onwards, of hospital patients from LLSOA or MLSOA of residence to hospital, disaggregated by various attributes

  16. Implications for CIDER • CIDER is currently in negotiation with the custodians of these targeted data sets to see if incorporation of the data into a an extended version of WICID is possible. • All current indications are positive, but due to the differing availability and cost of particular data sets, it is likely that the acquisition and incorporation of some data will happen before others. • Securing additional funding via the Census Development Programme should allow for the purchase of data and trial of a new improved INTERACTION data system which incorporates these new data sources.

  17. Towards an Enhanced Spatial Interaction Data Service… • Overview of the issues and challenges involved with adding new non-census datasets to the new INTERACTION system. • The new data: A more detailed look at the individual characteristics of each of the new proposed datasets.

  18. WICID – The current system Web client: IE, Firefox, Opera, etc… Web server: Apache 2.0.59, supporting php 5 Database Server: PostgreSQL 8.2.4 Session Data Meta Data Interaction Data

  19. WICID - Inbuilt flexibility • System originally designed to handle a variety of primary (migration) data • Metadata is key as it describes the primary data held in the database. The system relies on this metadata to recognise the range of primary data stored • The system has very few ‘hardcoded’ assumptions about the data – it is all looked up whenever a data page on the user’s browser is produced • Data need only have a single origin and destination identifier, with a set of fields (generally a set of counts disaggregating the flow)

  20. WICID – The metadata datatype: 1. 1991 SMS set 1 5. 2001 SMS level 1 6. 2001 SMS level 2 10. 2001 SWS level 1 45. 2001 Migrants by Religion relname: 1. data_1991sms1 7. data_2001sms3 22. data_c0648_religion DATA TABLE 2001 data Total Migrants By Religion Origin – England ST Ward Destination – England ST Ward orig_geogtype: 1. 1991 SMS counties 7. UK interaction data wards 2001 11. UK interaction data districts 2001 22. UK Standard table wards 2001 36. 1981 SMS foreign origins familytype: 1. commuting data 2. migration data dest_geogtype: 1. 1991 SMS counties 7. UK interaction data wards 2001 11. UK interaction data districts 2001 22. UK Standard table wards 2001 36. 1981 SMS foreign origins

  21. WICID – sample of table in SQL database Unique origin and destination identifiers which allow for swift look-up and extraction Pairwise table – generally set up origin/ destination metadata at this stage. Variables = counts of Christian, Muslim, Jewish etc… Table in database given name, e.g. data_c0648_religion

  22. WICID – finalising the metadata Rest of the metadata is set up through the web-interface Essentially the process involves letting the system know there is a new table in the database, and the structure of this table in terms of the variables included

  23. WICID – The finished product. Once the metadata is set up, defining the structure of the table in WICID, (in terms of the type of data, the variables included, the geographies that apply), the web-based query builder can then be used to extract any data selected by the user from the database

  24. From WICID to INTERACTION • Flexible nature of the current WICID system should allow for the addition of non-census datasets as long as the data is prepared in the required pair-wise origin, destination, variable format Main challenges: • Re-designing the interface to handle time-series data. Current data are discrete, cross-sectional data • Some of the datasets (HES for example) present issues related to geographies: Currently, HES destination is a specific point, rather than an area • Metadata redesign to clearly identify different datasets and characteristics for users • Incorporation of ‘on-the-fly’ disclosure control routines for datasets like HESA

  25. INTERACTION – Example issues Current data selection layout Possible future data selection layout

  26. INTERACTION – Example issues • Output complexities will need to be solved, with extra dimensions to the data output e.g. Current: origin/destination by age by sex Could be: origin/destination by age by sex by year

  27. INTERACTION – Example issues • Currently, census data supplied to us has already been subjected to statistical disclosure control methods, such that small counts are suppressed before the data is put onto the system - this can affect the accuracy of query results • Where some new datasets will be supplied in primary unit form, this offers us the opportunity to only apply statistical disclosure control where it is necessary, thus increasing data accuracy for the end user • Different techniques will need to be trialled and evaluated before data is made widely available

  28. The New Data Three new non-census data sets would be included in INTERACTION: • National Health Service Central Register (NHSCR) data from 1975 to present • Hospital Episode Statistics (HES) data from 2001 to present • Higher Education Statistics Agency (HESA) student data from 2001 to present

  29. NHSCR Data • NHSCR data will be available as a time series for a consistent set of 100 Zones based on the FHSA geography from 1975 to 1998 • Post-1998 data will be available for Health Authority areas in England and wales and equivalent areas in Scotland and Northern Ireland • Variables will be restricted to broad age and sex categories

  30. Changing patterns of net migration as shown by NHSCR data 1988-90 1980-82 Source: Stillwell (1994) Environment and Planning A

  31. HES data • We would be aiming to include HES data from 2001 until the present • Data contains information on all in-patient episodes relating to Hospitals in England • Origins are as detailed as Ward or SOA. Destinations are available down to Postcode Unit level • The ‘journey to hospital’ data can be disaggregated by a huge variety of variables, including:

  32. HES data • Age (at end and start of hospital episode) • Sex • Ethnicity • Duration of episode • Type of episode (related to treatment given) • Diagnosis category (International Classification of Diseases and related health problems [ICD-10] classification) – contains information on every known illness/disease/injury • Separate classifications for maternity and mental health episodes • Type of operation (if applicable)

  33. Total number on in-patient visits (including repeats) made to Yeovil and Weston hospitals from England in 2000/01

  34. HES data – research opportunities • Hospital Episode Statistics provide a unique opportunity to study hospital catchment areas in relation to specific treatments and enable measurements of ‘market penetration’ – something becoming more relevant under the new NHS Patient Choice directive which allows patients more choice over where they are treated • Spatial interaction modelling will enable analyses of the frictional effect of distance on the ‘commute’ to hospital, and the testing of ‘what if’ scenarios in relation to the opening and closing of hospitals • Optimum locations for new hospitals or treatment centres in relation to demand could be explored through location-allocation modelling

  35. HESA data • We would be aiming to include HESA data from 2001 until the present • Data contains information on the home address and destination of higher or further education institution • Origins could be as detailed as MLSOA with destinations only as accurate as the location of the HE institution attending – no way to ascertain exactly where student is living • Student migrations can be disaggregated by:

  36. HESA data • Age group (5 years) • Disability (disabled/not known to be disabled/not known) • Ethnicity (white/non-white/unknown) • Domicile (middle layer Super Output Area) • Postcode of HEI headquarters • Level of study (postgraduate, first degree, other undergraduate) • Subject area • Term-time accommodation • Major source of tuition fees • Mode of study (full-time/part-time) • Gender

  37. HESA data – research opportunities • Students are the section of the population most actively involved in internal migration in Britain • Increasing numbers of students are entering into higher education, with large numbers of students becoming features of many of Britain’s major urban centres • Students have significant social, cultural, economic and environmental impacts on the areas they live with issues such as ‘studentification’ becoming active topics of political debate • Times series and cross-sectional analysis of student migration data in Britain should allow for greater understanding and prediction of student in-migration impacts

  38. Conclusions: • An extensive audit of interaction data in the UK led to CIDER identifying a number of key sources that could be incorporated into an updated version of the WICID system • New data sources would compliment existing census-based interaction datasets and would move CIDER towards providing a more complete interaction data service • An number of technical challenges will need to be overcome as we move from WICID to INTERACTION • Easy access to new interaction data sources will provide unique opportunities for substantive research to be carried out in relation to internal migration in the UK

  39. Thank you Adam Dennett, Centre for Interaction Data Estimation and Research, School of Geography, University of Leeds a.r.dennett@leeds.ac.uk http://www.geog.leeds.ac.uk/people/a.dennett/ For the full audit: http://www.geog.leeds.ac.uk/wpapers/index.html

More Related