280 likes | 291 Views
This presentation discusses the emerging trends in data exchange and data hubbing in the context of data dissemination and communication. It covers topics such as the evolution of statistical communication, new methods of communication, structured data exchange using SDMX, and the concept of data hubbing. The presentation also highlights the benefits of SDMX and the impact of the SDMX registry on reporting efficiency.
E N D
Emerging Trends in Data Exchange and Data Hubbing Jacob Assa, UN Statistics Division Regional Workshop on Data Dissemination and Communication Manila, the Philippines June 20-22, 2012 United Nations Statistics Division 2012
Outline of the Presentation • Data Dissemination in Context • Dissemination History at UNSD • Dissemination versus Communication • Data Exchange and SDMX • Data Hubbing Nationally and Globally
Data Dissemination in Context • Virtual Value Chain : (Svend and Hollensen, 2001) • Dissemination – last but not least step • Often done as an afterthought • Can be made more efficient and effective: • From Data Publishing to Data Exchange • From Data Silos to Data Hubbing Define information problem Organize, select and compile information Synthesize information Distribute information Value
Dissemination History in UNSD • League of Nations 1919-1948 – print publications • United Nations • 1948-1995 – print publications (yearbooks, manuals) • 1995-2000 – CD-ROM, static web pages • 2000-2008 – online databases, dynamic web queries (UN Comtrade, UN Common Database) • 2008 – launch of UNdata – UN System data portal • 2010 – World Statistics Pocketbook app for iPhones and iPads • 2012 – launch of CountryData – UN national data portal
Dissemination versus Communication One-way vs. two-way communication • Considerable evolution of statistical communication over recent years • Traditionally, statistical organizations focused on • Dissemination through printed publications • One-way communication through few media channels • Newspapers • Radio Television • Since 1990s, acknowledged need to do more than just disseminate data • Employing communication professionals • Widespread use of the Internet • New methods of communication and dissemination
Dissemination versus Communication New methods of communication: • Web 2.0 technologies • Blogs • Wikis • Social networks • Interactive websites • Allow users to upload data and create graphs • Sharing and discussion with other users
Data Exchange - Unstructured • Paper questionnaires • Excel sheets • CSV files • Email • Semi-structured • XML files • However, XML in itself is simply a mark-up language and does not standardize data structure between exchanging parties
XML - Example Philippines, GDP in constant 2000 US$ (World Bank)
Data Exchange - Structured Statistical Data and Metadata Exchange (SDMX) • What is it? • An initiative to foster standards for the electronic exchange of statistical information • Goal - explore e-standards that could increase efficiency gains and avoid duplication • Sponsored by BIS, ECB, EUROSTAT, IMF, OECD, UN, WB • What it is not • Not a technology…but implemented using technology (XML EDIFACT syntax and GESMES/TS message) • How does it work? • Exchange partners agree on Data Structure Definitions • Data and metadata exported and imported accordingly
Benefits of SDMX Protection of existing technology investments • Many different types: • Data warehouses • OLAP cubes • GESMES/TS • Publication systems • SDMX standardizes formats and protocols at the point where data and metadata go between counter-parties
SDMX Registry/Repository SDMX Registry Interfaces Register Indexes data and metadata REGISTRY Data Set/Metadata Set Query Describes data and metadata sources and reporting processes Submit Subscription/Notification REPOSITORY Provisioning Metadata Query Submit REPOSITORY Structural Metadata Describes data and metadata structures Query
Impact of the SDMX Registry • The SDMX Registry allows for one of the major efficiency gains possible with SDMX: • Shifting from “push”-based reporting to “pull”-based reporting • This can save lots of time and duplication of effort
What is a Data Structure Definition? • Specifies a set of concepts which describe and identify a set of data • Tells which concepts are the dimensions (identification and description) and which are attributes (just description) • Tells which code lists provide the possible values for the dimensions and attributes
What is Data Hubbing? • In general, a hub is the central part of a wheel where the spokes come together. The term is familiar to frequent fliers who travel through airport "hubs" to make connecting flights from one point to another • In data communications, a hub is a place of convergence where data arrives from one or more directions and is forwarded out in one or more other directions http://searchnetworking.techtarget.com –
Data Hubbing at the National Level Cambodia – DFID Project Objectives • Improve coordination in the National Statistical System • Collate development data in one place/hub • Make access to national data easier • Reduce data request burden • Use of latest IT software and practices
Line Ministries National Statistical Office United Nations Line Ministry Database National Repository DB Post notification DevInfo Upload Mapping tool Publish Scripts SDMX-ML Download XLS National Indicator Registry Register files Project Dissemination Model
Data Hubbing at the International Level (1) The Joint External Debt Hub (JEDH) Jointly developed by • Bank for International Settlements (BIS) • International Monetary Fund (IMF) • Organization for Economic Cooperation and Development (OECD) • World Bank (WB)
JEDH Site before SDMX BIS WEBSITE IMF OECD World Bank (Various Formats) (3-month production cycle)
JEDH with SDMX Retrieves data from sites BIS SDMX “Agent” SDMX-ML SDMX-ML Loaded into JEDH DB [Info about data is registered] IMF SDMX-ML Discover data and URLs SDMX Registry OECD SDMX-ML Data provided in real time to site World Bank SDMX-ML JEDH Site SDMX-ML (Debtor database)
Data Hubbing at the International Level (2) UNdata Portal • Before, a researcher interested analyzing the effects of population, health and education on per capita income growth would need to visit: • UNSD website for population figures • WHO website for health indicators • UNESCO website for education indicators • UNSD/World Bank/IMF website for income data • Now all these indicators are available in one place through a single user interface
Data Hubbing at the International Level (3) European Central Bank (ECB) • Push vs. pull plus a hybrid approach • Central Hub to which all member banks submit their SDMX data • The ECB then pulls the entire dataset from the Central Hub • SDMX-based visualizations
Resources • UNSD - Handbook of Statistical Organization(3rd ed.)http://unstats.un.org/unsd/dnss/hb/default.aspx • UNECE - Making Data Meaningful (2 parts) http://www.unece.org/stats/documents/writing/ • SDMX - http://sdmx.org/ Contacts United Nations Statistics Hotline - statistics@un.org Jacob Assa, UNSD - assaj@un.org