140 likes | 147 Views
This article discusses the IMF's approach to storing and disseminating metadata for macroeconomic statistics, including collaborations with other organizations and the use of data standards.
E N D
IMF Approach to Storing Metadata with Macroeconomic Statistics UNECE Workshop on the Common Metadata Framework (Vienna, Austria, 4-6 July 2007)
Dissemination Standards Bulletin Board (DSBB) • data standards initiative (SDDS/GDDS • countries’ dissemination practices • information that SDDS countries provide the IMF on their dissemination practices • direct links to the economic and financial data that countries disseminate under the SDDS • information that GDDS countries make available to the IMF on their statistical practices • http://dsbb.imf.org
Collaboration with OECD Dec 2006 - Agreement to use Dotstat and MetaStore to form the basis of the IMF data warehouse Jan 07 – software available on joint Team Foundation Server (TFS) Feb 07 IMF.Stat installed with the assistance of OECD May 07 have loaded: International Financial Statistics (IFS), World Economic Outlook (WEO), and Sub Saharan Africa Regional Economic Outlook (REO) June 07 – signed an MOU which supports a collaboration approach to future enhancements for the mutual benefit of both organizations
100 250 3000 10 25 2 158.1 E IMF.Stat Data Model Data Referential Metadata Country Group Concept Country Group Concept CouGrpID 100 ConceptID 250 CouGrpID 100 ConceptID 250 ParentID Null ParentID 200 ParentID Null ParentID 200 Code 156 Code NGDP Code 156 Code NGDP Label Canada Label Gross ... Label Canada Label Gross ... Data Fact table Metadata Fact table CouGrpID CouGrpID 100 Unit Of Measure Unit Of Measure ConceptID ConceptID 250 UofMID 10 UofMID 10 DataSourceID DataSource DataSourceID 3000 DataSource ParentID Null ParentID Null UnitOfMeasID DatSrceID 3000 UnitOfMeasID 10 DatSrceID 3000 Code N Code N TimeFreqID ParentID Null TimeFreqID 25 ParentID Null Label Nat Curr Label Nat Curr StatusID Code WEO StatusID 2 Code WEO Observation Label World ... MetadataID 5487 Label World ... Flag Time & Frequency Status TimeFreqID 25 StatusID 2 Status ParentID Null ParentID Null Time & Frequency StatusID 2 Code 200401 Code SHARE TimeFreqID 25 ParentID Null Label 2004 Q1 Label Shareable ParentID Null Code SHARE Code 200401 Label Shareable Label 2004 Q1 Metadata MetadataID 5487 Text Chain-linked GDP volume measures are expressed in ...
Structural metadata • Economic Concepts -mapped as many time series as possible to the Catalogue of Time Series and loaded them to IMF.stat • Countries and groups – used IFS version of Country names and codes as the authoritative source for codes and labels • Unit – chose to combine unit and scale e.g. Millions of US dollars • Storing data in native units i.e. not trying to convert observations to a common unit. • Status, Source and Time and Frequency reasonably straight forward so far. Will become more problematic when we introduce versioning.
Referential Metadata • Working through existing metadata from IFS publications and production system • Where necessary/possible cleaning it up, standardizing it and loading it to MetaStore • WEO – metadata sourced from the external web site, reformatted and stored in MetaStore then exported to IMF.stat • All referential metadata loaded to MetaStore and then exported to IMF.Stat
Data- IFS All time series which were able to be mapped to the Catalogue of Time Series (CTS) • Includes • Exchange rates • Balance of Payments • International Investment Position • Real Sector Statistics • International Liquidity • Money and Banking non-SRF data • Excludes • Government Finance • Money and Banking SRF data • Fund Accounts • 191 concepts • 233 countries • 39 groups • 7.6 million observations
Data-WEO WEO • Two most recent editions • Includes series published externally as well as other series available internally • Concept - generally consistent with the CTS • Country and group – some differences in codes used so mapped where possible. Some groups added. • Unit - limited number of units used and mainly consistent across countries
Data-REO Sub Saharan Africa REO–Structural Metadata • Concepts – virtually no codes or labels in common with the CTS • Able to map those series published in the REO but the supporting series too difficult. Are now working through them on a case by case basis to determine which if any map to the CTS • Country and group – country codes and labels mainly consistent with WEO. Groups all different even though sometimes have the same label. • Units - mainly ratios which were added to the authoritative list. Sub Saharan Africa REO Referential Metadata • Have sourced top level referential metadata only. Will work with the Africa Department after the data are loaded to identify any usable referential metadata.
MetaStore • Some modifications with assistance from OECD • Now includes • structural metadata • mappings to authoritative lists • referential metadata SchemaLogic • In future may integrate structural metadata in MetaStore or replace Alignment with SDMX • Have used 42 ‘types’ to categorize our referential metadata • Added one to the OECD set, which are consistent with SDMX
Managing metadata within the IMF • Locate relevant sources of metadata • Locate potential warehouse content • Central repositories for data and metadata • Harmonizing and mapping to a preferred term • Authoritative lists • Working with Information Services Division (ISD) to ensure information management best practice • Assigning data stewards to manage metadata
Governance • Establishing groups and individuals with certain roles and responsibilities for management of metadata • Economic Data Advisory Group • Representation from departments across the Fund • Includes several working groups with specific focus • Information Services Division • Responsible for provision of metadata • Metadata and Standards team • New group in the Statistics Department currently focusing on metadata used in the data warehouse
Next Steps • Changes to work practices across the Fund • Identify a data steward for each dimension in IMF.Stat • Standardization, authoritative sources • Reuse of metadata across systems • Raise awareness of the value of quality metadata • Tie together basic schemas
Country Group Data Source Time & Freq 111 USA 112 UK 273 MEX ... 2005;USA,GDP,548.25 2004;USA,GDP,526.25 ... EDW Top Level Diagram Data sources MetaStore User interface Referential and structural metadata Internal Concept IFS WEO Structural metadata End-users ETL External DataStream Referential metadata Haver User interface IMF.stat Data flow