1 / 19

Big Data, threats or opportunities?

Big Data, threats or opportunities?. Ki-Jong Woo Commissioner Statistics Korea (KOSTAT). Friday Seminar “Big Data for Policy, Development and Official Statistics”. 22 February 2013, New York. Contents. Global Responses to Big Data Big Data Use in Public Sector of Korea

wilmet
Download Presentation

Big Data, threats or opportunities?

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Big Data, threats or opportunities? Ki-Jong Woo Commissioner Statistics Korea (KOSTAT) Friday Seminar “Big Data for Policy, Development and Official Statistics” 22 February 2013, New York

  2. Contents Global Responses to Big Data Big Data Use in Public Sector of Korea KOSTAT’s preparation in response to Big Data KOSTAT’s pilot project on applicability of Big Data Treats or Opportunities? What should we prepare?

  3. 1. Global Responses to Big Data “The Future Belongs to the One who Rules the Data” • National Strategy for Big Data is a global trend • Korean Government’s Response • US: ‘Big Data R&D Initiative’ - invest in R&D on big data technology (Mar 2012) • Japan : ‘Basic Strategy on Using Big Data’ as a major task for ‘Active Japan’ initiative (May 2012) • Public-Private-Academic Partnerships- “Big Data Strategy Forum” for market analysis and future strategy (Mar 2012) • - “Korea Big Data Forum” for market growth and industrial transformation (Sep 2012) • National Master Plan on Big Data • - established by 5 concerned ministries including National Science & Technology Commission • and Ministry of Education, Science and Technology(Dec 2012)- Identifying national agenda for big data use and its relevant infrastructure to ensure • applicability and benefits to the public 1 / 15

  4. 2. Big Data Use in Public Sector of Korea • 3Vs in Reality? • National Master Plan on Big Data • Using big data means handling ‘LARGE-SCALE’ & ‘REAL-TIME’ data only? • - For example, what if ‘administrative data’ in public sector ? • →Could be… because it can provide greater INSIGHTS for reproduction of data • First Step is… • - to develop a system that can cross-link huge amount of administrative data • kept by different government agencies • 3 Priority Tasks in 2013 are… - Prevention of criminal incidents by predicting the location and time of crimes • - Early detection and prediction of natural disasters • - A traffic accident reduction scheme with a bottom-up method (Citizen Participation) 2 / 15

  5. 2. Case Example: Supporting Youth At Risk Policy • Pilot Project: Recognizing Youth Risk Pattern based on Social Data Youth at Risk Policy has been insufficient in fighting serious/life-threatening youth problems Analyzing social data – buzz patterns related to suicide among youth – allows formulation of policies to effectively prevent suicide by better understanding the circumstances and psychological states of the youth • Identify risk factors based on analysis of suicide context • Prevent spread of harmful content through popular URLs or SNS • Develop cooperative mechanism with Online Big Mouth(influencer) • Harmful Contents • Suicide risk factor e.g.) Suicide Context Pool e.g) Online spread of harmful content e.g.) Influence by Major Players Online (Source) National Information Society Agency Pilot Project (Jul-Sept, 2012) 3 / 15

  6. 3. What should KOSTAT do? scientifically collected data vshuge amount of data • Statistics Win! • Big Data Win! Challenge! Quantity creates quality Statistics can be also biased MORE timely Survey environment is challenging On the basis of mathematical theories Represent target population Timely “All Models are Wrong, but Some are Useful” – George Box “Will the Data Deluge Makes the Scientific Methods Obsolete? - Chris Anderson, 『The End of Theory』 4 / 15

  7. 3. What does KOSTAT currently do? • Pilot Project • Study Group • Since October 2012 • Aims to raise awareness on big data among KOSTAT staffs • Organizes monthly seminars to share up-to-date information and trend of big data at global and national level • A pilot project on the use of big data in the process of editing existing national statistics • Using media data for examining outliers when producing the Index of Industrial Production(IIP) • Completing the system set-up by this February for a pilot trial run in March and April 2013 5/ 15

  8. 4. Case Example: Pilot Project on IIP(1) • Monthly Survey on Mining and Manufacturing • Pilot Project Coverage • Original Survey 83 26 Industrial Types (Division level) 4 Industrial Types • C21 (Pharmaceuticals) • C24 (Primary Metals) • C26 (Electronic Components, …) • C28 (Electric Equipment) 633 Products 162 Products 1,438 Establishments 8,300 Establishments 6 / 15

  9. 4. Case Example: Pilot Project on IIP(2) • Compiling IIP: AS-IS vs. TO-BE Work Flow • AS-IS • TO-BE (Application of Big Data) Visualization of Surveyed Data • Survey & Data Entry • (3rd-19th of every month) Volume by product and establishment Index by industry and product Data Edit • Produce • Preliminary Index • (20th of every month) Telephone Inquiries • Review goods total and make inquiries • (20th-23rd of every month) Analysis of Unstructured Data from Media • Confirm Index and Upload Data • (23rd of every month) Check related information on Internet Comprehensive internet information on products and establishments • Analyze Index and Prepare Reports • (24th-26th of every month) 7 / 15

  10. 4. Case Example: Pilot Project on IIP(3) • Collecting and Analyzing Media Data • Data Collection • Data Analysis Crawling of related articles and web documents online from 1st of previous month to the present - Collect internet data on products or establishments indicating increases and decreases in trends- Crawl internet news on real-time basis- Download materials (i.e. PDF, DOC) on website and upload to the server for analysis IBM’ VIVISIMO Analysis of unstructured data Consolidate documents downloaded from websites and internet news for comprehensive analysis- Enhance search accuracy by registering similar product names in advance- Provide website and internet news in order of search accuracy 8 / 15

  11. 4. Case Example: Pilot Project on IIP(4) • Coverage of Materials tobe Analyzed (Media Data) Information posted on websites and internet news articles 9 / 15

  12. 4. Case Example: Pilot Project on IIP(5) • Visualizing Surveyed Data • Direct Link to IPS DB • Visualizing data Direct access to the Industrial Production System (IPS) database Visualize data into graphs by industry, establishment and product for time series analysis on IIP and output volume Visualization of structured data - Graph of output by product/establishment(volume/ Comparison with previous month/ comparison with same month of previous year) - Graph of index by industry/product(volume/ Comparison with previous month/ comparison with same month of previous year) By Industry/Product By Product/Establishment Index Data Volume Data 10 / 15

  13. 4. Case Example: Pilot Project on IIP(6) 11 / 15

  14. 5. Crisis of NSOs? NSOs will have to decide on whether to produce official statistics based on big data. Someday! What would users prefer? 12 / 15

  15. 5. What should NSOs prepare? Representativeness Encourage the research on the comparison of the statistics produced on the basis of big data and the existing statistics. i.e. 2-year pilot project on using big data for price index in Europe 13 / 15

  16. 5. What should NSOs prepare? accessibility Openness .vs. Privacy Legal framework and system to facilitate data sharing and access should be established in order to ensure effective use of big data among agencies. - KOSTAT provides 540 subjects of statistics for free and plans to develop OpenAPI this year There is a growing request for Microdata in the advent of big data. - Clever solutions are needed to the issue pertaining to microdata, mainly the conflict between the need for data sharing vs ensuring privacy protection 14 / 15

  17. 5. What should NSOs prepare? DATA SCIENTISTS 14,000 data scientists will be needed by 2017 while there are currently about 100 in Korea. Data Scientists should be cultivated in public and private sector. - have to come up with strategies to train big data specialists to cooperate with private, academic and research sectors - need to develop a curriculum to foster data scientists by concentrating on business analysis, statistics and IT studies 15 / 15

  18. The future belongs to those who rule the data

  19. Thank You 감사합니다

More Related