1 / 22

Overcoming I/O Bottlenecks in Full Data Path Processing

Join the Collaborative Expedition Workshop #74 to learn about intelligent and scalable data management techniques for overcoming I/O bottlenecks in full data path processing. Experts from GSA, NSF, NARA, and NIST will discuss strategies for enabling access and discovery in data ingest and computation.

iralarson
Download Presentation

Overcoming I/O Bottlenecks in Full Data Path Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Collaborative Expedition Workshop #74June 10, 2008Overcoming I/O Bottlenecks in Full Data Path Processing: Intelligent, Scalable Data Management from Data Ingest to Computation Enabling Access and Discovery Susan Turnbull, GSA, Almadena Chtchelkanova, NSF, and Robert Chadduck, NARA, Richard Spivack, NIST

  2. Collaborative Expedition Workshops Purpose: Monthly open workshops to transcend insularity, encourage collaboration and demonstrate promising capabilities emerging from IT research and development • Organize around common purpose, larger than any institution, to appreciate potentials and realities • Improve quality of dialogue and collaborative prototyping at intergovernmental crossroads • Participants, representing many forms of expertise, return to their settings with a larger perspective of the “whole”

  3. CollaborativeExpedition Workshops Create conducive conditions for“Breakthrough” Innovations • Need to Know -> Need to Share -> Build to Share • To be Informed (not Overwhelmed) • by the Combined Complexity of our multiple forms of Expertise • Communities of Practice • Agile Framework for Building Intergovernmental Services • Open Collaboration, Open Standards

  4. VASA – 1628 In design we either hobble or support people’s natural ability to express forms of expertise.

  5. Building Sustainable Stewardship Practices Across Communities • Collaborative Expedition Workshops and Collaborative Work Environment (http://www.gsa.gov/collaborate) Co-sponsors: • 1. GSA's Intergovernmental Solutions Office • 2. Emerging Technology SC (ETSC), Architecture and Infrastructure Committee of the Federal CIO Council – http://cio.gov • 3. Subcommittee on Networking and Information Technology R & D (NITRD) CGs: including Social, Economic and Workforce Implications of IT and IT Workforce Development (SEW), Human-Computer Interaction-Information Management (HCI-IM) and High End Computing (HEC) – http://nitrd.gov

  6. Emerging Technology Subcommittee (ET SC), CIO Council Tuning ET Together From Stovepipes to Wind Chimes Purpose: “Incubator” organizing process to accelerate shared discovery, maturation, and validation of community-based capabilities. Common understanding of scenarios • Greater foresight and discernment • Improved collaboration • Sustainable life-cycles

  7. Emerging Technology Subcommittee - ET SC Key FY08 Activities • Conduct Collaborative Expedition Workshops with GSA and Subcommittee on Networking for IT Research and Development • Conduct ET Life-cycle process – http://ET.gov (StratML) ET SC Co-chairs

  8. Key FY08 Activities 1. Conduct Collaborative Expedition Workshops Purpose: Monthly open workshops to encourage collaboration among government and community implementers of IT and to demonstrate promising capabilities emerging from IT research that aligns with FEA principles • “Facilitate strategic dialogue among communities of interest. Through the Expedition Workshops, sponsored by AIC, interested participants experience and learn about new opportunities to adhere to sound architectural principles and implement shared, service-oriented solutions.” from CIOC Strategic Plan • Leadership in virtual collaboration (i.e. Data Reference Model, Geospatial Profile) Key FY08 Activities/Deliverables • Organize around business scenarios from ET.gov & IT R&D communities that address CIOC Strategic Plan and Architecture Principles for the US Government. • Organize around CIO requests. ET SC co-chairs SEW CG co-chairs

  9. Key FY08 Activities 2. Conduct http://ET.gov Purpose: “Continue to develop more efficient and effective methods for sharing information on emerging technologies.” CIOC Strategic Plan ET.gov stages: 1. Identification: anyone registers ET component using XML schema 2. Subscription: community forms around high potential component 3. Stewardship: community recognized by ET SC (i.e. IPv6, StratML) 4. Graduation: component recognized by Services SC for inclusion in CORE.gov Key FY08 Actions • Explore partnering with other federal settings involved in technology evaluation and transfer • Conduct Collaborative Expedition workshops to support networking among ET communities Contact Information

  10. Lessons Learned Connecting the Cultural DOTs - Dialogue, Openness, Transparency 1. Create environment to appreciate the “whole picture” – transcend insularity 2. Practice plausible scenarios “on Purpose” • Monthly public workshops, no fee, supports remote participants (shared screen, chat room) and public archive, including audio files, discussion forum • Assume strategic leadership roles while “thinking out loud together” 3. Shared Purpose is the organizing force in public workshops • Purpose that is larger than any organization – including government; influences structure and participation more than lines of control

  11. Lessons Learned – Summary of Purpose • Improved Ability to Appreciate the Whole Picture • overcome cultural differences in order to increase returns and decrease risk • Improved Ability to Engage in Sustained Dialogue • low-cost, low risk opportunities to dialogue and exchange views on emerging issues, enabling trust and mutual sense of purpose to meet future challenges together • Improved Resource Allocation Process for Achieving Results • find common ground and shared understanding across funding, implementation, and accountability processes, to eliminate delays, disincentives, and indecisiveness from non-aligned processes

  12. Collaborative Space Augments  Solid Past contributions and conversations always available Content never lost, wiki changes visible/ accountable by name High confidence level in 24/7 availability Hosted on high performance infrastructure Platform independent Any file format in shared repository fine-grained access – “virtual pointer on infinite whiteboard“ (persistent identifiers) People’s Natural Ability for Dialogue and Sharing Fluid Augments flow of purposeful conversations Sharing is paramount Context advances understanding Supports quality of dialogue, openness and transparency needed to build trust Supports CoP planning and development of events and documents Uses only everyday tools: phone and browser Open or closed communities Community sets the pace Lessons Learned – Organizing via Communities of Practice

  13. Key Findings Building Sustainable Stewardship Practices Across Communities • FY03 - Agile business componentsnot easily discovered by e-government managers resulting in lost opportunities • FY04 - Emerging Technologies (web services, grid computing, and semantic web) to tune up Innovation Pipeline with better linkages. • FY05 - Collaborative Work Environment (including wiki) expands effective networking across intergovernmental communities • FY06-07 – DiverseCommunities co-organizing the workshops • FY08- Practicing alignment with “real” national scenarios and joint workshops with NITRD Subcommittee Coordinating Groups (Networking for Multiplicative Returns) • Building shared understanding of fundamental concepts needed for communities representing diverse forms of expertise, to work together to leverage toward improved citizen service delivery at lower cost.

  14. Going Forward: From Stovepipes to Wind-Chimes "Frontier Outpost" to open up quality conversations, augmented by “light-weight” tools, to leverage collaborative capacity of united, but diverse sectors of society, seeking to discover, frame, and act on national potentials. • 73 workshops since March, 2001 • 60-80 participants per workshop, many Communities of Practice • Wiki, shared files, discussion forum, chat room, shared screen display • FY06: 1.1 million visits to site, 3.88 million file downloads, FY07: 1.7 million visits to site, 5.62 million file downloads • FY08 Alignment: Networking for Multiplicative Returns • Building shared understanding of fundamental concepts needed for communities representing diverse forms of expertise, to work together to leverage toward improved citizen service delivery at lower cost.

  15. Common Workshop Questions • How can multiple Communities of Practice (CoP) organize around common mission needs to build shared understanding in a manner that encourages creativity, trust, agility, and greater value from assets? • How can shared understanding around urgent cross-boundary scenarios be accelerated and what is the role of collaborative prototyping? • How can maturing, light-weight (Web 2.0) tools support governance and transformational potential of inter-organizational communities and their host institutions? Workshop Questions 2008

  16. Today’s Workshop • 8:30am - Check-in and Coffee • 8:45am - Workshop Overview - Susan Turnbull, Robert Chadduck, Almadena Chtchelkanova, Richard Spivack • 9:00am - Welcome • Christopher Greer, Ph.D., Director, The National Coordination Office, Networking and Information Technology Research and Development, The Executive Office of the President • 9:15am – Introductions: Attendee self introductions, including brief statements of interests and questions in light of perspectives and practical operational challenges to optimize throughput in processing ultra-large scale data collections - Robert Chadduck, NARA, Moderator

  17. Today’s Workshop • 10:00am – Panel One: Data Management Approaches Contributing to Optimize I/O in Full Data Path Processing, Robert Chadduck, NARA, Moderator • Dr. Michael Folk, Ph.D., Director, The HDF Group (“THG”), HDF5 Experiences with I/O Bottlenecks • Louis Reich, NASA Goddard Space Flight Center/ Computer Sciences Corporation, Research Findings Concerning the Utility and Scalability of the XFDU and Related Technologies in the Packaging and Validation of Very Large Digital Information Products • Dr. David Du, Ph.D., Program Director, Directorate for Computer and Information Science and Engineering, Division of Computer and Network Systems, The National Science Foundation, Long term End-to-End Security, Privacy, and Provenance • 11:30pm – Lunch

  18. Today’s Workshop • 12:30pm – Panel Two: Mass Storage Systems & Technologies Interests and Research Dr. Reagan Moore, Ph.D, Director, Data Intensive Cyberinfrastructure Environments Groups, The University of California, San Diego, Moderator and Managing Massive Data Collections • Michelle Butler, Technical Program Manager, Storage Enabling Technologies Group, The National Center for Supercomputer Applications (“NCSA”) • Dr. Ethan Miller, Ph.D, Associate Professor, The Department of Computer Science, The University of California, Santa Cruz (UCSC), Search and Indexing for Petabyte-scale Storage and Beyond • Paul Nowoczynski, Advanced Data Management Specialist, The Pittsburgh Supercomputer Center

  19. Today’s Workshop • 2:00pm – Panel Three: File Systems and I/O Interests and Research • Dr. Gary Grider, Ph.D, Los Alamos National Laboratory, Moderator and Highlights of File Systems and I/O Research and Implications for Information Lifecycle Management (ILM) Challenges • Dr. Garth Gibson, Ph.D., Carnegie-Mellon University, Data Reliability, Failure, Failure/ Operational Data Release • Dr. Henry Newman, Ph.D., Instrumental, Inc., Emerging Role of Standards in Information Lifecycle Management (ILM) • Bob Rogers, Chief Technology Officer, Application Matrix, LLC and ISM SNIA working group, Challenges and Opportunities of Information Lifecycle Management (ILM)

  20. Today’s Workshop • 3:30pm – Open Discussion: Opportunities for Synergy in Next Steps, Including Potential Commonalities in Technologies or Approaches in Response to “Hard Problems” to Optimize Full Data Path Throughput • 4:00pm – Poster Sessions Led by University Researchers: HEC-URA FSIO research program, mass storage technologies, data management approaches, etc. • Greg Ganger,The Carnegie Mellon University • Matthew Wolf, The Georgia Institute of Technology • Phil Carns, Argonne National Laboratory • Walt Ligon, Clemson University • Pete Wyckoff, The Ohio Supercomputer Center • Alok Choudhary, Northwestern University • Remzi Arpaci-Dusseau, University of Wisconsin • Xian-He Sun, The Illinois Institute of Technology

  21. Today’s Workshop • Tzi-cker Chiueh, The State University of New York at Stony Brook • Xiaodong Zhang, The Ohio State University • Mahmut Kandemir, Pennsylvania State University • Scott Brandt, The University of California, Santa Cruz • Paul Nowoczynski & Jared Yanovich; The Pittsburgh Supercomputer Center; Results and Developments in Scalable Lightweight Storage Hierarchies • Yong Chen; Department of Computer Science, The Illinois Institute of Technology; Server-Push Architecture for Improving I/O Access Performance • Nawab Ali, Department of Computer Science, The Ohio State University; Redesigning Parallel File Systems Using Object-based Storage Devices • Xin Li, University of Rochester, Reference-Driven Performance Anomaly Identification • 5:30pm or 6:00pm – Adjourn

  22. Invitation to Upcoming Expedition Workshops • July 15 - Peer Review and Scientific Knowledge Validation • August 19 - Broad Public Participation/ Green IT • Sept. 16 - Science of Science and Innovation Policy (being developed with NITRD CGs) Contact Susan.Turnbull at gsa.gov • 202-501-6214 • Questions?? • Discussion

More Related