1 / 19

Building an Information System for a Distributed Testbed

Building an Information System for a Distributed Testbed. Warren Smith Texas Advanced Computing Center Shava Smallen San Diego Supercomputing Center. FutureGrid Goals. Provide a high-quality distributed testbed High availability, easy to use, quality documentation, knowledgeable support

heller
Download Presentation

Building an Information System for a Distributed Testbed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Building an Information System for a Distributed Testbed Warren Smith Texas Advanced Computing Center ShavaSmallen San Diego Supercomputing Center

  2. FutureGrid Goals • Provide a high-quality distributed testbed • High availability, easy to use, quality documentation, knowledgeable support • Support a variety of experiments • Cloud, grid, high-performance computing, data intensive • Computer and computational science • Allow rigorous experiments • Data gathering • Support education and training

  3. FutureGrid Overview • Funded by NSF through September, 2014 • PI: Geoffrey Fox of Indiana University • Diverse and distributed set of resources • Compute, storage, network • Connected by high-performance networks • Variety of software environments • OpenStack, Nimbus, Eucalyptus • Torque/Moab, MPI, OpenMP • Hadoop • Software to support experiments • Pre-configured virtual machines images • Performance measurement tools • Experiment execution tools

  4. FutureGrid Deployment NID: Network Impairment Device Dedicated FutureGrid Network XSEDE Network

  5. FutureGrid Hardware

  6. Motivation • Measure a variety of information useful to users • Resource configuration and load • Software and service descriptions • Resource and service status • Resource usage • Detailed performance monitoring • Provide this information to users in a consistent way

  7. Approach • Use existing monitoring tools • Many good ones to choose from • Integrate the information they provide • Common publishing mechanism • Publish/subscribe messaging • Common storage mechanism • SQL database • Common representation language • JavaScript Object Notation (JSON)

  8. Monitoring Tools • Inca • Periodic user-level tests of software and services • Information Publishing Framework • Static and dynamic information from cluster schedulers and clouds • Represents as GLUE v2 • perfSONAR • All-to-all iperf bandwidth measurements • SNAPP • SNMP network data • Ganglia • Detailed node data • NetLogger • Users to instrument their software and services

  9. Architecture Consumers User Portal Phantom Experiment Recorder User Tools AMQP & JSON PSQL & JSON AdministrativeServers Information Server PostgreSQL Inca Extract Translate Publish Ganglia AMQP & JSON RabbitMQ perfSONAR AMQP & JSON SNAPP Tool-specific Resources Resources Inca Ganglia Information Publishing Framework NetLogger perfSONAR SNAPP

  10. Performance Experiments • Ensure design will meet performance needs before deploying it • See if design could meet the needs of other projects • Emulation environment: Sierra cluster (San Diego) Server Farm (Indiana University) Virtual Machine Cluster Switch Cluster Switch Producer Producer Database 1 Gb/s 10 Gb/s 1 Gb/s Messaging Service Consumer Consumer

  11. N-to-N Throughput

  12. Infrastructure Emulation • Emulate FutureGrid and other infrastructures • Characterize the producers and consumers of information

  13. Emulation Configurations

  14. Messaging Performance

  15. Database Performance

  16. Excess Capacity for User Data FutureGrid x2 OSG x2 FutureGrid OSG XSEDE x2 XSEDE

  17. Current Status • Majority deployed on FutureGrid • SNMP and iperf are the exception • Used behind the scenes • Not yet documented and supported for users

  18. Conclusions • There is a large amount of information of interest to testbed users • Generated by a variety of tools • Providing information in a common way makes it easier to use • Common mechanisms • Common representation language • Current technologies have sufficient performance • RabbitMQ pub/sub messaging • PostgreSQL relational database • Excess pub/sub capacity can be made available to users

  19. Future Work • FutureGrid is wrapping up • Complete and make user-visible for feedback • XSEDE information services • Deployed RabbitMQ • Inca/JSON publishing to RabbitMQ • Information Publishing Framework & GLUEv2

More Related