1 / 23

Report on the INFN-GRID Globus evaluation

Report on the INFN-GRID Globus evaluation. Massimo Sgaravatto INFN Padova for the INFN Globus group globus@infn.it. Why Globus ?. Some basic services (security, information services, resource management, …) must be deployed in order to implement and use a Grid for real applications

tevy
Download Presentation

Report on the INFN-GRID Globus evaluation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Report on the INFN-GRID Globus evaluation Massimo Sgaravatto INFN Padova for the INFN Globus group globus@infn.it

  2. Why Globus ? • Some basic services (security, information services, resource management, …) must be deployed in order to implement and use a Grid for real applications • Globus identified as possible Grid framework providing these services •  WP “Installation and Evaluation of the Globus Toolkit” of the INFN-GRID Project • Evaluation of the Globus toolkit (effectiveness, completeness, robustness, ease of use, …) • Provide feedback to the Globus team • Bringing attention to existing problems and requirements • Providing fixes to some problems

  3. Globus activities within INFN • Activities driven by the following work plan • Evaluation of Globus security services • Evaluation of Grid Information Service • Evaluation of Globus services for resource management • Evaluation of Globus tools for data management • Evaluation of Globus HBM for fault monitoring • Evaluation of Globus GEM for execution environment management • Globus deployment and installation tools • Not only a simple evaluation • Some existing shortcomings addressed • Specific configurations and customizations implemented • INFN-GRID Globus evaluation activities performed between June 2000 and January 2001 • “Official” Globus 1.1.3 (1.1.4 for MPICH-G2) release tested

  4. Globus security services • The Globus GSI security model seems to satisfy the HENP community current requirements on security • One time login mechanism • Use of X509 certificates • Possibility for extending relations of trust to multiple CA’s without having to interfere with their X.500 naming scheme • Some shortcomings • Need for limited (by scope or purpose) proxies • Globus team is already addressing this problem • Memory leaks in the GAA library • Fixed: patches provided by INFN • Cryptic diagnostics • Now partially solved with newer code • Interface between GSI and AFS • Already addressed with gsiklog • No tools for group management • Addressed with new CAS service

  5. INFN customizations on security • INFN-CA • CRL distribution • Centralized management of the grid-mapfile • Goal: Ease the sharing of the same access policies (represented by the grid-mapfiles) for groups of hosts with common purposes • Proposed system • Central repository (LDAP server) to store user certificates (subjects) and to define groups of users • Certificates published by CA manager • Group manager responsible for editing group memberships (using a LDAP client) • Resource owners (Globus administrators) periodically (i.e. cron job) “connect” to this repository, “download” the subject of the certificates that meet a specified criterion (e.g. all users of group X), and produce grid-mapfile entries

  6. Globus Information Services • INFN implemented a hierarchical structure of GIS based on geographical entities • Site GIIS’s • Local GRIS’s registered at the site GIIS • Root GIIS where local GIIS’s are registered

  7. INFN GIS Topology Dc=infn,dc=it, o=grid Top Level INFN GIIS Dc=mi, Dc=infn, dc=it,o=grid Dc=pd,Dc=infn, dc=it,o=grid GIIS GIIS GRIS Padova Milano

  8. GRIS GRIS GRIS GRIS GRIS GRIS GRIS GRIS GRIS root GIIS A global view 1st level query focus on a set of resources Scheduling/ Resource discovery High Availability ldbm backend (?) GIIS replication (?) GIIS 2nd and 3rd level query Get more updated info GIIS ……..

  9. Globus Information Services • Problems • Performance • Querying the root GIIS server, on the worst case the whole namespace must be searched • The overall response time is limited by the slowest response of a descendant • Poor GRIS performance (shell backend) • Example (querying a site GIIS): • ~ 1 sec. When cache is on • ~ 5-10 sec. When cache expired and GIIS and GRIS not busy • > 1 min. when cache expired and GRIS busy

  10. Globus Information Services • Other problems • Pull model • Mixed push/pull model more suitable • Security and access controls • Any GRIS can register itself to a GIIS • No access control when searching the GIS • Fault tolerance • No automatic failover mechanisms

  11. Globus Information Services • Most of these problems already addressed or are being addressed with the new MDS development • Improved GRIS performance • Improved GIIS performance (e.g. support for GIIS timeouts) • Integration of GSI security and access control • Support for customized indexes • Support for pluggable information providers • Support for both registration and invitation • …

  12. Globus Information Services • Other INFN customisations • INFN-GIS browser • Tools (MRTG based) to monitor LDAP servers • Entries returned • Connections

  13. INFN-GIS browser

  14. Resource Management • Evaluation of Globus GRAM • Focus on possible use of GRAM as uniform interface to different underlying local resource management systems • Tests with Condor, LSF and PBS as LRMS • INFN WAN Condor pool as Globus resource • The model is fine, but lack of “robustness” (needed for real production environments) • Memory leaks in the Globus job manager • Fixed: patches provided by our group were fed back to Globus • Scalability (one job manager for each job) • Reliability (the job manager is not persistent) • Addressed with the new jobmanager (by Condor team) • New resource management architecture foreseen with GRAM-2

  15. Resource Management • Default GRAM Reporter (Information providers) not enough for our needs (in particular considering PC farms): • Many useless attributes (at least for our needs), attributes not calculated (always defined as 0), some attributes not properly calculated, important information (e.g. needed by a resource broker) missing •  We are addressing this problem in the context of the DataGrid Project • Submission of Condor jobs to Globus resources • Condor-G • Useful as a reliable job submission service • Persistent queue of jobs • Logging information • Exploitation of the new persistent Globus jobmanager • Reliable (two phase commit) submission protocol • GlideIn • Evaluation of MPICH-G2 vs. MPICH • Some shortcomings found (lack of support for shared memory, worse latency performance for small messages wrt. MPICH)

  16. Data management • Tests with GASS • Tests with GridFTP alpha release 2 • Capability of resuming an interrupted file transfer successfully tested • Support for the GSI authentication mechanisms successfully tested • Throughput tests • Increasing number of parallel streams and fixed file size • Increasing file size and fixed number of streams • Increasing TCP buffer size • Increasing block size

  17. Other services • Fault Monitoring (HBM) • Evaluation of HBM for fault detection (for “system” and “user” processes) • … but the HBM package is not seeing active development • Execution Environment Management (GEM) • Evaluation of GEM as service for “code migration” • … but Globus now provides only limited capabilities (executable staging)

  18. Globus installation tools • INFN-GRID Globus installation toolkit • To make easier and more “automatic” the installation of the Globus toolkit • To shorten the installation time (very long using the standard install procedures) • Support for specific customisations and configurations • Quick distribution of patches • Support for distribution of new tools and packages • Proven to be successful • Used to setup a INFN GRID Testbed and also outside (CERN, FNAL, …) • Used as installation tool for DataGrid Testbed 0

  19. INFN-GRID Installation toolkit • Characteristics • Distribution of binary files • Distribution of the packages needed to install/use Globus • Distribution of various Globus flavoured compilations (kerberos, MPICH, AFS) • Support for the most used platforms in the HENP community (Linux RH, Solaris) • Binary file relocation supported • Latest patches included (e.g. fixes for Globus jobmanager memory leaks) • Support for local customisations (hook to support different CA’s, support for different GIS configurations, support for different LRMS,…) • Support for distribution of new tools and packages (certretrieve, GDMP, …) • Upgrade and uninstall procedures • Documentation

  20. New Globus packaging • Modular packages for individual components • More open development process • Possibility to build and install only desired packages • Simpler customization • Contributions from INFN included

  21. Conclusions • The Globus toolkit can provide basic services useful to create and deploy usable Grids, but various shortcomings and issues must be addressed • Globus developers already addressed/ing most of them • Other info • Report on the INFN-GRID Globus Evaluation • http://www.infn.it/globus/Docs/infn-globus-evaluation.pdf • Response from Globus team to “Report on the INFN-GRID Globus Evaluation” • http://www.isi.edu/~annc/infn/responsetoinfn.pdf • http://www.infn.it/globus

More Related