1 / 26

GridFTP

GridFTP. Steve Tuecke Argonne National Laboratory. Overview. Motivation for GridFTP Working Group Requirements GridFTP Solution GridFTP Working Group Documents Role of GridFTP Working Group. GridFTP Working Group Motivation.

gusty
Download Presentation

GridFTP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GridFTP Steve Tuecke Argonne National Laboratory

  2. Overview • Motivation for GridFTP Working Group • Requirements • GridFTP Solution • GridFTP Working Group Documents • Role of GridFTP Working Group

  3. GridFTP Working Group Motivation • Data transfer solutions have been developed by the Globus Project over past ~5 years, GridFTP is 3rd generation • Grid Forum started ~1 year ago to promote and develop Grid technologies • Critical mass of people working in this area • Grid Forum GridFTP working group formed to foster the further specification and development of GridFTP • Community effort to move GridFTP forward

  4. Some Important Definitions • Resource • Network protocol • Network enabled service • Application Programmer Interface (API) • Syntax • Software Development Kit (SDK)

  5. Resource • Entity that is to be shared • Includes computers, storage, data, software • Does not have to be physical entity • Condor pool, distributed file system, … • Defined in terms of interfaces, not devices • E.g. LSF defines compute resource • Open/close/read/write defines access to a distributed file system, e.g. NFS, AFS, DFS

  6. Network Protocol • A formal description of message formats and a set of rules for message exchange • Rules may define sequence of message exchanges • Protocol may define state-change in endpoint, e.g. state change • Good protocols designed to do one thing • Protocols can be layered • Examples of protocols • IP, TCP, TLS, FTP, HTTP, Kerberos

  7. FTP Server Web Server HTTP Protocol FTP Protocol Telnet Protocol TLS Protocol TCP Protocol TCP Protocol IP Protocol IP Protocol Network Enabled Services • Implementation of a protocol that defines a set of capabilities • Protocol defines interaction with service • All services require protocols • Not all protocols are used to provide services (e.g. IP, TLS) • Examples: FTP and Web servers

  8. API(Application Programming Interface) • A specification for a set of routines to facilitate application development • Refers to definition, not implementation, e.g. there are many implementations of MPI • Spec often language-specific (or IDL) • Routine name, number, order and type of arguments; mapping to language constructs • Behavior or function of routine • Examples • GSS API, MPI

  9. Syntax • A specification for how a defined set of information is encoded into bits • A syntax may be defined as part of a protocol or API • Protocol messages have defined syntax • A syntax may be used as API function argument • But syntax can also stand alone • Good syntax designed to do one thing • Syntaxes can be layered • Examples • XML, ASN.1, X.509, LDIF

  10. SDK(Software Development Kit) • A particular instantiation of an API • SDK consists of libraries and tools • Provides implementation of API specification • Can have multiple SDKs for an API • Examples of SDKs • MPICH, Motif Widgets

  11. Multiple APIs but a Single ProtocolExample: TCP/IP • Multiple APIs: BSD sockets, Winsock, System V streams, … • Different programs use different APIs • Interoperability: programs using different APIs can exchange information Application Application WinSock API Berkeley Sockets API TCP/IP Protocol: Reliable byte streams

  12. Application Application GSS-API GSS-API GSI SDK Kerberos SDK GSI protocol Kerberos protocol Different message formats, exchange sequences, etc. TCP/IP TCP/IP Single API, but Multiple ProtocolsE.g., GSS-API • GSS-API provides portability: any correct program compiles & runs on a platform • Does not provide interoperability: all processes must link against same SDK • E.g., GSI and Kerberos versions of GSS-API

  13. I.e., Standard APIs and Protocols are Both Important: For Different Reasons • Standard APIs/SDKs are important • They enable application portability • But w/o standard protocols, interoperability is hard (every SDK speaks every protocol?) • Standard protocols are important • Enable cross-site interoperability • Enable shared infrastructure • But w/o standard APIs/SDKs, application portability is hard (different platforms access protocols in different ways)

  14. Grid Data Needs • Transfer of large amounts of data (petabytes or terabytes) between storage systems • Access to large amounts of data (terabytes or gigabytes) by many geographically distributed applications and users for analysis, visualization, etc.

  15. Requirements • Grid Security Infrastructure (GSI) and Kerberos support • Third-party control of data transfer • Parallel data transfer • Striped data transfer • Partial file transfer • Automatic negotiation of TCP buffer/window size • Support for reliable/recoverable data transfer

  16. Candidate Standards • FTP • Defined by a set of IETF RFCs • No partial file, parallel/striped, GSI, etc • Separate control & data channels • WebDAV • New extension to http • No third party transfer, parallel/striped, etc. • Combined control & data channel

  17. Separate Control & Data Channels • WebDAV combines control and data over single channel • FTP splits control and data • Supports multiple, user selectable data channel protocols • Advantage to split channels • Third party transfers handled cleanly • Can (cleanly) define new data channel protocols • E.g. parallel/striped transfer, automatic TCP buffer/window negotiation • Amenable to high-performance proxies • E.g. For firewalls, load balancing, etc.

  18. GridFTP Solution • Built on existing FTP standards • RFC 949: File Transfer Protocol • RFC 2228: FTP Security Extensions • RFC 2389: Feature Negotiation for the File Transfer Protocol • Draft: FTP Extensions • Extends standards with • Additions to security extensions, partial file transfer, parallel/striped transfer, TCP buffer/window size tuning,

  19. GridFTP Implementation Status • Modified wu-ftpd server • Most features • Modified ncftp client • Security, TCP buffer setting • Modified HPSS & Unitree ftpd server • Security • Globus Toolkit client and server SDKs, and command line tools • Most features • Striped FTP server (aka DPSS2)

  20. GridFTP Working Group Documents • GridFTP: A Data Transfer Protocol for the Grid • Overview of working group activities and documents • Requirements • Informational draft • GridFTP: FTP Extensions for the Grid • Protocol specification

  21. GridFTP Protocol Specifications • Existing standards • RFC 949: File Transfer Protocol • RFC 2228: FTP Security Extensions • RFC 2389: Feature Negotiation for the File Transfer Protocol • Draft: FTP Extensions • New drafts • GridFTP: FTP Extensions for the Grid

  22. GridFTP APIs • Should there be standard API(s)? • Posix I/O • SRB client • grid_storage • globus_ftp_client • MPI-IO • HDF5 • etc • Beyond scope of this working group • Common protocol beneath these APIs would allow interoperability

  23. Role of GridFTP Working Group • Bring together those who are interested in the future of GridFTP to help foster the… • continued specification and standardization of GridFTP • development of inter-operable GridFTP implementations • widespread adoption of GridFTP as a transfer protocol for the Grid • Develop drafts which together define GridFTP • May submit some of them to IETF • Move GridFTP forward to better address Grid data transfer requirements

  24. NOT Goals of GridFTP Working Group • This working group will not start from first principles • Starting point is roughly GridFTP as it now exists • FTP base is assumed • Its not design by committee • Seeking rough consensus, with broad input • Draft authors and WG chair have final say

  25. GF5 GridFTP Working Session • Is this appropriate for Grid Forum? • Who is interested in participating, and in what capacity? • Is the problem scoped appropriately (at least for now)? • What are the right drafts to write? • Establish rough timeline for drafts

  26. A Call To Arms • The Grid Forum security working group needs to do more than just gather 3 times a year to chat about data management. • But Grid Forum is only appropriate for this activity if people meaningfully participate. • I will be doing this regardless. • But it will hopefully be done better and faster with broad participation. • If there is not meaningful participation, I won’t bother with the overhead of Grid Forum.

More Related