200 likes | 378 Views
Presenter : Mahindra B. Pai mpai@usc.edu. Matchmaking. Distributed Resource Management for High Throughput Computing Rajesh Raman, Miron Livny, Marvin Solomon. Flow of Presentation. Introduction - RMS(resource management system) Characteristics of HTC Resource Management in HTC
E N D
Presenter : Mahindra B. Pai mpai@usc.edu Matchmaking Distributed Resource Management for High Throughput Computing Rajesh Raman, Miron Livny, Marvin Solomon
Flow of Presentation • Introduction - RMS(resource management system) • Characteristics of HTC • Resource Management in HTC • Matchmaking framework & Requirements • Class-ads • Matching and Claiming • Matchmaking in Condor • Gang-matching • Future Research • Related Work • Summary and References Matchmaking
Introduction • Conventional RMS • System model • Centralized scheduler • Obstacles to above approach in distributed systems • Heterogeneity of resources • Distributed ownership of resources • Alternative approach - HTC RMS (e.G. Condor) • Classified advertisement • Matching and claiming Matchmaking
Characteristics of HTC • Constantly on the lookout for additional resources. • Services of provider of computing power are always accepted regardless of characteristics, degree of availability of resource, or duration of service. • Main obstacle to inter domain execution is access to the environment from which the application was submitted, such as input and output. • A list of tasks is executed by a group of workers (resources) under the supervision of a master. • Losing workers (resources) prematurely is clearly what HTC customers worry about most. Matchmaking
Resource Management in HTC • Consists of the groups • Owners • System administrators • Application writers • Customers • Success of an RMS can be assessed • Runs continuously and reliably • Owners,customers satisfied by delivery quality of service, reliability • System administrators and application writers can rely on the robustness and flexibility of the system • Principle layers of an RMS • Local RM • Owner • System • Customer • Application • Application RM Matchmaking
The Matchmaking Framework • Class-ad specification • Defines a language for expressing characteristics and constraints, and a semantics of evaluating these attributes • Advertising protocol • Defines the format of class-ad and the method of sending it to the matchmaker • Matchmaking algorithm • Defines the relation of class-ads and state of the system • Matchmaking protocol • Defines how matched entities are notified, and what information they are given in case of a successful match • Claiming protocol • Defines actions of matched entities to discharge of service Matchmaking
Matchmaking Requirements • Portability • Entities involved in matchmaking not of a fixed type or architecture • Self-describing • Matchmaker should function correctly independent of the descriptions • Resources can be compute nodes, software licenses, storage space or network bandwidth • Well-defined and robust semantics • Handle incorrect characteristics of entity or no characteristics • Decoupled protocols • Advertising, matchmaking and claiming must be decoupled • Robustness and scalability Matchmaking
Class-ads • Definition. • A highly flexible and extensible data model that can be used to represent arbitrary services and constraints on their allocation. • Aspects of the class-ads. • Semi-structured data-model, no specific schema. • Folds the query language into the data model. • Mapping from attribute names to expressions. • Avoid assumptions about the nature and characteristics of resource. • Restrictions by owner on the customers. • Restrictions by requests on offers. Matchmaking
Class-ad for a Workstation [ Type = "Machine"; Activity = "Idle"; DayTime = 36107; // current time in seconds since midnight KeyboardIdle = 1432; // seconds Disk = 323496; // kbytes Memory = 64; // megabytes State = "Unclaimed"; LoadAvg = 0.042969; Mips = 104; Arch = "INTEL"; OpSys = "SOLARIS251"; KFlops = 21893; Name = "leonardo.cs.wisc.edu"; ResearchGroup = { "raman", "miron","solomon", "jbasney" }; Friends = { "tannenba", "wright" }; Untrusted = { "rival", "riffraff" }; Rank = member(other.Owner, ResearchGroup) * 10 + member(other.Owner, Friends); Constraint = !member(other.Owner, Untrusted) && Rank >= 10 ? True : Rank > 0 ? LoadAvg<0.3 && KeyboardIdle>15*60 : DayTime < 8*60*60 || DayTime > 18*60*60 ; ] Matchmaking
Matchmaker Match Algorithm (2) (1) Advertisement (1) Advertisement (3) Match notification (3) Match notification Entity (Provider) Entity (Requestor) (4)Claiming Matching and Claiming…(1) • Advertising protocol • Matchmaking algorithm • Matchmaking protocol • Claiming Protocol Matchmaking
Matching and Claiming…(2) • Actions involved in Matchmaking and Claiming. • Construct class-ads, send them to the Matchmaker. • Conform to the advertising protocol (e.g. Constraint indicates compatibility, Rank indicates desirability). • Matchmaking algorithm. • Self.attribute-name & Other. attribute-name. • Multiple matches – highest Rank value chosen. • Reference to non-existent value evaluates to undefined. • Matchmaking protocol to notify the two. • Send matching ads. • Send token of active entity. • Claiming protocol for working relationship with provider. • Negotiation is directly done by the customer with the provider. • Identity can be verified. Matchmaking
Why Separate Matching-Claiming ? • Weak consistency requirements. • Possible match with a stale advertisement. • Authentication. • Claiming protocol can use cryptographic techniques for the provider and customer to convince each other of their identities. • Bilateral specialization. • Allocation models are supplied by the entities involved in providing and using services. • End-to-end verification. • The matchmaker does not need to retain any state about the match. Matchmaking
Matchmaking in Condor…(1) • Class-ad specification. • Resources – Resource-owner Agents (RA). • Enforces policies stipulated by resource owner. • Periodically probes resource to determine its current state. • Encapsulate information of resource in a class-ad along with the owner’s usage policy. • Customers – Customer Agents (CA). • Maintain per-customer queues of submitted jobs, represented as lists of class-ads. • Advertising protocol. • The Constraint attribute prevents Untrusted attribute. • The Rank expression states that research jobs have higher priority than friends' jobs, which in turn have higher priority than other jobs. • Ads should contain "contact addresses“. • RA to include an "authorization ticket" with its ad. Matchmaking
Matchmaking in Condor…(2) • Matchmaking algorithm. • Pool Manager. • RAs and CAs periodically send class-ads to this pool manager. • Periodically enters a negotiation cycle (invokes the matchmaking algorithm). • Matchmaking protocol. • Send the matched principals each other's class-ads at the contact addresses specified in their class-ads. • Authorization ticket (supplied by RA) to the CA (by Pool manager ). • Claiming protocol. • CA contacts the RA and sends the authorization ticket. • RA accepts the resource request only if. • Ticket matches. • Request matches the RA's constraints with respect to the updated state of the request and resource. Matchmaking
Gang-matching…(1) • PROBLEM : A job requires a workstation and a software license to run successfully. However, there are a limited number of licenses, each of which is valid only on certain workstations. • Gang-matching • explicit list of required bilateral matches. • pick job class-ads in priority order and then using a top-down, backtracking algorithm to marshal the required gang. Matchmaking
Gang-matching…(2) Gangmatch request [ Type = "Job"; Owner = "raman"; Cmd = "run_sim"; Ports = { [// request a workstation Label = "cpu"; ImageSize = 28M; Rank = cpu.KFlops/1E3 + cpu.Memory/32; Constraint = cpu.Type=="Machine" && cpu.Arch=="INTEL" && cpu.OpSys=="LINUX" && cpu.Memory>=Imagesize; ], [// request a license Label = "license"; Host = cpu.Name; // cpu name Rank = 0; Constraint = license.Type=="License" && license.App==Cmd; ] } ] • License Advertisement • [ Type = "License"; • App = "sim_app"; • ValidHost= "foo.cs.wisc.edu"; • Ports = { • [ Label = "requester"; • Rank = 0; • Constraint=requester.Type=="Job" • && requester.Host==ValidHost • ] } • ] Matchmaking
Future Research • Automatically aggregating class-ads so that matches may be performed in groups. • Methods for identifying constraints which can never be satisfied by the pool • Optimization techniques to increase the efficiency of identifying gangs Matchmaking
Related Work • NQE, PBS, LSF, LoadLeveler • Globus • Resource Specification Language (RSL) • Resource brokers (specialization ) • co-allocator • resource manager • Information Service • Legion • Object oriented approach….matching=object placement • Object mapper • Inheritance to define resources • Jurisdiction Manager to support autonomy Matchmaking
Summary • The matchmaking framework is a flexible and general method of resource management in pools of resources which exhibit physical and ownership distribution. • The representation and protocols facilitate both static and dynamic heterogeneity of resources, which results in a robust, scalable and flexible framework that can evolve with changing resources. Matchmaking
References • Rajesh Raman, Miron Livny, and Marvin Solomon, "Matchmaking: Distributed Resource Management for High Throughput Computing", Proceedings of the Seventh IEEE International Symposium on High Performance Distributed Computing, July 28-31, 1998, Chicago, IL. • Foster and Kesselman, The Grid: Blueprint for a New Computing Infrastructure, Morgan Kaufmann Publishers, Inc., 1999. , Chapter # 13, pg 311-337. • Rajesh Raman, Miron Livny, and Marvin Solomon, "Resource Management through Multilateral Matchmaking", Proceedings of the Ninth IEEE Symposium on High Performance Distributed Computing (HPDC9), Pittsburgh, Pennsylvania, August 2000, pp 290-291. • http://www.cs.wisc.edu/condor/publications.html. Matchmaking