150 likes | 396 Views
Virtual Machines in Condor. Virtual Machines. VMware Pros: Full virtualization, no modification for guest OS, and checkpoint/restart capability Cons: Commercial product and performance issue compared to Xen Xen Pros:
E N D
Virtual Machines • VMware • Pros: • Full virtualization, no modification for guest OS, and checkpoint/restart capability • Cons: • Commercial product and performance issue compared to Xen • Xen • Pros: • Open source, good performance, checkpoint/restart and live migration capability • Cons: • Requires OS modification and must divide memory between host and VMs in advance • UML (User Mode Linux) etc.
Benefit of using Virtual Machine in Condor • Sandbox • Security and Isolation • Independent environment • Customizing environment for Condor • Several OS’s on a single physical machine • Support for a wider variety of jobs • Finer Resource Control • Assign memory size to each VM explicitly • Checkpoint and migration • All memory of VM can be saved(or suspended) and restarted(or resumed) later
Difficulty of using Virtual Machine in Condor • Hard to manage system memory efficiently • Need to know some information of host machine inside VM • Need to setup some environments in VM • If a VM cannot use the distributed file system, Condor’s file transfer or remote IO mechanism should be used • Need IP address for each VM
Already launched VM is ready to be used as a execution machine for Condor jobs Condor daemons should be installed and run on both the virtual and the host machine, which are all exposed to the pool Condor startd on the host machine controls when a launched VM is used for Condor. Supported by Condor 6.7.18 and all future releases Pros: Easy to implement Cons: Inefficient memory management How to use VM in CondorScenario 1
Schedd Startd Startd Scenario 1 Execution machine Host Machine Central Manager Virtual machine Collector Negotiator Submit machine Communication pathway
Before Query Name = “vmware1.domain.com” TotalLoadAvg = 2.670000 KeyboardIdle = 517656 …. 1. Query ClassAd for host 2. ClassAd for host machine After Query Name = “vmware1.domain.com” TotalLoadAvg = 2.670000 KeyboardIdle = 517656 … HOST_Name=“host.domain.com” HOST_TotalLoadAvg = 1.860000 HOST_KeyboardIdle = 50 … Current ImplementationHow can VM get the information for host machine Host Machine VMP_VM_LIST = vmware1.domain.com Virtual Machine VMP_HOST_MACHINE = host.domain.com ClassAd for VM ClassAd for host Name = host.domain.com” TotalLoadAvg = 1.860000 KeyboardIdle = 50 ….
START = ((KeyboardIdle > 150 ) && ( LoadAvg <= 0.3 )) START = ((KeyboardIdle > 150 ) && ( LoadAvg <= 0.3 )) START = False START = False START = False 1. Send VM_REGISTER If host status == (‘owner’ | ‘unclaimed’) If permission == yes START = ((KeyboardIdle > 150 ) && (HOST_KeyboardIdle > 150 ) && ( LoadAvg <= 0.3 ) && (HOST_TotalLoadAvg <= 0.3 )) 2. Reply permission else else Current Implementation How does a VM get permission from host machine Host Machine VMP_VM_LIST = vmware1.domain.com Virtual Machine VMP_HOST_MACHINE = host.domain.com ClassAd for host ClassAd for VM
Issues in current implementation for Scenario 1 • Problem: host machine cannot be used for Condor any more after sending a permission to a virtual machine. • Possibility: A user may want to use both virtual and host machine in a SMP machine. • Possible solution: After sending permission, host machine does not change START expression. Instead, the virtual machine sends its status to host machine periodically and host machine decides the permission for each virtual machine when a Condor job is assigned.
Virtual Machine is launched on demand to serve a Condor job Checkpoint and migration per virtual machine base can be used Startd on host machine may have to advertise more than one OS. A specific daemon in virtual machine needs to communicate with host machine. The daemon receives a command from host machine and executes it when a Condor job is assigned. How to use VM in CondorScenario 2
Not yet implemented in Condor but we hope to do it soon. Pros: Efficient memory management Cons: Complex to implement How to use VM in CondorScenario 2
Central Manager Collector Negotiator schedd Schedd daemon Startd Communication pathway Creating/forking process Scenario 2 Execution machine Host Machine launching Virtual machine Daemon Starter Submit machine 2 Submit machine 1 Host Machine Virtual machine Shadow
Issues in Scenario 2 • Stop VM and save the entire memory of VM instead of suspending an executed Condor job when a user returns to the host machine • During migration if there is no shared file system, files used by a Condor job, including a program file, should be transferred because it is very hard to copy entire disk image. • Xen live migration technique can be effectively used for direct migration without checkpointing.
Startd Daemon Shadow Schedd Startd Scenario 2 - Migration Central Manager Execution machine 1 Collector Host Machine Virtual machine Negotiator Daemon Starter Submit machine migration Execution machine 2 Host Machine launching Communication pathway before migration Virtual machine Communication pathway after migration Creating/forking process Starter
Summary • Virtual Machine offers flexible solution in Condor • Sandbox for security • Can provide more than one OS on a single physical machine • Can provide customized environment for Condor • Scenario 1 has been already supported since Condor 6.7.18 • Scenario 2 is not yet implemented in Condor. That is a future work in Condor.