GPU p assthrough in the cloud: A comparison of KVM, Xen , VMWare ESXi and LXC

GPU passthrough in the cloud: A comparison of KVM, Xen, VMWareESXi and LXC John Paul Walters Project Leader, USC Information Sciences Institutejwalters@isi.edu

ISI@USC: Summary • Large, vibrant, path-breaking research Institute • Part of USC’s Viterbi School of Engineering located in “Marina Tech Campus” (Marina del Rey) and in Arlington, VA • ~300 people mostly research staff • Mission: Perform traditional basic and applied research combined with education and building real systems • Strength: World-class research and development in information sciences targeting issues of societal and commercial importance • Culture: Academic environment that supports researchers setting their own agendas to address important research problems with a full-time focus on research • Technology: Broad research profile across computer sciences, mathematics, engineering, and applied physics • Sponsors: DoD research agencies, intelligence community, NIH, DOE technical offices, and industrial/commercial entities

Motivation • Scientific workloads demand increasing performance with greater power efficiency • Architectures have been driven towards specialization, heterogeneity • We’ve implemented support for heterogeneity in OpenStack • Support GPUs via LXC, large shared memory machines, bare metal provisioning, etc. • Most clouds are still homogeneous, without GPU access • Of the major providers, only Amazon offers virtual machine access to GPUs in the public cloud

Cloud Computing and GPUs • GPU passthrough has historically been hard • Specific to particular GPUs, hypervisors, host OS • Legacy VGA BIOS support, etc. • Today we can access GPUs through most of the major hypervisors • KVM, VMWareESXi, Xen, LXC

Supporting GPUs • We’re pursuing multiple approaches for GPU support in OpenStack • LXC support for container-based VMs • Xen support for fully virtualized guests • KVM support for fully virtualized guests, SR-IOV • Also compare against VMWareESXi • Our OpenStack work currently supports GPU-enabled LXC containers • Xen prototype implementation as well • Given widespread support for GPUs across hypervisors, does hypervisor choice impact performance?

GPU Performance Across Hypervisors • 2 CPU Architectures, 2 GPU Architectures, 4 Hypervisors • Sandy Bridge + Kepler, Westmere+ Fermi • KVM (kernel 3.13), Xen 4.3, LXC CentOS 6.4, VMWareESXi 5.5.0 • KVM and Xen installed from Arch Linux 2013.10.01 base systems • Standardize on a common CentOS 6.4 base system for comparison • Same 2.6.32-358.23.2 kernel across all guests and LXC • Control for NUMA effects

K20 Results - SHOC

C2075 Results - SHOC

K20 Results – SHOC Outliers

C2075 Results – SHOC Outliers

SHOC Observations • Overall both Fermi and Kepler systems perform near-native • This is especially true for KVM and LXC • Xen on the C2075 system shows some overhead • Likely because Xen couldn’t activate large page tables • Some unexpected performance improvement for KeplerSpmv • Appears to be a performance regression in the base CentOS 6.4 system • Performance nearly matches comparable Arch Linux base.

K20 Results – GPU-LIBSVM

C2075 Results – GPU-LIBSVM

GPU-LIBSVM Observations • Unexpected performance improvement for KVM on both systems • Most pronounced on Westmere/Fermi platform • This is due to the use of transparent hugepages (THP) • Back the entire guest memory with hugepages • Improves TLB performance • Disabling hugepageson Westmere/Fermi platform reduces performance to 80-87% of the base system

Multi-GPU with GPUDirect • Many real applications extend beyond a single node’s capabilities • Test multi-node performance with Infiniband SR-IOV and GPUDirect • 2 Sandy Bridge nodes equipped with K20 • ConnectX-3 IB with SR-IOV enabled • Ported Mellanox OFED 2.1-1 to 3.13 kernel • KVM hypervisor • Test with HOOMD, a commonly used GPUDirect-enabled particle dynamics simulator

GPUDirect Advantage Image source: http://old.mellanox.com/content/pages.php?pg=products_dyn&product_family=116

HOOMD Performance

Multi-node Discussion • SR-IOV and GPUDirect are possible • Performance is near-native • Scales with problem size • Next step is to extend this to a much larger cluster and expand applications

Current Status • Source code is available now • https://github.com/usc-isi/nova • Includes support for heterogeneity • GPU-enabled LXC instances • Bare-metal provisioning • Architecture-aware scheduler • Prototype Xen with GPU passthrough implementation

Future Work • Primary focus: multi-node • Greater range of applications, larger systems • Integrate GPU passthrough support for KVM • This might come free with the existing OpenStack PCI passthrough work • NUMA support • This work assumes perfect NUMA mapping • OpenStack should be NUMA-aware

GPU p assthrough in the cloud: A comparison of KVM, Xen , VMWare ESXi and LXC