1 / 13

Longbow InfiniBand Extension

Longbow InfiniBand Extension. Dr. David T Southwell President & CEO. Agenda. The fundamentals of InfiniBand flow control InfiniBand range limitations – two mechanisms Longbow InfiniBand range extension technology Potential applications at CERN. InfiniBand flow control.

wesley
Download Presentation

Longbow InfiniBand Extension

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Longbow InfiniBand Extension Dr. David T Southwell President & CEO

  2. Agenda • The fundamentals of InfiniBand flow control • InfiniBand range limitations – two mechanisms • Longbow InfiniBand range extension technology • Potential applications at CERN Obsidian Research Corporation - CERN

  3. InfiniBand flow control • InfiniBand is credit based • On initialisation, each fabric end-point declares its capacity to receive data • This capacity is described as it’s “buffer credit” • As buffers are freed up, end points post messages updating their credit status • InfiniBand is therefore lossless – data is never thrown away, since… • …InfiniBand flow control happens before the transmission, not after it! • Note the buffer credit mechanism applies to every point-to-point link (not end-to-end) • This mechanism is in contrast to Ethernet’s loss-based “flow control” : • On network over-subscription, packets are simply thrown away • Detected packet loss triggers retransmissions and adjustments to the injection rate Obsidian Research Corporation - CERN

  4. InfiniBand range limitations • As commercialised today, InfiniBand addresses the cluster/ supercomputer market… • High equipment packing density (rack-to-rack connections are short) • InfiniBand switches cascade easily (very low latency), so multi-hop is ok • High port count switches (large ICs) • NASA’s “Columbia”– 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  5. Mechanism (1) – physical layer • These applications are served by standard InfiniBand cables: • Balanced copper cables (twin-axial, shielded & tight impedance control) • Cheaper than optics, but with a range < 20m (RF losses) @ 2.5GBits/s (“SDR”) • At DDR (5Gbits/s) and QDR (10Gbits/s) rates per channel, cables get even shorter • There exists a parallel optic multi-mode fibre solution (simple E-O-E) • More expensive (especially the parallel fibre bundles themselves) • Self-limits @ ~200m • A good solution for longer inter-rack runs or for links between floors • MPO will see more use at DDR/ QDR rates (today) (soon!) • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  6. Mechanism (2) – link layer • Optimised for a short signal flight time; small buffers are used inside the ICs: • Facilitates switch IC implementation, but limits effective range to ~ 300m Undersized buffers restrict the sustained data flow rate – in this case data is only moving in phases 1 and 5! The inefficiency is caused by an inability to keep the pipe full by restoring the receive credits fast enough to avoid a break up of the burst. The longer the flight time, the lower the effective transfer rate is. This limits the useful length of an InfiniBand link no matter what the physical transport is capable of. (Nb. this has no impact on copper InfiniBand links – receive buffers >> 2x wire data capacity). • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  7. Longbow Technology • Obsidian has developed a technology that performs InfiniBand encapsulation over 10GbE, Packet Over SONET/SDH and ATM WANs at 4x InfiniBand speeds: Longbow XR. • Looks like a 2-port InfiniBand switch to the InfiniBand fabric • Designed for 100,000km+ ranges, prototypes publicly tested over 1,500km and 8,500km OC-192c networks (SC|04, OFC’05, SC|05) • 950+MBytes/s sustained performance in a single logical flow • ~ 4% CPU load (Opteron 242s using RDMA transport) • IPv6 Packet Over SONET & ATM modes • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  8. Longbow Transport • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  9. Longbow @ SC|05 • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  10. The Obsidian Longbow XR • Transparent to InfiniBand hardware, stacks and applications • Very user-friendly long-haul wire-speed InfiniBand data pump • Compatible with all InfiniBand equipment and stacks, including OpenFabrics • High availability architecture – telecom grade equipment • A managed device (HTTP GUI, SSH CLI, SNMP) – 10/100 Ethernet/ serial console • Also encapsulates two GbEthernet channels along with the 4x SDR InfiniBand channel • NASA’s Columbia – 10,240 Itaniums (NUMALink+InfiniBand interconnect) Obsidian Research Corporation - CERN

  11. Potential Application…ATLAS In collaboration with Dr. Bryan Caron (University of Alberta, Canada), Bill St.Arnaud (Canarie Inc. - Canada’s high performance research network) and others, Obsidian will soon launch a multi-stage Long Haul InfiniBand project which will demonstrate reliable, 10Gbits/s transfer of bulk data back & forth across the Atlantic: CERN would be the preferred end point for such a demonstration - Canarie has confirmed that the entire lightpath would be available for sustained streaming demonstrations. Obsidian Research Corporation - CERN

  12. Longbow Campus and Metro • Obsidian also sees application for the range extension technology over SONET/ SDH networks for Metro Area Networks (up to 120km), and for dark fibre campus applications (up to 10km). • Remote InfiniBand storage (replication, distributed SAN) • Visualisation applications; tap directly and natively into distant clusters • Aggregate remote InfiniBand clusters into larger compute resources • Campus and Metro versions are currently in development. They will be optimised for latency and the more efficient use of smaller networks. Obsidian Research Corporation - CERN

  13. Conclusions InfiniBand is becoming a critical element in high performance computing architectures. With demonstrated uncompromising long haul capability, InfiniBand and Longbow technology may represent an excellent long term platform for globally distributing the relentless data streams LHC will emit during its lifetime. InfiniBand, global optical network transports and Longbow technologies will scale in performance over time to continue to offer a compelling system-level solution that will present a stable interface to the applications software. Thank you for your attention. http://www.obsidianresearch.com (P.S. Thanks for Web too!) Obsidian Research Corporation - CERN

More Related