1 / 9

Tests and tools for ENEA GRID

Tests and tools for ENEA GRID. Performance test: HPL (High Performance Linpack) Network monitoring. Funel December 11, 2007. HPL TEST. HPL measures the floating point execution rate for solving a sistem of linear equations A X = B

cosmo
Download Presentation

Tests and tools for ENEA GRID

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tests and tools for ENEA GRID Performance test: HPL (High Performance Linpack) Network monitoring • Funel • December 11, 2007

  2. HPL TEST • HPL measures the floating point execution rate for solving a sistem of linear equations AX = B • HPL requires the availibility of MPI and libraries for linear algebra (BLAS, VSIPL, ATLAS) • HPL is scalable: parallel efficiency constant with respect to the processor memory usage www.netlib.org/benchmark/hpl

  3. HPL Results (1) • A(nn) X = B • GFLOPS = [(2/3)n3+(3/2)n2]/[th 109] • th = CPU time • Th. Peak = • # of CORES CPU CLOCK SPEED  • FPO ISSUE RATE LSF SUBMITION Linux (bw305): 15 CORES, Th. Peak = 72 GFLOPS Test completed AIX (sp4-2): 32 CORES, Th. Peak  96 GFLOPS AIX (sp4-3-4): 32 CORES Th. Peak =  122 GFLOPS Test did not complete!!!

  4. HPL Results (2) Expected CPU time th : # FPO / (# of CORES  CPU CLOCK SPEED  FPO ISSUE RATE) USED: ATLAS Version 3.6 Linux (bw305), PQ = 35 CORES (LSF SUBMITION) (HPL COMPLETED) HIGH USER WAIT TIME (NOT CPU TIME)  MAYBE DUE TO THE NETWORK INTERCONNECTS (PUBLIC WHEN THE TEST WAS DONE)

  5. A PONT-TO-POINT COMMUNICATION TEST USING MPI HPL POINT-TO-POINT COMMUNICATION BETWEEN PROCESSORS IS BASED ON MPI (MPI_Send MPI_Recv) ROUTINES

  6. HPL Results (3) PROBLEMS FOR AIX (LSF SUBMITION) HPL MAKES THE MACHINES HANGING OUT, THE TEST DOES NOT COMPLETE EVEN IF MEMORY USAGE < 10% ONLY A FEW CPU SECONDS OVER DAYS OF RUNNING TIME!!!  UNDER INVESTIGATION INTERACTIVE SUBMITIONS AIX (sp4-1), 48=32 CORES HPL COMPLETED ll AIX (sp4-2), 48=32 CORES, 20% TOTAL (32 GB) MEMORYHPL NOT COMPLETED AIX (ostro), 44=16 CORES, 20% TOTAL (16 GB) MEMORYHPL NOT COMPLETED

  7. NETWORK MONITORING (coll. G. Guarnieri) A TOOL HAS BEEN PROVIDED IN ORDER TO DETECT WHETHER THE COMMUNICATION SPEED BETWEEN TWO HOSTS (CLIENT AND SERVER) OF THE ENEA GRID CHANGES OVER TIME THE TEST MEASURES THE ROUND TRIP TIME IT TAKES TO SEND A SMALL PACKET (10, 100, 1000 BYTES) OF DATA AND RECEIVE IT BACK SMALL PACKETS: NOT CHOPPED (NO SPURIOUS DELAY EFFECTS), FAST FLUCTUATIONS NOT HIDDEN BY THE FINAL INTEGRATED AVERAGE TIME NEEDED FOR WAITING BIG SIZE PACKETS 60 PACKETS SENT IN SEQUENCE EACH SECOND TCP/IP PROTOCOL client server start BOTH CLIENT AND SERVER BLOCK UNTIL THE FULL PACKET IS SENT/RECEIVED: NO LOSS OF DATA stop www.afs.enea.it/funel

  8. NETWORK MONITORING (2) Client: eurofel00 Server: bw305-2 Client: kleos Server: feronix0 HIGH SPIKES CLEARLY DETECTED OVERALL COMMUNICATION DELAY

  9. Conclusions • HPL BENCHMARK TEST: • Linux (LSF) THE TEST COMPLETES HOWEVER: • OBTAINED CPU TIME >> EXPECTED CPU TIME  (PEAK)exp < (PEAK)th • TOO MUCH (USER) TIME TO COMPLETE • AIX (LSF)  THE TEST DOES NOT COMPLETE: • ONLY A FEW CPU SECONDS OVER DAYS OF RUNNING TIME!!!! • AIX (INTERACTIVE SUBMITION): • ONLY sp4-1 (32 CORES, 10% TOTAL MEMORY) TESTED  • TEST COMPLETED BUT STILL (CPU TIME) >> (EXPECTED CPU TIME) • USER WAIT TIME  35 minutes NETWORK MONITORING: A TOOL HAS BEEN PROVIDED TO DETECT VARIATIONS IN THE COMMUNICATION SPEED BEWTEEN TWO HOSTS OF THE ENEA GRID USEFUL FOR IMPROVING THE OVERALL NETWORK EFFICIENCY

More Related