1 / 22

Porting from the Cray T3E to the IBM SP

Porting from the Cray T3E to the IBM SP. Jonathan Carter NERSC User Services. Overview. Focus is on Fortran programs using MPI for communication Outline common pitfalls: f90 vs. xlf Fortran compiler Cray vs. IBM MPI library Math libraries System libraries I/O.

olwen
Download Presentation

Porting from the Cray T3E to the IBM SP

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Porting from the Cray T3E to the IBM SP Jonathan Carter NERSC User Services

  2. Overview • Focus is on Fortran programs using MPI for communication • Outline common pitfalls: • f90 vs. xlf Fortran compiler • Cray vs. IBM MPI library • Math libraries • System libraries • I/O

  3. f90 vs. xlf - Main Differences • f90 • compiles for parallel (MPI) automatically • accepts file suffix .f90, .F90 • default optimization is -O2 • allows access to full memory on a PE by default • xlf • compiler is accessed by several names, each name “packages” options together • by default, only file suffix .f and .F allowed • default is no optimization • restricted amount of memory available by default

  4. xlf Compiler Options • Compiler name can have three parts: • optional prefix “mp” indicates MPI library is automatically linked • compiler name, xlf, xlf90, or xlf95 indicates language mode • optional postfix “_r” indicates threads, or OpenMP capability • Example: • mpxlf90 - Fortran 90 language compiler with MPI library available • mpxlf_r - Fortran 77 language compiler with MPI library, threads, and OpenMP capability available. • If you want to use MPI I/O, the thread capable compiler must be used.

  5. xlf Compiler Options • To use different file suffixes, e.g. .f90 and .F90: • -qsuffix=f=f90,F=F90 • For optimization we recommend: • -O3 -qtune=pwr3 -qarch=pwr3 -qstrict • xlf defaults to 32 Kbytes for stack space and 128 Mbyte for heap space. To increase to maximums of 256 Mbyte for stack, and 2 Gbyte for heap: • -bmaxstack:0x10000000 -bmaxstack:0x80000000

  6. Default Datatypes • Double Complex is a language extension • Assume -dp flag for f90 • xlf compiler has -qrealsize=8 to promote all default reals and real constants to 8 bytes. Also, -qintsize=8 to promote all integers and logicals.

  7. Available Datatypes • Fortran 77 “*” syntax is also available to explicitly define a datatype

  8. MPI Differences • Different default datatypes between T3E and SP • More error checking of arguments on the SP • Default amount of buffering is different • Different subset of MPI I/O implemented

  9. Available MPI Datatypes

  10. Default MPI Datatypes

  11. MPI - Argument Checking • T3E MPI library has several collective routines which do not check arguments in accordance with the MPI standard. The SP does check arguments. • Examples: • MPI_Bcast “count” argument is not checked for consistency on T3E • MPI_Gatherv array of “counts” is not checked for consistency on T3E

  12. MPI - Buffering • If your program depends on the buffering of standard MPI Sends and Receives, you may see different behavior between the T3E and the SP. • Classic case: ... if (mype.eq.0) then call mpi_send(buf,count,type,1,tag,MPI_COMM_WORLD,ierr) call mpi_recv(buf,count,type,0,tag,MPI_COMM_WORLD,status,ierr) else if (mype.eq.1) then call mpi_send(buf,count,type,0,tag,MPI_COMM_WORLD,ierr) call mpi_recv(buf,count,type,1,tag,MPI_COMM_WORLD,status,ierr) end if ...

  13. MPI - Buffering • On the T3E, a message up to 4 Kbyte are buffered. This can be changed by setting the environment variable MPI_BUFFER_MAX. • On the SP, the default size depends on the number of processors: 1 to 16 4096 17 to 32 2048 33 to 64 1024 65 to 128 512 127 to 256 256 257 and over 128 • This can be changed by setting the environment variable MP_EAGER_LIMIT.

  14. Cray SciLib and IBM ESSL • Both vendors provide libraries of commonly used Linear Algebra subroutines • On the T3E this is linked by default, on the SP use “-lessl” • These libraries are faster then the public domain BLAS, LAPACK, etc.

  15. Using BLAS • BLAS levels 1 through 3 are completely compatible between the two machines • Note which precision of BLAS is being called: • On the T3E real*8 a(n), b(n), x … x = sdot(n,a,1,b,1) • On the SP real*8 a(n), b(n), x … x = ddot(n,a,1,b,1)

  16. Using BLAS • Instead of changing program source, loader options can be used to map one routine to another • To resolve a call to sdot by a call to ddot on the SP: xlf -o a.out -brename:sdot,ddot b.f • To resolve a call to ddot by a call to sdot on the T3E: f90 -o a.out -Wl”-Dequiv(DDOT)=SDOT” b.f

  17. LAPACK routines • Most other linear algebra routines in Cray SciLib and IBM ESSL are compatible with LAPACK. • In ESSL there are a few incompatibilities (x may be C, D, S, Z): xGEEV xSPEV xSPSV xHPEV xHPSV xGEGV xSYGV • Use installed LAPACK library for these.

  18. ScaLAPACK library • Cray SciLib and IBM PESSL support pieces of the standard ScaLAPACK library. • Check precision of routines: • For real*8 on the T3E, routines start “PS” • For real*8 on the SP, routines start “PD” • On the SP, you must call BLACS_GET followed by either BLACS_GRIDINIT or BLACS_GRIDMAP. On the T3E, only a call to one of the latter two routines is required. • Public domain ScaLAPACk is also installed on both machines.

  19. System Libraries • Generally, any routines which interact with the operating system, and provide extensions to the Fortran language. • Cray provides very many such routines. Some are available on the SP, for example:

  20. System Libraries • A more comprehensive list is available at: http://hpcf.nersc.gov/computers/SP/port.html • Some routines have changed names and slightly different arguments. • There are sometimes identically or similarly named routines on the SP which are designed to be called from C only. Calling them from Fortran will cause unexpected behavior. • For example, calling exit instead of exit_ will cause the program to end without flushing any Fortran I/O buffer.

  21. Fortran I/O • Unformatted I/O • The primitive datatypes on the T3E and SP are compatible (provided they are of the same length), but control words inserted by Fortran language i/o layer prevent transferability of sequential access files. • Direct access files can be freely transferred between the two machines, as can MPI I/O files. • Namelist Input/Output • Users familar with the assign -f77 on the T3E, which causes an old-style namelist input to be written or read, can set the following environment variable on the SP to obtain the same effect: setenv XLFRTEOPTS="namelist=old"

  22. Further Information • T3E and SP webpages and software webpages contain further information and links to vendor documentation: http://hpcf.nersc.gov/computers http://hpcf.nersc.gov/software

More Related