170 likes | 446 Views
Parallel Computing With Rmpi. Outline. Parallel computing for R Rmpi programming Parallelisation with Rmpi Conclusion. Parallel Computing for R. Parallel computing for scientific computing Expensive calculations – Faster Massive data – Larger
E N D
Outline • Parallel computing for R • Rmpi programming • Parallelisation with Rmpi • Conclusion Parallel Computing with Rmpi
Parallel Computing for R Parallel computing for scientific computing Expensive calculations – Faster Massive data – Larger Split problems to many processors and run in parallel R A language and free software environment for statistical computing and graphics R parallel computing Multiple implementations for R parallel computing Most available on CRAN (the Comprehensive R Archive Network ) E.g. Rmpi, snow, R/parallel, etc. Parallel Computing with Rmpi
Parallel Programming Parallel programming MPI, OpenMP, Mix-mode… Suitable for different parallel computer architectures MPI Message-Passing Interface Standardized and portable Widely used on current parallel computers Different implementations: MPICH/MPICH2, LAM-MPI, OpenMPI, vendor’s MPI, etc. Parallel Computing with Rmpi
Rmpi Rmpi A package for R parallel programming developed by Hao Yu, The University of Western Ontario An interface (wrapper) to MPI APIs Provide a task farm environment to R (master/slaves) Including a number of R-specific extensions, e.g. for R objects Available for download from CRAN License: GPL version 2 or newer Helps to hide C/C++/FORTRAN from R users Support multiple MPI implementations LAM-MPI, MPICH / MPICH2, OpenMPI, etc. Can be run under various distributions of Linux, Windows, and Mac OS X Installation needs to match system and MPI implementation Parallel Computing with Rmpi
An Example of Rmpi # Load the R MPI package library("Rmpi") # Spawn 2 slaves mpi.spawn.Rslaves(nslaves = 2) # Function to be executed: print out a identify message myId<-function(){ myrank <- mpi.comm.rank() totalSize <- mpi.comm.size() message(“I am ”, myrank, “ of ”, totalSize, “ ranks\n”) } # Tell all ranks and run the function mpi.bcast.Robj2slave(myId) mpi.remote.exec(myId()) # Tell all slaves to close down, and exit the program mpi.close.Rslaves() mpi.quit() Parallel Computing with Rmpi
An Example of Rmpi (cont.) master (rank 0, comm 1) of size 3 is running on: nid09466 slave1 (rank 1, comm 1) of size 3 is running on: nid09467 slave2 (rank 2, comm 1) of size 3 is running on: nid09468 I am 0 of 3 ranks I am 1 of 3 ranks I am 2 of 3 ranks Parallel Computing with Rmpi
Rmpi Basic Program Structure Load Rmpi, and spawn slaves Create functions Create the functions containing the code run by the slaves Initialisation Send all the required data and functions to slaves Tell the slaves to execute their functions Communicates and synchronisations Gather/operate on the results Close the slaves and quit Parallel Computing with Rmpi
Rmpi Programming Straightforward to start coding Similar to standard MPI usage Existing R code can be modified directly Could be very complex depending on your code May require reconstruction of the original serial code Select proper decomposition strategies Parallelisation implementations Parallel Computing with Rmpi
The Fios Project Fios Genomics Ltd. & EPCC Focus on parts of genotyping Bioconductor packages e.g. crlmm Aims to analyse larger datasets Original platform CPU rate: 2.6GHz Total memory: 32 GB Target platform: HECToR Latest national high-performance computing service for the UK academic community Cray XT4 system 5664 AMD 2.3 GHz quad-core processors. i.e. a total of 22,656 cores Theoretical peak performance of 208 Tflops 8GB per processor on 1 node, shared by the 4 cores Total memory: 45.3 TB Parallel Computing with Rmpi
Identify The Bottlenecks Understand the code R profilings Functions: Rprof, summaryRprof Memory: memory.profiling=TRUE, tracemem, Rprofmem R proftools package: call tree, graph… Manual profiling Parallel Computing with Rmpi
Prepare Serial code for Parallelisation Code reconstructions required to be parallelisable Parallel parts should be as independent as possible Code modifications to reduce the required communications More complex when C/C++/FORTRAN extensions involved Reduce transfer between R and C/C++/FORTRAN extensions Rmpi communications on R level Correctness check is important ! Could be slower than the original serial code Parallel Computing with Rmpi
Parallel Implementation Using Rmpi Select a proper decomposition strategy Simple tasks: equal shares for all slaves Task farms: better load balance, more communications Again, correctness check ! Communication overheads vs. Computation performance gain Synchronisations Necessary for the correctness Very expensive Only use when you have to Parallel Computing with Rmpi
The Fios Project (2) Original Serial crlmm package CPU rate: 2.6GHz 32 GB memory Up to 200 datasets 1700 seconds Now Parallelised crlmm code HECToR: 2.3GHz Allowing much more datasets 200 datasets on 10 nodes (80GB memory in total) : 810 seconds 512 datasets on 16 nodes (128GB memory in total) : 1100 seconds Parallel Computing with Rmpi
Pros and Cons of Rmpi Parallelisation Pros Provide an interface to portable MPI on HPC facilities Enable to parallelise existing R code directly Provide faster calculation Allow larger datasets Cons Rmpi package installation – depends on the system and MPI implementation Code modification required Maximum speed up limited by the fraction of the parallel parts MPI communication overheads Parallel Computing with Rmpi
Conclusion Rmpi is useful for the R parallel computing Rmpi programming is easy to start with, but could be more complex depending on your code Parallelisation with Rmpi Enable a faster computing with larger datasets Parallel coding will be required Parallel performance tuning may be required Parallel Computing with Rmpi
Reference R project: http://www.r-project.org/ Rmpi: http://www.stats.uwo.ca/faculty/yu/Rmpi/ Rmpi tutorial: http://math.acadiau.ca/ACMMaC/Rmpi/ CRLMM: http://www.bioconductor.org EPCC: http://www.epcc.ed.ac.uk/ HECToR: http://www.hector.ac.uk/ Fios Genomics Ltd.: http://www.fiosgenomics.com/ Parallel Computing with Rmpi