1 / 20

CSS490 Group Communication and MPI Textbook Ch3

CSS490 Group Communication and MPI Textbook Ch3. Instructor: Munehiro Fukuda These slides were compiled from the course textbook, the reference books, and the instructor’s original materials. Group Communication. Communication types: One-to-many: broadcast

phuong
Download Presentation

CSS490 Group Communication and MPI Textbook Ch3

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CSS490 Group Communication and MPI Textbook Ch3 Instructor: Munehiro Fukuda These slides were compiled from the course textbook, the reference books, and the instructor’s original materials. CSS490 MPI

  2. Group Communication • Communication types: • One-to-many: broadcast • Many-to-one: synchronization, collective communication • Many-to-many: gather and scatter • Group addressing • Using a special network address: IP Class D and UDP • Emulating a broadcast with one-to-one communication: • Performance drawback on bus-type networks • Simpler for switching-based networks • Semantics • Send-to-all, bulletin-board semantics • 0-, 1-, m-out-of-n, all-reliable CSS490 MPI

  3. Atomic Multicast • Send-to-all semantics and all-reliable • Simple emulation: • A repetition of one-to-on communication with acknowledgment • What if a receiver fails • Time-out retransmission • What if a sender fails before all receivers receive a message • All receivers forward the message to the same group. • A receiver discard the 2nd or the following messages. CSS490 MPI

  4. Message Ordering • R1 and R2 receive m1 and m2 in a different order! • Some message ordering required • Absolute ordering • Consistent ordering • Causal ordering • FIFO ordering S2 S1 R1 R2 m2 m1 m1 m2 CSS490 MPI

  5. Absolute Ordering • Rule: • Mi must be delivered before mj if Ti < Tj • Implementation: • A clock synchronized among machines • A sliding time window used to commit message delivery whose timestamp is in this window. • Example: • Distributed simulation • Drawback • Too strict constraint • No absolute synchronized clock • No guarantee to catch all tardy messages Ti < Tj Ti mi Tj mi mj mj CSS490 MPI

  6. Consistent Ordering • Rule: • Messages received in the same order (regardless of their timestamp). • Implementation: • A message sent to a sequencer, assigned a sequence number, and finally multicast to receivers • A message retrieved in incremental order at a receiver • Example: • Replicated database updation • Drawback: • A centralized algorithm Ti < Tj Ti Tj mj mj mi mi CSS490 MPI

  7. Causal Ordering • Rule: • Happened-before relation • If eki, eli ∈h and k < l, then eki → eli, • If ei = send(m) and ej = receive(m), then ei → ej, • If e → e’ and e’ → e”, then e → e” • Implementation: • Use of a vector message • Example: • Distributed file system • Drawback: • Vector as an overhead • Broadcast assumed R1 S1 R2 S2 R3 m4 m1 m1 m4 m2 m2 m3 From R2’s view point m1 →m2 CSS490 MPI

  8. Vector Message • S[i] = R[i] + 1 where i is the source id • S[j] ≤ R[j] where i≠j Site B Site C Site D Site A 1, 1, 1, 0 2, 1, 0, 0 2, 1, 1, 0 2, 1, 1, 0 3,1,1,0 delayed delayed delivered CSS490 MPI

  9. FIFO Ordering • Rule: • Messages received in the same order as they were sent. • Implementation: • Messages assigned a sequence number • Example: • TCP • This is the weakest ordering. S R m1 Router 1 m2 m1 m3 m2 m4 m3 Router 2 m4 CSS490 MPI

  10. Why High-Level Message Passing Tools? • Data formatting • Data formatted into appropriate types at user level • Non-blocking communication • Polling and interrupt handled at system call level • Process addressing • Inflexible hardwired addressing with machine id + local id • Group communication • Group server implemented at user level • Broadcasting simulated by a repetition of one-to-one communication CSS490 MPI

  11. PVM and MPI • PVM: Parallel Virtual Machine • Developed in 80’s • The pioneer library to provide high-level message passing functions • The PVM daemon process taking care of message transfer for user processes in background • MPI: Message Passing Interface • Defined in 90’s • The specification of high-level message passing functions • Several implementations available: mpich, mpi-lam • Library functions directly linked to user programs (no background daemons) • The detailed difference is shown by: • PVMvsMPI.ps CSS490 MPI

  12. Getting Started with MPI • Website: http://www-unix.mcs.anl.gov/mpi/mpich/ • Creating a hostfile: [mfukuda@UW1-320-00 mfukuda]$ vi hosts uw1-320-00 uw1-320-01 uw1-320-02 uw1-320-03 • Compile a source program: [mfukuda@UW1-320-00 mfukuda]$ mpiCC source.cpp –o myProg • Run the executable file: [mfukuda@UW1-320-00 mfukuda]$ mpirun –np 4 myProg args CSS490 MPI

  13. Program Using MPI #include <iostream.h> #include "mpi++.h" int main(int argc, char *argv[]) { MPI::Init(argc, argv); // Start MPI computation int rank = MPI::COMM_WORLD.Get_rank(); // Process ID (from 0 to #processes – 1) int size = MPI::COMM_WORLD.Get_size(); // # participating processes cout << "Hello World! I am " << rank << " of " << size << endl; MPI::Finalize(); // Finish MPI computation } CSS490 MPI

  14. MPI_Send and MPI_Recv Int MPI::COMM_WORLD.Send( void* message /* in */, int count /* in */, MPI::Datatype datatype /* in */, int dest /* in */, int tag /* in */) Int MPI::COMM_WORLD.Recv( void* message /* in */, int count /* in */, MPI::Datatype datatype /* in */, int source /* in */, /* MPI::ANY_SOURCE */ int tag /* in */, MPI::Status* status /* out */) /* can be omitted */ MPI::Datatype = CHAR, SHORT, INT, LONG UNSIGNED_CHAR, UNSIGNED_SHORT, UNSIGNED, UNSIGNED_LONG, FLOAT, DOUBLE, LONG_DOUBLE, BYTE, PACKED MPI::Status->MPI_SOURCE, MPI::Status->MPI_TAG, MPI::MPI_ERROR CSS490 MPI

  15. MPI_Send and MPI_Recv #include <iostream.h> #include "mpi++.h" main(int argc, char *argv[]) { int tag0 = 0; MPI::Init(argc, argv); // Start MPI computation if (MPI::COMM_WORLD.Get_rank() rank == 0 ) { // rank 0…sender int loop = 3; MPI::COMM_WORLD.Send( "Hello World!", 12, MPI::CHAR, 1, tag0 ); MPI::COMM_WORLD.Send( &loop, 1, MPI::INT, 1, tag0 ); } else { // rank 1…receiver int loop; char msg[12]; MPI::COMM_WORLD.Recv( msg, 12, MPI::CHAR, 0, tag0 ); MPI::COMM_WORLD.Recv( &loop, 1, MPI::INT, 0, tag0 ); for (int I = 0; I < loop; I++ ) cout << msg << endl; } MPI::Finalize(); // Finish MPI computation } CSS490 MPI

  16. Message Ordering in MPI • FIFO Ordering in each data type • Messages reordered with a tag in each data type Source Destination Source Destination tag = 1 tag = 3 tag = 2 CSS490 MPI

  17. MPI_Bcast Int MPI::COMM_WORLD.Bcast( void* message /* in */, int count /* in */, MPI::Datatype datatype /* in */, int root /* in */) Rank 3 Rank 2 Rank 4 Rank 1 Rank 0 MPI::COMM_WORLD.Bcast( &msg, 1, MPI::INT, 2); CSS490 MPI

  18. MPI_Reduce Int MPI::COMM_WORLD.Reduce( void* operand /* in */, void* result /* out */, int count /* in */, MPI::Datatype datatype /* in */, MPI::Op operator /* in */, int root /* in */) MPI::Op = MPI::MAX (Maximum), MPI::MIN (Minimum), MPI::SUM (Sum), MPI::PROD (Product), MPI::LAND (Logical and), MPI::BAND (Bitwise and), MPI::LOR (Logical or), MPI::BOR (Bitwise or), MPI::LXOR (logical xor), MPI::BXOR(Bitwise xor), MPI::MAXLOC (MAX location) MPI::MINLOC (MIN loc.) Rank3 8 Rank2 12 Rank4 4 Rank1 10 Rank0 15 49 MPI::COMM_WORLD.Reduce( &msg, &result, 1, MPI::INT, MPI::SUM, 2); CSS490 MPI

  19. MPI_Allreduce Int MPI::COMM_WORLD.Allreduce( void* operand /* in */, void* result /* out */, int count /* in */, MPI::Datatype datatype /* in */, MPI::Op operator /* in */) 0 1 2 4 3 5 6 7 0 1 2 4 3 5 6 7 0 1 2 4 3 5 6 7 0 1 2 4 3 5 6 7 CSS490 MPI

  20. Exercises (No turn-in) • Consider an application requiring both one-to-many and many-to-one communication. • Consider an application requiring atomic multicast. • Assume that four processes communicate with one another in causal ordering. Their current vectors are show below. If Process A sends a message, which processes can receive it immediately? • Consider pros and cons of PVM’s daemon-based and MPI’s library linking-based message passing. • Why can MPI maintain FIFO ordering? CSS490 MPI

More Related