Parallel CORBA Objects CORBA

Parallel CORBA Objects CORBA May, 22nd 2000 ARC « Couplage » Christophe René (IRISA/IFSIC)

Contents • Introduction • Parallel CORBA object concept • Performance evaluation • Encapsulation example • Conclusion

Introduction • Objective • To design a Problem Solving Environment able to integrate a large number of codes aiming at simulating a physical problem • To perform multi-physics simulation (code coupling) • Constraints • Simulation codes may be located on different machines • distributed processing • Simulation codes may require high performance computers • parallel processing • Approach • Combining both parallel and distributed technologies using a component approach (MPI + CORBA)

CORBA Generalities • CORBA: Common Object Request Broker Architecture • Open standard for distributed object computing by the OMG • Software bus, object oriented • Remote invocation mechanism • Hardware, operating system and programming language independence • Vendor independence (interoperability) • Problems to face • Performance issues • Poor integration of high performance computing environments with CORBA

How does CORBA work ? • Interface Definition Language (IDL) • Describe remote object • IDL compiler • Stub and skeleton code generation • IDL stub (proxy) • Handle remote invocation • IDL skeleton • Link between object implementation and ORB interface MatrixOperations { const long SIZE = 100; typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ]; void multiply( in Matrix A, in Vector B, out Vector C ); }; Server IDL compiler Client Object implementation Objet invocation IDL skeleton IDL stub OA Object Request Broker (ORB)

Encapsulated MPI Code MPI Communication layer Scheduler SPMD code SPMD code SPMD code SPMD code SPMD code SPMD code SPMD code MPI Master process Client Skel. MPI Slave processes OA Stub CORBA ORB Encapsulating MPI-based parallel codes into CORBA objects • Master/slave approach • One SPMD code acts as the master whereas the others act as slaves • The master drives the execution of the slaves through message-passing • Drawbacks • Lack of scalability when communicating through the ORB • Need modifications to the original MPI code • Advantage • Can be used with any CORBA implementation

Encapsulated MPI Code MPI Communication layer Scheduler SPMD code SPMD code SPMD code SPMD code MPI Slave processes Client MPI Master process Disp. Disp. Disp. Skel. Stub OA CORBA ORB Master / Slave approach in details • Master has to • Select the method to invoke within the slave processes • Scatter data to slave processes • Gather data from slave processes • Master process • CORBA + MPI initialization • Slave processes • MPI initialization

Parallel CORBA object concept • A collection of identical CORBA objects • Transparent to the client • Parallel remote invocation • Data distribution Parallel Server Parallel CORBA Object MPI Communication layer Sequential Client SPMD code SPMD code SPMD code Client Skel. Skel. Skel. OA OA OA Stub CORBA ORB

Problems to face • Communication between • a sequential client and a parallel server • a parallel client and a sequential server • a parallel client and a parallel server • Implementation constraints • Do not modify the ORB core to keep interoperability features • Approach • Modify stub and skeleton code • Extend the IDL compiler

Extended-IDL • Collection specification • Size specification = number of requests to send • Shape specification used to distribute arrays • Data distribution specification • Scatter and gather elements of an array • Reduction operator specification • Perform collective operations using request replies

Specifying number of objects in a collection • Several ways: • integer value • interval of integer value • mathematical function • power • exponential • multiple • character “*” interface[ 4 ] Example1 { /* ... */ }; interface[ 2 .. 8 ] Example2 { /* ... */ }; interface[ 2 ^ n ] Example3 { /* ... */ }; interface[ * ] Example4 { /* ... */ };

Shape of the object collection • Shape depends on data distribution specification, but users may add special requirements • How can we organize 8 objets ?

Shape of the object collection (cont’d) • Specification of the shape • size of one dimension • integer value • mathematical function • multiple • dependence between dimensions interface[ 8: 2, 4 ] Example1 { /* ... */ }; interface[ *: 2 ] Example2 { /* ... */ }; interface[ *: *, 2 ] Example3 { /* ... */ }; interface[ *: 2 * n ] Example3 { /* ... */ }; interface[ x ^ 2: n, n ] Example4 { /* ... */ };

Inheritance mechanism • Under some constraints • numbers of processors must match • shapes of virtual nodes array must match interface[ * ] Example1 { /* ... */ }; interface[ 2 ^ n ] Example2 : Example1 { /* ... */ }; interface[ 2 ^ n ] Example1 { /* ... */ }; interface[ * ] Example2 : Example1 { /* ... */ }; Inheritance allowed Inheritance not allowed interface[ * ] Example1 { /* ... */ }; interface[ * : 2 ] Example2: Example1 { /* ... */ }; interface[ *: 2 ] Example1 { /* ... */ }; interface[ *: 3 ] Example2: Example1 { /* ... */ }; Inheritance not allowed Inheritance allowed

Specifying Data distribution • New keyword: dist • Only arrays and sequences may be distributed • Available distribution mode: • BLOCK • BLOCK( size ) • CYCLIC • CYCLIC( size ) • “*” interface[ * ] Example { typedef double Arr1[ 8 ]; typedef Arr1 Arr2[ 8 ]; typedef sequence< double > Seq; void Op1( in dist[ CYCLIC ] Arr1 A, in Arr1 B, out dist[ BLOCK ][ * ] Arr2 C ); void Op2( in dist[ BLOCK ] Seq A, inout Seq B ); };

Distribution examples on 2 processors • Block( 5 ) • Block = Block( BlockSize ) • BlockSize = ( ArrayLength + ProcNb - 1 ) / ProcNb • Cyclic( 3 ) • Cyclic = Cyclic( 1 )

0 0 1 1 2 2 3 3 Op1 Op2 • Extended-IDL specification interface[ * ] Example { typedef double Arr[ 8 ]; void Op1( in dist[ BLOCK ] Arr A ); void Op2( in dist[ BLOCK, 2 ] Arr A ); }; Mapping • Vector distribution on a processor matrix

typedef double Arr1[ 8 ]; typedef Arr1 Arr2[ 8 ]; interface[ * ] Example3 { void Op2( in dist[ CYCLIC, 2 ][ CYCLIC ] Arr2 A ); }; Specification not allowed Mapping (cont’d) typedef double Arr1[ 8 ]; typedef Arr1 Arr2[ 8 ]; interface[ * ] Example1 { void Op1( in dist[ * ][ CYCLIC ] Arr2 A ); void Op2( in dist[ CYCLIC, 2 ][ * ] Arr2 A ); }; interface[ * ] Example2 { void Op1( in dist[ BLOCK, 2 ][ BLOCK, 1 ] Arr2 A, out dist[ BLOCK ][ BLOCK ] Arr2 B ); }; Specification allowed

Reduction operators • Reduction operator available: • min, max • addition (sum), multiplier (prod) • bitwise operation (and, or, xor) • logical operation (and, or, xor) interface[ * ] Example1 { typedef double Arr[ 8 ]; cland boolean Op1( in dist[ BLOCK ] Arr A, in double B ); void Op2( in dist[ CYCLIC ] Arr A, inout cmin double B ); void Op3( in dist[ CYCLIC( 3 ) ] Arr A, out csum double B ); };

Collection specification • Collection specification • Data distribution specification • Collection specification • Data distribution specification • Reduction operator specification interface[ * ] MatrixOperations { const long SIZE = 100; typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ]; void multiply( in Matrix A, in Vector B, out Vector C ); double minimum( in Vector A ); }; interface[ * ] MatrixOperations { const long SIZE = 100; typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ]; void multiply( in dist[ BLOCK ][ * ] Matrix A, in Vector B, out dist[ BLOCK ] Vector C ); double minimum( in Vector A ); }; interface[ * ] MatrixOperations { const long SIZE = 100; typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ]; void multiply( in dist[ BLOCK ][ * ] Matrix A, in Vector B, out dist[ BLOCK ] Vector C ); csum double minimum( in Vector A ); }; Summary interfaceMatrixOperations { const long SIZE = 100; typedef double Vector[ SIZE ]; typedef double Matrix[ SIZE ][ SIZE ]; void multiply( in Matrix A, in Vector B, out Vector C ); double minimum( in Vector A ); };

Code Generation Problems • New type for distributed parameters: distributed array • Amount of data to be sent to remote objects is known at runtime • An extension of CORBA sequence • Data distribution specification stored in distributed arrays • Skeleton code generation • Provide access to data distribution specification • Stub code generation • Scatter and gather data among remote objects • Manage remote operation invocations

... pco->multiply( A, B, C ); ... client Parallel CORBA Object void multiply( const Matrix A, const Vector B, Vector C ); MPI Communication layer void multiply(in dist[BLOCK][*] Matrix A, in Vector B, out dist[BLOCK] Vector C ); Requests SPMD code SPMD code A #2 #1 A A stub #2 B void multiply( const Matrix_DArray A, const Vector_DArray B, Vector_DArray C ); B #1 B A C C A C B B Skel. Skel. void multiply( const Matrix_Seq A, const Vector_Seq B, Vector_Seq C ); OA OA Stub code generation CORBA ORB

Parallel CORBA Object as client • Stub code generation when the client is parallel • Assignment of remote object references to the stubs • Use of distributed data type as operation parameters in the stubs • Exchange of data through MPI by the stubs • to build requests • to propagate results

Parallel CORBA Object as client (cont’d) • Only one process has lot of works • gather distributed data from other processes • send the alone request • scatter distributed data to other processes • broadcast value of non distributed data Parallel Client Sequential Server Parallel CORBA Object MPI Communication layer Object impl. SPMD code SPMD code SPMD code Skel. Stub Stub Stub OA CORBA ORB

Parallel CORBA Object as client (cont’d) • p requests are dispatched among n objects (cyclic distribution) • p < n: data distribution handled by the stub • p > n: data distribution handled by the skeleton • p = n: user choice Parallel Server Parallel Client Parallel CORBA Object (size p) MPI Communication layer Parallel CORBA Object (size n) MPI Communication layer SPMD code SPMD code SPMD code SPMD code SPMD code SPMD code Skel. Skel. Skel. Stub Stub Stub OA OA OA CORBA ORB

Naming Service • Currently (as defined by the OMG): • Provide some methods to access a remote object through a symbolic name • Associate a symbolic name with an object reference and only one • Our needs: • Associate a symbolic name with a collection of object references • Implementation constraint: • Object reference to the Standard Naming Service and the Parallel Naming Service must be the same: orb->resolve_initial_reference( “NameService” ); • Our solution: • Add new methods to the Naming Service interface

module CosNaming { ... interface NamingContext { ... typedef sequence<Object> ObjectCollection; void join_collection( in Name n, in Object obj ); void leave_collection( in Name n, in Object obj ); ObjectCollection resolve_collection( in Name n ); }; }; Extension to the CosNaming IDL specification Example_impl* obj = new Example_impl(); NamingService->join_collection( Matrix_name, obj ); ... NamingService->leave_collection( Matrix_name, obj ); Server side Client side objs = NamingService->resolve_collection( Matrix_name ); srv = Example::_narrow( objs ); ... srv->op1( A, B, C ); Extension to the Naming Service

Implementation • Using MICO implementation of CORBA • Library (not included in the ORB core) • Parallel CORBA object base class • Functions to handle distributed data • Data redistribution library interface • Extended-IDL compiler (extension of MICO IDL compiler) • Parser • Semantic analyzer • Code generator • Experimental platform • Cluster of PCs • Parallel machine (Cenju - NEC)

Throughput (Mb/s) 90 MPI 80 70 CORBA 60 50 40 30 20 10 0 0 10000 20000 30000 40000 50000 60000 70000 80000 Message size (bytes) Comparison between CORBA and MPI interface Bench { typedef sequence< long > Vector; void sendrecv( in Vector in_a, out Vector out_a ); }; • Benchmark: send / receive • Platform: • 2 Bi - Pentium III 500 Mhz • Ethernet 100 Mb/s • Latency: • MPI: 0,35 ms • CORBA: 0,52 ms • Differences due to: • Protocol • Memory allocation

Code 1 Code 2 MPI Communication layer MPI Communication layer Scheduler Client MPI Master process MPI Master process • Master/slave • through file exchange • ASCII file • XDR file • through the ORB Stub 1 Stub 2 MPI Slave processes MPI Slave processes CORBA ORB Code 1 Code 2 SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code SPMD Code Parallel CORBA Object Parallel CORBA Object MPI Communication layer MPI Communication layer Object Object Object Object Object Object Object Object Skel. 1 Skel. 2 ... ... impl impl impl impl impl impl impl impl . . . . . . . . SPMD SPMD SPMD SPMD SPMD SPMD SPMD SPMD code code code code code code code code • Parallel CORBA object • through the ORB ... ... OA OA Skel Skel Skel Skel Skel Skel Skel Skel . . . . . . . . ... ... OA OA OA OA OA OA OA OA CORBA ORB Stub 2 Stub 1 Stub 1 Stub 2 ... Client Client Parallel Scheduler MPI Communication layer Performance evaluation (cont’d) • Four experiments:

ms 250  37 Mb/s 200 150  47 Mb/s 100  56 Mb/s 50 ORB PaCO 0 ASCII 1 2 4 8 Number of objects belonging to the collection XDR  224 Mb/s Performance evaluation (cont’d) Matrix order = 256 ; Element type = long ms 2500 2000 1500 1000 500 0 1 2 4 8 Number of objects belonging to the collection

ms Matrix order = 1024 900 ms 800 3500 700 3000 600 2500 500 400 2000 300 1500 200 1000 100 500 0 1 2 4 8 ORB Number of objects belonging to the collection 0 PaCO 1 2 4 8 Number of objects belonging to the collection  258 Mb/s  293 Mb/s Performance evaluation (cont’d) Matrix order = 512

Remove invocation to MPI_Init() and to MPI_Finalize() • Rename the main function int main( int argc, char* argv[] ) { /* ... */ MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */ MPI_Send( ... ); MPI_Recv( ... ); /* ... */ MPI_Finalize(); } int main( int argc, char* argv[] ) { /* ... */ /* MPI_Init( &argc, &argv ); */ MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */ MPI_Send( ... ); MPI_Recv( ... ); /* ... */ /* MPI_Finalize(); */ } int app_main( int argc, char* argv[] ) { /* ... */ /* MPI_Init( &argc, &argv ); */ MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */ MPI_Send( ... ); MPI_Recv( ... ); /* ... */ /* MPI_Finalize(); */ } 1. Adapt the original source code new code Encapsulation example int main( int argc, char* argv[] ) { /* ... */ MPI_Init( &argc, &argv ); MPI_Comm_rank( MPI_COMM_WORLD, &id ); MPI_Comm_size( MPI_COMM_WORLD, &size ); /* ... */ MPI_Send( ... ); MPI_Recv( ... ); /* ... */ MPI_Finalize(); } Original code

2. Define the IDL interface typedef sequence< string > arg_type; interface[ * ] Wrapper { void compute( in string directory, in arg_type arguments ); void stop(); }; Interface IDL 3. Write method implementation void Wrapper_impl::compute( const char* directory, const arg_type_DArray& arguments ) { int argc = arguments.length() + 1; char** argv = new (char *) [ argc ]; argv[ 0 ] = strdup( “...” ); /* Application name */ for( int i0 = 1; i0 < argc; ++i0 ) argv[ i0 ] = strdup( arguments[ i0 - 1 ] ); chdir( directory ); app_main( argc, argv ); for( int i1 = 0; i1 < argc; ++i1 ) free( argv[ i1 ] ); delete [] argv; } Object implementation Encapsulation example (cont’d)

4. Write the main function of the server int main( int argc, char* argv[] ) { /* ... */ PaCO_DL_init( &argc, &argv ); orb = ORB_init( argc, argv ); /* ... */ srv = new Wrapper_impl(); /* ... */ orb->run(); /* ... */ PaCO_DL_exit(); } 5. Write the client int main( int argc, char* argv[] ) { /* ... */ arg_type arguments; arguments.length( 2 ); arguments[ 0 ] = strdup( “...” ); /* arg 1 */ arguments[ 1 ] = strdup( “...” ); /* arg 2 */ pos->compute( working_directory, arguments ); /* ... */ } client server Encapsulation example (cont’d)

Conclusion • We show that MPI and CORBA can be combined for distributed and parallel programming • Implementation depends on the CORBA implementation • Need to have a standardized API for the ORB • Response to the OMG RFI “Supporting Aggregated Computing in CORBA”

Parallel CORBA Objects CORBA

Parallel CORBA Objects CORBA

Presentation Transcript

CORBA

CORBA

CORBA

CORBA

CORBA

CORBA

CORBA

CORBA

Distributed Objects Using CORBA

CORBA

CORBA

CORBA

CORBA

CORBA

CORBA

CORBA