1 / 15

Climate Simulation using Ninf-G on the ApGrid Testbed

Climate Simulation using Ninf-G on the ApGrid Testbed. Yoshio Tanaka, Hiroshi Takemiya Kazuyuki Shudo, Satoshi Sekiguchi Grid Technology Research Center, AIST. Elements of this DEMO. Application: Climate Simulation Originally developed by Dr. Tanaka (U. of Tsukuba)

Download Presentation

Climate Simulation using Ninf-G on the ApGrid Testbed

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Climate Simulation using Ninf-Gon the ApGrid Testbed Yoshio Tanaka, Hiroshi Takemiya Kazuyuki Shudo, Satoshi Sekiguchi Grid Technology Research Center, AIST

  2. Elements of this DEMO • Application: Climate Simulation • Originally developed by Dr. Tanaka (U. of Tsukuba) • Portal: Grid PSE Builder • Any Unix-command application can be integrated to Web portal • Middleware used for the implementation of Grid-enabled climate simulation: Ninf-G • GridRPC middleware based on the Globus Toolkit which is used for gridifying the original (sequential) application • Testbed: ApGrid Testbed • International Grid Testbed over the Asia Pacific Region

  3. …… Application: Climate Simulation • Goal • Long term, global climate simulation • Winding of Jet-Stream • Blocking phenomenon of high atmospheric pressure • Barotropic S-Model • Climate simulation model proposed by Prof. Tanaka • Simple and precise Modeling complicated 3D turbulence as a horizontal one Keep high precision over long periods • Taking a statistical ensemble mean • ~ several 100 simulations • Introducing perturbation at every time step • Typical parameter survey

  4. PSE: Grid PSE Builder • Generates an web interface for running an Unix-command application. • Write an interface using XML. <application> <appname>ls</appname> <argspec>/bin/ls %option% %width%</argspec> <arglist> <args use=“required”> <title>option</title> <radio name=“option”> <option value=“-a”>do not hide entries …</option>

  5. HTTP server + Servlet (Apache + Tomcat) user JDBC Interface (TCP/IP) PSE: Grid PSE Builder (cont’d) client auth. Grid PSE Core SignOn/SignOff Job Control submission/query /cancel Job Queuing Manager & Signing Server globusrun Accounting DB (Postgress) accounting information

  6. Middleware: Ninf-G (GridRPC System) Utilization of remote supercomputers ② Notify results Internet user ① Call remote procedures Call remote libraries Large scale computing utilizing multiple supercomputers on the Grid

  7. Requires no detailed knowledge on Grid infrastructure Middleware: Ninf-G (cont’d) • RPC library on the Grid • Built on top of Globus Toolkit • MDS: managing stub information • GRAM: invocation of server programs • GSI: secure communication between a client and a server • Simple and easy-to-use programming interface • Hiding complicated mechanism of the grid • Providing RPC semantics for (i = start; i <= end; i++) { /* sequential search */ SDP_search(argv[1], i, &value[i]); } grpc_function_handle_init(&hdl, …, “SDP/search”); for (i = start; i <= end; i++) { /* parallel search using async. call */ grpc_call_async(&hdl, argv[1], i, &value[i]): }

  8. 80 160 32 16 32 40 Testbed: ApGrid Testbed http://www.apgrid.org/

  9. Ninf-g Ninf-g Ninf-g Grid Lib Web browser Ninf-g Ninf-g user Ninfy the original (seq.) climate simulation • Dividing a program into two parts as a client-server system • Client: • Pre-processing: reading input data • Post-processing: averaging results of ensembles • Server • climate simulation, visualize S-model Program Reading data Solving Equations Solving Equations Solving Equations Averaging results VIsualize

  10. Testbed • UME Cluster (AIST) • jobmanager-grd, (40cpu + 20cpu) • AMATA Cluster (KU) • jobmanager-sqms, 6cpu • Galley Cluster (Doshisha U.) • jobmanager-pbs, 10cpu • Gideon Cluster (HKU) • jobmanager-pbs, 15cpu • PRESTO Cluster (TITECH) • jobmanager-pbs, 4cpu • VENUS Cluster (KISTI) • jobmanager-pbs, 16cpu • ASE Cluster (NCHC) • jobmanager-fork, 2cpu

  11. Climate Simulation client server front node - public IP - Globus - gatekeeper - jobmanager - pbs, grd, sqms - NAT backend nodes - private IP or public IP - Globus SDK - Ninf-G Lib

  12. Lessons Learned • Difficulties caused by the bottom-up approach and the problems on the installation of the Globus Toolkit. • Most resources are not dedicated to the ApGrid Testbed. • Site’s policy should be respected. • There were some requirements on modifying software configuration, environments, etc. • Version up of the Globus Toolkit (GT1.1.4 -> GT2.0 -> GT2.2) • Apply patches, install additional packages • Build bundles using other flavors • Different requirements for the Globus Toolkit between users. • Middleware developers needs the newest one. • Application developers satisfy with using the stable (older) one. • It is not easy to catch up frequent version up of the Globus Toolkit. • ApGrid software package should solve some of these problems

  13. Lessons Learned (cont’d) • Problems in scalabiliy • Initialization of function handles • Initialization of a function handle takes several to several ten seconds • Overhead caused by hitting gatekeeper (GSI authentication) and a jobmanager invocation • Overhead caused by MDS lookup • Current Ninf-G implementation needs to hit gatekeeper for initialization of function handles one-by-one • Although Globus GRAM enables to invoke multiple jobs at one contact to gatekeeper, GRAM API is not sufficient to control each jobs.

  14. Lessons Learned (cont’d) • We observed that Ninf-G apps did not work correctly due to un-expected configuration of clusters • Failed in GSI auth. for establishing connection for file transfers using GASS. • Backend nodes do not have host certs. • Due to the configuration of local scheduler (PBS), Ninf-G executables were not activated. • Example: • PBS jobmanager on a 16 nodes cluster • Call grpc_call 16 times on the cluster. App. developer expected to invoke 16 Ninf-G executables simultaneously. • Configuration of PBS Queue Manager set the max number of simultaneous job invocation for each user a 9 • 9 Ninf-G executables were launched, however 7 were not activated

  15. Special Thanks (for technical support) to: • Kasetsart University (Thailand) • Sugree Phatanapherom • Doshisha University (Japan) • Yusuke Tanimura • University of Hong Kong (Hong Kong) • CHEN Lin, Elaine • KISTI (Korea) • Gee-Bum Koo, Jae-Hyuck • Tokyo Institute of Technology (Japan) • Ken’ichiro Shirose • NCHC (Taiwan) • Julian Yu-Chung Chen • AIST (Japan) • Grid Support Team • APAN • HK, TW, JP

More Related