1 / 21

Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam

Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam. MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE Soohyung (Korea Univ.). Distributed parallel framework. Analysis framework: ROOBASF Extended from BASF (Belle’s framework)

Download Presentation

Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Distributed parallel processing analysis framework for Belle II and Hyper Suprime-Cam MINEO Sogo (Univ. Tokyo), ITOH Ryosuke, KATAYAMA Nobu (KEK), LEE Soohyung (Korea Univ.)

  2. Distributed parallel framework • Analysis framework: ROOBASF • Extended from BASF (Belle’s framework) • Controls analysis workflow • For MPI distributed-memory system* • With a Python interface * • ROOT embedded * • For the use of: • Belle II (High energyphysics) • Hyper Suprime-Cam (Astrophysics) * Newly appended features

  3. Table of contents • Motivation • Hyper Suprime-Cam & Belle II • Distributed parallel framework • MPI & Python • Test pipeline • Summary

  4. MOTIVATION

  5. Hyper Suprime-Cam (HSC) & Belle II • Hyper Suprime-Cam (HSC) • Next-generation camera aiming for dark energy • On the prime focus of the Subaru Telescope. • Data rate: 2GB/shot. • 10 times larger than the current camera’s. • Belle II • Next-generation B factory • With Super KEKB: new high luminosity e--e+ collider at KEK. • Data rate: 600MB/sec. • > 40 times larger than the current Belle detector’s Efficient, distributed parallel analysis system is necessary

  6. Analyses on HSC images Chip-by-chip correction Easily data-parallelized. Assigning chips with processes 1 by 1 116 CCD sensors cover the focal plane Pedestal correction Gain correction Parallelization is not trivial Processes must exchange – object position information – pixel information – etc. “Mosaicking” Processes need communication superpose chips Determine positions by matching celestial objects

  7. Use case in Belle ll • ROOT-based data format. • DAQ cluster needs cooperation

  8. Existing framework • BASF: the framework for the Belle experiment • successfully used for 10 years. • Involved in nearly all of the experiment. • Data Acquisition, Simulation, Users’ analysis • Software pipeline architecture • Enables modular structureof analysis paths. • Flexible and dynamic module linking . • Event-by-eventparallelanalysis • Issues to be improved: • Large data rate: distributed parallelization • with Inter-process communication. • ROOT support / Object-oriented data flow. analysis modules Path Upgrade BASF for Belle II & also for HSC

  9. DISTRIBUTED PARALLEL FRAMEWORK

  10. Parallel framework (ROOBASF) analysis modules • Control analysis paths. • Like BASF in Belle. • Data parallel. • Inter-process comm. • Program parallel. • Python user interface. • ROOT utilization. Path Process 1 Process 2 Process 1 Process 3 Process 2 Process 4

  11. Parallelization • ROOBASF uses Message Passing Interface (MPI) • De-facto standard of distributed parallel computing. • Expected to run in various environments. • Analysis modules use MPI to perform data-parallel algorithms. • Each pipeline stage is given an MPI group (communicator.) • Modules perform parallel processing just like stand-alone MPI programs in the given group. Process group 1 Process group 2

  12. Two layers of analysis paths analysis modules • Sequential paths • Sequence of analysis modules. • Conditional branches. →All executed in one process. • Parallel paths • Sequence of processes & c. branches. • Each of the processes execute a “sequential path. ” • Program-parallelization. • Multiple copies run simultaneously. • Data-parallelization. Con. branch processes

  13. Data flow • Events • Event or image data to be analyzed. • Broadcast messages • Experiment parameters, observation params, etc. • Have to be sent to all modules. • Must not switch order with events. event Suspend b-castuntil it arrives from all branches bcast 1 2 2 overtake event? c. branch

  14. Utilization of Python • Analysis paths are described in the Python language. • Modules can also be described in the script inline. • Modules can be quickly developed in Python. • CPU costly, then be rewritten in C++. →Efficient development of analysis modules. • Implemented with the boost.python library. • Python scripts can call native codes. • Native codes can call Python scripts. • Unique feature of boost.python, absent from SWIG. Native (C++ etc) Python script Analysis code Analysis code call ROOBASF Path Descrpt. call

  15. Python script load = Load(“/data/img%03d.fits") f.Seq_Add("main", load) f.Seq_Add("main", "Astr1Chip") import boostpbasf as basf f = basf.CFrame() Create an instance of ROOBASF framework Create a sequential path “main” f.Plug_Module( "Astr1Chip").SetParam( "config", "matching.scamp”) dopen() “Astr1Chip.so”, link the plugin code, and set its parameter. Python Load class Load(basf.CModule): def __init__(self, namefmt): basf.CModule.__init__(self) self.namefmt = namefmt self.count = 0 def event(self, status, ev, comm): if status == 0: ev.SetFile(namefmt % count) (……) ROOBASF (native) “main” path Astr1Chip.so (native) Define a python module

  16. TEST PIPELINE

  17. Pipeline for the test • Data-parallel analysis path (for on-line monitoring): • Performs pedestal/gain correction • Checks data quality • Performs 1-chip astrometry • Tiny modules in Python: Error detector, Time watch, etc. ROOBASF (Multi-threaded) OSS FLAT AGP STAT SEXT ASTR OSS FLAT AGP STAT SEXT ASTR CCD images OSS FLAT AGP STAT SEXT ASTR correction Check Data Quality 1-chip astrometry

  18. Test environment • 3 PCs only • x64 4-core • Gigabit-Ethernet-linked • Number of processes • 1, 3x1, 3x2, 3x3 • Parallelization will not go linear (though CPU has 4 cores) because of multi-threaded modules. CPU: 4 cores CPU: 4 cores CPU: 4 cores HDD HDD HDD (NFS) (NFS) • In. images • Out. images • Programs • In. images • Out. images • In. images • Out. images Process with threads 1 process 3x1 process 3x2 processes 3x3 processes

  19. Parallelization efficiency 9 Analysis time per image / sec (inversed) 8 Speedup 7 6 5 Parallelization efficiency Analysis time per image / sec (inversed) Ideal speedup 5 4 3 10 2 15 20 1 30 1 3 6 9 Process with threads

  20. SUMMARY

  21. Summary • Analysis framework: ROOBASF • Distributed memory (MPI) • Python script • ROOT I/O • We built a parallel analysis path for astronomical images. • Yet to confirm feasibility in Belle II.

More Related