Parallel Reconstruction of CLEO III Data

Parallel Reconstruction ofCLEO III Data Gregory J. Sharp Christopher D. Jones Wilson Synchrotron Laboratory Cornell University

Outline • Overview of CLEO reconstruction environment • The problems with the old reconstruction system • The solution - finer-grained parallelism • The benefits

CLEO IIIReconstruction Environment • Uses a farm of more than 130 Sun Netras • Sun Grid Engine ™ manages CPU allocation • Data read from & written to Objectivity/DB™ • Events must be written to DB in event-number order • Reconstruction rate has to equal average DAQ rate

Former Reconstruction System • Output was written directly to the offline database • ~130 runs may be processed in parallel on the farm • Each run is processed in its entirety by a single CPU • Up to 9 days to reconstruct a single run on a single CPU • All failures required operator and/or DBA intervention

Problems • Need to maximize CPU utilization • Load balancing between farms is difficult • Takes a long time to stop the farm safely • Output of the first few runs must be checked • Debugging reconstruction code

More Problems • Low I/O rates to the database • Many locks held for long periods • Large window for failures to occur • Failure leaves database in an invalid state • No automation of failure detection and recovery

The Solution • Split each run into roughly equal-sized chunks • Assign each chunk to a CPU • Save sub-job output in intermediate binary files in event-number order • Once all sub-jobs complete, collate binary files into database in event-number order

The Job Manager • The JM submits all the reconstruction sub-jobs and monitors their progress, retrying failures • Once all reconstruction completes successfully the JM starts the collation sub-job • Once collation completes successfully the JM starts the merge histogram sub-job • Can be restarted at any time if it dies

Structure Diagram

Automation • JM restarts subjobs with transient failures • Runs may be submitted automatically when SGE queue is (almost) empty • A cron job generates status web pages

Implementation Details • Written in Perl • Uses Sun Grid Engine to submit and track jobs • Uses CLEO III software infrastructure for reconstruction and population • Uses PAW for merging histograms

Benefits • Less operator intervention/management • Faster debugging • Increased CPU utilization, which offsets extra CPU use • 20% Faster completion of reconstruction • Just-in-time pre-staging of data from HSM file system • The January ice storm

Future Steps • Automate staging of data to cache disks • Automate posting of staged runs info to Reconstruction

Conclusions • Multiple file formats made this possible • Substantial productivity gains • Higher utilization of computing resources • For more details: • http://www.lepp.cornell.edu/~gregor/projects/parallelpass2

Parallel Reconstruction of CLEO III Data