90 likes | 226 Views
PHENIX and the data grid. >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals. Grid use that would help PHENIX. Data management Replica management to/from remote sites
E N D
PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals
Grid use that would help PHENIX • Data management • Replica management to/from remote sites • Management of simulated data • Replica management within RCF • Job management • Simulated events generation and analysis • Centralized analysis of summary data at remote sites
Replica management: export to remote sites • Export of PHENIX data • Send data by network or FedEx net to Japan, France (IN2P3) and US collaborator sites • Network to Japan via APAN using bbftp (right?) • Network to France using bbtfp (right?) • Network within US using bbftp and globus-url-copy • Currently transfers initiated & logged by hand • Much/most transfers use disks as buffer • Goals • Automate data export and logging into replica catalog • Allow transfer of data from most convenient site, rather than only the central repository at RCF
Simulated data management • Simulations are performed at • CC-J(RIKEN/Wako),Vanderbilt, UNM, LLNL,USB • Will add other sites, including IN2P3 for run3 • Simulated hits data were imported to RCF • For detector response, reconstruction, analysis • Simulation projects managed by C. Maguire • actual simulation jobs run by expert at each site • Data transfers initiated by scripts or by hand • Goals • Automate importation and archive of simulated data • Ideally by merging with centralized job submission utility • Export PHENIX software effectively to allow remote site detector response and reconstruction
Replica management within RCF • VERY important short term goal! • PHENIX tools have been developed • Replica catalog, including DAQ/production/QA info • lightweight POSTGRES version as well as Objy • logical/physical filename translator • Goals • Use and optimize existing tools at RCF • Investigate merging with Globus middleware • relation to GDMP? • different from Magda – carry more file info (?) • Integrate into job management/submission • Can we collect statistics for optimization and scheduling?
Job management • Currently use scripts and batch queues at each site • Have two kinds of jobs we should manage better • Simulations • User analysis jobs
Requirements for simulation jobs • Job specifications • Conditions & particle types to simulate • Number of events • May need embedding into real events (multiplicity effects) • I/O requirements • I=database access for run # ranges, detector geometry • O= the big requirement • send files to RCF for further processing • eventually can reduce to DST volume for RCF import • Job sequence requirements • Initially rather small, only interaction is random # seed • Eventually: hits generation -> response -> reconstruction • Site selection criteria • CPU cycles! Also buffer disk space & access for expert