1 / 9

PHENIX and the data grid

PHENIX and the data grid. >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals. Grid use that would help PHENIX. Data management Replica management to/from remote sites

haley
Download Presentation

PHENIX and the data grid

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PHENIX and the data grid >400 collaborators Active on 3 continents + Brazil 100’s of TB of data per year Complex data with multiple disparate physics goals

  2. Grid use that would help PHENIX • Data management • Replica management to/from remote sites • Management of simulated data • Replica management within RCF • Job management • Simulated events generation and analysis • Centralized analysis of summary data at remote sites

  3. Replica management: export to remote sites • Export of PHENIX data • Send data by network or FedEx net to Japan, France (IN2P3) and US collaborator sites • Network to Japan via APAN using bbftp (right?) • Network to France using bbtfp (right?) • Network within US using bbftp and globus-url-copy • Currently transfers initiated & logged by hand • Much/most transfers use disks as buffer • Goals • Automate data export and logging into replica catalog • Allow transfer of data from most convenient site, rather than only the central repository at RCF

  4. Simulated data management • Simulations are performed at • CC-J(RIKEN/Wako),Vanderbilt, UNM, LLNL,USB • Will add other sites, including IN2P3 for run3 • Simulated hits data were imported to RCF • For detector response, reconstruction, analysis • Simulation projects managed by C. Maguire • actual simulation jobs run by expert at each site • Data transfers initiated by scripts or by hand • Goals • Automate importation and archive of simulated data • Ideally by merging with centralized job submission utility • Export PHENIX software effectively to allow remote site detector response and reconstruction

  5. Replica management within RCF • VERY important short term goal! • PHENIX tools have been developed • Replica catalog, including DAQ/production/QA info • lightweight POSTGRES version as well as Objy • logical/physical filename translator • Goals • Use and optimize existing tools at RCF • Investigate merging with Globus middleware • relation to GDMP? • different from Magda – carry more file info (?) • Integrate into job management/submission • Can we collect statistics for optimization and scheduling?

  6. Job management • Currently use scripts and batch queues at each site • Have two kinds of jobs we should manage better • Simulations • User analysis jobs

  7. Requirements for simulation jobs • Job specifications • Conditions & particle types to simulate • Number of events • May need embedding into real events (multiplicity effects) • I/O requirements • I=database access for run # ranges, detector geometry • O= the big requirement • send files to RCF for further processing • eventually can reduce to DST volume for RCF import • Job sequence requirements • Initially rather small, only interaction is random # seed • Eventually: hits generation -> response -> reconstruction • Site selection criteria • CPU cycles! Also buffer disk space & access for expert

  8. Current user analysis approach

  9. Requirements for analysis using grid

More Related