200 likes | 211 Views
LHCb Computing Philippe Charpentier CERN. LHC b in brief. Experiment dedicated to studying CP-violation Responsible for the dominance of matter on antimatter Matter-antimatter difference studied using the b-quark (beauty) High precision physics (tiny difference…) Single arm spectrometer
E N D
LHCb in brief • Experiment dedicated to studying CP-violation • Responsible for the dominance of matter on antimatter • Matter-antimatter difference studied using the b-quark (beauty) • High precision physics (tiny difference…) • Single arm spectrometer • Looks like a fixed-target experiment • Smallest of the 4 big LHC experiments • ~500 physicists • Nevertheless, computing is also a challenge….
DST Event model / Physics event model Conditions Database Trigger Moore Simul. Gauss Analysis DaVinci Recons. Brunel GenParts Digit. Boole MCParts Raw Data µDST MCHits (r)DST Gaudi LHCb data processing software LHCb Computing, PhC
LHCb Basic Computing principles • Raw data shipped in real time to Tier-0 • Registered in the Grid File Catalog (LFC) • Raw data provenance in a Bookkeeping database (query-enabled) • Resilience enforced by a second copy at Tier-1’s • Rate: ~2000 evts/s (35 kB) 70 MB/s • All data processing up to final µDST or Tuple production distributed • Not possible to perform first pass reconstruction of all data at Tier0 • Consider Tier0 also is distributed • First pass reconstruction at all Tier1s like re-processing • Analysis performed at Analysis Facilities • In Computing Model: AF at Tier1s • Part of the analysis is not data-related • Extracting physics parameters on CP violation (toy-MC, complex fitting procedures…) • Also using distributed computing resources LHCb Computing, PhC
Basic principles (cont’d) • LHCb runs jobs where data are • All data are placed explicitly • Analysis made possible by reduction of datasets • many different channels of interest • very few events in each channel (from 102 to 106 events / year) • physicist dealing with maximum 107 events • small and simple events • final dataset manageable on physicist’s desktop (100’s of GBytes) • Calibration and alignment performed on a selected part of the data stream (at CERN) • Alignment and tracking calibration using dimuons (~5/s) • Used also for validation of new calibration • PID calibration using Ks, D* LHCb Computing, PhC
Simulation. Simulation. Simulation. Simulation. Simulation. Simulation. Simulation. Tier1 Tier1 Tier1 Tier1 Tier1 MSS-SE LHCb dataflow Online Tier0 Tier2 Raw MSS-SE Tier1 Digi Recons. Raw/Digi rDST Analysis Stripping rDST+Raw DST DST LHCb Computing, PhC
Comments on the LHCb Distributed Computing • Only last part of the analysis is foreseen to be “interactive” • Either analysing ROOT trees or using GaudiPython/pyRoot • User analysis at Tier1’s - why? • Analysis is very delicate, needs careful file placement • Tier1’s are easier to check, less prone (in principle) to outages • CPU requirements are very modest • What is LHCb’s concept of the Grid? • It is a set of computing resources working in a collaborative way • Provides computing resources for the collaboration as a whole • Recognition of contributions is independent on what type of jobs are run at a site • There are no noble and less noble tasks. All are needed to make the experiment a success • Resources are not made available for nationals • Resource high availability is the key issue LHCb Computing, PhC
Further comments on Analysis • Preliminary: currently being discussed • Local analysis (Tier3) • Non-pledged resources, reserved to local users (no CE access, local batch queues, no central accounting) • Storage may be a Grid-SE (i.e. SRM-enabled) or not • Copy or Replication performed by Dirac DMS tools • Grid-SE: replication, can use third-party transfers • Replica should be registered in LFC • Non Grid-SE: copy from a local node • LFC registration more problematic (no SRM), but possible • Analysis on Tier2 • Pledged resources, therefore available to the whole collaboration • Resources should be additional (dedicated to analysis) • We have just enough with Tier2 for simulation… • Storage and data access handled by local team (no central manpower available) • Data fully replicated in Grid-SE (LFC) • CE centrally banned in case of failures (as for Tier1s) LHCb Computing, PhC
EGEE, EGI? How to best achieve Distributed Computing? • Data Management is primordial • Availability of Storage Elements at Tier1’s • Reliability of SRM and transfers • Efficiency of data access protocols (rfio, (gsi)dcap, xrootd…) • Infrastructure is vital • Resource management • 24x7 support coverage • Reliable and powerful networks (OPN) • Resource sharing is a must • Less support needed • Best resource usage (less idle CPUs, empty tapes, unused networks…) • Shares must be on long term, no hard limit on number of slots • …. but opportunistic resources should not be neglected… LHCb Computing, PhC
LHCb Distributed Computing software • Integrated WMS and DMS : DIRAC • Distributed analysis portal: GANGA • Uses DIRAC W&DMS as back-end • DIRAC’s main characteristics • Implements late job scheduling • Overlay network (pilot jobs, central task queue) • Pull paradigm • Generic pilot jobs: allows to run multiple payload • Allows LHCb policy to be enforced • Alleviates the level of support required from sites • LHCb services designed to be redundant and hence highly available (multiple instances with failover, VO-BOXes) LHCb Computing, PhC
WMS with pilot jobs • Jobs are submitted with credentials of their owner (VOMS proxy) • The proxy is renewed automatically inside the WMS repository • The Pilot Job fetches the User Job and proxy • The User Job is executed with its owner’s proxy used to access SE, catalogs, etc
The LHCb Tier1s • 6 Tier1s • CNAF (IT, Bologna) • GridKa (DE, Karlsruhe) • IN2P3 (FR, Lyon) • NIKHEF (NL, Amsterdam) • PIC (ES, Barcelona) • RAL (UK, Didcot) • Contribute to • Reconstruction • Stripping • Analysis • Keeps copies on MSS of • Raw (2 copies shared) • Locally produced rDST • DST (2 copies) • MC data (2 copies) • Keeps copies on disk of • DST (7 copies)
LHCb Computing: a few numbers • Event sizes • on persistent medium (not in memory) • Processing time • Best estimates as of today • Requirements for 2009-10 • 6 106 seconds of beam • ~109 MC b-events • ~3. 109 MC non-b events
Resource requirements for 2009-10 CPU requirements Disk requirements Tape requirements IN2P3-CC represents 25% of the LHCb Tier1 pledges LHCb Computing, PhC
LHCb and LCG-France Statistiques depuis le 1er janvier 2009 LHCb Computing, PhC
CPU (in days) from January 2009 LHCb Computing, PhC
Jobs in France LHCb Computing, PhC
Tests for analysis LHCb Computing, PhC
Analysis data access (cont’d) LHCb Computing, PhC
Conclusions • LHCb has proposed a Computing Model adapted at its specific needs (number of events, event size, low number of physics candidates) • Reconstruction, stripping and analysis resources located at Tier1s (and possibly some Tier2s with enough storage and CPU capacities) • CPU requirements dominated by Monte-Carlo, assigned to Tier2s and opportunistic sites • With DIRAC, even idle desktops / laptops could be used ;-) • LHCb@home ? • Requirements are modest compared to other experiments • DIRAC is well suited and adapted to this computing model • Integrated WMS and DMS LHCb Computing, PhC