LHCb Computing Philippe Charpentier CERN

LHCb ComputingPhilippe CharpentierCERN

LHCb in brief • Experiment dedicated to studying CP-violation • Responsible for the dominance of matter on antimatter • Matter-antimatter difference studied using the b-quark (beauty) • High precision physics (tiny difference…) • Single arm spectrometer • Looks like a fixed-target experiment • Smallest of the 4 big LHC experiments • ~500 physicists • Nevertheless, computing is also a challenge….

DST Event model / Physics event model Conditions Database Trigger Moore Simul. Gauss Analysis DaVinci Recons. Brunel GenParts Digit. Boole MCParts Raw Data µDST MCHits (r)DST Gaudi LHCb data processing software LHCb Computing, PhC

LHCb Basic Computing principles • Raw data shipped in real time to Tier-0 • Registered in the Grid File Catalog (LFC) • Raw data provenance in a Bookkeeping database (query-enabled) • Resilience enforced by a second copy at Tier-1’s • Rate: ~2000 evts/s (35 kB)  70 MB/s • All data processing up to final µDST or Tuple production distributed • Not possible to perform first pass reconstruction of all data at Tier0 • Consider Tier0 also is distributed • First pass reconstruction at all Tier1s like re-processing • Analysis performed at Analysis Facilities • In Computing Model: AF at Tier1s • Part of the analysis is not data-related • Extracting physics parameters on CP violation (toy-MC, complex fitting procedures…) • Also using distributed computing resources LHCb Computing, PhC

Basic principles (cont’d) • LHCb runs jobs where data are • All data are placed explicitly • Analysis made possible by reduction of datasets • many different channels of interest • very few events in each channel (from 102 to 106 events / year) • physicist dealing with maximum 107 events • small and simple events • final dataset manageable on physicist’s desktop (100’s of GBytes) • Calibration and alignment performed on a selected part of the data stream (at CERN) • Alignment and tracking calibration using dimuons (~5/s) • Used also for validation of new calibration • PID calibration using Ks, D* LHCb Computing, PhC

Simulation. Simulation. Simulation. Simulation. Simulation. Simulation. Simulation. Tier1 Tier1 Tier1 Tier1 Tier1 MSS-SE LHCb dataflow Online Tier0 Tier2 Raw MSS-SE Tier1 Digi Recons. Raw/Digi rDST Analysis Stripping rDST+Raw DST DST LHCb Computing, PhC

Comments on the LHCb Distributed Computing • Only last part of the analysis is foreseen to be “interactive” • Either analysing ROOT trees or using GaudiPython/pyRoot • User analysis at Tier1’s - why? • Analysis is very delicate, needs careful file placement • Tier1’s are easier to check, less prone (in principle) to outages • CPU requirements are very modest • What is LHCb’s concept of the Grid? • It is a set of computing resources working in a collaborative way • Provides computing resources for the collaboration as a whole • Recognition of contributions is independent on what type of jobs are run at a site • There are no noble and less noble tasks. All are needed to make the experiment a success • Resources are not made available for nationals • Resource high availability is the key issue LHCb Computing, PhC

Further comments on Analysis • Preliminary: currently being discussed • Local analysis (Tier3) • Non-pledged resources, reserved to local users (no CE access, local batch queues, no central accounting) • Storage may be a Grid-SE (i.e. SRM-enabled) or not • Copy or Replication performed by Dirac DMS tools • Grid-SE: replication, can use third-party transfers • Replica should be registered in LFC • Non Grid-SE: copy from a local node • LFC registration more problematic (no SRM), but possible • Analysis on Tier2 • Pledged resources, therefore available to the whole collaboration • Resources should be additional (dedicated to analysis) • We have just enough with Tier2 for simulation… • Storage and data access handled by local team (no central manpower available) • Data fully replicated in Grid-SE (LFC) • CE centrally banned in case of failures (as for Tier1s) LHCb Computing, PhC

EGEE, EGI? How to best achieve Distributed Computing? • Data Management is primordial • Availability of Storage Elements at Tier1’s • Reliability of SRM and transfers • Efficiency of data access protocols (rfio, (gsi)dcap, xrootd…) • Infrastructure is vital • Resource management • 24x7 support coverage • Reliable and powerful networks (OPN) • Resource sharing is a must • Less support needed • Best resource usage (less idle CPUs, empty tapes, unused networks…) • Shares must be on long term, no hard limit on number of slots • …. but opportunistic resources should not be neglected… LHCb Computing, PhC

LHCb Distributed Computing software • Integrated WMS and DMS : DIRAC • Distributed analysis portal: GANGA • Uses DIRAC W&DMS as back-end • DIRAC’s main characteristics • Implements late job scheduling • Overlay network (pilot jobs, central task queue) • Pull paradigm • Generic pilot jobs: allows to run multiple payload • Allows LHCb policy to be enforced • Alleviates the level of support required from sites • LHCb services designed to be redundant and hence highly available (multiple instances with failover, VO-BOXes) LHCb Computing, PhC

WMS with pilot jobs • Jobs are submitted with credentials of their owner (VOMS proxy) • The proxy is renewed automatically inside the WMS repository • The Pilot Job fetches the User Job and proxy • The User Job is executed with its owner’s proxy used to access SE, catalogs, etc

The LHCb Tier1s • 6 Tier1s • CNAF (IT, Bologna) • GridKa (DE, Karlsruhe) • IN2P3 (FR, Lyon) • NIKHEF (NL, Amsterdam) • PIC (ES, Barcelona) • RAL (UK, Didcot) • Contribute to • Reconstruction • Stripping • Analysis • Keeps copies on MSS of • Raw (2 copies shared) • Locally produced rDST • DST (2 copies) • MC data (2 copies) • Keeps copies on disk of • DST (7 copies)

LHCb Computing: a few numbers • Event sizes • on persistent medium (not in memory) • Processing time • Best estimates as of today • Requirements for 2009-10 • 6 106 seconds of beam • ~109 MC b-events • ~3. 109 MC non-b events

Resource requirements for 2009-10 CPU requirements Disk requirements Tape requirements IN2P3-CC represents 25% of the LHCb Tier1 pledges LHCb Computing, PhC

LHCb and LCG-France Statistiques depuis le 1er janvier 2009 LHCb Computing, PhC

CPU (in days) from January 2009 LHCb Computing, PhC

Jobs in France LHCb Computing, PhC

Tests for analysis LHCb Computing, PhC

Analysis data access (cont’d) LHCb Computing, PhC

Conclusions • LHCb has proposed a Computing Model adapted at its specific needs (number of events, event size, low number of physics candidates) • Reconstruction, stripping and analysis resources located at Tier1s (and possibly some Tier2s with enough storage and CPU capacities) • CPU requirements dominated by Monte-Carlo, assigned to Tier2s and opportunistic sites • With DIRAC, even idle desktops / laptops could be used ;-) • LHCb@home ? • Requirements are modest compared to other experiments • DIRAC is well suited and adapted to this computing model • Integrated WMS and DMS LHCb Computing, PhC

LHCb Computing Philippe Charpentier CERN

LHCb Computing Philippe Charpentier CERN

Presentation Transcript

LHCb computing highlights

LHCb Computing Tasks

LHCb computing status

Status of LHCb-INFN Computing

LHCb France situation for computing

The LHCb Software and Computing GridPP10 meeting, June 2nd 2004 Ph. Charpentier, CERN

LHCb Computing Programme of Work

INFN Computing for LHCb

LHCb Computing

Grid computing at CERN

LHCb computing in Russia

LHCb Computing Model

CERN Computing Review Recommendations

Computing for LHCb - Ital y

The LHCb Computing TDR

LHCb Computing

LHCb Computing and Grid Status

Computing for LHCb - Ital y

The LHCb experiment @ CERN

LHCb Computing Project

Computing at CERN - I

Computing for LHCb - Ital y