410 likes | 536 Views
The ATLAS Data Acquisition & Trigger: concept, design & status. Kostas KORDAS INFN – Frascati. 10th Topical Seminar on Innovative Particle & Radiation Detectors (IPRD06) Siena, 1-5 Oct. 2006. ATLAS Trigger & DAQ: concept. p. p. Full info / event: ~ 1.6 MB/25ns = 60 PB/s. 40 MHz.
E N D
The ATLAS Data Acquisition & Trigger: concept, design & status Kostas KORDAS INFN – Frascati 10th Topical Seminar on Innovative Particle & Radiation Detectors (IPRD06) Siena, 1-5 Oct. 2006
ATLAS Trigger & DAQ: concept p p Full info / event: ~ 1.6 MB/25ns = 60 PB/s 40 MHz LVL1 • Hardware based • No dead time 100 kHz 160 GB/s LVL2 • Algorithms on PC farms • seeded by previous level • decide fast • work w/ min. data volume ~3+6 GB/s ~ 3.5 kHz Event Filter ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
From the detector into the Level-1 Trigger Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms FE Pipelines ATLAS TDAQ concept, design & status - Kostas KORDAS
Upon LVL1 accept: buffer data & get RoIs ROD ROD ROD ROB ROB ROB Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 160 GB/s 100 kHz Read-Out Drivers Read-Out Links (S-LINK) ROS Read-Out Buffers Read-Out Systems ATLAS TDAQ concept, design & status - Kostas KORDAS
Upon LVL1 accept: buffer data & get RoIs ROD ROD ROD ROB ROB ROB Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 160 GB/s 100 kHz Read-Out Drivers RoI Read-Out Links (S-LINK) Region of Interest Builder ROS Read-Out Buffers ROIB Read-Out Systems On average, LVL1 finds ~2 Regions of Interest (in h-f)per event ATLAS TDAQ concept, design & status - Kostas KORDAS
LVL2: work with “interesting” ROSs/ROBs ROIB L2SV ROD ROD ROD ROB ROB ROB L2N L2P Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz 160 GB/s RoI Level 2 ~10 ms L2 ROS RoI requests Read-Out Buffers Read-Out Systems LVL2 Supervisor LVL2 Processing Units ~3 GB/s LVL2 Network RoI data (~2% of full event) A much smaller ReadOut network … at the cost of a higher control traffic ATLAS TDAQ concept, design & status - Kostas KORDAS
After LVL2: Event Builder makes full events ROIB L2SV DFM ROD ROD ROD ROB ROB ROB L2N L2P EBN SFI Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz 160 GB/s RoI Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ~3+6 GB/s RoI data (~2%) EB ~3.5 kHz Dataflow Manager L2 accepts EB Network Sub-Farm Input Event Builder ATLAS TDAQ concept, design & status - Kostas KORDAS
Event Filter: deals with Full Events ROIB L2SV DFM ROD ROD ROD ROB ROB ROB L2N L2P EBN SFI EFN EFP Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) 100 kHz 160 GB/s RoI Level 2 ~10 ms L2 ROS RoI requests Read-Out Systems RoI data (~2%) ~3+6 GB/s EB ~3.5 kHz Event Builder L2 accept (~3.5 kHz) Sub-Farm Input Full Event Event Filter Event Filter Network ~ sec EF Farm of PCs ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
From Event Filter to Local (TDAQ) storage Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s ~10 ms L2 Level 2 ROS RoI requests ROB ROB ROB Read-Out Systems RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI Event Filter Event Filter Network EFN ~ sec EF EFP SFO Event Filter Processors Sub-Farm Output EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s ATLAS TDAQ concept, design & status - Kostas KORDAS
TDAQ, High Level Trigger & DataFlow Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Dataflow High Level Trigger Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ROB ROB ROB RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI EFN ~ sec Event Filter EF EFP SFO EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s ATLAS TDAQ concept, design & status - Kostas KORDAS
TDAQ, High Level Trigger & DataFlow Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz RoI 160 GB/s Dataflow High Level Trigger Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests ROB ROB ROB RoI data (~2%) ~3+6 GB/s L2N L2P EB ~3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI EFN ~ sec Event Filter EF EFP SFO EF accept (~0.2 kHz) ~ 200 Hz ~ 300 MB/s High Level Trigger (HLT) • Algorithms developed offline(with HLT in mind) • HLT Infrastructure (TDAQ job): • “steer” the order of algorithm execution • Alternate steps of “feature extraction” & “hypothesis testing”) fast rejection (min. CPU) • Reconstruction in Regions of Interest min. processing time & network resources ATLAS TDAQ concept, design & status - Kostas KORDAS
High Level Trigger & DataFlow: PCs (Linux) Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD RoI Dataflow 150 nodes High Level Trigger ~10 ms L2 ROS 500 nodes RoI requests ROB ROB ROB RoI data (~2%) L2N L2P EB 100 nodes L2 accept (~3.5 kHz) EBN SFI 1600 nodes EFN ~ sec EF EFP SFO EF accept (~0.2 kHz) Infrastructure Control Communication Databases ATLAS TDAQ concept, design & status - Kostas KORDAS
TDAQ at the ATLAS site SDX1 dual-CPU nodes CERN computer centre ~30 ~1600 ~100 ~ 500 Local Storage SubFarm Outputs (SFOs) Event Filter (EF) Event Builder SubFarm Inputs (SFIs) LVL2 farm Event rate ~ 200 Hz Second- level trigger Data storage pROS DataFlow Manager Network switches stores LVL2 output Network switches LVL2 Super- visor Gigabit Ethernet Event data requests Delete commands Requested event data Event data pulled: partial events @ ≤ 100 kHz, full events @ ~ 3 kHz Regions Of Interest USA15 Data of events accepted by first-level trigger 1600 Read- Out Links ~150 PCs VME Dedicated links Read- Out Drivers (RODs) ATLAS detector Read-Out Subsystems (ROSs) RoI Builder First- level trigger UX15 Timing Trigger Control (TTC) Event data pushed @ ≤ 100 kHz, 1600 fragments of ~ 1 kByte each SDX1 USA15 UX15 ATLAS TDAQ concept, design & status - Kostas KORDAS
TDAQ testbeds SDX1 dual-CPU nodes CERN computer centre ~30 ~1600 ~100 ~ 500 Local Storage SubFarm Outputs (SFOs) Event Filter (EF) Event Builder SubFarm Inputs (SFIs) LVL2 farm Event rate ~ 200 Hz Second- level trigger Data storage pROS DataFlow Manager Network switches stores LVL2 output Network switches LVL2 Super- visor Gigabit Ethernet Event data requests Delete commands Requested event data Event data pulled: partial events @ ≤ 100 kHz, full events @ ~ 3 kHz Regions Of Interest USA15 Data of events accepted by first-level trigger 1600 Read- Out Links ~150 PCs VME Dedicated links Read- Out Drivers (RODs) ATLAS detector Read-Out Subsystems (ROSs) RoI Builder First- level trigger UX15 Timing Trigger Control (TTC) Event data pushed @ ≤ 100 kHz, 1600 fragments of ~ 1 kByte each “pre-series” DataFlow: ~10% of final TDAQ Used for realistic measurements, assessment and validation of TDAQ dataflow & HLT SDX1 USA15 Large scale system tests (at PC clusters with ~700 nodes) demonstrated required system performance & scalability for online infrastructure UX15 ATLAS TDAQ concept, design & status - Kostas KORDAS
Muon + HAD Cal. cosmics run with LVL1 LVL1: Calorimeter, muon and central trigger logics in production and installation phases for both hardware & software August 2006: first combined cosmic ray run • Muon section at feet of ATLAS • Tile (HAD) Calorimeter Triggered by Muon Trigger Chambers ATLAS TDAQ concept, design & status - Kostas KORDAS
ReadOut Systems: all 153 PCs in place Input from detector Read Out Drivers ROS units are PCs housing 12 Read Out Buffers, in 4 custom PCI-x cards (ROBIN) • All 153 ROSs installed and standalone commissioned ATLAS TDAQ concept, design & status - Kostas KORDAS
ReadOut Systems: all 153 PCs in place Input from detector Read Out Drivers ROS units are PCs housing 12 Read Out Buffers, in 4 custom PCI-x cards (ROBIN) • All 153 ROSs installed and standalone commissioned • 44 ROSs connected to detectors and fully commissioned: • Full LAr-barrel (EM), • Half of Tile (HAD) and the Central Trigger Processor • Taking data with final DAQ (Event Building at the ROS level) • Commissioning of other detector read-outs: expect to complete most of it by end 2006 ATLAS TDAQ concept, design & status - Kostas KORDAS
EM + HAD calo cosmics run using installed ROSs 18 ATLAS TDAQ concept, design & status - Kostas KORDAS
Event Building needs: bandwidth decides Event Builder (SFIs) DFM Gbit links Network switches Throughput requirements: • LVL2 accept rate: 3.5 kHz EB; Event size 1.6 MB 5.6 GB/s total input Gbit links Read-Out Subsystems (ROSs) ATLAS TDAQ concept, design & status - Kostas KORDAS
Event Building needs: bandwidth decides Event Builder (SFIs) Network limited (fast CPUs): • Event building using 60-70% of Gbit network ~70 MB/s into each Event Building node (SFI) DFM Gbit links Network switches Throughput requirements: • LVL2 accept rate: 3.5 kHz EB; Event size 1.6 MB 5600 MB/s total input Gbit links We need ~100 SFIs for full ATLAS Read-Out Subsystems (ROSs) ATLAS TDAQ concept, design & status - Kostas KORDAS
For HLT, CPU power is important • At TDR we assumed: • 100 kHz LVL1 accept rate • 500 dual-CPU PCs for LVL2 • each CPU has to do 100Hz • 10ms average latency per event in each CPU Assumed: 8 GHz per CPU at LVL2 ATLAS TDAQ concept, design & status - Kostas KORDAS
For HLT, CPU power is important Test with AMD dual-core, dual CPU @ 1.8 GHz, 4 GB total • At TDR we assumed: • 100 kHz LVL1 accept rate • 500 dual-CPU PCs for LVL2 • each CPU has to do 100Hz • 10ms average latency per event in each CPU Assumed: 8 GHz per CPU at LVL2 Preloaded ROS w/ muon events, run muFast @ LVL2 • 8 GHz per CPU will not come (soon) • But, dual-core dual-CPU PCs show scaling. We should reach necessary performance per PC (the more we wait, the better machines we’ll get) ATLAS TDAQ concept, design & status - Kostas KORDAS
DAQ / HLT commissioning Online infrastructure: • A useful fraction operational since last year. Growing according to need • Final network almost done ATLAS TDAQ concept, design & status - Kostas KORDAS
DAQ / HLT commissioning • First DAQ/HLT-I slice of final system within weeks • 153 ROSs (done) • 47 Event Building + HLT-Infrastructure PCs • 20 Local File Servers, 24 Loc. Switches • 20 Operations PCs Might add Pre-series L2 (30 PCs) and EF (12 PCs) racks ~ 300 machines on final network ATLAS TDAQ concept, design & status - Kostas KORDAS
DAQ / HLT commissioning • First DAQ/HLT-I slice of final system within weeks • 153 ROSs (done) • 47 Event Building + HLT-Infrastructure PCs • 20 Local File Servers, 24 Loc. Switches • 20 Operations PCs Might add Pre-series L2 (30 PCs) and EF (12 PCs) racks ~ 300 machines on final network • First 4 full racks of HLT machines (~100) early 2007 • Another 500 to 600 machines can be procured within 2007 • Rest, not before 2008. ATLAS TDAQ concept, design & status - Kostas KORDAS
DAQ / HLT commissioning • First DAQ/HLT-I slice of final system within weeks • 153 ROSs (done) • 47 Event Building + HLT-Infrastructure PCs • 20 Local File Servers, 24 Loc. Switches • 20 Operations PCs Might add Pre-series L2 (30 PCs) and EF (12 PCs) racks ~ 300 machines on final network • First 4 full racks of HLT machines (~100) early 2007 • Another ~500 machines can be procured within 2007 • Rest, not before 2008. TDAQ will provide significant trigger rates (LVL1, LVL2, EF) in 2007 • LVL1 rate 40 kHz • EB rate 1.9 kHz • physics storage rate up to 85 Hz • final bandwidth for storage – calibration ATLAS TDAQ concept, design & status - Kostas KORDAS
Summary • ATLAS TDAQ design: • 3-level trigger hierarchy • LVL2 works with Regions of Interest: small data movement • Feature extraction + hypothesis testing: fast rejection min. CPU power • Architecture has been validated via deployment of testbeds • We are in the installation phase of system • Cosmic runs with Central Calorimeters + muon system • An initial but fully functional TDAQ system will be installed, commissioned and integrated with Detectors till end of 2006 • TDAQ will provide significant trigger rates (LVL1, LVL2, EF) in 2007 ATLAS TDAQ concept, design & status - Kostas KORDAS
Thank you ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: RoI concept In this example: 4 Regions of Interest: 2 muons, 2 electrons 4 RoI -faddresses ATLAS TDAQ concept, design & status - Kostas KORDAS
Inner detector ATLAS total event size = 1.5 MB Total no. ROLs = 1600 Calorimetry Muon system Trigger ATLAS TDAQ concept, design & status - Kostas KORDAS
Scalability of LVL2 system • L2SV gets RoI info from RoIB • Assigns a L2PU to work on event • Load-balances its’ L2PU sub-farm • Can scheme cope with LVL1 rate? • Test with preloaded RoI info into RoIB, which triggers TDAQ chain, emulating LVL1 • LVL2 system is able to sustain the LVL1 input rate: • 1 L2SV system for LVL1 rate ~ 35 kHz • 2 L2SV system for LVL1 rate ~ 70 kHz (50%-50% sharing) Rate per L2SV stable within 1.5% ATLAS will have a handful of L2SVs can easily manage 100 kHz LVL1 rate ATLAS TDAQ concept, design & status - Kostas KORDAS
Tests of LVL2 algorithms & RoI collection 8 1 pROS L2SV Emulated ROS 1 1 L2PU pROS 1 DFM Plus: • 1 Online Server • 1 MySQL data base server 1) Majority of events rejected fast Electron sample is pre-selected Di-jet, m & e simulated events preloaded on ROSs; RoI info on L2SV 2) Processing takes ~all latency: small RoI data collection time 3) Small RoI data request per event Note: Neither Trigger menu, nor data files representative mix of ATLAS (this is the aim for a late 2006 milestone) ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: need p p Full info / event: ~ 1.6 MB/25ns = 60k TB/s 40 MHz Need high luminosity to get to observe the (rare) very interesting events Need on-line selection to write to disk mostly the interesting events ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: LVL1 concept p p Full info / event: ~ 1.6 MB/25ns 40 MHz • Hardware based • No dead-time • Calo & Muon info (coarse granularity) • Identify Regions of Interest for next Trigger Level LVL1 100 kHz 160 GB/s ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: LVL2 concept p p Full info / event: ~ 1.6 MB/25ns 40 MHz 100 kHz 160 GB/s • Software (specialized algorithms) • Use LVL1 Regions of Interest • All sub-detectors : full granularity • Emphasis on early rejection LVL2 ~3+6 GB/s ~ 3.5 kHz ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: Event Filter concept p p Full info / event: ~ 1.6 MB/25ns 40 MHz 100 kHz 160 GB/s ~3+6 GB/s ~ 3.5 kHz Event Filter • Offline algorithms • Seeded by LVL2 Result • Work with full event • Full calibration/alignment info ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: concept summary ROB ROB ROB ROB ROB ROB ROB ROB ROB ROB ROB ROB RoI LVL2 EF Latency Rates Muon Calo Inner 40 MHz Pipeline Memories Hardware based (FPGA, ASIC) Calo/Muon (coarse granularity) LVL1 2.5 ms Read-Out Subsystems hosting Read-Out Buffers ~100 kHz Software (specialised algs) Uses LVL1 Regions of Interest All sub-dets, full granularity Emphasis on early rejection ~10 ms ~3 kHz High Level Trigger Event builder cluster Offline algorithms Seeded by LVL2 result Work with full event Full calibration/alignment info ~1 s Event Filter farm ~200 Hz Local Storage: ~ 300 MB/s ATLAS TDAQ concept, design & status - Kostas KORDAS
ATLAS Trigger & DAQ: design Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) ROIB L2SV DFM ROD ROD ROD 100 kHz 160 GB/s RoI Dataflow High Level Trigger ~10 ms L2 Level 2 ROS ROB ROB ROB RoI requests Read-Out Systems RoI data (~2%) L2N L2P EB ~3+6 GB/s ~ 3.5 kHz L2 accept (~3.5 kHz) EBN Event Builder SFI Event Filter EFN ~ sec EFP EF SFO EF accept (~0.2 kHz) Full info / event: ~ 1.6 MB/25ns 40 MHz ~ 300 MB/s ~ 200 Hz ATLAS TDAQ concept, design & status - Kostas KORDAS
High Level Trigger & DataFlow: recap Latency Rates 40 MHz 2.5 ms ROIB L2SV DFM ROD ROD ROD ~100 kHz 160 GB/s ROB ROB ROB ~10 ms ~3+6 GB/s L2N L2P ~3.5 kHz EBN Event Builder SFI ~1 s EFN EFP ~200 Hz SFO ~ 300 MB/s Trigger DAQ Calo MuTrCh Other detectors Level 1 40 MHz 2.5 ms Det. R/O L1 accept (100 kHz) RoI Dataflow High Level Trigger Level 2 ~10 ms L2 ROS Read-Out Systems RoI requests RoI data (~2%) EB L2 accept (~3.5 kHz) ~ sec Event Filter EF EF accept (~0.2 kHz) ATLAS TDAQ concept, design & status - Kostas KORDAS
SDX1 dual-CPU nodes ~30 ~1600 ~100 ~ 500 Local Storage SubFarm Outputs (SFOs) Event Filter (EF) Event Builder SubFarm Inputs (SFIs) LVL2 farm Second- level trigger pROS DataFlow Manager Network switches stores LVL2 output Network switches LVL2 Super- visor Gigabit Ethernet Event data requests Delete commands Requested event data Regions Of Interest USA15 Data of events accepted by first-level trigger 1600 Read- Out Links ~150 PCs VME Dedicated links Read- Out Drivers (RODs) Read-Out Subsystems (ROSs) RoI Builder First- level trigger Timing Trigger Control (TTC) ATLAS TDAQ concept, design & status - Kostas KORDAS
Event Builder ( SFIs) 12 30 EFDs L2PUs 2 SFOs 6 pROS Network switches Network switches 1 DFM 2 L2SVs S-link Gbit 12 Read-Out Subsystems (ROSs) 1 RoI Builder Timing Trigger Control (TTC) ATLAS TDAQ concept, design & status - Kostas KORDAS