810 likes | 1.03k Views
Pulsar Functional requirements. Part I: Pulsar as a Pulser and Recorder (test stand tools). Ted Liu Level 2 group meeting, Feb 22rd. 02. Part I outline: Pulser mode. The need for L2 teststand and goals pulser board design concept what pulser board should be able to do?
E N D
Pulsar Functional requirements Part I: Pulsar as a Pulser and Recorder (test stand tools) Ted Liu Level 2 group meeting, Feb 22rd. 02
Part I outline: Pulser mode • The need for L2 teststand and goals • pulser board design concept • what pulser board should be able to do? • L2 input data for each subsystem • firmware design considerations for pulser mode
L2 Review committee Recommendation (Dec. 14th, 2001): “In the longer term, the test-stand system should be aggressively pursued. This will allow completion of the development effort and longer-term maintenance of the full system.” As recommended by the committee, we will schedule a workshop for the L2 group to discuss the specifications for this system… “The committee thinks that the test station system, combined with the 2nd test crate sounds like the perfect way to provide various types of simulated event data (different luminosity, trigger types, suspected failure modes etc.).Providing that this effort won't impact any activities needed to make the baseline system work, it should be strongly supported, and prototyping and testing work should go ahead at full speed.” “Hold a design workshop by the rest of the Level 2 groups in order to ensure that what is built is safe to use and is capable of exercising all the important parts of the system in a realistic fashion”.
How you could use expanded teststand capabilities ? • (from this meeting agenda) • if we have a teststand with a general input pulser, how • would you want to use it? • --- debugging broken/spare boards; • --- testing firmware modifications; • --- ??? • what type of tests would you want to run? • --- is fixed patterns with fixed timing good enough? • --- data from real events? • --- test multiple boards and check for interference? • --- Randomly timed L1A patterns? • --- Random/user controlled latency of input data? • --- L1As etc driven in a deterministic way? (TESTCLK) • --- L1As etc driven by TS? • --- ??? • what kind of software tools you will need for your testing?
Basic requirement for a L2 trigger data source board: L1A for buffer n To L2 interface Board input Data block latency • upon L1A for buffer n, start a counter for buffer n; • At the same time clock data from RAM into the FIFO, • once the counter reaches latency threshold, clock the data out from the FIFO at the speed which matches with that of the subsystem… the actual latency is controlled by when the data is clocked out the FIFO. FIFO Test pattern This is an over simplified picture. Each subsystem is somewhat different and to design an universal test board is not all that easy… RAM
12 fibers(Taxi) “provide various types of simulated event data (different luminosity, trigger types, suspected failure modes etc.)” L1 cables L 1 Reces x 4 Magic bus Alphas X 4 X T R P S V T C L I S T I S O M U O N L2 crate inputs One SVT Cable each 6 fiber (hotlink) 1 LVDS cable 16 fibers (hotlink) 7 fibers (Taxi) can one design an universal L2 test (pulser) board? -- to enhance the testability of L2
12 fibers(Taxi) L1 cables Level 2 decision crate L 1 Reces x 4 Magic bus Alphas X 4 X T R P S V T C L I S T I S O M U O N T R A C E R M V M E L2 crate inputs One SVT Cable each 6 fiber (hotlink) 1 LVDS cable 16 fibers (hotlink) 7 fibers (Taxi) • After L1A, data arrives to each interface board with different latency (L1 within 132 ns, • XTRP within ~1us, then the rest. SVT takes 10 us or so---longest); • most boards (L1, XTRP, CLIST, IsoList and SVT) will then request for bus, and send • its data to alpha over magicbus; • alpha will process the data, if it needs muon and Reces data (for some events), it will then • get the data via programmed I/O over magicbus from muon and Reces boards • once decision is made, alpha will handshake with Trigger Supervisor (via L2-TS cable)
Level 2 trigger input data paths SVT XTRP L1 CLIST ISO Muon Reces Incoming data Clock rate 30Mhz 7.6Mhz 7.6Mhz 20Mhz 12Mhz 30Mhz cdfclk x 4 7.6 Mhz cdfclk SVT cable L1 cable Hotlink+fiber Taxi+fiber Taxi+fiber Interface hardware SVT cable Hotlink+fiber 46bits/clu data size range 150bits/trk 21 bits/trk 96 bits/evt 145bits/clu 11Kbits/evt 1.5Kb/evt ~1us - 10us Latency range* ~132 ns ~1-20us ~few us ~1-5 us ~ 6 us ~10-100us Fixed or variable data length? variable fixed fixed variable variable variable fixed yes yes yes yes yes no yes Data with Buffer#? no yes yes no yes - EOE with data? (or from separate path?) - no no no no yes BC# B0 marker? BC# yes yes no Data gap within one event? yes no no no Flow control ? no no no no not used no Not used * Latency range also depends on L1A history …
Design issues for an universal test board: • Hardware requirement is clear: • have all hardware interfaces for all data paths; • Firmware requirements need more thinking: • variable data size for some subsystem; • variable latency (from event to event); • correct buffer number in the data for a given L1A; • gaps for certain data paths; • record real data and reproduce in test stand; • response to HRR etc • ….???
Basic hardware requirement: have all hardware interfaces Pulsar is designed to have all the data interfaces that Level 2 decision crate has. It is a data source for all trigger inputs to Level 2 decision crate, it can be used to record data from upstream as well. PulsAR: Pulser And Recorder L 1 Reces α S V T X T R P C L I S T I S O M U O N L2 decision crate Hotlink IO Pulsar CDFctrl Taxi IO SVT/XTRP VME L1 TS
What are the test tools ? -> two types of board: Reces MMB Magic Mystery Board HP scope CDFctrl SVT format M M B L 1 α S V T X T R P C L I S T I S O M U O N Magicbus L2 test crate TS P U L S A R P U L S A R P U L S A R P U L S A R P U L S A R L2 inputs L2 decision crate Logical Analyzer HP scope Hotlink IO Pulsar CDFctrl Taxi IO SVT/XTRP VME L1 TS
Level2_Pulsar design as test stand tool 9U VME (VME FPGA not shown) 3 Altera APEX 20K400 FPGAs Optical IO Mezz card connectors T S T R K CDF ctrl Control Optical IO L 1 T R K Pulsar: Pulser and Recorder (as Level 2 test stand tools)
Front-panel (double width) PULSAR design as test tool only T S Mezzanine Card L 1 IO Hotlink/ Taxi L 1 Ctrl S V T IO S V T component side Other connectors (1 L1, 1 TS) will stay inside the board. The mezzanine card connectors are used for optical I/O (hotlink and taxi)
CustomMezzanine cards • Hotlink: Tx and Rx (CLIST, Muon data paths) • Taxi: Tx and Rx (Iso, Reces data paths) Altera EP1K30_144 FPGA FIFO J3 FIFO CMC connector FIFO J1 FIFO Hotlink Optical Tx/Rx: HFBR-1119T/2119T Taxi Optical Tx/Rx: HFBR-1414T/2416T Hotlink or Taxi Tx/Rx chips Hotlink Tx/Rx: CY7B923JC/933JC Taxi Tx/Rx: AM7968/7969-175JC
Hotlink Rx mezzanine Hotlink Tx mezzanine Usually has 4 fiber connectors. add one LVDS connector for CLIST case: only two fiber connector (left side) will be loaded for one Mezzanine card, and one LVDS connector will be loaded on the right side instead of two fiber connectors
Pulsar boards communication lines on P2 • Will use SVT like P2 user defined signals for • Inter-board communication: • Pulsar_init* (P2 A1); • Pulsar_error* (P2 A2); • Pulsar_freeze* (P2 A3); • Pulsar_Lostlock* (P2 A4); (for SLINK etc). • Pulsar_spare* (P2 A5); Any Pulsar board can drive and listen to these signal lines
Control unit L1A Buf # controller L1 data SVT data XTRP data FIFOs Internal Test RAM The latency is controlled by when the data is clocked out the FIFO If we want to load large number of events, we will need to use external SRAM. SRAM
hotlink examples: Muon case (only one mezzanine card shown) Optical IO Unit L1A Buf # controller FIFO FIFO FIFO FIFO Internal Test RAM The latency is controlled by when the data is clocked out the FIFO SRAM
Another hotlink example outputs: 6 fibers + LVDS FIFO Optical IO Unit for CLIST case FIFO FIFO FIFO F E D C B A 8 bits each @50ns, one cluster is encoded in 6 8-bit words in all 6 fibers FIFO FIFO LVDS FIFO 8 bits data streams will be pushed into the FIFOs in the mezzanine card after L1A, later on they will be clocked out onto fibers. The end of event marker comes out via LVDS connector.
Simple way to load test patterns and send them out (optical paths as an example) 8 bits wide per fiber 128 words deep The actual latency will be controlled by when the data is clocked out the FIFO after L1A (use a register via VME) buffer0 8 Clocked FIFO buffer1 Fiber Tx 1 8 FPGA Internal RAM buffer2 This means the latency will be fixed for a given test run. This is not good enough to mimic the real system as the latency varies from event to event, but may be good enough for testing spares buffer3
another way to loadtest pattern memory: use 36 bits data width, 32 will be for 4 fiber output (4 x 8), the highest 4 bits will be used as control bits to mark the content of data. For each event worth data, the first one will be the header, and the 32 bits data will contain the latency (&number of words etc) for this particular event and this particular path. The last one is the trailer, which can contain other info if needed (such as what L2 decision should be etc) (either use internal RAM or use 16 bit address 36 bit data external SRAM): 36bits 4 Ctrl bits 8 bits data 8 bits data 8 bits data Buffer0 data Buffer1 data Buffer2 data Buffer3 data 8 bits data The highest two address bits will be controlled by buffer number to divide automatically the memory for 4 buffers
How does it work: (1) after L1A, read the first word(header) and get the latency, at the same time start a counter; (2) continue to readout the rest of the data words from the memory and clock them into a FIFO, until the trailer is reached (can get the L2 decision information there) (3) once the counter reaches latency threshold, clock the data out from the FIFO at the speed which matches with the subsystem. this way the latency for each event and each data path can be individually controlled by user. Latency for this event, and other info header One could have more control by inserting gaps in between data words…etc using the 4 control bits, to better mimic the real situation for certain data paths. This approach seem to be quite flexible data data data data 1st event trailer Other information (what L2 decision should be etc) Ctrl bit 35: header Ctrl bit 34: trailer Ctrl bit 33: gap Ctrl bit 32: reserved Buffer 0 data memory
Initial thoughts on tester firmware design L1A Buf# data • Latch L1A+buf# • read 1st word from • RAM • save latency&compare • with counter • continue reading data • from RAM to FIFOs • until last word • once counter counts • Up to latency, enable • FIFO output and ctrl • Signals for Tx chips • ready for next L1A FIFO FFIO ctrl ctrl counter0 counter1 FFIO counter2 FFIO counter3 RAM Buffer 0 data FFIO Buffer 1 data 36bits Mezzanine Card side VME Buffer 2 data addr Buffer 3 data State machine
Notes: • Counter will be reset by the statemachine, rearm for next L1A • if counter already counts beyond the latency for a given • L1A, clock data out right away; • (3) RAM address controlled by an address counter and buffer • number with the current event; • (4) HRR: reset L1A FIFO and all output FIFOs, wait for B0 • marker to pass by before enabling… • (5) To keep things simple, for now just use single event data for • a given buffer (to keep RAM address simple) • (6) what’s shown is for 4 fiber case, need one more internal RAM • for 8 fibers… if use external SRAM, need ping-ponging…
Possible implementation A: Comments: Pro: simple Con: not so elegant, as the state machine has to finish sending all data out of the FIFO before able to process next one. Maybe ok if run at higher clock rate. Will be some intrinsic delay between events. Good starting point, allow us to simulate the board soon. Would be better to separate the RAM to FIFO part from the actual data sending part null L1A header RAMtoFIFO Latency output done
Possible implementation B: Comments: Pro: more elegant Con: somewhat more involved. Implement this later For the real thing. We decided to go for Implementation A first. null counterQ Output controller latency L1A header Data ready to be sent FIFO RAMtoFIFO done
A few comments: • On average, the maximum data size is from muon. Each fiber can send up to • 30 x 4 = 120 8-bit words per event (with 16 fibers total) . If we use a 36 data • bits 16 address bits SRAM, we can load up to a few hundreds of different • events for muon for a given test run. Can load much more for other subsystems. • The latency for each event and for each data path (arrival time into L2 decision • crate after L1A) can be controlled by user to better mimic the real system; • The long data gaps or long delays for some subsystems can be simulated this way; • latency for individual events? • * estimate based on data size (i.e. more clusters -> longer latency etc); • * it may be possible to record the real data with Pulsar in recorder mode and • time stamp the incoming data during recording. Then save them and • can later be used to reproduce the real data with real timing. • * note: actual latency also depends on the history of L1As.
for example, it may be possible to record XTRP data with timing information: would this be useful? Counter for L1A buffer n time recording RAM XTRP input data recording RAM Both data and arrival time can be recorded with the data strobe from XTRP. The data latency AND “gap” information can be recorded this way, and can be reproduce in test stand mode. Since Pulsar has both XTRP input and output connectors, spying on the data is possible. For fiber connections, need to use fiber splitters to spy on data, or take short test runs.
Pulsar in recorder mode Optical IO Unit hotlink examples: Muon case (only one mezzanine card shown) L1A Buf # controller FIFO FIFO or FIFO FIFO Internal Test RAM Configure the (S)RAM as a circular buffer for recording (for each L1A) and can be stopped and read out via VME. Each Optical IO FPGA looks at 8 fiber channels, SRAM has 32+4 bits. So need ping-ponging for recording (recording is at twice the incoming data rate, 60MHz) VME SRAM
Ideal test stand setup: Alpha + Pulsar + interface board(s) (this setup has been done already at VME speed by Steve, Matt and Peter with SVT Merger board acting as Pulsar) Magic bus T R A C E R A L P H A P U L S A R I N T F A C E R O C data input • can test individual board • can test the full data path; • can also test multiple boards • and check for interference • note that with only 2 Pulsar boards • one can source data at the same time for • L1, XTRP, SVT, CLIST, Iso and muon. • need one extra Pulsar to drive one Reces, • …. what else? Data source: Level2_Pulsar Data sink: Alpha Possible data patterns: (1) hand made (2) derived from MC (3) derived from data bank (4) recorded from upstream, catch errors and reproduce them
Another possible setup: Pulsar + MMB + Interface board Magicbus Analyzer M M B Magic bus or TDC backplane T R A C E R P U L S A R I N T F A C E or TESTCLK? R O C data input Data source: Level2_Pulsar Data sink: MMB+Pulsar/GB in case Alpha is not available, it is possible to use MMB to sink the data and convert into SVT data format then send the data into Pulsar or a GhostBuster board.
Possible to drive the full system Magicbus watchdog M M B L 1 Reces x 4 Magic bus alphas X T R P S V T C L I S T I S O M U O N 16 fibers X 3 =48 16 fibers T R A C E R P U L SAR P ULSAR P U L S A R P U L S A R P U L S A R with different R O C L2 test crate L1A rate & Event patterns 5 Pulsar boards needed to drive the full system
CLIST cluster information from one LOCOS 6 8-bits words per cluster on one fiber input, arriving 50ns apart train no. I II III IV V VI ----------------------------------------------------------------------------- sig_0 1 em(5) 1 had(5) 1 crate_sel sig_1 L1AB(0) em(6) passbit(0) had(6) phi(0) ntow(0) sig_2 L1AB(1) em(7) passbit(1) had(7) phi(1) ntow(1) sig_3 em(0) em(8) had(0) had(8) eta(0) ntow(2) sig_4 em(1) em(9) had(1) had(9) eta(1) ntow(3) sig_5 em(2) em(10) had(2) had(10) eta(2) ntow(4) sig_6 em(3) em(11) had(3) had(11) eta(3) ntow(5) sig_7 em(4) em(12) had(4) had(12) eta(4) ntow(6) Data format from Monica.
1 CLIQUE connection (via 10 pin twisted ribbon cable) The LVDS signals are driven by a 16.7 nsec clock which is a divided-by-8 copy of the 132 nsec CDF clock: pin 1 BUF_DONE(0)+ pin 2 BUF_DONE(0)- pin 3 BUF_DONE(1)+ pin 4 BUF_DONE(1)- pin 5 CRSUM_SEND+ (not received by CLIST) pin 6 CRSUM_SEND- (not received by CLIST) pin 7 EVENT_DONE*+ pin 8 EVENT_DONE*- pin 9 unused pin 10 unused The time of EVENT_DONE* with respect to the last cluster found in the event is fixed.
LVDS output 8-bits wide cluster data: Em(4:0), buff(1:0),1 Em(12 : 5) Had(4:0), pass(1:0),1 Had(12 : 5) Eta(4:0), phi(1:0), 1 Ntow(6:0), crate_sel Evt_done, buff(1:0) (assume this is the last cluster for the event ) CLIQUE control word
Information from Eric James about muon input: Each matchbox card sends up to 32 24-bit words for each fiber. The transfer time for each word is 132 ns. Each 24-bit word is encoded into 32-bit transfer over hotlink and come in as four groups of 8-bit words. Transfer A (1st 33ns) --------------------- bit0 - data(0) bit1 - data(1) bit2 - data(2) bit3 - data(3) bit4 - data(4) bit5 - data(5) bit6 - data(6) bit7 - VCC Transfer B (2nd 33ns) --------------------- bit0 - data(7) bit1 - data(8) bit2 - data(9) bit3 - data(10) bit4 - data(11) bit5 - data(12) bit6 - data(13) bit7 - VCC Transfer C (3rd 33ns) --------------------- bit0 - data(14) bit1 - data(15) bit2 - data(16) bit3 - data(17) bit4 - data(18) bit5 - data(19) bit6 - data(20) bit7 - Bunch Zero Marker Transfer D (4th 33ns) --------------------- bit0 - data(21) bit1 - data(22) bit2 - data(23) bit3 - GND bit4 - GND bit5 - GND bit6 - L2 Endmark bit7 - GND There is a register on the Matchbox card which gives one the ability to send zero, ten, twenty, or thirty words to L2. This feature was included in case we needed to complete our data transfer within a given time window to make the system work. The central trigger primitives are sent in the first ten words, the forward trigger primitives are sent in the next ten words, and L1 trigger decision data is sent in the last ten words. If one looks at the table in section 29.5.1 of CDF4152, the words which get sent to L2 begin with the High Pt CMU East bits (P0+3) and end with the IMB Diagnostic bits (P0+32). The output ordering of the words is the same as that shown in the table. The pre-match connections work In exactly the same way. There are only 16 24-bit words output to L2 from each pre-match card. From the table in section 29.5.2 of CDF4152, the first word which gets sent is the CMP primitives for stacks 00-23 (P0+2). The last word sent is CMP/CSP west matches for stacks 72-95 (P0+17). The ordering is the same as in the table. There are also register bits on the Pre-Match to control the number of words being sent. For this card one would transfer either zero, eight, or sixteen words.
CDF Muon bank data format Muon data as an example. Each word is 24 bits (sent as 4 8-bit words over hotlink).
To source data for each individual data path may not be hard to do, but to drive the full system, test the system robustness and rate capability… need more thinking. To keep things simple, it will involve TS and most likely will take system time if we need TS to use L2 decisions (can be done when there is no beam); Calibration L1A strobe is only good for the initial tests, the real test requires L1A patterns close to real situation. Can TS generate these patterns?
From test stand tools to a possible upgrade path • Only need a few minor modifications: • Add P3 connector for SLINK IO • Make L1 and SVT/XTRP inputs visible • to all 3 FPGAs • Note: the input mezzanine card connector is already • compatible with SLINK cards • Since the modification is simple at hardware level, • it doesn’t hurt to add them in, to make the board more • general purpose. It provides the interface to a PC which • could be very useful as test tools as well.
Level2_Pulsar baseline design (as a tester only) 9U VME (VME FPGA not shown) 3 Altera APEX 20K400 FPGAs Optical IO Mezz card connectors T S T R K CDF ctrl Control Optical IO L 1 T R K Pulsar: Pulser and Recorder (as Level 2 test stand tools)
Level2_Pulsar design (enhanced version) 9U VME (VME FPGA not shown) 3 Altera APEX 20K400 FPGAs P1 Optical IO Mezz card connectors T S T R K P2 CDF ctrl Control Optical IO P3 L 1 T R K SLINK signal lines Aslo make the L1 and SVT/TRK inputs available to all FPGAs From test stand tool to a more general purpose board: only need a few minor changes
Front-panel (double width) PULSAR design T S Mezzanine Card L 1 IO Hotlink/ Taxi/ Slink etc L 1 C LSC S V T IO To/ from PC S V T SLINK component side Other connectors (L1 out, TS in) will stay inside the board, only used in pulser mode, The mezzanine card connectors can be used either for optical I/O or SLINK cards
Loaded with SLINK Mezzanine cards Can simply use CDF CAL backplane. CERN sent us two transition modules The transition module is very simple(just a few SLINK CMC connectors). Will make our own (it is not commercially available). It uses P2 type connector for P3.
Examples of SLINK products (we ordered them and have most of them) LSC (Link Source Card), LDC (LINK Destination Card) Mezzanine card which can plug onto motherboard via CMC(Common Mezzanine Card) Connector (just like PMC). PCI to SLINK SLINK to PCI Proven technology, has been used by a few experiments to take hundreds of TB data without problem in the past few years
ATLAS SLINK data format CDF SLINK format will look very similar…. can follow TL2D bank format for each subsystem, Will need to define the format for each subsystem soon. Inputs welcome. Output from each Pulsar interface board will be in SLINK format
Design philosophy:Simplicity/Uniformity&Testability ONE for ALL and ALL for ONE design Level2_PULSAR stands for Level2_Processor Controller (3) pre-processor/merger (2) pULSer (1) And Readout (2) (1) as pulser; (2) as pre-processor/merger & readout; (3) as Processor Controller (with SLINK to PCI) Backward compatible (with existing system) Based on proven modern technology: CERN SLINK products (use as much as possible commercial available products)
L1 trk svt clist Iso reces mu SumEt,MEt Tracks Jets electrons photons muons Taus Tags Met SumEt • What does Level 2 really do? • Create all the trigger objects needed, then • Count objects above thresholds, or, • Cut on kinematics quantities From Henry Fish’s comments on L2 upgrade
New System Reces Pre-processor X 3 Reces Pre-processor X 3 16 x3 fibers 3 Reces Pre-processor + 1 merger TS Reces/trk L2 decision SLINK Slink to PCI cluster 6 fibers Global Processor Controller Cluster Pre-processor SLINK CPU Cluster/trk Iso 7 fibers PCI to Slink SLINK L1 S VT Muon/trk 16 fibers Muon Pre-processor SumET,MET from a L1 type cable (another mezzanine card) XTRP L1 bits
New system configuration 16x3 fibers taxi TS Reces/Trk Cluster/trk SLINK 6 hotlink + LVDS CPU Muon/trk 7 taxi fibers SLINK L1 SVT 12 hotlink matchbox SumEt,MET 4 hotlink prematchbox Red: FPGAs require largest internal buffering capability L1 XTRP
M U O N R E CE S R E CE S R E CE S M E R G E R P R O C E S S O R T R A CE R C L U S T E R New L2 Decision crate R O C TS Reces/trk Cluster/trk Muon/trk SumEt,MEt Slink to PCI mem All Pulsar boards take two slots (due to mezzanine cards) Total: 7 pulsar boards = 14 slots PCI to Slink Baseline design: use pre-processors to simply suppress/organize data, use processor Controller to simply pass data to CPU via Slink to PCI and also handshake with TS. All trigger algorithm will be handled by CPU. CPU GHz PC or VME processor