DQ2 status & plans

DQ2 status & plans BNL workshop October 3, 2007

Outline Focusing mostly on site services • Releases • Observations from 0.3.x • 0.4.x • Future plans

Release status • 0.3.2 (‘stable’) in use on OSG [ I think :-) ] • 0.4.0 being progressively rolled out • And 0.4.0_rc8 in use on all LCG sites • 0.3.x was focused on moving central catalogues to better DB. • The site services were essentially the same as on 0.2.12. • 0.4.x is about site services only • In fact, 0.4.x version number applies to site services only, and clients remain on 0.3.x

Observations from 0.3.x • Some problems were solved: • overloading central catalogues • big datasets • choice of next files to transfer • … but this introduced another problem: • as we were more ‘successful’ into filling up the site services queue of files… • we observed overloading the site services with expensive ORDERing queries (e.g. to choose which files to transfer next)

Observations from 0.3.x • The ramp-up in simulation production followed by poor throughput created a large backlog • Also, site services were (still are) insufficiently protected for requests impossible to fulfill: • “transfer this dataset, but keep polling for data to be created - or streamed from another site” .. but the data never came/never was produced/was subscribed from a source that was never meant to have it…

Observations from 0.3.x • Most relevant changes to the site services were to handle large datasets • less important for PanDA production but very relevant for LCG as all were large datasets

Implementation of 0.3.x • The implementation was also too simplistic for the load we observed: • MySQL database servers were not extensively tuned, too much reliance on database for IPC, heavy overload from ‘polling’ FTS and LFCs (e.g. no GSI session reuse for LFC) • loads >10 were common (I/O+CPU), in particular with MySQL database on the same machine (~10 processes doing GSI plus mysqld doing ORDER BYs)

Deployment of 0.3.x • FTS @ T1s is usually run in high performing hardware, with the FTS agents split from the database • … but FTS usually has a few thousand files in the system, on a typical T1 • DQ2 is usually deployed with the DB joint with the site services • And its queue is hundreds of thousands of files in a typical T1 • DQ2 is the system throttling FTS • where the expensive brokering decisions are made and where the larger queue is maintained

Evolution of 0.3.x • During 0.3.x, the solution ended up being to simplify 0.3.x in particular reducing its ORDERing • Work on 0.4.x had already started • to fix database interactions while maintaining essentially same logic

0.4.x • New DB schema, more reliance on newer MySQL features (e.g. triggers) less on expensive features (e.g. foreign key constraints) • Able to sustain and order large queues (e.g. FZK/LYON are running with >1.5M files in their queues) • One instance of DQ2 0.4.x is used to serve ALL M4 + Tier-0 test data • One instance of DQ2 0.4.x is used to serve 26 sites (FZK + T2s AND LYON + T2s)

What remains to be done • Need another (smaller) iteration on channel allocation and transfer ordering • e.g. in-memory buffers to prevent I/O on the database, creating temporary queues in front of each channel with tentative ‘best files to transfer next’ • Some work remains to have services resilient to failures • e.g. dropped MySQL connection • Still need to tackle some ‘holes’ • e.g. queues of files for which we cannot find replicas may still grow forever • if a replica indeed appears for one of them the system may take too long to consider this file to transfer • … but already introduced BROKEN subscriptions to release some load

What remains to be done • More site services local monitoring • work nearly completed • will be deployed when we are confident it does not cause any harm to the database • we still observe deadlocks • next slides…

Expected patches to 0.4.x • 0.4.x branch will continue to focus on site services only • channel allocation, source replicas lookup and submit queues + monitoring • Still, the DDM problem as a whole can only be solved by having LARGE files • while we need to sustain queues with MANY files, if we continue with the current file size, the “per-event transfer throughput” will remain very inefficient • plus more aggressive policies on denying/forgetting about subscription requests

After 0.4.x • 0.5.x will include an extension to the central catalogues • location catalogue only • This change follows a request from user analysis and LCG production • Goal is to provide central overview of incomplete datasets (files missing) • but handle also dataset deletion (returning list of files to delete at a site, coping with overlapping datasets - quite a hard problem!) • First integration efforts (as prototype is now complete) expected to begin mid-Nov

After 0.5.x • Background work has began on a central catalogue update • Important schema change: new timestamp oriented unique identifiers for datasets • allowing partitioning of the backend DB transparently to the user and proving more efficient storage schema • 2007 datasets on an instance, 2008 datasets on another.. • Work has started, on a longer timescale, as the change will be fully backward compatible • old clients will continue to operate as today, to facilitate any 0.3.x->1.0 migration

Constraints • In March we decided to centralize DQ2 services on LCG • as a motivation to understand problems with production simulation transfers • as our Tier-0 -> Tier-1 tests had always been quite successful in using DQ2 • now, 6 months later we finally start seeing some improvement • many design decisions of the site services were altered to adapt to production simulation behaviour (e.g. many fairly “large” open datasets) • We expect to continue to need to operate centrally all LCG DQ2 instances for a while more • support and operations are now being setup • but there is an important lack of technical people, aware of MySQL/DQ2 internals

Points • Longish threads and the use of Savannah for error reports • e.e.g recently we kept getting internal Panda error messages from the worker node for some failed jobs, which were side-effects of some failure in the central Panda part • Propose a single (or at least a primary) contact point for central catalogue issues (Tadashi + Pedro?) • For site services and whenever problem is clear, please post also report on Savannah • We have missed minor bug reports due to this • Clarify DQ2 role on OSG and signal possible contributions

DQ2 status & plans

DQ2 status & plans

Presentation Transcript

Air Quality System (AQS) Status and Plans

Landsat Status and Plans

OpenADE Status and Plans

Status & plans of the CAST experiment

LHC Accelerator Status and Plans

LSD status and plans

CASTOR and EOS status and plans

Linac4 Construction Status and Plans

Tracker DPG status and plans

CPL - Status & Plans

CTA Status and Plans

Status and Plans for LCG AA Software on Mac OS X 10.4 (Tiger)

ARDA status and plans

(Di-)Photon + MET Status and Plans for 5 fb -1 Analysis

EU DataGrid Data Management Workpackage : WP2 Status and Plans

US CMS TriDAS

LHCONE perfSONAR: Status and Plans

VOMS: Status & Plans

EDG WP4 (fabric mgmt): status&plans

LHC Status & Plans

Raw Fabrics for PCA Status and Plans

Status/Plans for EM Calibration

DQ2 status &amp; plans

DQ2 status &amp; plans

Presentation Transcript

Air Quality System (AQS) Status and Plans

Landsat Status and Plans

OpenADE Status and Plans

Status &amp; plans of the CAST experiment

LHC Accelerator Status and Plans

LSD status and plans

CASTOR and EOS status and plans

Linac4 Construction Status and Plans

Tracker DPG status and plans

CPL - Status &amp; Plans

CTA Status and Plans

Status and Plans for LCG AA Software on Mac OS X 10.4 (Tiger)

ARDA status and plans

(Di-)Photon + MET Status and Plans for 5 fb -1 Analysis

EU DataGrid Data Management Workpackage : WP2 Status and Plans

US CMS TriDAS

LHCONE perfSONAR: Status and Plans

VOMS: Status &amp; Plans

EDG WP4 (fabric mgmt): status&amp;plans

LHC Status &amp; Plans

Raw Fabrics for PCA Status and Plans

Status/Plans for EM Calibration

DQ2 status & plans

DQ2 status & plans

Status & plans of the CAST experiment

CPL - Status & Plans

VOMS: Status & Plans

EDG WP4 (fabric mgmt): status&plans

LHC Status & Plans