CASTOR External Workshop - ASGC Site Status Report

CASTOR External Workshop -ASGC Site Status Report Jason Shih ASGC, Grid Operation Nov 13, Academia Sinica

Outline • Current status: • Current architecture & implementation • H/W Profile • Milestones • Limitation of current implementation • Future planning • Castor V2 implementation • stager catalogue migration • DB migration for NS, from mysql to Oracle. • Split instances for multiple VO supported • Resource expansion planning

H/W Profile (I) - Disk Fig 3 • Stager/NS: blade system, SMP Xeon 3.0, 4GB ECC memory (Fig 1) • Disk servers: blade system, SMP Xeon 3.0, 4GB ECC with x86_64 (kernel 2.6.9-34.EL.cernsmp) (Fig 1) • DAS to backend storage (max upto 10.5TB for single partition per chassis) (Fig 2) • Raid 6, 24 bays in same closure, single controller only. • Total Capacity 45TB • Sun/STK raid system shared within specific zoning for CASTOR group (redundant controller configuration) • FLX-B280 – 20TB and FlX-B380 – 30TB (Fig 3) Fig 1 Fig 2

H/W Profile (II) – Tape Lib Fig 1 • IBM 3584 (Fig 1) • Base frameset L52 and D52 expansion • 700 LTO3 cartridges, 400GB ea. • Total capacity 280TB • Tape drives (Fig 2) • 4 x IBM 3592 LTO3 tape drives • 80MB/s max throughput • Tape Servers (Fig 3) • IBM 306m, Pentium dual core, 4GB ECC • 1 for castor V1 • 2 for Castor V2 Fig 3 Fig 2

Current architecture & implementation • Stager/NS – castor v1 • mysql backend • Backend disk servers • castor v2 (griftp and rfio) • R/B SRM headnodes • Tape servers • Base on Castor v2 • V1 VDQM, and direct communicate with rtcp daemon (skip migrator, MigHunter, err handlers etc.)

Some milestones • Dec, 2005 – Splitting CASTOR fabrics into SC and production • Jan-Feb • FS of disk servers reformatted, from ext3 to xfs (performance) • Castor v2 gridftp pkg adoped • Feb-Mar – kernel upgrade from 2.4 to 2.6 (tcp buffer size, bdflush issue). • May – CASTOR-SC relocated, disk server migrated into dual core servers (IPMI). • Jul – LSF license procurement • Aug - CASTOR V2 testbed implemented • Sep - Data migration from Sun/STK B280 raid system to Infotrend, due to the consideration of high IO wait ratio, and low throughput (problem resolved recently when rebuild with larger segment size, say >256kB). • Sep - Disk server migrate from dual core server to SMP Xeon blade system, reinstalling OS with x86_64 for the long exist XFS kernel bug. • stack overflow on SLC4+XFS • Sep - Stager/NS split from SRM headnodes, transfer quality improved. • Sep - 11 disk servers installed for CSA06 (total capacity 65TB) • Oct - nagios plug-in to help monitoring disk servers status and service availability (rfio and gridftp).

Limitation of current setup • Two castor group setup, all base on v1 • Stager designed not scalable in V1 • dedicated disk servers serving for each stagers (v2) • Lack of scheduling functionality accessing tape (v2) • Limited resources sharing, transfer requests via generic scheduling. (v2) • Monitoring of internal migration (v2) • In recent Biomed DC2, we have noticed • CASTOR performance limited by current version (rf*, srm or gridftp) • stager catalogue limited by order of 100k but in DC2 we have registered more than 1300k files. • need > 12hr to reinit the stager • High CPU load have been observed if client transfer requests using SRM push mode (Fig1) Fig 1

CSA06 disk server transfer • disk to disk nominal rate • currently ASGC have reach100+ MB/s static throughput • Round robin SRM headnodes associate with 4 disk servers, each provide ~20-30 MB/s • debugging kernel/castor s/w issues early stage of SC4 (reduction to 25% only, w/o further tuning)

Summary • During CSA06, high CPU load have been observed when client adopted FTS channel with srm pushmode enable (SRM implementation in v1) • Stager catalogue limited in size (v1), as found in current Biomed DC2 (limitation have been raised in Jul, 2k5). • Strange H/W problem when connecting to LTO3 tape drives (IBM 3592) • Integrated v1 stager with v2 gridftp, rfio and tape daemons. • Successful validation in CSA06 and Atlas DDM, except for the well-known SRM push-mode problem.

Future planning • Migration part of components from v1 to v2 • Pure V2 implementation – stager/name server – end of Nov. • Oracle RAC for NS – end of Nov. • Multiple instances serving different VOs – end of Nov. • Remount disk servers into new production v2 env. – mid Dec. • Performance and stress testing – end of Dec. • Internal pre-production testbed • Service validation for new castor v2 release • Internal performance tuning before applying to production system • MoU commitment: • 400TB disk servers installed Q4 of 2k6 (end of Dec)

Acknowledgment • Special thanks to assistant from FIO and GD: • Olof, Sebastien and Giuseppe, Hugo • Maarten • People involved in DM: • HungChe Jen, Dave Wei • Exp. OP: • Atlas DQ2: Suijian Zhou • CMS Phedex: ChiaMing Kuo

CASTOR External Workshop - ASGC Site Status Report

CASTOR External Workshop - ASGC Site Status Report

Presentation Transcript

CASTOR project status

ASGC Incident Report

CASTOR status and development

Workshop theme status report

Savannah River Site Status Report

Site Status Report

ASGC T1 report

CASTOR Project Status

IHEP Site Status Report

ASGC Updated

ASGC incident report

ASGC Site Status

ASGC Site Report

CASTOR Status

CASTOR Report

Tracy Tire Fire Site Status Report

Savannah River Site Status Report

V4 Status and Workshop Report

ASGC Site Report

ASGC Status report

ASGC Site Report

HEPiX Fall 2008 @ ASGC Taipei Summary Report