110 likes | 270 Views
CASTOR External Workshop - ASGC Site Status Report. Jason Shih ASGC , Grid Operation Nov 1 3 , Academia Sinica. Outline. Current status: Current architecture & implementation H/W Profile Milestones Limitation of current implementation Future planning Castor V2 implementation
E N D
CASTOR External Workshop -ASGC Site Status Report Jason Shih ASGC, Grid Operation Nov 13, Academia Sinica
Outline • Current status: • Current architecture & implementation • H/W Profile • Milestones • Limitation of current implementation • Future planning • Castor V2 implementation • stager catalogue migration • DB migration for NS, from mysql to Oracle. • Split instances for multiple VO supported • Resource expansion planning
H/W Profile (I) - Disk Fig 3 • Stager/NS: blade system, SMP Xeon 3.0, 4GB ECC memory (Fig 1) • Disk servers: blade system, SMP Xeon 3.0, 4GB ECC with x86_64 (kernel 2.6.9-34.EL.cernsmp) (Fig 1) • DAS to backend storage (max upto 10.5TB for single partition per chassis) (Fig 2) • Raid 6, 24 bays in same closure, single controller only. • Total Capacity 45TB • Sun/STK raid system shared within specific zoning for CASTOR group (redundant controller configuration) • FLX-B280 – 20TB and FlX-B380 – 30TB (Fig 3) Fig 1 Fig 2
H/W Profile (II) – Tape Lib Fig 1 • IBM 3584 (Fig 1) • Base frameset L52 and D52 expansion • 700 LTO3 cartridges, 400GB ea. • Total capacity 280TB • Tape drives (Fig 2) • 4 x IBM 3592 LTO3 tape drives • 80MB/s max throughput • Tape Servers (Fig 3) • IBM 306m, Pentium dual core, 4GB ECC • 1 for castor V1 • 2 for Castor V2 Fig 3 Fig 2
Current architecture & implementation • Stager/NS – castor v1 • mysql backend • Backend disk servers • castor v2 (griftp and rfio) • R/B SRM headnodes • Tape servers • Base on Castor v2 • V1 VDQM, and direct communicate with rtcp daemon (skip migrator, MigHunter, err handlers etc.)
Some milestones • Dec, 2005 – Splitting CASTOR fabrics into SC and production • Jan-Feb • FS of disk servers reformatted, from ext3 to xfs (performance) • Castor v2 gridftp pkg adoped • Feb-Mar – kernel upgrade from 2.4 to 2.6 (tcp buffer size, bdflush issue). • May – CASTOR-SC relocated, disk server migrated into dual core servers (IPMI). • Jul – LSF license procurement • Aug - CASTOR V2 testbed implemented • Sep - Data migration from Sun/STK B280 raid system to Infotrend, due to the consideration of high IO wait ratio, and low throughput (problem resolved recently when rebuild with larger segment size, say >256kB). • Sep - Disk server migrate from dual core server to SMP Xeon blade system, reinstalling OS with x86_64 for the long exist XFS kernel bug. • stack overflow on SLC4+XFS • Sep - Stager/NS split from SRM headnodes, transfer quality improved. • Sep - 11 disk servers installed for CSA06 (total capacity 65TB) • Oct - nagios plug-in to help monitoring disk servers status and service availability (rfio and gridftp).
Limitation of current setup • Two castor group setup, all base on v1 • Stager designed not scalable in V1 • dedicated disk servers serving for each stagers (v2) • Lack of scheduling functionality accessing tape (v2) • Limited resources sharing, transfer requests via generic scheduling. (v2) • Monitoring of internal migration (v2) • In recent Biomed DC2, we have noticed • CASTOR performance limited by current version (rf*, srm or gridftp) • stager catalogue limited by order of 100k but in DC2 we have registered more than 1300k files. • need > 12hr to reinit the stager • High CPU load have been observed if client transfer requests using SRM push mode (Fig1) Fig 1
CSA06 disk server transfer • disk to disk nominal rate • currently ASGC have reach100+ MB/s static throughput • Round robin SRM headnodes associate with 4 disk servers, each provide ~20-30 MB/s • debugging kernel/castor s/w issues early stage of SC4 (reduction to 25% only, w/o further tuning)
Summary • During CSA06, high CPU load have been observed when client adopted FTS channel with srm pushmode enable (SRM implementation in v1) • Stager catalogue limited in size (v1), as found in current Biomed DC2 (limitation have been raised in Jul, 2k5). • Strange H/W problem when connecting to LTO3 tape drives (IBM 3592) • Integrated v1 stager with v2 gridftp, rfio and tape daemons. • Successful validation in CSA06 and Atlas DDM, except for the well-known SRM push-mode problem.
Future planning • Migration part of components from v1 to v2 • Pure V2 implementation – stager/name server – end of Nov. • Oracle RAC for NS – end of Nov. • Multiple instances serving different VOs – end of Nov. • Remount disk servers into new production v2 env. – mid Dec. • Performance and stress testing – end of Dec. • Internal pre-production testbed • Service validation for new castor v2 release • Internal performance tuning before applying to production system • MoU commitment: • 400TB disk servers installed Q4 of 2k6 (end of Dec)
Acknowledgment • Special thanks to assistant from FIO and GD: • Olof, Sebastien and Giuseppe, Hugo • Maarten • People involved in DM: • HungChe Jen, Dave Wei • Exp. OP: • Atlas DQ2: Suijian Zhou • CMS Phedex: ChiaMing Kuo