Site Report

Site Report US CMS T2 Workshop Samir Cury on behalf of T2_BR_UERJ Team

Server's Hardware profile • SuperMicro machines • 2 X Intel Xeon dual core @ 2.0 GHz • 4 GB RAM • RAID 1 - 120 GB HDs

Nodes Hardware profile (40) • Dell PowerEdge 2950 • 2 x Intel Xeon Quad core @ 2.33 GHz • 16 GB RAM • RAID 0 – 6 x 1 TB Hard Drives • CE Resources • 8 Batch slots • 66.5 kHS06 • 2 GB RAM / Slot • SE Resources • 5.8 TB Useful for dCache or hadoop Private network only

Nodes Hardware profile (2+5) • Dell R710 • 2 are Xen Servers – not worker nodes • 2 X Intel Xeon Quad core @ 2.4 GHz • 16 GB RAM • RAID 0 – 6 x 2 TB Hard Drives • CE • 8 Batch Slots (or more?) • 124.41 kHS06 • 2 GB RAM / Slot • SE • 11.8 TB for dCache or hadoop Private network only `

First phase nodes Profile (82) • SuperMicro Server • 2 Intel Xeon single core @ 2.66 GHz • 2 GB RAM • 500 GB Hard Drive & 40 GB Hard Drive • CE Resources • Not used – Old CPU’s & low RAM per node • SE Resources • 500 GB per node

Plans for the future - Hardware • Buying 5 more Dell R710 • Deploying 5 R710 when the disks arrive • More 80 cores • More 120 TB Storage • More 1244 kHS06 Total • CE - 40 PE 2950 + 10 R710 = 400 Cores || 3.9 kHS06 • SE - 240 + 120 + 45 = 405 TB

Software profile – CE • OS – CentOS 5.3 64 bits • 2 OSG Gatekeepers • Both running OSG - 1.2.x • Maintenance tasks eased by redundancy – less downtimes • GUMS 1.2.15 • Condor 7.0.3 for job scheduling

Software profile – SE • OS - CentOS 4.7 32 bits • dCache 1.8 • 4 GridFTP Servers • PNFS 1.8 • PhEDEx 3.2.0

Plans for the future: Software/Network • SE Migration • Right now we use dCache/PNFS • We plan to migrate to BeStman/Hadoop • Some effort already comes up with results • Adding the new nodes to the Hadoop SE • Migrate the data • Test with real production environment • Jobs and users accessing • Network Improvement • RNP (our network provider) plan to deliver for us a 10 Gbps link before the next SuperComputing Conference.

T2 Analysis model & associated Physics groups • We have reserved 30 TB for each of the groups: • Forward Physics • B-Physics • Studying the possibility to reserve space for Exotica The group has several MSc & PhD students working on CMS Analysis for a long time – These have a good support Some Grid users submit, sometimes run into trouble and give up – don't ask for support

Developments • Condor Mechanism based on suspend to give priority to a very little pool of important users : • 1 pair of batch slots per core • When the priority user’s jobs arrive, it pauses the normal job on the other batch slot • Once it finishes and vacate the slot, his pair automatically resumes. • Documentation can become available for the interested • Developed by Diego Gomes

Developments • Condor4Web • Web interface to visualize condor queue • Shows grid DN’s • Useful for Grid users that want to know how the job is going scheduled inside the site – http://monitor.hepgrid.uerj.br/condor • Available on http://condor4web.sourceforge.net • Still have much to evolve, but already works • Developed by Samir

CMS Center @ UERJ • During LISHEP 2009 – January we have inaugurated a small control room for CMS on UERJ:

Shifts @ CMS Center Our computing team have participated on tutorials and now we have four potential CSP Shifters

CMS Centre (quick) profile • Hardware • 4 Dell workstations with 22” monitors • 2 x 47” TV’s • Polycom SoundStation • Software • All the conferences including with the other CMS Centers are done via EVO

Andre Sznajder (Project coordinator) Jose Afonso (Software coordinator) Fabiana Fortes (Site admin) Raul Matos (Trainee) Cluster & Team • Alberto Santoro (General supervisor) • Eduardo Revoredo (Hardware coordinator) • Samir Cury (Site admin) • Douglas Milanez (Trainee)

2009/2010 year’s goals • We have worked in 2009 mostly in • Getting rid of the infra-structure problems • Electrical Insuficciency • AC – Many downtimes due to this • These are solved now • Besides that problems • Running official production on small workflows • Doing private production & analysis for local and Grid users • 2010 goal • Use the new hardware and infra-structure for a more reliable site • Run more heavy workflows and increase participation and presence on official production.

Thanks! • I want to formally thank Fermilab, USCMS and OSG for their financial help to bring an UERJ representantive here. • Also want to thank USCMS for this very useful meeting

Questions? Comments?

Site Report

Site Report

Presentation Transcript

Site Report

IRFU Site report

Full Site Report

AGLT2 Site Report

AGLT2 Site Report

AGLT2 Site Report

Tier1 Site Report

SitE Report

Nikhef Site Report

Site Report

RAL Site Report

LAL Site Report

Site Report

Site Report

GRIF site report

Site Status Report

UTA Site Report

RAL Site Report

Site Report

Site Report

Birmingham site report

DAPNIA Site report