180 likes | 342 Views
CVS Service at CERN status and LCG-dedicated service. Manuel.Guijarro@cern.ch Sebastian.Lopienski@cern.ch CERN IT/PS/UI October 2003. Outline. CERN Central CVS Service Overview What does it offer? Projects Architecture Status Interactions with users Failures and fail-over Tools.
E N D
CVS Service at CERNstatus and LCG-dedicated service Manuel.Guijarro@cern.ch Sebastian.Lopienski@cern.ch CERN IT/PS/UI October 2003
Outline CERN Central CVS Service • Overview • What does it offer? • Projects • Architecture • Status • Interactions with users • Failures and fail-over • Tools CVS Service for LCG • The challenge • Architecture • Failure recovery • Advantages, disadvantages • Plans for the future
CERN Central CVS Service Overview • central service hosting CERN-related software projects • created following a service request • collection of user requirements • architecture proposals • implementation based on assigned resources • in production since the end of August 2002 • currently hosting around 45 projects, 3 GB of data(major part is source code)
CERN Central CVS Service What does it offer? • secure and robust CVS service with up-to-date server software • data integrity (mirror every hour, daily archiving) • several access methods: Kerberos IV, SSH, pserver • service support through Remedy • automatic CVS lock monitoring and reporting • good performance • Web interfaces: CVSWeb, ViewCVS BUT: • the service is not a project management tool
CERN Central CVS Service Architecture • automatic and transparent load-balancing and fault tolerance (via an ISS DNS alias that distributes CVS requests among a farm of four servers) • dependency on AFS and DNS • monitoring availability every 10 min. • service availability higher than 99.98% so far
CERN Central CVS Service Failures and fail-over • On average, one out of four nodes is often down, due to: • software upgrades • 3ware disk controllers failures • hardware tests performed to investigate hangs (lxcvs02 has hung more than 20 times this year) • 4 disks failed (no problem since they are mirrored) • Automatic fail-over made the disruptions mentioned above transparent to CVS users • Total down-time of the service this year < 12 hours (est.): • Computer Centre power cuts • several short network interruptions (mainly at non-working hours) • some others partial interruptions: • AFS problems - affecting only some users • xinetd configuration (wrong pserver port number, limit on kserver processes)
CERN Central CVS Service Status • Software upgrades: • operating system → Red Hat 7.3 • CVS → 1.12.1 (newest feature version) • applying patches when necessary • Two additional nodes (SEIL 2x2.4GHz) have been added to the cluster – currently four servers • Fully automated installation and configuration of the machines using WP4 tools (as in lxbatch): • CDB templates • kickstart files • ... but still some SUE features to be translated into WP4 components
CERN Central CVS Service Interactions with users • Almost 200 user requests or questions received and answered so far • around 90 on the Remedy system • others via e-mail • Full documentation on the Web (http://cern.ch/cvs) • user documentation • manuals • howto's • list of CVS books • technical documentation for administrators • Web tools for users: • configuring access type for web interfaces: Public, Restricted or None - modified CVSWeb and ViewCVS • encrypting passwords for pserver access
CERN Central CVS Service Tools A series of tools were developed for service maintenance: • detecting and increasing AFS volume size for projects which are about to get full • CVS lock detection and reporting to librarians • ISS statistics • cluster nodes' information (availability; load; up-time;ISS information and enabling and disabling a node) • cvsserver SUE feature (for automatic machine configuration) • scripts for creating and deleting projects • setup, availability and backup checking scripts • etc.
CVS Service for LCGThe LCG challenge • LCG (LHC Computing Grid – http://cern.ch/LCG)group has requested a non AFS-based CVS Service • Several proposals were prepared to meet this demand • One solution was chosen, implemented and is to be evaluated by LCG • The old CVS Service will remain in production, available for non LCG-related projects
CVS Service for LCGArchitecture An „N+1” cluster: • N active nodes with repositories on local file systems • additional passive node („slave server”) – backup for all active servers • data replication – repositories copied to the slave server • for each repository there is a DNS alias pointing to the node hosting this repository • when a node is down, aliases are redirected to the slave server • currently 3+1 machines
1 2 N slave servers repositories copies of repositories Project X CVS Service for LCGArchitecture X.cvs.cern.ch
CVS Service for LCGAdvantages and disadvantages Advantages • fastest possible access to the repositories (local file system) • independent of AFS • regular PC as servers (no special hardware bought) Disadvantages: • constant mirroring may affect performance • load-balance is on repository level (not request level) • slave server down => no fault tolerance • fail-over requires human intervention (for the time being) Plans for the future: Automatic fail-over(system decides when to switch to the slave server)
CVS Service for LCGOther information • data replication is done by CVSup • access to repositories and user home directorieson a file-system level via NFS • Web interface: CVSWeb (sample URL: http://X.cvs.cern.ch:8180/cgi-bin/X.cgi) • instant DNS update • CVS Service for LCG web page: http://cern.ch/lcgcvs
Thank you! Any questions? More information at: http://cern.ch/cvs http://cern.ch/lcgcvs