1 / 29

Venturing into VCS

Venturing into VCS. Luis Londoño Big Planet luisl@istaken.com. Outline. Introduction Our current configuration and why Installation/Upgrade: good and bad Group Dependencies Developing custom agents A few lessons What’s next? Summary. Big Planet - Introduction. Based in Provo, Utah

Download Presentation

Venturing into VCS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Venturing into VCS Luis Londoño Big Planet luisl@istaken.com

  2. Outline • Introduction • Our current configuration and why • Installation/Upgrade: good and bad • Group Dependencies • Developing custom agents • A few lessons • What’s next? • Summary

  3. Big Planet - Introduction • Based in Provo, Utah • 2 year old company with 18 months of service to the public • One data center • We are a Network Marketing company with over 60,000 sales representatives • Recently acquired by Nu Skin Enterprises

  4. Big Planet - Introduction • National Internet Service Provider • Over 2000 dial-up pops • Standard ISP services • Flagship Device: I-Phone - internet phone

  5. Data Center Environment • Several Sun E450, E250 and E4000 servers • A couple of HP K-400 series servers • EMC Symmetrix 3700 Storage • 1.5 Terabytes • SCSI connected today • Databases: • Oracle: ~180GB • DB2: ~10GB • NFS servers • Sun • Veritas QuickLog • Drank a lot of the Veritas Kool-aid!

  6. Our Goal • A lot of dependencies on backend systems. • We require 99.9% uptime for these systems, but really want 99.95% • Tricky with: • SW Maintenance • HW maintenance • Understaffed, so ease of management is critical 5 Minutes per week and the easy life!

  7. Choosing VCS • Upgraded from FirstWatch • We had two configurations • Oracle in a 2-to-1 setup • NFS asymmetric two-server configuration • FirstWatch was definitely not the easy life, in particular with 2-to-1 configuration • Had already learned a lot though, and the systems were pretty well setup. • One high-availability package for both HP-UX and Solaris • Much easier to deploy and configure • More Veritas Kool-aid!

  8. db2/hhtp NFS (Production) Oracle standby (Testing) Cluster Configuration Overview

  9. Resource Groups

  10. Resource Groups (cont’d)

  11. Resource Groups (cont’d)

  12. Resource Groups (cont’d)

  13. Resource Groups (cont’d)

  14. Why this way? • One big cluster vs. three little clusters • Ease of management and monitoring • Would like to start interrelating all servers when SAN arrives • Looks cool in the gui and my director was impressed! • Independent large groups vs. dependent smaller groups • Experimented and we were not convinced • Group relationships not well defined - more later • Instead we thought carefully about criticality of resources in a group.

  15. Installation/Upgrade: good • Good: • ALMOST NO DOWNTIME!!!! • Really easy to configure the communications layers. • Decide on names/ids up front • Upgrade from 1.0.2 to 1.1 went very smooth (thanks Veritas!!) • Resource attribute localization for NIC and IP resources --- NOT IN THE MANUAL!* • hares -local res attr * I think :-)

  16. Installation/Upgrade: bad • Bad: • ALMOST NO DOWNTIME!!!! • VCS and typical VxVM imports do not like each other • Try: vxprint -ag <dgname> <dgname> to checkOutput should be: dg <dg> …noautoimport=on … • Unfortunately, deport and let VCS reimport, or vxdg -t import … • SOAP BOX • Give thought to seeding • Consultant setup GAB originally as gabconfig -c -x • Suffered split brain personality as a result of no seeding, and a bug in gab v1.0.2 • We are running gabconfig -c -n5 today for 7 nodes • I don’t know that there is a right answer, but there is a wrong one!

  17. Ora db1 Ora db1 Disk and IPs Group Dependencies • Dependencies not well defined • We would have liked to setup smaller groups and link them “online local” • There is no concept of “critical” groups, they are all “non-critical” • Here is what we found...

  18. Group Dep: Online Local • Forced failure of resource in topgrp, and nothing happened • Forced failure of resource in bottomgrp, and VCS migrated both to another system • Tried to switch to another machine, but could not do it

  19. Group Dep: Offline Local and Online Remote • Forced failure of resource in topgrp, and nothing happened • With topgrp on host A, forced failure of bottomgrp on host B, and VCS swapped them (top->B,bot->A) • With topgrp offline, switched bottomgrp and topgrp did not come online

  20. Customizing VCS • Local attribute on resources • Writing script custom agents • BPVxQuickLog Agent • QuickLog functionality does not come with VCS and Professional Services did not appear to have an agent. • LAST SOAP BOX • If you are not careful it will kill your data! • We wrote our own • Generic Application Agent • Kind of like a Swiss army agent - very handy tool to have around • Used it to monitor and control DB2

  21. Writing Script Agents 1. Decide on attributes - Think generic! • What information do you need to start/stop/monitor • Create type entry. Easiest thing is to hand edit the config files, force stop ha on all nodes and force start again. • ArgList comprises the parameters that will get passed to your scripts after the resource name. The first argument in ArgList is the second argument to your scripts. • NameRule specifies a default name for your resource. Look for examples in types.cf - lots of options.

  22. Writing Script Agents (cont’d) 2. Create the agent directory • If type in types.cf or equiv is Bob, then create a directory $VCS_HOME/bin/Bob • Copy $VCS_HOME/bin/ScriptAgent to $VCS_HOME/bin/Bob/BobAgent 3. Write monitor script first • Beware that online may not be called • Common Exit Codes: • 100 -- Resource is offline • 110 -- Resource is online 4. Write online and offline scripts • Exit Code 0 means success • Use logger or $VCS_HOME/bin/halog -add

  23. BPVxQuickLog • Script Agent with online, offline, monitor • Attributes: • DiskGroup • QLogDev • AccelVolume • MountPoint • online steps 1. /sbin/vxld_logck 2. /sbin/vxld_mntlog WARNING:Make sure you remove the /etc/rcS.d/S88vxld-startup script, or it will cause the system to go into single user move during boot since VCS will not have imported the diskgroup yet.

  24. BPVxQuickLog (cont’d) • offline steps 1. /sbin/vxld_umntfs if necessary 2. /sbin/vxld_umntlog • monitor steps Use vxld_print to: 1. Check the QuickLog Device - make sure status is RUNNING 2. Check the QuickLog Volume - make sure status is OPENED

  25. Generic Application Agent • Very generic, can do a bunch of things • Attributes: • PidFileDir • PidFile • FileExistsDir • FileExists • MonitorProc1 • MonitorProc2 • MonitorProc3 • MonitorProc4 • StartUser • StartDir • StartScript • StartParams • StopUser • StopDir • StopScript • StopParams

  26. A few lessons • Resource names matter . . . ok not to VCS • hastatus -summary is not that honest • engine logs are not all the same • To NetworkHosts, or not to NetworkHosts • Oracle SQLNet listener monitor is picky about caps • Overstated, but remember to check major and minor numbers for NFS failover

  27. What’s next? • We will be developing additional agents for: • SUN SIMS • LDAP servers • Create an HP-UX 2 node cluster • Waiting to complete the upgrade to 11.0 • Integration with our very young SAN • SNMP integration with HP-OpenView and Netcool • Ops training

  28. Summary • VCS has made our lives easier • Very quick to install and almost no downtime • Very customizable • Very stable

  29. “If all else fails, immortality can always be assured by spectacular error.”- John Kenneth Galbraith

More Related