IT Briefing Agenda 7/17/05

IT Briefing Agenda 7/17/05 • Microsoft Agreement • SPSS Site License • IMAP Polling • NetReg/CAT Update • Research Cluster • Premiere Support • NetCom Q&A • John Ellis • Marcy Alexander • Ken Guyton • Alan White • Keven Haynes • Karen Jenkins • Paul Petersen

General Updates

Eagle Mail Performance An opportunity to make it faster Ken Guyton

Architecture • Eagle mail consists of three services • Relay • Delivery • Reading

Architecture • Relay • Moving email from computer to computer (SMTP)

Architecture • Delivery • Delivering messages into an INBOX.

Architecture • Reading • Users retrieving their messages to read them, mark as read, delete, etc. (IMAP)

Virus Architecture LDAP routing other svrs Relay Spam disk 2 4 Read/Deliv firewall 6 IMAP proxy 3 Relay Webmail 4 3 clients

The Situation • Reading and Delivery live on the same servers.

The Situation CPU Utilization 100% CPU Utilization 90% 80% 80% 50% 50% 1 2 3 4 5 Read/Delivery Servers

The Situation Users per server 100% CPU Utilization 90% 12300 12809 80% 80% 8313 50% 50% 4168 75 1 2 3 4 5 Read/Delivery Servers

The Question • What are 75 users doing to use 50% of a Read/Delivery server?

Observations • Make some measurements of busy IMAP processes • Tracing with truss • Profile processes • Packet snooping

Observations • ...to answer questions: • What are these processes doing? • What system calls are using the most CPU time? • What IMAP commands are being sent? • ...and how often?

Results • Processes are doing a lot of disk I/O. • The system calls that account for the vast majority of CPU time are read() and alarm(). • The IMAP command is SELECT

More Observations • Instrument the imapd daemon (we have the source code!) • Log SELECTS on a user and mailbox basis • Plot behavior

Results ≤ 1 min ≤ 5 min > 5 min

Conclusions • High-frequency SELECTs are killing us • A new server/75 users is EXPENSIVE!

Hypothesis • When clients check for new email they send a SELECT • (They should send a NOOP) • Users are setting their clients to check for email every minute

Final Notes • Webmail checks every five minutes (and does use SELECT) • Some clients have a drop-down menu to select this time (1-min, 5-min, etc.)

Our Plea • See if your users are polling < 5 min • 10 min is better • You can always manually check for new email • Help them change their polling time if needed

The UNIX Group • Chris Alexander • Bruce Anderson • Karla Fields • Amanda Gagnon • Ken Guyton • Curt Tucker • Eric Van Wieren

NetReg/CAT Update Alan White

Emory University High Performance Computing Cluster Keven Haynes

Need for High Performance • Large number of computations • Large data set • Complex computations • Specialized applications • More disciplines doing computational work

Need for Shared Resources • Most researchers do not have physical resources to house large computing systems. Air Conditioning, Power, Security are all important, often overlooked. • Many researchers lack technical expertise required to manage systems, especially Linux/Unix. • Most personally-owned systems are underutilized, therefore not as cost-effective. • Money pooled-together can buy bigger and better systems.

Emory High Performance Computing Cluster • Partnership between Emory College, BIMCORE (School of Medicine) and ITD. • Emory College and individual faculty (Jeager, Printz) provided funding for purchase of the cluster. • BIMCORE provides software expertise, cost-recovery infrastructure. • ITD provides facility and system administration.

The Cluster - hardware • 63 dual-processor (AMD 2.2 GHz Opteron 248) Sun V20z’s “Compute Nodes” • 1 quad-processor (AMD 2.2 GHZ Opteron 848) Sun V40z “Master Node” • Compute Nodes have 2 GB of RAM each, Master node has 8GB. • Each node has 73 GB of local disk space (RAID 1) • Master Node has 550+ GB of local disk space • Two 47u APC powered rack enclosures

The Cluster - Networking • All nodes connected via gigabit Ethernet (copper) on private network • Two SMC gigabit switches • 21 Nodes are connected via 4 gigabit Myrinet • Service processors connected via 100 Mb Ethernet • Two MRV serial console switches

The Cluster - Software • Red Hat Enterprise Linux - Version 3, x86_64, Advanced Server and Workstation • Kernel 2.4.21, glibc 2.3.2, gcc version 3.2.3 • 64-bit Operating system/runtime environment • Sun Grid EngineTM : • Manages queuing and prioritization of jobs • Performs job and user accounting for time-shared resource • Can support up to 200,000 jobs simultaneously • Heterogeneous support allows connection of Mac OS X, Solaris and other execution hosts

Current and Future Applications • Genesis (neural simulator, Jaeger) • Pattern Generation and Homeostasis in Neural Circuits (Prinz) • Pharmacology (Severson) • Others: • Animation Rendering • Statistical Analysis (-R-) • Numerical Analyis (MATLAB) • Bioinformatics (BLAST) • Large Population Studies, GIS

Who may use the Cluster? • Researchers, namely PI’s • Open to anyone affiliated with Emory, possibly some external research • Subscription Fee: ~ $3000/year or $750/quarter

Questions?

Premiere Support Overview Karen Jenkins

Premiere Support • Advanced/escalated support for specific set of customers • Local support and other campus technical professionals • Executive leadership (later phase) • Benefits • Dedicated number to reduce wait times • Direct entry to high level support technicians

When to use • Reporting of system down and other performance issues only. • University Enterprise applications & Network • Examples include: Network Outage affecting a large department, building, or campus; Eagle Mail is down; PeopleSoft is crawling; other “strange” behavior • Non-critical or other work requests should go through Manage IT or ESR. • Examples include: account requests, virus reporting, suggestions to improve service, etc.

Logistics • Hours: M-F 8:00am – 5:00pm • After hours calls automatically forwarded to the help desk (which forward to on-call after help desk hours). • Premiere Support Team: • Call Center supervisor • Craig Myers • Linda Ellis

Responsibilities • Obtain technical input/details • Escalate to proper Tier 3 team • Provide regular communication and updates • Via Manage IT Bulletin Board • Provide final debrief / explanation of problem • Again via Manage IT

Setup • Dedicated line rings on the primary team members phone sets • After 4 rings rolls over to help desk FTE phones … if busy the queue • Premiere Support calls are placed in the front of the queue. • Investigating adding a visual indicator for the help desk FTEs.

Pre-Requisites • Requires a support account in Manage IT • Participation in the local-l listserv • Caller ID information displayed

… and the number is … Will be posted on the Manage IT Bulleting Board Available on TBD

INPUT, QUESTIONS, COMMENTS

Quick Manage IT Update • New Manage IT Major Features for 8/31/05 • Flashboards • Port Status Table • 2-Way email • Assignment permissions group • Target Features for 9/30/05 • PS Status Table • Emory Reports • Magic View • Resolution / Communication to requester Manage IT Training Session Scheduled for: September 13th @ 1:00pm NDB Auditorium or Kennessaw

Quick ESR Update • New ESR Major Features by 8/31/05 • Change pop-ups to long names • Web & DB self-service forms • Target Features for 9/30/05 • On-behalf of (in Manage IT too) • Communication box

NetCom Q&A

IT Briefing Agenda 7/17/05

IT Briefing Agenda 7/17/05

Presentation Transcript

IT Briefing Agenda 8/19/04

IT Briefing Agenda 4/21/05

IT Briefing Agenda 1/20/05

Agenda (Fri., June 7)

IT Briefing Agenda 2/16/06

IT Briefing Agenda 10/20/05

Briefing Agenda

IT Briefing Agenda 5/19/05

IT Briefing Agenda 1/19/06

IT Briefing Agenda 8/19/04

Briefing Agenda

Briefing Agenda

IT Council Agenda

Agenda:

IT Briefing Agenda 10/20/05

IT Briefing Agenda 8/19/04