Performance Analysis

Performance Analysis Chapter 25

Introduction • Performance is one of the most visible characteristics of any system, and it is often high on the list of complaints from users. • Many users are convinced that their computers could run twice as fast if only the administrator knew how to properly tune the system to release its vast, untapped potential. • In reality this is almost never true. Chapter 25 -Performance Analysis

Introduction • System performance is not entirely out of your control. It is just that the road to good performance is not paved with magic fixes and romantic kernel patches. • There are two basic rules • Don’t overload your system or your network. • Collect and review historical information about your system. • Keep regular baselines in your hip pocket to pull out in an emergency. Chapter 25 -Performance Analysis

Introduction • This chapter focuses on the performance of systems that are used as servers. • Desktop systems typically do not experience the same types of performance issues that servers do. • And the answer to the question of how to improve performance of a desktop is almost always “Upgrade the hardware.” • Users like this answer because it means they get fancy new systems on their desk more often. Chapter 25 -Performance Analysis

1. What you can do to Improve Performance • Here are some specific things you can do to improve performance: • Make sure the system has enough memory. • Memory has a major influence on performance. • Load every performance sensitive machine to the gills. Chapter 25 -Performance Analysis

1. What you can do to Improve Performance • Correct problems of usage: • Caused by users • too many jobs at once • inefficient programming practices • jobs run at excessive priority • large jobs run at inappropriate times of the day. • Caused by the system • quotas • CPU accounting • unwanted daemons. Chapter 25 -Performance Analysis

1. What you can do to Improve Performance • Use a load balancing appliance. • There are devices that make several servers appear to be one logical server to the outside world. • These also provide useful redundancy should a server go down, or you are hit with traffic spikes • Organize the systems hard disks and filesystems so that load is evenly balanced, maximizing I/O throughput. • For databases, consider a RAID setup Chapter 25 -Performance Analysis

1. What you can do to Improve Performance • Monitor your network • not saturated with traffic, and error rate is low. • Use the netstat command and see Chapter 20 • Configure your kernel to eliminate unwanted drivers • This was covered in Chapter 12 • Identify situations in which the system is fundamentally inadequate to satisfy the demands being made of it. Chapter 25 -Performance Analysis

2. Factors that affect Performance • Introduction: • Perceived performance is determined by the efficiency with which the systems resources are allocated and shared. • Only the following four resources have much effect on performance • CPU time • Memory • Hard Disk I/O bandwidth • Network I/O bandwidth Chapter 25 -Performance Analysis

2. Factors that affect Performance • Introduction (cont): • All processes consume a portion of the system’s resources. • If resources are still left after active processes have taken what they want, the system’s performance is about as good as it can be. • If there are not enough resources to go around, processes must take turns. • The amount of time spent waiting is one of the basic measures of performance degradation. Chapter 25 -Performance Analysis

2. Factors that affect Performance • Introduction (cont): • CPU time is one of the easiest resources to measure. • Many people assume that the speed of the CPU is the most important factor affecting a system’s overall performance. • In the everyday world CPU speed is relatively unimportant. Chapter 25 -Performance Analysis

2. Factors that affect Performance • Introduction (cont): • The most common performance bottleneck on a UNIX systems is actually disk bandwidth. • Each disk access causes a stall worth millions of CPU instructions. • Because of virtual memory, disk bandwidth and memory are directly related. • Swapping and paging caused by bloated software is performance enemy #1 on most workstations. Chapter 25 -Performance Analysis

3. System Performance Checkup • Introduction: • Most performance analysis tools tell you what is going on at a particular point in time. • However, the number and character of loads will probably change throughout the day. • Be sure to gather a cross-section of data before taking action. • The best information on system performance often becomes clear only after a long period of data collection. Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing CPU usage: • You will probably want to gather three kinds of CPU data • overall utilization • is CPU speed the bottleneck? • load averages • gives an impression of overall system performance • per-process CPU consumption • is someone being a hog? Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing CPU usage (cont): • Overall Usage • You can obtain summary information with the vmstat command (on Solaris and HP-UX you can use sar -u) • CPU numbers that are heavy on user time generally indicate computation, • And high system numbers indicate that processes are making a lot of system calls or performing I/O • Long term averages of CPU statistics allow you to determine whether there is fundamentally enough CPU power to go around. Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing CPU usage (cont): • Load Average: • The average number of runnable processes. • It gives a good idea of how many pieces the CPU pie is being divided into. • This information can be obtained with many commands. uptime is one of them. • The higher the load average, the more important the system’s aggregate performance becomes. Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing CPU usage (cont): • Load Average (cont): • Modern systems do not deal well with load averages over about 6.0. • You may have to ask people to run jobs at night • Or use the nice command. • The system load average is an excellent metric to track as part of a system baseline. • If you know your system’s load average on a normal day and it is in that same range on a bad day, this is a hint you should look elsewhere for performance problems. (such as network) Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing CPU usage (cont): • Load Average (cont): • Another way to view CPU usage is to run the ps command (ps -aux or ps -elf) • More likely than not, on a busy system, at least 70% of the CPU will be consumed by one or two processes • Deferring the execution of the CPU hogs (or reducing their priority) will make the CPU available to other processes. • An excellent alternative to ps is a program called top. • top itself can be quite a CPU hog, so be judicious in your use of it. Chapter 25 -Performance Analysis

3. System Performance Checkup • How UNIX manages memory: • UNIX manages memory in units called pages that are usually 4K or larger. • Disk blocks are usually smaller than pages (1K or 52 bytes), so the kernel has to associate several disk blocks with each page that is written out. • UNIX uses an LRU algorithm for moving pages in and out. Chapter 25 -Performance Analysis

3. System Performance Checkup • How UNIX manages memory: • Swapping is handled somewhat differently than paging. • If a process is known to have been idle for a long time (tens of seconds), it makes sense to write out all its pages at once rather than waiting for the paging algorithm to collect them. • It is a very bad sign if the kernel forcibly swaps out runnable processes. • This is called thrashing and indicates an extreme memory shortage Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing memory usage: • On a workstation your best memory analyis tools are your ears. • The amount of paging activity is generally proportional to the mount of crunching you hear from the disk. • There are basically two numbers that quantify memory activity: • The total amount of active virtual memory • total demand for memory • And the paging rate. • Proportion of memory that is actively being used. Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing memory usage (cont): • The goal is to reduce the activity or increase the memory until the paging remains at an acceptable level. • The amount of swap space can be determined with a command: • swap -l under Solaris • swapinfo under HP-X • swapon -a on Red Hat • pstat -s on FreeBSD Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing disk I/O: • Most systems allow disk throughput to e monitored with the iostat command. • Each hard disk has columns kps, tps, and serv, indicating kilobytes transferred per second, total transfers per second, and average “service times” (seek times) in milliseconds. • The ration between kps and tps tells you whether there are a few large transfers or lots of small ones. Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing disk I/O(cont): • Some systems support iostat -D, which gives the percentage utilization of each disk. • Filesystem design and layout can improve (lower) this utilization. • Some systems allow you to set up /tmp as a memory based filesystem. • This is generally a goo thing. Chapter 25 -Performance Analysis

3. System Performance Checkup • Analyzing disk I/O(cont): • Some software degrades the system’s performance by delaying basic operations. • Two examples are disk quotas and CPU accounting. • Disk caching helps to soften the impact of these features • But they may still have a slight effect on performance and should not be enabled unless you really need them. Chapter 25 -Performance Analysis

3. System Performance Checkup • Procinfo: display Red Hat performance data • Red Hat’s procinfo command provides a nice summary of system performance information, much like vmstat but in a more understandable form. • The information it provides about PC interrupts is especially useful. • If you have a spare terminal window and can spare the CPU cycles, you can run procinfo -f to show updates every 5 seconds. Chapter 25 -Performance Analysis

3. System Performance Checkup • pstat: print random FreeBSD statistics • Another useful tool available on FreeBSD systems is the pstat command. • It dumps the contents of various kernel tables in an almost human readable form. Chapter 25 -Performance Analysis

4. Help! My system just got really slow! • Introduction: Chapter 25 -Performance Analysis

5. Recommended Reading • Cockroft, Adrian and Richard Pettit. Sun Performance and Tuning: Java and the Internet. Upper Saddle River, NJ: Prentice Hall. 1998. • Loukides, Mike. System Performance Tuning. Sebastopol: O’Reilly. 1991. Chapter 25 -Performance Analysis

Chapter 25 -Performance Analysis

Performance Analysis