Table of Contents
NetBSD ships a variety of performance monitoring tools with the system. Most of these tools are common on all UNIX systems. In this section some example usage of the tools is given with interpretation of the output.
The top monitor does exactly what it says, it displays the CPU hogs on the system. To run the monitor, simply type top at the prompt. Without any arguments, it should look like:
load averages: 0.09, 0.12, 0.08 20:23:41
21 processes: 20 sleeping, 1 on processor
CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle
Memory: 15M Act, 1104K Inact, 208K Wired, 22M Free, 129M Swap free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
13663 root 2 0 1552K 1836K sleep 0:08 0.00% 0.00% httpd
127 root 10 0 129M 4464K sleep 0:01 0.00% 0.00% mount_mfs
22591 root 2 0 388K 1156K sleep 0:01 0.00% 0.00% sshd
108 root 2 0 132K 472K sleep 0:01 0.00% 0.00% syslogd
22597 jrf 28 0 156K 616K onproc 0:00 0.00% 0.00% top
22592 jrf 18 0 828K 1128K sleep 0:00 0.00% 0.00% tcsh
203 root 10 0 220K 424K sleep 0:00 0.00% 0.00% cron
1 root 10 0 312K 192K sleep 0:00 0.00% 0.00% init
205 root 3 0 48K 432K sleep 0:00 0.00% 0.00% getty
206 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty
208 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty
207 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty
13667 nobody 2 0 1660K 1508K sleep 0:00 0.00% 0.00% httpd
9926 root 2 0 336K 588K sleep 0:00 0.00% 0.00% sshd
200 root 2 0 76K 456K sleep 0:00 0.00% 0.00% inetd
182 root 2 0 92K 436K sleep 0:00 0.00% 0.00% portsentry
180 root 2 0 92K 436K sleep 0:00 0.00% 0.00% portsentry
13666 nobody -4 0 1600K 1260K sleep 0:00 0.00% 0.00% httpd
The top utility is great for finding CPU hogs, runaway processes or groups of processes that may be causing problems. The output shown above indicates that this particular system is in good health. Now, the next display should show some very different results:
load averages: 0.34, 0.16, 0.13 21:13:47
25 processes: 24 sleeping, 1 on processor
CPU states: 0.5% user, 0.0% nice, 9.0% system, 1.0% interrupt, 89.6% idle
Memory: 20M Act, 1712K Inact, 240K Wired, 30M Free, 129M Swap free
PID USERNAME PRI NICE SIZE RES STATE TIME WCPU CPU COMMAND
5304 jrf -5 0 56K 336K sleep 0:04 66.07% 19.53% bonnie
5294 root 2 0 412K 1176K sleep 0:02 1.01% 0.93% sshd
108 root 2 0 132K 472K sleep 1:23 0.00% 0.00% syslogd
187 root 2 0 1552K 1824K sleep 0:07 0.00% 0.00% httpd
5288 root 2 0 412K 1176K sleep 0:02 0.00% 0.00% sshd
5302 jrf 28 0 160K 620K onproc 0:00 0.00% 0.00% top
5295 jrf 18 0 828K 1116K sleep 0:00 0.00% 0.00% tcsh
5289 jrf 18 0 828K 1112K sleep 0:00 0.00% 0.00% tcsh
127 root 10 0 129M 8388K sleep 0:00 0.00% 0.00% mount_mfs
204 root 10 0 220K 424K sleep 0:00 0.00% 0.00% cron
1 root 10 0 312K 192K sleep 0:00 0.00% 0.00% init
208 root 3 0 48K 432K sleep 0:00 0.00% 0.00% getty
210 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty
209 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty
211 root 3 0 48K 424K sleep 0:00 0.00% 0.00% getty
217 nobody 2 0 1616K 1272K sleep 0:00 0.00% 0.00% httpd
184 root 2 0 336K 580K sleep 0:00 0.00% 0.00% sshd
201 root 2 0 76K 456K sleep 0:00 0.00% 0.00% inetd
At first, it should seem rather obvious which process is hogging the system, however, what is interesting in this case is why. The bonnie program is a disk benchmark tool which can write large files in a variety of sizes and ways. What the previous output indicates is only that the bonnie program is a CPU hog, but not why.
A careful examination of the manual page top(1) for top shows that there is a lot more that can be done with it, for example, processes can have their priority changed and killed. Additionally, filters can be set for looking at processes.
As the man page systat(1) indicates, the systat utility shows a variety of system statistics using the curses library. While it is running the screen is shown in two parts, the upper window shows the current load average while the lower screen depends on user commands. The exception to the split window view is when vmstat display is on which takes up the whole screen. Following is what systat looks like on a fairly idle system with no arguments given when it was invoked:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average |
/0 /10 /20 /30 /40 /50 /60 /70 /80 /90 /100
<idle> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
Basically a lot of dead time there, so now have a look with some arguments provided, in this case, systat inet.tcp which looks like this:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average |
0 connections initiated 19 total TCP packets sent
0 connections accepted 11 data
0 connections established 0 data (retransmit)
8 ack-only
0 connections dropped 0 window probes
0 in embryonic state 0 window updates
0 on retransmit timeout 0 urgent data only
0 by keepalive 0 control
0 by persist
29 total TCP packets received
11 potential rtt updates 17 in sequence
11 successful rtt updates 0 completely duplicate
9 delayed acks sent 0 with some duplicate data
0 retransmit timeouts 4 out of order
0 persist timeouts 0 duplicate acks
0 keepalive probes 11 acks
0 keepalive timeouts 0 window probes
0 window updates
Now that is informative. The first poll is accumulative, so it is possible to see quite a lot of information in the output when systat is invoked. Now, while that may be interesting, how about a look at the buffer cache with systat bufcache:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average
There are 1642 buffers using 6568 kBytes of memory.
File System Bufs used % kB in use % Bufsize kB % Util %
/ 877 53 6171 93 6516 99 94
/var/tmp 5 0 17 0 28 0 60
Total: 882 53 6188 94 6544 99
Again, a pretty boring system, but great information to have available. While this is all nice to look at, it is time to put a false load on the system to see how systat can be used as a performance monitoring tool. As with top, bonnie++ will be used to put a high load on the I/O subsystems and a little on the CPU. The bufcache will be looked at again to see of there are any noticeable differences:
/0 /1 /2 /3 /4 /5 /6 /7 /8 /9 /10
Load Average |||
There are 1642 buffers using 6568 kBytes of memory.
File System Bufs used % kB in use % Bufsize kB % Util %
/ 811 49 6422 97 6444 98 99
Total: 811 49 6422 97 6444 98
First, notice that the load average shot up, this is to be expected of course, then, while most of the numbers are close, notice that utilization is at 99%. Throughout the time that bonnie++ was running the utilization percentage remained at 99, this of course makes sense, however, in a real troubleshooting situation, it could be indicative of a process doing heavy I/O on one particular file or filesystem.