Annotation of wikisrc/guide/tuning.mdwn, revision 1.1

1.1     ! jdf         1: # Tuning NetBSD
        !             2: 
        !             3: ## Introduction
        !             4: 
        !             5: ### Overview
        !             6: 
        !             7: This section covers a variety of performance tuning topics. It attempts to span
        !             8: tuning from the perspective of the system administrator to systems programmer.
        !             9: The art of performance tuning itself is very old. To tune something means to
        !            10: make it operate more efficiently, whether one is referring to a NetBSD based
        !            11: technical server or a vacuum cleaner, the goal is to improve something, whether
        !            12: that be the way something is done, how it works or how it is put together.
        !            13: 
        !            14: #### What is Performance Tuning?
        !            15: 
        !            16: A view from 10,000 feet pretty much dictates that everything we do is task
        !            17: oriented, this pertains to a NetBSD system as well. When the system boots, it
        !            18: automatically begins to perform a variety of tasks. When a user logs in, they
        !            19: usually have a wide variety of tasks they have to accomplish. In the scope of
        !            20: these documents, however, performance tuning strictly means to improve how
        !            21: efficient a NetBSD system performs.
        !            22: 
        !            23: The most common thought that crops into someone's mind when they think "tuning"
        !            24: is some sort of speed increase or decreasing the size of the kernel - while
        !            25: those are ways to improve performance, they are not the only ends an
        !            26: administrator may have to take for increasing efficiency. For our purposes,
        !            27: performance tuning means this: *To make a NetBSD system operate in an optimum
        !            28: state.*
        !            29: 
        !            30: Which could mean a variety of things, not necessarily speed enhancements. A good
        !            31: example of this is filesystem formatting parameters, on a system that has a lot
        !            32: of small files (say like a source repository) an administrator may need to
        !            33: increase the number of inodes by making their size smaller (say down to 1024k)
        !            34: and then increasing the amount of inodes. In this case the number of inodes was
        !            35: increased, however, it keeps the administrator from getting those nasty out of
        !            36: inodes messages, which ultimately makes the system more efficient.
        !            37: 
        !            38: Tuning normally revolves around finding and eliminating bottlenecks. Most of the
        !            39: time, such bottlenecks are spurious, for example, a release of Mozilla that does
        !            40: not quite handle java applets too well can cause Mozilla to start crunching the
        !            41: CPU, especially applets that are not done well. Occasions when processes seem to
        !            42: spin off into nowhere and eat CPU are almost always resolved with a kill. There
        !            43: are instances, however, when resolving bottlenecks takes a lot longer, for
        !            44: example, say an rsynced server is just getting larger and larger. Slowly,
        !            45: performance begins to fade and the administrator may have to take some sort of
        !            46: action to speed things up, however, the situation is relative to say an
        !            47: emergency like an instantly spiked CPU.
        !            48: 
        !            49: #### When does one tune?
        !            50: 
        !            51: Many NetBSD users rarely have to tune a system. The GENERIC kernel may run just
        !            52: fine and the layout/configuration of the system may do the job as well. By the
        !            53: same token, as a pragma it is always good to know how to tune a system. Most
        !            54: often tuning comes as a result of a sudden bottleneck issue (which may occur
        !            55: randomly) or a gradual loss of performance. It does happen in a sense to
        !            56: everyone at some point, one process that is eating the CPU is a bottleneck as
        !            57: much as a gradual increase in paging. So, the question should not be when to
        !            58: tune so much as when to learn to tune.
        !            59: 
        !            60: One last time to tune is if you can tune in a preventive manner (and you think
        !            61: you might need to) then do it. One example of this was on a system that needed
        !            62: to be able to reboot quickly. Instead of waiting, I did everything I could to
        !            63: trim the kernel and make sure there was absolutely nothing running that was not
        !            64: needed, I even removed drivers that did have devices, but were never used (lp).
        !            65: The result was reducing reboot time by nearly two-thirds. In the long run, it
        !            66: was a smart move to tune it before it became an issue.
        !            67: 
        !            68: #### What these Documents Will Not Cover
        !            69: 
        !            70: Before I wrap up the introduction, I think it is important to note what these
        !            71: documents will not cover. This guide will pertain only to the core NetBSD
        !            72: system. In other words, it will not cover tuning a web server's configuration to
        !            73: make it run better; however, it might mention how to tune NetBSD to run better
        !            74: as a web server. The logic behind this is simple: web servers, database
        !            75: software, etc. are third party and almost limitless. I could easily get mired
        !            76: down in details that do not apply to the NetBSD system. Almost all third party
        !            77: software have their own documentation about tuning anyhow.
        !            78: 
        !            79: #### How Examples are Laid Out
        !            80: 
        !            81: Since there is ample man page documentation, only used options and arguments
        !            82: with examples are discussed. In some cases, material is truncated for brevity
        !            83: and not thoroughly discussed because, quite simply, there is too much. For
        !            84: example, every single device driver entry in the kernel will not be discussed,
        !            85: however, an example of determining whether or not a given system needs one will
        !            86: be. Nothing in this Guide is concrete, tuning and performance are very
        !            87: subjective, instead, it is a guide for the reader to learn what some of the
        !            88: tools available to them can do.
        !            89: 
        !            90: ## Tuning Considerations
        !            91: 
        !            92: Tuning a system is not really too difficult when pro-active tuning is the
        !            93: approach. This document approaches tuning from a *before it comes up* approach.
        !            94: While tuning in spare time is considerably easier versus say, a server that is
        !            95: almost completely bogged down to 0.1% idle time, there are still a few things
        !            96: that should be mulled over about tuning before actually doing it, hopefully,
        !            97: before a system is even installed.
        !            98: 
        !            99: ### General System Configuration
        !           100: 
        !           101: Of course, how the system is setup makes a big difference. Sometimes small items
        !           102: can be overlooked which may in fact cause some sort of long term performance
        !           103: problem.
        !           104: 
        !           105: #### Filesystems and Disks
        !           106: 
        !           107: How the filesystem is laid out relative to disk drives is very important. On
        !           108: hardware RAID systems, it is not such a big deal, but, many NetBSD users
        !           109: specifically use NetBSD on older hardware where hardware RAID simply is not an
        !           110: option. The idea of `/` being close to the first drive is a good one, but for
        !           111: example if there are several drives to choose from that will be the first one,
        !           112: is the best performing the one that `/` will be on? On a related note, is it
        !           113: wise to split off `/usr`? Will the system see heavy usage say in `/usr/pkgsrc`?
        !           114: It might make sense to slap a fast drive in and mount it under `/usr/pkgsrc`, or
        !           115: it might not. Like all things in performance tuning, this is subjective.
        !           116: 
        !           117: #### Swap Configuration
        !           118: 
        !           119: There are three schools of thought on swap size and about fifty on using split
        !           120: swap files with prioritizing and how that should be done. In the swap size
        !           121: arena, the vendor schools (at least most commercial ones) usually have their own
        !           122: formulas per OS. As an example, on a particular version of HP-UX with a
        !           123: particular version of Oracle the formula was:
        !           124: 
        !           125: 2.5 GB \* Number\_of\_processor
        !           126: 
        !           127: Well, that all really depends on what type of usage the database is having and
        !           128: how large it is, for instance if it is so large that it must be distributed,
        !           129: that formula does not fit well.
        !           130: 
        !           131: The next school of thought about swap sizing is sort of strange but makes some
        !           132: sense, it says, if possible, get a reference amount of memory used by the
        !           133: system. It goes something like this:
        !           134: 
        !           135:  1. Startup a machine and estimate total memory needs by running everything that
        !           136:     may ever be needed at once. Databases, web servers .... whatever. Total up
        !           137:        the amount.
        !           138:  2. Add a few MB for padding.
        !           139:  3. Subtract the amount of physical RAM from this total.
        !           140: 
        !           141: If the amount leftover is 3 times the size of physical RAM, consider getting
        !           142: more RAM. The problem, of course, is figuring out what is needed and how much
        !           143: space it will take. There is also another flaw in this method, some programs do
        !           144: not behave well. A glaring example of misbehaved software is web browsers. On
        !           145: certain versions of Netscape, when something went wrong it had a tendency to
        !           146: runaway and eat swap space. So, the more spare space available, the more time to
        !           147: kill it.
        !           148: 
        !           149: Last and not least is the tried and true PHYSICAL\_RAM \* 2 method. On modern
        !           150: machines and even older ones (with limited purpose of course) this seems to work
        !           151: best.
        !           152: 
        !           153: All in all, it is hard to tell when swapping will start. Even on small 16MB RAM
        !           154: machines (and less) NetBSD has always worked well for most people until
        !           155: misbehaving software is running.
        !           156: 
        !           157: ### System Services
        !           158: 
        !           159: On servers, system services have a large impact. Getting them to run at their
        !           160: best almost always requires some sort of network level change or a fundamental
        !           161: speed increase in the underlying system (which of course is what this is all
        !           162: about). There are instances when some simple solutions can improve services. One
        !           163: example, an ftp server is becoming slower and a new release of the ftp server
        !           164: that is shipped with the system comes out that, just happens to run faster. By
        !           165: upgrading the ftp software, a performance boost is accomplished.
        !           166: 
        !           167: Another good example where services are concerned is the age old question: *To
        !           168: use inetd or not to use inetd?* A great service example is pop3. Pop3
        !           169: connections can conceivably clog up inetd. While the pop3 service itself starts
        !           170: to degrade slowly, other services that are multiplexed through inetd will also
        !           171: degrade (in some case more than pop3). Setting up pop3 to run outside of inetd
        !           172: and on its own may help.
        !           173: 
        !           174: ### The NetBSD Kernel
        !           175: 
        !           176: The NetBSD kernel obviously plays a key role in how well a system performs,
        !           177: while rebuilding and tuning the kernel is covered later in the text, it is worth
        !           178: discussing in the local context from a high level.
        !           179: 
        !           180: Tuning the NetBSD kernel really involves three main areas:
        !           181: 
        !           182:  1. removing unrequired drivers
        !           183:  2. configuring options
        !           184:  3. system settings
        !           185: 
        !           186: #### Removing Unrequired Drivers
        !           187: 
        !           188: Taking drivers out of the kernel that are not needed achieves several results;
        !           189: first, the system boots faster since the kernel is smaller, second again since
        !           190: the kernel is smaller, more memory is free to users and processes and third, the
        !           191: kernel tends to respond quicker.
        !           192: 
        !           193: #### Configuring Options
        !           194: 
        !           195: Configuring options such as enabling/disabling certain subsystems, specific
        !           196: hardware and filesystems can also improve performance pretty much the same way
        !           197: removing unrequired drivers does. A very simple example of this is a FTP server
        !           198: that only hosts ftp files - nothing else. On this particular server there is no
        !           199: need to have anything but native filesystem support and perhaps a few options to
        !           200: help speed things along. Why would it ever need NTFS support for example?
        !           201: Besides, if it did, support for NTFS could be added at some later time. In an
        !           202: opposite case, a workstation may need to support a lot of different filesystem
        !           203: types to share and access files.
        !           204: 
        !           205: #### System Settings
        !           206: 
        !           207: System wide settings are controlled by the kernel, a few examples are filesystem
        !           208: settings, network settings and core kernel settings such as the maximum number
        !           209: of processes. Almost all system settings can be at least looked at or modified
        !           210: via the sysctl facility. Examples using the sysctl facility are given later on.
        !           211: 
        !           212: ## Visual Monitoring Tools
        !           213: 
        !           214: NetBSD ships a variety of performance monitoring tools with the system. Most of
        !           215: these tools are common on all UNIX systems. In this section some example usage
        !           216: of the tools is given with interpretation of the output.
        !           217: 
        !           218: ### The top Process Monitor
        !           219: 
        !           220: The [top(1)](http://netbsd.gw.com/cgi-bin/man-cgi?top+1+NetBSD-current)
        !           221: monitor does exactly what it says: it displays the CPU hogs on the
        !           222: system. To run the monitor, simply type top at the prompt. Without any
        !           223: arguments, it should look like:
        !           224: 
        !           225:     load averages:  0.09,  0.12,  0.08                                     20:23:41
        !           226:     21 processes:  20 sleeping, 1 on processor
        !           227:     CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
        !           228:     Memory: 15M Act, 1104K Inact, 208K Wired, 22M Free, 129M Swap free
        !           229:     
        !           230:       PID USERNAME PRI NICE   SIZE   RES STATE     TIME   WCPU    CPU COMMAND
        !           231:     13663 root       2    0  1552K 1836K sleep     0:08  0.00%  0.00% httpd
        !           232:       127 root      10    0   129M 4464K sleep     0:01  0.00%  0.00% mount_mfs
        !           233:     22591 root       2    0   388K 1156K sleep     0:01  0.00%  0.00% sshd
        !           234:       108 root       2    0   132K  472K sleep     0:01  0.00%  0.00% syslogd
        !           235:     22597 jrf       28    0   156K  616K onproc    0:00  0.00%  0.00% top
        !           236:     22592 jrf       18    0   828K 1128K sleep     0:00  0.00%  0.00% tcsh
        !           237:       203 root      10    0   220K  424K sleep     0:00  0.00%  0.00% cron
        !           238:         1 root      10    0   312K  192K sleep     0:00  0.00%  0.00% init
        !           239:       205 root       3    0    48K  432K sleep     0:00  0.00%  0.00% getty
        !           240:       206 root       3    0    48K  424K sleep     0:00  0.00%  0.00% getty
        !           241:       208 root       3    0    48K  424K sleep     0:00  0.00%  0.00% getty
        !           242:       207 root       3    0    48K  424K sleep     0:00  0.00%  0.00% getty
        !           243:     13667 nobody     2    0  1660K 1508K sleep     0:00  0.00%  0.00% httpd
        !           244:      9926 root       2    0   336K  588K sleep     0:00  0.00%  0.00% sshd
        !           245:       200 root       2    0    76K  456K sleep     0:00  0.00%  0.00% inetd
        !           246:       182 root       2    0    92K  436K sleep     0:00  0.00%  0.00% portsentry
        !           247:       180 root       2    0    92K  436K sleep     0:00  0.00%  0.00% portsentry
        !           248:     13666 nobody    -4    0  1600K 1260K sleep     0:00  0.00%  0.00% httpd
        !           249: 
        !           250: The top(1) utility is great for finding CPU hogs, runaway processes or groups of
        !           251: processes that may be causing problems. The output shown above indicates that
        !           252: this particular system is in good health. Now, the next display should show some
        !           253: very different results:
        !           254: 
        !           255:     load averages:  0.34,  0.16,  0.13                                     21:13:47
        !           256:     25 processes:  24 sleeping, 1 on processor
        !           257:     CPU states:  0.5% user,  0.0% nice,  9.0% system,  1.0% interrupt, 89.6% idle
        !           258:     Memory: 20M Act, 1712K Inact, 240K Wired, 30M Free, 129M Swap free
        !           259:     
        !           260:       PID USERNAME PRI NICE   SIZE   RES STATE     TIME   WCPU    CPU COMMAND
        !           261:      5304 jrf       -5    0    56K  336K sleep     0:04 66.07% 19.53% bonnie
        !           262:      5294 root       2    0   412K 1176K sleep     0:02  1.01%  0.93% sshd
        !           263:       108 root       2    0   132K  472K sleep     1:23  0.00%  0.00% syslogd
        !           264:       187 root       2    0  1552K 1824K sleep     0:07  0.00%  0.00% httpd
        !           265:      5288 root       2    0   412K 1176K sleep     0:02  0.00%  0.00% sshd
        !           266:      5302 jrf       28    0   160K  620K onproc    0:00  0.00%  0.00% top
        !           267:      5295 jrf       18    0   828K 1116K sleep     0:00  0.00%  0.00% tcsh
        !           268:      5289 jrf       18    0   828K 1112K sleep     0:00  0.00%  0.00% tcsh
        !           269:       127 root      10    0   129M 8388K sleep     0:00  0.00%  0.00% mount_mfs
        !           270:       204 root      10    0   220K  424K sleep     0:00  0.00%  0.00% cron
        !           271:         1 root      10    0   312K  192K sleep     0:00  0.00%  0.00% init
        !           272:       208 root       3    0    48K  432K sleep     0:00  0.00%  0.00% getty
        !           273:       210 root       3    0    48K  424K sleep     0:00  0.00%  0.00% getty
        !           274:       209 root       3    0    48K  424K sleep     0:00  0.00%  0.00% getty
        !           275:       211 root       3    0    48K  424K sleep     0:00  0.00%  0.00% getty
        !           276:       217 nobody     2    0  1616K 1272K sleep     0:00  0.00%  0.00% httpd
        !           277:       184 root       2    0   336K  580K sleep     0:00  0.00%  0.00% sshd
        !           278:       201 root       2    0    76K  456K sleep     0:00  0.00%  0.00% inetd
        !           279: 
        !           280: At first, it should seem rather obvious which process is hogging the system,
        !           281: however, what is interesting in this case is why. The bonnie program is a disk
        !           282: benchmark tool which can write large files in a variety of sizes and ways. What
        !           283: the previous output indicates is only that the bonnie program is a CPU hog, but
        !           284: not why.
        !           285: 
        !           286: #### Other Neat Things About Top
        !           287: 
        !           288: A careful examination of the manual page
        !           289: [top(1)](http://netbsd.gw.com/cgi-bin/man-cgi?top+1+NetBSD-5.0.1+i386) shows
        !           290: that there is a lot more that can be done with top, for example, processes can
        !           291: have their priority changed and killed. Additionally, filters can be set for
        !           292: looking at processes.
        !           293: 
        !           294: ### The sysstat utility
        !           295: 
        !           296: As the man page
        !           297: [sysstat(1)](http://netbsd.gw.com/cgi-bin/man-cgi?sysstat+1+NetBSD-5.0.1+i386)
        !           298: indicates, the sysstat utility shows a variety of system statistics using the
        !           299: curses library. While it is running the screen is shown in two parts, the upper
        !           300: window shows the current load average while the lower screen depends on user
        !           301: commands. The exception to the split window view is when vmstat display is on
        !           302: which takes up the whole screen. Following is what sysstat looks like on a
        !           303: fairly idle system with no arguments given when it was invoked:
        !           304: 
        !           305:                        /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
        !           306:          Load Average   |
        !           307:     
        !           308:                              /0   /10  /20  /30  /40  /50  /60  /70  /80  /90  /100
        !           309:                       <idle> XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
        !           310: 
        !           311: Basically a lot of dead time there, so now have a look with some arguments
        !           312: provided, in this case, `sysstat inet.tcp` which looks like this:
        !           313: 
        !           314:                         /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
        !           315:          Load Average   |
        !           316:     
        !           317:             0 connections initiated           19 total TCP packets sent
        !           318:             0 connections accepted            11   data
        !           319:             0 connections established          0   data (retransmit)
        !           320:                                                8   ack-only
        !           321:             0 connections dropped              0   window probes
        !           322:             0   in embryonic state             0   window updates
        !           323:             0   on retransmit timeout          0   urgent data only
        !           324:             0   by keepalive                   0   control
        !           325:             0   by persist
        !           326:                                               29 total TCP packets received
        !           327:            11 potential rtt updates           17   in sequence
        !           328:            11 successful rtt updates           0   completely duplicate
        !           329:             9 delayed acks sent                0   with some duplicate data
        !           330:             0 retransmit timeouts              4   out of order
        !           331:             0 persist timeouts                 0   duplicate acks
        !           332:             0 keepalive probes                11   acks
        !           333:             0 keepalive timeouts               0   window probes
        !           334:                                                0   window updates
        !           335: 
        !           336: Now that is informative. The first poll is accumulative, so it is possible to
        !           337: see quite a lot of information in the output when sysstat is invoked. Now, while
        !           338: that may be interesting, how about a look at the buffer cache with `sysstat
        !           339: bufcache`:
        !           340: 
        !           341:                         /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
        !           342:          Load Average
        !           343:     
        !           344:     There are 1642 buffers using 6568 kBytes of memory.
        !           345:     
        !           346:     File System          Bufs used   %   kB in use   %  Bufsize kB   %  Util %
        !           347:     /                          877  53        6171  93        6516  99      94
        !           348:     /var/tmp                     5   0          17   0          28   0      60
        !           349:     
        !           350:     Total:                     882  53        6188  94        6544  99
        !           351: 
        !           352: Again, a pretty boring system, but great information to have available. While
        !           353: this is all nice to look at, it is time to put a false load on the system to see
        !           354: how sysstat can be used as a performance monitoring tool. As with top, bonnie++
        !           355: will be used to put a high load on the I/O subsystems and a little on the CPU.
        !           356: The bufcache will be looked at again to see of there are any noticeable
        !           357: differences:
        !           358: 
        !           359:                         /0   /1   /2   /3   /4   /5   /6   /7   /8   /9   /10
        !           360:          Load Average   |||
        !           361:     
        !           362:     There are 1642 buffers using 6568 kBytes of memory.
        !           363:     
        !           364:     File System          Bufs used   %   kB in use   %  Bufsize kB   %  Util %
        !           365:     /                          811  49        6422  97        6444  98      99
        !           366:     
        !           367:     Total:                     811  49        6422  97        6444  98
        !           368: 
        !           369: First, notice that the load average shot up, this is to be expected of course,
        !           370: then, while most of the numbers are close, notice that utilization is at 99%.
        !           371: Throughout the time that bonnie++ was running the utilization percentage
        !           372: remained at 99, this of course makes sense, however, in a real troubleshooting
        !           373: situation, it could be indicative of a process doing heavy I/O on one particular
        !           374: file or filesystem.
        !           375: 
        !           376: ## Monitoring Tools
        !           377: 
        !           378: In addition to screen oriented monitors and tools, the NetBSD system also ships
        !           379: with a set of command line oriented tools. Many of the tools that ship with a
        !           380: NetBSD system can be found on other UNIX and UNIX-like systems.
        !           381: 
        !           382: ### fstat
        !           383: 
        !           384: The [fstat(1)](http://netbsd.gw.com/cgi-bin/man-cgi?fstat+1+NetBSD-5.0.1+i386)
        !           385: utility reports the status of open files on the system, while it is not what
        !           386: many administrators consider a performance monitor, it can help find out if a
        !           387: particular user or process is using an inordinate amount of files, generating
        !           388: large files and similar information.
        !           389: 
        !           390: Following is a sample of some fstat output:
        !           391: 
        !           392:     USER     CMD          PID   FD MOUNT      INUM MODE         SZ|DV R/W
        !           393:     jrf      tcsh       21607   wd /         29772 drwxr-xr-x     512 r
        !           394:     jrf      tcsh       21607    3* unix stream c057acc0<-> c0553280
        !           395:     jrf      tcsh       21607    4* unix stream c0553280 <-> c057acc0
        !           396:     root     sshd       21597   wd /             2 drwxr-xr-x     512 r
        !           397:     root     sshd       21597    0 /         11921 crw-rw-rw-    null rw
        !           398:     nobody   httpd       5032   wd /             2 drwxr-xr-x     512 r
        !           399:     nobody   httpd       5032    0 /         11921 crw-rw-rw-    null r
        !           400:     nobody   httpd       5032    1 /         11921 crw-rw-rw-    null w
        !           401:     nobody   httpd       5032    2 /         15890 -rw-r--r--  353533 rw
        !           402:     ...
        !           403: 
        !           404: The fields are pretty self explanatory, again, this tool while not as
        !           405: performance oriented as others, can come in handy when trying to find out
        !           406: information about file usage.
        !           407: 
        !           408: ### iostat
        !           409: 
        !           410: The [iostat(8)](http://netbsd.gw.com/cgi-bin/man-cgi?iostat+8+NetBSD-5.0.1+i386)
        !           411: command does exactly what it sounds like, it reports the status of the I/O
        !           412: subsystems on the system. When iostat is employed, the user typically runs it
        !           413: with a certain number of counts and an interval between them like so:
        !           414: 
        !           415:     $ iostat 5 5
        !           416:           tty            wd0             cd0             fd0             md0             cpu
        !           417:      tin tout  KB/t t/s MB/s   KB/t t/s MB/s   KB/t t/s MB/s   KB/t t/s MB/s  us ni sy in id
        !           418:        0    1  5.13   1 0.00   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  0  0 100
        !           419:        0   54  0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  0  0 100
        !           420:        0   18  0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  0  0 100
        !           421:        0   18  8.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  0  0 100
        !           422:        0   28  0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  0  0 100
        !           423: 
        !           424: The above output is from a very quiet ftp server. The fields represent the
        !           425: various I/O devices, the tty (which, ironically, is the most active because
        !           426: iostat is running), wd0 which is the primary IDE disk, cd0, the cdrom drive,
        !           427: fd0, the floppy and the memory filesystem.
        !           428: 
        !           429: Now, let's see if we can pummel the system with some heavy usage. First, a large
        !           430: ftp transaction consisting of a tarball of netbsd-current source along with the
        !           431: `bonnie++` disk benchmark program running at the same time.
        !           432: 
        !           433:     $ iostat 5 5
        !           434:           tty            wd0             cd0             fd0             md0             cpu
        !           435:      tin tout  KB/t t/s MB/s   KB/t t/s MB/s   KB/t t/s MB/s   KB/t t/s MB/s  us ni sy in id
        !           436:        0    1  5.68   1 0.00   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  0  0 100
        !           437:        0   54 61.03 150 8.92   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   1  0 18  4 78
        !           438:        0   26 63.14 157 9.71   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   1  0 20  4 75
        !           439:        0   20 43.58  26 1.12   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   0  0  9  2 88
        !           440:        0   28 19.49  82 1.55   0.00   0 0.00   0.00   0 0.00   0.00   0 0.00   1  0  7  3 89
        !           441: 
        !           442: As can be expected, notice that wd0 is very active, what is interesting about
        !           443: this output is how the processor's I/O seems to rise in proportion to wd0. This
        !           444: makes perfect sense, however, it is worth noting that only because this ftp
        !           445: server is hardly being used can that be observed. If, for example, the cpu I/O
        !           446: subsystem was already under a moderate load and the disk subsystem was under the
        !           447: same load as it is now, it could appear that the cpu is bottlenecked when in
        !           448: fact it would have been the disk. In such a case, we can observe that *one tool*
        !           449: is rarely enough to completely analyze a problem. A quick glance at processes
        !           450: probably would tell us (after watching iostat) which processes were causing
        !           451: problems.
        !           452: 
        !           453: ### ps
        !           454: 
        !           455: Using the [ps(1)](http://netbsd.gw.com/cgi-bin/man-cgi?ps+1+NetBSD-5.0.1+i386)
        !           456: command or process status, a great deal of information about the system can be
        !           457: discovered. Most of the time, the ps command is used to isolate a particular
        !           458: process by name, group, owner etc. Invoked with no options or arguments, ps
        !           459: simply prints out information about the user executing it.
        !           460: 
        !           461:     $ ps
        !           462:       PID TT STAT    TIME COMMAND
        !           463:     21560 p0 Is   0:00.04 -tcsh
        !           464:     21564 p0 I+   0:00.37 ssh jrf.odpn.net
        !           465:     21598 p1 Ss   0:00.12 -tcsh
        !           466:     21673 p1 R+   0:00.00 ps
        !           467:     21638 p2 Is+  0:00.06 -tcsh
        !           468: 
        !           469: Not very exciting. The fields are self explanatory with the exception of `STAT`
        !           470: which is actually the state a process is in. The flags are all documented in the
        !           471: man page, however, in the above example, `I` is idle, `S` is sleeping, `R` is
        !           472: runnable, the `+` means the process is in a foreground state, and the s means
        !           473: the process is a session leader. This all makes perfect sense when looking at
        !           474: the flags, for example, PID 21560 is a shell, it is idle and (as would be
        !           475: expected) the shell is the process leader.
        !           476: 
        !           477: In most cases, someone is looking for something very specific in the process
        !           478: listing. As an example, looking at all processes is specified with `-a`, to see
        !           479: all processes plus those without controlling terminals is `-ax` and to get a
        !           480: much more verbose listing (basically everything plus information about the
        !           481: impact processes are having) aux:
        !           482: 
        !           483:     # ps aux
        !           484:     USER     PID %CPU %MEM    VSZ  RSS TT STAT STARTED    TIME COMMAND
        !           485:     root       0  0.0  9.6      0 6260 ?? DLs  16Jul02 0:01.00 (swapper)
        !           486:     root   23362  0.0  0.8    144  488 ?? S    12:38PM 0:00.01 ftpd -l
        !           487:     root   23328  0.0  0.4    428  280 p1 S    12:34PM 0:00.04 -csh
        !           488:     jrf    23312  0.0  1.8    828 1132 p1 Is   12:32PM 0:00.06 -tcsh
        !           489:     root   23311  0.0  1.8    388 1156 ?? S    12:32PM 0:01.60 sshd: jrf@ttyp1
        !           490:     jrf    21951  0.0  1.7    244 1124 p0 S+    4:22PM 0:02.90 ssh jrf.odpn.net
        !           491:     jrf    21947  0.0  1.7    828 1128 p0 Is    4:21PM 0:00.04 -tcsh
        !           492:     root   21946  0.0  1.8    388 1156 ?? S     4:21PM 0:04.94 sshd: jrf@ttyp0
        !           493:     nobody  5032  0.0  2.0   1616 1300 ?? I    19Jul02 0:00.02 /usr/pkg/sbin/httpd
        !           494:     ...
        !           495: 
        !           496: Again, most of the fields are self explanatory with the exception of `VSZ` and
        !           497: `RSS` which can be a little confusing. `RSS` is the real size of a process in
        !           498: 1024 byte units while `VSZ` is the virtual size. This is all great, but again,
        !           499: how can ps help? Well, for one, take a look at this modified version of the same
        !           500: output:
        !           501: 
        !           502:     # ps aux
        !           503:     USER     PID %CPU %MEM    VSZ  RSS TT STAT STARTED    TIME COMMAND
        !           504:     root       0  0.0  9.6      0 6260 ?? DLs  16Jul02 0:01.00 (swapper)
        !           505:     root   23362  0.0  0.8    144  488 ?? S    12:38PM 0:00.01 ftpd -l
        !           506:     root   23328  0.0  0.4    428  280 p1 S    12:34PM 0:00.04 -csh
        !           507:     jrf    23312  0.0  1.8    828 1132 p1 Is   12:32PM 0:00.06 -tcsh
        !           508:     root   23311  0.0  1.8    388 1156 ?? S    12:32PM 0:01.60 sshd: jrf@ttyp1
        !           509:     jrf    21951  0.0  1.7    244 1124 p0 S+    4:22PM 0:02.90 ssh jrf.odpn.net
        !           510:     jrf    21947  0.0  1.7    828 1128 p0 Is    4:21PM 0:00.04 -tcsh
        !           511:     root   21946  0.0  1.8    388 1156 ?? S     4:21PM 0:04.94 sshd: jrf@ttyp0
        !           512:     nobody  5032  9.0  2.0   1616 1300 ?? I    19Jul02 0:00.02 /usr/pkg/sbin/httpd
        !           513:     ...
        !           514: 
        !           515: Given that on this server, our baseline indicates a relatively quiet system, the
        !           516: PID 5032 has an unusually large amount of `%CPU`. Sometimes this can also cause
        !           517: high `TIME` numbers. The ps command can be grepped on for PIDs, username and
        !           518: process name and hence help track down processes that may be experiencing
        !           519: problems.
        !           520: 
        !           521: ### vmstat
        !           522: 
        !           523: Using
        !           524: [vmstat(1)](http://netbsd.gw.com/cgi-bin/man-cgi?vmstat+1+NetBSD-5.0.1+i386),
        !           525: information pertaining to virtual memory can be monitored and measured. Not
        !           526: unlike iostat, vmstat can be invoked with a count and interval. Following is
        !           527: some sample output using `5 5` like the iostat example:
        !           528: 
        !           529:     # vmstat 5 5
        !           530:      procs   memory     page                       disks         faults      cpu
        !           531:      r b w   avm   fre  flt  re  pi   po   fr   sr w0 c0 f0 m0   in   sy  cs us sy id
        !           532:      0 7 0 17716 33160    2   0   0    0    0    0  1  0  0  0  105   15   4  0  0 100
        !           533:      0 7 0 17724 33156    2   0   0    0    0    0  1  0  0  0  109    6   3  0  0 100
        !           534:      0 7 0 17724 33156    1   0   0    0    0    0  1  0  0  0  105    6   3  0  0 100
        !           535:      0 7 0 17724 33156    1   0   0    0    0    0  0  0  0  0  107    6   3  0  0 100
        !           536:      0 7 0 17724 33156    1   0   0    0    0    0  0  0  0  0  105    6   3  0  0 100
        !           537: 
        !           538: Yet again, relatively quiet, for posterity, the exact same load that was put on
        !           539: this server in the iostat example will be used. The load is a large file
        !           540: transfer and the bonnie benchmark program.
        !           541: 
        !           542:     # vmstat 5 5
        !           543:      procs   memory     page                       disks         faults      cpu
        !           544:      r b w   avm   fre  flt  re  pi   po   fr   sr w0 c0 f0 m0   in   sy  cs us sy id
        !           545:      1 8 0 18880 31968    2   0   0    0    0    0  1  0  0  0  105   15   4  0  0 100
        !           546:      0 8 0 18888 31964    2   0   0    0    0    0 130  0  0  0 1804 5539 1094 31 22 47
        !           547:      1 7 0 18888 31964    1   0   0    0    0    0 130  0  0  0 1802 5500 1060 36 16 49
        !           548:      1 8 0 18888 31964    1   0   0    0    0    0 160  0  0  0 1849 5905 1107 21 22 57
        !           549:      1 7 0 18888 31964    1   0   0    0    0    0 175  0  0  0 1893 6167 1082  1 25 75
        !           550: 
        !           551: Just a little different. Notice, since most of the work was I/O based, the
        !           552: actual memory used was not very much. Since this system uses mfs for `/tmp`,
        !           553: however, it can certainly get beat up. Have a look at this:
        !           554: 
        !           555:     # vmstat 5 5
        !           556:      procs   memory     page                       disks         faults      cpu
        !           557:      r b w   avm   fre  flt  re  pi   po   fr   sr w0 c0 f0 m0   in   sy  cs us sy id
        !           558:      0 2 0 99188   500    2   0   0    0    0    0  1  0  0  0  105   16   4  0  0 100
        !           559:      0 2 0111596   436  592   0 587  624  586 1210 624  0  0  0  741  883 1088  0 11 89
        !           560:      0 3 0123976   784  666   0 662  643  683 1326 702  0  0  0  828  993 1237  0 12 88
        !           561:      0 2 0134692  1236  581   0 571  563  595 1158 599  0  0  0  722  863 1066  0  9 90
        !           562:      2 0 0142860   912  433   0 406  403  405  808 429  0  0  0  552  602 768  0  7 93
        !           563: 
        !           564: Pretty scary stuff. That was created by running bonnie in `/tmp` on a memory
        !           565: based filesystem. If it continued for too long, it is possible the system could
        !           566: have started thrashing. Notice that even though the VM subsystem was taking a
        !           567: beating, the processors still were not getting too battered.
        !           568: 
        !           569: ## Network Tools
        !           570: 
        !           571: Sometimes a performance problem is not a particular machine, it is the network
        !           572: or some sort of device on the network such as another host, a router etc. What
        !           573: other machines that provide a service or some sort of connectivity to a
        !           574: particular NetBSD system do and how they act can have a very large impact on
        !           575: performance of the NetBSD system itself, or the perception of performance by
        !           576: users. A really great example of this is when a DNS server that a NetBSD machine
        !           577: is using suddenly disappears. Lookups take long and they eventually fail.
        !           578: Someone logged into the NetBSD machine who is not experienced would undoubtedly
        !           579: (provided they had no other evidence) blame the NetBSD system. One of my
        !           580: personal favorites, *the Internet is broke*, usually means either DNS service or
        !           581: a router/gateway has dropped offline. Whatever the case may be, a NetBSD system
        !           582: comes adequately armed to deal with finding out what network issues may be
        !           583: cropping up whether the fault of the local system or some other issue.
        !           584: 
        !           585: ### ping
        !           586: 
        !           587: The classic
        !           588: [ping(8)](http://netbsd.gw.com/cgi-bin/man-cgi?ping+8+NetBSD-5.0.1+i386) utility
        !           589: can tell us if there is just plain connectivity, it can also tell if host
        !           590: resolution (depending on how `nsswitch.conf` dictates) is working. Following is
        !           591: some typical ping output on a local network with a count of 3 specified:
        !           592: 
        !           593:     # ping -c 3 marie
        !           594:     PING marie (172.16.14.12): 56 data bytes
        !           595:     64 bytes from 172.16.14.12: icmp_seq=0 ttl=255 time=0.571 ms
        !           596:     64 bytes from 172.16.14.12: icmp_seq=1 ttl=255 time=0.361 ms
        !           597:     64 bytes from 172.16.14.12: icmp_seq=2 ttl=255 time=0.371 ms
        !           598:     
        !           599:     ----marie PING Statistics----
        !           600:     3 packets transmitted, 3 packets received, 0.0% packet loss
        !           601:     round-trip min/avg/max/stddev = 0.361/0.434/0.571/0.118 ms
        !           602: 
        !           603: Not only does ping tell us if a host is alive, it tells us how long it took and
        !           604: gives some nice details at the very end. If a host cannot be resolved, just the
        !           605: IP address can be specified as well:
        !           606: 
        !           607:     # ping -c 1 172.16.20.5
        !           608:     PING ash (172.16.20.5): 56 data bytes
        !           609:     64 bytes from 172.16.20.5: icmp_seq=0 ttl=64 time=0.452 ms
        !           610:     
        !           611:     ----ash PING Statistics----
        !           612:     1 packets transmitted, 1 packets received, 0.0% packet loss
        !           613:     round-trip min/avg/max/stddev = 0.452/0.452/0.452/0.000 ms
        !           614: 
        !           615: Now, not unlike any other tool, the times are very subjective, especially in
        !           616: regards to networking. For example, while the times in the examples are good,
        !           617: take a look at the localhost ping:
        !           618: 
        !           619:     # ping -c 4 localhost
        !           620:     PING localhost (127.0.0.1): 56 data bytes
        !           621:     64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0.091 ms
        !           622:     64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0.129 ms
        !           623:     64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=0.120 ms
        !           624:     64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=0.122 ms
        !           625:     
        !           626:     ----localhost PING Statistics----
        !           627:     4 packets transmitted, 4 packets received, 0.0% packet loss
        !           628:     round-trip min/avg/max/stddev = 0.091/0.115/0.129/0.017 ms
        !           629: 
        !           630: Much smaller because the request never left the machine. Pings can be used to
        !           631: gather information about how well a network is performing. It is also good for
        !           632: problem isolation, for instance, if there are three relatively close in size
        !           633: NetBSD systems on a network and one of them simply has horrible ping times,
        !           634: chances are something is wrong on that one particular machine.
        !           635: 
        !           636: ### traceroute
        !           637: 
        !           638: The
        !           639: [traceroute(8)](http://netbsd.gw.com/cgi-bin/man-cgi?traceroute+8+NetBSD-5.0.1+i386)
        !           640: command is great for making sure a path is available or detecting problems on a
        !           641: particular path. As an example, here is a trace between the example ftp server
        !           642: and ftp.NetBSD.org:
        !           643: 
        !           644:     # traceroute ftp.NetBSD.org
        !           645:     traceroute to ftp.NetBSD.org (204.152.184.75), 30 hops max, 40 byte packets
        !           646:      1  208.44.95.1 (208.44.95.1)  1.646 ms  1.492 ms  1.456 ms
        !           647:      2  63.144.65.170 (63.144.65.170)  7.318 ms  3.249 ms  3.854 ms
        !           648:      3  chcg01-edge18.il.inet.qwest.net (65.113.85.229)  35.982 ms  28.667 ms  21.971 ms
        !           649:      4  chcg01-core01.il.inet.qwest.net (205.171.20.1)  22.607 ms  26.242 ms  19.631 ms
        !           650:      5  snva01-core01.ca.inet.qwest.net (205.171.8.50)  78.586 ms  70.585 ms  84.779 ms
        !           651:      6  snva01-core03.ca.inet.qwest.net (205.171.14.122)  69.222 ms  85.739 ms  75.979 ms
        !           652:      7  paix01-brdr02.ca.inet.qwest.net (205.171.205.30)  83.882 ms  67.739 ms  69.937 ms
        !           653:      8  198.32.175.3 (198.32.175.3)  72.782 ms  67.687 ms  73.320 ms
        !           654:      9  so-1-0-0.orpa8.pf.isc.org (192.5.4.231)  78.007 ms  81.860 ms  77.069 ms
        !           655:     10  tun0.orrc5.pf.isc.org (192.5.4.165)  70.808 ms  75.151 ms  81.485 ms
        !           656:     11  ftp.NetBSD.org (204.152.184.75)  69.700 ms  69.528 ms  77.788 ms
        !           657: 
        !           658: All in all, not bad. The trace went from the host to the local router, then out
        !           659: onto the provider network and finally out onto the Internet looking for the
        !           660: final destination. How to interpret traceroutes, again, are subjective, but
        !           661: abnormally high times in portions of a path can indicate a bottleneck on a piece
        !           662: of network equipment. Not unlike ping, if the host itself is suspect, run
        !           663: traceroute from another host to the same destination. Now, for the worst case
        !           664: scenario:
        !           665: 
        !           666:     # traceroute www.microsoft.com
        !           667:     traceroute: Warning: www.microsoft.com has multiple addresses; using 207.46.230.220
        !           668:     traceroute to www.microsoft.akadns.net (207.46.230.220), 30 hops max, 40 byte packets
        !           669:      1  208.44.95.1 (208.44.95.1)  2.517 ms  4.922 ms  5.987 ms
        !           670:      2  63.144.65.170 (63.144.65.170)  10.981 ms  3.374 ms  3.249 ms
        !           671:      3  chcg01-edge18.il.inet.qwest.net (65.113.85.229)  37.810 ms  37.505 ms  20.795 ms
        !           672:      4  chcg01-core03.il.inet.qwest.net (205.171.20.21)  36.987 ms  32.320 ms  22.430 ms
        !           673:      5  chcg01-brdr03.il.inet.qwest.net (205.171.20.142)  33.155 ms  32.859 ms  33.462 ms
        !           674:      6  205.171.1.162 (205.171.1.162)  39.265 ms  20.482 ms  26.084 ms
        !           675:      7  sl-bb24-chi-13-0.sprintlink.net (144.232.26.85)  26.681 ms  24.000 ms  28.975 ms
        !           676:      8  sl-bb21-sea-10-0.sprintlink.net (144.232.20.30)  65.329 ms  69.694 ms  76.704 ms
        !           677:      9  sl-bb21-tac-9-1.sprintlink.net (144.232.9.221)  65.659 ms  66.797 ms  74.408 ms
        !           678:     10  144.232.187.194 (144.232.187.194)  104.657 ms  89.958 ms  91.754 ms
        !           679:     11  207.46.154.1 (207.46.154.1)  89.197 ms  84.527 ms  81.629 ms
        !           680:     12  207.46.155.10 (207.46.155.10)  78.090 ms  91.550 ms  89.480 ms
        !           681:     13  * * *
        !           682:     .......
        !           683: 
        !           684: In this case, the Microsoft server cannot be found either because of multiple
        !           685: addresses or somewhere along the line a system or server cannot reply to the
        !           686: information request. At that point, one might think to try ping, in the
        !           687: Microsoft case, a ping does not reply, that is because somewhere on their
        !           688: network ICMP is most likely disabled.
        !           689: 
        !           690: ### netstat
        !           691: 
        !           692: Another problem that can crop up on a NetBSD system is routing table issues.
        !           693: These issues are not always the systems fault. The
        !           694: [route(8)](http://netbsd.gw.com/cgi-bin/man-cgi?route+8+NetBSD-5.0.1+i386) and
        !           695: [netstat(1)](http://netbsd.gw.com/cgi-bin/man-cgi?netstat+1+NetBSD-5.0.1+i386)
        !           696: commands can show information about routes and network connections
        !           697: (respectively).
        !           698: 
        !           699: The route command can be used to look at and modify routing tables while netstat
        !           700: can display information about network connections and routes. First, here is
        !           701: some output from `route show`:
        !           702: 
        !           703:     # route show
        !           704:     Routing tables
        !           705:     
        !           706:     Internet:
        !           707:     Destination      Gateway            Flags
        !           708:     default          208.44.95.1        UG
        !           709:     loopback         127.0.0.1          UG
        !           710:     localhost        127.0.0.1          UH
        !           711:     172.15.13.0      172.16.14.37       UG
        !           712:     172.16.0.0       link#2             U
        !           713:     172.16.14.8      0:80:d3:cc:2c:0    UH
        !           714:     172.16.14.10     link#2             UH
        !           715:     marie            0:10:83:f9:6f:2c   UH
        !           716:     172.16.14.37     0:5:32:8f:d2:35    UH
        !           717:     172.16.16.15     link#2             UH
        !           718:     loghost          8:0:20:a7:f0:75    UH
        !           719:     artemus          8:0:20:a8:d:7e     UH
        !           720:     ash              0:b0:d0:de:49:df   UH
        !           721:     208.44.95.0      link#1             U
        !           722:     208.44.95.1      0:4:27:3:94:20     UH
        !           723:     208.44.95.2      0:5:32:8f:d2:34    UH
        !           724:     208.44.95.25     0:c0:4f:10:79:92   UH
        !           725:     
        !           726:     Internet6:
        !           727:     Destination      Gateway            Flags
        !           728:     default          localhost          UG
        !           729:     default          localhost          UG
        !           730:     localhost        localhost          UH
        !           731:     ::127.0.0.0      localhost          UG
        !           732:     ::224.0.0.0      localhost          UG
        !           733:     ::255.0.0.0      localhost          UG
        !           734:     ::ffff:0.0.0.0   localhost          UG
        !           735:     2002::           localhost          UG
        !           736:     2002:7f00::      localhost          UG
        !           737:     2002:e000::      localhost          UG
        !           738:     2002:ff00::      localhost          UG
        !           739:     fe80::           localhost          UG
        !           740:     fe80::%ex0       link#1             U
        !           741:     fe80::%ex1       link#2             U
        !           742:     fe80::%lo0       fe80::1%lo0        U
        !           743:     fec0::           localhost          UG
        !           744:     ff01::           localhost          U
        !           745:     ff02::%ex0       link#1             U
        !           746:     ff02::%ex1       link#2             U
        !           747:     ff02::%lo0       fe80::1%lo0        U
        !           748: 
        !           749: The flags section shows the status and whether or not it is a gateway. In this
        !           750: case we see `U`, `H` and `G` (`U` is up, `H` is host and `G` is gateway, see
        !           751: the man page for additional flags).
        !           752: 
        !           753: Now for some netstat output using the `-r` (routing) and `-n` (show network
        !           754: numbers) options:
        !           755: 
        !           756:     Routing tables
        !           757:     
        !           758:     Internet:
        !           759:     Destination        Gateway            Flags     Refs     Use    Mtu  Interface
        !           760:     default            208.44.95.1        UGS         0   330309   1500  ex0
        !           761:     127                127.0.0.1          UGRS        0        0  33228  lo0
        !           762:     127.0.0.1          127.0.0.1          UH          1     1624  33228  lo0
        !           763:     172.15.13/24       172.16.14.37       UGS         0        0   1500  ex1
        !           764:     172.16             link#2             UC         13        0   1500  ex1
        !           765:     ...
        !           766:     Internet6:
        !           767:     Destination                   Gateway                   Flags     Refs     Use
        !           768:       Mtu  Interface
        !           769:     ::/104                        ::1                       UGRS        0        0
        !           770:     33228  lo0 =>
        !           771:     ::/96                         ::1                       UGRS        0        0
        !           772: 
        !           773: The above output is a little more verbose. So, how can this help? Well, a good
        !           774: example is when routes between networks get changed while users are connected. I
        !           775: saw this happen several times when someone was rebooting routers all day long
        !           776: after each change. Several users called up saying they were getting kicked out
        !           777: and it was taking very long to log back in. As it turned out, the clients
        !           778: connecting to the system were redirected to another router (which took a very
        !           779: long route) to reconnect. I observed the `M` flag or Modified dynamically (by
        !           780: redirect) on their connections. I deleted the routes, had them reconnect and
        !           781: summarily followed up with the offending technician.
        !           782: 
        !           783: ### tcpdump
        !           784: 
        !           785: Last, and definitely not least is
        !           786: [tcpdump(8)](http://netbsd.gw.com/cgi-bin/man-cgi?tcpdump+8+NetBSD-5.0.1+i386),
        !           787: the network sniffer that can retrieve a lot of information. In this discussion,
        !           788: there will be some sample output and an explanation of some of the more useful
        !           789: options of tcpdump.
        !           790: 
        !           791: Following is a small snippet of tcpdump in action just as it starts:
        !           792: 
        !           793:     # tcpdump
        !           794:     tcpdump: listening on ex0
        !           795:     14:07:29.920651 mail.ssh > 208.44.95.231.3551: P 2951836801:2951836845(44) ack 2
        !           796:     476972923 win 17520 <nop,nop,timestamp 1219259 128519450> [tos 0x10]
        !           797:     14:07:29.950594 12.125.61.34 >  208.44.95.16: ESP(spi=2548773187,seq=0x3e8c) (DF)
        !           798:     14:07:29.983117 smtp.somecorp.com.smtp > 208.44.95.30.42828: . ack 420285166 win
        !           799:     16500 (DF)
        !           800:     14:07:29.984406 208.44.95.30.42828 > smtp.somecorp.com.smtp: . 1:1376(1375) ack 0
        !           801:      win 7431 (DF)
        !           802:     ...
        !           803: 
        !           804: Given that the particular server is a mail server, what is shown makes perfect
        !           805: sense, however, the utility is very verbose, I prefer to initially run tcpdump
        !           806: with no options and send the text output into a file for later digestion like
        !           807: so:
        !           808: 
        !           809:     # tcpdump > tcpdump.out
        !           810:     tcpdump: listening on ex0
        !           811: 
        !           812: So, what precisely in the mish mosh are we looking for? In short, anything that
        !           813: does not seem to fit, for example, messed up packet lengths (as in a lot of
        !           814: them) will show up as improper lens or malformed packets (basically garbage).
        !           815: If, however, we are looking for something specific, tcpdump may be able to help
        !           816: depending on the problem.
        !           817: 
        !           818: #### Specific tcpdump Usage
        !           819: 
        !           820: These are just examples of a few things one can do with tcpdump.
        !           821: 
        !           822: Look for duplicate IP addresses:
        !           823: 
        !           824:     tcpdump -e host ip-address
        !           825: 
        !           826: For example:
        !           827: 
        !           828:     tcpdump -e host 192.168.0.2
        !           829: 
        !           830: Routing Problems:
        !           831: 
        !           832:     tcpdump icmp
        !           833: 
        !           834: There are plenty of third party tools available, however, NetBSD comes shipped
        !           835: with a good tool set for tracking down network level performance problems.
        !           836: 
        !           837: ## Accounting
        !           838: 
        !           839: The NetBSD system comes equipped with a great deal of performance monitors for
        !           840: active monitoring, but what about long term monitoring? Well, of course the
        !           841: output of a variety of commands can be sent to files and re-parsed later with a
        !           842: meaningful shell script or program. NetBSD does, by default, offer some
        !           843: extraordinarily powerful low level monitoring tools for the programmer,
        !           844: administrator or really astute hobbyist.
        !           845: 
        !           846: ### Accounting
        !           847: 
        !           848: While accounting gives system usage at an almost userland level, kernel
        !           849: profiling with gprof provides explicit system call usage.
        !           850: 
        !           851: Using the accounting tools can help figure out what possible performance
        !           852: problems may be laying in wait, such as increased usage of compilers or network
        !           853: services for example.
        !           854: 
        !           855: Starting accounting is actually fairly simple, as root, use the
        !           856: [accton(8)](http://netbsd.gw.com/cgi-bin/man-cgi?accton+8+NetBSD-5.0.1+i386)
        !           857: command. The syntax to start accounting is: `accton filename`
        !           858: 
        !           859: Where accounting information is appended to filename, now, strangely enough, the
        !           860: lastcomm command which reads from an accounting output file, by default, looks
        !           861: in `/var/account/acct` so I tend to just use the default location, however,
        !           862: lastcomm can be told to look elsewhere.
        !           863: 
        !           864: To stop accounting, simply type accton with no arguments.
        !           865: 
        !           866: ### Reading Accounting Information
        !           867: 
        !           868: To read accounting information, there are two tools that can be used:
        !           869: 
        !           870:  * [lastcomm(1)](http://netbsd.gw.com/cgi-bin/man-cgi?lastcomm+1+NetBSD-5.0.1+i386)
        !           871:  * [sa(8)](http://netbsd.gw.com/cgi-bin/man-cgi?sa+8+NetBSD-5.0.1+i386)
        !           872: 
        !           873: #### lastcomm
        !           874: 
        !           875: The lastcomm command shows the last commands executed in order, like all of
        !           876: them. It can, however, select by user, here is some sample output:
        !           877: 
        !           878:     $ lastcomm jrf
        !           879:     last       -       jrf      ttyp3      0.00 secs Tue Sep  3 14:39 (0:00:00.02)
        !           880:     man        -       jrf      ttyp3      0.00 secs Tue Sep  3 14:38 (0:01:49.03)
        !           881:     sh         -       jrf      ttyp3      0.00 secs Tue Sep  3 14:38 (0:01:49.03)
        !           882:     less       -       jrf      ttyp3      0.00 secs Tue Sep  3 14:38 (0:01:49.03)
        !           883:     lastcomm   -       jrf      ttyp3      0.02 secs Tue Sep  3 14:38 (0:00:00.02)
        !           884:     stty       -       jrf      ttyp3      0.00 secs Tue Sep  3 14:38 (0:00:00.02)
        !           885:     tset       -       jrf      ttyp3      0.00 secs Tue Sep  3 14:38 (0:00:01.05)
        !           886:     hostname   -       jrf      ttyp3      0.00 secs Tue Sep  3 14:38 (0:00:00.02)
        !           887:     ls         -       jrf      ttyp0      0.00 secs Tue Sep  3 14:36 (0:00:00.00)
        !           888:     ...
        !           889: 
        !           890: Pretty nice, the lastcomm command gets its information from the default location
        !           891: of /var/account/acct, however, using the -f option, another file may be
        !           892: specified.
        !           893: 
        !           894: As may seem obvious, the output of lastcomm could get a little heavy on large
        !           895: multi user systems. That is where sa comes into play.
        !           896: 
        !           897: #### sa
        !           898: 
        !           899: The sa command (meaning "print system accounting statistics") can be used to
        !           900: maintain information. It can also be used interactively to create reports.
        !           901: Following is the default output of sa:
        !           902: 
        !           903:     $ sa
        !           904:           77       18.62re        0.02cp        8avio        0k
        !           905:            3        4.27re        0.01cp       45avio        0k   ispell
        !           906:            2        0.68re        0.00cp       33avio        0k   mutt
        !           907:            2        1.09re        0.00cp       23avio        0k   vi
        !           908:           10        0.61re        0.00cp        7avio        0k   ***other
        !           909:            2        0.01re        0.00cp       29avio        0k   exim
        !           910:            4        0.00re        0.00cp        8avio        0k   lastcomm
        !           911:            2        0.00re        0.00cp        3avio        0k   atrun
        !           912:            3        0.03re        0.00cp        1avio        0k   cron*
        !           913:            5        0.02re        0.00cp       10avio        0k   exim*
        !           914:           10        3.98re        0.00cp        2avio        0k   less
        !           915:           11        0.00re        0.00cp        0avio        0k   ls
        !           916:            9        3.95re        0.00cp       12avio        0k   man
        !           917:            2        0.00re        0.00cp        4avio        0k   sa
        !           918:           12        3.97re        0.00cp        1avio        0k   sh
        !           919:     ...
        !           920: 
        !           921: From left to right, total times called, real time in minutes, sum of user and
        !           922: system time, in minutes, Average number of I/O operations per execution, size,
        !           923: command name.
        !           924: 
        !           925: The sa command can also be used to create summary files or reports based on some
        !           926: options, for example, here is the output when specifying a sort by CPU-time
        !           927: average memory usage:
        !           928: 
        !           929:     $ sa -k
        !           930:           86       30.81re        0.02cp        8avio        0k
        !           931:           10        0.61re        0.00cp        7avio        0k   ***other
        !           932:            2        0.00re        0.00cp        3avio        0k   atrun
        !           933:            3        0.03re        0.00cp        1avio        0k   cron*
        !           934:            2        0.01re        0.00cp       29avio        0k   exim
        !           935:            5        0.02re        0.00cp       10avio        0k   exim*
        !           936:            3        4.27re        0.01cp       45avio        0k   ispell
        !           937:            4        0.00re        0.00cp        8avio        0k   lastcomm
        !           938:           12        8.04re        0.00cp        2avio        0k   less
        !           939:           13        0.00re        0.00cp        0avio        0k   ls
        !           940:           11        8.01re        0.00cp       12avio        0k   man
        !           941:            2        0.68re        0.00cp       33avio        0k   mutt
        !           942:            3        0.00re        0.00cp        4avio        0k   sa
        !           943:           14        8.03re        0.00cp        1avio        0k   sh
        !           944:            2        1.09re        0.00cp       23avio        0k   vi
        !           945: 
        !           946: The sa command is very helpful on larger systems.
        !           947: 
        !           948: ### How to Put Accounting to Use
        !           949: 
        !           950: Accounting reports, as was mentioned earlier, offer a way to help predict
        !           951: trends, for example, on a system that has cc and make being used more and more
        !           952: may indicate that in a few months some changes will need to be made to keep the
        !           953: system running at an optimum level. Another good example is web server usage. If
        !           954: it begins to gradually increase, again, some sort of action may need to be taken
        !           955: before it becomes a problem. Luckily, with accounting tools, said actions can be
        !           956: reasonably predicted and planned for ahead of time.
        !           957: 
        !           958: ## Kernel Profiling
        !           959: 
        !           960: Profiling a kernel is normally employed when the goal is to compare the
        !           961: difference of new changes in the kernel to a previous one or to track down some
        !           962: sort of low level performance problem. Two sets of data about profiled code
        !           963: behavior are recorded independently: function call frequency and time spent in
        !           964: each function.
        !           965: 
        !           966: ### Getting Started
        !           967: 
        !           968: First, take a look at both [[Kernel Tuning|guide/tuning#kernel]] and [[Compiling
        !           969: the kernel|guide/kernel]]. The only difference in procedure for setting up a
        !           970: kernel with profiling enabled is when you run config add the `-p` option. The
        !           971: build area is `../compile/<KERNEL_NAME>.PROF` , for example, a GENERIC kernel
        !           972: would be `../compile/GENERIC.PROF`.
        !           973: 
        !           974: Following is a quick summary of how to compile a kernel with profiling enabled
        !           975: on the i386 port, the assumptions are that the appropriate sources are available
        !           976: under `/usr/src` and the GENERIC configuration is being used, of course, that
        !           977: may not always be the situation:
        !           978: 
        !           979:  1. **`cd /usr/src/sys/arch/i386/conf`**
        !           980:  2. **`config -p GENERIC`**
        !           981:  3. **`cd ../compile/GENERIC.PROF`**
        !           982:  4. **`make depend && make`**
        !           983:  5. **`cp /netbsd /netbsd.old`**
        !           984:  6. **`cp netbsd /`**
        !           985:  7. **`reboot`**
        !           986: 
        !           987: Once the new kernel is in place and the system has rebooted, it is time to turn
        !           988: on the monitoring and start looking at results.
        !           989: 
        !           990: #### Using kgmon
        !           991: 
        !           992: To start kgmon:
        !           993: 
        !           994:     $ kgmon -b
        !           995:     kgmon: kernel profiling is running.
        !           996: 
        !           997: Next, send the data into the file `gmon.out`:
        !           998: 
        !           999:     $ kgmon -p
        !          1000: 
        !          1001: Now, it is time to make the output readable:
        !          1002: 
        !          1003:     $ gprof /netbsd > gprof.out
        !          1004: 
        !          1005: Since gmon is looking for `gmon.out`, it should find it in the current working
        !          1006: directory.
        !          1007: 
        !          1008: By just running kgmon alone, you may not get the information you need, however,
        !          1009: if you are comparing the differences between two different kernels, then a known
        !          1010: good baseline should be used. Note that it is generally a good idea to  stress
        !          1011: the subsystem if you know what it is both in the baseline and with the newer (or
        !          1012: different) kernel.
        !          1013: 
        !          1014: ### Interpretation of kgmon Output
        !          1015: 
        !          1016: Now that kgmon can run, collect and parse information, it is time to actually
        !          1017: look at some of that information. In this particular instance, a GENERIC kernel
        !          1018: is running with profiling enabled for about an hour with only system processes
        !          1019: and no adverse load, in the fault insertion section, the example will be large
        !          1020: enough that even under a minimal load detection of the problem should be easy.
        !          1021: 
        !          1022: #### Flat Profile
        !          1023: 
        !          1024: The flat profile is a list of functions, the number of times they were called
        !          1025: and how long it took (in seconds). Following is sample output from the quiet
        !          1026: system:
        !          1027: 
        !          1028:     Flat profile:
        !          1029:     
        !          1030:     Each sample counts as 0.01 seconds.
        !          1031:       %   cumulative   self              self     total
        !          1032:      time   seconds   seconds    calls  ns/call  ns/call  name
        !          1033:      99.77    163.87   163.87                             idle
        !          1034:       0.03    163.92     0.05      219 228310.50 228354.34  _wdc_ata_bio_start
        !          1035:       0.02    163.96     0.04      219 182648.40 391184.96  wdc_ata_bio_intr
        !          1036:       0.01    163.98     0.02     3412  5861.66  6463.02  pmap_enter
        !          1037:       0.01    164.00     0.02      548 36496.35 36496.35  pmap_zero_page
        !          1038:       0.01    164.02     0.02                             Xspllower
        !          1039:       0.01    164.03     0.01   481968    20.75    20.75  gettick
        !          1040:       0.01    164.04     0.01     6695  1493.65  1493.65  VOP_LOCK
        !          1041:       0.01    164.05     0.01     3251  3075.98 21013.45  syscall_plain
        !          1042:     ...
        !          1043: 
        !          1044: As expected, idle was the highest in percentage, however, there were still some
        !          1045: things going on, for example, a little further down there is the `vn\_lock`
        !          1046: function:
        !          1047: 
        !          1048:     ...
        !          1049:       0.00    164.14     0.00     6711     0.00     0.00  VOP_UNLOCK
        !          1050:       0.00    164.14     0.00     6677     0.00  1493.65  vn_lock
        !          1051:       0.00    164.14     0.00     6441     0.00     0.00  genfs_unlock
        !          1052: 
        !          1053: This is to be expected, since locking still has to take place, regardless.
        !          1054: 
        !          1055: #### Call Graph Profile
        !          1056: 
        !          1057: The call graph is an augmented version of the flat profile showing subsequent
        !          1058: calls from the listed functions. First, here is some sample output:
        !          1059: 
        !          1060:                          Call graph (explanation follows)
        !          1061:     
        !          1062:     
        !          1063:     granularity: each sample hit covers 4 byte(s) for 0.01% of 164.14 seconds
        !          1064:     
        !          1065:     index % time    self  children    called     name
        !          1066:                                                      <spontaneous>
        !          1067:     [1]     99.8  163.87    0.00                 idle [1]
        !          1068:     -----------------------------------------------
        !          1069:                                                      <spontaneous>
        !          1070:     [2]      0.1    0.01    0.08                 syscall1 [2]
        !          1071:                     0.01    0.06    3251/3251        syscall_plain [7]
        !          1072:                     0.00    0.01     414/1660        trap [9]
        !          1073:     -----------------------------------------------
        !          1074:                     0.00    0.09     219/219         Xintr14 [6]
        !          1075:     [3]      0.1    0.00    0.09     219         pciide_compat_intr [3]
        !          1076:                     0.00    0.09     219/219         wdcintr [5]
        !          1077:     -----------------------------------------------
        !          1078:     ...
        !          1079: 
        !          1080: Now this can be a little confusing. The index number is mapped to from the
        !          1081: trailing number on the end of the line, for example,
        !          1082: 
        !          1083:     ...
        !          1084:                     0.00    0.01      85/85          dofilewrite [68]
        !          1085:     [72]     0.0    0.00    0.01      85         soo_write [72]
        !          1086:                     0.00    0.01      85/89          sosend [71]
        !          1087:     ...
        !          1088: 
        !          1089: Here we see that dofilewrite was called first, now we can look at the index
        !          1090: number for 64 and see what was happening there:
        !          1091: 
        !          1092:     ...
        !          1093:                     0.00    0.01     101/103         ffs_full_fsync <cycle 6> [58]
        !          1094:     [64]     0.0    0.00    0.01     103         bawrite [64]
        !          1095:                     0.00    0.01     103/105         VOP_BWRITE [60]
        !          1096:     ...
        !          1097: 
        !          1098: And so on, in this way, a "visual trace" can be established.
        !          1099: 
        !          1100: At the end of the call graph right after the terms section is an index by
        !          1101: function name which can help map indexes as well.
        !          1102: 
        !          1103: ### Putting it to Use
        !          1104: 
        !          1105: In this example, I have modified an area of the kernel I know will create a problem that will be blatantly obvious.
        !          1106: 
        !          1107: Here is the top portion of the flat profile after running the system for about an hour with little interaction from users:
        !          1108: 
        !          1109:     Flat profile:
        !          1110:     
        !          1111:     Each sample counts as 0.01 seconds.
        !          1112:       %   cumulative   self              self     total
        !          1113:      time   seconds   seconds    calls  us/call  us/call  name
        !          1114:      93.97    139.13   139.13                             idle
        !          1115:       5.87    147.82     8.69       23 377826.09 377842.52  check_exec
        !          1116:       0.01    147.84     0.02      243    82.30    82.30  pmap_copy_page
        !          1117:       0.01    147.86     0.02      131   152.67   152.67  _wdc_ata_bio_start
        !          1118:       0.01    147.88     0.02      131   152.67   271.85  wdc_ata_bio_intr
        !          1119:       0.01    147.89     0.01     4428     2.26     2.66  uvn_findpage
        !          1120:       0.01    147.90     0.01     4145     2.41     2.41  uvm_pageactivate
        !          1121:       0.01    147.91     0.01     2473     4.04  3532.40  syscall_plain
        !          1122:       0.01    147.92     0.01     1717     5.82     5.82  i486_copyout
        !          1123:       0.01    147.93     0.01     1430     6.99    56.52  uvm_fault
        !          1124:       0.01    147.94     0.01     1309     7.64     7.64  pool_get
        !          1125:       0.01    147.95     0.01      673    14.86    38.43  genfs_getpages
        !          1126:       0.01    147.96     0.01      498    20.08    20.08  pmap_zero_page
        !          1127:       0.01    147.97     0.01      219    45.66    46.28  uvm_unmap_remove
        !          1128:       0.01    147.98     0.01      111    90.09    90.09  selscan
        !          1129:     ...
        !          1130: 
        !          1131: As is obvious, there is a large difference in performance. Right off the bat the
        !          1132: idle time is noticeably less. The main difference here is that one particular
        !          1133: function has a large time across the board with very few calls. That function is
        !          1134: `check_exec`. While at first, this may not seem strange if a lot of commands
        !          1135: had been executed, when compared to the flat profile of the first measurement,
        !          1136: proportionally it does not seem right:
        !          1137: 
        !          1138:     ...
        !          1139:       0.00    164.14     0.00       37     0.00 62747.49  check_exec
        !          1140:     ...
        !          1141: 
        !          1142: The call in the first measurement is made 37 times and has a better performance.
        !          1143: Obviously something in or around that function is wrong. To eliminate other
        !          1144: functions, a look at the call graph can help, here is the first instance of
        !          1145: `check_exec`
        !          1146: 
        !          1147:     ...
        !          1148:     -----------------------------------------------
        !          1149:                     0.00    8.69      23/23          syscall_plain [3]
        !          1150:     [4]      5.9    0.00    8.69      23         sys_execve [4]
        !          1151:                     8.69    0.00      23/23          check_exec [5]
        !          1152:                     0.00    0.00      20/20          elf32_copyargs [67]
        !          1153:     ...
        !          1154: 
        !          1155: Notice how the time of 8.69 seems to affect the two previous functions. It is
        !          1156: possible that there is something wrong with them, however, the next instance of
        !          1157: `check_exec` seems to prove otherwise:
        !          1158: 
        !          1159:     ...
        !          1160:     -----------------------------------------------
        !          1161:                     8.69    0.00      23/23          sys_execve [4]
        !          1162:     [5]      5.9    8.69    0.00      23         check_exec [5]
        !          1163:     ...
        !          1164: 
        !          1165: Now we can see that the problem, most likely, resides in `check_exec`. Of
        !          1166: course, problems are not always this simple and in fact, here is the simpleton
        !          1167: code that was inserted right after `check_exec` (the function is in
        !          1168: `sys/kern/kern_exec.c`):
        !          1169: 
        !          1170:     ...
        !          1171:             /* A Cheap fault insertion */
        !          1172:             for (x = 0; x < 100000000; x++) {
        !          1173:                     y = x;
        !          1174:             }
        !          1175:     ..
        !          1176: 
        !          1177: Not exactly glamorous, but enough to register a large change with profiling.
        !          1178: 
        !          1179: ### Summary
        !          1180: 
        !          1181: Kernel profiling can be enlightening for anyone and provides a much more refined
        !          1182: method of hunting down performance problems that are not as easy to find using
        !          1183: conventional means, it is also not nearly as hard as most people think, if you
        !          1184: can compile a kernel, you can get profiling to work.
        !          1185: 
        !          1186: ## System Tuning
        !          1187: 
        !          1188: Now that monitoring and analysis tools have been addressed, it is time to look
        !          1189: into some actual methods. In this section, tools and methods that can affect how
        !          1190: the system performs that are applied without recompiling the kernel are
        !          1191: addressed, the next section examines kernel tuning by recompiling.
        !          1192: 
        !          1193: ### Using sysctl
        !          1194: 
        !          1195: The sysctl utility can be used to look at and in some cases alter system
        !          1196: parameters. There are so many parameters that can be viewed and changed they
        !          1197: cannot all be shown here, however, for the first example here is a simple usage
        !          1198: of sysctl to look at the system PATH environment variable:
        !          1199: 
        !          1200:     $ sysctl user.cs_path
        !          1201:     user.cs_path = /usr/bin:/bin:/usr/sbin:/sbin:/usr/pkg/bin:/usr/pkg/sbin:/usr/local/bin:/usr/local/sbin
        !          1202: 
        !          1203: Fairly simple. Now something that is actually related to performance. As an
        !          1204: example, lets say a system with many users is having file open issues, by
        !          1205: examining and perhaps raising the kern.maxfiles parameter the problem may be
        !          1206: fixed, but first, a look:
        !          1207: 
        !          1208:     $ sysctl kern.maxfiles
        !          1209:     kern.maxfiles = 1772
        !          1210: 
        !          1211: Now, to change it, as root with the -w option specified:
        !          1212: 
        !          1213:     # sysctl -w kern.maxfiles=1972
        !          1214:     kern.maxfiles: 1772 -> 1972
        !          1215: 
        !          1216: Note, when the system is rebooted, the old value will return, there are two
        !          1217: cures for this, first, modify that parameter in the kernel and recompile, second
        !          1218: (and simpler) add this line to `/etc/sysctl.conf`:
        !          1219: 
        !          1220:     kern.maxfiles=1972
        !          1221: 
        !          1222: ### tmpfs & mfs
        !          1223: 
        !          1224: NetBSD's *ramdisk* implementations cache all data in the RAM, and if that is
        !          1225: full, the swap space is used as backing store. NetBSD comes with two
        !          1226: implementations, the traditional BSD memory-based file system
        !          1227: [mfs](http://netbsd.gw.com/cgi-bin/man-cgi?mount_mfs+8+NetBSD-current)
        !          1228: and the more modern
        !          1229: [tmpfs](http://netbsd.gw.com/cgi-bin/man-cgi?mount_tmpfs+8+NetBSD-current).
        !          1230: While the former can only grow in size, the latter can also shrink if space is
        !          1231: no longer needed.
        !          1232: 
        !          1233: When to use and not to use a memory based filesystem can be hard on large multi
        !          1234: user systems. In some cases, however, it makes pretty good sense, for example,
        !          1235: on a development machine used by only one developer at a time, the obj directory
        !          1236: might be a good place, or some of the tmp directories for builds. In a case like
        !          1237: that, it makes sense on machines that have a fair amount of RAM on them. On the
        !          1238: other side of the coin, if a system only has 16MB of RAM and `/var/tmp` is
        !          1239: mfs-based, there could be severe applications issues that occur.
        !          1240: 
        !          1241: The GENERIC kernel has both tmpfs and mfs enabled by default. To use it on a
        !          1242: particular directory first determine where the swap space is that you wish to
        !          1243: use, in the example case, a quick look in `/etc/fstab` indicates that
        !          1244: `/dev/wd0b` is the swap partition:
        !          1245: 
        !          1246:     mail% cat /etc/fstab
        !          1247:     /dev/wd0a / ffs rw 1 1
        !          1248:     /dev/wd0b none swap sw 0 0
        !          1249:     /kern /kern kernfs rw
        !          1250: 
        !          1251: This system is a mail server so I only want to use `/tmp` with tmpfs, also on
        !          1252: this particular system I have linked `/tmp` to `/var/tmp` to save space (they
        !          1253: are on the same drive). All I need to do is add the following entry:
        !          1254: 
        !          1255:     /dev/wd0b /var/tmp tmpfs rw 0 0
        !          1256: 
        !          1257: If you want to use mfs instead of tmpfs, put just that into the above place.
        !          1258: 
        !          1259: Now, a word of warning: make sure said directories are empty and nothing is
        !          1260: using them when you mount the memory file system! After changing `/etc/fstab`,
        !          1261: you can either run `mount -a` or reboot the system.
        !          1262: 
        !          1263: ### Soft-dependencies
        !          1264: 
        !          1265: Soft-dependencies (softdeps) is a mechanism that does not write meta-data to
        !          1266: disk immediately, but it is written in an ordered fashion, which keeps the
        !          1267: filesystem consistent in case of a crash. The main benefit of softdeps is
        !          1268: processing speed. Soft-dependencies have some sharp edges, so beware! Also note
        !          1269: that soft-dependencies will not be present in any releases past 5.x. See
        !          1270: [[Journaling|guide/tuning#system-logging]] for information about WAPBL, which is
        !          1271: the replacement for soft-dependencies.
        !          1272: 
        !          1273: Soft-dependencies can be enabled by adding `softdep` to the filesystem options
        !          1274: in `/etc/fstab`. Let's look at an example of `/etc/fstab`:
        !          1275: 
        !          1276:     /dev/wd0a / ffs rw 1 1
        !          1277:     /dev/wd0b none swap sw 0 0
        !          1278:     /dev/wd0e /var ffs rw 1 2
        !          1279:     /dev/wd0f /tmp ffs rw 1 2
        !          1280:     /dev/wd0g /usr ffs rw 1 2
        !          1281: 
        !          1282: Suppose we want to enable soft-dependencies for all file systems, except for the
        !          1283: `/` partition. We would change it to (changes are emphasized):
        !          1284: 
        !          1285:     /dev/wd0a / ffs rw 1 1
        !          1286:     /dev/wd0b none swap sw 0 0
        !          1287:     /dev/wd0e /var ffs rw,softdep 1 2
        !          1288:     /dev/wd0f /tmp ffs rw,softdep 1 2
        !          1289:     /dev/wd0g /usr ffs rw,softdep 1 2
        !          1290: 
        !          1291: More information about softdep capabilities can be found on the
        !          1292: [author's page](http://www.mckusick.com/softdep/index.html).
        !          1293: 
        !          1294: ### Journaling
        !          1295: 
        !          1296: Journaling is a mechanism which puts written data in a so-called *journal*
        !          1297: first, and in a second step the data from the journal is written to disk. In the
        !          1298: event of a system crash, data that was not written to disk but that is in the
        !          1299: journal can be replayed, and will thus get the disk into a proper state. The
        !          1300: main effect of this is that no file system check (fsck) is needed after a rough
        !          1301: reboot. As of 5.0, NetBSD includes WAPBL, which provides journaling for FFS.
        !          1302: 
        !          1303: Journaling can be enabled by adding `log` to the filesystem options in
        !          1304: `/etc/fstab`. Here is an example which enables journaling for the root (`/`),
        !          1305: `/var`, and `/usr` file systems:
        !          1306: 
        !          1307:     /dev/wd0a /    ffs rw,log 1 1
        !          1308:     /dev/wd0e /var ffs rw,log 1 2
        !          1309:     /dev/wd0g /usr ffs rw,log 1 2
        !          1310: 
        !          1311: ### LFS
        !          1312: 
        !          1313: LFS, the log structured filesystem, writes data to disk in a way that is
        !          1314: sometimes too aggressive and leads to congestion. To throttle writing, the
        !          1315: following sysctls can be used:
        !          1316: 
        !          1317:     vfs.sync.delay
        !          1318:     vfs.sync.filedelay
        !          1319:     vfs.sync.dirdelay
        !          1320:     vfs.sync.metadelay
        !          1321:     vfs.lfs.flushindir
        !          1322:     vfs.lfs.clean_vnhead
        !          1323:     vfs.lfs.dostats
        !          1324:     vfs.lfs.pagetrip
        !          1325:     vfs.lfs.stats.segsused
        !          1326:     vfs.lfs.stats.psegwrites
        !          1327:     vfs.lfs.stats.psyncwrites
        !          1328:     vfs.lfs.stats.pcleanwrites
        !          1329:     vfs.lfs.stats.blocktot
        !          1330:     vfs.lfs.stats.cleanblocks
        !          1331:     vfs.lfs.stats.ncheckpoints
        !          1332:     vfs.lfs.stats.nwrites
        !          1333:     vfs.lfs.stats.nsync_writes
        !          1334:     vfs.lfs.stats.wait_exceeded
        !          1335:     vfs.lfs.stats.write_exceeded
        !          1336:     vfs.lfs.stats.flush_invoked
        !          1337:     vfs.lfs.stats.vflush_invoked
        !          1338:     vfs.lfs.stats.clean_inlocked
        !          1339:     vfs.lfs.stats.clean_vnlocked
        !          1340:     vfs.lfs.stats.segs_reclaimed
        !          1341:     vfs.lfs.ignore_lazy_sync
        !          1342: 
        !          1343: Besides tuning those parameters, disabling write-back caching on
        !          1344: [wd(4)](http://netbsd.gw.com/cgi-bin/man-cgi?wd+4+NetBSD-5.0.1+i386) devices may
        !          1345: be beneficial. See the
        !          1346: [dkctl(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dkctl+8+NetBSD-5.0.1+i386) man
        !          1347: page for details.
        !          1348: 
        !          1349: More is available in the NetBSD mailing list archives. See
        !          1350: [this](http://mail-index.NetBSD.org/tech-perform/2007/04/01/0000.html) and
        !          1351: [this](http://mail-index.NetBSD.org/tech-perform/2007/04/01/0001.html) mail.
        !          1352: 
        !          1353: ## Kernel Tuning
        !          1354: 
        !          1355: While many system parameters can be changed with sysctl, many improvements by
        !          1356: using enhanced system software, layout of the system and managing services
        !          1357: (moving them in and out of inetd for example) can be achieved as well. Tuning
        !          1358: the kernel however will provide better performance, even if it appears to be
        !          1359: marginal.
        !          1360: 
        !          1361: ### Preparing to Recompile a Kernel
        !          1362: 
        !          1363: First, get the kernel sources for the release as described in
        !          1364: [[Obtaining the sources|guide/fetch]], reading
        !          1365: [[Compiling the kernel|guide/kernel]]for more information on building the kernel
        !          1366: is recommended. Note, this document can be used for -current tuning, however, a
        !          1367: read of the
        !          1368: [[Tracking -current|tracking_current]] documentation should be done first, much
        !          1369: of the information there is repeated here.
        !          1370: 
        !          1371: ### Configuring the Kernel
        !          1372: 
        !          1373: Configuring a kernel in NetBSD can be daunting. This is because of multiple line
        !          1374: dependencies within the configuration file itself, however, there is a benefit
        !          1375: to this method and that is, all it really takes is an ASCII editor to get a new
        !          1376: kernel configured and some dmesg output. The kernel configuration file is under
        !          1377: `src/sys/arch/ARCH/conf` where ARCH is your architecture (for example, on a
        !          1378: SPARC it would be under `src/sys/arch/sparc/conf`).
        !          1379: 
        !          1380: After you have located your kernel config file, copy it and remove (comment out)
        !          1381: all the entries you don't need. This is where
        !          1382: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)
        !          1383: becomes your friend. A clean
        !          1384: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)-output
        !          1385: will show all of the devices detected by the kernel at boot time. Using
        !          1386: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)
        !          1387: output, the device options really needed can be determined.
        !          1388: 
        !          1389: #### Some example Configuration Items
        !          1390: 
        !          1391: In this example, an ftp server's kernel is being reconfigured to run with the
        !          1392: bare minimum drivers and options and any other items that might make it run
        !          1393: faster (again, not necessarily smaller, although it will be). The first thing to
        !          1394: do is take a look at some of the main configuration items. So, in
        !          1395: `/usr/src/sys/arch/i386/conf` the GENERIC file is copied to FTP, then the file
        !          1396: FTP edited.
        !          1397: 
        !          1398: At the start of the file there are a bunch of options beginning with maxusers,
        !          1399: which will be left alone, however, on larger multi-user systems it might be help
        !          1400: to crank that value up a bit. Next is CPU support, looking at the dmesg output
        !          1401: this is seen:
        !          1402: 
        !          1403:     cpu0: Intel Pentium II/Celeron (Deschutes) (686-class), 400.93 MHz
        !          1404: 
        !          1405: Indicating that only the options `I686_CPU` options needs to be used. In the next
        !          1406: section, all options are left alone except the `PIC_DELAY` which is recommended
        !          1407: unless it is an older machine. In this case it is enabled since the 686 is
        !          1408: *relatively new*.
        !          1409: 
        !          1410: Between the last section all the way down to compat options there really was no
        !          1411: need to change anything on this particular system. In the compat section,
        !          1412: however, there are several options that do not need to be enabled, again this is
        !          1413: because this machine is strictly a FTP server, all compat options were turned
        !          1414: off.
        !          1415: 
        !          1416: The next section is File systems, and again, for this server very few need to be
        !          1417: on, the following were left on:
        !          1418: 
        !          1419:     # File systems
        !          1420:     file-system     FFS             # UFS
        !          1421:     file-system     LFS             # log-structured file system
        !          1422:     file-system     MFS             # memory file system
        !          1423:     file-system     CD9660          # ISO 9660 + Rock Ridge file system
        !          1424:     file-system     FDESC           # /dev/fd
        !          1425:     file-system     KERNFS          # /kern
        !          1426:     file-system     NULLFS          # loopback file system
        !          1427:     file-system     PROCFS          # /proc
        !          1428:     file-system     UMAPFS          # NULLFS + uid and gid remapping
        !          1429:     ...
        !          1430:     options         SOFTDEP         # FFS soft updates support.
        !          1431:     ...
        !          1432: 
        !          1433: Next comes the network options section. The only options left on were:
        !          1434: 
        !          1435:     options         INET            # IP + ICMP + TCP + UDP
        !          1436:     options         INET6           # IPV6
        !          1437:     options         IPFILTER_LOG    # ipmon(8) log support
        !          1438: 
        !          1439: `IPFILTER_LOG` is a nice one to have around since the server will be running
        !          1440: ipf.
        !          1441: 
        !          1442: The next section is verbose messages for various subsystems, since this machine
        !          1443: is already running and had no major problems, all of them are commented out.
        !          1444: 
        !          1445: #### Some Drivers
        !          1446: 
        !          1447: The configurable items in the config file are relatively few and easy to cover,
        !          1448: however, device drivers are a different story. In the following examples, two
        !          1449: drivers are examined and their associated *areas* in the file trimmed down.
        !          1450: First, a small example: the cdrom, in dmesg, is the following lines:
        !          1451: 
        !          1452:     ...
        !          1453:     cd0 at atapibus0 drive 0: <CD-540E, , 1.0A> type 5 cdrom removable
        !          1454:     cd0: 32-bit data port
        !          1455:     cd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 2
        !          1456:     pciide0: secondary channel interrupting at irq 15
        !          1457:     cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (using DMA data transfer
        !          1458:     ...
        !          1459: 
        !          1460: Now, it is time to track that section down in the configuration file. Notice
        !          1461: that the `cd`-drive is on an atapibus and requires pciide support. The section
        !          1462: that is of interest in this case is the kernel config's "IDE and related
        !          1463: devices" section. It is worth noting at this point, in and around the IDE
        !          1464: section are also ISA, PCMCIA etc., on this machine in the
        !          1465: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)
        !          1466: output there are no PCMCIA devices, so it stands to reason that all PCMCIA
        !          1467: references can be removed. But first, the `cd` drive.
        !          1468: 
        !          1469: At the start of the IDE section is the following:
        !          1470: 
        !          1471:     ...
        !          1472:     wd*     at atabus? drive ? flags 0x0000
        !          1473:     ...
        !          1474:     atapibus* at atapi?
        !          1475:     ...
        !          1476: 
        !          1477: Well, it is pretty obvious that those lines need to be kept. Next is this:
        !          1478: 
        !          1479:     ...
        !          1480:     # ATAPI devices
        !          1481:     # flags have the same meaning as for IDE drives.
        !          1482:     cd*     at atapibus? drive ? flags 0x0000       # ATAPI CD-ROM drives
        !          1483:     sd*     at atapibus? drive ? flags 0x0000       # ATAPI disk drives
        !          1484:     st*     at atapibus? drive ? flags 0x0000       # ATAPI tape drives
        !          1485:     uk*     at atapibus? drive ? flags 0x0000       # ATAPI unknown
        !          1486:     ...
        !          1487: 
        !          1488: The only device type that was in the
        !          1489: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)
        !          1490: output was the cd, the rest can be commented out.
        !          1491: 
        !          1492: The next example is slightly more difficult, network interfaces. This machine
        !          1493: has two of them:
        !          1494: 
        !          1495:     ...
        !          1496:     ex0 at pci0 dev 17 function 0: 3Com 3c905B-TX 10/100 Ethernet (rev. 0x64)
        !          1497:     ex0: interrupting at irq 10
        !          1498:     ex0: MAC address 00:50:04:83:ff:b7
        !          1499:     UI 0x001018 model 0x0012 rev 0 at ex0 phy 24 not configured
        !          1500:     ex1 at pci0 dev 19 function 0: 3Com 3c905B-TX 10/100 Ethernet (rev. 0x30)
        !          1501:     ex1: interrupting at irq 11
        !          1502:     ex1: MAC address 00:50:da:63:91:2e
        !          1503:     exphy0 at ex1 phy 24: 3Com internal media interface
        !          1504:     exphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto
        !          1505:     ...
        !          1506: 
        !          1507: At first glance it may appear that there are in fact three devices, however, a
        !          1508: closer look at this line:
        !          1509: 
        !          1510:     exphy0 at ex1 phy 24: 3Com internal media interface
        !          1511: 
        !          1512: Reveals that it is only two physical cards, not unlike the cdrom, simply
        !          1513: removing names that are not in dmesg will do the job. In the beginning of the
        !          1514: network interfaces section is:
        !          1515: 
        !          1516:     ...
        !          1517:     # Network Interfaces
        !          1518:     
        !          1519:     # PCI network interfaces
        !          1520:     an*     at pci? dev ? function ?        # Aironet PC4500/PC4800 (802.11)
        !          1521:     bge*    at pci? dev ? function ?        # Broadcom 570x gigabit Ethernet
        !          1522:     en*     at pci? dev ? function ?        # ENI/Adaptec ATM
        !          1523:     ep*     at pci? dev ? function ?        # 3Com 3c59x
        !          1524:     epic*   at pci? dev ? function ?        # SMC EPIC/100 Ethernet
        !          1525:     esh*    at pci? dev ? function ?        # Essential HIPPI card
        !          1526:     ex*     at pci? dev ? function ?        # 3Com 90x[BC]
        !          1527:     ...
        !          1528: 
        !          1529: There is the ex device. So all of the rest under the PCI section can be removed.
        !          1530: Additionally, every single line all the way down to this one:
        !          1531: 
        !          1532:     exphy*  at mii? phy ?                   # 3Com internal PHYs
        !          1533: 
        !          1534: can be commented out as well as the remaining.
        !          1535: 
        !          1536: #### Multi Pass
        !          1537: 
        !          1538: When I tune a kernel, I like to do it remotely in an X windows session, in one
        !          1539: window the dmesg output, in the other the config file. It can sometimes take a
        !          1540: few passes to rebuild a very trimmed kernel since it is easy to accidentally
        !          1541: remove dependencies.
        !          1542: 
        !          1543: ### Building the New Kernel
        !          1544: 
        !          1545: Now it is time to build the kernel and put it in place. In the conf directory on
        !          1546: the ftp server, the following command prepares the build:
        !          1547: 
        !          1548:     $ config FTP
        !          1549: 
        !          1550: When it is done a message reminding me to make depend will display, next:
        !          1551: 
        !          1552:     $ cd ../compile/FTP
        !          1553:     $ make depend && make
        !          1554: 
        !          1555: When it is done, I backup the old kernel and drop the new one in place:
        !          1556: 
        !          1557:     # cp /netbsd /netbsd.orig
        !          1558:     # cp netbsd /
        !          1559: 
        !          1560: Now reboot. If the kernel cannot boot, stop the boot process when prompted and
        !          1561: type `boot netbsd.orig` to boot from the previous kernel.
        !          1562: 
        !          1563: ### Shrinking the NetBSD kernel
        !          1564: 
        !          1565: When building a kernel for embedded systems, it's often necessary to modify the
        !          1566: Kernel binary to reduce space or memory footprint.
        !          1567: 
        !          1568: #### Removing ELF sections and debug information
        !          1569: 
        !          1570: We already know how to remove Kernel support for drivers and options that you
        !          1571: don't need, thus saving memory and space, but you can save some KiloBytes of
        !          1572: space by removing debugging symbols and two ELF sections if you don't need them:
        !          1573: `.comment` and `.ident`. They are used for storing RCS strings viewable with
        !          1574: [ident(1)](http://netbsd.gw.com/cgi-bin/man-cgi?ident+1+NetBSD-5.0.1+i386) and a
        !          1575: [gcc(1)](http://netbsd.gw.com/cgi-bin/man-cgi?gcc+1+NetBSD-5.0.1+i386) version
        !          1576: string. The following examples assume you have your `TOOLDIR` under
        !          1577: `/usr/src/tooldir.NetBSD-2.0-i386` and the target architecture is `i386`.
        !          1578: 
        !          1579:     $ /usr/src/tooldir.NetBSD-2.0-i386/bin/i386--netbsdelf-objdump -h /netbsd
        !          1580:     
        !          1581:     /netbsd:     file format elf32-i386
        !          1582:     
        !          1583:     Sections:
        !          1584:     Idx Name          Size      VMA       LMA       File off  Algn
        !          1585:       0 .text         0057a374  c0100000  c0100000  00001000  2**4
        !          1586:                       CONTENTS, ALLOC, LOAD, READONLY, CODE
        !          1587:       1 .rodata       00131433  c067a380  c067a380  0057b380  2**5
        !          1588:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1589:       2 .rodata.str1.1 00035ea0  c07ab7b3  c07ab7b3  006ac7b3  2**0
        !          1590:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1591:       3 .rodata.str1.32 00059d13  c07e1660  c07e1660  006e2660  2**5
        !          1592:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1593:       4 link_set_malloc_types 00000198  c083b374  c083b374  0073c374  2**2
        !          1594:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1595:       5 link_set_domains 00000024  c083b50c  c083b50c  0073c50c  2**2
        !          1596:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1597:       6 link_set_pools 00000158  c083b530  c083b530  0073c530  2**2
        !          1598:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1599:       7 link_set_sysctl_funcs 000000f0  c083b688  c083b688  0073c688  2**2
        !          1600:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1601:       8 link_set_vfsops 00000044  c083b778  c083b778  0073c778  2**2
        !          1602:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1603:       9 link_set_dkwedge_methods 00000004  c083b7bc  c083b7bc  0073c7bc  2**2
        !          1604:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1605:      10 link_set_bufq_strats 0000000c  c083b7c0  c083b7c0  0073c7c0  2**2
        !          1606:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1607:      11 link_set_evcnts 00000030  c083b7cc  c083b7cc  0073c7cc  2**2
        !          1608:                       CONTENTS, ALLOC, LOAD, READONLY, DATA
        !          1609:      12 .data         00048ae4  c083c800  c083c800  0073c800  2**5
        !          1610:                       CONTENTS, ALLOC, LOAD, DATA
        !          1611:      13 .bss          00058974  c0885300  c0885300  00785300  2**5
        !          1612:                       ALLOC
        !          1613:      14 .comment      0000cda0  00000000  00000000  00785300  2**0
        !          1614:                       CONTENTS, READONLY
        !          1615:      15 .ident        000119e4  00000000  00000000  007920a0  2**0
        !          1616:                       CONTENTS, READONLY
        !          1617: 
        !          1618: On the third column we can see the size of the sections in hexadecimal form. By
        !          1619: summing `.comment` and `.ident` sizes we know how much we're going to save with
        !          1620: their removal: around 120KB (= 52640 + 72164 = 0xcda0 + 0x119e4). To remove the
        !          1621: sections and debugging symbols that may be present, we're going to use
        !          1622: [strip(1)](http://netbsd.gw.com/cgi-bin/man-cgi?strip+1+NetBSD-5.0.1+i386):
        !          1623: 
        !          1624:     # cp /netbsd /netbsd.orig
        !          1625:     # /usr/src/tooldir.NetBSD-2.0-i386/bin/i386--netbsdelf-strip -S -R .ident -R .comment /netbsd
        !          1626:     # ls -l /netbsd /netbsd.orig
        !          1627:     -rwxr-xr-x  1 root  wheel  8590668 Apr 30 15:56 netbsd
        !          1628:     -rwxr-xr-x  1 root  wheel  8757547 Apr 30 15:56 netbsd.orig
        !          1629: 
        !          1630: Since we also removed debugging symbols, the total amount of disk space saved is
        !          1631: around 160KB.
        !          1632: 
        !          1633: #### Compressing the Kernel
        !          1634: 
        !          1635: On some architectures, the bootloader can boot a compressed kernel. You can save
        !          1636: several MegaBytes of disk space by using this method, but the bootloader will
        !          1637: take longer to load the Kernel.
        !          1638: 
        !          1639:     # cp /netbsd /netbsd.plain
        !          1640:     # gzip -9 /netbsd
        !          1641: 
        !          1642: To see how much space we've saved:
        !          1643: 
        !          1644:     $ ls -l /netbsd.plain /netbsd.gz
        !          1645:     -rwxr-xr-x  1 root  wheel  8757547 Apr 29 18:05 /netbsd.plain
        !          1646:     -rwxr-xr-x  1 root  wheel  3987769 Apr 29 18:05 /netbsd.gz
        !          1647: 
        !          1648: Note that you can only use gzip coding, by using
        !          1649: [gzip(1)](http://netbsd.gw.com/cgi-bin/man-cgi?gzip+1+NetBSD-5.0.1+i386), bzip2
        !          1650: is not supported by the NetBSD bootloaders!
        !          1651: 

CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb