Annotation of wikisrc/guide/raidframe.mdwn, revision 1.4
1.1 jdf 1: # NetBSD RAIDframe
2:
3: ## RAIDframe Introduction
4:
5: ### About RAIDframe
6:
7: NetBSD uses the [CMU RAIDframe](http://www.pdl.cmu.edu/RAIDframe/) software for
8: its RAID subsystem. NetBSD is the primary platform for RAIDframe development.
9: RAIDframe can also be found in older versions of FreeBSD and OpenBSD. NetBSD
10: also has another way of bundling disks, the
11: [ccd(4)](http://netbsd.gw.com/cgi-bin/man-cgi?ccd+4+NetBSD-5.0.1+i386) subsystem
1.3 jdf 12: (see [Concatenated Disk Device](/guide/ccd)). You should possess some [basic
1.1 jdf 13: knowledge](http://www.acnc.com/04_00.html) about RAID concepts and terminology
14: before continuing. You should also be at least familiar with the different
15: levels of RAID - Adaptec provides an [excellent
16: reference](http://www.adaptec.com/en-US/_common/compatibility/_education/RAID_level_compar_wp.htm),
17: and the [raid(4)](http://netbsd.gw.com/cgi-bin/man-cgi?raid+4+NetBSD-5.0.1+i386)
18: manpage contains a short overview too.
19:
20: ### A warning about Data Integrity, Backups, and High Availability
21:
22: RAIDframe is a Software RAID implementation, as opposed to Hardware RAID. As
23: such, it does not need special disk controllers supported by NetBSD. System
24: administrators should give a great deal of consideration to whether software
25: RAID or hardware RAID is more appropriate for their "Mission Critical"
26: applications. For some projects you might consider the use of many of the
27: hardware RAID devices [supported by
28: NetBSD](http://www.NetBSD.org/support/hardware/). It is truly at your discretion
29: what type of RAID you use, but it is recommend that you consider factors such
30: as: manageability, commercial vendor support, load-balancing and failover, etc.
31:
32: Depending on the RAID level used, RAIDframe does provide redundancy in the event
33: of a hardware failure. However, it is *not* a replacement for reliable backups!
34: Software and user-error can still cause data loss. RAIDframe may be used as a
35: mechanism for facilitating backups in systems without backup hardware, but this
36: is not an ideal configuration. Finally, with regard to "high availability", RAID
37: is only a very small component to ensuring data availability.
38:
39: Once more for good measure: *Back up your data!*
40:
41: ### Hardware versus Software RAID
42:
43: If you run a server, it will most probably already have a Hardware RAID
44: controller. There are reasons for and against using a Software RAID, depending
45: on the scenario.
46:
47: In general, a Software RAID is well suited for low-IO system disks. If you run a
48: Software RAID, you can exchange disks and disk controllers, or even move the
49: disks to a completely different machine. The computational overhead for the RAID
50: is negligible if there is only few disk IO operations.
51:
52: If you need much IO, you should use a Hardware RAID. With a Software RAID, the
53: redundancy data has to be transferred via the bus your disk controller is
54: connected to. With a Hardware RAID, you transfer data only once - the redundancy
55: computation and transfer is done by the controller.
56:
57: ### Getting Help
58:
59: If you encounter problems using RAIDframe, you have several options for
60: obtaining help.
61:
62: 1. Read the RAIDframe man pages:
63: [raid(4)](http://netbsd.gw.com/cgi-bin/man-cgi?raid+4+NetBSD-5.0.1+i386) and
64: [raidctl(8)](http://netbsd.gw.com/cgi-bin/man-cgi?raidctl+8+NetBSD-5.0.1+i386)
65: thoroughly.
66:
67: 2. Search the mailing list archives. Unfortunately, there is no NetBSD list
68: dedicated to RAIDframe support. Depending on the nature of the problem, posts
69: tend to end up in a variety of lists. At a very minimum, search
70: [netbsd-help](http://mail-index.NetBSD.org/netbsd-help/),
71: [netbsd-users@NetBSD.org](http://mail-index.NetBSD.org/netbsd-users/),
72: [current-users@NetBSD.org](http://mail-index.NetBSD.org/current-users/). Also
73: search the list for the NetBSD platform on which you are using RAIDframe:
74: port-*`${ARCH}`*@NetBSD.org.
75:
76: ### Caution
77:
78: Because RAIDframe is constantly undergoing development, some information in
79: mailing list archives has the potential of being dated and inaccurate.
80:
81: 3. Search the [Problem Report
82: database](http://www.NetBSD.org/support/send-pr.html).
83:
84: 4. If your problem persists: Post to the mailing list most appropriate
85: (judgment call). Collect as much verbosely detailed information as possible
86: before posting: Include your
87: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)
88: output from `/var/run/dmesg.boot`, your kernel
89: [config(5)](http://netbsd.gw.com/cgi-bin/man-cgi?config+5+NetBSD-5.0.1+i386) ,
90: your `/etc/raid[0-9].conf`, any relevant errors on `/dev/console`,
91: `/var/log/messages`, or to `stdout/stderr` of
92: [raidctl(8)](http://netbsd.gw.com/cgi-bin/man-cgi?raidctl+8+NetBSD-5.0.1+i386).
93: The output of **raidctl -s** (if available) will be useful as well. Also
94: include details on the troubleshooting steps you've taken thus far, exactly
95: when the problem started, and any notes on recent changes that may have
96: prompted the problem to develop. Remember to be patient when waiting for a
97: response.
98:
99: ## Setup RAIDframe Support
100:
101: The use of RAID will require software and hardware configuration changes.
102:
103: ### Kernel Support
104:
105: The GENERIC kernel already has support for RAIDframe. If you have built a custom
106: kernel for your environment the kernel configuration must have the following
107: options:
108:
109: pseudo-device raid 8 # RAIDframe disk driver
110: options RAID_AUTOCONFIG # auto-configuration of RAID components
111:
112: The RAID support must be detected by the NetBSD kernel, which can be checked by
113: looking at the output of the
114: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386)
115: command.
116:
117: # dmesg|grep -i raid
118: Kernelized RAIDframe activated
119:
120: Historically, the kernel must also contain static mappings between bus addresses
121: and device nodes in `/dev`. This used to ensure consistency of devices within
122: RAID sets in the event of a device failure after reboot. Since NetBSD 1.6,
123: however, using the auto-configuration features of RAIDframe has been recommended
124: over statically mapping devices. The auto-configuration features allow drives to
125: move around on the system, and RAIDframe will automatically determine which
126: components belong to which RAID sets.
127:
128: ### Power Redundancy and Disk Caching
129:
130: If your system has an Uninterruptible Power Supply (UPS), if your system has
131: redundant power supplies, or your disk controller has a battery, you should
132: consider enabling the read and write caches on your drives. On systems with
133: redundant power, this will improve drive performance. On systems without
134: redundant power, the write cache could endanger the integrity of RAID data in
135: the event of a power loss.
136:
137: The [dkctl(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dkctl+8+NetBSD-5.0.1+i386)
138: utility to can be used for this on all kinds of disks that support the operation
139: (SCSI, EIDE, SATA, ...):
140:
141: # dkctl wd0 getcache
142: /dev/rwd0d: read cache enabled
143: /dev/rwd0d: read cache enable is not changeable
144: /dev/rwd0d: write cache enable is changeable
145: /dev/rwd0d: cache parameters are not savable
146: # dkctl wd0 setcache rw
147: # dkctl wd0 getcache
148: /dev/rwd0d: read cache enabled
149: /dev/rwd0d: write-back cache enabled
150: /dev/rwd0d: read cache enable is not changeable
151: /dev/rwd0d: write cache enable is changeable
152: /dev/rwd0d: cache parameters are not savable
153:
154: ## Example: RAID-1 Root Disk
155:
156: This example explains how to setup RAID-1 root disk. With RAID-1 components are
157: mirrored and therefore the server can be fully functional in the event of a
158: single component failure. The goal is to provide a level of redundancy that will
159: allow the system to encounter a component failure on either component disk in
160: the RAID and:
161:
162: * Continue normal operations until a maintenance window can be scheduled.
163: * Or, in the unlikely event that the component failure causes a system reboot,
164: be able to quickly reconfigure the system to boot from the remaining
165: component (platform dependent).
166:
1.4 ! jdf 167: 
1.1 jdf 168: **RAID-1 Disk Logical Layout**
169:
170: Because RAID-1 provides both redundancy and performance improvements, its most
171: practical application is on critical "system" partitions such as `/`, `/usr`,
172: `/var`, `swap`, etc., where read operations are more frequent than write
173: operations. For other file systems, such as `/home` or `/var/`, other RAID
174: levels might be considered (see the references above). If one were simply
175: creating a generic RAID-1 volume for a non-root file system, the cookie-cutter
176: examples from the man page could be followed, but because the root volume must
177: be bootable, certain special steps must be taken during initial setup.
178:
179: *Note*: This example will outline a process that differs only slightly between
180: the i386 and sparc64 platforms. In an attempt to reduce excessive duplication of
181: content, where differences do exist and are cosmetic in nature, they will be
182: pointed out using a section such as this. If the process is drastically
183: different, the process will branch into separate, platform dependent steps.
184:
185: ### Pseudo-Process Outline
186:
187: Although a much more refined process could be developed using a custom copy of
188: NetBSD installed on custom-developed removable media, presently the NetBSD
189: install media lacks RAIDframe tools and support, so the following pseudo process
190: has become the de facto standard for setting up RAID-1 Root.
191:
192: 1. Install a stock NetBSD onto Disk0 of your system.
193:
1.4 ! jdf 194: 
1.1 jdf 195: **Perform generic install onto Disk0/wd0**
196:
197: 2. Use the installed system on Disk0/wd0 to setup a RAID Set composed of
198: Disk1/wd1 only.
199:
1.4 ! jdf 200: 
1.1 jdf 201: **Setup RAID Set**
202:
203: 3. Reboot the system off the Disk1/wd1 with the newly created RAID volume.
204:
1.4 ! jdf 205: 
1.1 jdf 206: **Reboot using Disk1/wd1 of RAID**
207:
208: 4. Add / re-sync Disk0/wd0 back into the RAID set.
209:
1.4 ! jdf 210: 
1.1 jdf 211: **Mirror Disk1/wd1 back to Disk0/wd0**
212:
213: ### Hardware Review
214:
215: At present, the alpha, amd64, i386, pmax, sparc, sparc64, and vax NetBSD
216: platforms support booting from RAID-1. Booting is not supported from any other
217: RAID level. Booting from a RAID set is accomplished by teaching the 1st stage
218: boot loader to understand both 4.2BSD/FFS and RAID partitions. The 1st boot
219: block code only needs to know enough about the disk partitions and file systems
220: to be able to read the 2nd stage boot blocks. Therefore, at any time, the
221: system's BIOS / firmware must be able to read a drive with 1st stage boot blocks
222: installed. On the i386 platform, configuring this is entirely dependent on the
223: vendor of the controller card / host bus adapter to which your disks are
224: connected. On sparc64 this is controlled by the IEEE 1275 Sun OpenBoot Firmware.
225:
226: This article assumes two identical IDE disks (`/dev/wd{0,1}`) which we are going
227: to mirror (RAID-1). These disks are identified as:
228:
229: # grep ^wd /var/run/dmesg.boot
230: wd0 at atabus0 drive 0: <WDC WD100BB-75CLB0>
231: wd0: drive supports 16-sector PIO transfers, LBA addressing
232: wd0: 9541 MB, 19386 cyl, 16 head, 63 sec, 512 bytes/sect x 19541088 sectors
233: wd0: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
234: wd0(piixide0:0:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA data transfers)
235:
236: wd1 at atabus1 drive 0: <WDC WD100BB-75CLB0>
237: wd1: drive supports 16-sector PIO transfers, LBA addressing
238: wd1: 9541 MB, 19386 cyl, 16 head, 63 sec, 512 bytes/sect x 19541088 sectors
239: wd1: drive supports PIO mode 4, DMA mode 2, Ultra-DMA mode 5 (Ultra/100)
240: wd1(piixide0:1:0): using PIO mode 4, Ultra-DMA mode 2 (Ultra/33) (using DMA data transfers)
241:
242: *Note*: If you are using SCSI, replace `/dev/{,r}wd{0,1}` with
243: `/dev/{,r}sd{0,1}`.
244:
245: In this example, both disks are jumpered as Master on separate channels on the
246: same controller. You usually wouldn't want to have both disks on the same bus on
247: the same controller; this creates a single point of failure. Ideally you would
248: have the disks on separate channels on separate controllers. Nonetheless, in
249: most cases the most critical point is the hard disk, so having redundant
250: channels or controllers is not that important. Plus, having more channels or
251: controllers increases costs. Some SCSI controllers have multiple channels on the
252: same controller, however, a SCSI bus reset on one channel could adversely affect
253: the other channel if the ASIC/IC becomes overloaded. The trade-off with two
254: controllers is that twice the bandwidth is used on the system bus. For purposes
255: of simplification, this example shows two disks on different channels on the
256: same controller.
257:
258: *Note*: RAIDframe requires that all components be of the same size. Actually, it
259: will use the lowest common denominator among components of dissimilar sizes. For
260: purposes of illustration, the example uses two disks of identical geometries.
261: Also, consider the availability of replacement disks if a component suffers a
262: critical hardware failure.
263:
264: *Tip*: Two disks of identical vendor model numbers could have different
265: geometries if the drive possesses "grown defects". Use a low-level program to
266: examine the grown defects table of the disk. These disks are obviously
267: suboptimal candidates for use in RAID and should be avoided.
268:
269: ### Initial Install on Disk0/wd0
270:
271: Perform a very generic installation onto your Disk0/wd0. Follow the `INSTALL`
272: instructions for your platform. Install all the sets but do not bother
273: customizing anything other than the kernel as it will be overwritten.
274:
275: *Tip*: On i386, during the sysinst install, when prompted if you want to `use
276: the entire disk for NetBSD`, answer `yes`.
277:
1.3 jdf 278: * [Installing NetBSD: Preliminary considerations and preparations](/guide/inst)
1.1 jdf 279: * [NetBSD/i386 Install Directions](http://ftp.NetBSD.org/pub/NetBSD/NetBSD-5.0.2/i386/INSTALL.html)
280: * [NetBSD/sparc64 Install Directions](http://ftp.NetBSD.org/pub/NetBSD/NetBSD-5.0.2/sparc64/INSTALL.html)
281:
282: Once the installation is complete, you should examine the
283: [disklabel(8)](http://netbsd.gw.com/cgi-bin/man-cgi?disklabel+8+NetBSD-5.0.1+i386)
284: and [fdisk(8)](http://netbsd.gw.com/cgi-bin/man-cgi?fdisk+8+NetBSD-5.0.1+i386) /
285: [sunlabel(8)](http://netbsd.gw.com/cgi-bin/man-cgi?sunlabel+8+NetBSD-5.0.1+i386)
286: outputs on the system:
287:
288: # df
289: Filesystem 1K-blocks Used Avail %Cap Mounted on
290: /dev/wd0a 9487886 502132 8511360 5% /
291:
292: On i386:
293:
294: # disklabel -r wd0
295: type: unknown
296: disk: Disk00
297: label:
298: flags:
299: bytes/sector: 512
300: sectors/track: 63
301: tracks/cylinder: 16
302: sectors/cylinder: 1008
303: cylinders: 19386
304: total sectors: 19541088
305: rpm: 3600
306: interleave: 1
307: trackskew: 0
308: cylinderskew: 0
309: headswitch: 0 # microseconds
310: track-to-track seek: 0 # microseconds
311: drivedata: 0
312:
313: 16 partitions:
314: # size offset fstype [fsize bsize cpg/sgs]
315: a: 19276992 63 4.2BSD 1024 8192 46568 # (Cyl. 0* - 19124*)
316: b: 264033 19277055 swap # (Cyl. 19124* - 19385)
317: c: 19541025 63 unused 0 0 # (Cyl. 0* - 19385)
318: d: 19541088 0 unused 0 0 # (Cyl. 0 - 19385)
319:
320: # fdisk /dev/rwd0d
321: Disk: /dev/rwd0d
322: NetBSD disklabel disk geometry:
323: cylinders: 19386, heads: 16, sectors/track: 63 (1008 sectors/cylinder)
324: total sectors: 19541088
325:
326: BIOS disk geometry:
327: cylinders: 1023, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
328: total sectors: 19541088
329:
330: Partition table:
331: 0: NetBSD (sysid 169)
332: start 63, size 19541025 (9542 MB, Cyls 0-1216/96/1), Active
333: 1: <UNUSED>
334: 2: <UNUSED>
335: 3: <UNUSED>
336: Bootselector disabled.
337: First active partition: 0
338:
339: On Sparc64 the command and output differ slightly:
340:
341: # disklabel -r wd0
342: type: unknown
343: disk: Disk0
344: [...snip...]
345: 8 partitions:
346: # size offset fstype [fsize bsize cpg/sgs]
347: a: 19278000 0 4.2BSD 1024 8192 46568 # (Cyl. 0 - 19124)
348: b: 263088 19278000 swap # (Cyl. 19125 - 19385)
349: c: 19541088 0 unused 0 0 # (Cyl. 0 - 19385)
350:
351: # sunlabel /dev/rwd0c
352: sunlabel> P
353: a: start cyl = 0, size = 19278000 (19125/0/0 - 9413.09Mb)
354: b: start cyl = 19125, size = 263088 (261/0/0 - 128.461Mb)
355: c: start cyl = 0, size = 19541088 (19386/0/0 - 9541.55Mb)
356:
357: ### Preparing Disk1/wd1
358:
359: Once you have a stock install of NetBSD on Disk0/wd0, you are ready to begin.
360: Disk1/wd1 will be visible and unused by the system. To setup Disk1/wd1, you will
361: use
362: [disklabel(8)](http://netbsd.gw.com/cgi-bin/man-cgi?disklabel+8+NetBSD-5.0.1+i386)
363: to allocate the entire second disk to the RAID-1 set.
364:
365: *Tip*: The best way to ensure that Disk1/wd1 is completely empty is to 'zero'
366: out the first few sectors of the disk with
367: [dd(1)](http://netbsd.gw.com/cgi-bin/man-cgi?dd+1+NetBSD-5.0.1+i386) . This will
368: erase the MBR (i386) or Sun disk label (sparc64), as well as the NetBSD disk
369: label. If you make a mistake at any point during the RAID setup process, you can
370: always refer to this process to restore the disk to an empty state.
371:
372: *Note*: On sparc64, use `/dev/rwd1c` instead of `/dev/rwd1d`!
373:
374: # dd if=/dev/zero of=/dev/rwd1d bs=8k count=1
375: 1+0 records in
376: 1+0 records out
377: 8192 bytes transferred in 0.003 secs (2730666 bytes/sec)
378:
379: Once this is complete, on i386, verify that both the MBR and NetBSD disk labels
380: are gone. On sparc64, verify that the Sun Disk label is gone as well.
381:
382: On i386:
383:
384: # fdisk /dev/rwd1d
385:
386: fdisk: primary partition table invalid, no magic in sector 0
387: Disk: /dev/rwd1d
388: NetBSD disklabel disk geometry:
389: cylinders: 19386, heads: 16, sectors/track: 63 (1008 sectors/cylinder)
390: total sectors: 19541088
391:
392: BIOS disk geometry:
393: cylinders: 1023, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
394: total sectors: 19541088
395:
396: Partition table:
397: 0: <UNUSED>
398: 1: <UNUSED>
399: 2: <UNUSED>
400: 3: <UNUSED>
401: Bootselector disabled.
402:
403: # disklabel -r wd1
404:
405: [...snip...]
406: 16 partitions:
407: # size offset fstype [fsize bsize cpg/sgs]
408: c: 19541025 63 unused 0 0 # (Cyl. 0* - 19385)
409: d: 19541088 0 unused 0 0 # (Cyl. 0 - 19385)
410:
411: On sparc64:
412:
413: # sunlabel /dev/rwd1c
414:
415: sunlabel: bogus label on `/dev/wd1c' (bad magic number)
416:
417: # disklabel -r wd1
418:
419: [...snip...]
420: 3 partitions:
421: # size offset fstype [fsize bsize cpg/sgs]
422: c: 19541088 0 unused 0 0 # (Cyl. 0 - 19385)
423: disklabel: boot block size 0
424: disklabel: super block size 0
425:
426: Now that you are certain the second disk is empty, on i386 you must establish
427: the MBR on the second disk using the values obtained from Disk0/wd0 above. We
428: must remember to mark the NetBSD partition active or the system will not boot.
429: You must also create a NetBSD disklabel on Disk1/wd1 that will enable a RAID
430: volume to exist upon it. On sparc64, you will need to simply
431: [disklabel(8)](http://netbsd.gw.com/cgi-bin/man-cgi?disklabel+8+NetBSD-5.0.1+i386)
432: the second disk which will write the proper Sun Disk Label.
433:
434: *Tip*:
435: [disklabel(8)](http://netbsd.gw.com/cgi-bin/man-cgi?disklabel+8+NetBSD-5.0.1+i386)
436: will use your shell' s environment variable `$EDITOR` variable to edit the
437: disklabel. The default is
438: [vi(1)](http://netbsd.gw.com/cgi-bin/man-cgi?vi+1+NetBSD-5.0.1+i386)
439:
440: On i386:
441:
442: # fdisk -0ua /dev/rwd1d
443: fdisk: primary partition table invalid, no magic in sector 0
444: Disk: /dev/rwd1d
445: NetBSD disklabel disk geometry:
446: cylinders: 19386, heads: 16, sectors/track: 63 (1008 sectors/cylinder)
447: total sectors: 19541088
448:
449: BIOS disk geometry:
450: cylinders: 1023, heads: 255, sectors/track: 63 (16065 sectors/cylinder)
451: total sectors: 19541088
452:
453: Do you want to change our idea of what BIOS thinks? [n]
454:
455: Partition 0:
456: <UNUSED>
457: The data for partition 0 is:
458: <UNUSED>
459: sysid: [0..255 default: 169]
460: start: [0..1216cyl default: 63, 0cyl, 0MB]
461: size: [0..1216cyl default: 19541025, 1216cyl, 9542MB]
462: bootmenu: []
463: Do you want to change the active partition? [n] y
464: Choosing 4 will make no partition active.
465: active partition: [0..4 default: 0] 0
466: Are you happy with this choice? [n] y
467:
468: We haven't written the MBR back to disk yet. This is your last chance.
469: Partition table:
470: 0: NetBSD (sysid 169)
471: start 63, size 19541025 (9542 MB, Cyls 0-1216/96/1), Active
472: 1: <UNUSED>
473: 2: <UNUSED>
474: 3: <UNUSED>
475: Bootselector disabled.
476: Should we write new partition table? [n] y
477:
478: # disklabel -r -e -I wd1
479: type: unknown
480: disk: Disk1
481: label:
482: flags:
483: bytes/sector: 512
484: sectors/track: 63
485: tracks/cylinder: 16
486: sectors/cylinder: 1008
487: cylinders: 19386
488: total sectors: 19541088
489: [...snip...]
490: 16 partitions:
491: # size offset fstype [fsize bsize cpg/sgs]
492: a: 19541025 63 RAID # (Cyl. 0*-19385)
493: c: 19541025 63 unused 0 0 # (Cyl. 0*-19385)
494: d: 19541088 0 unused 0 0 # (Cyl. 0 -19385)
495:
496: On sparc64:
497:
498: # disklabel -r -e -I wd1
499: type: unknown
500: disk: Disk1
501: label:
502: flags:
503: bytes/sector: 512
504: sectors/track: 63
505: tracks/cylinder: 16
506: sectors/cylinder: 1008
507: cylinders: 19386
508: total sectors: 19541088
509: [...snip...]
510: 3 partitions:
511: # size offset fstype [fsize bsize cpg/sgs]
512: a: 19541088 0 RAID # (Cyl. 0 - 19385)
513: c: 19541088 0 unused 0 0 # (Cyl. 0 - 19385)
514:
515: # sunlabel /dev/rwd1c
516: sunlabel> P
517: a: start cyl = 0, size = 19541088 (19386/0/0 - 9541.55Mb)
518: c: start cyl = 0, size = 19541088 (19386/0/0 - 9541.55Mb)
519:
520: *Note*: On i386, the `c:` and `d:` slices are reserved. `c:` represents the
521: NetBSD portion of the disk. `d:` represents the entire disk. Because we want to
522: allocate the entire NetBSD MBR partition to RAID, and because `a:` resides
523: within the bounds of `c:`, the `a:` and `c:` slices have same size and offset
524: values and sizes. The offset must start at a track boundary (an increment of
525: sectors matching the sectors/track value in the disk label). On sparc64 however,
526: `c:` represents the entire NetBSD partition in the Sun disk label and `d:` is
527: not reserved. Also note that sparc64's `c:` and `a:` require no offset from the
528: beginning of the disk, however if they should need to be, the offset must start
529: at a cylinder boundary (an increment of sectors matching the sectors/cylinder
530: value).
531:
532: ### Initializing the RAID Device
533:
534: Next we create the configuration file for the RAID set / volume. Traditionally,
535: RAIDframe configuration files belong in `/etc` and would be read and initialized
536: at boot time, however, because we are creating a bootable RAID volume, the
537: configuration data will actually be written into the RAID volume using the
538: *auto-configure* feature. Therefore, files are needed only during the initial
539: setup and should not reside in `/etc`.
540:
541: # vi /var/tmp/raid0.conf
542: START array
543: 1 2 0
544:
545: START disks
546: absent
547: /dev/wd1a
548:
549: START layout
550: 128 1 1 1
551:
552: START queue
553: fifo 100
554:
555: Note that `absent` means a non-existing disk. This will allow us to establish
556: the RAID volume with a bogus component that we will substitute for Disk0/wd0 at
557: a later time.
558:
559: Next we configure the RAID device and initialize the serial number to something
560: unique. In this example we use a "YYYYMMDD*`Revision`*" scheme. The format you
561: choose is entirely at your discretion, however the scheme you choose should
562: ensure that no two RAID sets use the same serial number at the same time.
563:
564: After that we initialize the RAID set for the first time, safely ignoring the
565: errors regarding the bogus component.
566:
567: # raidctl -v -C /var/tmp/raid0.conf raid0
568: Ignoring missing component at column 0
569: raid0: Component absent being configured at col: 0
570: Column: 0 Num Columns: 0
571: Version: 0 Serial Number: 0 Mod Counter: 0
572: Clean: No Status: 0
573: Number of columns do not match for: absent
574: absent is not clean!
575: raid0: Component /dev/wd1a being configured at col: 1
576: Column: 0 Num Columns: 0
577: Version: 0 Serial Number: 0 Mod Counter: 0
578: Clean: No Status: 0
579: Column out of alignment for: /dev/wd1a
580: Number of columns do not match for: /dev/wd1a
581: /dev/wd1a is not clean!
582: raid0: There were fatal errors
583: raid0: Fatal errors being ignored.
584: raid0: RAID Level 1
585: raid0: Components: component0[**FAILED**] /dev/wd1a
586: raid0: Total Sectors: 19540864 (9541 MB)
587: # raidctl -v -I 2009122601 raid0
588: # raidctl -v -i raid0
589: Initiating re-write of parity
590: raid0: Error re-writing parity!
591: Parity Re-write status:
592:
593: # tail -1 /var/log/messages
594: Dec 26 00:00:30 /netbsd: raid0: Error re-writing parity!
595: # raidctl -v -s raid0
596: Components:
597: component0: failed
598: /dev/wd1a: optimal
599: No spares.
600: component0 status is: failed. Skipping label.
601: Component label for /dev/wd1a:
602: Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
603: Version: 2, Serial Number: 2009122601, Mod Counter: 7
604: Clean: No, Status: 0
605: sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
606: Queue size: 100, blocksize: 512, numBlocks: 19540864
607: RAID Level: 1
608: Autoconfig: No
609: Root partition: No
610: Last configured as: raid0
611: Parity status: DIRTY
612: Reconstruction is 100% complete.
613: Parity Re-write is 100% complete.
614: Copyback is 100% complete.
615:
616: ### Setting up Filesystems
617:
618: *Caution*: The root filesystem must begin at sector 0 of the RAID device. If
619: not, the primary boot loader will be unable to find the secondary boot loader.
620:
621: The RAID device is now configured and available. The RAID device is a pseudo
622: disk-device. It will be created with a default disk label. You must now
623: determine the proper sizes for disklabel slices for your production environment.
624: For purposes of simplification in this example, our system will have 8.5
625: gigabytes dedicated to `/` as `/dev/raid0a` and the rest allocated to `swap`
626: as `/dev/raid0b`.
627:
628: *Caution*: This is an unrealistic disk layout for a production server; the
629: NetBSD Guide can expand on proper partitioning technique. See [Installing
630: NetBSD: Preliminary considerations and preparations*](inst).
631:
632: *Note*: Note that 1 GB is 2\*1024\*1024=2097152 blocks (1 block is 512 bytes, or
633: 0.5 kilobytes). Despite what the underlying hardware composing a RAID set is,
634: the RAID pseudo disk will always have 512 bytes/sector.
635:
636: *Note*: In our example, the space allocated to the underlying `a:` slice
637: composing the RAID set differed between i386 and sparc64, therefore the total
638: sectors of the RAID volumes differs:
639:
640: On i386:
641:
642: # disklabel -r -e -I raid0
643: type: RAID
644: disk: raid
645: label: fictitious
646: flags:
647: bytes/sector: 512
648: sectors/track: 128
649: tracks/cylinder: 8
650: sectors/cylinder: 1024
651: cylinders: 19082
652: total sectors: 19540864
653: rpm: 3600
654: interleave: 1
655: trackskew: 0
656: cylinderskew: 0
657: headswitch: 0 # microseconds
658: track-to-track seek: 0 # microseconds
659: drivedata: 0
660:
661: # size offset fstype [fsize bsize cpg/sgs]
662: a: 19015680 0 4.2BSD 0 0 0 # (Cyl. 0 - 18569)
663: b: 525184 19015680 swap # (Cyl. 18570 - 19082*)
664: d: 19540864 0 unused 0 0 # (Cyl. 0 - 19082*)
665:
666: On sparc64:
667:
668: # disklabel -r -e -I raid0
669: [...snip...]
670: total sectors: 19539968
671: [...snip...]
672: 3 partitions:
673: # size offset fstype [fsize bsize cpg/sgs]
674: a: 19251200 0 4.2BSD 0 0 0 # (Cyl. 0 - 18799)
675: b: 288768 19251200 swap # (Cyl. 18800 - 19081)
676: c: 19539968 0 unused 0 0 # (Cyl. 0 - 19081)
677:
678: Next, format the newly created `/` partition as a 4.2BSD FFSv1 File System:
679:
680: # newfs -O 1 /dev/rraid0a
681: /dev/rraid0a: 9285.0MB (19015680 sectors) block size 16384, fragment size 2048
682: using 51 cylinder groups of 182.06MB, 11652 blks, 23040 inodes.
683: super-block backups (for fsck -b #) at:
684: 32, 372896, 745760, 1118624, 1491488, 1864352, 2237216, 2610080, 2982944,
685: ...............................................................................
686:
687: # fsck -fy /dev/rraid0a
688: ** /dev/rraid0a
689: ** File system is already clean
690: ** Last Mounted on
691: ** Phase 1 - Check Blocks and Sizes
692: ** Phase 2 - Check Pathnames
693: ** Phase 3 - Check Connectivity
694: ** Phase 4 - Check Reference Counts
695: ** Phase 5 - Check Cyl groups
696: 1 files, 1 used, 4679654 free (14 frags, 584955 blocks, 0.0% fragmentation)
697:
698: ### Migrating System to RAID
699:
700: The new RAID filesystems are now ready for use. We mount them under `/mnt` and
701: copy all files from the old system. This can be done using
702: [dump(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dump+8+NetBSD-5.0.1+i386) or
703: [pax(1)](http://netbsd.gw.com/cgi-bin/man-cgi?pax+1+NetBSD-5.0.1+i386).
704:
705: # mount /dev/raid0a /mnt
706: # df -h /mnt
707: Filesystem Size Used Avail %Cap Mounted on
708: /dev/raid0a 8.9G 2.0K 8.5G 0% /mnt
709: # cd /; pax -v -X -rw -pe . /mnt
710: [...snip...]
711:
712: The NetBSD install now exists on the RAID filesystem. We need to fix the
713: mount-points in the new copy of `/etc/fstab` or the system will not come up
714: correctly. Replace instances of `wd0` with `raid0`.
715:
716: The swap should be unconfigured upon shutdown to avoid parity errors on the RAID
717: device. This can be done with a simple, one-line setting in `/etc/rc.conf`.
718:
719: # vi /mnt/etc/rc.conf
720: swapoff=YES
721:
722: Next, the boot loader must be installed on Disk1/wd1. Failure to install the
723: loader on Disk1/wd1 will render the system un-bootable if Disk0/wd0 fails. You
724: should hope your system won't have to reboot when wd0 fails.
725:
726: *Tip*: Because the BIOS/CMOS menus in many i386 based systems are misleading
727: with regard to device boot order. I highly recommend utilizing the `-o
728: timeout=X` option supported by the i386 1st stage boot loader. Setup unique
729: values for each disk as a point of reference so that you can easily determine
730: from which disk the system is booting.
731:
732: *Caution*: Although it may seem logical to install the 1st stage boot block into
733: `/dev/rwd1{c,d}` (which is historically correct with NetBSD 1.6.x
734: [installboot(8)](http://netbsd.gw.com/cgi-bin/man-cgi?installboot+8+NetBSD-5.0.1+i386)
735: , this is no longer the case. If you make this mistake, the boot sector will
736: become irrecoverably damaged and you will need to start the process over again.
737:
738: On i386, install the boot loader into `/dev/rwd1a`:
739:
740: # /usr/sbin/installboot -o timeout=30 -v /dev/rwd1a /usr/mdec/bootxx_ffsv1
741: File system: /dev/rwd1a
742: Primary bootstrap: /usr/mdec/bootxx_ffsv1
743: Ignoring PBR with invalid magic in sector 0 of `/dev/rwd1a'
744: Boot options: timeout 30, flags 0, speed 9600, ioaddr 0, console pc
745:
746: On sparc64, install the boot loader into `/dev/rwd1a` as well, however the `-o`
747: flag is unsupported (and un-needed thanks to OpenBoot):
748:
749: # /usr/sbin/installboot -v /dev/rwd1a /usr/mdec/bootblk
750: File system: /dev/rwd1a
751: Primary bootstrap: /usr/mdec/bootblk
752: Bootstrap start sector: 1
753: Bootstrap byte count: 5140
754: Writing bootstrap
755:
756: Finally the RAID set must be made auto-configurable and the system should be
757: rebooted. After the reboot everything is mounted from the RAID devices.
758:
759: # raidctl -v -A root raid0
760: raid0: Autoconfigure: Yes
761: raid0: Root: Yes
762: # tail -2 /var/log/messages
763: raid0: New autoconfig value is: 1
764: raid0: New rootpartition value is: 1
765: # raidctl -v -s raid0
766: [...snip...]
767: Autoconfig: Yes
768: Root partition: Yes
769: Last configured as: raid0
770: [...snip...]
771: # shutdown -r now
772:
773: ### Warning
774:
775: Always use
776: [shutdown(8)](http://netbsd.gw.com/cgi-bin/man-cgi?shutdown+8+NetBSD-5.0.1+i386)
777: when shutting down. Never simply use
778: [reboot(8)](http://netbsd.gw.com/cgi-bin/man-cgi?reboot+8+NetBSD-5.0.1+i386).
779: [reboot(8)](http://netbsd.gw.com/cgi-bin/man-cgi?reboot+8+NetBSD-5.0.1+i386)
780: will not properly run shutdown RC scripts and will not safely disable swap. This
781: will cause dirty parity at every reboot.
782:
783: ### The first boot with RAID
784:
785: At this point, temporarily configure your system to boot Disk1/wd1. See notes in
786: [[Testing Boot Blocks|guide/rf#adding-text-boot]] for details on this process.
787: The system should boot now and all filesystems should be on the RAID devices.
788: The RAID will be functional with a single component, however the set is not
789: fully functional because the bogus drive (wd9) has failed.
790:
791: # egrep -i "raid|root" /var/run/dmesg.boot
792: raid0: RAID Level 1
793: raid0: Components: component0[**FAILED**] /dev/wd1a
794: raid0: Total Sectors: 19540864 (9541 MB)
795: boot device: raid0
796: root on raid0a dumps on raid0b
797: root file system type: ffs
798:
799: # df -h
800: Filesystem Size Used Avail Capacity Mounted on
801: /dev/raid0a 8.9G 196M 8.3G 2% /
802: kernfs 1.0K 1.0K 0B 100% /kern
803:
804: # swapctl -l
805: Device 1K-blocks Used Avail Capacity Priority
806: /dev/raid0b 262592 0 262592 0% 0
807: # raidctl -s raid0
808: Components:
809: component0: failed
810: /dev/wd1a: optimal
811: No spares.
812: component0 status is: failed. Skipping label.
813: Component label for /dev/wd1a:
814: Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
815: Version: 2, Serial Number: 2009122601, Mod Counter: 65
816: Clean: No, Status: 0
817: sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
818: Queue size: 100, blocksize: 512, numBlocks: 19540864
819: RAID Level: 1
820: Autoconfig: Yes
821: Root partition: Yes
822: Last configured as: raid0
823: Parity status: DIRTY
824: Reconstruction is 100% complete.
825: Parity Re-write is 100% complete.
826: Copyback is 100% complete.
827:
828: ### Adding Disk0/wd0 to RAID
829:
830: We will now add Disk0/wd0 as a component of the RAID. This will destroy the
831: original file system structure. On i386, the MBR disklabel will be unaffected
832: (remember we copied wd0's label to wd1 anyway) , therefore there is no need to
833: "zero" Disk0/wd0. However, we need to relabel Disk0/wd0 to have an identical
834: NetBSD disklabel layout as Disk1/wd1. Then we add Disk0/wd0 as "hot-spare" to
835: the RAID set and initiate the parity reconstruction for all RAID devices,
836: effectively bringing Disk0/wd0 into the RAID-1 set and "syncing up" both disks.
837:
838: # disklabel -r wd1 > /tmp/disklabel.wd1
839: # disklabel -R -r wd0 /tmp/disklabel.wd1
840:
841: As a last-minute sanity check, you might want to use
842: [diff(1)](http://netbsd.gw.com/cgi-bin/man-cgi?diff+1+NetBSD-5.0.1+i386) to
843: ensure that the disklabels of Disk0/wd0 match Disk1/wd1. You should also backup
844: these files for reference in the event of an emergency.
845:
846: # disklabel -r wd0 > /tmp/disklabel.wd0
847: # disklabel -r wd1 > /tmp/disklabel.wd1
848: # diff /tmp/disklabel.wd0 /tmp/disklabel.wd1
849: # fdisk /dev/rwd0 > /tmp/fdisk.wd0
850: # fdisk /dev/rwd1 > /tmp/fdisk.wd1
851: # diff /tmp/fdisk.wd0 /tmp/fdisk.wd1
852: # mkdir /root/RFbackup
853: # cp -p /tmp/{disklabel,fdisk}* /root/RFbackup
854:
855: Once you are sure, add Disk0/wd0 as a spare component, and start reconstruction:
856:
857: # raidctl -v -a /dev/wd0a raid0
858: /netbsd: Warning: truncating spare disk /dev/wd0a to 241254528 blocks
859: # raidctl -v -s raid0
860: Components:
861: component0: failed
862: /dev/wd1a: optimal
863: Spares:
864: /dev/wd0a: spare
865: [...snip...]
866: # raidctl -F component0 raid0
867: RECON: initiating reconstruction on col 0 -> spare at col 2
868: 11% |**** | ETA: 04:26 \
869:
870: Depending on the speed of your hardware, the reconstruction time will vary. You
871: may wish to watch it on another terminal (note that you can interrupt
872: `raidctl -S` any time without stopping the synchronisation):
873:
874: # raidctl -S raid0
875: Reconstruction is 0% complete.
876: Parity Re-write is 100% complete.
877: Copyback is 100% complete.
878: Reconstruction status:
879: 17% |****** | ETA: 03:08 -
880:
881: After reconstruction, both disks should be *optimal*.
882:
883: # tail -f /var/log/messages
884: raid0: Reconstruction of disk at col 0 completed
885: raid0: Recon time was 1290.625033 seconds, accumulated XOR time was 0 us (0.000000)
886: raid0: (start time 1093407069 sec 145393 usec, end time 1093408359 sec 770426 usec)
887: raid0: Total head-sep stall count was 0
888: raid0: 305318 recon event waits, 1 recon delays
889: raid0: 1093407069060000 max exec ticks
890:
891: # raidctl -v -s raid0
892: Components:
893: component0: spared
894: /dev/wd1a: optimal
895: Spares:
896: /dev/wd0a: used_spare
897: [...snip...]
898:
899: When the reconstruction is finished we need to install the boot loader on the
900: Disk0/wd0. On i386, install the boot loader into `/dev/rwd0a`:
901:
902: # /usr/sbin/installboot -o timeout=15 -v /dev/rwd0a /usr/mdec/bootxx_ffsv1
903: File system: /dev/rwd0a
904: Primary bootstrap: /usr/mdec/bootxx_ffsv1
905: Boot options: timeout 15, flags 0, speed 9600, ioaddr 0, console pc
906:
907: On sparc64:
908:
909: # /usr/sbin/installboot -v /dev/rwd0a /usr/mdec/bootblk
910: File system: /dev/rwd0a
911: Primary bootstrap: /usr/mdec/bootblk
912: Bootstrap start sector: 1
913: Bootstrap byte count: 5140
914: Writing bootstrap
915:
916: And finally, reboot the machine one last time before proceeding. This is
917: required to migrate Disk0/wd0 from status "used\_spare" as "Component0" to state
918: "optimal". Refer to notes in the next section regarding verification of clean
919: parity after each reboot.
920:
921: # shutdown -r now
922:
923: ### Testing Boot Blocks
924:
925: At this point, you need to ensure that your system's hardware can properly boot
926: using the boot blocks on either disk. On i386, this is a hardware-dependent
927: process that may be done via your motherboard CMOS/BIOS menu or your controller
928: card's configuration menu.
929:
930: On i386, use the menu system on your machine to set the boot device order /
931: priority to Disk1/wd1 before Disk0/wd0. The examples here depict a generic Award
932: BIOS.
933:
1.4 ! jdf 934: 
1.1 jdf 935: **Award BIOS i386 Boot Disk1/wd1**
936:
937: Save changes and exit:
938:
939: >> NetBSD/i386 BIOS Boot, Revision 5.2 (from NetBSD 5.0.2)
940: >> (builds@b7, Sun Feb 7 00:30:50 UTC 2010)
941: >> Memory: 639/130048 k
942: Press return to boot now, any other key for boot menu
943: booting hd0a:netbsd - starting in 30
944:
945: You can determine that the BIOS is reading Disk1/wd1 because the timeout of the
946: boot loader is 30 seconds instead of 15. After the reboot, re-enter the BIOS and
947: configure the drive boot order back to the default:
948:
1.4 ! jdf 949: 
1.1 jdf 950: **Award BIOS i386 Boot Disk0/wd0**
951:
952: Save changes and exit:
953:
954: >> NetBSD/i386 BIOS Boot, Revision 5.2 (from NetBSD 5.0.2)
955: >> Memory: 639/130048 k
956: Press return to boot now, any other key for boot menu
957: booting hd0a:netbsd - starting in 15
958:
959: Notice how your custom kernel detects controller/bus/drive assignments
960: independent of what the BIOS assigns as the boot disk. This is the expected
961: behavior.
962:
963: On sparc64, use the Sun OpenBoot **devalias** to confirm that both disks are bootable:
964:
965: Sun Ultra 5/10 UPA/PCI (UltraSPARC-IIi 400MHz), No Keyboard
966: OpenBoot 3.15, 128 MB memory installed, Serial #nnnnnnnn.
967: Ethernet address 8:0:20:a5:d1:3b, Host ID: nnnnnnnn.
968:
969: ok devalias
970: [...snip...]
971: cdrom /pci@1f,0/pci@1,1/ide@3/cdrom@2,0:f
972: disk /pci@1f,0/pci@1,1/ide@3/disk@0,0
973: disk3 /pci@1f,0/pci@1,1/ide@3/disk@3,0
974: disk2 /pci@1f,0/pci@1,1/ide@3/disk@2,0
975: disk1 /pci@1f,0/pci@1,1/ide@3/disk@1,0
976: disk0 /pci@1f,0/pci@1,1/ide@3/disk@0,0
977: [...snip...]
978:
979: ok boot disk0 netbsd
980: Initializing Memory [...]
981: Boot device /pci/pci/ide@3/disk@0,0 File and args: netbsd
982: NetBSD IEEE 1275 Bootblock
983: >> NetBSD/sparc64 OpenFirmware Boot, Revision 1.13
984: >> (builds@b7.netbsd.org, Wed Jul 29 23:43:42 UTC 2009)
985: loadfile: reading header
986: elf64_exec: Booting [...]
987: symbols @ [....]
988: Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
989: 2006, 2007, 2008, 2009
990: The NetBSD Foundation, Inc. All rights reserved.
991: Copyright (c) 1982, 1986, 1989, 1991, 1993
992: The Regents of the University of California. All rights reserved.
993: [...snip...]
994:
995: And the second disk:
996:
997: ok boot disk2 netbsd
998: Initializing Memory [...]
999: Boot device /pci/pci/ide@3/disk@2,0: File and args:netbsd
1000: NetBSD IEEE 1275 Bootblock
1001: >> NetBSD/sparc64 OpenFirmware Boot, Revision 1.13
1002: >> (builds@b7.netbsd.org, Wed Jul 29 23:43:42 UTC 2009)
1003: loadfile: reading header
1004: elf64_exec: Booting [...]
1005: symbols @ [....]
1006: Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005,
1007: 2006, 2007, 2008, 2009
1008: The NetBSD Foundation, Inc. All rights reserved.
1009: Copyright (c) 1982, 1986, 1989, 1991, 1993
1010: The Regents of the University of California. All rights reserved.
1011: [...snip...]
1012:
1013: At each boot, the following should appear in the NetBSD kernel
1014: [dmesg(8)](http://netbsd.gw.com/cgi-bin/man-cgi?dmesg+8+NetBSD-5.0.1+i386) :
1015:
1016: Kernelized RAIDframe activated
1017: raid0: RAID Level 1
1018: raid0: Components: /dev/wd0a /dev/wd1a
1019: raid0: Total Sectors: 19540864 (9541 MB)
1020: boot device: raid0
1021: root on raid0a dumps on raid0b
1022: root file system type: ffs
1023:
1024: Once you are certain that both disks are bootable, verify the RAID parity is
1025: clean after each reboot:
1026:
1027: # raidctl -v -s raid0
1028: Components:
1029: /dev/wd0a: optimal
1030: /dev/wd1a: optimal
1031: No spares.
1032: [...snip...]
1033: Component label for /dev/wd0a:
1034: Row: 0, Column: 0, Num Rows: 1, Num Columns: 2
1035: Version: 2, Serial Number: 2009122601, Mod Counter: 67
1036: Clean: No, Status: 0
1037: sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
1038: Queue size: 100, blocksize: 512, numBlocks: 19540864
1039: RAID Level: 1
1040: Autoconfig: Yes
1041: Root partition: Yes
1042: Last configured as: raid0
1043: Component label for /dev/wd1a:
1044: Row: 0, Column: 1, Num Rows: 1, Num Columns: 2
1045: Version: 2, Serial Number: 2009122601, Mod Counter: 67
1046: Clean: No, Status: 0
1047: sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1
1048: Queue size: 100, blocksize: 512, numBlocks: 19540864
1049: RAID Level: 1
1050: Autoconfig: Yes
1051: Root partition: Yes
1052: Last configured as: raid0
1053: Parity status: clean
1054: Reconstruction is 100% complete.
1055: Parity Re-write is 100% complete.
1056: Copyback is 100% complete.
1057:
CVSweb for NetBSD wikisrc <wikimaster@NetBSD.org> software: FreeBSD-CVSweb