recently. In this case, it was a root disk. Thankfully, however,
it was mirrored with SVM (Solaris Volume Manager). Unfortunately, disk
failures aren't the type of thing that should happen too frequently so
it can be easy to overlook steps in the recovery process. The following
details both my oversight and recovery of the failed root disk, mirrored
with SVM. Our host details are:
HOST: helios PROMPT: helios [0] SYSTEM: Sun Fire V490 OS: Solaris 9After being alerted to an issue with one of my root disks on helios,
I logged into the host to see the following:
helios [0] /usr/sbin/metastat d11: Mirror Submirror 0: d9 State: Needs maintenance Submirror 1: d10 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 246686592 blocks (117 GB) d9: Submirror of d11 State: Unavailable Size: 246686592 blocks (117 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t0d0s6 0 No - Yes d10: Submirror of d11 State: Okay Size: 246686592 blocks (117 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t1d0s6 0 No Okay Yes <snip...> <truncated, mirrors d8 and d5 were in the same state as d11> <snip...> d2: Mirror Submirror 0: d0 State: Needs maintenance Submirror 1: d1 State: Okay Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 10501632 blocks (5.0 GB) d0: Submirror of d2 State: Needs maintenance Invoke: metareplace d2 c1t0d0s0 <new device> Size: 10501632 blocks (5.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t0d0s0 0 No Maintenance Yes d1: Submirror of d2 State: Okay Size: 10501632 blocks (5.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t1d0s0 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1t1d0 Yes id1,ssd@w500000e011c5c0c0 c1t0d0 Yes id1,ssd@w500000e011c61500The output of 'metastat' reports a state of "Needs Maintenance' for d5,
d8, and d11, also listing no state information for one side of each
mirror. Mirror d2 is also set to "Needs Maintenance" but its components
list state "Maintenance" and "Okay", respectively. Having seen a drop
off in I/O activity to c1t0d0 in Cacti and verified the situation with
system logs and 'iostat -En', I had a failed disk. (Sorry, I didn't
retain the output for inclusion here.) A further check of 'metadb' also
confirms this showing write errors to state database replicas on c1t0d0:
helios [0] /usr/sbin/metadb -i flags first blk block count Wm p l 16 8192 /dev/dsk/c1t0d0s3 W p l 8208 8192 /dev/dsk/c1t0d0s3 W p l 16400 8192 /dev/dsk/c1t0d0s3 W p l 16 8192 /dev/dsk/c1t0d0s4 W p l 8208 8192 /dev/dsk/c1t0d0s4 W p l 16400 8192 /dev/dsk/c1t0d0s4 a p luo 16 8192 /dev/dsk/c1t1d0s3 a p luo 8208 8192 /dev/dsk/c1t1d0s3 a p luo 16400 8192 /dev/dsk/c1t1d0s3 a p luo 16 8192 /dev/dsk/c1t1d0s4 a p luo 8208 8192 /dev/dsk/c1t1d0s4 a p luo 16400 8192 /dev/dsk/c1t1d0s4 r - replica does not have device relocation information o - replica active prior to last mddb configuration change u - replica is up to date l - locator for this replica was read successfully c - replica's location was in /etc/lvm/mddb.cf p - replica's location was patched in kernel m - replica is master, this is replica selected as input W - replica has device write errors a - replica is active, commits are occurring to this replica M - replica had problem with master blocks D - replica had problem with data blocks F - replica had format problems S - replica is too small to hold current data base R - replica had device read errorsNote, 'metadb -i' will include the flag definitions below the state
databases as seen above. At this point, I retrieved a spare disk in
preparation to swap into the machine. On to recovery, delete the state
databases on the failed disk with 'metadb -d /dev/dsk/cWtXdYsZ':
helios [0] /usr/sbin/metadb -d /dev/dsk/c1t0d0s3 helios [0] /usr/sbin/metadb -d /dev/dsk/c1t0d0s4 helios [0] /usr/sbin/metadb flags first blk block count a p luo 16 8192 /dev/dsk/c1t1d0s3 a p luo 8208 8192 /dev/dsk/c1t1d0s3 a p luo 16400 8192 /dev/dsk/c1t1d0s3 a p luo 16 8192 /dev/dsk/c1t1d0s4 a p luo 8208 8192 /dev/dsk/c1t1d0s4 a p luo 16400 8192 /dev/dsk/c1t1d0s4The subsequent 'metadb' no longer shows the replicas on the failed disk.
Had I thought a little further through, I could have save myself some
trouble later. By this, I mean that I should have also removed the
failed (c1t0d0) devices from each mirror before continuing. Instead,
I skipped this step and continued by hot-swapping in the new disk.
This just means that I will have to remove those devices further below:
helios [0] /usr/sbin/devfsadm -C helios [0] echo | /usr/sbin/format Searching for disks...done c1t0d0: configured with capacity of 136.71GB AVAILABLE DISK SELECTIONS: 0. c1t0d0 <SUN146G cyl 14087 alt 2 hd 24 sec 848> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w2100000c505ec040,0 1. c1t1d0 <FUJITSU-MAW3147FCSUN146G-1203 cyl 14087 alt 2 hd 24 sec 848> /pci@9,600000/SUNW,qlc@2/fp@0,0/ssd@w500000e011c5c0c1,0 2. c4t0d0 <STK-FLEXLINE380-0615 cyl 65533 alt 2 hd 64 sec 169> /pci@8,600000/fibre-channel@2/sd@0,0 3. c4t0d1 <STK-FLEXLINE380-0615 cyl 65533 alt 2 hd 64 sec 169> /pci@8,600000/fibre-channel@2/sd@0,1 4. c4t0d2 <STK-FLEXLINE380-0615 cyl 65533 alt 2 hd 64 sec 169> /pci@8,600000/fibre-channel@2/sd@0,2 5. c4t0d3 <STK-FLEXLINE380-0615 cyl 40958 alt 2 hd 64 sec 64> /pci@8,600000/fibre-channel@2/sd@0,3 6. c5t0d0 <STK-FLEXLINE380-0615 cyl 65533 alt 2 hd 64 sec 169> /pci@8,600000/fibre-channel@2,1/sd@0,0 7. c5t0d1 <STK-FLEXLINE380-0615 cyl 65533 alt 2 hd 64 sec 169> /pci@8,600000/fibre-channel@2,1/sd@0,1 8. c5t0d2 <STK-FLEXLINE380-0615 cyl 65533 alt 2 hd 64 sec 169> /pci@8,600000/fibre-channel@2,1/sd@0,2 9. c5t0d3 <STK-FLEXLINE380-0615 cyl 40958 alt 2 hd 64 sec 64> /pci@8,600000/fibre-channel@2,1/sd@0,3 Specify disk (enter its number): Specify disk (enter its number): helios [0] /usr/sbin/prtvtoc -h /dev/rdsk/c1t0d0s2 prtvtoc: /dev/rdsk/c1t0d0s2: Unable to read Disk geometry errno = 0x5 helios [1]After the failed disk was swapped out with one of the same size,
'devfsadm -C' was run to clean up any dead device links, create new
ones, etc, followed by format to verify the new disk was seen. While it
was picked up (disk 0 above), 'prtvtoc' suggests a problem so back to
'format'. The output below shows a simple label issue, so we check the
drive type and set it to auto confiured (shouldn't have been necessary).
After this, the disk is labeled with a standard SMI label, and further
verified (verify) showing the new partition table:
helios [1] /usr/sbin/format c1t0d0 c1t0d0: configured with capacity of 136.71GB selecting c1t0d0 [disk formatted] FORMAT MENU: disk - select a disk type - select (define) a disk type partition - select (define) a partition table current - describe the current disk format - format and analyze the disk repair - repair a defective sector label - write label to the disk analyze - surface analysis defect - defect list management backup - search for backup labels verify - read and display labels save - save new disk/partition definitions inquiry - show vendor, product and revision volname - set 8-character volume name !<cmd> - execute <cmd>, then return quit format> verify Warning: Could not read primary label. Warning: Could not read backup labels. Warning: Check the current partitioning and 'label' the disk or use the 'backup' command. format> type AVAILABLE DRIVE TYPES: 0. Auto configure 1. Quantum ProDrive 80S 2. Quantum ProDrive 105S 3. CDC Wren IV 94171-344 4. SUN0104 5. SUN0207 6. SUN0327 7. SUN0340 8. SUN0424 9. SUN0535 10. SUN0669 11. SUN1.0G 12. SUN1.05 13. SUN1.3G 14. SUN2.1G 15. SUN2.9G 16. Zip 100 17. Zip 250 18. SUN146G 19. other Specify disk type (enter its number)[18]: 0 c1t0d0: configured with capacity of 136.71GB <SUN146G cyl 14087 alt 2 hd 24 sec 848> selecting c1t0d0 [disk formatted] Disk not labeled. Label it now? yes format> verify Primary label contents: Volume name = < > ascii name = <SUN146G cyl 14087 alt 2 hd 24 sec 848> pcyl = 14089 ncyl = 14087 acyl = 2 nhead = 24 nsect = 848 Part Tag Flag Cylinders Size Blocks 0 root wm 0 - 12 129.19MB (13/0/0) 264576 1 swap wu 13 - 25 129.19MB (13/0/0) 264576 2 backup wu 0 - 14086 136.71GB (14087/0/0) 286698624 3 unassigned wm 0 0 (0/0/0) 0 4 unassigned wm 0 0 (0/0/0) 0 5 unassigned wm 0 0 (0/0/0) 0 6 usr wm 26 - 14086 136.46GB (14061/0/0) 286169472 7 unassigned wm 0 0 (0/0/0) 0 format> quitWith the disk labeled, we return to 'prtvtoc' to identify the current
slices on c1t1d0 and verify the slice 2 sizes match between c1t1d0 and
c1t0d0, the replaced disk. Since they match, using 'prtvtoc' piped to
'fmthard', c1t1d0's VTOC is copied over to c1t0d0 and c1t0d0 is verified
with 'prtvtoc':
helios [0] /usr/sbin/prtvtoc -h /dev/rdsk/c1t1d0s2 0 2 00 0 10501632 10501631 1 3 01 10583040 8323968 18907007 2 5 00 0 286698624 286698623 3 0 00 10501632 40704 10542335 4 0 00 10542336 40704 10583039 5 7 00 19029120 20982912 40012031 6 4 00 40012032 246686592 286698623 helios [0] /usr/sbin/prtvtoc -h /dev/rdsk/c1t0d0s2 0 2 00 0 264576 264575 1 3 01 264576 264576 529151 2 5 01 0 286698624 286698623 6 4 00 529152 286169472 286698623 helios [0] /usr/sbin/prtvtoc -h /dev/rdsk/c1t1d0s2 | > /usr/sbin/fmthard -s - /dev/rdsk/c1t0d0s2 fmthard: New volume table of contents now in place. helios [0] /usr/sbin/prtvtoc -h /dev/rdsk/c1t0d0s2 0 2 00 0 10501632 10501631 1 3 01 10583040 8323968 18907007 2 5 00 0 286698624 286698623 3 0 00 10501632 40704 10542335 4 0 00 10542336 40704 10583039 5 7 00 19029120 20982912 40012031 6 4 00 40012032 246686592 286698623With c1t0d0 prepared, we add back in the state database replicas via
'metadb -a -c #', verifying with 'metadb' which shows them to be up to
date and active:
helios [0] /usr/sbin/metadb -a -c 3 /dev/dsk/c1t0d0s3 helios [0] /usr/sbin/metadb -a -c 3 /dev/dsk/c1t0d0s4 helios [0] /usr/sbin/metadb flags first blk block count a u 16 8192 /dev/dsk/c1t0d0s3 a u 8208 8192 /dev/dsk/c1t0d0s3 a u 16400 8192 /dev/dsk/c1t0d0s3 a u 16 8192 /dev/dsk/c1t0d0s4 a u 8208 8192 /dev/dsk/c1t0d0s4 a u 16400 8192 /dev/dsk/c1t0d0s4 a p luo 16 8192 /dev/dsk/c1t1d0s3 a p luo 8208 8192 /dev/dsk/c1t1d0s3 a p luo 16400 8192 /dev/dsk/c1t1d0s3 a p luo 16 8192 /dev/dsk/c1t1d0s4 a p luo 8208 8192 /dev/dsk/c1t1d0s4 a p luo 16400 8192 /dev/dsk/c1t1d0s4Now to recover the mirrors. Starting with mirror d2, we do an in-place
replacement via 'metareplace -e' which automatically begins resyncing
the mirror:
helios [0] /usr/sbin/metastat | /bin/grep 'metareplace' Invoke: metareplace d2 c1t0d0s0 <new device> helios [0] /usr/sbin/metareplace -e d2 c1t0d0s0 d2: device c1t0d0s0 is replaced with c1t0d0s0 helios [0] /usr/sbin/metastat d2 d2: Mirror Submirror 0: d0 State: Resyncing Submirror 1: d1 State: Okay Resync in progress: 15 % done Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 10501632 blocks (5.0 GB) d0: Submirror of d2 State: Resyncing Size: 10501632 blocks (5.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t0d0s0 0 No Resyncing Yes d1: Submirror of d2 State: Okay Size: 10501632 blocks (5.0 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t1d0s0 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1t0d0 Yes id1,ssd@w2000000c505ec040 c1t1d0 Yes id1,ssd@w500000e011c5c0c0Earlier I stated that I could have saved myself some trouble by removing
the broken side of the mirrors. Since I didn't, I get to fix it here.
The output from 'metastat' still shows a lot of "Needs maintenance" and
"Unavailable", which 'metareplace' won't work on:
helios [0] /usr/sbin/metastat | /usr/bin/egrep '^d|State:' d11: Mirror State: Needs maintenance State: Okay d9: Submirror of d11 State: Unavailable d10: Submirror of d11 State: Okay d8: Mirror State: Needs maintenance State: Okay d6: Submirror of d8 State: Unavailable d7: Submirror of d8 State: Okay d5: Mirror State: Needs maintenance State: Okay d3: Submirror of d5 State: Unavailable d4: Submirror of d5 State: Okay d2: Mirror State: Okay State: Okay d0: Submirror of d2 State: Okay d1: Submirror of d2 State: OkaySince the submirrors (plexes) are in an erred state, we'll have to force
detach d9 from d11, d6 from d8, and d3 from d5, and subsequently clear
them from SVM. These detachments and clears are what I should have
done earlier:
helios [0] /usr/sbin/metadetach d11 d9 metadetach: helios: d11: attempt an operation on a submirror that has erred components helios [1] /usr/sbin/metadetach -f d11 d9 d11: submirror d9 is detached helios [0] /usr/sbin/metadetach -f d8 d6 d8: submirror d6 is detached helios [0] /usr/sbin/metadetach -f d5 d3 d5: submirror d3 is detached helios [0] /usr/sbin/metaclear d3 d3: Concat/Stripe is cleared helios [0] /usr/sbin/metaclear d6 d6: Concat/Stripe is cleared helios [0] /usr/sbin/metaclear d9 d9: Concat/Stripe is clearedWith the stale "erred" components removed, we can re-add those same
components (on the new disk) back into their respective mirrors via
'metainit' and 'metattach':
helios [0] /usr/sbin/metainit d3 1 1 c1t0d0s1 d3: Concat/Stripe is setup helios [0] /usr/sbin/metainit d6 1 1 c1t0d0s5 d6: Concat/Stripe is setup helios [0] /usr/sbin/metainit d9 1 1 c1t0d0s6 d9: Concat/Stripe is setup helios [0] /usr/bin/metattach d11 d9 d11: submirror d9 is attached helios [0] /usr/bin/metattach d8 d6 d8: submirror d6 is attached helios [0] /usr/bin/metattach d5 d3 d5: submirror d3 is attachedRunning 'metastat' on d9 shows the state is now "Okay" rather than
"Unavailable". On d11, we see that SVM is resyncing the data from d10
to d9:
helios [0] /usr/sbin/metastat d9 d9: Concat/Stripe Size: 246686592 blocks (117 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t0d0s6 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1t0d0 Yes id1,ssd@w2000000c505ec040 helios [0] /usr/sbin/metastat d11 d11: Mirror Submirror 0: d9 State: Resyncing Submirror 1: d10 State: Okay Resync in progress: 1 % done Pass: 1 Read option: roundrobin (default) Write option: parallel (default) Size: 246686592 blocks (117 GB) d9: Submirror of d11 State: Resyncing Size: 246686592 blocks (117 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t0d0s6 0 No Okay Yes d10: Submirror of d11 State: Okay Size: 246686592 blocks (117 GB) Stripe 0: Device Start Block Dbase State Reloc Hot Spare c1t1d0s6 0 No Okay Yes Device Relocation Information: Device Reloc Device ID c1t0d0 Yes id1,ssd@w2000000c505ec040 c1t1d0 Yes id1,ssd@w500000e011c5c0c0Running 'metastat' on each of the other mirrors would show similar for
those mirrors. Without options, 'metastat' will show all mirrors and
plexes, allowing us to keep track of the resync operations until complete.
As a final step, since the faulted disk was a root disk, we still need
the ability to boot from it. Here we turn to 'installboot' to install
the boot blocks.
helios [0] /usr/sbin/installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c1t0d0s0 helios [0]The end result to all of this is that the box is back to a healthy state
with all mirrors functional. Had I not overlooked a step, the above
could have been little shorter. Still, it illustrates that in either
case, the situation is still recoverable.