12 January 2013

Repartitioning Contiguous Space in Solaris

Recently, a system owner aked me for help because they had run out of space on one of their filesystems (FS). As a result, this was causing problems with the application they had running. Unfortunately, this FS was being used for database files so we couldn't just remove or compress the stored files. For a variety of reasons, the solution we decided upon was to repartition the disk in question, leaving the configured filesystems intact, and growing the fully consumed FS into space reclaimed from the next adjacent partition. I've recreated this scenario here with our host details being:
        HOST:           sunspot
        PROMPTS:        multi user:     sunspot [0]
                        single user:    sunspot-cons [0]
        OS:             Solaris 10 x86
        FS TYPE:        UFS
        DISK:           c1t1d0
        NOTE:           I've used the following (with minor changes)
                        with no issues on previous versions of Solaris
                        (sparc and x86) with UFS filesystems as well.

Before continuing, a similar resolution to this scenario could have been to use SVM (Solaris Volume Manager) to concatenate the original partitioned FS with another partition (adjacent or not) and simply grow the FS. Since SVM isn't otherwise used on this host and the subsequent adjacent partition to the original was available, reclaiming the adjacent partition was chosen instead.

Starting things off, our application resides solely on disk "c1t1d0" on partitions "s0" and "s3", mounted under "/opt/myapp". There is also a swap partition at "c1t1d0s1". Below, our 'df' output shows the FS in question on "c1t1d0s0" (at 100% capacity), our swap volumes, and the partition table on "c1t1d0" as reported by 'format':
        sunspot [0] /bin/df -h | /bin/egrep 'File|/dev/dsk'
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t0d0s0       14G   3.5G    10G    26%    /
        /dev/dsk/c1t0d0s3      996M    66M   871M     8%    /var
        /dev/dsk/c1t1d0s3      482M   109M   324M    26%    /opt/myapp/bin
        /dev/dsk/c1t1d0s0      482M   482M     0K   100%    /opt/myapp/logs
        sunspot [0] /usr/sbin/swap -l
        swapfile             dev  swaplo blocks   free
        /dev/dsk/c1t0d0s1   32,1       8 2104504 2104504
        /dev/dsk/c1t1d0s1   32,129      8 1048568 1048568
        sunspot [0] echo "verify" | /usr/sbin/format c1t1d0
        selecting c1t1d0
        [disk formatted]
        Warning: Current Disk has mounted partitions.
        /dev/dsk/c1t1d0s0 is currently mounted on /opt/myapp/logs. Please see umount(1M).
        /dev/dsk/c1t1d0s1 is currently used by swap. Please see swap(1M).
        /dev/dsk/c1t1d0s3 is currently mounted on /opt/myapp/bin. Please see umount(1M).


        FORMAT MENU:
        <snip...>
        format>
        Primary label contents:

        Volume name = <        >
        ascii name  = <DEFAULT cyl 2044 alt 2 hd 128 sec 32>
        pcyl        = 2046
        ncyl        = 2044
        acyl        =    2
        bcyl        =    0
        nhead       =  128
        nsect       =   32
        Part      Tag    Flag     Cylinders        Size            Blocks
          0 unassigned    wm       1 -  256      512.00MB    (256/0/0)  1048576
          1 unassigned    wm     257 -  512      512.00MB    (256/0/0)  1048576
          2     backup    wu       0 - 2043        3.99GB    (2044/0/0) 8372224
          3 unassigned    wm     513 -  768      512.00MB    (256/0/0)  1048576
          4 unassigned    wm       0               0         (0/0/0)          0
          5 unassigned    wm       0               0         (0/0/0)          0
          6 unassigned    wm       0               0         (0/0/0)          0
          7 unassigned    wm       0               0         (0/0/0)          0
          8       boot    wu       0 -    0        2.00MB    (1/0/0)       4096
          9 unassigned    wm       0               0         (0/0/0)          0

        format>
        sunspot [1]

Since "c1t1d0s1" is swap space, and in this case, we have additional space at the end of the disk, we are going to repartition this disk. The result will be that partition "c1t1d0s1" is removed, "c1t1d0s0" will be grown to encompass the space previously held by "c1t1d0s1", and "c1t1d0s4" will be partitioned to the same size as the original "c1t1d0s1" so that we can retain the swap allocation. To prepare for this, using your favorite editor (vi?) update the entries for "c1t1d0" in "/etc/vfstab" from this:
        sunspot [1] /bin/grep c1t1 /etc/vfstab
        /dev/dsk/c1t1d0s1       -       -       swap    -       no      -
        /dev/dsk/c1t1d0s0       /dev/rdsk/c1t1d0s0      /opt/myapp/logs ufs     2       yes     -
        /dev/dsk/c1t1d0s3       /dev/rdsk/c1t1d0s3      /opt/myapp/bin  ufs     2       yes     -
        sunspot [0]

to this:
        sunspot [0] /bin/grep c1t1 /etc/vfstab
        #/dev/dsk/c1t1d0s1      -       -       swap    -       no      -
        #/dev/dsk/c1t1d0s0      /dev/rdsk/c1t1d0s0      /opt/myapp/logs ufs     2       yes     -
        #/dev/dsk/c1t1d0s3      /dev/rdsk/c1t1d0s3      /opt/myapp/bin  ufs     2       yes     -
        sunspot [0]

Once complete, we'll need to reboot into "single user mode". This is to ensure that the "c1t1d0" FS are not in use and removes "c1t1d0s1" from swap usage. (If you can otherwise stop I/O to the "c1t1d0" FS and remove "c1t1d0s1" from swap usage, you could technically do this in "multi user mode".):
        sunspot [0] /usr/sbin/reboot -- -smverbose
        Jan 12 15:41:10 sunspot reboot: rebooted by root
        updating /platform/i86pc/boot_archive
        syncing file systems... done
        Rebooting...

Of note regarding the reboot above, the parameters I've passed to 'reboot' are "-smverbose". This breaks down to:
        -s              single user mode
        -m verbose      Solaris should detail the boot process

I've simply run the parameters together to result with "-smverbose". (In Solaris versions prior to Solaris 10, the command would have been '/usr/sbin/reboot -- -sv'.) Once the host starts to reboot, a new GRUB entry of "Solaris_reboot_transient" is available and highlighted, which is effectively a temporary entry related to the options to our 'reboot' command and the entry we need to boot from (if this was a sparc host, rather than a GRUB menu you would see the normal OpenBoot process and wouldn't need to do anything here):
         GNU GRUB  version 0.97  (639K lower / 1571776K upper memory)

        Solaris 10 10/09 s10x_u8wos_08a X86
        Solaris failsafe
        Solaris_reboot_transient                <====================


           Use the <up> and <down> keys to select which entry is highlighted.
           Press enter to boot the select OS, 'e' to edit the
           commands before booting, or 'c' for a command-line.


        The highlighted entry will be booted automatically in 10 seconds.

Once the screen has refreshed, we should start to see the Solaris boot process (sparc and x86):
        SunOS Release 5.10 Version Generic_141445-09 64-bit.
        Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
        Use is subject to license terms.
        NOTICE: MPO disabled because memory is interleaved

        Booting to milestone "mileston/single-user:default".
        [ network/pfil:default starting (packet filter) ]
        [ network/loopback:default starting (loopback network interface) ]
        <snip...>
        [ mileston/single-user:default starting (single-user milestone ]
        Requesting System Maintenance Mode
        SINGLE USER MODE

        Root password for system maintenance (control-d to bypass): _

Log in at the prompt above. Once logged in, Ive simply reset PS1 for clarity of this writeup:
        single-user privilege assigned to /dev/console.
        Entering System Maintenance Mode

        Jan 12 15:42:50 su: 'su root' succeeded for root on /dev/console
        Sun Microsystems Inc.   SunOS 5.10      Generic January 2005
        # PS1=`/usr/bin/hostname`'-cons [$?] '
        sunspot-cons [0]

At this point, we get to repartition "c1t1d0". It must be stressed, before we do any repartitioning, either we must evacuate the adjacent partition of data (if it contains an FS), the adjacent partition was in use but can be reclaimed, or subsequent contiguous space must exist to our partition that we're concerned with (c1t1d0s0). If this is not the case, you will likely risk data loss if you proceed any further.

Moving along, I've called 'format c1t1d0' to operate on that disk, moved into the "partition" submenu, and verified the existing partition table via "print":
        sunspot-cons [0] /usr/sbin/format c1t1d0
        selecting c1t1d0
        [disk formatted]


        FORMAT MENU:
        <snip...>
        format> partition


        PARTITION MENU:
                0      - change `0' partition
        <snip...>
        partition> print
        Current partition table (original):
        Total disk cylinders available: 2044 + 2 (reserved cylinders)

        Part      Tag    Flag     Cylinders        Size            Blocks
          0 unassigned    wm       1 -  256      512.00MB    (256/0/0)  1048576
          1 unassigned    wm     257 -  512      512.00MB    (256/0/0)  1048576
          2     backup    wu       0 - 2043        3.99GB    (2044/0/0) 8372224
          3 unassigned    wm     513 -  768      512.00MB    (256/0/0)  1048576
          4 unassigned    wm       0               0         (0/0/0)          0
          5 unassigned    wm       0               0         (0/0/0)          0
          6 unassigned    wm       0               0         (0/0/0)          0
          7 unassigned    wm       0               0         (0/0/0)          0
          8       boot    wu       0 -    0        2.00MB    (1/0/0)       4096
          9 unassigned    wm       0               0         (0/0/0)          0

Our first order of business is to zero out slice 1 (c1t1d0s1, the swap partition). After that, I've selected slice 0, retaining the starting cylinder (1), but changing the partition size to end on cylinder 512 (512e), just as slice 1 origially did. (In this context, I'm using slice and partition interchangeably since Solaris normally refers to slices but 'format' likes calling them partitions.) Finally, I've reconfigured our swap space to slice 4 at the same size as "c1t1d0s1", setting the starting cylinder to be "769" (the next available) and the size to be "512m":
        partition> 1
        Part      Tag    Flag     Cylinders        Size            Blocks
          1 unassigned    wm     257 -  512      512.00MB    (256/0/0)  1048576

        Enter partition id tag[unassigned]:
        Enter partition permission flags[wm]:
        Enter new starting cyl[257]: 0
        Enter partition size[1048576b, 256c, 255e, 512.00mb, 0.50gb]: 0
        partition> 0
        Part      Tag    Flag     Cylinders        Size            Blocks
          0 unassigned    wm       1 -  256      512.00MB    (256/0/0)  1048576

        Enter partition id tag[unassigned]:
        Enter partition permission flags[wm]:
        Enter new starting cyl[1]:
        Enter partition size[1048576b, 256c, 256e, 512.00mb, 0.50gb]: 512e
        partition> 4
        Part      Tag    Flag     Cylinders        Size            Blocks
          4 unassigned    wm       0               0         (0/0/0)          0

        Enter partition id tag[unassigned]:
        Enter partition permission flags[wm]:
        Enter new starting cyl[0]: 769
        Enter partition size[0b, 0c, 769e, 0.00mb, 0.00gb]: 512m
        partition>

Below, I've confirmed our new partition table (print), labeled the disk (set the partition table to disk), and have quit out of the "partition" submenu and 'format' (see note 0):
        partition> print
        Current partition table (unnamed):
        Total disk cylinders available: 2044 + 2 (reserved cylinders)

        Part      Tag    Flag     Cylinders        Size            Blocks
          0 unassigned    wm       1 -  512     1024.00MB    (512/0/0)  2097152
          1 unassigned    wm       0               0         (0/0/0)          0
          2     backup    wu       0 - 2043        3.99GB    (2044/0/0) 8372224
          3 unassigned    wm     513 -  768      512.00MB    (256/0/0)  1048576
          4 unassigned    wm     769 - 1024      512.00MB    (256/0/0)  1048576
          5 unassigned    wm       0               0         (0/0/0)          0
          6 unassigned    wm       0               0         (0/0/0)          0
          7 unassigned    wm       0               0         (0/0/0)          0
          8       boot    wu       0 -    0        2.00MB    (1/0/0)       4096
          9 unassigned    wm       0               0         (0/0/0)          0

        partition> label
        Ready to label disk, continue? yes

        partition> quit
        format> quit
        sunspot-cons [0]

Since we've removed "c1t1d0s1", we need to update "/etc/vfstab" to set our swap volume to be "c1t1d0s4" and uncomment our "c1t1d0" entries. Using your favorite text editor (vi?), update "/etc/vfstab" from this:
        sunspot-cons [0] /bin/grep c1t1 /etc/vfstab
        #/dev/dsk/c1t1d0s1      -       -       swap    -       no      -
        #/dev/dsk/c1t1d0s0      /dev/rdsk/c1t1d0s0      /opt/myapp/logs ufs     2       yes     -
        #/dev/dsk/c1t1d0s3      /dev/rdsk/c1t1d0s3      /opt/myapp/bin  ufs     2       yes     -

to this:
        sunspot-cons [0] /bin/grep c1t1 /etc/vfstab
        /dev/dsk/c1t1d0s4       -       -       swap    -       no      -
        /dev/dsk/c1t1d0s0       /dev/rdsk/c1t1d0s0      /opt/myapp/logs ufs     2       yes     -
        /dev/dsk/c1t1d0s3       /dev/rdsk/c1t1d0s3      /opt/myapp/bin  ufs     2       yes     -

While there shouldn't be any issue, run 'fsck' against "/dev/rdsk/c1t1d0s0" to ensure that our contained FS is still sane:
        sunspot-cons [0] /usr/sbin/fsck -y /dev/rdsk/c1t1d0s0
        ** /dev/rdsk/c1t1d0s0
        ** Last Mounted on /opt/myapp/logs
        ** Phase 1 - Check Blocks and Sizes
        ** Phase 2 - Check Pathnames
        ** Phase 3a - Check Connectivity
        ** Phase 3b - Verify Shadows/ACLs
        ** Phase 4 - Check Reference Counts
        ** Phase 5 - Check Cylinder Groups
        38163 files, 492097 used, 166 free (166 frags, 0 blocks, 0.0% fragmentation)

If our FS checks out to be clean (which it does), below we 'mount' the "s0" FS to "/opt/myapp/logs", run 'df' to see the existing size and usage, extend the UFS FS on "s0" using 'growfs', and finally recheck our size and usage via 'df':
        sunspot-cons [0] /usr/sbin/mount /opt/myapp/logs
        sunspot-cons [0] /bin/df -h /opt/myapp/logs
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t1d0s0      482M   482M     0K   100%    /opt/myapp/logs
        sunspot-cons [0] /usr/sbin/growfs -M /opt/myapp/logs /dev/rdsk/c1t1d0s0
        /dev/rdsk/c1t1d0s0:     2097152 sectors in 512 cylinders of 128 tracks, 32 sectors
                1024.0MB in 32 cyl groups (16 c/g, 32.00MB/g, 15360 i/g)
        super-block backups (for fsck -F ufs -o b=#) at:
         32, 65600, 131168, 196736, 262304, 327872, 393440, 459008, 524576, 590144,
         1442528, 1508096, 1573664, 1639232, 1704800, 1770368, 1835936, 1901504,
         1967072, 2032640
        sunspot-cons [0] /bin/df -h /opt/myapp/logs
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t1d0s0      963M   482M   434M    53%    /opt/myapp/logs
        sunspot-cons [0] /bin/ls /opt/myapp/logs
        acct          etc           log           patch         sm.bin
        aculog        exacct        lost+found    pkg           softinfo
        adm           fm            lp            pool          spellhist
        <snip...>

Above, we see in the second 'df' that our FS size has indeed been extended to consume the space reclaimed from "c1t1d0s1". As a follow up, our files are still accessible as seen in the 'ls'. Below, we 'umount' "/opt/myapp/logs" and run 'fsck' once more on "c1t1d0s0" to verify the FS sanity as a follow up to our extending of the FS with 'growfs' above:
        sunspot-cons [0] /usr/sbin/umount /opt/myapp/logs
        sunspot-cons [0] /usr/sbin/fsck -y /dev/rdsk/c1t1d0s0
        ** /dev/rdsk/c1t1d0s0
        ** Last Mounted on /opt/myapp/logs
        ** Phase 1 - Check Blocks and Sizes
        ** Phase 2 - Check Pathnames
        ** Phase 3a - Check Connectivity
        ** Phase 3b - Verify Shadows/ACLs
        ** Phase 4 - Check Reference Counts
        ** Phase 5 - Check Cylinder Groups
        38163 files, 492097 used, 493478 free (166 frags, 61664 blocks, 0.0% fragmentation)
        sunspot-cons [0] /usr/sbin/reboot
        syncing file systems... done
        Rebooting...

Providing our FS is clean, above we reboot the host (no parameters this time) so that Solaris comes up in "multi user mode". When we get our GRUB menu, select the normal boot option this time:
         GNU GRUB  version 0.97  (639K lower / 1571776K upper memory)

        Solaris 10 10/09 s10x_u8wos_08a X86     <====================
        Solaris failsafe
        Solaris_reboot_transient


           Use the <up> and <down> keys to select which entry is highlighted.
           Press enter to boot the select OS, 'e' to edit the
           commands before booting, or 'c' for a command-line.


        The highlighted entry will be booted automatically in 10 seconds.

(screen refresh)
        SunOS Release 5.10 Version Generic_141445-09 64-bit.
        Copyright 1983-2009 Sun Microsystems, Inc.  All rights reserved.
        Use is subject to license terms.
        NOTICE: MPO diabled becuase memory is interleaved

        Hostname: sunspot
        /dev/rdsk/c1t1d0s0 is clean
        /dev/rdsk/c1t1d0s3 is clean
        Reading ZFS config: done.

        sunspot console login: _

With our host now back online, we simply verify that "c1t1d0s4" is allocated to swap using 'swap -l' and our "c1t1d0" FS are mounted:
        sunspot [0] /usr/sbin/swap -l
        swapfile             dev  swaplo blocks   free
        /dev/dsk/c1t0d0s1   32,1       8 2104504 2104504
        /dev/dsk/c1t1d0s4   32,132      8 1048568 1048568
        sunspot [0] /bin/df -h | /bin/egrep 'File|c1t1'
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t1d0s0      963M   482M   386M    56%    /opt/myapp/logs
        /dev/dsk/c1t1d0s3      482M   109M   324M    26%    /opt/myapp/bin
        sunspot [0]

Our work is now complete. For the curious, the reason we can re-layout the partition table in this manner is that a slice (partition) can be treated as simply a container for the contained FS. Providing that you are not reducing the size of the slice (container) or shifting the start/end of the slice boundaries into space occupied by an FS, your FS is not otherwise impacted and remains sane.


Note 0:
  The 'format' command performs a check of devices, including entries in   "/etc/vfstab". If we didn't comment out the "c1t1d0" entries, 'format'   would have complained about them:
        sunspot-cons [0] /usr/sbin/format c1t1d0
        selecting c1t1d0
        [disk formatted]
        /dev/dsk/c1t1d0s0 is normally mounted on /opt/myapp/logs according to /etc/vfstab. \
          Please remove this entry to use this device.
        /dev/dsk/c1t1d0s1 is normally mounted on  according to /etc/vfstab. Please remove \
          this entry to use this device.
        /dev/dsk/c1t1d0s3 is normally mounted on /opt/myapp/bin according to /etc/vfstab. \
          Please remove this entry to use this device.


        FORMAT MENU:
        <snip...>
        format> partition

        PARTITION MENU:
                0      - change `0' partition
        <snip...>
        partition> print
        Current partition table (original):
        Total disk cylinders available: 2044 + 2 (reserved cylinders)

        Part      Tag    Flag     Cylinders        Size            Blocks
          0 unassigned    wm       1 -  256      512.00MB    (256/0/0)  1048576
          1 unassigned    wm     257 -  512      512.00MB    (256/0/0)  1048576
          2     backup    wu       0 - 2043        3.99GB    (2044/0/0) 8372224
          3 unassigned    wm     513 -  768      512.00MB    (256/0/0)  1048576
          4 unassigned    wm       0               0         (0/0/0)          0
          5 unassigned    wm       0               0         (0/0/0)          0
          6 unassigned    wm       0               0         (0/0/0)          0
          7 unassigned    wm       0               0         (0/0/0)          0
          8       boot    wu       0 -    0        2.00MB    (1/0/0)       4096
          9 unassigned    wm       0               0         (0/0/0)          0

        partition> 1
        <snip...>

  The above partition table is our original setup. After our   modifications, seen below, since 'format' knows that we have   "/etc/vfstab" entries related to "c1t1d0", it will not allow us to   label the disk with the new table:
        partition> print
        Current partition table (unnamed):
        Total disk cylinders available: 2044 + 2 (reserved cylinders)

        Part      Tag    Flag     Cylinders        Size            Blocks
          0 unassigned    wm       1 -  512     1024.00MB    (512/0/0)  2097152
          1 unassigned    wm       0               0         (0/0/0)          0
          2     backup    wu       0 - 2043        3.99GB    (2044/0/0) 8372224
          3 unassigned    wm     513 -  768      512.00MB    (256/0/0)  1048576
          4 unassigned    wm     769 - 1024      512.00MB    (256/0/0)  1048576
          5 unassigned    wm       0               0         (0/0/0)          0
          6 unassigned    wm       0               0         (0/0/0)          0
          7 unassigned    wm       0               0         (0/0/0)          0
          8       boot    wu       0 -    0        2.00MB    (1/0/0)       4096
          9 unassigned    wm       0               0         (0/0/0)          0

        partition> label
        Cannot label disk when partitions are in use as described.
        partition> quit
        format> quit
        sunspot-cons [0]

  It is for that reason that we commented out the entries in   "/etc/vfstab".


see also:
    Repartitioning Contiguous Space in Linux

4 comments:

Trixter said...

Is there any particular reason why you didn't implement ZFS instead? If you had, you wouldn't have these issues.

troy said...

Trixter,

I completely agree, ZFS would have negated this issue and the end resolution.

The original host (to which this post refers) was configured and is managed to a strict adherence towards a particular vendor's requirements. When the requirements were initially presented, I suggested to the system owner the usage of ZFS. Unfortunately, that was rejected by the vendor as they didn't support their application on ZFS. In order to maintain the support of the application, the configuration needed to adhere to their requirements, so sadly, no ZFS.

-troy

Trixter said...

Thanks for the clarification. When I've been presented with similar situations in the past, I've ignored similar vendor requirements :-)

troy said...

Trixter,

Not a problem. Unfortunately, this was one of those times when we weren't permitted to ignore the requirements, regardless of an improved configuration.

-troy