21 October 2010

Removing / Recovering an Open File (Solaris)

While working with filesystems, one will inevitably have to deal with a
file that is quickly growing and becoming unmanageable.  Unfortunately,
this situation normally becomes evident because the space of a volume
has been nearly exhausted.  If retention is necessary, you could set up
the file for log rotation and compression.  If the data isn't crucial,
you could simply zero out the file or remove it.  The details below
concern a file removed while still open by a process, and that still
needs to be retained.  An easy example would be /var/adm/messages held
open by syslogd.  While the following uses Solaris 10, the same steps
should be applicable on Solaris 8 and 9, as well.

INFO

        host:           snorkle
        OS:             Solaris 10
        prompt:         snorkle [0]
        device:         c1t2d0s0
        file:           /mnt/logfiles/messages

DETAILS

To set up our test case above, I've simply run a tail in another terminal
against /mnt/logfiles/messages.  The purpose is to hold open a file
descriptor to the file, similar to how syslogd would (though syslogd
would hold it open for writes):

        snorkle [0] /usr/bin/tty
        /dev/pts/3
        snorkle [0] pwd
        /
        snorkle [0] /usr/bin/tail -f /mnt/logfiles/messages
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy

In the details that follow, our determined admin (Wendy) wouldn't
initially have knowledge of the above.  For reference, the following is
relevant to our admin who has just logged in:

        snorkle [0] /usr/bin/tty
        /dev/pts/1
        snorkle [0] pwd
        /

In checking out the filesystem, Wendy notes that the FS on c1t2d0s0 is
at 47%, not necessarily a problem, but she checks it out anyway:

        snorkle [0] /usr/sbin/df -h /mnt/logfiles
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t2d0s0      477M   198M   231M    47%    /mnt/logfiles
        snorkle [0] cd /mnt/logfiles
        snorkle [0] /usr/bin/ls -li
        total 404092
                 8 -rw-r--r--   1 root     root       71680 Oct 21 16:16 ftplog
                 4 -rw-r--r--   1 root     root      307200 Oct 21 16:15 httpd
                 3 drwx------   2 root     root        8192 Oct 21 16:11 lost+found/
                 6 -rw-r--r--   1 root     root      507904 Oct 21 16:16 maillog
                 5 -rw-r--r--   1 root     root     205541050 Oct 21 18:42 messages
                10 -rw-r--r--   1 root     root       23985 Oct 21 16:23 messages.0.gz
                 7 -rw-r--r--   1 root     root      282624 Oct 21 16:17 snmpd.log

Being the good admin that she is, upon seeing messages at 196 MB
(205541050 bytes), Wendy decides to rotate out the file.  Instead of
moving the file so it can be compressed, in her haste Wendy instead
removes the file:

        snorkle [0] /usr/bin/rm messages
        snorkle [0] /usr/bin/ls -li
        total 2412
                 8 -rw-r--r--   1 root     root       71680 Oct 21 16:16 ftplog
                 4 -rw-r--r--   1 root     root      307200 Oct 21 16:15 httpd
                 3 drwx------   2 root     root        8192 Oct 21 16:11 lost+found/
                 6 -rw-r--r--   1 root     root      507904 Oct 21 16:16 maillog
                10 -rw-r--r--   1 root     root       23985 Oct 21 16:23 messages.0.gz
                 7 -rw-r--r--   1 root     root      282624 Oct 21 16:17 snmpd.log

Well, that's a predicament.  A momentary lapse in thought before typing
and we see above that 'messages' has indeed been removed; so much for
rotating it out.  Well, at least it should have reduced the currently
used space on the filesystem:

        snorkle [0] /usr/sbin/df -h /mnt/logfiles
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t2d0s0      477M   198M   231M    47%    /mnt/logfiles

Huh, no good.  The used space hasn't diminished at all.  The 'fuser'
command will come in handy here (see note 1):

        snorkle [0] /usr/sbin/fuser -cu /mnt/logfiles
        /mnt/logfiles:     1073c(root)    1062o(root)     499c(root)

Our 'fuser' output shows three active processes under /mnt/logfiles, one
of them may be our problem by holding open 'messages'.  Also, we see that
the three processes are euid root in the parens '()' following the PIDs.
Of note, the options passed to 'fuser' and the characters after the PIDs:

        options:
        -c      reports on a mount point and the files contained within
        -u      reports the user ID in parens following the PIDs

        output chars:
        c       the process is using the file as its current directory
        o       the process is holding open a file

Supporting the output from 'fuser', Wendy runs 'ps' to determine any
further information about the processes:

        snorkle [0] /usr/bin/ps -f -p "$(/usr/sbin/fuser -c /mnt/logfiles 2>/dev/null)"
             UID   PID  PPID   C    STIME TTY         TIME CMD
            root   499   493   0 15:13:00 pts/1       2:33 /bin/ksh
            root  1062   704   0 18:43:52 pts/3       0:00 /usr/bin/tail -f /mnt/logfiles/messages

The 'ps' above requests a full listing (-f) for the selected PIDs (-p),
using 'fuser' output for the PID selection.  (Redirecting STDERR from
'fuser' results in only the PIDs being displayed, thus being usable for
input to 'ps'.)  Since we saw earlier that Wendy is logged into tty1,
we can disregard PID 499.  Next, it seems that a process run by 'root'
with a PID of 1062 is running a 'tail' against 'messages'.  To quote
Adam Savage, "well that's your problem."  For our purposes, we'll
see why this isn't actually a problem, but instead saves the day.
In UNIX, if a file is still held open when it is removed, the filename
and inode are cleared.  Until the process holding open said file closes
the file's file descriptor, the space consumed by the file still remains.
It is also potentially possible to recover that file.

Using 'pfiles', Wendy checks out any opened files by process 1062:

        snorkle [0] /usr/bin/pfiles /proc/1062
        1062:   /usr/bin/tail -f /mnt/logfiles/messages
          Current rlimit: 256 file descriptors
           0: S_IFREG mode:0644 dev:32,128 ino:5 uid:0 gid:0 size:205541050
              O_RDONLY|O_LARGEFILE
           1: S_IFCHR mode:0620 dev:287,0 ino:12582922 uid:0 gid:7 rdev:24,3
              O_RDWR|O_NOCTTY|O_LARGEFILE 
              /devices/pseudo/pts@0:3
           2: S_IFCHR mode:0620 dev:287,0 ino:12582922 uid:0 gid:7 rdev:24,3
              O_RDWR|O_NOCTTY|O_LARGEFILE 
              /devices/pseudo/pts@0:3

Alright, based on the output, entry '0' is our file.  It has the same
inode number (ino:5) and file size (size:205541050) that we saw in our
earlier directory listing prior to the file's removal.  (If the file
descriptor is still being written to, the file size will have increased
since the previous 'ls' output.)  Since the file descriptor is still
open on what used to be 'messages', we can still access the file contents
in memory thanks to the /proc pseudo filesystem:

        snorkle [0] /usr/bin/ls -l /proc/1062/fd    
        total 401680
        -r--r--r--   0 root     root     205541050 Oct 21 18:42 0
        c---------   1 root     tty       24,  3 Oct 21 18:43 1
        c---------   1 root     tty       24,  3 Oct 21 18:43 2

Above, Wendy runs 'ls' on the file descriptors directory for process
1062.  Lo and behold, our file is still there only as '0', which we
identified with 'pfiles'.  (The other two files are character devices
relevant to the tty associated with the 'tail' process.)  Time to recover
it before we kill that 'tail' process:

        snorkle [0] /usr/bin/cp /proc/1062/fd/0 /tmp/messages.recov
        snorkle [0] /usr/bin/ls -l /tmp/messages.recov
        -r--r--r--   1 root     root     205541050 Oct 21 18:51 /tmp/messages.recov
        snorkle [0] /usr/bin/tail -10 /tmp/messages.recov
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boy
        all work and no play makes jack a dull boysnorkle [0]

It looks like our file is good.  The output of messages.recov corresponds
to what we expect for this file as well as the original output we saw
from the initial setup.  Cool, now that we recovered 'messages', it's time
for cleanup.  Assuming the process that was holding open this file was a
daemon, the file 'messages' may or may not be recreated during a restart
or a HUP of the process.  For the sake of tidiness, recreate the file.
(Of note, you may need to reset the appropriate permissions on the file,
which we already gathered from our 'pfiles' command):

        snorkle [0] /usr/bin/touch messages
        snorkle [0] /usr/bin/ls -l messages
        -rw-r--r--   1 root     root           0 Oct 21 18:55 messages

Since it's no longer needed and is tailing a "non-existent" file anyway,
we can 'kill' the 'tail' process from earlier:

        snorkle [0] /usr/bin/kill 1062
        snorkle [0] /usr/bin/ls /proc/1062
        /proc/1062: No such file or directory
        snorkle [2] /usr/bin/df -h /mnt/logfiles
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t2d0s0      477M   2.2M   428M     1%    /mnt/logfiles

The 'tail' process is now dead, the /proc directory for the process is
gone, and our 'df' of the filesystem on c1t2d0s0 shows the reclaimed
space.  Below, Wendy compresses the recovered messages file in /tmp
and moves it to its rightful place under /mnt/logfiles:

        snorkle [0] /usr/bin/gzip /tmp/messages.recov
        snorkle [0] /usr/bin/ls -l /tmp/messages.recov.gz
        -r--r--r--   1 root     root      598060 Oct 21 18:51 /tmp/messages.recov.gz
        snorkle [0] /usr/bin/mv /tmp/messages.recov.gz /mnt/logfiles/.
        snorkle [0] /usr/bin/ls -li
        total 3612
                 8 -rw-r--r--   1 root     root       71680 Oct 21 16:16 ftplog
                 4 -rw-r--r--   1 root     root      307200 Oct 21 16:15 httpd
                 3 drwx------   2 root     root        8192 Oct 21 16:11 lost+found/
                 6 -rw-r--r--   1 root     root      507904 Oct 21 16:16 maillog
                 5 -rw-r--r--   1 root     root           0 Oct 21 19:00 messages
                10 -rw-r--r--   1 root     root       23985 Oct 21 16:23 messages.0.gz
                 9 -r--r--r--   1 root     root      598060 Oct 21 18:51 messages.recov.gz
                 7 -rw-r--r--   1 root     root      282624 Oct 21 16:17 snmpd.log
        snorkle [0] /usr/bin/df -h /mnt/logfiles
        Filesystem             size   used  avail capacity  Mounted on
        /dev/dsk/c1t2d0s0      477M   2.8M   427M     1%    /mnt/logfiles

A few closing notes.  The above recovery was possible only because another
process held open a file descriptor to the file removed.  Had that not
been the case, file recovery would have been much more difficult, if
not impossible.  Additionally, the setup shows one reason why removing
a file may not initially reclaim filesystem space.

Note 1:  Thanks to Blog O' Matty for noting the 'fuser trick' picked up
    from Pete Shanahan.

Note 2:  Mike Tremell's 'file recovery by inode number' somewhat inspired
    this particular post.  Have a look.

see also:
    Find Open Files in Solaris
    Removing / Recovering an Open File (Linux)