31 October 2011

Fixing an Overly Eager chown in Linux

A while ago, someone asked me how to recover from a mistyped recursive
'chown' they performed.  They mistyped the path and it executed against
the root FS (/), though they caught it before it acted on everything.
Ideally, one would have a backup to recover from, however that wasn't
an option in either the original situation or the one detailed herein.
Our host details are:
        HOSTs:          cobblepot, adler
        PROMPT:         HOST [0]
        OS:             CentOS 6.0
        NOTE:           The following should reasonably work on on prior
                        versions of CentOS (or Red Hat based distros).
Before proceeding, this is a fairly long write up.  The situation
presented is tedious, though not overly difficult.  Now some
background, we have user "winston" who is a member of group "appuser"
and needs to be able to maintain application files under "/usr/app1.0".
In fact, that directory and its contents should be owned by user:group
"appuser:appuser".  Being the good sysadmin, we log in to the host,
check the UID:GID for "winston" and "appuser", and recursively 'chown'
the "/usr/app1.0" directory to "appuser:appuser":
        cobblepot [0] pwd
        /
        cobblepot [0] /usr/bin/whoami
        root
        cobblepot [0] /usr/bin/who am i
        root     pts/0        2011-10-30 15:03 (glados-vb)
        cobblepot [0] /bin/find /usr/app1.0 -exec /bin/ls -ld  {} \;
        drwxr-xr-x. 3 root root 4096 Oct 30 15:04 /usr/app1.0
        drwxr-xr-x. 2 root root 4096 Oct 30 15:04 /usr/app1.0/bin
        -r-xr--r--. 1 root root 4922 Oct 30 15:04 /usr/app1.0/bin/audit-up.pl
        -r-xr--r--. 1 root root 77151 Oct 30 15:04 /usr/app1.0/bin/audit.soc
        -r-xr--r--. 1 root root 592 Oct 30 15:04 /usr/app1.0/bin/audit-inc.ksh
        -r-xr-xr-x. 1 root root 13433 Oct 30 15:04 /usr/app1.0/bin/lprtdiag.pl
        <snip...>
        cobblepot [0] id -a winston
        uid=501(winston) gid=500(appuser) groups=500(appuser)
        cobblepot [0] id -a appuser
        uid=500(appuser) gid=500(appuser) groups=500(appuser)
        cobblepot [0] /bin/chown -R 500:500 / usr/app1.0
        /bin/chown: changing ownership of `/proc/sys/kernel/sched_child_runs_first': Operation not permitted
        /bin/chown: changing ownership of `/proc/sys/kernel/sched_min_granularity_ns': Operation not permitted
        /bin/chown: changing ownership of `/proc/sys/kernel/sched_latency_ns': Operation not permitted
        <snip...>
        /bin/chown: changing ownership of `/proc/1300/io': Permission denied
        /bin/chown: changing ownership of `/proc/1300': Operation not permitted
        cobblepot [1]
This is a problem, there's a space in our directory path thus 'chown'
saw two separate directories to execute against, "/" and "usr/app1.0"
rather than "/usr/app1.0".  Since the 'chown' ran until completion,
every file on the system was modified and over 1000 errors were produced.
We can see some of the damage below via 'ls':
        cobblepot [1] /bin/ls -l /
        total 108
        drwxr-xr-x.  2 appuser appuser  4096 Oct  3 00:23 a
        dr-xr-xr-x.  2 appuser appuser  4096 Oct  2 23:39 bin
        dr-xr-xr-x.  4 appuser appuser  4096 Oct  2 23:16 boot
        drwxr-xr-x. 10 appuser appuser  4096 Oct 30 15:03 cgroup
        drwxr-xr-x. 16 appuser appuser  3520 Oct 30 15:03 dev
        drwxr-xr-x. 99 appuser appuser 12288 Oct 30 15:07 etc
        drwxr-xr-x.  4 appuser appuser  4096 Oct 30 15:07 home
        dr-xr-xr-x. 11 appuser appuser  4096 Oct  3 00:32 lib
        dr-xr-xr-x. 10 appuser appuser 12288 Oct  2 23:16 lib64
        drwx------.  2 appuser appuser 16384 Oct  2 23:01 lost+found
        drwxr-xr-x.  2 appuser appuser  4096 Nov 11  2010 media
        drwxr-xr-x.  2 appuser appuser  4096 Nov 11  2010 mnt
        drwxr-xr-x.  3 appuser appuser  4096 Oct  3 00:37 opt
        dr-xr-xr-x. 77 appuser appuser     0 Oct 30 15:03 proc
        dr-xr-x---.  2 appuser appuser  4096 Oct 25 23:31 root
        dr-xr-xr-x.  2 appuser appuser 12288 Oct  2 23:15 sbin
        drwxr-xr-x.  7 appuser appuser     0 Oct 30 15:03 selinux
        drwxr-xr-x.  2 appuser appuser  4096 Nov 11  2010 srv
        drwxr-xr-x. 13 appuser appuser     0 Oct 30 15:03 sys
        drwxrwxrwt.  3 appuser appuser  4096 Oct 30 15:08 tmp
        drwxr-xr-x. 14 appuser appuser  4096 Oct 30 15:04 usr
        drwxr-xr-x. 22 appuser appuser  4096 Oct  2 23:15 var
        cobblepot [0] /bin/ls -ld /bin/rpm
        -rwxr-xr-x. 1 appuser appuser 20392 Nov 11  2010 /bin/rpm
Since we don't have a backup to recover from, let's see if the RPM
database can help us.  Above we saw that 'rpm' was now owned by
"appuser:appuser".  Since we are "root", our binaries should still
execute, including 'rpm'.  Here we identify the package that owns
'/bin/rpm' and the expected ownership "root root" by way of using 'rpm'.
We can also use 'rpm' to validate the current files assocated with the
'rpm' package and see that "user" (U) and "group" (G) permissions are
now wrong:
        cobblepot [0] /bin/rpm -qf /bin/rpm
        rpm-4.8.0-12.el6.x86_64
        cobblepot [0] /bin/rpm -q --dump rpm-4.8.0-12.el6.x86_64 | grep '/bin/rpm '
        /bin/rpm 20392 1289521670 88d60915477cee96d4de6fb63bd3ca3db7efb05a83325e69375c54af3933bc90 0100755 root root 0 0 0 X
        cobblepot [0] /bin/rpm -Vf /bin/rpm
        .....UG..    /bin/rpm
        .....UG..    /etc/rpm
        .....UG..    /usr/bin/rpm2cpio
        .....UG..    /usr/bin/rpmdb
        .....UG..    /usr/bin/rpmquery
        .....UG..    /usr/bin/rpmsign
        <snip...>
        .....UG..  d /usr/share/man/ru/man8/rpm.8.gz
        .....UG..  d /usr/share/man/ru/man8/rpm2cpio.8.gz
        .....UG..  d /usr/share/man/sk/man8/rpm.8.gz
        cobblepot [1]
With 'rpm' is still usable despite its user:group ownership, we can
use it to reset the user:group for the entire "rpm" package.  A further
verify via 'rpm' and 'ls' shows that it worked:
        cobblepot [1] /bin/rpm --setugids rpm-4.8.0-12.el6.x86_64
        chown: cannot access `/var/lib/rpm/__db.005': No such file or directory
        chgrp: cannot access `/var/lib/rpm/__db.005': No such file or directory
        chown: cannot access `/var/lib/rpm/__db.006': No such file or directory
        chgrp: cannot access `/var/lib/rpm/__db.006': No such file or directory
        chown: cannot access `/var/lib/rpm/__db.007': No such file or directory
        chgrp: cannot access `/var/lib/rpm/__db.007': No such file or directory
        chown: cannot access `/var/lib/rpm/__db.008': No such file or directory
        chgrp: cannot access `/var/lib/rpm/__db.008': No such file or directory
        chown: cannot access `/var/lib/rpm/__db.009': No such file or directory
        chgrp: cannot access `/var/lib/rpm/__db.009': No such file or directory
        cobblepot [0] /bin/rpm -Vf /bin/rpm
        cobblepot [0] /bin/ls -ld /bin/rpm
        -rwxr-xr-x. 1 root root 20392 Nov 11  2010 /bin/rpm
        cobblepot [0]
Rather than manually go through each package doing this (cobblepot has
633 packages installed), we'll just use a 'for' loop to query the "RPM"
database and reset the ownership for us:
        cobblepot [0] for i in `/bin/rpm -qa` ; do echo "** working on ${i} **" ;
        > /bin/rpm --setugids ${i} ; done
        ** working on usbutils-0.86-2.el6.x86_64 **
        ** working on pixman-0.16.6-1.el6.x86_64 **
        ** working on filesystem-2.4.30-2.1.el6.x86_64 **
        chown: cannot access `/mnt/cdrom': No such file or directory
        chgrp: cannot access `/mnt/cdrom': No such file or directory
        chown: cannot access `/mnt/floppy': No such file or directory
        chgrp: cannot access `/mnt/floppy': No such file or directory
        chown: cannot access `/usr/share/man/aa': No such file or directory
        <snip...>
        ** working on prelink-0.4.3-4.el6.x86_64 **
        chown: cannot access `/var/lib/prelink/full': No such file or directory
        chgrp: cannot access `/var/lib/prelink/full': No such file or directory
        chown: cannot access `/var/lib/prelink/quick': No such file or directory
        chgrp: cannot access `/var/lib/prelink/quick': No such file or directory
        chown: cannot access `/var/log/prelink/prelink.log': No such file or directory
        chgrp: cannot access `/var/log/prelink/prelink.log': No such file or directory
        ** working on hdparm-9.16-3.4.el6.x86_64 **
        cobblepot [0]
The errors above can be safely ignored.  A slew of them are regarding
non-existent internationalization files for 'man'.  Alright, 'rpm' has
updated ownership on our files, however, it appears in updating the
ownership, setuid / setgid modes were removed as seen below in lines
identified ".M.......":
        cobblepot [0] for i in `/bin/rpm -qa` ; do /bin/rpm -V ${i} ;
        > if [ $? != 0 ]; then echo "** files above from ${i} **" ; fi ; done
        .M.......    /usr/libexec/pt_chown
        ** files above from glibc-common-2.12-1.7.el6.x86_64 **
        .M.......    /lib64/dbus-1/dbus-daemon-launch-helper
        ** files above from dbus-1.2.24-3.el6.x86_64 **
        .M.......    /bin/mount
        .M.......    /bin/umount
        .M.......    /usr/bin/chfn
        .M.......    /usr/bin/chsh
        .M.......    /usr/bin/write
        ** files above from util-linux-ng-2.17.2-6.el6.x86_64 **
        .M.......    /usr/sbin/postdrop
        .M.......    /usr/sbin/postqueue
        ** files above from postfix-2.6.6-2.el6.x86_64 **
        <snip...>
        .M.......    /usr/bin/pkexec
        .M.......    /usr/libexec/polkit-1/polkit-agent-helper-1
        ** files above from polkit-0.96-2.el6.x86_64 **
        .M.......    /usr/sbin/lockdev
        ** files above from lockdev-1.0.1-18.el6.x86_64 **
        cobblepot [0]

        adler [0] for i in `/bin/rpm -qa` ; do /bin/rpm -V ${i} ;
        > if [ $? != 0 ]; then echo "** files above from ${i} **" ; fi ; done
        S.5....T.  c /etc/pki/nssdb/pkcs11.txt
        ** files above from nss-sysinit-3.12.7-2.el6.x86_64 **
        missing     /var/cache/libvirt/qemu
        ** files above from libvirt-0.8.1-27.el6.x86_64 **
        .......T.  c /etc/inittab
        ** files above from initscripts-9.03.17-1.el6.centos.x86_64 **
        S.5....T.  c /etc/nslcd.conf
        ** files above from nss-pam-ldapd-0.7.5-3.el6.x86_64 **
        ....L....  c /etc/pam.d/fingerprint-auth
        ....L....  c /etc/pam.d/password-auth
        ....L....  c /etc/pam.d/smartcard-auth
        ....L....  c /etc/pam.d/system-auth
        ** files above from pam-1.1.1-4.el6.x86_64 **
To see what we should expect, I logged into another host (adler) above
which is identical to "cobblepot" and ran the same 'rpm' validation
query.  To see what permissions we should be expecting, I randomly
chose to look at package "util-linux-ng-2.17.2-6.el6.x86_64" for files
'mount|umount|chfn|chsh|write' via 'rpm' and 'ls'.  I further confirmed
the permissions on "adler":
        cobblepot [0] /bin/rpm -q --dump util-linux-ng-2.17.2-6.el6.x86_64 |
        > /bin/egrep '/bin/(mount|umount)|/usr/bin/(chfn|chsh|write)'
        /bin/mount 74680 1289547235 30a503ced768d0d0b991df84eeca2c0b7dddf8fa01bb1eb8e65736271e6163d4 0104755 root root 0 0 0 X
        /bin/umount 49280 1289547235 82589c2779372072bc667cea3f28809b2fae9a3f98278698e6fd9b338d3ba9d0 0104755 root root 0 0 0 X
        /usr/bin/chfn 20120 1289547237 3a19316118bc01912730dc2dd5b71a013030d52cf010af3a225b1fbf91b98894 0104711 root root 0 0 0 X
        /usr/bin/chsh 18104 1289547237 e1c8a69cbede6b0570319ab0f662869b79d3f668ca8694b17c3866e3538a298e 0104711 root root 0 0 0 X
        /usr/bin/write 12024 1289547238 cc576a56d9e433fe21f54bc86608f1ffab8ffe80ad8a35254781af1b9be8cd7e 0102755 root tty 0 0 0 X
        cobblepot [0] /bin/ls -ld /bin/mount /bin/umount /usr/bin/chfn /usr/bin/chsh \
        > /usr/bin/write
        -rwxr-xr-x. 1 root root 74680 Nov 12  2010 /bin/mount
        -rwxr-xr-x. 1 root root 49280 Nov 12  2010 /bin/umount
        -rwx--x--x. 1 root root 20120 Nov 12  2010 /usr/bin/chfn
        -rwx--x--x. 1 root root 18104 Nov 12  2010 /usr/bin/chsh
        -rwxr-xr-x. 1 root tty  12024 Nov 12  2010 /usr/bin/write
        cobblepot [0]

        adler [0] /bin/ls -ld /bin/mount /bin/umount /usr/bin/chfn /usr/bin/chsh \
        > /usr/bin/write
        -rwsr-xr-x. 1 root root 74680 Nov 12  2010 /bin/mount
        -rwsr-xr-x. 1 root root 49280 Nov 12  2010 /bin/umount
        -rws--x--x. 1 root root 20120 Nov 12  2010 /usr/bin/chfn
        -rws--x--x. 1 root root 18104 Nov 12  2010 /usr/bin/chsh
        -rwxr-sr-x. 1 root tty  12024 Nov 12  2010 /usr/bin/write
        adler [0]
In the above, using 'mount' as an example, both 'rpm' on "cobblepot" and
'ls' on "adler" suggest the mode should be setuid, read, write, execute
for the user, thus "4755" or "rwsr-xr-x".  Below, we reset the modes
for our sample files based on what we saw above, followed by verifying
the changes via 'ls' and 'rpm':
        cobblepot [0] /bin/chmod u+s /bin/mount /bin/umount /usr/bin/chfn /usr/bin/chsh
        cobblepot [0] /bin/chmod g+s /usr/bin/write
        cobblepot [0] /bin/ls -ld /bin/mount /bin/umount /usr/bin/chfn /usr/bin/chsh \
        > /usr/bin/write
        -rwsr-xr-x. 1 root root 74680 Nov 12  2010 /bin/mount
        -rwsr-xr-x. 1 root root 49280 Nov 12  2010 /bin/umount
        -rws--x--x. 1 root root 20120 Nov 12  2010 /usr/bin/chfn
        -rws--x--x. 1 root root 18104 Nov 12  2010 /usr/bin/chsh
        -rwxr-sr-x. 1 root tty  12024 Nov 12  2010 /usr/bin/write
        cobblepot [0] /bin/rpm -V util-linux-ng-2.17.2-6.el6.x86_64
        cobblepot [0]
Given the number of files we have to update, in the first 'for' loop
below, we correlate each offending file to its "RPM" package and output it
to a temporary file.  In the second 'for' loop, we read in that temporary
file, set up variable "a" to be our package name, variable "b" to be
the file from that package, variable "c" to be the four digit mode /
permissions for that file, and finally execute 'chmod' to reset the mode:
        cobblepot [0] for i in `/bin/rpm -qa` ; do /bin/rpm -V ${i}  |
        > /bin/awk '/^.M./ {print "'$i':"$2}' ; done >> /tmp/mod-files
        cobblepot [0] /bin/cat /tmp/mod-files
        glibc-common-2.12-1.7.el6.x86_64:/usr/libexec/pt_chown
        dbus-1.2.24-3.el6.x86_64:/lib64/dbus-1/dbus-daemon-launch-helper
        postfix-2.6.6-2.el6.x86_64:/usr/sbin/postdrop
        postfix-2.6.6-2.el6.x86_64:/usr/sbin/postqueue
        <snip...>
        cobblepot [0] for i in `/bin/cat /tmp/mod-files` ; do a=`echo "$i" |
        > /bin/cut -d: -f1` ; b=`echo "$i" | /bin/cut -d: -f2` ;
        > c=`/bin/rpm -q --dump ${a} | /bin/grep "^${b} " | /bin/awk '{print $5}' |
        > /bin/sed -e 's/^...//g'` ; echo "chmod ${c} ${b}" ; /bin/chmod ${c} ${b} ;
        > done
        chmod 4711 /usr/libexec/pt_chown
        chmod 4750 /lib64/dbus-1/dbus-daemon-launch-helper
        chmod 2755 /usr/sbin/postdrop
        chmod 2755 /usr/sbin/postqueue
        chmod 4711 /usr/sbin/userhelper
        chmod 4755 /usr/bin/chage
        chmod 4755 /usr/bin/gpasswd
        <snip...>
        chmod 4755 /usr/libexec/polkit-1/polkit-agent-helper-1
        chmod 2711 /usr/sbin/lockdev
        cobblepot [0]
A quick check against the "RPM" database shows that we have fixed those
files.  After copying the temporary file to "adler", the same command
output mirrors our output on "cobblepot":
        cobblepot [0] for i in `/bin/awk -F: '{print $1}' /tmp/mod-files |
        > /bin/sort -n | /usr/bin/uniq` ; do /bin/rpm -V ${i} ; done
        .......T.  c /etc/inittab
        ....L....  c /etc/pam.d/fingerprint-auth
        ....L....  c /etc/pam.d/password-auth
        ....L....  c /etc/pam.d/smartcard-auth
        ....L....  c /etc/pam.d/system-auth
        cobblepot [0]

        adler [0] for i in `/bin/awk -F: '{print $1}' /tmp/mod-files |
        > /bin/sort -n | /usr/bin/uniq` ; do /bin/rpm -V ${i} ; done
        .......T.  c /etc/inittab
        ....L....  c /etc/pam.d/fingerprint-auth
        ....L....  c /etc/pam.d/password-auth
        ....L....  c /etc/pam.d/smartcard-auth
        ....L....  c /etc/pam.d/system-auth
        adler [0]
Unfortnately, our work is still not finished.  We check below for any
files still owned by "appuser", UID 500:
        cobblepot [0] /bin/find / -uid 500 -print > /tmp/appuser-owned 2>/dev/null
        cobblepot [1] /usr/bin/wc -l /tmp/appuser-owned
        18049 /tmp/appuser-owned
        cobblepot [0] /usr/bin/head -10 /tmp/appuser-owned
        /a
        /.kshpr
        /dev/vcsa6
        /dev/vcs6
        /dev/vcsa5
        /dev/vcs5
        /dev/vcsa4
        /dev/vcs4
        /dev/vcsa3
        /dev/vcs3
        cobblepot [0] /bin/grep /etc /tmp/appuser-owned | /usr/bin/head -10
        /etc/ssh/ssh_host_rsa_key
        /etc/ssh/ssh_host_dsa_key
        /etc/ssh/ssh_host_rsa_key.pub
        /etc/ssh/ssh_host_dsa_key.pub
        /etc/ssh/ssh_host_key.pub
        /etc/ssh/ssh_host_key
        /etc/tgt
        /etc/libvirt/qemu/networks/autostart/default.xml
        /etc/libvirt/qemu/networks/default.xml
        /etc/plymouth
        cobblepot [0]
Wow, still 18049 files owned by "appuser:appuser".  Fortunately, we don't
have to fix all of them and some we can take care of with a recursive
'chown'.  (We'll be more careful this time so as not to end back up in
this situation.)  The following are pseudofs which will be recreated
appropriately when we reboot, so ignore them:
        /dev                    pseudofs        # don't bother
        /sys                    pseudofs        # don't bother
        /proc                   pseudofs        # don't bother
        /selinux                pseudofs        # don't bother
The following directories and files, such as "/etc" and its contents,
should all be owned by root:root, with few exceptions:
        /etc                    root:root       # includes subdirs and files
        /etc/ntp/crypto         root:ntp
        /etc/pam.d/atd          root:daemon
        /lost+found             root:root
        /boot                   root:root       # includes subdirs and files
        /lib                    root:root       # includes subdirs and files
        /cgroup                 root:root       # includes subdirs and files
        /root                   root:root       # includes subdirs and files
        /usr/share              root:root       # includes subdirs and files
        /usr/lib64              root:root       # includes subdirs and files
        /usr/lib                root:root       # includes subdirs and files
        /usr/lib64/vte/gnome-pty-helper root:utmp
        /usr/libexec            root:root       # includes subdirs and files
        /usr/libexec/utempter   root:utmp       # includes subdirs and files
Using the above, we used 'chown' to set the appropriate ownership,
including the ownership of symlinks.  As an aside, a subsequent 'ls' of
"/" is looking a lot better:
        cobblepot [1] /bin/chown -R -h root:root /etc
        cobblepot [0] /bin/chown root:ntp /etc/ntp/crypto
        cobblepot [0] /bin/chown root:daemon /etc/pam.d/atd
        cobblepot [0] /bin/chown -R root:root /lost+found
        cobblepot [0] /bin/chown root:root /lost+found
        cobblepot [0] for i in /boot /lib /cgroup /root /usr/share /usr/lib64 \
        > /usr/lib /usr/libexec ; do /bin/chown -R -h root:root ${i} ; done
        cobblepot [0] /bin/chown root:utmp /usr/lib64/vte/gnome-pty-helper
        cobblepot [0] /bin/chown root:utmp /usr/libexec/utempter
        cobblepot [0] /bin/ls -l /
        total 108
        drwxr-xr-x.  2 appuser appuser  4096 Oct  3 00:23 a
        dr-xr-xr-x.  2 root    root     4096 Oct  2 23:39 bin
        dr-xr-xr-x.  4 root    root     4096 Oct  2 23:16 boot
        drwxr-xr-x. 10 root    root     4096 Oct 30 19:12 cgroup
        drwxr-xr-x. 16 root    root     3520 Oct 30 19:12 dev
        drwxr-xr-x. 99 root    root    12288 Oct 30 19:20 etc
        drwxr-xr-x.  4 root    root     4096 Oct 30 19:20 home
        dr-xr-xr-x. 11 root    root     4096 Oct  3 00:32 lib
        dr-xr-xr-x. 10 root    root    12288 Oct  2 23:16 lib64
        drwx------.  2 root    root    16384 Oct  2 23:01 lost+found
        drwxr-xr-x.  2 root    root     4096 Nov 11  2010 media
        drwxr-xr-x.  2 root    root     4096 Nov 11  2010 mnt
        drwxr-xr-x.  3 root    root     4096 Oct  3 00:37 opt
        dr-xr-xr-x. 80 root    root        0 Oct 30 19:12 proc
        dr-xr-x---.  2 root    root     4096 Oct 25 23:31 root
        dr-xr-xr-x.  2 root    root    12288 Oct  2 23:15 sbin
        drwxr-xr-x.  7 root    root        0 Oct 30 19:12 selinux
        drwxr-xr-x.  2 root    root     4096 Nov 11  2010 srv
        drwxr-xr-x. 13 root    root        0 Oct 30 19:12 sys
        drwxrwxrwt.  3 root    root     4096 Oct 30 19:34 tmp
        drwxr-xr-x. 14 root    root     4096 Oct 30 19:20 usr
        drwxr-xr-x. 22 root    root     4096 Oct  2 23:15 var
        cobblepot [0]
Still, a 'find' for UID 500 files, negating those under our pseudofs
and non-system directories, shows 5211 files:
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null
        /a
        /.kshpr
        /dev/vcsa6
        <snip...>
        /var/lock/subsys/rpcbind
        /var/lock/subsys/ksmtuned
        /.autofsck
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null |
        > /bin/egrep -v '^/(dev|sys|proc|selinux|opt|usr/local|usr/app1.0)' \
        > >> /tmp/appuser-leftovers
        cobblepot [0] /usr/bin/wc -l /tmp/appuser-leftovers
        5211 /tmp/appuser-leftovers
        cobblepot [0] /usr/bin/head -10 /tmp/appuser-leftovers
        /a
        /.kshpr
        /tmp/yum.log
        /tmp/.ICE-unix
        /tmp/svc-onl
        /usr/bin/mailq
        /usr/bin/javaws
        /usr/bin/servertool
        /usr/bin/pack200
        /usr/bin/orbd
        cobblepot [0]
Seeing 'mailq', I would have expected it to have been picked up earlier.
Checking the "RPM" database suggests this file is part of postfix, though
a dump of the postfix rpm info doesn't find it.  To ensure that the file
does indeed exist, we did an 'ls' and see that 'mailq' is a symlink.
To further illustrate the lack of package ownership, we check out 'orbd'
as well:
        cobblepot [0] /bin/rpm -qf /usr/bin/mailq
        postfix-2.6.6-2.el6.x86_64
        cobblepot [0] /bin/rpm -q --dump postfix | /bin/grep mailq
        /usr/bin/mailq.postfix 31 1289491808 0000000000000000000000000000000000000000000000000000000000000000 0120755 root root 0 0 0 ../../usr/sbin/sendmail.postfix
        /usr/share/man/man1/mailq.postfix.1.gz 48 1289491808 1e18ae3257a41ed8ea3eae07e13b42d2523e385c3eeb60b39be3989c9a46d19b 0100644 root root 0 1 0 X
        cobblepot [0] /bin/ls -ld /usr/bin/mailq
        lrwxrwxrwx. 1 appuser appuser 27 Oct  2 23:14 /usr/bin/mailq -> /etc/alternatives/mta-mailq
        cobblepot [0] /bin/rpm -qf /usr/bin/orbd
        file /usr/bin/orbd is not owned by any package
        cobblepot [1]
Fortunately, the files under "/usr" that we still have left are simply
symlinks.  We can either ignore them or (better) reset their ownership
too, as seen below.  We can also take care of our user home directories
and mail spools while we're at it:
        cobblepot [0] for i in `/bin/grep '^/usr' /tmp/appuser-leftovers` ; do
        > if [ -L ${i} ]; then echo "${i} is a link" ; else echo "** ${i} not a link" ;
        > fi ; done
        /usr/bin/mailq is a link
        /usr/bin/javaws is a link
        /usr/bin/servertool is a link
        /usr/bin/pack200 is a link
        /usr/bin/orbd is a link
        /usr/bin/rmail is a link
        /usr/bin/java is a link
        /usr/bin/unpack200 is a link
        /usr/bin/rmiregistry is a link
        /usr/bin/keytool is a link
        /usr/bin/tnameserv is a link
        /usr/bin/newaliases is a link
        /usr/bin/rmid is a link
        /usr/sbin/sendmail is a link
        cobblepot [0] for i in `/bin/grep '^/usr' /tmp/appuser-leftovers` ; do
        > if [ -L ${i} ]; then echo "/bin/chown -h root:root ${i}" ;
        > /bin/chown -h root:root ${i} ; fi ; done
        /bin/chown -h root:root /usr/bin/mailq
        /bin/chown -h root:root /usr/bin/javaws
        /bin/chown -h root:root /usr/bin/servertool
        /bin/chown -h root:root /usr/bin/pack200
        /bin/chown -h root:root /usr/bin/orbd
        /bin/chown -h root:root /usr/bin/rmail
        /bin/chown -h root:root /usr/bin/java
        /bin/chown -h root:root /usr/bin/unpack200
        /bin/chown -h root:root /usr/bin/rmiregistry
        /bin/chown -h root:root /usr/bin/keytool
        /bin/chown -h root:root /usr/bin/tnameserv
        /bin/chown -h root:root /usr/bin/newaliases
        /bin/chown -h root:root /usr/bin/rmid
        /bin/chown -h root:root /usr/sbin/sendmail
        cobblepot [0] /bin/chown -R winston:appuser /home/winston
        cobblepot [0] /bin/chown winston:mail /var/spool/mail/winston
        cobblepot [0]
The last part of this is somewhat painful as it requires knowledge
of those files under "/var".  Since typically the files under "/var"
are log files, lock files, mail spools, etc, their ownerships can seem
less controlled.  Without another host as a reference point, figuring
out how these files should be owned is quite tedious.  On "cobblepot",
there are about 36 files or directory structures left under "/var" that
we need to account for.  After verifying the ownership on a separate host
(adler), I came up with the following list of 'chown' commands:
        cobblepot [0] /bin/chown abrt:abrt /var/cache/abrt/
        cobblepot [0] /bin/chown -R root:root /var/cache/fontconfig
        cobblepot [0] /bin/chown -R haldaemon:haldaemon /var/cache/hald
        cobblepot [0] /bin/chown -R root:man /var/cache/man
        cobblepot [0] /bin/chown -R rpc:rpc /var/cache/rpcbind
        cobblepot [0] /bin/chown -R root:root /var/cache/yum
        cobblepot [0] /bin/chown -R root:root /var/lost+found/
        cobblepot [0] /bin/chown -R rpcuser:rpcuser /var/lib/nfs/statd
        cobblepot [0] /bin/chown -R root:slocate /var/lib/mlocate/
        cobblepot [0] /bin/chown -R root:root /var/lib/alternatives
        cobblepot [0] /bin/chown root:root /var/lib/libvirt/network/default.xml
        cobblepot [0] /bin/chown root:root /var/lib/libvirt/libvirt-guests
        cobblepot [0] /bin/chown -R qemu:qemu /var/lib/libvirt/qemu
        cobblepot [0] /bin/chown root:root /var/lib/rpm/.rpm.lock
        cobblepot [0] /bin/chown root:root /var/lib/authconfig/last/
        cobblepot [0] /bin/chown postfix:postfix /var/lib/postfix/master.lock
        cobblepot [0] /bin/chown root:root /var/lib/dbus/machine-id
        cobblepot [0] /bin/chown -R root:root /var/lib/cas
        cobblepot [0] /bin/chown root:root /var/lib/random-seed
        cobblepot [0] /bin/chown -R ntp:ntp /var/lib/ntp
        cobblepot [0] /bin/chown -R root:root /var/lib/readahead
        cobblepot [0] /bin/chown -R root:root /var/lib/dnsmasq
        cobblepot [0] /bin/chown -R root:root /var/lib/yum
        cobblepot [0] /bin/chown -R root:root /var/log
        cobblepot [0] /bin/chown ntp:ntp /var/log/ntpstats/
        cobblepot [0] /bin/chown root:utmp /var/log/btmp
        cobblepot [0] /bin/chown root:utmp /var/log/wtmp
        cobblepot [0] /bin/chown root:root /var/spool/plymouth/boot.log
        cobblepot [0] /bin/chown -R abrt:abrt /var/spool/abrt-upload
        cobblepot [0] /bin/chown -R abrt:abrt /var/spool/abrt
        cobblepot [0] /bin/chown -R daemon:daemon /var/spool/at
        cobblepot [0] /bin/chown -R postfix:postfix /var/spool/postfix
        cobblepot [0] /bin/chown postfix:postdrop /var/spool/postfix/public/
        cobblepot [0] /bin/chown postfix:postdrop /var/spool/postfix/maildrop/
        cobblepot [0] /bin/chown root:mail /var/spool/mail/
        cobblepot [0] /bin/chown rpc:mail /var/spool/mail/rpc
        cobblepot [0]
With those out of the way, let's see how many files we still have left:
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null |
        > /bin/egrep -v '^/(opt|usr/local|usr/app1.0)' | /usr/bin/wc -l
        11762
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null |
        > /bin/egrep -v '^/(dev|sys|proc|selinux|opt|usr/local|usr/app1.0)' | /usr/bin/wc -l
        69
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null |
        > /bin/egrep -v '^/(dev|sys|proc|selinux|opt|usr/local|usr/app1.0)' | /usr/bin/less
        cobblepot [0] /sbin/init 6
Not including our pseudofs and non-system directories / files, we have 69
left.  A check of those should only leave files under "/tmp", "/var/run",
and "/var/lock" which should be transient any way.  Now to finish our
work, reboot the system with 'init' as seen above.  After the reboot,
another check excluding non-system directories shows only 10 files left:
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null |
        > /bin/egrep -v '^/(opt|usr/local|usr/app1.0)' | /usr/bin/wc -l
        10
        cobblepot [0] /bin/find / -uid 500 -print 2>/dev/null |
        > /bin/egrep -v '^/(opt|usr/local|usr/app1.0)'
        /a
        /.kshpr
        /tmp/yum.log
        /tmp/svc-onl
        /home/appuser
        /home/appuser/.bash_logout
        /home/appuser/.bashrc
        /home/appuser/.kshrc
        /home/appuser/.bash_profile
        /var/spool/mail/appuser
We are now successfully finished with recovering our system and at least
our OS will function sanely again.  Of note, this doesn't account for any
files for which the "RPM" database has no information, as we saw earlier.
Since this was one of those "I wonder if" scenarios for me, I'm fine
with that.  However, while it is possible to at least restore the OS files
to proper ownerships, one should have recent backups to recover from.

see also:
    File Integrity Checks via Package DB
    Fixing an Overly Eager chmod in Linux

2 comments:

THEcreationist said...

Really good article...you are a life saver

troy said...

THEcreationist,

I'm glad the post was useful. I expect at some point, I'll end up writing up similar for other OSes as well.

--troy