How to shrink root filesystem without booting a livecd

I find myself needing to rearrange a system’s partitions to move data previously under the root filesystem into dedicated mount points. The volumes are all in LVM, so this is relatively easy: create new volumes, move data into them, shrink the root filesystem, then mount the new volumes at the appropriate points.

The issue is step 3, shrinking the root filesystem. The filesystems involved are ext4, so online resizing is supported; however, while mounted, the filesystems can only be grown. To shrink the partition requires unmounting it, which of course is not possible for the root partition in normal operation.

Answers around the Web seem to revolve around booting a LiveCD or other rescue media, doing the shrink operation, then booting back into the installed system. However, the system in question is remote, and I have access only via SSH. I can reboot, but booting a rescue disc and doing operations from the console is not possible.

How can I unmount the root filesystem while maintaining remote shell access?

Asked By: Tom Hunt

||

In solving this issue, the information provided at https://www.ivarch.com/blogs/oss/2007/01/resize-a-live-root-fs-a-howto.shtml was pivotal. However, that guide is for a very old version of RHEL, and various information was obsolete.

The instructions below are crafted to work with CentOS 7, but they should be easily enough transferable to any distro that runs systemd. All commands are run as root.

  1. Ensure the system is in a stable state

    Make sure no one else is using it and nothing else important is going on. It’s probably a good idea to stop service-providing units like httpd or ftpd, just to ensure external connections don’t disrupt things in the middle.

     systemctl stop httpd
     systemctl stop nfs-server
     # and so on....
    

    Make sure you have lsof installed (lsof -v). And that fuser (fuser -V) in installed too (Debian/Ubuntu package: psmisc).

  2. Unmount all unused filesystems

     umount -a
    

    This will print a number of ‘Target is busy’ warnings, for the root volume itself and for various temporary/system FSs. These can be ignored for the moment. What’s important is that no on-disk filesystems remain mounted, except the root filesystem itself. Verify this:

     # mount alone provides the info, but column makes it possible to read
     mount | column -t
    

    If you see any on-disk filesystems still mounted, then something is still running that shouldn’t be. Check what it is using fuser:

     # if necessary:
     yum install psmisc
     # then:
     fuser -vm <mountpoint>
     systemctl stop <whatever>
     umount -a
     # repeat as required...
    
  3. Make the temporary root
    Note: if /tmp is a directory on /, we will not be able to unmount / later in this procedure if we use /tmp/tmproot. Thus it may be necessary to use an alternative mountpoint such as /tmproot instead.

     mkdir /tmp/tmproot
     mount -t tmpfs none /tmp/tmproot
     mkdir /tmp/tmproot/{proc,sys,dev,run,usr,var,tmp,oldroot}
     cp -ax /{bin,etc,mnt,sbin,lib,lib64} /tmp/tmproot/
     cp -ax /usr/{bin,sbin,lib,lib64} /tmp/tmproot/usr/
     cp -ax /var/{account,empty,lib,local,lock,nis,opt,preserve,run,spool,tmp,yp} /tmp/tmproot/var/
    

    This creates a very minimal root system, which breaks (among other things) manpage viewing (no /usr/share), user-level customizations (no /root or /home) and so forth. This is intentional, as it constitutes encouragement not to stay in such a jury-rigged root system any longer than necessary.

    At this point you should also ensure that all the necessary software is installed, as it will also assuredly break the package manager. Glance through all the steps, and make sure you have the necessary executables.

  4. Pivot into the root

     mount --make-rprivate / # necessary for pivot_root to work
     pivot_root /tmp/tmproot /tmp/tmproot/oldroot
     for i in dev proc sys run; do mount --move /oldroot/$i /$i; done
    

    systemd causes mounts to allow subtree sharing by default (as with mount --make-shared), and this causes pivot_root to fail. Hence, we turn this off globally with mount --make-rprivate /. System and temporary filesystems are moved wholesale into the new root. This is necessary to make it work at all; the sockets for communication with systemd, among other things, live in /run, and so there’s no way to make running processes close it.

  5. Ensure remote access survived the changeover

     systemctl restart sshd
     systemctl status sshd
    

    After restarting sshd, ensure that you can get in, by opening another terminal and connecting to the machine again via ssh. If you can’t, fix the problem before moving on.

    Once you’ve verified you can connect in again, exit the shell you’re currently using and reconnect. This allows the remaining forked sshd to exit and ensures the new one isn’t holding /oldroot.

  6. Close everything still using the old root

     fuser -vm /oldroot
    

    This will print a list of processes still holding onto the old root directory. On my system, it looked like this:

                  USER        PID ACCESS COMMAND
     /oldroot:    root     kernel mount /oldroot
                  root          1 ...e. systemd
                  root        549 ...e. systemd-journal
                  root        563 ...e. lvmetad
                  root        581 f..e. systemd-udevd
                  root        700 F..e. auditd
                  root        723 ...e. NetworkManager
                  root        727 ...e. irqbalance
                  root        730 F..e. tuned
                  root        736 ...e. smartd
                  root        737 F..e. rsyslogd
                  root        741 ...e. abrtd
                  chrony      742 ...e. chronyd
                  root        743 ...e. abrt-watch-log
                  libstoragemgmt    745 ...e. lsmd
                  root        746 ...e. systemd-logind
                  dbus        747 ...e. dbus-daemon
                  root        753 ..ce. atd
                  root        754 ...e. crond
                  root        770 ...e. agetty
                  polkitd     782 ...e. polkitd
                  root       1682 F.ce. master
                  postfix    1714 ..ce. qmgr
                  postfix   12658 ..ce. pickup
    

    You need to deal with each one of these processes before you can unmount /oldroot. The brute-force approach is simply kill $PID for each, but this can break things. To do it more softly:

     systemctl | grep running
    

    This creates a list of running services. You should be able to correlate this with the list of processes holding /oldroot, then issue systemctl restart for each of them. Some services will refuse to come up in the temporary root and enter a failed state; these don’t really matter for the moment.

    If the root drive you want to resize is an LVM drive, you may also need to restart some other running services, even if they do not show up in the list created by fuser -vm /oldroot. You might be unable to to resize an LVM drive under Step 7 because of this Error:

     fsadm: Cannot proceed with mounted filesystem "/oldroot"
    

    You can try systemctl restart systemd-udevd and if that fails, you can find the leftover mounts with grep system /proc/*/mounts | column -t

    Look for processes that say mounts:none and try restarting these:

     PATH                      BIN                        FSTYPE
     /proc/16395/mounts:tmpfs  /run/systemd/timesync      tmpfs
     /proc/16395/mounts:none   /var/lib/systemd/timesync  tmpfs
     /proc/18485/mounts:tmpfs  /run/systemd/inhibit       tmpfs
     /proc/18485/mounts:tmpfs  /run/systemd/seats         tmpfs
     /proc/18485/mounts:tmpfs  /run/systemd/sessions      tmpfs
     /proc/18485/mounts:tmpfs  /run/systemd/shutdown      tmpfs
     /proc/18485/mounts:tmpfs  /run/systemd/users         tmpfs
     /proc/18485/mounts:none   /var/lib/systemd/linger    tmpfs
    

    Some processes can’t be dealt with via simple systemctl restart. For me these included auditd (which doesn’t like to be killed via systemctl, and so just wanted a kill -15). These can be dealt with individually.

    The last process you’ll find, usually, is systemd itself. For this, run systemctl daemon-reexec.

    Once you’re done, the table should look like this:

                  USER        PID ACCESS COMMAND
     /oldroot:    root     kernel mount /oldroot
    
  7. Unmount the old root

     umount /oldroot
    

    At this point, you can carry out whatever manipulations you require. The original question needed a simple resize2fs invocation, but you can do whatever you want here; one other use case is transferring the root filesystem from a simple partition to LVM/RAID/whatever.

  8. Pivot the root back

     mount <blockdev> /oldroot
     mount --make-rprivate / # again
     pivot_root /oldroot /oldroot/tmp/tmproot
     for i in dev proc sys run; do mount --move /tmp/tmproot/$i /$i; done
    

    This is a straightforward reversal of step 4.

  9. Dispose of the temporary root

    Repeat steps 5 and 6, except using /tmp/tmproot in place of /oldroot. Then:

     umount /tmp/tmproot
     rmdir /tmp/tmproot
    

    Since it’s a tmpfs, at this point the temporary root dissolves into the ether, never to be seen again.

  10. Put things back in their places

    Mount filesystems again:

    mount -a
    

    At this point, you should also update /etc/fstab and grub.cfg in accordance with any adjustments you made during step 7.

    Restart any failed services:

    systemctl | grep failed
    systemctl restart <whatever>
    

    Allow shared subtrees again:

    mount --make-rshared /
    

    Start the stopped service units – you can use this single command:

    systemctl isolate default.target
    

And you’re done.

Many thanks to Andrew Wood, who worked out this evolution on RHEL4, and steve, who provided me the link to the former.

Answered By: Tom Hunt

If you are sure what you are doing – thus not experimenting, you may hook into initrd which is the non-interactive and fast way.

On a Debian based system here is how.

See the code: https://github.com/szepeviktor/debian-server-tools/blob/master/debian-setup/debian-resizefs.sh

There is another example: https://github.com/szepeviktor/debian-server-tools/blob/master/debian-setup/debian-convert-ext3-ext4.sh

Answered By: Szépe Viktor

What I did on a DigitalOcean VPS (having 25G /dev/vda1 mounted as root) :

  1. using fdisk, delete the first (25G) partition
  2. create a 20G partition (the first sector must be same as the original one)
  3. create a new partition (obviously with size 5G), it will be /dev/vda2
  4. write partition table and exit fdisk.
  5. create an ext4 filesystem on /dev/vda2
  6. note the UUIDs of both partitions (ls -l /dev/disk/by-uuid)
  7. replace /dev/vda1 UUID to /dev/vda2 UUID in /boot/grub/grub.cfg and /etc/fstab
  8. mount /dev/vda2 to /mnt
  9. copy everything (except dev,proc,run,sys,mnt) from / to /mnt
  10. run update-grub
  11. reboot and pray

The above worked well when I was told the resize2fs is not supported.

Answered By: Zsombor
Categories: Answers Tags: ,
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.