umount: device is busy. Why?
When running umount /path
I get:
umount: /path: device is busy.
The filesystem is huge, so lsof +D /path
is not a realistic option.
lsof /path
, lsof +f -- /path
, and fuser /path
all return nothing. fuser -v /path
gives:
USER PID ACCESS COMMAND
/path: root kernel mount /path
which is normal for all unused mounted file systems.
umount -l
and umount -f
is not good enough for my situation.
How do I figure out why the kernel thinks this filesystem is busy?
Instead of using lsof to crawl through the file system, just use the total list of open files and grep it. I find this returns must faster, although it’s less accurate. It should get the job done.
lsof | grep '/path'
It seems the cause for my issue was the nfs-kernel-server
was exporting the directory. The nfs-kernel-server
probably goes behind the normal open files and thus is not listed by lsof
and fuser
.
When I stopped the nfs-kernel-server
I could umount
the directory.
I have made a page with examples of all solutions so far here: http://oletange.blogspot.com/2012/04/umount-device-is-busy-why.html
For fuser to report on the PIDs holding a mount open you have to use -m
fuser -m /path
To add to BruceCran‘s comment above, the cause for my manifestation of this problem just now was a stale loopback mount. I’d already checked the output of fuser -vm <mountpoint>
/lsof +D <mountpoint>
, mount
and cat /proc/mounts
, checked whether some old nfs-kernel-server was running, turned off quotas, attempted (but failed) a umount -f <mountpoint>
and all but resigned myself to abandoning 924 days’ uptime before finally checking the output of losetup
and finding two stale configured-but-not-mounted loopbacks:
parsley:/mnt# cat /proc/mounts
rootfs / rootfs rw 0 0
none /sys sysfs rw,nosuid,nodev,noexec 0 0
none /proc proc rw,nosuid,nodev,noexec 0 0
udev /dev tmpfs rw,size=10240k,mode=755 0 0
/dev/mapper/stuff-root / ext3 rw,errors=remount-ro,data=ordered 0 0
tmpfs /lib/init/rw tmpfs rw,nosuid,mode=755 0 0
usbfs /proc/bus/usb usbfs rw,nosuid,nodev,noexec 0 0
tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0
devpts /dev/pts devpts rw,nosuid,noexec,gid=5,mode=620 0 0
fusectl /sys/fs/fuse/connections fusectl rw 0 0
/dev/dm-2 /mnt/big ext3 rw,errors=remount-ro,data=ordered,jqfmt=vfsv0,usrjquota=aquota.user 0 0
then
parsley:/mnt# fuser -vm /mnt/big/
parsley:/mnt# lsof +D big
parsley:/mnt# umount -f /mnt/big/
umount2: Device or resource busy
umount: /mnt/big: device is busy
umount2: Device or resource busy
umount: /mnt/big: device is busy
parsley:/mnt# losetup -a
/dev/loop0: [fd02]:59 (/mnt/big/dot-dropbox.ext2)
/dev/loop1: [fd02]:59 (/mnt/big/dot-dropbox.ext2)
parsley:/mnt# losetup -d /dev/loop0
parsley:/mnt# losetup -d /dev/loop1
parsley:/mnt# losetup -a
parsley:/mnt# umount big/
parsley:/mnt#
A Gentoo forum post also lists swapfiles as a potential culprit; although swapping to files is probably pretty rare these days, it can’t hurt to check the output of cat /proc/swaps
. I’m not sure whether quotas could ever prevent an unmount — I was clutching at straws.
I had this issue, and it turned out that there were active screen sessions in the background I didn’t know about. I connected to the other active screen session and its shell wasn’t even currently sitting in the mounted directory. Killing those other shell sessions fixed the issue for me.
Just thought I’d share my resolution.
For me, the offending process was a daemon running in a chroot. Because it was in a chroot, lsof
and fuser
wouldn’t find it.
If you suspect you have something left running in a chroot, sudo ls -l /proc/*/root | grep chroot
will find the culprit (replace “chroot” with the path to the chroot).
lsof
and fuser
didn’t give me anything either.
After a process of renaming all possible directories to .old and rebooting the system every time after I made changes I found one particular directory (relating to postfix) that was responsible.
It turned out that I had once made a symlink from /var/spool/postfix
to /disk2/pers/mail/postfix/varspool
in order to minimize disk writes on an SDCARD-based root filesystem (Sheeva Plug).
With this symlink, even after stopping the postfix
and dovecot
services (both ps aux
as well as netstat -tuanp
didn’t show anything related) I was not able to unmount /disk2/pers
.
When I removed the symlink and updated the postfix
and dovecot
config files to point directly to the new dirs on /disk2/pers/
I was able to successfully stop the services and unmount
the directory.
Next time I will look more closely at the output of:
ls -lR /var | grep ^l | grep disk2
The above command will recursively list all symbolic links in a directory tree (here starting at /var
) and filter out those names that point to a specific target mount point (here disk2).
Today the problem was an open socket (specifically tmux
):
mount /mnt/disk
export TMPDIR=/mnt/disk
tmux
<<detatch>>
umount /mnt/disk
umount: /mnt/disk: device is busy.
lsof | grep /mnt/disk
tmux 20885 root 6u unix 0xffff880022346300 0t0 3215201 /mnt/disk/tmux-0/default
We have a proprietary system where the root filesystem is normally read-only. Occasionally, when files have to be copied over, it is remounted read-write:
mount -oremount,rw /
And then remounted back:
mount -oremount,ro /
This time however, mount
kept giving the mount: / is busy
error. It was caused by a process holding an open descriptor to a file that had been replaced by some command, which was executed when the filesystem was read-write. The important line from lsof -- /
output happens to be (names have been changed):
replicate 1719 admin DEL REG 8,5 204394 /opt/gns/lib/replicate/modules/md_es.so
Notice the DEL
in the output. Simply restarting the process holding on to the deleted file resolved the issue.
I have a couple of bind
and overlay
mounts under my mount that were blocking me, check the tab completion for the mount-point you want to unmount. I suspect it was the overlay mount in particular but could have been the binds too
Open files
Processes with open files are the usual culprits. Display them:
lsof +f -- <mountpoint or device>
There is an advantage to using /dev/<device>
rather than /mountpoint
: a mountpoint will disappear after an umount -l
, or it may be hidden by an overlaid mount.
fuser
can also be used, but to my mind lsof
has a more useful output. However fuser
is useful when it comes to killing the processes causing your dramas so you can get on with your life.
List files on <mountpoint>
(see caveat above):
fuser -vmM <mountpoint>
Interactively kill only processes with files open for writing:
fuser -vmMkiw <mountpoint>
After remounting read-only (mount -o remount,ro <mountpoint>
), it is safe(r) to kill all remaining processes:
fuser -vmMk <mountpoint>
Mountpoints
The culprit can be the kernel itself. Another filesystem mounted on the filesystem you are trying to umount
will cause grief. Check with:
mount | grep <mountpoint>/
For loopback mounts, also check the output of:
losetup -la
Anonymous inodes (Linux)
Anonymous inodes can be created by:
- Temporary files (
open
withO_TMPFILE
) - inotify watches
- [eventfd]
- [eventpoll]
- [timerfd]
These are the most elusive type of pokemon, and appear in lsof
‘s TYPE
column as a_inode
(which is undocumented in the lsof
man page).
They won’t appear in lsof +f -- /dev/<device>
, so you’ll need to:
lsof | grep a_inode
For killing processes holding anonymous inodes, see: List current inotify watches (pathname, PID).
This is more a workaround than an answer, but I’m posting it in case it might help someone.
In my case I was trying to modify the LVM as I wanted to make the /var partition bigger, so I needed to umount it. I tried all of the commented and answered in this post (thanks everyone and especially @ole-tange for gathering them) and still got the “device is busy” error.
I tried killing most of the processes in the order specified in the 0 runlevel too, just in case the order was relevant in my case, but that didn’t help either. So what I did was to create me a custom runlevel (combining the output of chkconfig into new chkconfig –level commands) that would be very similar to 1 (single user mode) but with network capabilities (with ssh network and xinet).
As I was using redhat, runlevel 4 is marked as “unused/user defined”, so I used that one, and run
init 4
In my case this was ok as I needed to reboot the server in any case, but probably that will be the case of anyone tweaking the disks.
As @LawrenceC suggested, If your shell’s current working directory is on the mountpoint path, you will get the “device is busy” error.
ubuntu@myip:~/efs$ sudo umount ~/efs
umount.nfs4: /home/ubuntu/efs: device is busy
cd
to a location other than the mountpoint to resolve the error.
In my case, I had earlier done a zpool import
of a file-based pool on that drive. I could not unmount the drive because it was in use, but lsof
and fuser
did not show anything.
The solution was to do sudo zpool export mypool
and then unmount.
In my case it was docker that was holding the file. I am running ZFS, and after lazy umounting the volume, I got an error message:
mount: /tank: /dev/zd112 already mounted or mount point busy.
The investigation
fuser /dev/zd112
Cannot stat file /proc/32130/fd/4: Stale file handle
Cannot stat file /proc/32130/fd/5: Stale file handle
Cannot stat file /proc/32130/fd/6: Stale file handle
Cannot stat file /proc/32130/fd/7: Stale file handle
Cannot stat file /proc/32130/fd/11: Stale file handle
Running ps aux |grep 32130
results:
999 32130 4.1 1.2 10784420 1241840 ? Ssl Jun20 10055:41 mysqld
Which is not a mysql run in the same userspace, but within a container.
Running docker ps |grep mysql
yields
3c6c907a28db mysql:5.7 "docker-entrypoint.s…" 15 months ago Up 5 months 33060/tcp, 0.0.0.0:3336->3306/tcp zabbix_db_1_5e72b12e41d2
Next you can run:
docker stop 3c6c907a28db