Why does mount happen over an existing directory?

An existing directory is needed as a mount point.

$ ls
$ sudo mount /dev/sdb2 ./datadisk
mount: mount point ./datadisk does not exist
$ mkdir datadisk
$ sudo mount /dev/sdb2 ./datadisk
$

I find it confusing since it overlays existing contents of the directory. There are two possible contents of the mount point directory which may get switched unexpectedly (for a user who is not performing the mount).

Why doesn’t mount happen into a newly created directory? This is the way how graphical operating systems display removable media. It would be clear if the directory is mounted (exists) or not mounted (does not exist). I am pretty sure there is a good reason but I haven’t been able to discover it yet.

Asked By: Melebius

||

Mounting to existing directory makes a call to mount practically atomic: it either succeeds or fails, at least from user’s perspective. If mount had to create the mountpoint itself, it would have two points of failure, making it impossible to guarantee a clean roll back. Imagine the following scenario:

  1. mount successfully creates the mountpoint
  2. mount tries to mount a new file system to that directory, but fails
  3. mount tries to remove the mountpoint, but fails

The system ends up with a side effect of a failed mount.

Here’s another one:

  1. umount successfully unmounts a file system
  2. umount tries to remove the mountpoint, but fails

Now, should umount return success or failure?

Answered By: Dmitry Grigoryev

Another case that can occur:

When you boot, a basic read only image is load on root directory. So you would like to override it when you would like to mound real root. So you can imagine that mount syscall just swap the ro mountpoint to rw.

Here, lets imagine you have a filesystem issue on root mountpoint, you would like to be able to try to repair it. With mount overlap, you can unmount filesystem and use fsck provided into basic image to solve it.

This feature can also be useful in systems that need strong security to be make track of change between a ro partition and a rw one.

Answered By: alexises

This is a case of an implementation detail that has leaked.

In a UNIX system, every directory consists of a list of names mapped to inode numbers. An inode holds metadata which tells the system whether it is a file, directory, special device, named pipe, etc. If it is a file or directory it also tells the system where to find the file or directory contents on disk. Most inodes are files or directories. The -i option to ls will list inode numbers.

Mounting a filesystem takes a directory inode and sets a flag on the kernel’s in-memory copy to say “actually, when looking for the contents of this directory look at this other filesystem instead” (see slide 10 of this presentation). This is relatively easy as it’s changing a single data item.

Why doesn’t it create a directory entry for you pointing at the new inode instead? There are two ways you could implement that, both of which have disadvantages. One is to physically write a new directory into the filesystem – but that fails if the filesystem is readonly! The other is to add to every directory listing process a list of “extra” things that aren’t really there. This is fiddly and potentially incurs a small performance hit on every file operation.

If you want dynamically-created mount points, the automount system can do this. Special non-disk filesystems can also create directories at will, e.g. proc, sys, devfs and so on.

Edit: see also the answer to What happens when you 'mount over' an existing folder with contents?

Answered By: pjc50

If mount(2) required the creation of a new directory to be the mount point, you couldn’t mount anything under a read-only filesystem. That would be dumb, so we can rule that out.

If mount optionally created a new directory to be the mountpoint, that would be weird. It’s not like mount/unmount happen all the time, so putting extra logic in the kernel to do these two steps with a single system call would not be an important speedup. Just leave it up to user-space to make a mkdir(2) system call if it wants one. Dmitry’s answer points out that having mount(2) do both things would make it non-atomic. And you’d want an extra argument to mount(2) with mode flags like open(2) takes, for O_CREAT, O_EXCL, etc. It would just be silly compared to letting user-space do it.

Or maybe you were asking about having mount(8) (the traditional program that makes mount(2) system calls) do this? That would be possible, but there’s already a perfectly good mkdir(1) for the job, and Unix’s design is all about good small tools that can be combined. If you want a tool that does both, it’s easy to write a shell script to build that tool out of two simpler tools. (Or, as muru commented, udisksctl already does this, so you don’t have to write it.) Also, Linux’s normal mount(8) from util-linux supports mount -o x-mount.mkdir[=mode] using it’s x- syntax for options for userspace, rather than options to be passed to the filesystem.


Now the more interesting question: why does there have to be a directory on the parent filesystem at all?

Like pjc50’s answer points out (no relation, even though he has my initials!) , having mount points show up in directory listings would then require an extra check on every readdir().

Having mount points exist as directories in the directory containing them (on the parent FS) is a nice trick. readdir() doesn’t have to notice that it is a mount point at all. That only happens if the mount point is used as a path component. Path resolution of course does have to check the mount table for every directory component of a path.

Answered By: Peter Cordes

I’ve always wondered that too.

A simple wrapper such as:

#!/bin/sh
eval "mkdir -p "$$#"" 
/bin/mount "$@"  

saved as an executable script named mount in a directory overriding /bin in your PATH should take care of this if it bothers you too much

(Before running the actual mount binary, it creates a directory named after the last argument to mount, if such directory doesn’t exist already.)


Alternatively, if you don’t want failed invocations of the mount wrapper to create directories, you can do:

#!/bin/sh
set -e
eval "lastArg="$$#""
test -d "$lastArg" || { mkdir "$lastArg"; madeDir=1; }
/bin/mount "$@"  ||  {  test -z "$madeDir" || rmdir "$lastArg"; }
Answered By: PSkocik
Categories: Answers Tags:
Answers are sorted by their score. The answer accepted by the question owner as the best is marked with
at the top-right corner.