[HERE BE DRAGONS]: shiftfs: support some btrfs ioctls

Shiftfs currently only passes through a few ioctl()s to the underlay. These
are ioctl()s that are generally considered safe. Doing it for random
ioctl()s would be a security issue. Permissions for ioctl()s are not
checked before the filesystem gets involved so if we were to override
credentials we e.g. could do a btrfs tree search in the underlay which we
normally wouldn't be allowed to do.
However, the btrfs filesystem allows unprivileged users to perform various
operations through its ioctl() interface. With shiftfs these ioctl() are
currently not working. To not regress users that expect btrfs ioctl()s to
work in unprivileged containers we can create a whitelist of ioctl()s that
we allow to go through to the underlay and for which we also switch
credentials.
The main problem is how we switch credentials. Since permissions checks for
ioctl()s are
done by the actual file system and not by the vfs this would mean that any
additional capable(<cap>)-based checks done by the filesystem would
unconditonally pass after we switch credentials. So to make credential
switching safe we drop *all* capabilities when switching credentials. This
means that only inode-based permission checks will pass.

Btrfs also allows unprivileged users to delete snapshots when the
filesystem is mounted with user_subvol_rm_allowed mount option or if the
the callers is capable(CAP_SYS_ADMIN). The latter should never be the case
with unprivileged users. To make sure we only allow removal of snapshots in
the former case we drop all capabilities (see above) when switching
credentials.

Additonally, btrfs allows the creation of snapshots. To make this work we
need to be (too) clever. When doing snapshots btrfs requires that an fd to
the directory the snapshot is supposed to be created in be passed along.
This fd obviously references a shiftfs file and as such a shiftfs dentry
and inode.  This will cause btrfs to yell EXDEV. To circumnavigate this
problem we need to silently temporarily replace the passed in fd with an fd
that refers to a file that references a btrfs dentry and inode.

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
1 file changed