Skip to content

Commit 5dfbfe7

Browse files
committed
Merge tag 'fs.idmapped.v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux
Pull fs idmapping updates from Christian Brauner: "This contains the work to enable the idmapping infrastructure to support idmapped mounts of filesystems mounted with an idmapping. In addition this contains various cleanups that avoid repeated open-coding of the same functionality and simplify the code in quite a few places. We also finish the renaming of the mapping helpers we started a few kernel releases back and move them to a dedicated header to not continue polluting the fs header needlessly with low-level idmapping helpers. With this series the fs header only contains idmapping helpers that interact with fs objects. Currently we only support idmapped mounts for filesystems mounted without an idmapping themselves. This was a conscious decision mentioned in multiple places (cf. [1]). As explained at length in [3] it is perfectly fine to extend support for idmapped mounts to filesystem's mounted with an idmapping should the need arise. The need has been there for some time now (cf. [2]). Before we can port any filesystem that is mountable with an idmapping to support idmapped mounts in the coming cycles, we need to first extend the mapping helpers to account for the filesystem's idmapping. This again, is explained at length in our documentation at [3] and also in the individual commit messages so here's an overview. Currently, the low-level mapping helpers implement the remapping algorithms described in [3] in a simplified manner as we could rely on the fact that all filesystems supporting idmapped mounts are mounted without an idmapping. In contrast, filesystems mounted with an idmapping are very likely to not use an identity mapping and will instead use a non-identity mapping. So the translation step from or into the filesystem's idmapping in the remapping algorithm cannot be skipped for such filesystems. Non-idmapped filesystems and filesystems not supporting idmapped mounts are unaffected by this change as the remapping algorithms can take the same shortcut as before. If the low-level helpers detect that they are dealing with an idmapped mount but the underlying filesystem is mounted without an idmapping we can rely on the previous shortcut and can continue to skip the translation step from or into the filesystem's idmapping. And of course, if the low-level helpers detect that they are not dealing with an idmapped mount they can simply return the relevant id unchanged; no remapping needs to be performed at all. These checks guarantee that only the minimal amount of work is performed. As before, if idmapped mounts aren't used the low-level helpers are idempotent and no work is performed at all" Link: 2ca4dcc ("fs/mount_setattr: tighten permission checks") [1] Link: containers/podman#10374 [2] Link: Documentations/filesystems/idmappings.rst [3] Link: a65e58e ("fs: document and rename fsid helpers") [4] * tag 'fs.idmapped.v5.17' of git://git.kernel.org/pub/scm/linux/kernel/git/brauner/linux: fs: support mapped mounts of mapped filesystems fs: add i_user_ns() helper fs: port higher-level mapping helpers fs: remove unused low-level mapping helpers fs: use low-level mapping helpers docs: update mapping documentation fs: account for filesystem mappings fs: tweak fsuidgid_has_mapping() fs: move mapping helpers fs: add is_idmapped_mnt() helper
2 parents 84bfcc0 + bd30336 commit 5dfbfe7

File tree

17 files changed

+356
-231
lines changed

17 files changed

+356
-231
lines changed

Documentation/filesystems/idmappings.rst

-72
Original file line numberDiff line numberDiff line change
@@ -952,75 +952,3 @@ The raw userspace id that is put on disk is ``u1000`` so when the user takes
952952
their home directory back to their home computer where they are assigned
953953
``u1000`` using the initial idmapping and mount the filesystem with the initial
954954
idmapping they will see all those files owned by ``u1000``.
955-
956-
Shortcircuting
957-
--------------
958-
959-
Currently, the implementation of idmapped mounts enforces that the filesystem
960-
is mounted with the initial idmapping. The reason is simply that none of the
961-
filesystems that we targeted were mountable with a non-initial idmapping. But
962-
that might change soon enough. As we've seen above, thanks to the properties of
963-
idmappings the translation works for both filesystems mounted with the initial
964-
idmapping and filesystem with non-initial idmappings.
965-
966-
Based on this current restriction to filesystem mounted with the initial
967-
idmapping two noticeable shortcuts have been taken:
968-
969-
1. We always stash a reference to the initial user namespace in ``struct
970-
vfsmount``. Idmapped mounts are thus mounts that have a non-initial user
971-
namespace attached to them.
972-
973-
In order to support idmapped mounts this needs to be changed. Instead of
974-
stashing the initial user namespace the user namespace the filesystem was
975-
mounted with must be stashed. An idmapped mount is then any mount that has
976-
a different user namespace attached then the filesystem was mounted with.
977-
This has no user-visible consequences.
978-
979-
2. The translation algorithms in ``mapped_fs*id()`` and ``i_*id_into_mnt()``
980-
are simplified.
981-
982-
Let's consider ``mapped_fs*id()`` first. This function translates the
983-
caller's kernel id into a kernel id in the filesystem's idmapping via
984-
a mount's idmapping. The full algorithm is::
985-
986-
mapped_fsuid(kid):
987-
/* Map the kernel id up into a userspace id in the mount's idmapping. */
988-
from_kuid(mount-idmapping, kid) = uid
989-
990-
/* Map the userspace id down into a kernel id in the filesystem's idmapping. */
991-
make_kuid(filesystem-idmapping, uid) = kuid
992-
993-
We know that the filesystem is always mounted with the initial idmapping as
994-
we enforce this in ``mount_setattr()``. So this can be shortened to::
995-
996-
mapped_fsuid(kid):
997-
/* Map the kernel id up into a userspace id in the mount's idmapping. */
998-
from_kuid(mount-idmapping, kid) = uid
999-
1000-
/* Map the userspace id down into a kernel id in the filesystem's idmapping. */
1001-
KUIDT_INIT(uid) = kuid
1002-
1003-
Similarly, for ``i_*id_into_mnt()`` which translated the filesystem's kernel
1004-
id into a mount's kernel id::
1005-
1006-
i_uid_into_mnt(kid):
1007-
/* Map the kernel id up into a userspace id in the filesystem's idmapping. */
1008-
from_kuid(filesystem-idmapping, kid) = uid
1009-
1010-
/* Map the userspace id down into a kernel id in the mounts's idmapping. */
1011-
make_kuid(mount-idmapping, uid) = kuid
1012-
1013-
Again, we know that the filesystem is always mounted with the initial
1014-
idmapping as we enforce this in ``mount_setattr()``. So this can be
1015-
shortened to::
1016-
1017-
i_uid_into_mnt(kid):
1018-
/* Map the kernel id up into a userspace id in the filesystem's idmapping. */
1019-
__kuid_val(kid) = uid
1020-
1021-
/* Map the userspace id down into a kernel id in the mounts's idmapping. */
1022-
make_kuid(mount-idmapping, uid) = kuid
1023-
1024-
Handling filesystems mounted with non-initial idmappings requires that the
1025-
translation functions be converted to their full form. They can still be
1026-
shortcircuited on non-idmapped mounts. This has no user-visible consequences.

fs/cachefiles/bind.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -117,7 +117,7 @@ static int cachefiles_daemon_add_cache(struct cachefiles_cache *cache)
117117
root = path.dentry;
118118

119119
ret = -EINVAL;
120-
if (mnt_user_ns(path.mnt) != &init_user_ns) {
120+
if (is_idmapped_mnt(path.mnt)) {
121121
pr_warn("File cache on idmapped mounts not supported");
122122
goto error_unsupported;
123123
}

fs/ecryptfs/main.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -537,7 +537,7 @@ static struct dentry *ecryptfs_mount(struct file_system_type *fs_type, int flags
537537
goto out_free;
538538
}
539539

540-
if (mnt_user_ns(path.mnt) != &init_user_ns) {
540+
if (is_idmapped_mnt(path.mnt)) {
541541
rc = -EINVAL;
542542
printk(KERN_ERR "Mounting on idmapped mounts currently disallowed\n");
543543
goto out_free;

fs/ksmbd/smbacl.c

+3-16
Original file line numberDiff line numberDiff line change
@@ -9,6 +9,7 @@
99
#include <linux/fs.h>
1010
#include <linux/slab.h>
1111
#include <linux/string.h>
12+
#include <linux/mnt_idmapping.h>
1213

1314
#include "smbacl.h"
1415
#include "smb_common.h"
@@ -274,14 +275,7 @@ static int sid_to_id(struct user_namespace *user_ns,
274275
uid_t id;
275276

276277
id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);
277-
/*
278-
* Translate raw sid into kuid in the server's user
279-
* namespace.
280-
*/
281-
uid = make_kuid(&init_user_ns, id);
282-
283-
/* If this is an idmapped mount, apply the idmapping. */
284-
uid = kuid_from_mnt(user_ns, uid);
278+
uid = mapped_kuid_user(user_ns, &init_user_ns, KUIDT_INIT(id));
285279
if (uid_valid(uid)) {
286280
fattr->cf_uid = uid;
287281
rc = 0;
@@ -291,14 +285,7 @@ static int sid_to_id(struct user_namespace *user_ns,
291285
gid_t id;
292286

293287
id = le32_to_cpu(psid->sub_auth[psid->num_subauth - 1]);
294-
/*
295-
* Translate raw sid into kgid in the server's user
296-
* namespace.
297-
*/
298-
gid = make_kgid(&init_user_ns, id);
299-
300-
/* If this is an idmapped mount, apply the idmapping. */
301-
gid = kgid_from_mnt(user_ns, gid);
288+
gid = mapped_kgid_user(user_ns, &init_user_ns, KGIDT_INIT(id));
302289
if (gid_valid(gid)) {
303290
fattr->cf_gid = gid;
304291
rc = 0;

fs/ksmbd/smbacl.h

+3-2
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@
1111
#include <linux/fs.h>
1212
#include <linux/namei.h>
1313
#include <linux/posix_acl.h>
14+
#include <linux/mnt_idmapping.h>
1415

1516
#include "mgmt/tree_connect.h"
1617

@@ -216,7 +217,7 @@ static inline uid_t posix_acl_uid_translate(struct user_namespace *mnt_userns,
216217
kuid_t kuid;
217218

218219
/* If this is an idmapped mount, apply the idmapping. */
219-
kuid = kuid_into_mnt(mnt_userns, pace->e_uid);
220+
kuid = mapped_kuid_fs(mnt_userns, &init_user_ns, pace->e_uid);
220221

221222
/* Translate the kuid into a userspace id ksmbd would see. */
222223
return from_kuid(&init_user_ns, kuid);
@@ -228,7 +229,7 @@ static inline gid_t posix_acl_gid_translate(struct user_namespace *mnt_userns,
228229
kgid_t kgid;
229230

230231
/* If this is an idmapped mount, apply the idmapping. */
231-
kgid = kgid_into_mnt(mnt_userns, pace->e_gid);
232+
kgid = mapped_kgid_fs(mnt_userns, &init_user_ns, pace->e_gid);
232233

233234
/* Translate the kgid into a userspace id ksmbd would see. */
234235
return from_kgid(&init_user_ns, kgid);

fs/namespace.c

+39-14
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,7 @@
3131
#include <uapi/linux/mount.h>
3232
#include <linux/fs_context.h>
3333
#include <linux/shmem_fs.h>
34+
#include <linux/mnt_idmapping.h>
3435

3536
#include "pnode.h"
3637
#include "internal.h"
@@ -561,7 +562,7 @@ static void free_vfsmnt(struct mount *mnt)
561562
struct user_namespace *mnt_userns;
562563

563564
mnt_userns = mnt_user_ns(&mnt->mnt);
564-
if (mnt_userns != &init_user_ns)
565+
if (!initial_idmapping(mnt_userns))
565566
put_user_ns(mnt_userns);
566567
kfree_const(mnt->mnt_devname);
567568
#ifdef CONFIG_SMP
@@ -965,6 +966,7 @@ static struct mount *skip_mnt_tree(struct mount *p)
965966
struct vfsmount *vfs_create_mount(struct fs_context *fc)
966967
{
967968
struct mount *mnt;
969+
struct user_namespace *fs_userns;
968970

969971
if (!fc->root)
970972
return ERR_PTR(-EINVAL);
@@ -982,6 +984,10 @@ struct vfsmount *vfs_create_mount(struct fs_context *fc)
982984
mnt->mnt_mountpoint = mnt->mnt.mnt_root;
983985
mnt->mnt_parent = mnt;
984986

987+
fs_userns = mnt->mnt.mnt_sb->s_user_ns;
988+
if (!initial_idmapping(fs_userns))
989+
mnt->mnt.mnt_userns = get_user_ns(fs_userns);
990+
985991
lock_mount_hash();
986992
list_add_tail(&mnt->mnt_instance, &mnt->mnt.mnt_sb->s_mounts);
987993
unlock_mount_hash();
@@ -1072,7 +1078,7 @@ static struct mount *clone_mnt(struct mount *old, struct dentry *root,
10721078

10731079
atomic_inc(&sb->s_active);
10741080
mnt->mnt.mnt_userns = mnt_user_ns(&old->mnt);
1075-
if (mnt->mnt.mnt_userns != &init_user_ns)
1081+
if (!initial_idmapping(mnt->mnt.mnt_userns))
10761082
mnt->mnt.mnt_userns = get_user_ns(mnt->mnt.mnt_userns);
10771083
mnt->mnt.mnt_sb = sb;
10781084
mnt->mnt.mnt_root = dget(root);
@@ -3927,28 +3933,32 @@ static unsigned int recalc_flags(struct mount_kattr *kattr, struct mount *mnt)
39273933
static int can_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
39283934
{
39293935
struct vfsmount *m = &mnt->mnt;
3936+
struct user_namespace *fs_userns = m->mnt_sb->s_user_ns;
39303937

39313938
if (!kattr->mnt_userns)
39323939
return 0;
39333940

3941+
/*
3942+
* Creating an idmapped mount with the filesystem wide idmapping
3943+
* doesn't make sense so block that. We don't allow mushy semantics.
3944+
*/
3945+
if (kattr->mnt_userns == fs_userns)
3946+
return -EINVAL;
3947+
39343948
/*
39353949
* Once a mount has been idmapped we don't allow it to change its
39363950
* mapping. It makes things simpler and callers can just create
39373951
* another bind-mount they can idmap if they want to.
39383952
*/
3939-
if (mnt_user_ns(m) != &init_user_ns)
3953+
if (is_idmapped_mnt(m))
39403954
return -EPERM;
39413955

39423956
/* The underlying filesystem doesn't support idmapped mounts yet. */
39433957
if (!(m->mnt_sb->s_type->fs_flags & FS_ALLOW_IDMAP))
39443958
return -EINVAL;
39453959

3946-
/* Don't yet support filesystem mountable in user namespaces. */
3947-
if (m->mnt_sb->s_user_ns != &init_user_ns)
3948-
return -EINVAL;
3949-
39503960
/* We're not controlling the superblock. */
3951-
if (!capable(CAP_SYS_ADMIN))
3961+
if (!ns_capable(fs_userns, CAP_SYS_ADMIN))
39523962
return -EPERM;
39533963

39543964
/* Mount has already been visible in the filesystem hierarchy. */
@@ -4002,14 +4012,27 @@ static struct mount *mount_setattr_prepare(struct mount_kattr *kattr,
40024012

40034013
static void do_idmap_mount(const struct mount_kattr *kattr, struct mount *mnt)
40044014
{
4005-
struct user_namespace *mnt_userns;
4015+
struct user_namespace *mnt_userns, *old_mnt_userns;
40064016

40074017
if (!kattr->mnt_userns)
40084018
return;
40094019

4020+
/*
4021+
* We're the only ones able to change the mount's idmapping. So
4022+
* mnt->mnt.mnt_userns is stable and we can retrieve it directly.
4023+
*/
4024+
old_mnt_userns = mnt->mnt.mnt_userns;
4025+
40104026
mnt_userns = get_user_ns(kattr->mnt_userns);
40114027
/* Pairs with smp_load_acquire() in mnt_user_ns(). */
40124028
smp_store_release(&mnt->mnt.mnt_userns, mnt_userns);
4029+
4030+
/*
4031+
* If this is an idmapped filesystem drop the reference we've taken
4032+
* in vfs_create_mount() before.
4033+
*/
4034+
if (!initial_idmapping(old_mnt_userns))
4035+
put_user_ns(old_mnt_userns);
40134036
}
40144037

40154038
static void mount_setattr_commit(struct mount_kattr *kattr,
@@ -4133,13 +4156,15 @@ static int build_mount_idmapped(const struct mount_attr *attr, size_t usize,
41334156
}
41344157

41354158
/*
4136-
* The init_user_ns is used to indicate that a vfsmount is not idmapped.
4137-
* This is simpler than just having to treat NULL as unmapped. Users
4138-
* wanting to idmap a mount to init_user_ns can just use a namespace
4139-
* with an identity mapping.
4159+
* The initial idmapping cannot be used to create an idmapped
4160+
* mount. We use the initial idmapping as an indicator of a mount
4161+
* that is not idmapped. It can simply be passed into helpers that
4162+
* are aware of idmapped mounts as a convenient shortcut. A user
4163+
* can just create a dedicated identity mapping to achieve the same
4164+
* result.
41404165
*/
41414166
mnt_userns = container_of(ns, struct user_namespace, ns);
4142-
if (mnt_userns == &init_user_ns) {
4167+
if (initial_idmapping(mnt_userns)) {
41434168
err = -EPERM;
41444169
goto out_fput;
41454170
}

fs/nfsd/export.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -427,7 +427,7 @@ static int check_export(struct path *path, int *flags, unsigned char *uuid)
427427
return -EINVAL;
428428
}
429429

430-
if (mnt_user_ns(path->mnt) != &init_user_ns) {
430+
if (is_idmapped_mnt(path->mnt)) {
431431
dprintk("exp_export: export of idmapped mounts not yet supported.\n");
432432
return -EINVAL;
433433
}

fs/open.c

+5-3
Original file line numberDiff line numberDiff line change
@@ -32,6 +32,7 @@
3232
#include <linux/ima.h>
3333
#include <linux/dnotify.h>
3434
#include <linux/compat.h>
35+
#include <linux/mnt_idmapping.h>
3536

3637
#include "internal.h"
3738

@@ -640,7 +641,7 @@ SYSCALL_DEFINE2(chmod, const char __user *, filename, umode_t, mode)
640641

641642
int chown_common(const struct path *path, uid_t user, gid_t group)
642643
{
643-
struct user_namespace *mnt_userns;
644+
struct user_namespace *mnt_userns, *fs_userns;
644645
struct inode *inode = path->dentry->d_inode;
645646
struct inode *delegated_inode = NULL;
646647
int error;
@@ -652,8 +653,9 @@ int chown_common(const struct path *path, uid_t user, gid_t group)
652653
gid = make_kgid(current_user_ns(), group);
653654

654655
mnt_userns = mnt_user_ns(path->mnt);
655-
uid = kuid_from_mnt(mnt_userns, uid);
656-
gid = kgid_from_mnt(mnt_userns, gid);
656+
fs_userns = i_user_ns(inode);
657+
uid = mapped_kuid_user(mnt_userns, fs_userns, uid);
658+
gid = mapped_kgid_user(mnt_userns, fs_userns, gid);
657659

658660
retry_deleg:
659661
newattrs.ia_valid = ATTR_CTIME;

fs/overlayfs/super.c

+1-1
Original file line numberDiff line numberDiff line change
@@ -873,7 +873,7 @@ static int ovl_mount_dir_noesc(const char *name, struct path *path)
873873
pr_err("filesystem on '%s' not supported\n", name);
874874
goto out_put;
875875
}
876-
if (mnt_user_ns(path->mnt) != &init_user_ns) {
876+
if (is_idmapped_mnt(path->mnt)) {
877877
pr_err("idmapped layers are currently not supported\n");
878878
goto out_put;
879879
}

fs/posix_acl.c

+11-6
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@
2323
#include <linux/export.h>
2424
#include <linux/user_namespace.h>
2525
#include <linux/namei.h>
26+
#include <linux/mnt_idmapping.h>
2627

2728
static struct posix_acl **acl_by_type(struct inode *inode, int type)
2829
{
@@ -374,7 +375,9 @@ posix_acl_permission(struct user_namespace *mnt_userns, struct inode *inode,
374375
goto check_perm;
375376
break;
376377
case ACL_USER:
377-
uid = kuid_into_mnt(mnt_userns, pa->e_uid);
378+
uid = mapped_kuid_fs(mnt_userns,
379+
i_user_ns(inode),
380+
pa->e_uid);
378381
if (uid_eq(uid, current_fsuid()))
379382
goto mask;
380383
break;
@@ -387,7 +390,9 @@ posix_acl_permission(struct user_namespace *mnt_userns, struct inode *inode,
387390
}
388391
break;
389392
case ACL_GROUP:
390-
gid = kgid_into_mnt(mnt_userns, pa->e_gid);
393+
gid = mapped_kgid_fs(mnt_userns,
394+
i_user_ns(inode),
395+
pa->e_gid);
391396
if (in_group_p(gid)) {
392397
found = 1;
393398
if ((pa->e_perm & want) == want)
@@ -734,17 +739,17 @@ static void posix_acl_fix_xattr_userns(
734739
case ACL_USER:
735740
uid = make_kuid(from, le32_to_cpu(entry->e_id));
736741
if (from_user)
737-
uid = kuid_from_mnt(mnt_userns, uid);
742+
uid = mapped_kuid_user(mnt_userns, &init_user_ns, uid);
738743
else
739-
uid = kuid_into_mnt(mnt_userns, uid);
744+
uid = mapped_kuid_fs(mnt_userns, &init_user_ns, uid);
740745
entry->e_id = cpu_to_le32(from_kuid(to, uid));
741746
break;
742747
case ACL_GROUP:
743748
gid = make_kgid(from, le32_to_cpu(entry->e_id));
744749
if (from_user)
745-
gid = kgid_from_mnt(mnt_userns, gid);
750+
gid = mapped_kgid_user(mnt_userns, &init_user_ns, gid);
746751
else
747-
gid = kgid_into_mnt(mnt_userns, gid);
752+
gid = mapped_kgid_fs(mnt_userns, &init_user_ns, gid);
748753
entry->e_id = cpu_to_le32(from_kgid(to, gid));
749754
break;
750755
default:

0 commit comments

Comments
 (0)