** Tags removed: verification-needed-disco verification-needed-eoan ** Tags added: verification-done-disco verification-done-eoan
-- You received this bug notification because you are a member of Kernel Packages, which is subscribed to linux in Ubuntu. https://bugs.launchpad.net/bugs/1847744 Title: seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE Status in linux package in Ubuntu: Fix Committed Status in linux source package in Disco: Fix Committed Status in linux source package in Eoan: Fix Committed Bug description: SRU Justification Impact: Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4]) which enables a process (watchee) to retrieve an fd for its seccomp filter. This fd can then be handed to another (usually more privileged) process (watcher). The watcher will then be able to receive seccomp messages about the syscalls having been performed by the watchee. This feature is heavily used in some userspace workloads. For example, it is currently used to intercept mknod() syscalls in user namespaces aka in containers. The mknod() syscall can be easily filtered based on dev_t. This allows us to only intercept a very specific subset of mknod() syscalls. Furthermore, mknod() is not possible in user namespaces toto coelo and so intercepting and denying syscalls that are not in the whitelist on accident is not a big deal. The watchee won't notice a difference. In contrast to mknod(), a lot of other syscall we intercept (e.g. setxattr()) cannot be easily filtered like mknod() because they have pointer arguments. Additionally, some of them might actually succeed in user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we currently cannot tell seccomp to continue from a user notifier we are stuck with performing all of the syscalls in lieu of the container. This is a huge security liability since it is extremely difficult to correctly assume all of the necessary privileges of the calling task such that the syscall can be successfully emulated without escaping other additional security restrictions (think missing CAP_MKNOD for mknod(), or MS_NODEV on a filesystem etc.). This can be solved by telling seccomp to resume the syscall. Fix: Allow the seccomp notifier to continue a syscall. A positive discussion about this feature was triggered by a post to the ksummit- discuss mailing list (cf. [3]) and took place during KSummit (cf. [1]) and again at the containers/checkpoint-restore micro-conference at Linux Plumbers. Regression Potential: Limited to seccomp. The patchset also comes with proper selftests in addition to the large set of seccomp selftests that are already there. This further reduces regression potential. Test Case: Compile a kernel with the patch applied and run the selftests or trap a syscall via the notifier fd and set the newly introduced flag. The syscall should then have continued. Target Kernels: All current LTS kernels. Patches: https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/seccomp&id=fb3c5386b382d4097476ce9647260fc89b34afdb https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git/commit/?h=for-next/seccomp&id=0eebfed2954f152259cae0ad57b91d3ea92968e8 /* References */ [1]: https://linuxplumbersconf.org/event/4/contributions/560 [2]: https://linuxplumbersconf.org/event/4/contributions/477 [3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns3...@brauner.io [4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace") To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1847744/+subscriptions -- Mailing list: https://launchpad.net/~kernel-packages Post to : kernel-packages@lists.launchpad.net Unsubscribe : https://launchpad.net/~kernel-packages More help : https://help.launchpad.net/ListHelp