[arch-general] user namespaces

Wed Feb 1 07:45:46 UTC 2017

On Wed, 2017-02-01 at 00:21 -0700, Leonid Isaev wrote:
> On Wed, Feb 01, 2017 at 01:20:41AM -0500, Daniel Micay via arch-
> general wrote:
> > On Wed, 2017-02-01 at 00:18 +0100, sivmu wrote:
> > > Summary:
> > > 
> > > Arch Linux is one of the few, if not the only distribution that
> > > still
> > > disables or restricts the use of unprivileged user namespaces, a
> > > feature
> > > that is used by many applications and containers to provide secure
> > > sandboxing.
> > > There have been request to turn this feature on since Linux 3.13
> > > (in
> > > 2013) but they are still being denied. While there may have been
> > > some
> > > reason for doing so a few year ago, leading to many distributions
> > > like
> > > Debian and Red Hat to restrict its use to privileged users via a
> > > kernel
> > > patch (they never disabled it completely), today arch seems to be
> > > the
> > > only distribution to block this feature. Even conservative distros
> > > like
> > > Debian 8 and 9 have this feature fully enabled.
> > 
> > There are still endless unprivileged user namespace vulnerabilities
> > and
> > it's a nearly useless feature. The uid/gid mapping is poorly thought
> > out
> > and immature without the necessary environment (filesystem support,
> > etc.) built around it, but no one really wants it for that reason.
> > They
> > want it because it started pretending that it can offer something
> > that
> > it can't actually deliver safely. There are much better ways to do
> > unprivileged sandboxes with significantly less risk than
> > CLONE_NEWUSER
> > or setuid executables where the user controls the environment.
> > Anything
> > depending on this mechanism instead of properly designed plumbing
> > for it
> > is simply lazy garbage. Lack of a proper layer on top of the kernel
> > providing infrastructure (systemd is so far from that) on
> > desktop/server
> > Linux is not going to be fixed by delegating everything to the
> > kernel
> > even when it massively increases attack surface.
> 
> BTW, why can't one simply create a *privileged* lxc container on a
> host
> filesystem mounted with nosuid, then create an unprivileged user
> inside that
> container for browsing / viewing of untrusted pdfs, etc?

Application containers don't have a use for the user namespace quasi
root and no one really needs the half baked uid/gid mapping feature.
There's no real reason for stuff being done that way beyond desktop
Linux having the disease of inability to do plumbing in userspace, but
instead putting everything in the kernel simply to have it universally
available rather than for technical reasons.

It would make sense to simply have a service spawning on-demand unpriv
users from a range of uid/gid pairs. That's exactly how this works on
Android for both apps and isolatedProcess services (they each get a
unique uid/gid pair assigned), although they also layer SELinux and
mount namespaces on top.

The only real use case for user namespaces is unprivileged, contained
usage of OS containers since they actually need the quasi root. For
application containers / sandboxes, it's just laziness and bad design.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 866 bytes
Desc: This is a digitally signed message part
URL: <https://lists.archlinux.org/pipermail/arch-general/attachments/20170201/7b1c0948/attachment.asc>