如何在没有 root 的情况下“监禁”一个进程？

Question

如何在没有 root 的情况下“监禁”一个进程？

Tob*_*ler 27 jails permissions not-root-user

如果我是 root，我可以简单地创建一个虚拟用户/组，相应地设置文件权限并以该用户身份执行该过程。但是我不是，所以有没有办法在没有root的情况下实现这一目标？

Answer 1

更多类似问题，更多答案值得关注：

注意：那里的一些答案指向此处尚未提及的特定解决方案。

实际上，有很多具有不同实现的监狱工具，但其中许多要么设计不安全（如fakeroot, LD_PRELOAD-based），要么不完整（如fakeroot-ng, -basedptrace），或者需要 root（chroot，或plash在fakechroot 中提到警告标签）。

这些只是例子；我想将它们全部并排列出，并指出这 2 个功能（“可以信任？”，“需要 root 才能设置？”），也许在操作系统级虚拟化实现。

一般来说，那里的答案涵盖了所描述的全部可能性范围，甚至更多：

虚拟机/操作系统

（答案提到虚拟机/操作系统）

内核扩展（如 SELinux）

（在此处的评论中提到），

chroot

基于 Chroot 的帮助程序（但是必须是 setUID root，因为chroot需要 root；或者chroot可以在隔离的命名空间中工作——见下文）：

[多说一点关于他们的事！]

已知的基于 chroot 的隔离工具：

散列器，其hsh-run和hsh-shell命令。（Hasher旨在以安全和可重复的方式构建软件。）
schroot在另一个答案中提到
...

跟踪

另一个值得信赖的隔离溶液（除了一个seccomp基于一个）将是完整的系统调用，通过拦截ptrace，如在手册页为解释fakeroot-ng：

与以前的实现不同，fakeroot-ng 使用的技术使被跟踪的进程无法选择是否使用 fakeroot-ng 的“服务”。静态编译程序、直接调用内核和操作自己的地址空间都是可以轻松绕过基于 LD_PRELOAD 对进程的控制的技术，不适用于 fakeroot-ng。从理论上讲，可以以完全控制跟踪过程的方式塑造 fakeroot-ng。

虽然理论上可行，但尚未实现。Fakeroot-ng 确实假设了关于被跟踪进程的某些“行为良好”的假设，并且打破这些假设的进程可能能够，如果不是完全逃脱，那么至少可以绕过一些由 fakeroot 强加给它的“假”环境 -吴。因此，强烈警告您不要使用 fakeroot-ng 作为安全工具。错误报告声称一个进程可以故意（而不是无意中）逃脱假冒？root-ng 的控制将被关闭为“不是错误”或标记为低优先级。

未来可能会重新考虑该政策。但是，就目前而言，您已收到警告。

尽管如此，正如您可以阅读的那样，它fakeroot-ng本身并不是为此目的而设计的。

（顺便说一句，我想知道他们为什么选择seccomp对 Chromium使用基于 -based 的方法而不是ptrace基于 -based 的方法......）

在上面没有提到的工具中，我自己注意到了Geordi，因为我喜欢控制程序是用 Haskell 编写的。

已知的基于 ptrace 的隔离工具：

乔尔迪
根
fakeroot-ng
...（另请参见如何在 Linux 中的用户空间中实现 chroot 的效果（无需成为 root）？）

赛康

实现隔离的一种已知方法是通过Google Chromium 中使用的 seccomp 沙箱方法。但是这种方法假设您编写了一个帮助程序来处理一些（允许的）“被拦截”的文件访问和其他系统调用；而且，当然，努力“拦截”系统调用并将它们重定向到帮助程序（也许，这甚至意味着替换受控进程代码中拦截的系统调用；所以，这听起来不像很简单；如果您有兴趣，最好阅读详细信息，而不仅仅是我的回答）。

更多相关信息（来自维基百科）：

http://en.wikipedia.org/wiki/Seccomp
http://code.google.com/p/seccompsandbox/wiki/overview
LWN 文章：Google 的 Chromium 沙箱，Jake Edge，2009 年 8 月
seccomp-nurse，一个基于 seccomp 的沙盒框架。

（如果有人正在寻找seccompChromium 之外的基于通用的解决方案，那么最后一项似乎很有趣。还有一篇来自“seccomp-nurse”作者的博客文章值得一读：SECCOMP as a Sandboxing solution ？。）

来自“seccomp-nurse”项目的这种方法的说明：

在此处输入图片说明

Linux 的未来可能会出现“灵活”的 seccomp？

There used to appear in 2009 also suggestions to patch the Linux kernel so that there is more flexibility to the seccomp mode--so that "many of the acrobatics that we currently need could be avoided". ("Acrobatics" refers to the complications of writing a helper that has to execute many possibly innocent syscalls on behalf of the jailed process and of substituting the possibly innocent syscalls in the jailed process.) An LWN article wrote to this point:

One suggestion that came out was to add a new "mode" to seccomp. The API was designed with the idea that different applications might have different security requirements; it includes a "mode" value which specifies the restrictions that should be put in place. Only the original mode has ever been implemented, but others can certainly be added. Creating a new mode which allowed the initiating process to specify which system calls would be allowed would make the facility more useful for situations like the Chrome sandbox.

Adam Langley (also of Google) has posted a patch which does just that. The new "mode 2" implementation accepts a bitmask describing which system calls are accessible. If one of those is prctl(), then the sandboxed code can further restrict its own system calls (but it cannot restore access to system calls which have been denied). All told, it looks like a reasonable solution which could make life easier for sandbox developers.

That said, this code may never be merged because the discussion has since moved on to other possibilities.

This "flexible seccomp" would bring the possibilities of Linux closer to providing the desired feature in the OS, without the need to write helpers that complicated.

(A blog posting with basically the same content as this answer: http://geofft.mit.edu/blog/sipb/33.)

namespaces (`unshare`)

Isolating through namespaces (unshare-based solutions) -- not mentioned here -- e.g., unsharing mount-points (combined with FUSE?) could perhaps be a part of a working solution for you wanting to confine filesystem accesses of your untrusted processes.

More on namespaces, now, as their implementation has been completed (this isolation technique is also known under the nme "Linux Containers", or "LXC", isn't it?..):

"One of the overall goals of namespaces is to support the implementation of containers, a tool for lightweight virtualization (as well as other purposes)".

It's even possible to create a new user namespace, so that "a process can have a normal unprivileged user ID outside a user namespace while at the same time having a user ID of 0 inside the namespace. This means that the process has full root privileges for operations inside the user namespace, but is unprivileged for operations outside the namespace".

For real working commands to do this, see the answers at:

and special user-space programming/compiling

But well, of course, the desired "jail" guarantees are implementable by programming in user-space (without additional support for this feature from the OS; maybe that's why this feature hasn't been included in the first place in the design of OSes); with more or less complications.

The mentioned ptrace- or seccomp-based sandboxing can be seen as some variants of implementing the guarantees by writing a sandbox-helper that would control your other processes, which would be treated as "black boxes", arbitrary Unix programs.

Another approach could be to use programming techniques that can care about the effects that must be disallowed. (It must be you who writes the programs then; they are not black boxes anymore.) To mention one, using a pure programming language (which would force you to program without side-effects) like Haskell will simply make all the effects of the program explicit, so the programmer can easily make sure there will be no disallowed effects.

I guess, there are sandboxing facilities available for those programming in some other language, e.g., Java.

Cf. "Sandboxed Haskell" project proposal.
NaCl--not mentioned here--belongs to this group, doesn't it?

在那里的答案中还指出了一些积累有关此主题信息的页面：

Answer 2

Gil*_*il' 8

这是 unix 权限模型的一个基本限制：只有 root 可以委托。

您无需成为 root 用户即可运行虚拟机（并非所有 VM 技术都如此），但这是一个重量级的解决方案。

用户模式 Linux是一个相对轻量级的 Linux-on-Linux 虚拟化解决方案。设置起来并不容易。你需要填充一个根分区（在一个目录）至少需要启动的最小（在几个文件/etc，/sbin/init和它的依赖，登录程序，外壳和实用程序）。

归档时间：	14 年，11 月前
查看次数：	12981 次
最近记录：	6 年，1 月前

如何在没有 root 的情况下“监禁”一个进程？

虚拟机/操作系统

内核扩展（如 SELinux）

chroot

已知的基于 chroot 的隔离工具：

跟踪

已知的基于 ptrace 的隔离工具：

赛康

Linux 的未来可能会出现“灵活”的 seccomp？

namespaces (unshare)

and special user-space programming/compiling

namespaces (`unshare`)