diff options
Diffstat (limited to 'features/seccomp/00-README')
-rw-r--r-- | features/seccomp/00-README | 114 |
1 files changed, 114 insertions, 0 deletions
diff --git a/features/seccomp/00-README b/features/seccomp/00-README new file mode 100644 index 00000000..e14506a3 --- /dev/null +++ b/features/seccomp/00-README @@ -0,0 +1,114 @@ + +This is a backport of the seccomp BPF syscall filtering from v3.5 + +Quoting from: https://lkml.org/lkml/2012/1/11/260 + +--------------- +[RFC,PATCH 0/2] dynamic seccomp policies (using BPF filters) + +The goal of the patchset is straightforward: + + To provide a means of reducing the kernel attack surface. + +In practice, this is done at the primary kernel ABI: system calls. +Achieving this goal will address the needs expressed by many systems +projects: + qemu/kvm, openssh, vsftpd, lxc, and chromium and chromium os (me). + +While system call filtering has been attempted many times, I hope that +this approach shows more promise. It works as described below and in +the patch series. + +A userland task may call prctl(PR_ATTACH_SECCOMP_FILTER) to attach a +BPF program to itself. Once attached, all system calls made by the +task will be evaluated by the BPF program prior to being accepted. +Evaluation is done by executing the BPF program over the struct +user_regs_state for the process. +-------------- + +The content appears in v3.5 from: + +------------ +commit cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b +Merge: 99262a3 ff2bb04 +Author: Linus Torvalds <torvalds@linux-foundation.org> +Date: Mon May 21 20:27:36 2012 -0700 + + Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security + + Pull security subsystem updates from James Morris: + "New notable features: + - The seccomp work from Will Drewry + - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski + - Longer security labels for Smack from Casey Schaufler + - Additional ptrace restriction modes for Yama by Kees Cook" +----------- + +Here, we take Will's linear block of commits from the above merge, which +are all conveniently all marked with "v18" in the changelog, and the +one PR_{GET,SET}_NO_NEW_PRIVS commit from Andy (req'd as a dependency). + +Documentation: +============== + +See added file: Documentation/prctl/seccomp_filter.txt + + +Testing: +======== + +Several samples are added in samples/seccomp -- building is as easy as: + + mkdir ../test + make O=../test defconfig + make O=../test samples/seccomp/ + +The bpf-direct is a sample which grabs writes to STDERR, and redirects +them to STDOUT, with an "[ERR]" prefix. Consider the core of the program: + +--------------------- + syscall(__NR_write, STDOUT_FILENO, + payload("OHAI! WHAT IS YOUR NAME? ")); + bytes = syscall(__NR_read, STDIN_FILENO, buf, sizeof(buf)); + syscall(__NR_write, STDOUT_FILENO, payload("HELLO, ")); + syscall(__NR_write, STDOUT_FILENO, buf, bytes); + syscall(__NR_write, STDERR_FILENO, + payload("Error message going to STDERR\n")); +--------------------- + +Running this core on a non-seccomp kernel, (i.e. by copying the above core +to "foo.c") we can see with redirection, that the sample Error message +goes to STDERR; i.e. + +----------- +~$./foo +OHAI! WHAT IS YOUR NAME? sdfsdf +HELLO, sdfsdf +Error message going to STDERR +~$./foo 2> /dev/null +OHAI! WHAT IS YOUR NAME? sdfs +HELLO, sdfs +~$ +------------ + +Note in the 2nd instance, the error message disappears into /dev/null + +Now consider the seccomp enabled case, using the same redirect: + +------------ +$ ./bpf-direct +OHAI! WHAT IS YOUR NAME? sdfsd +HELLO, sdfsd +[ERR] Error message going to STDERR +$ ./bpf-direct 2>/dev/null +OHAI! WHAT IS YOUR NAME? sdfsdf +HELLO, sdfsdf +[ERR] Error message going to STDERR +$ +------------ + +There are two things to see in the above. + 1) We see the [ERR] prefix that is clearly from the emulator() + function we've installed on the __NR_write syscall, and + 2) Even when we redirect STDERR to /dev/null, we still see the + message, which confirms it was put on STDOUT instead. |