aboutsummaryrefslogtreecommitdiffstats
path: root/features/seccomp/00-README
diff options
context:
space:
mode:
Diffstat (limited to 'features/seccomp/00-README')
-rw-r--r--features/seccomp/00-README114
1 files changed, 114 insertions, 0 deletions
diff --git a/features/seccomp/00-README b/features/seccomp/00-README
new file mode 100644
index 00000000..e14506a3
--- /dev/null
+++ b/features/seccomp/00-README
@@ -0,0 +1,114 @@
+
+This is a backport of the seccomp BPF syscall filtering from v3.5
+
+Quoting from: https://lkml.org/lkml/2012/1/11/260
+
+---------------
+[RFC,PATCH 0/2] dynamic seccomp policies (using BPF filters)
+
+The goal of the patchset is straightforward:
+
+ To provide a means of reducing the kernel attack surface.
+
+In practice, this is done at the primary kernel ABI: system calls.
+Achieving this goal will address the needs expressed by many systems
+projects:
+ qemu/kvm, openssh, vsftpd, lxc, and chromium and chromium os (me).
+
+While system call filtering has been attempted many times, I hope that
+this approach shows more promise. It works as described below and in
+the patch series.
+
+A userland task may call prctl(PR_ATTACH_SECCOMP_FILTER) to attach a
+BPF program to itself. Once attached, all system calls made by the
+task will be evaluated by the BPF program prior to being accepted.
+Evaluation is done by executing the BPF program over the struct
+user_regs_state for the process.
+--------------
+
+The content appears in v3.5 from:
+
+------------
+commit cb60e3e65c1b96a4d6444a7a13dc7dd48bc15a2b
+Merge: 99262a3 ff2bb04
+Author: Linus Torvalds <torvalds@linux-foundation.org>
+Date: Mon May 21 20:27:36 2012 -0700
+
+ Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security
+
+ Pull security subsystem updates from James Morris:
+ "New notable features:
+ - The seccomp work from Will Drewry
+ - PR_{GET,SET}_NO_NEW_PRIVS from Andy Lutomirski
+ - Longer security labels for Smack from Casey Schaufler
+ - Additional ptrace restriction modes for Yama by Kees Cook"
+-----------
+
+Here, we take Will's linear block of commits from the above merge, which
+are all conveniently all marked with "v18" in the changelog, and the
+one PR_{GET,SET}_NO_NEW_PRIVS commit from Andy (req'd as a dependency).
+
+Documentation:
+==============
+
+See added file: Documentation/prctl/seccomp_filter.txt
+
+
+Testing:
+========
+
+Several samples are added in samples/seccomp -- building is as easy as:
+
+ mkdir ../test
+ make O=../test defconfig
+ make O=../test samples/seccomp/
+
+The bpf-direct is a sample which grabs writes to STDERR, and redirects
+them to STDOUT, with an "[ERR]" prefix. Consider the core of the program:
+
+---------------------
+ syscall(__NR_write, STDOUT_FILENO,
+ payload("OHAI! WHAT IS YOUR NAME? "));
+ bytes = syscall(__NR_read, STDIN_FILENO, buf, sizeof(buf));
+ syscall(__NR_write, STDOUT_FILENO, payload("HELLO, "));
+ syscall(__NR_write, STDOUT_FILENO, buf, bytes);
+ syscall(__NR_write, STDERR_FILENO,
+ payload("Error message going to STDERR\n"));
+---------------------
+
+Running this core on a non-seccomp kernel, (i.e. by copying the above core
+to "foo.c") we can see with redirection, that the sample Error message
+goes to STDERR; i.e.
+
+-----------
+~$./foo
+OHAI! WHAT IS YOUR NAME? sdfsdf
+HELLO, sdfsdf
+Error message going to STDERR
+~$./foo 2> /dev/null
+OHAI! WHAT IS YOUR NAME? sdfs
+HELLO, sdfs
+~$
+------------
+
+Note in the 2nd instance, the error message disappears into /dev/null
+
+Now consider the seccomp enabled case, using the same redirect:
+
+------------
+$ ./bpf-direct
+OHAI! WHAT IS YOUR NAME? sdfsd
+HELLO, sdfsd
+[ERR] Error message going to STDERR
+$ ./bpf-direct 2>/dev/null
+OHAI! WHAT IS YOUR NAME? sdfsdf
+HELLO, sdfsdf
+[ERR] Error message going to STDERR
+$
+------------
+
+There are two things to see in the above.
+ 1) We see the [ERR] prefix that is clearly from the emulator()
+ function we've installed on the __NR_write syscall, and
+ 2) Even when we redirect STDERR to /dev/null, we still see the
+ message, which confirms it was put on STDOUT instead.