summaryrefslogtreecommitdiffstats
path: root/security
AgeCommit message (Collapse)Author
2017-09-22apparmor: fix incorrect type assignment when freeing proxiesJohn Johansen
sparse reports poisoning the proxy->label before freeing the struct is resulting in a sparse build warning. ../security/apparmor/label.c:52:30: warning: incorrect type in assignment (different address spaces) ../security/apparmor/label.c:52:30: expected struct aa_label [noderef] <asn:4>*label ../security/apparmor/label.c:52:30: got struct aa_label *<noident> fix with RCU_INIT_POINTER as this is one of those cases where rcu_assign_pointer() is not needed. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: ensure unconfined profiles have dfas initializedJohn Johansen
Generally unconfined has early bailout tests and does not need the dfas initialized, however if an early bailout test is ever missed it will result in an oops. Be defensive and initialize the unconfined profile to have null dfas (no permission) so if an early bailout test is missed we fail closed (no perms granted) instead of oopsing. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: fix race condition in null profile creationJohn Johansen
There is a race when null- profile is being created between the initial lookup/creation of the profile and lock/addition of the profile. This could result in multiple version of a profile being added to the list which need to be removed/replaced. Since these are learning profile their is no affect on mediation. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: move new_null_profile to after profile lookup fns()John Johansen
new_null_profile will need to use some of the profile lookup fns() so move instead of doing forward fn declarations. Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: add base infastructure for socket mediationJohn Johansen
Provide a basic mediation of sockets. This is not a full net mediation but just whether a spcific family of socket can be used by an application, along with setting up some basic infrastructure for network mediation to follow. the user space rule hav the basic form of NETWORK RULE = [ QUALIFIERS ] 'network' [ DOMAIN ] [ TYPE | PROTOCOL ] DOMAIN = ( 'inet' | 'ax25' | 'ipx' | 'appletalk' | 'netrom' | 'bridge' | 'atmpvc' | 'x25' | 'inet6' | 'rose' | 'netbeui' | 'security' | 'key' | 'packet' | 'ash' | 'econet' | 'atmsvc' | 'sna' | 'irda' | 'pppox' | 'wanpipe' | 'bluetooth' | 'netlink' | 'unix' | 'rds' | 'llc' | 'can' | 'tipc' | 'iucv' | 'rxrpc' | 'isdn' | 'phonet' | 'ieee802154' | 'caif' | 'alg' | 'nfc' | 'vsock' | 'mpls' | 'ib' | 'kcm' ) ',' TYPE = ( 'stream' | 'dgram' | 'seqpacket' | 'rdm' | 'raw' | 'packet' ) PROTOCOL = ( 'tcp' | 'udp' | 'icmp' ) eg. network, network inet, Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add more debug asserts to apparmorfsJohn Johansen
Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: make policy_unpack able to audit different info messagesJohn Johansen
Switch unpack auditing to using the generic name field in the audit struct and make it so we can start adding new info messages about why an unpack failed. Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add support for absolute root view based labelsJohn Johansen
With apparmor policy virtualization based on policy namespace View's we don't generally want/need absolute root based views, however there are cases like debugging and some secid based conversions where using a root based view is important. Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: cleanup conditional check for label in label_printJohn Johansen
Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add mount mediationJohn Johansen
Add basic mount mediation. That allows controlling based on basic mount parameters. It does not include special mount parameters for apparmor, super block labeling, or any triggers for apparmor namespace parameter modifications on pivot root. default userspace policy rules have the form of MOUNT RULE = ( MOUNT | REMOUNT | UMOUNT ) MOUNT = [ QUALIFIERS ] 'mount' [ MOUNT CONDITIONS ] [ SOURCE FILEGLOB ] [ '->' MOUNTPOINT FILEGLOB ] REMOUNT = [ QUALIFIERS ] 'remount' [ MOUNT CONDITIONS ] MOUNTPOINT FILEGLOB UMOUNT = [ QUALIFIERS ] 'umount' [ MOUNT CONDITIONS ] MOUNTPOINT FILEGLOB MOUNT CONDITIONS = [ ( 'fstype' | 'vfstype' ) ( '=' | 'in' ) MOUNT FSTYPE EXPRESSION ] [ 'options' ( '=' | 'in' ) MOUNT FLAGS EXPRESSION ] MOUNT FSTYPE EXPRESSION = ( MOUNT FSTYPE LIST | MOUNT EXPRESSION ) MOUNT FSTYPE LIST = Comma separated list of valid filesystem and virtual filesystem types (eg ext4, debugfs, etc) MOUNT FLAGS EXPRESSION = ( MOUNT FLAGS LIST | MOUNT EXPRESSION ) MOUNT FLAGS LIST = Comma separated list of MOUNT FLAGS. MOUNT FLAGS = ( 'ro' | 'rw' | 'nosuid' | 'suid' | 'nodev' | 'dev' | 'noexec' | 'exec' | 'sync' | 'async' | 'remount' | 'mand' | 'nomand' | 'dirsync' | 'noatime' | 'atime' | 'nodiratime' | 'diratime' | 'bind' | 'rbind' | 'move' | 'verbose' | 'silent' | 'loud' | 'acl' | 'noacl' | 'unbindable' | 'runbindable' | 'private' | 'rprivate' | 'slave' | 'rslave' | 'shared' | 'rshared' | 'relatime' | 'norelatime' | 'iversion' | 'noiversion' | 'strictatime' | 'nouser' | 'user' ) MOUNT EXPRESSION = ( ALPHANUMERIC | AARE ) ... PIVOT ROOT RULE = [ QUALIFIERS ] pivot_root [ oldroot=OLD PUT FILEGLOB ] [ NEW ROOT FILEGLOB ] SOURCE FILEGLOB = FILEGLOB MOUNTPOINT FILEGLOB = FILEGLOB eg. mount, mount /dev/foo, mount options=ro /dev/foo -> /mnt/, mount options in (ro,atime) /dev/foo -> /mnt/, mount options=ro options=atime, Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: add the ability to mediate signalsJohn Johansen
Add signal mediation where the signal can be mediated based on the signal, direction, or the label or the peer/target. The signal perms are verified on a cross check to ensure policy consistency in the case of incremental policy load/replacement. The optimization of skipping the cross check when policy is guaranteed to be consistent (single compile unit) remains to be done. policy rules have the form of SIGNAL_RULE = [ QUALIFIERS ] 'signal' [ SIGNAL ACCESS PERMISSIONS ] [ SIGNAL SET ] [ SIGNAL PEER ] SIGNAL ACCESS PERMISSIONS = SIGNAL ACCESS | SIGNAL ACCESS LIST SIGNAL ACCESS LIST = '(' Comma or space separated list of SIGNAL ACCESS ')' SIGNAL ACCESS = ( 'r' | 'w' | 'rw' | 'read' | 'write' | 'send' | 'receive' ) SIGNAL SET = 'set' '=' '(' SIGNAL LIST ')' SIGNAL LIST = Comma or space separated list of SIGNALS SIGNALS = ( 'hup' | 'int' | 'quit' | 'ill' | 'trap' | 'abrt' | 'bus' | 'fpe' | 'kill' | 'usr1' | 'segv' | 'usr2' | 'pipe' | 'alrm' | 'term' | 'stkflt' | 'chld' | 'cont' | 'stop' | 'stp' | 'ttin' | 'ttou' | 'urg' | 'xcpu' | 'xfsz' | 'vtalrm' | 'prof' | 'winch' | 'io' | 'pwr' | 'sys' | 'emt' | 'exists' | 'rtmin+0' ... 'rtmin+32' ) SIGNAL PEER = 'peer' '=' AARE eg. signal, # allow all signals signal send set=(hup, kill) peer=foo, Signed-off-by: John Johansen <john.johansen@canonical.com> Acked-by: Seth Arnold <seth.arnold@canonical.com>
2017-09-22apparmor: Redundant condition: prev_ns. in [label.c:1498]John Johansen
Reported-by: David Binderman <dcb314@hotmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: Fix an error code in aafs_create()Dan Carpenter
We accidentally forgot to set the error code on this path. It means we return NULL instead of an error pointer. I looked through a bunch of callers and I don't think it really causes a big issue, but the documentation says we're supposed to return error pointers here. Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: Fix logical error in verify_header()Christos Gkekas
verify_header() is currently checking whether interface version is less than 5 *and* greater than 7, which always evaluates to false. Instead it should check whether it is less than 5 *or* greater than 7. Signed-off-by: Christos Gkekas <chris.gekas@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-22apparmor: Fix shadowed local variable in unpack_trans_table()Geert Uytterhoeven
with W=2: security/apparmor/policy_unpack.c: In function ‘unpack_trans_table’: security/apparmor/policy_unpack.c:469: warning: declaration of ‘pos’ shadows a previous local security/apparmor/policy_unpack.c:451: warning: shadowed declaration is here Rename the old "pos" to "saved_pos" to fix this. Fixes: 5379a3312024a8be ("apparmor: support v7 transition format compatible with label_parse") Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: John Johansen <john.johansen@canonical.com>
2017-09-14Merge branch 'work.set_fs' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull more set_fs removal from Al Viro: "Christoph's 'use kernel_read and friends rather than open-coding set_fs()' series" * 'work.set_fs' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: fs: unexport vfs_readv and vfs_writev fs: unexport vfs_read and vfs_write fs: unexport __vfs_read/__vfs_write lustre: switch to kernel_write gadget/f_mass_storage: stop messing with the address limit mconsole: switch to kernel_read btrfs: switch write_buf to kernel_write net/9p: switch p9_fd_read to kernel_write mm/nommu: switch do_mmap_private to kernel_read serial2002: switch serial2002_tty_write to kernel_{read/write} fs: make the buf argument to __kernel_write a void pointer fs: fix kernel_write prototype fs: fix kernel_read prototype fs: move kernel_read to fs/read_write.c fs: move kernel_write to fs/read_write.c autofs4: switch autofs4_write to __kernel_write ashmem: switch to ->read_iter
2017-09-12Merge tag 'selinux-pr-20170831' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux Pull selinux updates from Paul Moore: "A relatively quiet period for SELinux, 11 patches with only two/three having any substantive changes. These noteworthy changes include another tweak to the NNP/nosuid handling, per-file labeling for cgroups, and an object class fix for AF_UNIX/SOCK_RAW sockets; the rest of the changes are minor tweaks or administrative updates (Stephen's email update explains the file explosion in the diffstat). Everything passes the selinux-testsuite" [ Also a couple of small patches from the security tree from Tetsuo Handa for Tomoyo and LSM cleanup. The separation of security policy updates wasn't all that clean - Linus ] * tag 'selinux-pr-20170831' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/selinux: selinux: constify nf_hook_ops selinux: allow per-file labeling for cgroupfs lsm_audit: update my email address selinux: update my email address MAINTAINERS: update the NetLabel and Labeled Networking information selinux: use GFP_NOWAIT in the AVC kmem_caches selinux: Generalize support for NNP/nosuid SELinux domain transitions selinux: genheaders should fail if too many permissions are defined selinux: update the selinux info in MAINTAINERS credits: update Paul Moore's info selinux: Assign proper class to PF_UNIX/SOCK_RAW sockets tomoyo: Update URLs in Documentation/admin-guide/LSM/tomoyo.rst LSM: Remove security_task_create() hook.
2017-09-11Merge branch 'for-linus' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace Pull namespace updates from Eric Biederman: "Life has been busy and I have not gotten half as much done this round as I would have liked. I delayed it so that a minor conflict resolution with the mips tree could spend a little time in linux-next before I sent this pull request. This includes two long delayed user namespace changes from Kirill Tkhai. It also includes a very useful change from Serge Hallyn that allows the security capability attribute to be used inside of user namespaces. The practical effect of this is people can now untar tarballs and install rpms in user namespaces. It had been suggested to generalize this and encode some of the namespace information information in the xattr name. Upon close inspection that makes the things that should be hard easy and the things that should be easy more expensive. Then there is my bugfix/cleanup for signal injection that removes the magic encoding of the siginfo union member from the kernel internal si_code. The mips folks reported the case where I had used FPE_FIXME me is impossible so I have remove FPE_FIXME from mips, while at the same time including a return statement in that case to keep gcc from complaining about unitialized variables. I almost finished the work to get make copy_siginfo_to_user a trivial copy to user. The code is available at: git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace.git neuter-copy_siginfo_to_user-v3 But I did not have time/energy to get the code posted and reviewed before the merge window opened. I was able to see that the security excuse for just copying fields that we know are initialized doesn't work in practice there are buggy initializations that don't initialize the proper fields in siginfo. So we still sometimes copy unitialized data to userspace" * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: Introduce v3 namespaced file capabilities mips/signal: In force_fcr31_sig return in the impossible case signal: Remove kernel interal si_code magic fcntl: Don't use ambiguous SIG_POLL si_codes prctl: Allow local CAP_SYS_ADMIN changing exe_file security: Use user_namespace::level to avoid redundant iterations in cap_capable() userns,pidns: Verify the userns for new pid namespaces signal/testing: Don't look for __SI_FAULT in userspace signal/mips: Document a conflict with SI_USER with SIGFPE signal/sparc: Document a conflict with SI_USER with SIGFPE signal/ia64: Document a conflict with SI_USER with SIGFPE signal/alpha: Document a conflict with SI_USER for SIGTRAP
2017-09-07Merge tag 'audit-pr-20170907' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit Pull audit updates from Paul Moore: "A small pull request for audit this time, only four patches and only two with any real code changes. Those two changes are the removal of a pointless SELinux AVC initialization audit event and a fix to improve the audit timestamp overhead. The other two patches are comment cleanup and administrative updates, nothing very exciting. Everything passes our tests" * tag 'audit-pr-20170907' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: audit: update the function comments selinux: remove AVC init audit log message audit: update the audit info in MAINTAINERS audit: Reduce overhead using a coarse clock
2017-09-07Merge tag 'secureexec-v4.14-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull secureexec update from Kees Cook: "This series has the ultimate goal of providing a sane stack rlimit when running set*id processes. To do this, the bprm_secureexec LSM hook is collapsed into the bprm_set_creds hook so the secureexec-ness of an exec can be determined early enough to make decisions about rlimits and the resulting memory layouts. Other logic acting on the secureexec-ness of an exec is similarly consolidated. Capabilities needed some special handling, but the refactoring removed other special handling, so that was a wash" * tag 'secureexec-v4.14-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: exec: Consolidate pdeath_signal clearing exec: Use sane stack rlimit under secureexec exec: Consolidate dumpability logic smack: Remove redundant pdeath_signal clearing exec: Use secureexec for clearing pdeath_signal exec: Use secureexec for setting dumpability LSM: drop bprm_secureexec hook commoncap: Move cap_elevated calculation into bprm_set_creds commoncap: Refactor to remove bprm_secureexec hook smack: Refactor to remove bprm_secureexec hook selinux: Refactor to remove bprm_secureexec hook apparmor: Refactor to remove bprm_secureexec hook binfmt: Introduce secureexec flag exec: Correct comments about "point of no return" exec: Rename bprm->cred_prepared to called_set_creds
2017-09-05selinux: remove AVC init audit log messageRichard Guy Briggs
In the process of normalizing audit log messages, it was noticed that the AVC initialization code registered an audit log KERNEL record that didn't fit the standard format. In the process of attempting to normalize it it was determined that this record was not even necessary. Remove it. Ref: http://marc.info/?l=selinux&m=149614868525826&w=2 See: https://github.com/linux-audit/audit-kernel/issues/48 Signed-off-by: Richard Guy Briggs <rgb@redhat.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Steve Grubb <sgrubb@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-09-04fs: fix kernel_write prototypeChristoph Hellwig
Make the position an in/out argument like all the other read/write helpers and and make the buf argument a void pointer. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-04fs: fix kernel_read prototypeChristoph Hellwig
Use proper ssize_t and size_t types for the return value and count argument, move the offset last and make it an in/out argument like all other read/write helpers, and make the buf argument a void pointer to get rid of lots of casts in the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-09-01Introduce v3 namespaced file capabilitiesSerge E. Hallyn
Root in a non-initial user ns cannot be trusted to write a traditional security.capability xattr. If it were allowed to do so, then any unprivileged user on the host could map his own uid to root in a private namespace, write the xattr, and execute the file with privilege on the host. However supporting file capabilities in a user namespace is very desirable. Not doing so means that any programs designed to run with limited privilege must continue to support other methods of gaining and dropping privilege. For instance a program installer must detect whether file capabilities can be assigned, and assign them if so but set setuid-root otherwise. The program in turn must know how to drop partial capabilities, and do so only if setuid-root. This patch introduces v3 of the security.capability xattr. It builds a vfs_ns_cap_data struct by appending a uid_t rootid to struct vfs_cap_data. This is the absolute uid_t (that is, the uid_t in user namespace which mounted the filesystem, usually init_user_ns) of the root id in whose namespaces the file capabilities may take effect. When a task asks to write a v2 security.capability xattr, if it is privileged with respect to the userns which mounted the filesystem, then nothing should change. Otherwise, the kernel will transparently rewrite the xattr as a v3 with the appropriate rootid. This is done during the execution of setxattr() to catch user-space-initiated capability writes. Subsequently, any task executing the file which has the noted kuid as its root uid, or which is in a descendent user_ns of such a user_ns, will run the file with capabilities. Similarly when asking to read file capabilities, a v3 capability will be presented as v2 if it applies to the caller's namespace. If a task writes a v3 security.capability, then it can provide a uid for the xattr so long as the uid is valid in its own user namespace, and it is privileged with CAP_SETFCAP over its namespace. The kernel will translate that rootid to an absolute uid, and write that to disk. After this, a task in the writer's namespace will not be able to use those capabilities (unless rootid was 0), but a task in a namespace where the given uid is root will. Only a single security.capability xattr may exist at a time for a given file. A task may overwrite an existing xattr so long as it is privileged over the inode. Note this is a departure from previous semantics, which required privilege to remove a security.capability xattr. This check can be re-added if deemed useful. This allows a simple setxattr to work, allows tar/untar to work, and allows us to tar in one namespace and untar in another while preserving the capability, without risking leaking privilege into a parent namespace. Example using tar: $ cp /bin/sleep sleepx $ mkdir b1 b2 $ lxc-usernsexec -m b:0:100000:1 -m b:1:$(id -u):1 -- chown 0:0 b1 $ lxc-usernsexec -m b:0:100001:1 -m b:1:$(id -u):1 -- chown 0:0 b2 $ lxc-usernsexec -m b:0:100000:1000 -- tar --xattrs-include=security.capability --xattrs -cf b1/sleepx.tar sleepx $ lxc-usernsexec -m b:0:100001:1000 -- tar --xattrs-include=security.capability --xattrs -C b2 -xf b1/sleepx.tar $ lxc-usernsexec -m b:0:100001:1000 -- getcap b2/sleepx b2/sleepx = cap_sys_admin+ep # /opt/ltp/testcases/bin/getv3xattr b2/sleepx v3 xattr, rootid is 100001 A patch to linux-test-project adding a new set of tests for this functionality is in the nsfscaps branch at github.com/hallyn/ltp Changelog: Nov 02 2016: fix invalid check at refuse_fcap_overwrite() Nov 07 2016: convert rootid from and to fs user_ns (From ebiederm: mar 28 2017) commoncap.c: fix typos - s/v4/v3 get_vfs_caps_from_disk: clarify the fs_ns root access check nsfscaps: change the code split for cap_inode_setxattr() Apr 09 2017: don't return v3 cap for caps owned by current root. return a v2 cap for a true v2 cap in non-init ns Apr 18 2017: . Change the flow of fscap writing to support s_user_ns writing. . Remove refuse_fcap_overwrite(). The value of the previous xattr doesn't matter. Apr 24 2017: . incorporate Eric's incremental diff . move cap_convert_nscap to setxattr and simplify its usage May 8, 2017: . fix leaking dentry refcount in cap_inode_getsecurity Signed-off-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-08-28selinux: constify nf_hook_opsArvind Yadav
nf_hook_ops are not supposed to change at runtime. nf_register_net_hooks and nf_unregister_net_hooks are working with const nf_hook_ops. So mark the non-const nf_hook_ops structs as const. Signed-off-by: Arvind Yadav <arvind.yadav.cs@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-22selinux: allow per-file labeling for cgroupfsAntonio Murdaca
This patch allows genfscon per-file labeling for cgroupfs. For instance, this allows to label the "release_agent" file within each cgroup mount and limit writes to it. Signed-off-by: Antonio Murdaca <amurdaca@redhat.com> [PM: subject line and merge tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-17lsm_audit: update my email addressStephen Smalley
Update my email address since epoch.ncsc.mil no longer exists. MAINTAINERS and CREDITS are already correct. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-17selinux: update my email addressStephen Smalley
Update my email address since epoch.ncsc.mil no longer exists. MAINTAINERS and CREDITS are already correct. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-08selinux: use GFP_NOWAIT in the AVC kmem_cachesMichal Hocko
There is a strange __GFP_NOMEMALLOC usage pattern in SELinux, specifically GFP_ATOMIC | __GFP_NOMEMALLOC which doesn't make much sense. GFP_ATOMIC on its own allows to access memory reserves while __GFP_NOMEMALLOC dictates we cannot use memory reserves. Replace this with the much more sane GFP_NOWAIT in the AVC code as we can tolerate memory allocation failures in that code. Signed-off-by: Michal Hocko <mhocko@kernel.org> Acked-by: Mel Gorman <mgorman@suse.de> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-02selinux: Generalize support for NNP/nosuid SELinux domain transitionsStephen Smalley
As systemd ramps up enabling NNP (NoNewPrivileges) for system services, it is increasingly breaking SELinux domain transitions for those services and their descendants. systemd enables NNP not only for services whose unit files explicitly specify NoNewPrivileges=yes but also for services whose unit files specify any of the following options in combination with running without CAP_SYS_ADMIN (e.g. specifying User= or a CapabilityBoundingSet= without CAP_SYS_ADMIN): SystemCallFilter=, SystemCallArchitectures=, RestrictAddressFamilies=, RestrictNamespaces=, PrivateDevices=, ProtectKernelTunables=, ProtectKernelModules=, MemoryDenyWriteExecute=, or RestrictRealtime= as per the systemd.exec(5) man page. The end result is bad for the security of both SELinux-disabled and SELinux-enabled systems. Packagers have to turn off these options in the unit files to preserve SELinux domain transitions. For users who choose to disable SELinux, this means that they miss out on at least having the systemd-supported protections. For users who keep SELinux enabled, they may still be missing out on some protections because it isn't necessarily guaranteed that the SELinux policy for that service provides the same protections in all cases. commit 7b0d0b40cd78 ("selinux: Permit bounded transitions under NO_NEW_PRIVS or NOSUID.") allowed bounded transitions under NNP in order to support limited usage for sandboxing programs. However, defining typebounds for all of the affected service domains is impractical to implement in policy, since typebounds requires us to ensure that each domain is allowed everything all of its descendant domains are allowed, and this has to be repeated for the entire chain of domain transitions. There is no way to clone all allow rules from descendants to their ancestors in policy currently, and doing so would be undesirable even if it were practical, as it requires leaking permissions to objects and operations into ancestor domains that could weaken their own security in order to allow them to the descendants (e.g. if a descendant requires execmem permission, then so do all of its ancestors; if a descendant requires execute permission to a file, then so do all of its ancestors; if a descendant requires read to a symbolic link or temporary file, then so do all of its ancestors...). SELinux domains are intentionally not hierarchical / bounded in this manner normally, and making them so would undermine their protections and least privilege. We have long had a similar tension with SELinux transitions and nosuid mounts, albeit not as severe. Users often have had to choose between retaining nosuid on a mount and allowing SELinux domain transitions on files within those mounts. This likewise leads to unfortunate tradeoffs in security. Decouple NNP/nosuid from SELinux transitions, so that we don't have to make a choice between them. Introduce a nnp_nosuid_transition policy capability that enables transitions under NNP/nosuid to be based on a permission (nnp_transition for NNP; nosuid_transition for nosuid) between the old and new contexts in addition to the current support for bounded transitions. Domain transitions can then be allowed in policy without requiring the parent to be a strict superset of all of its children. With this change, systemd unit files can be left unmodified from upstream. SELinux-disabled and SELinux-enabled users will benefit from retaining any of the systemd-provided protections. SELinux policy will only need to be adapted to enable the new policy capability and to allow the new permissions between domain pairs as appropriate. NB: Allowing nnp_transition between two contexts opens up the potential for the old context to subvert the new context by installing seccomp filters before the execve. Allowing nosuid_transition between two contexts opens up the potential for a context transition to occur on a file from an untrusted filesystem (e.g. removable media or remote filesystem). Use with care. Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-08-01smack: Remove redundant pdeath_signal clearingKees Cook
This removes the redundant pdeath_signal clearing in Smack: the check in smack_bprm_committing_creds() matches the check in smack_bprm_set_creds() (which used to be in the now-removed smack_bprm_securexec() hook) and since secureexec is now being checked for clearing pdeath_signal, this is redundant to the common exec code. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
2017-08-01LSM: drop bprm_secureexec hookKees Cook
This removes the bprm_secureexec hook since the logic has been folded into the bprm_set_creds hook for all LSMs now. Cc: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: John Johansen <john.johansen@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-08-01commoncap: Move cap_elevated calculation into bprm_set_credsKees Cook
Instead of a separate function, open-code the cap_elevated test, which lets us entirely remove bprm->cap_effective (to use the local "effective" variable instead), and more accurately examine euid/egid changes via the existing local "is_setid". The following LTP tests were run to validate the changes: # ./runltp -f syscalls -s cap # ./runltp -f securebits # ./runltp -f cap_bounds # ./runltp -f filecaps All kernel selftests for capabilities and exec continue to pass as well. Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: Andy Lutomirski <luto@kernel.org>
2017-08-01commoncap: Refactor to remove bprm_secureexec hookKees Cook
The commoncap implementation of the bprm_secureexec hook is the only LSM that depends on the final call to its bprm_set_creds hook (since it may be called for multiple files, it ignores bprm->called_set_creds). As a result, it cannot safely _clear_ bprm->secureexec since other LSMs may have set it. Instead, remove the bprm_secureexec hook by introducing a new flag to bprm specific to commoncap: cap_elevated. This is similar to cap_effective, but that is used for a specific subset of elevated privileges, and exists solely to track state from bprm_set_creds to bprm_secureexec. As such, it will be removed in the next patch. Here, set the new bprm->cap_elevated flag when setuid/setgid has happened from bprm_fill_uid() or fscapabilities have been prepared. This temporarily moves the bprm_secureexec hook to a static inline. The helper will be removed in the next patch; this makes the step easier to review and bisect, since this does not introduce any changes to inputs nor outputs to the "elevated privileges" calculation. The new flag is merged with the bprm->secureexec flag in setup_new_exec() since this marks the end of any further prepare_binprm() calls. Cc: Andy Lutomirski <luto@kernel.org> Signed-off-by: Kees Cook <keescook@chromium.org> Reviewed-by: Andy Lutomirski <luto@kernel.org> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-08-01smack: Refactor to remove bprm_secureexec hookKees Cook
The Smack bprm_secureexec hook can be merged with the bprm_set_creds hook since it's dealing with the same information, and all of the details are finalized during the first call to the bprm_set_creds hook via prepare_binprm() (subsequent calls due to binfmt_script, etc, are ignored via bprm->called_set_creds). Here, the test can just happen at the end of the bprm_set_creds hook, and the bprm_secureexec hook can be dropped. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Reviewed-by: Casey Schaufler <casey@schaufler-ca.com>
2017-08-01selinux: Refactor to remove bprm_secureexec hookKees Cook
The SELinux bprm_secureexec hook can be merged with the bprm_set_creds hook since it's dealing with the same information, and all of the details are finalized during the first call to the bprm_set_creds hook via prepare_binprm() (subsequent calls due to binfmt_script, etc, are ignored via bprm->called_set_creds). Here, the test can just happen at the end of the bprm_set_creds hook, and the bprm_secureexec hook can be dropped. Cc: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: Paul Moore <paul@paul-moore.com> Tested-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge Hallyn <serge@hallyn.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Reviewed-by: Andy Lutomirski <luto@kernel.org>
2017-08-01apparmor: Refactor to remove bprm_secureexec hookKees Cook
The AppArmor bprm_secureexec hook can be merged with the bprm_set_creds hook since it's dealing with the same information, and all of the details are finalized during the first call to the bprm_set_creds hook via prepare_binprm() (subsequent calls due to binfmt_script, etc, are ignored via bprm->called_set_creds). Here, all the comments describe how secureexec is actually calculated during bprm_set_creds, so this actually does it, drops the bprm flag that was being used internally by AppArmor, and drops the bprm_secureexec hook. Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Johansen <john.johansen@canonical.com> Reviewed-by: James Morris <james.l.morris@oracle.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-08-01exec: Rename bprm->cred_prepared to called_set_credsKees Cook
The cred_prepared bprm flag has a misleading name. It has nothing to do with the bprm_prepare_cred hook, and actually tracks if bprm_set_creds has been called. Rename this flag and improve its comment. Cc: David Howells <dhowells@redhat.com> Cc: Stephen Smalley <sds@tycho.nsa.gov> Cc: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Kees Cook <keescook@chromium.org> Acked-by: John Johansen <john.johansen@canonical.com> Acked-by: James Morris <james.l.morris@oracle.com> Acked-by: Paul Moore <paul@paul-moore.com> Acked-by: Serge Hallyn <serge@hallyn.com>
2017-07-31netfilter: nf_hook_ops structs can be constFlorian Westphal
We no longer place these on a list so they can be const. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2017-07-25selinux: Assign proper class to PF_UNIX/SOCK_RAW socketsLuis Ressel
For PF_UNIX, SOCK_RAW is synonymous with SOCK_DGRAM (cf. net/unix/af_unix.c). This is a tad obscure, but libpcap uses it. Signed-off-by: Luis Ressel <aranea@aixah.de> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com>
2017-07-25sync to Linus v4.13-rc2 for subsystem developers to work againstJames Morris
2017-07-21Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/netDavid S. Miller
2017-07-20security: Use user_namespace::level to avoid redundant iterations in ↵Kirill Tkhai
cap_capable() When ns->level is not larger then cred->user_ns->level, then ns can't be cred->user_ns's descendant, and there is no a sense to search in parents. So, break the cycle earlier and skip needless iterations. v2: Change comment on suggested by Andy Lutomirski. Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2017-07-19Merge tag 'gcc-plugins-v4.13-rc2' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux Pull structure randomization updates from Kees Cook: "Now that IPC and other changes have landed, enable manual markings for randstruct plugin, including the task_struct. This is the rest of what was staged in -next for the gcc-plugins, and comes in three patches, largest first: - mark "easy" structs with __randomize_layout - mark task_struct with an optional anonymous struct to isolate the __randomize_layout section - mark structs to opt _out_ of automated marking (which will come later) And, FWIW, this continues to pass allmodconfig (normal and patched to enable gcc-plugins) builds of x86_64, i386, arm64, arm, powerpc, and s390 for me" * tag 'gcc-plugins-v4.13-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux: randstruct: opt-out externally exposed function pointer structs task_struct: Allow randomized layout randstruct: Mark various structs for randomization
2017-07-18xfrm: remove flow cacheFlorian Westphal
After rcu conversions performance degradation in forward tests isn't that noticeable anymore. See next patch for some numbers. A followup patcg could then also remove genid from the policies as we do not cache bundles anymore. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>
2017-07-18LSM: Remove security_task_create() hook.Tetsuo Handa
Since commit a79be238600d1a03 ("selinux: Use task_alloc hook rather than task_create hook") changed to use task_alloc hook, task_create hook is no longer used. Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-07-14KEYS: DH: validate __spare fieldEric Biggers
Syscalls must validate that their reserved arguments are zero and return EINVAL otherwise. Otherwise, it will be impossible to actually use them for anything in the future because existing programs may be passing garbage in. This is standard practice when adding new APIs. Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.l.morris@oracle.com>
2017-07-12include/linux/string.h: add the option of fortified string.h functionsDaniel Micay
This adds support for compiling with a rough equivalent to the glibc _FORTIFY_SOURCE=1 feature, providing compile-time and runtime buffer overflow checks for string.h functions when the compiler determines the size of the source or destination buffer at compile-time. Unlike glibc, it covers buffer reads in addition to writes. GNU C __builtin_*_chk intrinsics are avoided because they would force a much more complex implementation. They aren't designed to detect read overflows and offer no real benefit when using an implementation based on inline checks. Inline checks don't add up to much code size and allow full use of the regular string intrinsics while avoiding the need for a bunch of _chk functions and per-arch assembly to avoid wrapper overhead. This detects various overflows at compile-time in various drivers and some non-x86 core kernel code. There will likely be issues caught in regular use at runtime too. Future improvements left out of initial implementation for simplicity, as it's all quite optional and can be done incrementally: * Some of the fortified string functions (strncpy, strcat), don't yet place a limit on reads from the source based on __builtin_object_size of the source buffer. * Extending coverage to more string functions like strlcat. * It should be possible to optionally use __builtin_object_size(x, 1) for some functions (C strings) to detect intra-object overflows (like glibc's _FORTIFY_SOURCE=2), but for now this takes the conservative approach to avoid likely compatibility issues. * The compile-time checks should be made available via a separate config option which can be enabled by default (or always enabled) once enough time has passed to get the issues it catches fixed. Kees said: "This is great to have. While it was out-of-tree code, it would have blocked at least CVE-2016-3858 from being exploitable (improper size argument to strlcpy()). I've sent a number of fixes for out-of-bounds-reads that this detected upstream already" [arnd@arndb.de: x86: fix fortified memcpy] Link: http://lkml.kernel.org/r/20170627150047.660360-1-arnd@arndb.de [keescook@chromium.org: avoid panic() in favor of BUG()] Link: http://lkml.kernel.org/r/20170626235122.GA25261@beast [keescook@chromium.org: move from -mm, add ARCH_HAS_FORTIFY_SOURCE, tweak Kconfig help] Link: http://lkml.kernel.org/r/20170526095404.20439-1-danielmicay@gmail.com Link: http://lkml.kernel.org/r/1497903987-21002-8-git-send-email-keescook@chromium.org Signed-off-by: Daniel Micay <danielmicay@gmail.com> Signed-off-by: Kees Cook <keescook@chromium.org> Signed-off-by: Arnd Bergmann <arnd@arndb.de> Acked-by: Kees Cook <keescook@chromium.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Daniel Axtens <dja@axtens.net> Cc: Rasmus Villemoes <linux@rasmusvillemoes.dk> Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Cc: Chris Metcalf <cmetcalf@ezchip.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-07-05Merge branch 'work.memdup_user' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Pull memdup_user() conversions from Al Viro: "A fairly self-contained series - hunting down open-coded memdup_user() and memdup_user_nul() instances" * 'work.memdup_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: bpf: don't open-code memdup_user() kimage_file_prepare_segments(): don't open-code memdup_user() ethtool: don't open-code memdup_user() do_ip_setsockopt(): don't open-code memdup_user() do_ipv6_setsockopt(): don't open-code memdup_user() irda: don't open-code memdup_user() xfrm_user_policy(): don't open-code memdup_user() ima_write_policy(): don't open-code memdup_user_nul() sel_write_validatetrans(): don't open-code memdup_user_nul()
2017-07-05Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-nextLinus Torvalds
Pull networking updates from David Miller: "Reasonably busy this cycle, but perhaps not as busy as in the 4.12 merge window: 1) Several optimizations for UDP processing under high load from Paolo Abeni. 2) Support pacing internally in TCP when using the sch_fq packet scheduler for this is not practical. From Eric Dumazet. 3) Support mutliple filter chains per qdisc, from Jiri Pirko. 4) Move to 1ms TCP timestamp clock, from Eric Dumazet. 5) Add batch dequeueing to vhost_net, from Jason Wang. 6) Flesh out more completely SCTP checksum offload support, from Davide Caratti. 7) More plumbing of extended netlink ACKs, from David Ahern, Pablo Neira Ayuso, and Matthias Schiffer. 8) Add devlink support to nfp driver, from Simon Horman. 9) Add RTM_F_FIB_MATCH flag to RTM_GETROUTE queries, from Roopa Prabhu. 10) Add stack depth tracking to BPF verifier and use this information in the various eBPF JITs. From Alexei Starovoitov. 11) Support XDP on qed device VFs, from Yuval Mintz. 12) Introduce BPF PROG ID for better introspection of installed BPF programs. From Martin KaFai Lau. 13) Add bpf_set_hash helper for TC bpf programs, from Daniel Borkmann. 14) For loads, allow narrower accesses in bpf verifier checking, from Yonghong Song. 15) Support MIPS in the BPF selftests and samples infrastructure, the MIPS eBPF JIT will be merged in via the MIPS GIT tree. From David Daney. 16) Support kernel based TLS, from Dave Watson and others. 17) Remove completely DST garbage collection, from Wei Wang. 18) Allow installing TCP MD5 rules using prefixes, from Ivan Delalande. 19) Add XDP support to Intel i40e driver, from Björn Töpel 20) Add support for TC flower offload in nfp driver, from Simon Horman, Pieter Jansen van Vuuren, Benjamin LaHaise, Jakub Kicinski, and Bert van Leeuwen. 21) IPSEC offloading support in mlx5, from Ilan Tayari. 22) Add HW PTP support to macb driver, from Rafal Ozieblo. 23) Networking refcount_t conversions, From Elena Reshetova. 24) Add sock_ops support to BPF, from Lawrence Brako. This is useful for tuning the TCP sockopt settings of a group of applications, currently via CGROUPs" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1899 commits) net: phy: dp83867: add workaround for incorrect RX_CTRL pin strap dt-bindings: phy: dp83867: provide a workaround for incorrect RX_CTRL pin strap cxgb4: Support for get_ts_info ethtool method cxgb4: Add PTP Hardware Clock (PHC) support cxgb4: time stamping interface for PTP nfp: default to chained metadata prepend format nfp: remove legacy MAC address lookup nfp: improve order of interfaces in breakout mode net: macb: remove extraneous return when MACB_EXT_DESC is defined bpf: add missing break in for the TCP_BPF_SNDCWND_CLAMP case bpf: fix return in load_bpf_file mpls: fix rtm policy in mpls_getroute net, ax25: convert ax25_cb.refcount from atomic_t to refcount_t net, ax25: convert ax25_route.refcount from atomic_t to refcount_t net, ax25: convert ax25_uid_assoc.refcount from atomic_t to refcount_t net, sctp: convert sctp_ep_common.refcnt from atomic_t to refcount_t net, sctp: convert sctp_transport.refcnt from atomic_t to refcount_t net, sctp: convert sctp_chunk.refcnt from atomic_t to refcount_t net, sctp: convert sctp_datamsg.refcnt from atomic_t to refcount_t net, sctp: convert sctp_auth_bytes.refcnt from atomic_t to refcount_t ...