summaryrefslogtreecommitdiffstats
path: root/security
AgeCommit message (Collapse)Author
2021-03-07smackfs: restrict bytes count in smackfs write functionsSabyrzhan Tasbolatov
commit 7ef4c19d245f3dc233fd4be5acea436edd1d83d8 upstream. syzbot found WARNINGs in several smackfs write operations where bytes count is passed to memdup_user_nul which exceeds GFP MAX_ORDER. Check count size if bigger than PAGE_SIZE. Per smackfs doc, smk_write_net4addr accepts any label or -CIPSO, smk_write_net6addr accepts any label or -DELETE. I couldn't find any general rule for other label lengths except SMK_LABELLEN, SMK_LONGLABEL, SMK_CIPSOMAX which are documented. Let's constrain, in general, smackfs label lengths for PAGE_SIZE. Although fuzzer crashes write to smackfs/netlabel on 0x400000 length. Here is a quick way to reproduce the WARNING: python -c "print('A' * 0x400000)" > /sys/fs/smackfs/netlabel Reported-by: syzbot+a71a442385a0b2815497@syzkaller.appspotmail.com Signed-off-by: Sabyrzhan Tasbolatov <snovitoll@gmail.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04KEYS: trusted: Fix migratable=1 failingJarkko Sakkinen
commit 8da7520c80468c48f981f0b81fc1be6599e3b0ad upstream. Consider the following transcript: $ keyctl add trusted kmk "new 32 blobauth=helloworld keyhandle=80000000 migratable=1" @u add_key: Invalid argument The documentation has the following description: migratable= 0|1 indicating permission to reseal to new PCR values, default 1 (resealing allowed) The consequence is that "migratable=1" should succeed. Fix this by allowing this condition to pass instead of return -EINVAL. [*] Documentation/security/keys/trusted-encrypted.rst Cc: stable@vger.kernel.org Cc: "James E.J. Bottomley" <jejb@linux.ibm.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: David Howells <dhowells@redhat.com> Fixes: d00a1c72f7f4 ("keys: add new trusted key-type") Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-03-04certs: Fix blacklist flag type confusionDavid Howells
[ Upstream commit 4993e1f9479a4161fd7d93e2b8b30b438f00cb0f ] KEY_FLAG_KEEP is not meant to be passed to keyring_alloc() or key_alloc(), as these only take KEY_ALLOC_* flags. KEY_FLAG_KEEP has the same value as KEY_ALLOC_BYPASS_RESTRICTION, but fortunately only key_create_or_update() uses it. LSMs using the key_alloc hook don't check that flag. KEY_FLAG_KEEP is then ignored but fortunately (again) the root user cannot write to the blacklist keyring, so it is not possible to remove a key/hash from it. Fix this by adding a KEY_ALLOC_SET_KEEP flag that tells key_alloc() to set KEY_FLAG_KEEP on the new key. blacklist_init() can then, correctly, pass this to keyring_alloc(). We can also use this in ima_mok_init() rather than setting the flag manually. Note that this doesn't fix an observable bug with the current implementation but it is required to allow addition of new hashes to the blacklist in the future without making it possible for them to be removed. Fixes: 734114f8782f ("KEYS: Add a system blacklist keyring") Reported-by: Mickaël Salaün <mic@linux.microsoft.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Mickaël Salaün <mic@linux.microsoft.com> cc: Mimi Zohar <zohar@linux.vnet.ibm.com> Cc: David Woodhouse <dwmw2@infradead.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04capabilities: Don't allow writing ambiguous v3 file capabilitiesEric W. Biederman
[ Upstream commit 95ebabde382c371572297915b104e55403674e73 ] The v3 file capabilities have a uid field that records the filesystem uid of the root user of the user namespace the file capabilities are valid in. When someone is silly enough to have the same underlying uid as the root uid of multiple nested containers a v3 filesystem capability can be ambiguous. In the spirit of don't do that then, forbid writing a v3 filesystem capability if it is ambiguous. Fixes: 8db6c34f1dbc ("Introduce v3 namespaced file capabilities") Reviewed-by: Andrew G. Morgan <morgan@kernel.org> Reviewed-by: Serge Hallyn <serge@hallyn.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04ima: Free IMA measurement buffer after kexec syscallLakshmi Ramasubramanian
[ Upstream commit f31e3386a4e92ba6eda7328cb508462956c94c64 ] IMA allocates kernel virtual memory to carry forward the measurement list, from the current kernel to the next kernel on kexec system call, in ima_add_kexec_buffer() function. This buffer is not freed before completing the kexec system call resulting in memory leak. Add ima_buffer field in "struct kimage" to store the virtual address of the buffer allocated for the IMA measurement list. Free the memory allocated for the IMA measurement list in kimage_file_post_load_cleanup() function. Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Reviewed-by: Thiago Jung Bauermann <bauerman@linux.ibm.com> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-03-04ima: Free IMA measurement buffer on errorLakshmi Ramasubramanian
[ Upstream commit 6d14c6517885fa68524238787420511b87d671df ] IMA allocates kernel virtual memory to carry forward the measurement list, from the current kernel to the next kernel on kexec system call, in ima_add_kexec_buffer() function. In error code paths this memory is not freed resulting in memory leak. Free the memory allocated for the IMA measurement list in the error code paths in ima_add_kexec_buffer() function. Signed-off-by: Lakshmi Ramasubramanian <nramas@linux.microsoft.com> Suggested-by: Tyler Hicks <tyhicks@linux.microsoft.com> Fixes: 7b8589cc29e7 ("ima: on soft reboot, save the measurement list") Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-02-23cap: fix conversions on getxattrMiklos Szeredi
[ Upstream commit f2b00be488730522d0fb7a8a5de663febdcefe0a ] If a capability is stored on disk in v2 format cap_inode_getsecurity() will currently return in v2 format unconditionally. This is wrong: v2 cap should be equivalent to a v3 cap with zero rootid, and so the same conversions performed on it. If the rootid cannot be mapped, v3 is returned unconverted. Fix this so that both v2 and v3 return -EOVERFLOW if the rootid (or the owner of the fs user namespace in case of v2) cannot be mapped into the current user namespace. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Acked-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-01-19dump_common_audit_data(): fix racy accesses to ->d_nameAl Viro
commit d36a1dd9f77ae1e72da48f4123ed35627848507d upstream. We are not guaranteed the locking environment that would prevent dentry getting renamed right under us. And it's possible for old long name to be freed after rename, leading to UAF here. Cc: stable@kernel.org # v2.6.2+ Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-01-19ima: Remove __init annotation from ima_pcrread()Roberto Sassu
commit 8b8c704d913b0fe490af370631a4200e26334ec0 upstream. Commit 6cc7c266e5b4 ("ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()") added a call to ima_calc_boot_aggregate() so that the digest can be recalculated for the boot_aggregate measurement entry if the 'd' template field has been requested. For the 'd' field, only SHA1 and MD5 digests are accepted. Given that ima_eventdigest_init() does not have the __init annotation, all functions called should not have it. This patch removes __init from ima_pcrread(). Cc: stable@vger.kernel.org Fixes: 6cc7c266e5b4 ("ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()") Reported-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30ima: Don't modify file descriptor mode on the flyRoberto Sassu
commit 207cdd565dfc95a0a5185263a567817b7ebf5467 upstream. Commit a408e4a86b36b ("ima: open a new file instance if no read permissions") already introduced a second open to measure a file when the original file descriptor does not allow it. However, it didn't remove the existing method of changing the mode of the original file descriptor, which is still necessary if the current process does not have enough privileges to open a new one. Changing the mode isn't really an option, as the filesystem might need to do preliminary steps to make the read possible. Thus, this patch removes the code and keeps the second open as the only option to measure a file when it is unreadable with the original file descriptor. Cc: <stable@vger.kernel.org> # 4.20.x: 0014cc04e8ec0 ima: Set file->f_mode Fixes: 2fe5d6def1672 ("ima: integrity appraisal extension") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-12-30selinux: fix inode_doinit_with_dentry() LABEL_INVALID error handlingPaul Moore
[ Upstream commit 200ea5a2292dc444a818b096ae6a32ba3caa51b9 ] A previous fix, commit 83370b31a915 ("selinux: fix error initialization in inode_doinit_with_dentry()"), changed how failures were handled before a SELinux policy was loaded. Unfortunately that patch was potentially problematic for two reasons: it set the isec->initialized state without holding a lock, and it didn't set the inode's SELinux label to the "default" for the particular filesystem. The later can be a problem if/when a later attempt to revalidate the inode fails and SELinux reverts to the existing inode label. This patch should restore the default inode labeling that existed before the original fix, without affecting the LABEL_INVALID marking such that revalidation will still be attempted in the future. Fixes: 83370b31a915 ("selinux: fix error initialization in inode_doinit_with_dentry()") Reported-by: Sven Schnelle <svens@linux.ibm.com> Tested-by: Sven Schnelle <svens@linux.ibm.com> Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-12-30selinux: fix error initialization in inode_doinit_with_dentry()Tianyue Ren
[ Upstream commit 83370b31a915493231e5b9addc72e4bef69f8d31 ] Mark the inode security label as invalid if we cannot find a dentry so that we will retry later rather than marking it initialized with the unlabeled SID. Fixes: 9287aed2ad1f ("selinux: Convert isec->lock into a spinlock") Signed-off-by: Tianyue Ren <rentianyue@kylinos.cn> [PM: minor comment tweaks] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-11-18selinux: Fix error return code in sel_ib_pkey_sid_slow()Chen Zhou
commit c350f8bea271782e2733419bd2ab9bf4ec2051ef upstream. Fix to return a negative error code from the error handling case instead of 0 in function sel_ib_pkey_sid_slow(), as done elsewhere in this function. Cc: stable@vger.kernel.org Fixes: 409dcf31538a ("selinux: Add a cache for quicker retreival of PKey SIDs") Reported-by: Hulk Robot <hulkci@huawei.com> Signed-off-by: Chen Zhou <chenzhou10@huawei.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-11-05evm: Check size of security.evm before using itRoberto Sassu
commit 455b6c9112eff8d249e32ba165742085678a80a4 upstream. This patch checks the size for the EVM_IMA_XATTR_DIGSIG and EVM_XATTR_PORTABLE_DIGSIG types to ensure that the algorithm is read from the buffer returned by vfs_getxattr_alloc(). Cc: stable@vger.kernel.org # 4.19.x Fixes: 5feeb61183dde ("evm: Allow non-SHA1 digital signatures") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-29ima: Don't ignore errors from crypto_shash_update()Roberto Sassu
commit 60386b854008adc951c470067f90a2d85b5d520f upstream. Errors returned by crypto_shash_update() are not checked in ima_calc_boot_aggregate_tfm() and thus can be overwritten at the next iteration of the loop. This patch adds a check after calling crypto_shash_update() and returns immediately if the result is not zero. Cc: stable@vger.kernel.org Fixes: 3323eec921efd ("integrity: IMA as an integrity service provider") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-10-01selinux: sel_avc_get_stat_idx should increase position indexVasily Averin
[ Upstream commit 8d269a8e2a8f0bca89022f4ec98de460acb90365 ] If seq_file .next function does not change position index, read after some lseek can generate unexpected output. $ dd if=/sys/fs/selinux/avc/cache_stats # usual output lookups hits misses allocations reclaims frees 817223 810034 7189 7189 6992 7037 1934894 1926896 7998 7998 7632 7683 1322812 1317176 5636 5636 5456 5507 1560571 1551548 9023 9023 9056 9115 0+1 records in 0+1 records out 189 bytes copied, 5,1564e-05 s, 3,7 MB/s $# read after lseek to midle of last line $ dd if=/sys/fs/selinux/avc/cache_stats bs=180 skip=1 dd: /sys/fs/selinux/avc/cache_stats: cannot skip to specified offset 056 9115 <<<< end of last line 1560571 1551548 9023 9023 9056 9115 <<< whole last line once again 0+1 records in 0+1 records out 45 bytes copied, 8,7221e-05 s, 516 kB/s $# read after lseek beyond end of of file $ dd if=/sys/fs/selinux/avc/cache_stats bs=1000 skip=1 dd: /sys/fs/selinux/avc/cache_stats: cannot skip to specified offset 1560571 1551548 9023 9023 9056 9115 <<<< generates whole last line 0+1 records in 0+1 records out 36 bytes copied, 9,0934e-05 s, 396 kB/s https://bugzilla.kernel.org/show_bug.cgi?id=206283 Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-10-01selinux: allow labeling before policy is loadedJonathan Lebon
[ Upstream commit 3e3e24b42043eceb97ed834102c2d094dfd7aaa6 ] Currently, the SELinux LSM prevents one from setting the `security.selinux` xattr on an inode without a policy first being loaded. However, this restriction is problematic: it makes it impossible to have newly created files with the correct label before actually loading the policy. This is relevant in distributions like Fedora, where the policy is loaded by systemd shortly after pivoting out of the initrd. In such instances, all files created prior to pivoting will be unlabeled. One then has to relabel them after pivoting, an operation which inherently races with other processes trying to access those same files. Going further, there are use cases for creating the entire root filesystem on first boot from the initrd (e.g. Container Linux supports this today[1], and we'd like to support it in Fedora CoreOS as well[2]). One can imagine doing this in two ways: at the block device level (e.g. laying down a disk image), or at the filesystem level. In the former, labeling can simply be part of the image. But even in the latter scenario, one still really wants to be able to set the right labels when populating the new filesystem. This patch enables this by changing behaviour in the following two ways: 1. allow `setxattr` if we're not initialized 2. don't try to set the in-core inode SID if we're not initialized; instead leave it as `LABEL_INVALID` so that revalidation may be attempted at a later time Note the first hunk of this patch is mostly the same as a previously discussed one[3], though it was part of a larger series which wasn't accepted. [1] https://coreos.com/os/docs/latest/root-filesystem-placement.html [2] https://github.com/coreos/fedora-coreos-tracker/issues/94 [3] https://www.spinics.net/lists/linux-initramfs/msg04593.html Co-developed-by: Victor Kamensky <kamensky@cisco.com> Signed-off-by: Victor Kamensky <kamensky@cisco.com> Signed-off-by: Jonathan Lebon <jlebon@redhat.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19Smack: prevent underflow in smk_set_cipso()Dan Carpenter
[ Upstream commit 42a2df3e829f3c5562090391b33714b2e2e5ad4a ] We have an upper bound on "maplevel" but forgot to check for negative values. Fixes: e114e473771c ("Smack: Simplified Mandatory Access Control Kernel") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-19Smack: fix another vsscanf out of boundsDan Carpenter
[ Upstream commit a6bd4f6d9b07452b0b19842044a6c3ea384b0b88 ] This is similar to commit 84e99e58e8d1 ("Smack: slab-out-of-bounds in vsscanf") where we added a bounds check on "rule". Reported-by: syzbot+a22c6092d003d6fe1122@syzkaller.appspotmail.com Fixes: f7112e6c9abf ("Smack: allow for significantly longer Smack labels v4") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-08-11Smack: fix use-after-free in smk_write_relabel_self()Eric Biggers
commit beb4ee6770a89646659e6a2178538d2b13e2654e upstream. smk_write_relabel_self() frees memory from the task's credentials with no locking, which can easily cause a use-after-free because multiple tasks can share the same credentials structure. Fix this by using prepare_creds() and commit_creds() to correctly modify the task's credentials. Reproducer for "BUG: KASAN: use-after-free in smk_write_relabel_self": #include <fcntl.h> #include <pthread.h> #include <unistd.h> static void *thrproc(void *arg) { int fd = open("/sys/fs/smackfs/relabel-self", O_WRONLY); for (;;) write(fd, "foo", 3); } int main() { pthread_t t; pthread_create(&t, NULL, thrproc, NULL); thrproc(NULL); } Reported-by: syzbot+e6416dabb497a650da40@syzkaller.appspotmail.com Fixes: 38416e53936e ("Smack: limited capability for changing process label") Cc: <stable@vger.kernel.org> # v4.4+ Signed-off-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-07-22apparmor: ensure that dfa state tables have entriesJohn Johansen
commit c27c6bd2c4d6b6bb779f9b722d5607993e1d5e5c upstream. Currently it is possible to specify a state machine table with 0 length, this is not valid as optional tables are specified by not defining the table as present. Further this allows by-passing the base tables range check against the next/check tables. Fixes: d901d6a298dc ("apparmor: dfa split verification of table headers") Reported-by: Mike Salvatore <mike.salvatore@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-30apparmor: don't try to replace stale label in ptraceme checkJann Horn
[ Upstream commit ca3fde5214e1d24f78269b337d3f22afd6bf445e ] begin_current_label_crit_section() must run in sleepable context because when label_is_stale() is true, aa_replace_current_label() runs, which uses prepare_creds(), which can sleep. Until now, the ptraceme access check (which runs with tasklist_lock held) violated this rule. Fixes: b2d09ae449ced ("apparmor: move ptrace checks to using labels") Reported-by: Cyrill Gorcunov <gorcunov@gmail.com> Reported-by: kernel test robot <rong.a.chen@intel.com> Signed-off-by: Jann Horn <jannh@google.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25selinux: fix double freeTom Rix
commit 65de50969a77509452ae590e9449b70a22b923bb upstream. Clang's static analysis tool reports these double free memory errors. security/selinux/ss/services.c:2987:4: warning: Attempt to free released memory [unix.Malloc] kfree(bnames[i]); ^~~~~~~~~~~~~~~~ security/selinux/ss/services.c:2990:2: warning: Attempt to free released memory [unix.Malloc] kfree(bvalues); ^~~~~~~~~~~~~~ So improve the security_get_bools error handling by freeing these variables and setting their return pointers to NULL and the return len to 0 Cc: stable@vger.kernel.org Signed-off-by: Tom Rix <trix@redhat.com> Acked-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-25apparmor: fix nnp subset test for unconfinedJohn Johansen
[ Upstream commit 3ed4aaa94fc07db3cd0c91be95e3e1b9782a2710 ] The subset test is not taking into account the unconfined exception which will cause profile transitions in the stacked confinement case to fail when no_new_privs is applied. This fixes a regression introduced in the fix for https://bugs.launchpad.net/bugs/1839037 BugLink: https://bugs.launchpad.net/bugs/1844186 Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25apparmor: check/put label on apparmor_sk_clone_security()Mauricio Faria de Oliveira
[ Upstream commit 3b646abc5bc6c0df649daea4c2c976bd4d47e4c8 ] Currently apparmor_sk_clone_security() does not check for existing label/peer in the 'new' struct sock; it just overwrites it, if any (with another reference to the label of the source sock.) static void apparmor_sk_clone_security(const struct sock *sk, struct sock *newsk) { struct aa_sk_ctx *ctx = SK_CTX(sk); struct aa_sk_ctx *new = SK_CTX(newsk); new->label = aa_get_label(ctx->label); new->peer = aa_get_label(ctx->peer); } This might leak label references, which might overflow under load. Thus, check for and put labels, to prevent such errors. Note this is similarly done on: static int apparmor_socket_post_create(struct socket *sock, ...) ... if (sock->sk) { struct aa_sk_ctx *ctx = SK_CTX(sock->sk); aa_put_label(ctx->label); ctx->label = aa_get_label(label); } ... Context: ------- The label reference count leak is observed if apparmor_sock_graft() is called previously: this sets the 'ctx->label' field by getting a reference to the current label (later overwritten, without put.) static void apparmor_sock_graft(struct sock *sk, ...) { struct aa_sk_ctx *ctx = SK_CTX(sk); if (!ctx->label) ctx->label = aa_get_current_label(); } And that is the case on crypto/af_alg.c:af_alg_accept(): int af_alg_accept(struct sock *sk, struct socket *newsock, ...) ... struct sock *sk2; ... sk2 = sk_alloc(...); ... security_sock_graft(sk2, newsock); security_sk_clone(sk, sk2); ... Apparently both calls are done on their own right, especially for other LSMs, being introduced in 2010/2014, before apparmor socket mediation in 2017 (see commits [1,2,3,4]). So, it looks OK there! Let's fix the reference leak in apparmor. Test-case: --------- Exercise that code path enough to overflow label reference count. $ cat aa-refcnt-af_alg.c #include <stdio.h> #include <string.h> #include <unistd.h> #include <sys/socket.h> #include <linux/if_alg.h> int main() { int sockfd; struct sockaddr_alg sa; /* Setup the crypto API socket */ sockfd = socket(AF_ALG, SOCK_SEQPACKET, 0); if (sockfd < 0) { perror("socket"); return 1; } memset(&sa, 0, sizeof(sa)); sa.salg_family = AF_ALG; strcpy((char *) sa.salg_type, "rng"); strcpy((char *) sa.salg_name, "stdrng"); if (bind(sockfd, (struct sockaddr *) &sa, sizeof(sa)) < 0) { perror("bind"); return 1; } /* Accept a "connection" and close it; repeat. */ while (!close(accept(sockfd, NULL, 0))); return 0; } $ gcc -o aa-refcnt-af_alg aa-refcnt-af_alg.c $ ./aa-refcnt-af_alg <a few hours later> [ 9928.475953] refcount_t overflow at apparmor_sk_clone_security+0x37/0x70 in aa-refcnt-af_alg[1322], uid/euid: 1000/1000 ... [ 9928.507443] RIP: 0010:apparmor_sk_clone_security+0x37/0x70 ... [ 9928.514286] security_sk_clone+0x33/0x50 [ 9928.514807] af_alg_accept+0x81/0x1c0 [af_alg] [ 9928.516091] alg_accept+0x15/0x20 [af_alg] [ 9928.516682] SYSC_accept4+0xff/0x210 [ 9928.519609] SyS_accept+0x10/0x20 [ 9928.520190] do_syscall_64+0x73/0x130 [ 9928.520808] entry_SYSCALL_64_after_hwframe+0x3d/0xa2 Note that other messages may be seen, not just overflow, depending on the value being incremented by kref_get(); on another run: [ 7273.182666] refcount_t: saturated; leaking memory. ... [ 7273.185789] refcount_t: underflow; use-after-free. Kprobes: ------- Using kprobe events to monitor sk -> sk_security -> label -> count (kref): Original v5.7 (one reference leak every iteration) ... (af_alg_accept+0x0/0x1c0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd2 ... (af_alg_release_parent+0x0/0xd0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd4 ... (af_alg_accept+0x0/0x1c0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd3 ... (af_alg_release_parent+0x0/0xd0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd5 ... (af_alg_accept+0x0/0x1c0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd4 ... (af_alg_release_parent+0x0/0xd0) label=0xffff8a0f36c25eb0 label_refcnt=0x11fd6 Patched v5.7 (zero reference leak per iteration) ... (af_alg_accept+0x0/0x1c0) label=0xffff9ff376c25eb0 label_refcnt=0x593 ... (af_alg_release_parent+0x0/0xd0) label=0xffff9ff376c25eb0 label_refcnt=0x594 ... (af_alg_accept+0x0/0x1c0) label=0xffff9ff376c25eb0 label_refcnt=0x593 ... (af_alg_release_parent+0x0/0xd0) label=0xffff9ff376c25eb0 label_refcnt=0x594 ... (af_alg_accept+0x0/0x1c0) label=0xffff9ff376c25eb0 label_refcnt=0x593 ... (af_alg_release_parent+0x0/0xd0) label=0xffff9ff376c25eb0 label_refcnt=0x594 Commits: ------- [1] commit 507cad355fc9 ("crypto: af_alg - Make sure sk_security is initialized on accept()ed sockets") [2] commit 4c63f83c2c2e ("crypto: af_alg - properly label AF_ALG socket") [3] commit 2acce6aa9f65 ("Networking") a.k.a ("crypto: af_alg - Avoid sock_graft call warning) [4] commit 56974a6fcfef ("apparmor: add base infastructure for socket mediation") Fixes: 56974a6fcfef ("apparmor: add base infastructure for socket mediation") Reported-by: Brian Moyles <bmoyles@netflix.com> Signed-off-by: Mauricio Faria de Oliveira <mfo@canonical.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-25apparmor: fix introspection of of task mode for unconfined tasksJohn Johansen
[ Upstream commit dd2569fbb053719f7df7ef8fdbb45cf47156a701 ] Fix two issues with introspecting the task mode. 1. If a task is attached to a unconfined profile that is not the ns->unconfined profile then. Mode the mode is always reported as - $ ps -Z LABEL PID TTY TIME CMD unconfined 1287 pts/0 00:00:01 bash test (-) 1892 pts/0 00:00:00 ps instead of the correct value of (unconfined) as shown below $ ps -Z LABEL PID TTY TIME CMD unconfined 2483 pts/0 00:00:01 bash test (unconfined) 3591 pts/0 00:00:00 ps 2. if a task is confined by a stack of profiles that are unconfined the output of label mode is again the incorrect value of (-) like above, instead of (unconfined). This is because the visibile profile count increment is skipped by the special casing of unconfined. Fixes: f1bd904175e8 ("apparmor: add the base fns() for domain labels") Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-22ima: Call ima_calc_boot_aggregate() in ima_eventdigest_init()Roberto Sassu
[ Upstream commit 6cc7c266e5b47d3cd2b5bb7fd3aac4e6bb2dd1d2 ] If the template field 'd' is chosen and the digest to be added to the measurement entry was not calculated with SHA1 or MD5, it is recalculated with SHA1, by using the passed file descriptor. However, this cannot be done for boot_aggregate, because there is no file descriptor. This patch adds a call to ima_calc_boot_aggregate() in ima_eventdigest_init(), so that the digest can be recalculated also for the boot_aggregate entry. Cc: stable@vger.kernel.org # 3.13.x Fixes: 3ce1217d6cd5d ("ima: define template fields library and new helpers") Reported-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-22evm: Fix possible memory leak in evm_calc_hmac_or_hash()Roberto Sassu
commit 0c4395fb2aa77341269ea619c5419ea48171883f upstream. Don't immediately return if the signature is portable and security.ima is not present. Just set error so that memory allocated is freed before returning from evm_calc_hmac_or_hash(). Fixes: 50b977481fce9 ("EVM: Add support for portable signature format") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-22ima: Directly assign the ima_default_policy pointer to ima_rulesRoberto Sassu
commit 067a436b1b0aafa593344fddd711a755a58afb3b upstream. This patch prevents the following oops: [ 10.771813] BUG: kernel NULL pointer dereference, address: 0000000000000 [...] [ 10.779790] RIP: 0010:ima_match_policy+0xf7/0xb80 [...] [ 10.798576] Call Trace: [ 10.798993] ? ima_lsm_policy_change+0x2b0/0x2b0 [ 10.799753] ? inode_init_owner+0x1a0/0x1a0 [ 10.800484] ? _raw_spin_lock+0x7a/0xd0 [ 10.801592] ima_must_appraise.part.0+0xb6/0xf0 [ 10.802313] ? ima_fix_xattr.isra.0+0xd0/0xd0 [ 10.803167] ima_must_appraise+0x4f/0x70 [ 10.804004] ima_post_path_mknod+0x2e/0x80 [ 10.804800] do_mknodat+0x396/0x3c0 It occurs when there is a failure during IMA initialization, and ima_init_policy() is not called. IMA hooks still call ima_match_policy() but ima_rules is NULL. This patch prevents the crash by directly assigning the ima_default_policy pointer to ima_rules when ima_rules is defined. This wouldn't alter the existing behavior, as ima_rules is always set at the end of ima_init_policy(). Cc: stable@vger.kernel.org # 3.7.x Fixes: 07f6a79415d7d ("ima: add appraise action keywords and default rules") Reported-by: Takashi Iwai <tiwai@suse.de> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-22ima: Fix ima digest hash table key calculationKrzysztof Struczynski
commit 1129d31b55d509f15e72dc68e4b5c3a4d7b4da8d upstream. Function hash_long() accepts unsigned long, while currently only one byte is passed from ima_hash_key(), which calculates a key for ima_htable. Given that hashing the digest does not give clear benefits compared to using the digest itself, remove hash_long() and return the modulus calculated on the first two bytes of the digest with the number of slots. Also reduce the depth of the hash table by doubling the number of slots. Cc: stable@vger.kernel.org Fixes: 3323eec921ef ("integrity: IMA as an integrity service provider") Co-developed-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Acked-by: David.Laight@aculab.com (big endian system concerns) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-22Smack: slab-out-of-bounds in vsscanfCasey Schaufler
commit 84e99e58e8d1e26f04c097f4266e431a33987f36 upstream. Add barrier to soob. Return -EOVERFLOW if the buffer is exceeded. Suggested-by: Hillf Danton <hdanton@sina.com> Reported-by: syzbot+bfdd4a2f07be52351350@syzkaller.appspotmail.com Signed-off-by: Casey Schaufler <casey@schaufler-ca.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-06-22mm: add kvfree_sensitive() for freeing sensitive data objectsWaiman Long
[ Upstream commit d4eaa2837851db2bfed572898bfc17f9a9f9151e ] For kvmalloc'ed data object that contains sensitive information like cryptographic keys, we need to make sure that the buffer is always cleared before freeing it. Using memset() alone for buffer clearing may not provide certainty as the compiler may compile it away. To be sure, the special memzero_explicit() has to be used. This patch introduces a new kvfree_sensitive() for freeing those sensitive data objects allocated by kvmalloc(). The relevant places where kvfree_sensitive() can be used are modified to use it. Fixes: 4f0882491a14 ("KEYS: Avoid false positive ENOMEM error on key read") Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Reviewed-by: Eric Biggers <ebiggers@google.com> Acked-by: David Howells <dhowells@redhat.com> Cc: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: James Morris <jmorris@namei.org> Cc: "Serge E. Hallyn" <serge@hallyn.com> Cc: Joe Perches <joe@perches.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: David Rientjes <rientjes@google.com> Cc: Uladzislau Rezki <urezki@gmail.com> Link: http://lkml.kernel.org/r/20200407200318.11711-1-longman@redhat.com Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-07evm: Fix RCU list related warningsMadhuparna Bhowmik
[ Upstream commit 770f60586d2af0590be263f55fd079226313922c ] This patch fixes the following warning and few other instances of traversal of evm_config_xattrnames list: [ 32.848432] ============================= [ 32.848707] WARNING: suspicious RCU usage [ 32.848966] 5.7.0-rc1-00006-ga8d5875ce5f0b #1 Not tainted [ 32.849308] ----------------------------- [ 32.849567] security/integrity/evm/evm_main.c:231 RCU-list traversed in non-reader section!! Since entries are only added to the list and never deleted, use list_for_each_entry_lockless() instead of list_for_each_entry_rcu for traversing the list. Also, add a relevant comment in evm_secfs.c to indicate this fact. Reported-by: kernel test robot <lkp@intel.com> Suggested-by: Paul E. McKenney <paulmck@kernel.org> Signed-off-by: Madhuparna Bhowmik <madhuparnabhowmik10@gmail.com> Acked-by: Paul E. McKenney <paulmck@kernel.org> (RCU viewpoint) Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-06-03exec: Always set cap_ambient in cap_bprm_set_credsEric W. Biederman
[ Upstream commit a4ae32c71fe90794127b32d26d7ad795813b502e ] An invariant of cap_bprm_set_creds is that every field in the new cred structure that cap_bprm_set_creds might set, needs to be set every time to ensure the fields does not get a stale value. The field cap_ambient is not set every time cap_bprm_set_creds is called, which means that if there is a suid or sgid script with an interpreter that has neither the suid nor the sgid bits set the interpreter should be able to accept ambient credentials. Unfortuantely because cap_ambient is not reset to it's original value the interpreter can not accept ambient credentials. Given that the ambient capability set is expected to be controlled by the caller, I don't think this is particularly serious. But it is definitely worth fixing so the code works correctly. I have tested to verify my reading of the code is correct and the interpreter of a sgid can receive ambient capabilities with this change and cannot receive ambient capabilities without this change. Cc: stable@vger.kernel.org Cc: Andy Lutomirski <luto@kernel.org> Fixes: 58319057b784 ("capabilities: ambient capabilities") Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27apparmor: Fix aa_label refcnt leak in policy_updateXiyu Yang
commit c6b39f070722ea9963ffe756bfe94e89218c5e63 upstream. policy_update() invokes begin_current_label_crit_section(), which returns a reference of the updated aa_label object to "label" with increased refcount. When policy_update() returns, "label" becomes invalid, so the refcount should be decreased to keep refcount balanced. The reference counting issue happens in one exception handling path of policy_update(). When aa_may_manage_policy() returns not NULL, the refcnt increased by begin_current_label_crit_section() is not decreased, causing a refcnt leak. Fix this issue by jumping to "end_section" label when aa_may_manage_policy() returns not NULL. Fixes: 5ac8c355ae00 ("apparmor: allow introspecting the loaded policy pre internal transform") Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27apparmor: fix potential label refcnt leak in aa_change_profileXiyu Yang
commit a0b845ffa0d91855532b50fc040aeb2d8338dca4 upstream. aa_change_profile() invokes aa_get_current_label(), which returns a reference of the current task's label. According to the comment of aa_get_current_label(), the returned reference must be put with aa_put_label(). However, when the original object pointed by "label" becomes unreachable because aa_change_profile() returns or a new object is assigned to "label", reference count increased by aa_get_current_label() is not decreased, causing a refcnt leak. Fix this by calling aa_put_label() before aa_change_profile() return and dropping unnecessary aa_get_current_label(). Fixes: 9fcf78cca198 ("apparmor: update domain transitions that are subsets of confinement at nnp") Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27apparmor: Fix use-after-free in aa_audit_rule_initNavid Emamdoost
commit c54d481d71c6849e044690d3960aaebc730224cc upstream. In the implementation of aa_audit_rule_init(), when aa_label_parse() fails the allocated memory for rule is released using aa_audit_rule_free(). But after this release, the return statement tries to access the label field of the rule which results in use-after-free. Before releasing the rule, copy errNo and return it after release. Fixes: 52e8c38001d8 ("apparmor: Fix memory leak of rule on error exit path") Signed-off-by: Navid Emamdoost <navid.emamdoost@gmail.com> Signed-off-by: John Johansen <john.johansen@canonical.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-05-27ima: Fix return value of ima_write_policy()Roberto Sassu
[ Upstream commit 2e3a34e9f409ebe83d1af7cd2f49fca7af97dfac ] This patch fixes the return value of ima_write_policy() when a new policy is directly passed to IMA and the current policy requires appraisal of the file containing the policy. Currently, if appraisal is not in ENFORCE mode, ima_write_policy() returns 0 and leads user space applications to an endless loop. Fix this issue by denying the operation regardless of the appraisal mode. Cc: stable@vger.kernel.org # 4.10.x Fixes: 19f8a84713edc ("ima: measure and appraise the IMA policy itself") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27evm: Check also if *tfm is an error pointer in init_desc()Roberto Sassu
[ Upstream commit 53de3b080d5eae31d0de219617155dcc34e7d698 ] This patch avoids a kernel panic due to accessing an error pointer set by crypto_alloc_shash(). It occurs especially when there are many files that require an unsupported algorithm, as it would increase the likelihood of the following race condition: Task A: *tfm = crypto_alloc_shash() <= error pointer Task B: if (*tfm == NULL) <= *tfm is not NULL, use it Task B: rc = crypto_shash_init(desc) <= panic Task A: *tfm = NULL This patch uses the IS_ERR_OR_NULL macro to determine whether or not a new crypto context must be created. Cc: stable@vger.kernel.org Fixes: d46eb3699502b ("evm: crypto hash replaced by shash") Co-developed-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Krzysztof Struczynski <krzysztof.struczynski@huawei.com> Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-27ima: Set file->f_mode instead of file->f_flags in ima_calc_file_hash()Roberto Sassu
[ Upstream commit 0014cc04e8ec077dc482f00c87dfd949cfe2b98f ] Commit a408e4a86b36 ("ima: open a new file instance if no read permissions") tries to create a new file descriptor to calculate a file digest if the file has not been opened with O_RDONLY flag. However, if a new file descriptor cannot be obtained, it sets the FMODE_READ flag to file->f_flags instead of file->f_mode. This patch fixes this issue by replacing f_flags with f_mode as it was before that commit. Cc: stable@vger.kernel.org # 4.20.x Fixes: a408e4a86b36 ("ima: open a new file instance if no read permissions") Signed-off-by: Roberto Sassu <roberto.sassu@huawei.com> Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com> Signed-off-by: Mimi Zohar <zohar@linux.ibm.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-05-06selinux: properly handle multiple messages in selinux_netlink_send()Paul Moore
commit fb73974172ffaaf57a7c42f35424d9aece1a5af6 upstream. Fix the SELinux netlink_send hook to properly handle multiple netlink messages in a single sk_buff; each message is parsed and subject to SELinux access control. Prior to this patch, SELinux only inspected the first message in the sk_buff. Cc: stable@vger.kernel.org Reported-by: Dmitry Vyukov <dvyukov@google.com> Reviewed-by: Stephen Smalley <stephen.smalley.work@gmail.com> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-29KEYS: Avoid false positive ENOMEM error on key readWaiman Long
[ Upstream commit 4f0882491a148059a52480e753b7f07fc550e188 ] By allocating a kernel buffer with a user-supplied buffer length, it is possible that a false positive ENOMEM error may be returned because the user-supplied length is just too large even if the system do have enough memory to hold the actual key data. Moreover, if the buffer length is larger than the maximum amount of memory that can be returned by kmalloc() (2^(MAX_ORDER-1) number of pages), a warning message will also be printed. To reduce this possibility, we set a threshold (PAGE_SIZE) over which we do check the actual key length first before allocating a buffer of the right size to hold it. The threshold is arbitrary, it is just used to trigger a buffer length check. It does not limit the actual key length as long as there is enough memory to satisfy the memory request. To further avoid large buffer allocation failure due to page fragmentation, kvmalloc() is used to allocate the buffer so that vmapped pages can be used when there is not a large enough contiguous set of pages available for allocation. In the extremely unlikely scenario that the key keeps on being changed and made longer (still <= buflen) in between 2 __keyctl_read_key() calls, the __keyctl_read_key() calling loop in keyctl_read_key() may have to be iterated a large number of times, but definitely not infinite. Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-04-23KEYS: Don't write out to userspace while holding key semaphoreWaiman Long
commit d3ec10aa95819bff18a0d936b18884c7816d0914 upstream. A lockdep circular locking dependency report was seen when running a keyutils test: [12537.027242] ====================================================== [12537.059309] WARNING: possible circular locking dependency detected [12537.088148] 4.18.0-147.7.1.el8_1.x86_64+debug #1 Tainted: G OE --------- - - [12537.125253] ------------------------------------------------------ [12537.153189] keyctl/25598 is trying to acquire lock: [12537.175087] 000000007c39f96c (&mm->mmap_sem){++++}, at: __might_fault+0xc4/0x1b0 [12537.208365] [12537.208365] but task is already holding lock: [12537.234507] 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12537.270476] [12537.270476] which lock already depends on the new lock. [12537.270476] [12537.307209] [12537.307209] the existing dependency chain (in reverse order) is: [12537.340754] [12537.340754] -> #3 (&type->lock_class){++++}: [12537.367434] down_write+0x4d/0x110 [12537.385202] __key_link_begin+0x87/0x280 [12537.405232] request_key_and_link+0x483/0xf70 [12537.427221] request_key+0x3c/0x80 [12537.444839] dns_query+0x1db/0x5a5 [dns_resolver] [12537.468445] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.496731] cifs_reconnect+0xe04/0x2500 [cifs] [12537.519418] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.546263] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.573551] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.601045] kthread+0x30c/0x3d0 [12537.617906] ret_from_fork+0x3a/0x50 [12537.636225] [12537.636225] -> #2 (root_key_user.cons_lock){+.+.}: [12537.664525] __mutex_lock+0x105/0x11f0 [12537.683734] request_key_and_link+0x35a/0xf70 [12537.705640] request_key+0x3c/0x80 [12537.723304] dns_query+0x1db/0x5a5 [dns_resolver] [12537.746773] dns_resolve_server_name_to_ip+0x1e1/0x4d0 [cifs] [12537.775607] cifs_reconnect+0xe04/0x2500 [cifs] [12537.798322] cifs_readv_from_socket+0x461/0x690 [cifs] [12537.823369] cifs_read_from_socket+0xa0/0xe0 [cifs] [12537.847262] cifs_demultiplex_thread+0x311/0x2db0 [cifs] [12537.873477] kthread+0x30c/0x3d0 [12537.890281] ret_from_fork+0x3a/0x50 [12537.908649] [12537.908649] -> #1 (&tcp_ses->srv_mutex){+.+.}: [12537.935225] __mutex_lock+0x105/0x11f0 [12537.954450] cifs_call_async+0x102/0x7f0 [cifs] [12537.977250] smb2_async_readv+0x6c3/0xc90 [cifs] [12538.000659] cifs_readpages+0x120a/0x1e50 [cifs] [12538.023920] read_pages+0xf5/0x560 [12538.041583] __do_page_cache_readahead+0x41d/0x4b0 [12538.067047] ondemand_readahead+0x44c/0xc10 [12538.092069] filemap_fault+0xec1/0x1830 [12538.111637] __do_fault+0x82/0x260 [12538.129216] do_fault+0x419/0xfb0 [12538.146390] __handle_mm_fault+0x862/0xdf0 [12538.167408] handle_mm_fault+0x154/0x550 [12538.187401] __do_page_fault+0x42f/0xa60 [12538.207395] do_page_fault+0x38/0x5e0 [12538.225777] page_fault+0x1e/0x30 [12538.243010] [12538.243010] -> #0 (&mm->mmap_sem){++++}: [12538.267875] lock_acquire+0x14c/0x420 [12538.286848] __might_fault+0x119/0x1b0 [12538.306006] keyring_read_iterator+0x7e/0x170 [12538.327936] assoc_array_subtree_iterate+0x97/0x280 [12538.352154] keyring_read+0xe9/0x110 [12538.370558] keyctl_read_key+0x1b9/0x220 [12538.391470] do_syscall_64+0xa5/0x4b0 [12538.410511] entry_SYSCALL_64_after_hwframe+0x6a/0xdf [12538.435535] [12538.435535] other info that might help us debug this: [12538.435535] [12538.472829] Chain exists of: [12538.472829] &mm->mmap_sem --> root_key_user.cons_lock --> &type->lock_class [12538.472829] [12538.524820] Possible unsafe locking scenario: [12538.524820] [12538.551431] CPU0 CPU1 [12538.572654] ---- ---- [12538.595865] lock(&type->lock_class); [12538.613737] lock(root_key_user.cons_lock); [12538.644234] lock(&type->lock_class); [12538.672410] lock(&mm->mmap_sem); [12538.687758] [12538.687758] *** DEADLOCK *** [12538.687758] [12538.714455] 1 lock held by keyctl/25598: [12538.732097] #0: 000000003de5b58d (&type->lock_class){++++}, at: keyctl_read_key+0x15a/0x220 [12538.770573] [12538.770573] stack backtrace: [12538.790136] CPU: 2 PID: 25598 Comm: keyctl Kdump: loaded Tainted: G [12538.844855] Hardware name: HP ProLiant DL360 Gen9/ProLiant DL360 Gen9, BIOS P89 12/27/2015 [12538.881963] Call Trace: [12538.892897] dump_stack+0x9a/0xf0 [12538.907908] print_circular_bug.isra.25.cold.50+0x1bc/0x279 [12538.932891] ? save_trace+0xd6/0x250 [12538.948979] check_prev_add.constprop.32+0xc36/0x14f0 [12538.971643] ? keyring_compare_object+0x104/0x190 [12538.992738] ? check_usage+0x550/0x550 [12539.009845] ? sched_clock+0x5/0x10 [12539.025484] ? sched_clock_cpu+0x18/0x1e0 [12539.043555] __lock_acquire+0x1f12/0x38d0 [12539.061551] ? trace_hardirqs_on+0x10/0x10 [12539.080554] lock_acquire+0x14c/0x420 [12539.100330] ? __might_fault+0xc4/0x1b0 [12539.119079] __might_fault+0x119/0x1b0 [12539.135869] ? __might_fault+0xc4/0x1b0 [12539.153234] keyring_read_iterator+0x7e/0x170 [12539.172787] ? keyring_read+0x110/0x110 [12539.190059] assoc_array_subtree_iterate+0x97/0x280 [12539.211526] keyring_read+0xe9/0x110 [12539.227561] ? keyring_gc_check_iterator+0xc0/0xc0 [12539.249076] keyctl_read_key+0x1b9/0x220 [12539.266660] do_syscall_64+0xa5/0x4b0 [12539.283091] entry_SYSCALL_64_after_hwframe+0x6a/0xdf One way to prevent this deadlock scenario from happening is to not allow writing to userspace while holding the key semaphore. Instead, an internal buffer is allocated for getting the keys out from the read method first before copying them out to userspace without holding the lock. That requires taking out the __user modifier from all the relevant read methods as well as additional changes to not use any userspace write helpers. That is, 1) The put_user() call is replaced by a direct copy. 2) The copy_to_user() call is replaced by memcpy(). 3) All the fault handling code is removed. Compiling on a x86-64 system, the size of the rxrpc_read() function is reduced from 3795 bytes to 2384 bytes with this patch. Fixes: ^1da177e4c3f4 ("Linux-2.6.12-rc2") Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Waiman Long <longman@redhat.com> Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-21keys: Fix proc_keys_next to increase position indexVasily Averin
commit 86d32f9a7c54ad74f4514d7fef7c847883207291 upstream. If seq_file .next function does not change position index, read after some lseek can generate unexpected output: $ dd if=/proc/keys bs=1 # full usual output 0f6bfdf5 I--Q--- 2 perm 3f010000 1000 1000 user 4af2f79ab8848d0a: 740 1fb91b32 I--Q--- 3 perm 1f3f0000 1000 65534 keyring _uid.1000: 2 27589480 I--Q--- 1 perm 0b0b0000 0 0 user invocation_id: 16 2f33ab67 I--Q--- 152 perm 3f030000 0 0 keyring _ses: 2 33f1d8fa I--Q--- 4 perm 3f030000 1000 1000 keyring _ses: 1 3d427fda I--Q--- 2 perm 3f010000 1000 1000 user 69ec44aec7678e5a: 740 3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1 521+0 records in 521+0 records out 521 bytes copied, 0,00123769 s, 421 kB/s But a read after lseek in middle of last line results in the partial last line and then a repeat of the final line: $ dd if=/proc/keys bs=500 skip=1 dd: /proc/keys: cannot skip to specified offset g _uid_ses.1000: 1 3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1 0+1 records in 0+1 records out 97 bytes copied, 0,000135035 s, 718 kB/s and a read after lseek beyond end of file results in the last line being shown: $ dd if=/proc/keys bs=1000 skip=1 # read after lseek beyond end of file dd: /proc/keys: cannot skip to specified offset 3ead4096 I--Q--- 1 perm 1f3f0000 1000 65534 keyring _uid_ses.1000: 1 0+1 records in 0+1 records out 76 bytes copied, 0,000119981 s, 633 kB/s See https://bugzilla.kernel.org/show_bug.cgi?id=206283 Fixes: 1f4aace60b0e ("fs/seq_file.c: simplify seq_file iteration code ...") Signed-off-by: Vasily Averin <vvs@virtuozzo.com> Signed-off-by: David Howells <dhowells@redhat.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Cc: stable@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-04-17KEYS: reaching the keys quotas correctlyYang Xu
commit 2e356101e72ab1361821b3af024d64877d9a798d upstream. Currently, when we add a new user key, the calltrace as below: add_key() key_create_or_update() key_alloc() __key_instantiate_and_link generic_key_instantiate key_payload_reserve ...... Since commit a08bf91ce28e ("KEYS: allow reaching the keys quotas exactly"), we can reach max bytes/keys in key_alloc, but we forget to remove this limit when we reserver space for payload in key_payload_reserve. So we can only reach max keys but not max bytes when having delta between plen and type->def_datalen. Remove this limit when instantiating the key, so we can keep consistent with key_alloc. Also, fix the similar problem in keyctl_chown_key(). Fixes: 0b77f5bfb45c ("keys: make the keyring quotas controllable through /proc/sys") Fixes: a08bf91ce28e ("KEYS: allow reaching the keys quotas exactly") Cc: stable@vger.kernel.org # 5.0.x Cc: Eric Biggers <ebiggers@google.com> Signed-off-by: Yang Xu <xuyang2018.jy@cn.fujitsu.com> Reviewed-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Reviewed-by: Eric Biggers <ebiggers@google.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@linux.intel.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-02-24selinux: ensure we cleanup the internal AVC counters on error in avc_update()Jaihind Yadav
[ Upstream commit 030b995ad9ece9fa2d218af4429c1c78c2342096 ] In AVC update we don't call avc_node_kill() when avc_xperms_populate() fails, resulting in the avc->avc_cache.active_nodes counter having a false value. In last patch this changes was missed , so correcting it. Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") Signed-off-by: Jaihind Yadav <jaihindyadav@codeaurora.org> Signed-off-by: Ravi Kumar Siddojigari <rsiddoji@codeaurora.org> [PM: merge fuzz, minor description cleanup] Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24selinux: ensure we cleanup the internal AVC counters on error in avc_insert()Paul Moore
[ Upstream commit d8db60cb23e49a92cf8cada3297395c7fa50fdf8 ] Fix avc_insert() to call avc_node_kill() if we've already allocated an AVC node and the code fails to insert the node in the cache. Fixes: fa1aa143ac4a ("selinux: extended permissions for ioctls") Reported-by: rsiddoji@codeaurora.org Suggested-by: Stephen Smalley <sds@tycho.nsa.gov> Acked-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-24selinux: fall back to ref-walk if audit is requiredStephen Smalley
[ Upstream commit 0188d5c025ca8fe756ba3193bd7d150139af5a88 ] commit bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") passed down the rcu flag to the SELinux AVC, but failed to adjust the test in slow_avc_audit() to also return -ECHILD on LSM_AUDIT_DATA_DENTRY. Previously, we only returned -ECHILD if generating an audit record with LSM_AUDIT_DATA_INODE since this was only relevant from inode_permission. Move the handling of MAY_NOT_BLOCK to avc_audit() and its inlined equivalent in selinux_inode_permission() immediately after we determine that audit is required, and always fall back to ref-walk in this case. Fixes: bda0be7ad994 ("security: make inode_follow_link RCU-walk aware") Reported-by: Will Deacon <will@kernel.org> Suggested-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Stephen Smalley <sds@tycho.nsa.gov> Signed-off-by: Paul Moore <paul@paul-moore.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2020-02-05tomoyo: Use atomic_t for statistics counterTetsuo Handa
commit a8772fad0172aeae339144598b809fd8d4823331 upstream. syzbot is reporting that there is a race at tomoyo_stat_update() [1]. Although it is acceptable to fail to track exact number of times policy was updated, convert to atomic_t because this is not a hot path. [1] https://syzkaller.appspot.com/bug?id=a4d7b973972eeed410596e6604580e0133b0fc04 Reported-by: syzbot <syzbot+efea72d4a0a1d03596cd@syzkaller.appspotmail.com> Signed-off-by: Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2020-01-27keys: Timestamp new keysDavid Howells
[ Upstream commit 7c1857bdbdf1e4c541e45eab477ee23ed4333ea4 ] Set the timestamp on new keys rather than leaving it unset. Fixes: 31d5a79d7f3d ("KEYS: Do LRU discard in full keyrings") Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: James Morris <james.morris@microsoft.com> Signed-off-by: Sasha Levin <sashal@kernel.org>