linux-yocto - Yocto Linux Embedded kernel

Age	Commit message (Collapse)	Author
2023-11-20	hwrng: geode - fix accessing registers	Jonas Gorski
	[ Upstream commit 464bd8ec2f06707f3773676a1bd2c64832a3c805 ] When the membase and pci_dev pointer were moved to a new struct in priv, the actual membase users were left untouched, and they started reading out arbitrary memory behind the struct instead of registers. This unfortunately turned the RNG into a constant number generator, depending on the content of what was at that offset. To fix this, update geode_rng_data_{read,present}() to also get the membase via amd_geode_priv, and properly read from the right addresses again. Fixes: 9f6ec8dc574e ("hwrng: geode - Fix PCI device refcount leak") Reported-by: Timur I. Davletshin <timur.davletshin@gmail.com> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217882 Tested-by: Timur I. Davletshin <timur.davletshin@gmail.com> Suggested-by: Jo-Philipp Wich <jo@mein.io> Signed-off-by: Jonas Gorski <jonas.gorski@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-10-10	parisc: sba: Fix compile warning wrt list of SBA devices	Helge Deller
	[ Upstream commit eb3255ee8f6f4691471a28fbf22db5e8901116cd ] Fix this makecheck warning: drivers/parisc/sba_iommu.c:98:19: warning: symbol 'sba_list' was not declared. Should it be static? Signed-off-by: Helge Deller <deller@gmx.de> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23	tpm_tis: Resend command to recover from data transfer errors	Alexander Steffen
	[ Upstream commit 280db21e153d8810ce3b93640c63ae922bcb9e8e ] Similar to the transmission of TPM responses, also the transmission of TPM commands may become corrupted. Instead of aborting when detecting such issues, try resending the command again. Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-09-23	ipmi_si: fix a memleak in try_smi_init()	Yi Yang
	commit 6cf1a126de2992b4efe1c3c4d398f8de4aed6e3f upstream. Kmemleak reported the following leak info in try_smi_init(): unreferenced object 0xffff00018ecf9400 (size 1024): comm "modprobe", pid 2707763, jiffies 4300851415 (age 773.308s) backtrace: [<000000004ca5b312>] __kmalloc+0x4b8/0x7b0 [<00000000953b1072>] try_smi_init+0x148/0x5dc [ipmi_si] [<000000006460d325>] 0xffff800081b10148 [<0000000039206ea5>] do_one_initcall+0x64/0x2a4 [<00000000601399ce>] do_init_module+0x50/0x300 [<000000003c12ba3c>] load_module+0x7a8/0x9e0 [<00000000c246fffe>] __se_sys_init_module+0x104/0x180 [<00000000eea99093>] __arm64_sys_init_module+0x24/0x30 [<0000000021b1ef87>] el0_svc_common.constprop.0+0x94/0x250 [<0000000070f4f8b7>] do_el0_svc+0x48/0xe0 [<000000005a05337f>] el0_svc+0x24/0x3c [<000000005eb248d6>] el0_sync_handler+0x160/0x164 [<0000000030a59039>] el0_sync+0x160/0x180 The problem was that when an error occurred before handlers registration and after allocating `new_smi->si_sm`, the variable wouldn't be freed in the error handling afterwards since `shutdown_smi()` hadn't been registered yet. Fix it by adding a `kfree()` in the error handling path in `try_smi_init()`. Cc: stable@vger.kernel.org # 4.19+ Fixes: 7960f18a5647 ("ipmi_si: Convert over to a shutdown handler") Signed-off-by: Yi Yang <yiyang13@huawei.com> Co-developed-by: GONG, Ruiqi <gongruiqi@huaweicloud.com> Signed-off-by: GONG, Ruiqi <gongruiqi@huaweicloud.com> Message-Id: <20230629123328.2402075-1-gongruiqi@huaweicloud.com> Signed-off-by: Corey Minyard <minyard@acm.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11	tpm_tis: Explicitly check for error code	Alexander Steffen
	commit 513253f8c293c0c8bd46d09d337fc892bf8f9f48 upstream. recv_data either returns the number of received bytes, or a negative value representing an error code. Adding the return value directly to the total number of received bytes therefore looks a little weird, since it might add a negative error code to a sum of bytes. The following check for size < expected usually makes the function return ETIME in that case, so it does not cause too many problems in practice. But to make the code look cleaner and because the caller might still be interested in the original error code, explicitly check for the presence of an error code and pass that through. Cc: stable@vger.kernel.org Fixes: cb5354253af2 ("[PATCH] tpm: spacing cleanups 2") Signed-off-by: Alexander Steffen <Alexander.Steffen@infineon.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11	hwrng: imx-rngc - fix the timeout for init and self check	Martin Kaiser
	commit d744ae7477190967a3ddc289e2cd4ae59e8b1237 upstream. Fix the timeout that is used for the initialisation and for the self test. wait_for_completion_timeout expects a timeout in jiffies, but RNGC_TIMEOUT is in milliseconds. Call msecs_to_jiffies to do the conversion. Cc: stable@vger.kernel.org Fixes: 1d5449445bd0 ("hwrng: mx-rngc - add a driver for Freescale RNGC") Signed-off-by: Martin Kaiser <martin@kaiser.cx> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11	tpm: tpm_vtpm_proxy: fix a race condition in /dev/vtpmx creation	Jarkko Sakkinen
	commit f4032d615f90970d6c3ac1d9c0bce3351eb4445c upstream. /dev/vtpmx is made visible before 'workqueue' is initialized, which can lead to a memory corruption in the worst case scenario. Address this by initializing 'workqueue' as the very first step of the driver initialization. Cc: stable@vger.kernel.org Fixes: 6f99612e2500 ("tpm: Proxy driver for supporting multiple emulated TPMs") Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen@tuni.fi> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-11	hwrng: virtio - Fix race on data_avail and actual data	Herbert Xu
	[ Upstream commit ac52578d6e8d300dd50f790f29a24169b1edd26c ] The virtio rng device kicks off a new entropy request whenever the data available reaches zero. When a new request occurs at the end of a read operation, that is, when the result of that request is only needed by the next reader, then there is a race between the writing of the new data and the next reader. This is because there is no synchronisation whatsoever between the writer and the reader. Fix this by writing data_avail with smp_store_release and reading it with smp_load_acquire when we first enter read. The subsequent reads are safe because they're either protected by the first load acquire, or by the completion mechanism. Also remove the redundant zeroing of data_idx in random_recv_done (data_idx must already be zero at this point) and data_avail in request_entropy (ditto). Reported-by: syzbot+726dc8c62c3536431ceb@syzkaller.appspotmail.com Fixes: f7f510ec1957 ("virtio: An entropy device, as suggested by hpa.") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Acked-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11	hwrng: virtio - always add a pending request	Laurent Vivier
	[ Upstream commit 9a4b612d675b03f7fc9fa1957ca399c8223f3954 ] If we ensure we have already some data available by enqueuing again the buffer once data are exhausted, we can return what we have without waiting for the device answer. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20211028101111.128049-5-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: ac52578d6e8d ("hwrng: virtio - Fix race on data_avail and actual data") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11	hwrng: virtio - don't waste entropy	Laurent Vivier
	[ Upstream commit 5c8e933050044d6dd2a000f9a5756ae73cbe7c44 ] if we don't use all the entropy available in the buffer, keep it and use it later. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20211028101111.128049-4-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: ac52578d6e8d ("hwrng: virtio - Fix race on data_avail and actual data") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11	hwrng: virtio - don't wait on cleanup	Laurent Vivier
	[ Upstream commit 2bb31abdbe55742c89f4dc0cc26fcbc8467364f6 ] When virtio-rng device was dropped by the hwrng core we were forced to wait the buffer to come back from the device to not have remaining ongoing operation that could spoil the buffer. But now, as the buffer is internal to the virtio-rng we can release the waiting loop immediately, the buffer will be retrieve and use when the virtio-rng driver will be selected again. This avoids to hang on an rng_current write command if the virtio-rng device is blocked by a lack of entropy. This allows to select another entropy source if the current one is empty. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20211028101111.128049-3-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: ac52578d6e8d ("hwrng: virtio - Fix race on data_avail and actual data") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-11	hwrng: virtio - add an internal buffer	Laurent Vivier
	[ Upstream commit bf3175bc50a3754dc427e2f5046e17a9fafc8be7 ] hwrng core uses two buffers that can be mixed in the virtio-rng queue. If the buffer is provided with wait=0 it is enqueued in the virtio-rng queue but unused by the caller. On the next call, core provides another buffer but the first one is filled instead and the new one queued. And the caller reads the data from the new one that is not updated, and the data in the first one are lost. To avoid this mix, virtio-rng needs to use its own unique internal buffer at a cost of a data copy to the caller buffer. Signed-off-by: Laurent Vivier <lvivier@redhat.com> Link: https://lore.kernel.org/r/20211028101111.128049-2-lvivier@redhat.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: ac52578d6e8d ("hwrng: virtio - Fix race on data_avail and actual data") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-28	ipmi: move message error checking to avoid deadlock	Tony Camuso
	commit 383035211c79d4d98481a09ad429b31c7dbf22bd upstream. V1->V2: in handle_one_rcv_msg, if data_size > 2, set requeue to zero and goto out instead of calling ipmi_free_msg. Kosuke Tatsukawa <tatsu@ab.jp.nec.com> In the source stack trace below, function set_need_watch tries to take out the same si_lock that was taken earlier by ipmi_thread. ipmi_thread() [drivers/char/ipmi/ipmi_si_intf.c:995] smi_event_handler() [drivers/char/ipmi/ipmi_si_intf.c:765] handle_transaction_done() [drivers/char/ipmi/ipmi_si_intf.c:555] deliver_recv_msg() [drivers/char/ipmi/ipmi_si_intf.c:283] ipmi_smi_msg_received() [drivers/char/ipmi/ipmi_msghandler.c:4503] intf_err_seq() [drivers/char/ipmi/ipmi_msghandler.c:1149] smi_remove_watch() [drivers/char/ipmi/ipmi_msghandler.c:999] set_need_watch() [drivers/char/ipmi/ipmi_si_intf.c:1066] Upstream commit e1891cffd4c4896a899337a243273f0e23c028df adds code to ipmi_smi_msg_received() to call smi_remove_watch() via intf_err_seq() and this seems to be causing the deadlock. commit e1891cffd4c4896a899337a243273f0e23c028df Author: Corey Minyard <cminyard@mvista.com> Date: Wed Oct 24 15:17:04 2018 -0500 ipmi: Make the smi watcher be disabled immediately when not needed The fix is to put all messages in the queue and move the message checking code out of ipmi_smi_msg_received and into handle_one_recv_msg, which processes the message checking after ipmi_thread releases its locks. Additionally,Kosuke Tatsukawa <tatsu@ab.jp.nec.com> reported that handle_new_recv_msgs calls ipmi_free_msg when handle_one_rcv_msg returns zero, so that the call to ipmi_free_msg in handle_one_rcv_msg introduced another panic when "ipmitool sensor list" was run in a loop. He submitted this part of the patch. +free_msg: + requeue = 0; + goto out; Reported by: Osamu Samukawa <osa-samukawa@tg.jp.nec.com> Characterized by: Kosuke Tatsukawa <tatsu@ab.jp.nec.com> Signed-off-by: Tony Camuso <tcamuso@redhat.com> Fixes: e1891cffd4c4 ("ipmi: Make the smi watcher be disabled immediately when not needed") Cc: stable@vger.kernel.org # 5.1 Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-28	ipmi: Make the smi watcher be disabled immediately when not needed	Corey Minyard
	commit e1891cffd4c4896a899337a243273f0e23c028df upstream. The code to tell the lower layer to enable or disable watching for certain things was lazy in disabling, it waited until a timer tick to see if a disable was necessary. Not a really big deal, but it could be improved. Modify the code to enable and disable watching immediately and don't do it from the background timer any more. Signed-off-by: Corey Minyard <cminyard@mvista.com> Tested-by: Kamlakant Patel <kamlakant.patel@cavium.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-30	tpm/tpm_tis: Disable interrupts for more Lenovo devices	Jerry Snitselaar
	commit e7d3e5c4b1dd50a70b31524c3228c62bb41bbab2 upstream. The P360 Tiny suffers from an irq storm issue like the T490s, so add an entry for it to tpm_tis_dmi_table, and force polling. There also previously was a report from the previous attempt to enable interrupts that involved a ThinkPad L490. So an entry is added for it as well. Cc: stable@vger.kernel.org Reported-by: Peter Zijlstra <peterz@infradead.org> # P360 Tiny Closes: https://lore.kernel.org/linux-integrity/20230505130731.GO83892@hirez.programming.kicks-ass.net/ Signed-off-by: Jerry Snitselaar <jsnitsel@redhat.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-05-17	ipmi: fix SSIF not responding under certain cond.	Zhang Yuchen
	[ Upstream commit 6d2555cde2918409b0331560e66f84a0ad4849c6 ] The ipmi communication is not restored after a specific version of BMC is upgraded on our server. The ipmi driver does not respond after printing the following log: ipmi_ssif: Invalid response getting flags: 1c 1 I found that after entering this branch, ssif_info->ssif_state always holds SSIF_GETTING_FLAGS and never return to IDLE. As a result, the driver cannot be loaded, because the driver status is checked during the unload process and must be IDLE in shutdown_ssif(): while (ssif_info->ssif_state != SSIF_IDLE) schedule_timeout(1); The process trigger this problem is: 1. One msg timeout and next msg start send, and call ssif_set_need_watch(). 2. ssif_set_need_watch()->watch_timeout()->start_flag_fetch() change ssif_state to SSIF_GETTING_FLAGS. 3. In msg_done_handler() ssif_state == SSIF_GETTING_FLAGS, if an error message is received, the second branch does not modify the ssif_state. 4. All retry action need IS_SSIF_IDLE() == True. Include retry action in watch_timeout(), msg_done_handler(). Sending msg does not work either. SSIF_IDLE is also checked in start_next_msg(). 5. The only thing that can be triggered in the SSIF driver is watch_timeout(), after destory_user(), this timer will stop too. So, if enter this branch, the ssif_state will remain SSIF_GETTING_FLAGS and can't send msg, no timer started, can't unload. We did a comparative test before and after adding this patch, and the result is effective. Fixes: 259307074bfc ("ipmi: Add SMBus interface driver (SSIF)") Cc: stable@vger.kernel.org Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com> Message-Id: <20230412074907.80046-1-zhangyuchen.lcr@bytedance.com> Signed-off-by: Corey Minyard <minyard@acm.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17	ipmi_ssif: Rename idle state and check	Corey Minyard
	[ Upstream commit 8230831c43a328c2be6d28c65d3f77e14c59986b ] Rename the SSIF_IDLE() to IS_SSIF_IDLE(), since that is more clear, and rename SSIF_NORMAL to SSIF_IDLE, since that's more accurate. Cc: stable@vger.kernel.org Signed-off-by: Corey Minyard <cminyard@mvista.com> Stable-dep-of: 6d2555cde291 ("ipmi: fix SSIF not responding under certain cond.") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17	ipmi: Fix how the lower layers are told to watch for messages	Corey Minyard
	[ Upstream commit c65ea996595005be470fbfa16711deba414fd33b ] The IPMI driver has a mechanism to tell the lower layers it needs to watch for messages, commands, and watchdogs (so it doesn't needlessly poll). However, it needed some extensions, it needed a way to tell what is being waited for so it could set the timeout appropriately. The update to the lower layer was also being done once a second at best because it was done in the main timeout handler. However, if a command is sent and a response message is coming back, it needed to be started immediately. So modify the code to update immediately if it needs to be enabled. Disable is still lazy. Signed-off-by: Corey Minyard <cminyard@mvista.com> Tested-by: Kamlakant Patel <kamlakant.patel@cavium.com> Stable-dep-of: 6d2555cde291 ("ipmi: fix SSIF not responding under certain cond.") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-05-17	ipmi: Fix SSIF flag requests	Corey Minyard
	[ Upstream commit a1466ec5b671651b848df17fc9233ecbb7d35f9f ] Commit 89986496de141 ("ipmi: Turn off all activity on an idle ipmi interface") modified the IPMI code to only request events when the driver had somethine waiting for events. The SSIF code, however, was using the event fetch request to also fetch the flags. Add a timer and the proper handling for the upper layer telling whether flags fetches are required. Reported-by: Kamlakant Patel <Kamlakant.Patel@cavium.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Tested-by: Kamlakant Patel <kamlakant.patel@cavium.com> Stable-dep-of: 6d2555cde291 ("ipmi: fix SSIF not responding under certain cond.") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18	ipmi: fix use after free in _ipmi_destroy_user()	Dan Carpenter
	commit a92ce570c81dc0feaeb12a429b4bc65686d17967 upstream. The intf_free() function frees the "intf" pointer so we cannot dereference it again on the next line. Fixes: cbb79863fc31 ("ipmi: Don't allow device module unload when in use") Signed-off-by: Dan Carpenter <error27@gmail.com> Message-Id: <Y3M8xa1drZv4CToE@kili> Cc: <stable@vger.kernel.org> # 5.5+ Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18	ipmi: fix long wait in unload when IPMI disconnect	Zhang Yuchen
	commit f6f1234d98cce69578bfac79df147a1f6660596c upstream. When fixing the problem mentioned in PATCH1, we also found the following problem: If the IPMI is disconnected and in the sending process, the uninstallation driver will be stuck for a long time. The main problem is that uninstalling the driver waits for curr_msg to be sent or HOSED. After stopping tasklet, the only place to trigger the timeout mechanism is the circular poll in shutdown_smi. The poll function delays 10us and calls smi_event_handler(smi_info,10). Smi_event_handler deducts 10us from kcs->ibf_timeout. But the poll func is followed by schedule_timeout_uninterruptible(1). The time consumed here is not counted in kcs->ibf_timeout. So when 10us is deducted from kcs->ibf_timeout, at least 1 jiffies has actually passed. The waiting time has increased by more than a hundredfold. Now instead of calling poll(). call smi_event_handler() directly and calculate the elapsed time. For verification, you can directly use ebpf to check the kcs-> ibf_timeout for each call to kcs_event() when IPMI is disconnected. Decrement at normal rate before unloading. The decrement rate becomes very slow after unloading. $ bpftrace -e 'kprobe:kcs_event {printf("kcs->ibftimeout : %d\n", *(arg0+584));}' Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com> Message-Id: <20221007092617.87597-3-zhangyuchen.lcr@bytedance.com> Signed-off-by: Corey Minyard <cminyard@mvista.com> Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18	tpm: tpm_tis: Add the missed acpi_put_table() to fix memory leak	Hanjun Guo
	commit db9622f762104459ff87ecdf885cc42c18053fd9 upstream. In check_acpi_tpm2(), we get the TPM2 table just to make sure the table is there, not used after the init, so the acpi_put_table() should be added to release the ACPI memory. Fixes: 4cb586a188d4 ("tpm_tis: Consolidate the platform and acpi probe flow") Cc: stable@vger.kernel.org Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18	tpm: tpm_crb: Add the missed acpi_put_table() to fix memory leak	Hanjun Guo
	commit 37e90c374dd11cf4919c51e847c6d6ced0abc555 upstream. In crb_acpi_add(), we get the TPM2 table to retrieve information like start method, and then assign them to the priv data, so the TPM2 table is not used after the init, should be freed, call acpi_put_table() to fix the memory leak. Fixes: 30fc8d138e91 ("tpm: TPM 2.0 CRB Interface") Cc: stable@vger.kernel.org Signed-off-by: Hanjun Guo <guohanjun@huawei.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-01-18	ipmi: fix memleak when unload ipmi driver	Zhang Yuchen
	[ Upstream commit 36992eb6b9b83f7f9cdc8e74fb5799d7b52e83e9 ] After the IPMI disconnect problem, the memory kept rising and we tried to unload the driver to free the memory. However, only part of the free memory is recovered after the driver is uninstalled. Using ebpf to hook free functions, we find that neither ipmi_user nor ipmi_smi_msg is free, only ipmi_recv_msg is free. We find that the deliver_smi_err_response call in clean_smi_msgs does the destroy processing on each message from the xmit_msg queue without checking the return value and free ipmi_smi_msg. deliver_smi_err_response is called only at this location. Adding the free handling has no effect. To verify, try using ebpf to trace the free function. $ bpftrace -e 'kretprobe:ipmi_alloc_recv_msg {printf("alloc rcv %p\n",retval);} kprobe:free_recv_msg {printf("free recv %p\n", arg0)} kretprobe:ipmi_alloc_smi_msg {printf("alloc smi %p\n", retval);} kprobe:free_smi_msg {printf("free smi %p\n",arg0)}' Signed-off-by: Zhang Yuchen <zhangyuchen.lcr@bytedance.com> Message-Id: <20221007092617.87597-4-zhangyuchen.lcr@bytedance.com> [Fixed the comment above handle_one_recv_msg().] Signed-off-by: Corey Minyard <cminyard@mvista.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18	hwrng: geode - Fix PCI device refcount leak	Xiongfeng Wang
	[ Upstream commit 9f6ec8dc574efb7f4f3d7ee9cd59ae307e78f445 ] for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL. If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. We add a new struct 'amd_geode_priv' to record pointer of the pci_dev and membase, and then add missing pci_dev_put() for the normal and error path. Fixes: ef5d862734b8 ("[PATCH] Add Geode HW RNG driver") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18	hwrng: amd - Fix PCI device refcount leak	Xiongfeng Wang
	[ Upstream commit ecadb5b0111ea19fc7c240bb25d424a94471eb7d ] for_each_pci_dev() is implemented by pci_get_device(). The comment of pci_get_device() says that it will increase the reference count for the returned pci_dev and also decrease the reference count for the input pci_dev @from if it is not NULL. If we break for_each_pci_dev() loop with pdev not NULL, we need to call pci_dev_put() to decrease the reference count. Add the missing pci_dev_put() for the normal and error path. Fixes: 96d63c0297cc ("[PATCH] Add AMD HW RNG driver") Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-01-18	tpm/tpm_crb: Fix error message in __crb_relinquish_locality()	Michael Kelley
	[ Upstream commit f5264068071964b56dc02c9dab3d11574aaca6ff ] The error message in __crb_relinquish_locality() mentions requestAccess instead of Relinquish. Fix it. Fixes: 888d867df441 ("tpm: cmd_ready command can be issued only after granting locality") Signed-off-by: Michael Kelley <mikelley@microsoft.com> Acked-by: Tomas Winkler <tomas.winkler@intel.com> Reviewed-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Jarkko Sakkinen <jarkko@kernel.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26	random: use expired timer rather than wq for mixing fast pool	Jason A. Donenfeld
	commit 748bc4dd9e663f23448d8ad7e58c011a67ea1eca upstream. Previously, the fast pool was dumped into the main pool periodically in the fast pool's hard IRQ handler. This worked fine and there weren't problems with it, until RT came around. Since RT converts spinlocks into sleeping locks, problems cropped up. Rather than switching to raw spinlocks, the RT developers preferred we make the transformation from originally doing: do_some_stuff() spin_lock() do_some_other_stuff() spin_unlock() to doing: do_some_stuff() queue_work_on(some_other_stuff_worker) This is an ordinary pattern done all over the kernel. However, Sherry noticed a 10% performance regression in qperf TCP over a 40gbps InfiniBand card. Quoting her message: > MT27500 Family [ConnectX-3] cards: > Infiniband device 'mlx4_0' port 1 status: > default gid: fe80:0000:0000:0000:0010:e000:0178:9eb1 > base lid: 0x6 > sm lid: 0x1 > state: 4: ACTIVE > phys state: 5: LinkUp > rate: 40 Gb/sec (4X QDR) > link_layer: InfiniBand > > Cards are configured with IP addresses on private subnet for IPoIB > performance testing. > Regression identified in this bug is in TCP latency in this stack as reported > by qperf tcp_lat metric: > > We have one system listen as a qperf server: > [root@yourQperfServer ~]# qperf > > Have the other system connect to qperf server as a client (in this > case, it’s X7 server with Mellanox card): > [root@yourQperfClient ~]# numactl -m0 -N0 qperf 20.20.20.101 -v -uu -ub --time 60 --wait_server 20 -oo msg_size:4K:1024K:*2 tcp_lat Rather than incur the scheduling latency from queue_work_on, we can instead switch to running on the next timer tick, on the same core. This also batches things a bit more -- once per jiffy -- which is okay now that mix_interrupt_randomness() can credit multiple bits at once. Reported-by: Sherry Yang <sherry.yang@oracle.com> Tested-by: Paul Webb <paul.x.webb@oracle.com> Cc: Sherry Yang <sherry.yang@oracle.com> Cc: Phillip Goerl <phillip.goerl@oracle.com> Cc: Jack Vogel <jack.vogel@oracle.com> Cc: Nicky Veitch <nicky.veitch@oracle.com> Cc: Colm Harrington <colm.harrington@oracle.com> Cc: Ramanan Govindarajan <ramanan.govindarajan@oracle.com> Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Cc: Dominik Brodowski <linux@dominikbrodowski.net> Cc: Tejun Heo <tj@kernel.org> Cc: Sultan Alsawaf <sultan@kerneltoast.com> Cc: stable@vger.kernel.org Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26	random: avoid reading two cache lines on irq randomness	Jason A. Donenfeld
	commit 9ee0507e896b45af6d65408c77815800bce30008 upstream. In order to avoid reading and dirtying two cache lines on every IRQ, move the work_struct to the bottom of the fast_pool struct. add_ interrupt_randomness() always touches .pool and .count, which are currently split, because .mix pushes everything down. Instead, move .mix to the bottom, so that .pool and .count are always in the first cache line, since .mix is only accessed when the pool is full. Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26	random: restore O_NONBLOCK support	Jason A. Donenfeld
	commit cd4f24ae9404fd31fc461066e57889be3b68641b upstream. Prior to 5.6, when /dev/random was opened with O_NONBLOCK, it would return -EAGAIN if there was no entropy. When the pools were unified in 5.6, this was lost. The post 5.6 behavior of blocking until the pool is initialized, and ignoring O_NONBLOCK in the process, went unnoticed, with no reports about the regression received for two and a half years. However, eventually this indeed did break somebody's userspace. So we restore the old behavior, by returning -EAGAIN if the pool is not initialized. Unlike the old /dev/random, this can only occur during early boot, after which it never blocks again. In order to make this O_NONBLOCK behavior consistent with other expectations, also respect users reading with preadv2(RWF_NOWAIT) and similar. Fixes: 30c08efec888 ("random: make /dev/random be almost like /dev/urandom") Reported-by: Guozihua <guozihua@huawei.com> Reported-by: Zhongguohua <zhongguohua1@huawei.com> Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Andrew Lutomirski <luto@kernel.org> Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26	random: clamp credited irq bits to maximum mixed	Jason A. Donenfeld
	commit e78a802a7b4febf53f2a92842f494b01062d85a8 upstream. Since the most that's mixed into the pool is sizeof(long)*2, don't credit more than that many bytes of entropy. Fixes: e3e33fc2ea7f ("random: do not use input pool from hard IRQs") Cc: stable@vger.kernel.org Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-29	Revert "Revert "char/random: silence a lockdep splat with printk()""	Jason A. Donenfeld
	In 2019, Sergey fixed a lockdep splat with 15341b1dd409 ("char/random: silence a lockdep splat with printk()"), but that got reverted soon after from 4.19 because back then it apparently caused various problems. But the issue it was fixing is still there, and more generally, many patches turning printk() into printk_deferred() have landed since, making me suspect it's okay to try this out again. This should fix the following deadlock found by the kernel test robot: [ 18.287691] WARNING: possible circular locking dependency detected [ 18.287692] 4.19.248-00165-g3d1f971aa81f #1 Not tainted [ 18.287693] ------------------------------------------------------ [ 18.287712] stop/202 is trying to acquire lock: [ 18.287713] (ptrval) (console_owner){..-.}, at: console_unlock (??:?) [ 18.287717] [ 18.287718] but task is already holding lock: [ 18.287718] (ptrval) (&(&port->lock)->rlock){-...}, at: pty_write (pty.c:?) [ 18.287722] [ 18.287722] which lock already depends on the new lock. [ 18.287723] [ 18.287724] [ 18.287725] the existing dependency chain (in reverse order) is: [ 18.287725] [ 18.287726] -> #2 (&(&port->lock)->rlock){-...}: [ 18.287729] validate_chain+0x84a/0xe00 [ 18.287729] __lock_acquire (lockdep.c:?) [ 18.287730] lock_acquire (??:?) [ 18.287731] _raw_spin_lock_irqsave (??:?) [ 18.287732] tty_port_tty_get (??:?) [ 18.287733] tty_port_default_wakeup (tty_port.c:?) [ 18.287734] tty_port_tty_wakeup (??:?) [ 18.287734] uart_write_wakeup (??:?) [ 18.287735] serial8250_tx_chars (??:?) [ 18.287736] serial8250_handle_irq (??:?) [ 18.287737] serial8250_default_handle_irq (8250_port.c:?) [ 18.287738] serial8250_interrupt (8250_core.c:?) [ 18.287738] __handle_irq_event_percpu (??:?) [ 18.287739] handle_irq_event_percpu (??:?) [ 18.287740] handle_irq_event (??:?) [ 18.287741] handle_edge_irq (??:?) [ 18.287742] handle_irq (??:?) [ 18.287742] do_IRQ (??:?) [ 18.287743] common_interrupt (entry_32.o:?) [ 18.287744] _raw_spin_unlock_irqrestore (??:?) [ 18.287745] uart_write (serial_core.c:?) [ 18.287746] process_output_block (n_tty.c:?) [ 18.287747] n_tty_write (n_tty.c:?) [ 18.287747] tty_write (tty_io.c:?) [ 18.287748] __vfs_write (??:?) [ 18.287749] vfs_write (??:?) [ 18.287750] ksys_write (??:?) [ 18.287750] sys_write (??:?) [ 18.287751] do_fast_syscall_32 (??:?) [ 18.287752] entry_SYSENTER_32 (??:?) [ 18.287752] [ 18.287753] -> #1 (&port_lock_key){-.-.}: [ 18.287756] [ 18.287756] -> #0 (console_owner){..-.}: [ 18.287759] check_prevs_add (lockdep.c:?) [ 18.287760] validate_chain+0x84a/0xe00 [ 18.287761] __lock_acquire (lockdep.c:?) [ 18.287761] lock_acquire (??:?) [ 18.287762] console_unlock (??:?) [ 18.287763] vprintk_emit (??:?) [ 18.287764] vprintk_default (??:?) [ 18.287764] vprintk_func (??:?) [ 18.287765] printk (??:?) [ 18.287766] get_random_u32 (??:?) [ 18.287767] shuffle_freelist (slub.c:?) [ 18.287767] allocate_slab (slub.c:?) [ 18.287768] new_slab (slub.c:?) [ 18.287769] ___slab_alloc+0x6d0/0xb20 [ 18.287770] __slab_alloc+0xd6/0x2e0 [ 18.287770] __kmalloc (??:?) [ 18.287771] tty_buffer_alloc (tty_buffer.c:?) [ 18.287772] __tty_buffer_request_room (tty_buffer.c:?) [ 18.287773] tty_insert_flip_string_fixed_flag (??:?) [ 18.287774] pty_write (pty.c:?) [ 18.287775] process_output_block (n_tty.c:?) [ 18.287776] n_tty_write (n_tty.c:?) [ 18.287777] tty_write (tty_io.c:?) [ 18.287778] __vfs_write (??:?) [ 18.287779] vfs_write (??:?) [ 18.287780] ksys_write (??:?) [ 18.287780] sys_write (??:?) [ 18.287781] do_fast_syscall_32 (??:?) [ 18.287782] entry_SYSENTER_32 (??:?) [ 18.287783] [ 18.287783] other info that might help us debug this: [ 18.287784] [ 18.287785] Chain exists of: [ 18.287785] console_owner --> &port_lock_key --> &(&port->lock)->rlock [ 18.287789] [ 18.287790] Possible unsafe locking scenario: [ 18.287790] [ 18.287791] CPU0 CPU1 [ 18.287792] ---- ---- [ 18.287792] lock(&(&port->lock)->rlock); [ 18.287794] lock(&port_lock_key); [ 18.287814] lock(&(&port->lock)->rlock); [ 18.287815] lock(console_owner); [ 18.287817] [ 18.287818] * DEADLOCK * [ 18.287818] [ 18.287819] 6 locks held by stop/202: [ 18.287820] #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read (??:?) [ 18.287823] #1: (ptrval) (&tty->atomic_write_lock){+.+.}, at: tty_write_lock (tty_io.c:?) [ 18.287826] #2: (ptrval) (&o_tty->termios_rwsem/1){++++}, at: n_tty_write (n_tty.c:?) [ 18.287830] #3: (ptrval) (&ldata->output_lock){+.+.}, at: process_output_block (n_tty.c:?) [ 18.287834] #4: (ptrval) (&(&port->lock)->rlock){-...}, at: pty_write (pty.c:?) [ 18.287838] #5: (ptrval) (console_lock){+.+.}, at: console_trylock_spinning (printk.c:?) [ 18.287841] [ 18.287842] stack backtrace: [ 18.287843] CPU: 0 PID: 202 Comm: stop Not tainted 4.19.248-00165-g3d1f971aa81f #1 [ 18.287843] Call Trace: [ 18.287844] dump_stack (??:?) [ 18.287845] print_circular_bug.cold+0x78/0x8b [ 18.287846] check_prev_add+0x66a/0xd20 [ 18.287847] check_prevs_add (lockdep.c:?) [ 18.287848] validate_chain+0x84a/0xe00 [ 18.287848] __lock_acquire (lockdep.c:?) [ 18.287849] lock_acquire (??:?) [ 18.287850] ? console_unlock (??:?) [ 18.287851] console_unlock (??:?) [ 18.287851] ? console_unlock (??:?) [ 18.287852] ? native_save_fl (??:?) [ 18.287853] vprintk_emit (??:?) [ 18.287854] vprintk_default (??:?) [ 18.287855] vprintk_func (??:?) [ 18.287855] printk (??:?) [ 18.287856] get_random_u32 (??:?) [ 18.287857] ? shuffle_freelist (slub.c:?) [ 18.287858] shuffle_freelist (slub.c:?) [ 18.287858] ? page_address (??:?) [ 18.287859] allocate_slab (slub.c:?) [ 18.287860] new_slab (slub.c:?) [ 18.287861] ? pvclock_clocksource_read (??:?) [ 18.287862] ___slab_alloc+0x6d0/0xb20 [ 18.287862] ? kvm_sched_clock_read (kvmclock.c:?) [ 18.287863] ? __slab_alloc+0xbc/0x2e0 [ 18.287864] ? native_wbinvd (paravirt.c:?) [ 18.287865] __slab_alloc+0xd6/0x2e0 [ 18.287865] __kmalloc (??:?) [ 18.287866] ? __lock_acquire (lockdep.c:?) [ 18.287867] ? tty_buffer_alloc (tty_buffer.c:?) [ 18.287868] tty_buffer_alloc (tty_buffer.c:?) [ 18.287869] __tty_buffer_request_room (tty_buffer.c:?) [ 18.287869] tty_insert_flip_string_fixed_flag (??:?) [ 18.287870] pty_write (pty.c:?) [ 18.287871] process_output_block (n_tty.c:?) [ 18.287872] n_tty_write (n_tty.c:?) [ 18.287873] ? print_dl_stats (??:?) [ 18.287874] ? n_tty_ioctl (n_tty.c:?) [ 18.287874] tty_write (tty_io.c:?) [ 18.287875] ? n_tty_ioctl (n_tty.c:?) [ 18.287876] ? tty_write_unlock (tty_io.c:?) [ 18.287877] __vfs_write (??:?) [ 18.287877] vfs_write (??:?) [ 18.287878] ? __fget_light (file.c:?) [ 18.287879] ksys_write (??:?) Cc: Sergey Senozhatsky <sergey.senozhatsky.work@gmail.com> Cc: Qian Cai <cai@lca.pw> Cc: Lech Perczak <l.perczak@camlintechnologies.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: Theodore Ts'o <tytso@mit.edu> Cc: Sasha Levin <sashal@kernel.org> Cc: Petr Mladek <pmladek@suse.com> Cc: John Ogness <john.ogness@linutronix.de> Reported-by: kernel test robot <oliver.sang@intel.com> Link: https://lore.kernel.org/lkml/Ytz+lo4zRQYG3JUR@xsang-OptiPlex-9020 Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02	random: quiet urandom warning ratelimit suppression message	Jason A. Donenfeld
	commit c01d4d0a82b71857be7449380338bc53dde2da92 upstream. random.c ratelimits how much it warns about uninitialized urandom reads using __ratelimit(). When the RNG is finally initialized, it prints the number of missed messages due to ratelimiting. It has been this way since that functionality was introduced back in 2018. Recently, cc1e127bfa95 ("random: remove ratelimiting for in-kernel unseeded randomness") put a bit more stress on the urandom ratelimiting, which teased out a bug in the implementation. Specifically, when under pressure, __ratelimit() will print its own message and reset the count back to 0, making the final message at the end less useful. Secondly, it does so as a pr_warn(), which apparently is undesirable for people's CI. Fortunately, __ratelimit() has the RATELIMIT_MSG_ON_RELEASE flag exactly for this purpose, so we set the flag. Fixes: 4e00b339e264 ("random: rate limit unseeded randomness warnings") Cc: stable@vger.kernel.org Reported-by: Jon Hunter <jonathanh@nvidia.com> Reported-by: Ron Economos <re@w6rz.net> Tested-by: Ron Economos <re@w6rz.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-07-02	random: schedule mix_interrupt_randomness() less often	Jason A. Donenfeld
	commit 534d2eaf1970274150596fdd2bf552721e65d6b2 upstream. It used to be that mix_interrupt_randomness() would credit 1 bit each time it ran, and so add_interrupt_randomness() would schedule mix() to run every 64 interrupts, a fairly arbitrary number, but nonetheless considered to be a decent enough conservative estimate. Since e3e33fc2ea7f ("random: do not use input pool from hard IRQs"), mix() is now able to credit multiple bits, depending on the number of calls to add(). This was done for reasons separate from this commit, but it has the nice side effect of enabling this patch to schedule mix() less often. Currently the rules are: a) Credit 1 bit for every 64 calls to add(). b) Schedule mix() once a second that add() is called. c) Schedule mix() once every 64 calls to add(). Rules (a) and (c) no longer need to be coupled. It's still important to have _some_ value in (c), so that we don't "over-saturate" the fast pool, but the once per second we get from rule (b) is a plenty enough baseline. So, by increasing the 64 in rule (c) to something larger, we avoid calling queue_work_on() as frequently during irq storms. This commit changes that 64 in rule (c) to be 1024, which means we schedule mix() 16 times less often. And it does not need to change the 64 in rule (a). Fixes: 58340f8e952b ("random: defer fast pool mixing to worker") Cc: stable@vger.kernel.org Cc: Dominik Brodowski <linux@dominikbrodowski.net> Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: credit cpu and bootloader seeds by default	Jason A. Donenfeld
	[ Upstream commit 846bb97e131d7938847963cca00657c995b1fce1 ] This commit changes the default Kconfig values of RANDOM_TRUST_CPU and RANDOM_TRUST_BOOTLOADER to be Y by default. It does not change any existing configs or change any kernel behavior. The reason for this is several fold. As background, I recently had an email thread with the kernel maintainers of Fedora/RHEL, Debian, Ubuntu, Gentoo, Arch, NixOS, Alpine, SUSE, and Void as recipients. I noted that some distros trust RDRAND, some trust EFI, and some trust both, and I asked why or why not. There wasn't really much of a "debate" but rather an interesting discussion of what the historical reasons have been for this, and it came up that some distros just missed the introduction of the bootloader Kconfig knob, while another didn't want to enable it until there was a boot time switch to turn it off for more concerned users (which has since been added). The result of the rather uneventful discussion is that every major Linux distro enables these two options by default. While I didn't have really too strong of an opinion going into this thread -- and I mostly wanted to learn what the distros' thinking was one way or another -- ultimately I think their choice was a decent enough one for a default option (which can be disabled at boot time). I'll try to summarize the pros and cons: Pros: - The RNG machinery gets initialized super quickly, and there's no messing around with subsequent blocking behavior. - The bootloader mechanism is used by kexec in order for the prior kernel to initialize the RNG of the next kernel, which increases the entropy available to early boot daemons of the next kernel. - Previous objections related to backdoors centered around Dual_EC_DRBG-like kleptographic systems, in which observing some amount of the output stream enables an adversary holding the right key to determine the entire output stream. This used to be a partially justified concern, because RDRAND output was mixed into the output stream in varying ways, some of which may have lacked pre-image resistance (e.g. XOR or an LFSR). But this is no longer the case. Now, all usage of RDRAND and bootloader seeds go through a cryptographic hash function. This means that the CPU would have to compute a hash pre-image, which is not considered to be feasible (otherwise the hash function would be terribly broken). - More generally, if the CPU is backdoored, the RNG is probably not the realistic vector of choice for an attacker. - These CPU or bootloader seeds are far from being the only source of entropy. Rather, there is generally a pretty huge amount of entropy, not all of which is credited, especially on CPUs that support instructions like RDRAND. In other words, assuming RDRAND outputs all zeros, an attacker would still have to accurately model every single other entropy source also in use. - The RNG now reseeds itself quite rapidly during boot, starting at 2 seconds, then 4, then 8, then 16, and so forth, so that other sources of entropy get used without much delay. - Paranoid users can set random.trust_{cpu,bootloader}=no in the kernel command line, and paranoid system builders can set the Kconfig options to N, so there's no reduction or restriction of optionality. - It's a practical default. - All the distros have it set this way. Microsoft and Apple trust it too. Bandwagon. Cons: - RDRAND could still be backdoored with something like a fixed key or limited space serial number seed or another indexable scheme like that. (However, it's hard to imagine threat models where the CPU is backdoored like this, yet people are still okay making any computations with it or connecting it to networks, etc.) - RDRAND could be defective, rather than backdoored, and produce garbage that is in one way or another insufficient for crypto. - Suggesting a reduction in paranoia, as this commit effectively does, may cause some to question my personal integrity as a "security person". - Bootloader seeds and RDRAND are generally very difficult if not all together impossible to audit. Keep in mind that this doesn't actually change any behavior. This is just a change in the default Kconfig value. The distros already are shipping kernels that set things this way. Ard made an additional argument in [1]: We're at the mercy of firmware and micro-architecture anyway, given that we are also relying on it to ensure that every instruction in the kernel's executable image has been faithfully copied to memory, and that the CPU implements those instructions as documented. So I don't think firmware or ISA bugs related to RNGs deserve special treatment - if they are broken, we should quirk around them like we usually do. So enabling these by default is a step in the right direction IMHO. In [2], Phil pointed out that having this disabled masked a bug that CI otherwise would have caught: A clean 5.15.45 boots cleanly, whereas a downstream kernel shows the static key warning (but it does go on to boot). The significant difference is that our defconfigs set CONFIG_RANDOM_TRUST_BOOTLOADER=y defining that on top of multi_v7_defconfig demonstrates the issue on a clean 5.15.45. Conversely, not setting that option in a downstream kernel build avoids the warning [1] https://lore.kernel.org/lkml/CAMj1kXGi+ieviFjXv9zQBSaGyyzeGW_VpMpTLJK8PJb2QHEQ-w@mail.gmail.com/ [2] https://lore.kernel.org/lkml/c47c42e3-1d56-5859-a6ad-976a1a3381c6@raspberrypi.com/ Cc: Theodore Ts'o <tytso@mit.edu> Reviewed-by: Ard Biesheuvel <ardb@kernel.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-06-25	random: account for arch randomness in bits	Jason A. Donenfeld
	commit 77fc95f8c0dc9e1f8e620ec14d2fb65028fb7adc upstream. Rather than accounting in bytes and multiplying (shifting), we can just account in bits and avoid the shift. The main motivation for this is there are other patches in flux that expand this code a bit, and avoiding the duplication of "* 8" everywhere makes things a bit clearer. Cc: stable@vger.kernel.org Fixes: 12e45a2a6308 ("random: credit architectural init the exact amount") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: mark bootloader randomness code as __init	Jason A. Donenfeld
	commit 39e0f991a62ed5efabd20711a7b6e7da92603170 upstream. add_bootloader_randomness() and the variables it touches are only used during __init and not after, so mark these as __init. At the same time, unexport this, since it's only called by other __init code that's built-in. Cc: stable@vger.kernel.org Fixes: 428826f5358c ("fdt: add support for rng-seed") Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: avoid checking crng_ready() twice in random_init()	Jason A. Donenfeld
	commit 9b29b6b20376ab64e1b043df6301d8a92378e631 upstream. The current flow expands to: if (crng_ready()) ... else if (...) if (!crng_ready()) ... The second crng_ready() call is redundant, but can't so easily be optimized out by the compiler. This commit simplifies that to: if (crng_ready() ... else if (...) ... Fixes: 560181c27b58 ("random: move initialization functions out of hot pages") Cc: stable@vger.kernel.org Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	crypto: drbg - make reseeding from get_random_bytes() synchronous	Nicolai Stange
	commit 074bcd4000e0d812bc253f86fedc40f81ed59ccc upstream. get_random_bytes() usually hasn't full entropy available by the time DRBG instances are first getting seeded from it during boot. Thus, the DRBG implementation registers random_ready_callbacks which would in turn schedule some work for reseeding the DRBGs once get_random_bytes() has sufficient entropy available. For reference, the relevant history around handling DRBG (re)seeding in the context of a not yet fully seeded get_random_bytes() is: commit 16b369a91d0d ("random: Blocking API for accessing nonblocking_pool") commit 4c7879907edd ("crypto: drbg - add async seeding operation") commit 205a525c3342 ("random: Add callback API for random pool readiness") commit 57225e679788 ("crypto: drbg - Use callback API for random readiness") commit c2719503f5e1 ("random: Remove kernel blocking API") However, some time later, the initialization state of get_random_bytes() has been made queryable via rng_is_initialized() introduced with commit 9a47249d444d ("random: Make crng state queryable"). This primitive now allows for streamlining the DRBG reseeding from get_random_bytes() by replacing that aforementioned asynchronous work scheduling from random_ready_callbacks with some simpler, synchronous code in drbg_generate() next to the related logic already present therein. Apart from improving overall code readability, this change will also enable DRBG users to rely on wait_for_random_bytes() for ensuring that the initial seeding has completed, if desired. The previous patches already laid the grounds by making drbg_seed() to record at each DRBG instance whether it was being seeded at a time when rng_is_initialized() still had been false as indicated by ->seeded == DRBG_SEED_STATE_PARTIAL. All that remains to be done now is to make drbg_generate() check for this condition, determine whether rng_is_initialized() has flipped to true in the meanwhile and invoke a reseed from get_random_bytes() if so. Make this move: - rename the former drbg_async_seed() work handler, i.e. the one in charge of reseeding a DRBG instance from get_random_bytes(), to "drbg_seed_from_random()", - change its signature as appropriate, i.e. make it take a struct drbg_state rather than a work_struct and change its return type from "void" to "int" in order to allow for passing error information from e.g. its __drbg_seed() invocation onwards to callers, - make drbg_generate() invoke this drbg_seed_from_random() once it encounters a DRBG instance with ->seeded == DRBG_SEED_STATE_PARTIAL by the time rng_is_initialized() has flipped to true and - prune everything related to the former, random_ready_callback based mechanism. As drbg_seed_from_random() is now getting invoked from drbg_generate() with the ->drbg_mutex being held, it must not attempt to recursively grab it once again. Remove the corresponding mutex operations from what is now drbg_seed_from_random(). Furthermore, as drbg_seed_from_random() can now report errors directly to its caller, there's no need for it to temporarily switch the DRBG's ->seeded state to DRBG_SEED_STATE_UNSEEDED so that a failure of the subsequently invoked __drbg_seed() will get signaled to drbg_generate(). Don't do it then. Signed-off-by: Nicolai Stange <nstange@suse.de> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> [Jason: for stable, undid the modifications for the backport of 5acd3548.] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	Revert "random: use static branch for crng_ready()"	Jason A. Donenfeld
	This reverts upstream commit f5bda35fba615ace70a656d4700423fa6c9bebee from stable. It's not essential and will take some time during 5.19 to work out properly. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: check for signals after page of pool writes	Jason A. Donenfeld
	commit 1ce6c8d68f8ac587f54d0a271ac594d3d51f3efb upstream. get_random_bytes_user() checks for signals after producing a PAGE_SIZE worth of output, just like /dev/zero does. write_pool() is doing basically the same work (actually, slightly more expensive), and so should stop to check for signals in the same way. Let's also name it write_pool_user() to match get_random_bytes_user(), so this won't be misused in the future. Before this patch, massive writes to /dev/urandom would tie up the process for an extremely long time and make it unterminatable. After, it can be successfully interrupted. The following test program can be used to see this works as intended: #include <unistd.h> #include <fcntl.h> #include <signal.h> #include <stdio.h> static unsigned char x[~0U]; static void handle(int) { } int main(int argc, char *argv[]) { pid_t pid = getpid(), child; int fd; signal(SIGUSR1, handle); if (!(child = fork())) { for (;;) kill(pid, SIGUSR1); } fd = open("/dev/urandom", O_WRONLY); pause(); printf("interrupted after writing %zd bytes\n", write(fd, x, sizeof(x))); close(fd); kill(child, SIGTERM); return 0; } Result before: "interrupted after writing 2147479552 bytes" Result after: "interrupted after writing 4096 bytes" Cc: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: wire up fops->splice_{read,write}_iter()	Jens Axboe
	commit 79025e727a846be6fd215ae9cdb654368ac3f9a6 upstream. Now that random/urandom is using {read,write}_iter, we can wire it up to using the generic splice handlers. Fixes: 36e2c7421f02 ("fs: don't allow splice read/write without explicit ops") Signed-off-by: Jens Axboe <axboe@kernel.dk> [Jason: added the splice_write path. Note that sendfile() and such still does not work for read, though it does for write, because of a file type restriction in splice_direct_to_actor(), which I'll address separately.] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: convert to using fops->write_iter()	Jens Axboe
	commit 22b0a222af4df8ee9bb8e07013ab44da9511b047 upstream. Now that the read side has been converted to fix a regression with splice, convert the write side as well to have some symmetry in the interface used (and help deprecate ->write()). Signed-off-by: Jens Axboe <axboe@kernel.dk> [Jason: cleaned up random_ioctl a bit, require full writes in RNDADDENTROPY since it's crediting entropy, simplify control flow of write_pool(), and incorporate suggestions from Al.] Cc: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: move randomize_page() into mm where it belongs	Jason A. Donenfeld
	commit 5ad7dd882e45d7fe432c32e896e2aaa0b21746ea upstream. randomize_page is an mm function. It is documented like one. It contains the history of one. It has the naming convention of one. It looks just like another very similar function in mm, randomize_stack_top(). And it has always been maintained and updated by mm people. There is no need for it to be in random.c. In the "which shape does not look like the other ones" test, pointing to randomize_page() is correct. So move randomize_page() into mm/util.c, right next to the similar randomize_stack_top() function. This commit contains no actual code changes. Cc: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: move initialization functions out of hot pages	Jason A. Donenfeld
	commit 560181c27b582557d633ecb608110075433383af upstream. Much of random.c is devoted to initializing the rng and accounting for when a sufficient amount of entropy has been added. In a perfect world, this would all happen during init, and so we could mark these functions as __init. But in reality, this isn't the case: sometimes the rng only finishes initializing some seconds after system init is finished. For this reason, at the moment, a whole host of functions that are only used relatively close to system init and then never again are intermixed with functions that are used in hot code all the time. This creates more cache misses than necessary. In order to pack the hot code closer together, this commit moves the initialization functions that can't be marked as __init into .text.unlikely by way of the __cold attribute. Of particular note is moving credit_init_bits() into a macro wrapper that inlines the crng_ready() static branch check. This avoids a function call to a nop+ret, and most notably prevents extra entropy arithmetic from being computed in mix_interrupt_randomness(). Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> [ Jason: for stable, made sure the printk_deferred was a pr_notice, because those caused problems on ≤ 4.19 according to commit logs. ] Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: use proper return types on get_random_{int,long}_wait()	Jason A. Donenfeld
	commit 7c3a8a1db5e03d02cc0abb3357a84b8b326dfac3 upstream. Before these were returning signed values, but the API is intended to be used with unsigned values. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: use static branch for crng_ready()	Jason A. Donenfeld
	commit f5bda35fba615ace70a656d4700423fa6c9bebee upstream. Since crng_ready() is only false briefly during initialization and then forever after becomes true, we don't need to evaluate it after, making it a prime candidate for a static branch. One complication, however, is that it changes state in a particular call to credit_init_bits(), which might be made from atomic context, which means we must kick off a workqueue to change the static key. Further complicating things, credit_init_bits() may be called sufficiently early on in system initialization such that system_wq is NULL. Fortunately, there exists the nice function execute_in_process_context(), which will immediately execute the function if !in_interrupt(), and otherwise defer it to a workqueue. During early init, before workqueues are available, in_interrupt() is always false, because interrupts haven't even been enabled yet, which means the function in that case executes immediately. Later on, after workqueues are available, in_interrupt() might be true, but in that case, the work is queued in system_wq and all goes well. Cc: Theodore Ts'o <tytso@mit.edu> Cc: Sultan Alsawaf <sultan@kerneltoast.com> Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: credit architectural init the exact amount	Jason A. Donenfeld
	commit 12e45a2a6308105469968951e6d563e8f4fea187 upstream. RDRAND and RDSEED can fail sometimes, which is fine. We currently initialize the RNG with 512 bits of RDRAND/RDSEED. We only need 256 bits of those to succeed in order to initialize the RNG. Instead of the current "all or nothing" approach, actually credit these contributions the amount that is actually contributed. Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: handle latent entropy and command line from random_init()	Jason A. Donenfeld
	commit 2f14062bb14b0fcfcc21e6dc7d5b5c0d25966164 upstream. Currently, start_kernel() adds latent entropy and the command line to the entropy bool after the RNG has been initialized, deferring when it's actually used by things like stack canaries until the next time the pool is seeded. This surely is not intended. Rather than splitting up which entropy gets added where and when between start_kernel() and random_init(), just do everything in random_init(), which should eliminate these kinds of bugs in the future. While we're at it, rename the awkwardly titled "rand_initialize()" to the more standard "random_init()" nomenclature. Reviewed-by: Dominik Brodowski <linux@dominikbrodowski.net> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-25	random: use proper jiffies comparison macro	Jason A. Donenfeld
	commit 8a5b8a4a4ceb353b4dd5bafd09e2b15751bcdb51 upstream. This expands to exactly the same code that it replaces, but makes things consistent by using the same macro for jiffy comparisons throughout. Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>