aboutsummaryrefslogtreecommitdiffstats
path: root/include
AgeCommit message (Collapse)Author
2024-02-20Merge tag 'v5.4.268' into v5.4/standard/baseBruce Ashfield
This is the 5.4.268 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmWy4hYACgkQONu9yGCS # aT7SVBAAyx1DlSyJWcqzpESH0+VfqyWHxXlKS6Ip5wT0/+t0gglIKkwU/O0FsRXw # pLO24wL0+MuIzgfZZj7wieAOPlGLOonKAvvUHGEMlpfAzyKjmZuW93WLKQlA/Oec # uaT2ooQevRQcgXzbuV1yN/CeCnhbtmiQdcwy6OU5QACfzguQYtDbNGpbVHJEyEIW # khlr+tj1KgRMzh/Sx76RPg4C/hkZBHun3tPcE0lTg+5QZDSkUj5gEdhVOSG2qmSh # Lj9zt/isY3v6Whixel9YoTLr9SukI7ZlKzMrH1kSbGtTW3uZqgqB+7wCi1tWoNE1 # Zwu9/kUe1dU1kfwYW8AA5OwupjBjADVnZZx1cKN3nQZG2J8bSKHwHmuZPx3DGhJ1 # sxlaQ0nGvcEbCKljlIqsHzx2U22YKk939mVz5Y+MZYT5uwWRHI+iH4yRW97putSP # t8tb3uX69Gsl6B+gLu38Mr7kkwyY06xmMnc5dfNCPwh8SxLj3dG7Gft90CNq1JKT # q2cwlMEcDZRlC08kwzD7pRehZ6hYLRlTOv8yhQsQefcfzrtsT18Cec5TI2k72NOe # fbIY8us3Qsr8JVSYuObGqT8LmkX9pkmRozEXgENvwltijEsWULoO2Hs+Z/yD07z8 # RYqtxWxVxFVeHTkrXbbMUTZWhFx5LE+rtxCySpfeFkv0WgRRwa8= # =vkKq # -----END PGP SIGNATURE----- # gpg: Signature made Thu 25 Jan 2024 05:35:02 PM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2024-01-25drm/bridge: Fix typo in post_disable() descriptionDario Binacchi
[ Upstream commit 288b039db225676e0c520c981a1b5a2562d893a3 ] s/singals/signals/ Fixes: 199e4e967af4 ("drm: Extract drm_bridge.h") Signed-off-by: Dario Binacchi <dario.binacchi@amarulasolutions.com> Signed-off-by: Robert Foss <rfoss@kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20231124094253.658064-1-dario.binacchi@amarulasolutions.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25Bluetooth: Fix bogus check for re-auth no supported with non-sspLuiz Augusto von Dentz
[ Upstream commit d03376c185926098cb4d668d6458801eb785c0a5 ] This reverts 19f8def031bfa50c579149b200bfeeb919727b27 "Bluetooth: Fix auth_complete_evt for legacy units" which seems to be working around a bug on a broken controller rather then any limitation imposed by the Bluetooth spec, in fact if there ws not possible to re-auth the command shall fail not succeed. Fixes: 19f8def031bf ("Bluetooth: Fix auth_complete_evt for legacy units") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25crypto: af_alg - Disallow multiple in-flight AIO requestsHerbert Xu
[ Upstream commit 67b164a871af1d736f131fd6fe78a610909f06f3 ] Having multiple in-flight AIO requests results in unpredictable output because they all share the same IV. Fix this by only allowing one request at a time. Fixes: 83094e5e9e49 ("crypto: af_alg - add async support to algif_aead") Fixes: a596999b7ddf ("crypto: algif - change algif_skcipher to be asynchronous") Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25virtio_crypto: Introduce VIRTIO_CRYPTO_NOSPCzhenwei pi
[ Upstream commit 13d640a3e9a3ac7ec694843d3d3b785e85fb8cb8 ] Base on the lastest virtio crypto spec, define VIRTIO_CRYPTO_NOSPC. Reviewed-by: Gonglei <arei.gonglei@huawei.com> Signed-off-by: zhenwei pi <pizhenwei@bytedance.com> Link: https://lore.kernel.org/r/20220302033917.1295334-2-pizhenwei@bytedance.com Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Stable-dep-of: fed93fb62e05 ("crypto: virtio - Handle dataq logic with tasklet") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-18Merge tag 'v5.4.267' into v5.4/standard/baseBruce Ashfield
This is the 5.4.267 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmWlao0ACgkQONu9yGCS # aT55SBAAu/fR/w4uhCqbJ2ygrz+0+kjAEfYCGK66OsfdRdFqANeiUANWHVzG7M4m # uAt2tB7jHFqXk0sStJ/CK5igyH7C5yEVTrU3txzR25bQad2m0R2lsbuveXWxFsrr # leklLO/96H8ao+iZ5yk5nGyB3dYRbw1qQIactYSzCqnTjwfn+uTeok0hFIu6gJKO # 7NYJxtgdWyFTq9o3AqVO6zCjrYRhbdANdzgCp9SZ/E6IiWp8Y9R+pg3n1fhZbUjS # hH/4pTdjLX050I1ikWV//zKG3OEQyV1LWxbky//uj62rq9FM2WWhc7TD1QqiH2Sf # oTY6GlSFFpxF7iM7kFDZTxr5A78Ui/fhGF9y+GQ+CZdqD5c/f8xzpNjSlLD28y0v # pxW9CecwSjv0HiPK/AZ+1vCS1fzZbn9v+MIr29sHrcH1BS6yYWSqzq/zrISGAA+L # kFVVrsGTmQHop9c1/DVx6i2Kdyr9+W/OAS3V3JnDkt6zkU4sqX/lT0BX6zNcxr0b # pAn5e3JxXZGUYug82VvWhaZhESkwBOxS62l0TD5iwnSF9macc2GMWbB0ZnR2jKpy # GxdxZVeZvQ2GYvFdQFHScg+tfmMLX+9WOcRI7J3PpEic8xQwM4Yb+QjN3nxARqtM # qrcZ7BY16q6/8ANO5cfsFR7Om1x769+hxOcoVjdf5WarwuwkvdY= # =0N1d # -----END PGP SIGNATURE----- # gpg: Signature made Mon 15 Jan 2024 12:25:33 PM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2024-01-15ipv6: remove max_size check inline with ipv4Jon Maxwell
commit af6d10345ca76670c1b7c37799f0d5576ccef277 upstream. In ip6_dst_gc() replace: if (entries > gc_thresh) With: if (entries > ops->gc_thresh) Sending Ipv6 packets in a loop via a raw socket triggers an issue where a route is cloned by ip6_rt_cache_alloc() for each packet sent. This quickly consumes the Ipv6 max_size threshold which defaults to 4096 resulting in these warnings: [1] 99.187805] dst_alloc: 7728 callbacks suppressed [2] Route cache is full: consider increasing sysctl net.ipv6.route.max_size. . . [300] Route cache is full: consider increasing sysctl net.ipv6.route.max_size. When this happens the packet is dropped and sendto() gets a network is unreachable error: remaining pkt 200557 errno 101 remaining pkt 196462 errno 101 . . remaining pkt 126821 errno 101 Implement David Aherns suggestion to remove max_size check seeing that Ipv6 has a GC to manage memory usage. Ipv4 already does not check max_size. Here are some memory comparisons for Ipv4 vs Ipv6 with the patch: Test by running 5 instances of a program that sends UDP packets to a raw socket 5000000 times. Compare Ipv4 and Ipv6 performance with a similar program. Ipv4: Before test: MemFree: 29427108 kB Slab: 237612 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 2881 3990 192 42 2 : tunables 0 0 0 During test: MemFree: 29417608 kB Slab: 247712 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 44394 44394 192 42 2 : tunables 0 0 0 After test: MemFree: 29422308 kB Slab: 238104 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 Ipv6 with patch: Errno 101 errors are not observed anymore with the patch. Before test: MemFree: 29422308 kB Slab: 238104 kB ip6_dst_cache 1912 2528 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 During Test: MemFree: 29431516 kB Slab: 240940 kB ip6_dst_cache 11980 12064 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 After Test: MemFree: 29441816 kB Slab: 238132 kB ip6_dst_cache 1902 2432 256 32 2 : tunables 0 0 0 xfrm_dst_cache 0 0 320 25 2 : tunables 0 0 0 ip_dst_cache 3048 4116 192 42 2 : tunables 0 0 0 Tested-by: Andrea Mayer <andrea.mayer@uniroma2.it> Signed-off-by: Jon Maxwell <jmaxwell37@gmail.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20230112012532.311021-1-jmaxwell37@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-15ipv6: make ip6_rt_gc_expire an atomic_tEric Dumazet
commit 9cb7c013420f98fa6fd12fc6a5dc055170c108db upstream. Reads and Writes to ip6_rt_gc_expire always have been racy, as syzbot reported lately [1] There is a possible risk of under-flow, leading to unexpected high value passed to fib6_run_gc(), although I have not observed this in the field. Hosts hitting ip6_dst_gc() very hard are under pretty bad state anyway. [1] BUG: KCSAN: data-race in ip6_dst_gc / ip6_dst_gc read-write to 0xffff888102110744 of 4 bytes by task 13165 on cpu 1: ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311 dst_alloc+0x9b/0x160 net/core/dst.c:86 ip6_dst_alloc net/ipv6/route.c:344 [inline] icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807 mld_send_cr net/ipv6/mcast.c:2119 [inline] mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 read-write to 0xffff888102110744 of 4 bytes by task 11607 on cpu 0: ip6_dst_gc+0x1f3/0x220 net/ipv6/route.c:3311 dst_alloc+0x9b/0x160 net/core/dst.c:86 ip6_dst_alloc net/ipv6/route.c:344 [inline] icmp6_dst_alloc+0xb2/0x360 net/ipv6/route.c:3261 mld_sendpack+0x2b9/0x580 net/ipv6/mcast.c:1807 mld_send_cr net/ipv6/mcast.c:2119 [inline] mld_ifc_work+0x576/0x800 net/ipv6/mcast.c:2651 process_one_work+0x3d3/0x720 kernel/workqueue.c:2289 worker_thread+0x618/0xa70 kernel/workqueue.c:2436 kthread+0x1a9/0x1e0 kernel/kthread.c:376 ret_from_fork+0x1f/0x30 value changed: 0x00000bb3 -> 0x00000ba9 Reported by Kernel Concurrency Sanitizer on: CPU: 0 PID: 11607 Comm: kworker/0:21 Not tainted 5.18.0-rc1-syzkaller-00037-g42e7a03d3bad-dirty #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: mld mld_ifc_work Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: syzbot <syzkaller@googlegroups.com> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20220413181333.649424-1-eric.dumazet@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> [ 5.4: context adjustment in include/net/netns/ipv6.h ] Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-15net/dst: use a smaller percpu_counter batch for dst entries accountingEric Dumazet
commit cf86a086a18095e33e0637cb78cda1fcf5280852 upstream. percpu_counter_add() uses a default batch size which is quite big on platforms with 256 cpus. (2*256 -> 512) This means dst_entries_get_fast() can be off by +/- 2*(nr_cpus^2) (131072 on servers with 256 cpus) Reduce the batch size to something more reasonable, and add logic to ip6_dst_gc() to call dst_entries_get_slow() before calling the _very_ expensive fib6_run_gc() function. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Suraj Jitindar Singh <surajjs@amazon.com> Cc: <stable@vger.kernel.org> # 5.4.x Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-27Merge tag 'v5.4.265' into v5.4/standard/baseBruce Ashfield
This is the 5.4.265 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmWC/R4ACgkQONu9yGCS # aT6XRw/+OE/DwEAaRGgM/gGLayr/n6zJoL7DUhLxkS+wG3beQXFsdigCHRRhTh58 # OCQP6pL6UlbJ8Yge3FtpYqqkR+UyY7c/wsjJI05v7dKUJ742rpFNML9w0Dg9Au8w # k4TsVU01nnr9HC7rY8k8zYZ/DZdULvIX8RNhSOi0CMO2gkdMUFrh/IC0q5JIWKmL # xFmMieGtsr4kl4sP2oUbYihf1Li4oblouBV+70kPViC6XA0YhOSCT0+PfDxp5CuD # sux1srZGY/782zI0O6+ObsYascwgL+wk0oEJRj1vO02tJKKbtEGMJvGO9Mcpto6B # 2YBq40PAhyeKFdt4YzOWCSO7WjvWP7h15U68EY+E6ruy9La+P/dTyhAqsBBTVDEs # PGFIjxc5pnHn72JQ/U3yJoHFM7yW26VEmEGItsd81VermNgqe2scSPSPHIfM0qFU # z2l0PcQkm+SLK2cFDSCBUBaXfx4R2UuWe/QY07K2eN5YCC4mqROajVh4Vqyj1Q8j # PLw/yrt8lOJcDEDMtFq7hcXKMzcb/dYfCZcSfxl6YJeaR4X4ViOkDGVhLEkVeOn5 # K3kyIvPd268rmoy/9jTuDYu6axMhg2eE2dTQqBg8pFwIOgetUwtYcBhyxDtmGZm1 # lNUYmY84BSHZwXuKjNXGgZ5DI0U7nAWis+odR0scHpVKwaC8ta8= # =d0Ht # -----END PGP SIGNATURE----- # gpg: Signature made Wed 20 Dec 2023 09:41:34 AM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2023-12-20asm-generic: qspinlock: fix queued_spin_value_unlocked() implementationLinus Torvalds
[ Upstream commit 125b0bb95dd6bec81b806b997a4ccb026eeecf8f ] We really don't want to do atomic_read() or anything like that, since we already have the value, not the lock. The whole point of this is that we've loaded the lock from memory, and we want to check whether the value we loaded was a locked one or not. The main use of this is the lockref code, which loads both the lock and the reference count in one atomic operation, and then works on that combined value. With the atomic_read(), the compiler would pointlessly spill the value to the stack, in order to then be able to read it back "atomically". This is the qspinlock version of commit c6f4a9002252 ("asm-generic: ticket-lock: Optimize arch_spin_value_unlocked()") which fixed this same bug for ticket locks. Cc: Guo Ren <guoren@kernel.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Waiman Long <longman@redhat.com> Link: https://lore.kernel.org/all/CAHk-=whNRv0v6kQiV5QO6DJhjH4KEL36vWQ6Re8Csrnh4zbRkQ@mail.gmail.com/ Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-20cred: switch to using atomic_long_tJens Axboe
commit f8fa5d76925991976b3e7076f9d1052515ec1fca upstream. There are multiple ways to grab references to credentials, and the only protection we have against overflowing it is the memory required to do so. With memory sizes only moving in one direction, let's bump the reference count to 64-bit and move it outside the realm of feasibly overflowing. Signed-off-by: Jens Axboe <axboe@kernel.dk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-20net: ipv6: support reporting otherwise unknown prefix flags in RTM_NEWPREFIXMaciej Żenczykowski
[ Upstream commit bd4a816752bab609dd6d65ae021387beb9e2ddbd ] Lorenzo points out that we effectively clear all unknown flags from PIO when copying them to userspace in the netlink RTM_NEWPREFIX notification. We could fix this one at a time as new flags are defined, or in one fell swoop - I choose the latter. We could either define 6 new reserved flags (reserved1..6) and handle them individually (and rename them as new flags are defined), or we could simply copy the entire unmodified byte over - I choose the latter. This unfortunately requires some anonymous union/struct magic, so we add a static assert on the struct size for a little extra safety. Cc: David Ahern <dsahern@kernel.org> Cc: Lorenzo Colitti <lorenzo@google.com> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Maciej Żenczykowski <maze@google.com> Reviewed-by: David Ahern <dsahern@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-19Merge tag 'v5.4.264' into v5.4/standard/baseBruce Ashfield
This is the 5.4.264 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmV5528ACgkQONu9yGCS # aT7+DxAAl/t4oGT1Di8mIhCfqsezTj/SQ6HAMaAFpKjGXdgTBX9QavwTHp35qLpv # xZtibdl611oEDT2B3//cvJu9Qs8sJGlDWxpJF8fkF/NED2EHLRwUAOmhc4fyBCBq # 01GfPRefU9G5E0Nw2g3o07dD0otKQzENh74iAzUr/cyju5TBgS4NC7CljiD6GXP8 # DtCcbz2UMmHF5icyasjw7WoCqKaWUn7KLqU3RyQfmDgnh3z0vSXKs17CFoNk1+R6 # EAkZJkrUsIDO3N6aRkJGvwtbEFDws4S7onpcthckXt6IF/OQ+h8LcHvyTpblVeWH # qVBXmj1QZD+SwfP5qVtM+RwHHTE3GVJI3+MVTInu0vzgEz6lhNPILzWwso+AIhHM # +SBZkx4/pO2LbrSgqn+NYWRcBTn4fXNURPd+zLXNJ1ZSnhamWm7kuFK8B36lz0s8 # CB5ngLld1wg6m2NVPneqAeEcD54msGd18iNZvnxfx3c28sb27RgLqLJQKdSrtFMH # tZj0+3KGkYHzWQmt07iLmg1DPAmZSNzMW7MDJRHlqDCKS2gBKO/LKxWiri2WPh+o # IvgrfGzEJsyYXi5+JcGC9+RkoPcGaous4VA2kWFJSjXiXPrJC4jknQm3mtdMiTf3 # V9DEOMvot9ELGI9RODaNsk59C2RhOuVimm2XWV9n88UcX38reAQ= # =gOna # -----END PGP SIGNATURE----- # gpg: Signature made Wed 13 Dec 2023 12:18:39 PM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2023-12-19Merge tag 'v5.4.263' into v5.4/standard/baseBruce Ashfield
This is the 5.4.263 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmVyyWIACgkQONu9yGCS # aT6Y8A//QJPg7pguCawsJGrem3a5dvhi9scNMmfuhKZOKS73JEmt4yudB9IOUjIX # 1c1aBcJo5yYMZq5L9mhXnlgkgqENxE9fI45FtMdwoKiriEQ0w9OBLlfZuKN9lwzC # tyIigaGE5DD3SqL8e/04LNmMPPdolM38lJ368fYaD3T4d7LfwK0qHJFL8dSg4OFQ # VaePViMFgbodjtSXoERNjVLaNtSlQDQytiWHMiQX2uf6CIIRbm+zFHn2Se1mUgh3 # WGT9JfXZ+achPw6OLhSIjwL+7vowhn3eRETq4zGkkNSK+rmB6W7zjPhou4SYsmc+ # FAYXvalmhQWWjlmIyZzO7GIVtgx19VuEYB8h5KLvp6DXQ0h0wCBOGgsfIT4icbgW # wO0R+toWYY3Y79OLRGiMjiL9b60njJYnrm7JrheRD+BIm2jva+Tb7UxhC6QDMfH6 # a8fya8iJDNZWggwpx67JUANdMO8e+2rS4ttNxW0gTZSHhyEjo1HXctKBEmmtXk4s # HGNV5xUniPnzrP8rduNqePG5B6c3wqOHUwj45L4scGmeC0DzW7E8EBgkHfRcU6CG # ik9z5nQeDikREfK7cp8OSFtLaEBWSIX57XwHWDTMVPDGTN8EQ6eI7vTnQH3xOhA8 # VWFfwcU6avROM/ih7eJ+X4JvuDKcAGTPeD6oF3II0MLPK2m7ZmE= # =p/ty # -----END PGP SIGNATURE----- # gpg: Signature made Fri 08 Dec 2023 02:44:34 AM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2023-12-13drop_monitor: Require 'CAP_SYS_ADMIN' when joining "events" groupIdo Schimmel
commit e03781879a0d524ce3126678d50a80484a513c4b upstream. The "NET_DM" generic netlink family notifies drop locations over the "events" multicast group. This is problematic since by default generic netlink allows non-root users to listen to these notifications. Fix by adding a new field to the generic netlink multicast group structure that when set prevents non-root users or root without the 'CAP_SYS_ADMIN' capability (in the user namespace owning the network namespace) from joining the group. Set this field for the "events" group. Use 'CAP_SYS_ADMIN' rather than 'CAP_NET_ADMIN' because of the nature of the information that is shared over this group. Note that the capability check in this case will always be performed against the initial user namespace since the family is not netns aware and only operates in the initial network namespace. A new field is added to the structure rather than using the "flags" field because the existing field uses uAPI flags and it is inappropriate to add a new uAPI flag for an internal kernel check. In net-next we can rework the "flags" field to use internal flags and fold the new field into it. But for now, in order to reduce the amount of changes, add a new field. Since the information can only be consumed by root, mark the control plane operations that start and stop the tracing as root-only using the 'GENL_ADMIN_PERM' flag. Tested using [1]. Before: # capsh -- -c ./dm_repo # capsh --drop=cap_sys_admin -- -c ./dm_repo After: # capsh -- -c ./dm_repo # capsh --drop=cap_sys_admin -- -c ./dm_repo Failed to join "events" multicast group [1] $ cat dm.c #include <stdio.h> #include <netlink/genl/ctrl.h> #include <netlink/genl/genl.h> #include <netlink/socket.h> int main(int argc, char **argv) { struct nl_sock *sk; int grp, err; sk = nl_socket_alloc(); if (!sk) { fprintf(stderr, "Failed to allocate socket\n"); return -1; } err = genl_connect(sk); if (err) { fprintf(stderr, "Failed to connect socket\n"); return err; } grp = genl_ctrl_resolve_grp(sk, "NET_DM", "events"); if (grp < 0) { fprintf(stderr, "Failed to resolve \"events\" multicast group\n"); return grp; } err = nl_socket_add_memberships(sk, grp, NFNLGRP_NONE); if (err) { fprintf(stderr, "Failed to join \"events\" multicast group\n"); return err; } return 0; } $ gcc -I/usr/include/libnl3 -lnl-3 -lnl-genl-3 -o dm_repo dm.c Fixes: 9a8afc8d3962 ("Network Drop Monitor: Adding drop monitor implementation & Netlink protocol") Reported-by: "The UK's National Cyber Security Centre (NCSC)" <security@ncsc.gov.uk> Signed-off-by: Ido Schimmel <idosch@nvidia.com> Reviewed-by: Jacob Keller <jacob.e.keller@intel.com> Reviewed-by: Jiri Pirko <jiri@nvidia.com> Link: https://lore.kernel.org/r/20231206213102.1824398-3-idosch@nvidia.com Signed-off-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13genetlink: add CAP_NET_ADMIN test for multicast bindIdo Schimmel
This is a partial backport of upstream commit 4d54cc32112d ("mptcp: avoid lock_fast usage in accept path"). It is only a partial backport because the patch in the link below was erroneously squash-merged into upstream commit 4d54cc32112d ("mptcp: avoid lock_fast usage in accept path"). Below is the original patch description from Florian Westphal: " genetlink sets NL_CFG_F_NONROOT_RECV for its netlink socket so anyone can subscribe to multicast messages. rtnetlink doesn't allow this unconditionally, rtnetlink_bind() restricts bind requests to CAP_NET_ADMIN for a few groups. This allows to set GENL_UNS_ADMIN_PERM flag on genl mcast groups to mandate CAP_NET_ADMIN. This will be used by the upcoming mptcp netlink event facility which exposes the token (mptcp connection identifier) to userspace. " Link: https://lore.kernel.org/mptcp/20210213000001.379332-8-mathew.j.martineau@linux.intel.com/ Signed-off-by: Ido Schimmel <idosch@nvidia.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-12-13mmc: core: add helpers mmc_regulator_enable/disable_vqmmcHeiner Kallweit
[ Upstream commit 8d91f3f8ae57e6292142ca89f322e90fa0d6ac02 ] There's a number of drivers (e.g. dw_mmc, meson-gx, mmci, sunxi) using the same mechanism and a private flag vqmmc_enabled to deal with enabling/disabling the vqmmc regulator. Move this to the core and create new helpers mmc_regulator_enable_vqmmc and mmc_regulator_disable_vqmmc. Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com> Acked-by: Martin Blumenstingl <martin.blumenstingl@googlemail.com> Link: https://lore.kernel.org/r/71586432-360f-9b92-17f6-b05a8a971bc2@gmail.com Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Stable-dep-of: 477865af60b2 ("mmc: sdhci-sprd: Fix vqmmc not shutting down after the card was pulled") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13perf/core: Add a new read format to get a number of lost samplesNamhyung Kim
[ Upstream commit 119a784c81270eb88e573174ed2209225d646656 ] Sometimes we want to know an accurate number of samples even if it's lost. Currenlty PERF_RECORD_LOST is generated for a ring-buffer which might be shared with other events. So it's hard to know per-event lost count. Add event->lost_samples field and PERF_FORMAT_LOST to retrieve it from userspace. Original-patch-by: Jiri Olsa <jolsa@redhat.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lkml.kernel.org/r/20220616180623.1358843-1-namhyung@kernel.org Stable-dep-of: 382c27f4ed28 ("perf: Fix perf_event_validate_size()") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13of: Add missing 'Return' section in kerneldoc commentsRob Herring
[ Upstream commit 8c8239c2c1fb82f171cb22a707f3bb88a2f22109 ] Many of the DT kerneldoc comments are lacking a 'Return' section. Let's add the section in cases we have a description of return values. There's still some cases where the return values are not documented. Cc: Frank Rowand <frowand.list@gmail.com> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> Signed-off-by: Rob Herring <robh@kernel.org> Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org> Link: https://lore.kernel.org/r/20210325164713.1296407-8-robh@kernel.org Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13of/irq: Make of_msi_map_rid() PCI bus agnosticLorenzo Pieralisi
[ Upstream commit 2bcdd8f2c07f1aa1bfd34fa0dab8e06949e34846 ] There is nothing PCI bus specific in the of_msi_map_rid() implementation other than the requester ID tag for the input ID space. Rename requester ID to a more generic ID so that the translation code can be used by all busses that require input/output ID translations. No functional change intended. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200619082013.13661-11-lorenzo.pieralisi@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13of/irq: make of_msi_map_get_device_domain() bus agnosticDiana Craciun
[ Upstream commit 6f881aba01109a01a43e4f135673c19190f61133 ] of_msi_map_get_device_domain() is PCI specific but it need not be and can be easily changed to be bus agnostic in order to be used by other busses by adding an IRQ domain bus token as an input parameter. Signed-off-by: Diana Craciun <diana.craciun@oss.nxp.com> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci/msi.c Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Rob Herring <robh+dt@kernel.org> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200619082013.13661-10-lorenzo.pieralisi@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13of/iommu: Make of_map_rid() PCI agnosticLorenzo Pieralisi
[ Upstream commit 746a71d02b5d15817fcb13c956ba999a87773952 ] There is nothing PCI specific (other than the RID - requester ID) in the of_map_rid() implementation, so the same function can be reused for input/output IDs mapping for other busses just as well. Rename the RID instances/names to a generic "id" tag. No functionality change intended. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Reviewed-by: Rob Herring <robh@kernel.org> Acked-by: Joerg Roedel <jroedel@suse.de> Cc: Rob Herring <robh+dt@kernel.org> Cc: Joerg Roedel <joro@8bytes.org> Cc: Robin Murphy <robin.murphy@arm.com> Cc: Marc Zyngier <maz@kernel.org> Link: https://lore.kernel.org/r/20200619082013.13661-7-lorenzo.pieralisi@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13ACPI/IORT: Make iort_msi_map_rid() PCI agnosticLorenzo Pieralisi
[ Upstream commit 39c3cf566ceafa7c1ae331a5f26fbb685d670001 ] There is nothing PCI specific in iort_msi_map_rid(). Rename the function using a bus protocol agnostic name, iort_msi_map_id(), and convert current callers to it. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> Cc: Will Deacon <will@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Link: https://lore.kernel.org/r/20200619082013.13661-4-lorenzo.pieralisi@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13ACPI/IORT: Make iort_get_device_domain IRQ domain agnosticLorenzo Pieralisi
[ Upstream commit d1718a1b7a86743b9c517bf9521695ba909c734f ] iort_get_device_domain() is PCI specific but it need not be, since it can be used to retrieve IRQ domain nexus of any kind by adding an irq_domain_bus_token input to it. Make it PCI agnostic by also renaming the requestor ID input to a more generic ID name. Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> Acked-by: Bjorn Helgaas <bhelgaas@google.com> # pci/msi.c Cc: Will Deacon <will@kernel.org> Cc: Hanjun Guo <guohanjun@huawei.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net> Link: https://lore.kernel.org/r/20200619082013.13661-3-lorenzo.pieralisi@arm.com Signed-off-by: Catalin Marinas <catalin.marinas@arm.com> Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13of: base: Add of_get_cpu_state_node() to get idle states for a CPU nodeUlf Hansson
[ Upstream commit b9f8c26afc405a4a616e765e949bdd551151e41d ] The CPU's idle state nodes are currently parsed at the common cpuidle DT library, but also when initializing data for specific CPU idle operations, as in the PSCI cpuidle driver case and qcom-spm cpuidle case. To avoid open-coding, let's introduce of_get_cpu_state_node(), which takes the device node for the CPU and the index to the requested idle state node, as in-parameters. In case a corresponding idle state node is found, it returns the node with the refcount incremented for it, else it returns NULL. Moreover, for PSCI there are two options to describe the CPU's idle states [1], either via a flattened description or a hierarchical layout. Hence, let's take both options into account. [1] Documentation/devicetree/bindings/arm/psci.yaml Suggested-by: Sudeep Holla <sudeep.holla@arm.com> Co-developed-by: Lina Iyer <lina.iyer@linaro.org> Signed-off-by: Lina Iyer <lina.iyer@linaro.org> Reviewed-by: Rob Herring <robh@kernel.org> Reviewed-by: Daniel Lezcano <daniel.lezcano@linaro.org> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> Reviewed-by: Sudeep Holla <sudeep.holla@arm.com> Stable-dep-of: d79972789d17 ("of: dynamic: Fix of_reconfig_get_state_change() return value documentation") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-13hrtimers: Push pending hrtimers away from outgoing CPU earlierThomas Gleixner
[ Upstream commit 5c0930ccaad5a74d74e8b18b648c5eb21ed2fe94 ] 2b8272ff4a70 ("cpu/hotplug: Prevent self deadlock on CPU hot-unplug") solved the straight forward CPU hotplug deadlock vs. the scheduler bandwidth timer. Yu discovered a more involved variant where a task which has a bandwidth timer started on the outgoing CPU holds a lock and then gets throttled. If the lock required by one of the CPU hotplug callbacks the hotplug operation deadlocks because the unthrottling timer event is not handled on the dying CPU and can only be recovered once the control CPU reaches the hotplug state which pulls the pending hrtimers from the dead CPU. Solve this by pushing the hrtimers away from the dying CPU in the dying callbacks. Nothing can queue a hrtimer on the dying CPU at that point because all other CPUs spin in stop_machine() with interrupts disabled and once the operation is finished the CPU is marked offline. Reported-by: Yu Liao <liaoyu15@huawei.com> Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Tested-by: Liu Tie <liutie4@huawei.com> Link: https://lore.kernel.org/r/87a5rphara.ffs@tglx Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-11Revert "rt: drop -stable migrate_disable"Bruce Ashfield
This reverts commit d7ed2aabdf71b82cfe40933d73d9a3d32e4df6e2.
2023-12-11rt: drop -stable migrate_disableBruce Ashfield
The -rt branches already have a migrate disable mechanism, we drop the -stable version to fix build issues. Signed-off-by: Bruce Ashfield <bruce.ashfield@gmail.com>
2023-12-08scsi: core: Introduce the scsi_cmd_to_rq() functionBart Van Assche
[ Upstream commit 51f3a478892873337c54068d1185bcd797000a52 ] The 'request' member of struct scsi_cmnd is superfluous. The struct request and struct scsi_cmnd data structures are adjacent and hence the request pointer can be derived easily from a scsi_cmnd pointer. Introduce a helper function that performs that conversion in a type-safe way. This patch is the first step towards removing the request member from struct scsi_cmnd. Making that change has the following advantages: - This is a performance optimization since adding an offset to a pointer takes less time than dereferencing a pointer. - struct scsi_cmnd becomes smaller. Link: https://lore.kernel.org/r/20210809230355.8186-2-bvanassche@acm.org Cc: Christoph Hellwig <hch@lst.de> Cc: Hannes Reinecke <hare@suse.de> Cc: Ming Lei <ming.lei@redhat.com> Reviewed-by: Hannes Reinecke <hare@suse.de> Signed-off-by: Bart Van Assche <bvanassche@acm.org> Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com> Stable-dep-of: 19597cad64d6 ("scsi: qla2xxx: Fix system crash due to bad pointer access") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08ovl: skip overlayfs superblocks at global syncKonstantin Khlebnikov
[ Upstream commit 32b1924b210a70dcacdf65abd687c5ef86a67541 ] Stacked filesystems like overlayfs has no own writeback, but they have to forward syncfs() requests to backend for keeping data integrity. During global sync() each overlayfs instance calls method ->sync_fs() for backend although it itself is in global list of superblocks too. As a result one syscall sync() could write one superblock several times and send multiple disk barriers. This patch adds flag SB_I_SKIP_SYNC into sb->sb_iflags to avoid that. Reported-by: Dmitry Monakhov <dmtrmonakhov@yandex-team.ru> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru> Reviewed-by: Amir Goldstein <amir73il@gmail.com> Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> Stable-dep-of: b836c4d29f27 ("ima: detect changes to the backing overlay file") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08HID: fix HID device resource race between HID core and debugging supportCharles Yi
[ Upstream commit fc43e9c857b7aa55efba9398419b14d9e35dcc7d ] hid_debug_events_release releases resources bound to the HID device instance. hid_device_release releases the underlying HID device instance potentially before hid_debug_events_release has completed releasing debug resources bound to the same HID device instance. Reference count to prevent the HID device instance from being torn down preemptively when HID debugging support is used. When count reaches zero, release core resources of HID device instance using hiddev_free. The crash: [ 120.728477][ T4396] kernel BUG at lib/list_debug.c:53! [ 120.728505][ T4396] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP [ 120.739806][ T4396] Modules linked in: bcmdhd dhd_static_buf 8822cu pcie_mhi r8168 [ 120.747386][ T4396] CPU: 1 PID: 4396 Comm: hidt_bridge Not tainted 5.10.110 #257 [ 120.754771][ T4396] Hardware name: Rockchip RK3588 EVB4 LP4 V10 Board (DT) [ 120.761643][ T4396] pstate: 60400089 (nZCv daIf +PAN -UAO -TCO BTYPE=--) [ 120.768338][ T4396] pc : __list_del_entry_valid+0x98/0xac [ 120.773730][ T4396] lr : __list_del_entry_valid+0x98/0xac [ 120.779120][ T4396] sp : ffffffc01e62bb60 [ 120.783126][ T4396] x29: ffffffc01e62bb60 x28: ffffff818ce3a200 [ 120.789126][ T4396] x27: 0000000000000009 x26: 0000000000980000 [ 120.795126][ T4396] x25: ffffffc012431000 x24: ffffff802c6d4e00 [ 120.801125][ T4396] x23: ffffff8005c66f00 x22: ffffffc01183b5b8 [ 120.807125][ T4396] x21: ffffff819df2f100 x20: 0000000000000000 [ 120.813124][ T4396] x19: ffffff802c3f0700 x18: ffffffc01d2cd058 [ 120.819124][ T4396] x17: 0000000000000000 x16: 0000000000000000 [ 120.825124][ T4396] x15: 0000000000000004 x14: 0000000000003fff [ 120.831123][ T4396] x13: ffffffc012085588 x12: 0000000000000003 [ 120.837123][ T4396] x11: 00000000ffffbfff x10: 0000000000000003 [ 120.843123][ T4396] x9 : 455103d46b329300 x8 : 455103d46b329300 [ 120.849124][ T4396] x7 : 74707572726f6320 x6 : ffffffc0124b8cb5 [ 120.855124][ T4396] x5 : ffffffffffffffff x4 : 0000000000000000 [ 120.861123][ T4396] x3 : ffffffc011cf4f90 x2 : ffffff81fee7b948 [ 120.867122][ T4396] x1 : ffffffc011cf4f90 x0 : 0000000000000054 [ 120.873122][ T4396] Call trace: [ 120.876259][ T4396] __list_del_entry_valid+0x98/0xac [ 120.881304][ T4396] hid_debug_events_release+0x48/0x12c [ 120.886617][ T4396] full_proxy_release+0x50/0xbc [ 120.891323][ T4396] __fput+0xdc/0x238 [ 120.895075][ T4396] ____fput+0x14/0x24 [ 120.898911][ T4396] task_work_run+0x90/0x148 [ 120.903268][ T4396] do_exit+0x1bc/0x8a4 [ 120.907193][ T4396] do_group_exit+0x8c/0xa4 [ 120.911458][ T4396] get_signal+0x468/0x744 [ 120.915643][ T4396] do_signal+0x84/0x280 [ 120.919650][ T4396] do_notify_resume+0xd0/0x218 [ 120.924262][ T4396] work_pending+0xc/0x3f0 [ Rahul Rameshbabu <sergeantsagara@protonmail.com>: rework changelog ] Fixes: cd667ce24796 ("HID: use debugfs for events/reports dumping") Signed-off-by: Charles Yi <be286@163.com> Signed-off-by: Jiri Kosina <jkosina@suse.cz> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-08HID: core: store the unique system identifier in hid_deviceBenjamin Tissoires
[ Upstream commit 1e839143d674603b0bbbc4c513bca35404967dbc ] This unique identifier is currently used only for ensuring uniqueness in sysfs. However, this could be handful for userspace to refer to a specific hid_device by this id. 2 use cases are in my mind: LEDs (and their naming convention), and HID-BPF. Reviewed-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Signed-off-by: Benjamin Tissoires <benjamin.tissoires@redhat.com> Link: https://lore.kernel.org/r/20220902132938.2409206-9-benjamin.tissoires@redhat.com Stable-dep-of: fc43e9c857b7 ("HID: fix HID device resource race between HID core and debugging support") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-12-03Merge tag 'v5.4.262' into v5.4/standard/baseBruce Ashfield
This is the 5.4.262 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmVmGmUACgkQONu9yGCS # aT4V7A//YzFdP4ANGVpZ7tBob7OxpgGgvEu32zCDx51LQ8n2uJRJ8WBWW6kVOBUZ # YyUEXzjPPaS7JRS1O7TpCGYFWrH0ue9c/xzyvUQyyHEBZvZVj0P3O1iHlAk2FWSG # pOTEfW1cFp8vtHwGn82rmIDETu56LMWd+aeVhg6psb2L6ho2LPipCkxN79kbBGSB # DLfD71O2Pb3mw8ZYHVC5KKIlfODLqjq9N6T+3VsG4uQCEMHTVAHjjoIvYFeSi1cR # MqPXS4/3GUyYUDTe2tjYznkSfPbdARfD1aKKPEXLuq1+q6WqvHCAG7nwgtPT/gd9 # JPCxm+9DPN9+YhmEsCJpMSq3pD2eTrD5ZXhYFNc5sOsNw0L4oFRLtrB782snerw+ # ogQ8DED4qATn1+x7jfRD7hwMzHih4nAL7zqy32s8knKHfp1+rOOkXfIohfc9qrUI # svUjb1B+guuGHwFq6YDzxpUxmhdGqOo262cnU4jfH8lxH+w03vyNxxyQn0ZUUe2I # gkvJ5wNpq4QhD/++B/DaCptw0l5AzfjOO+0xlp20xMzn5qW/BS8W26zUXhGeLOAd # MHu+fv9DU0mzs3V1MxRvbBQ5gI9TngRWXJSIBCJx5YhZ8gGIhfrzoIzY+IeF6l3F # idjruirbfujAQv0vQHuz7JmhHrTG+T90slQ/R8pPud73WGz5BMI= # =A+DX # -----END PGP SIGNATURE----- # gpg: Signature made Tue 28 Nov 2023 11:50:45 AM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2023-12-03Merge tag 'v5.4.261' into v5.4/standard/baseBruce Ashfield
This is the 5.4.261 stable release # -----BEGIN PGP SIGNATURE----- # # iQIzBAABCAAdFiEEZH8oZUiU471FcZm+ONu9yGCSaT4FAmVbJywACgkQONu9yGCS # aT6YqQ/9FxwPj4QDAuA9/XXBN2VsBm8Xws1chEJzEIxWftOc4kzQGtdbvoxmoLsd # 6cyyd89oiQyme0qvnpPbGnGKbZo4I100ugS5ba6k+T1gHtVMkcVB8AmvLce/wHip # oqxb340oK5leZxmJs2hm/S/rCEr7RlcB/2ZMGle+Bw3xRone4S9MQ/PXXeKzH1dQ # 9i34Vv8fAgBV/CCJ47QneGAMFj2gtr893us7MOJkyJVIM2tFCciZbclY8DTH5Umr # L7MhX4dCd2Bm1umtidt5f6fpmykHRFHSDb/STHS9Dg5BUSpEhr2H5/hq2spb6492 # 6xwQ/Q6WrP1scaIvSA8jChxed+wSxKTVd7hLXJiRoG03qGx4uD3udEus9+V4xbY/ # lOR9+QuZl7V5tGWYtC9KXUwOo9PkjX5Fi4mXKlEqrL25NTU8KO3AwMxKcGy/MNgn # xzSNM4u5iA+dNIsP0Xe/1QZ92c20p6oFKlllHFXVzH+PWwZN3X6Jm+MyeJb0AMj9 # MaaTa3ZDI5OiK72gBp4KH4ICEenCTFqsAsyyAai2jlzhPSHNWlKoiJbU8uacwiyU # tRctf4QCuZa2EA/u2qIqeWNIsionGGsPABd+hNp+Gvc2QqO4cd8p8diTO0V5aKOz # PvpUY29p3v/qyF8ofIO0GYE0Wa0YQaS7X2f4Pc4NGA1Uk6wLzLc= # =k5NF # -----END PGP SIGNATURE----- # gpg: Signature made Mon 20 Nov 2023 04:30:20 AM EST # gpg: using RSA key 647F28654894E3BD457199BE38DBBDC86092693E # gpg: Can't check signature: No public key
2023-11-28netfilter: nf_tables: fix table flag updatesPablo Neira Ayuso
commit 179d9ba5559a756f4322583388b3213fe4e391b0 upstream. The dormant flag need to be updated from the preparation phase, otherwise, two consecutive requests to dorm a table in the same batch might try to remove the same hooks twice, resulting in the following warning: hook not found, pf 3 num 0 WARNING: CPU: 0 PID: 334 at net/netfilter/core.c:480 __nf_unregister_net_hook+0x1eb/0x610 net/netfilter/core.c:480 Modules linked in: CPU: 0 PID: 334 Comm: kworker/u4:5 Not tainted 5.12.0-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 Workqueue: netns cleanup_net RIP: 0010:__nf_unregister_net_hook+0x1eb/0x610 net/netfilter/core.c:480 This patch is a partial revert of 0ce7cf4127f1 ("netfilter: nftables: update table flags from the commit phase") to restore the previous behaviour. However, there is still another problem: A batch containing a series of dorm-wakeup-dorm table and vice-versa also trigger the warning above since hook unregistration happens from the preparation phase, while hook registration occurs from the commit phase. To fix this problem, this patch adds two internal flags to annotate the original dormant flag status which are __NFT_TABLE_F_WAS_DORMANT and __NFT_TABLE_F_WAS_AWAKEN, to restore it from the abort path. The __NFT_TABLE_F_UPDATE bitmask allows to handle the dormant flag update with one single transaction. Reported-by: syzbot+7ad5cd1615f2d89c6e7e@syzkaller.appspotmail.com Fixes: 0ce7cf4127f1 ("netfilter: nftables: update table flags from the commit phase") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28netfilter: nftables: update table flags from the commit phasePablo Neira Ayuso
commit 0ce7cf4127f14078ca598ba9700d813178a59409 upstream. Do not update table flags from the preparation phase. Store the flags update into the transaction, then update the flags from the commit phase. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28netfilter: nf_tables: fix memleak when more than 255 elements expiredPablo Neira Ayuso
commit cf5000a7787cbc10341091d37245a42c119d26c5 upstream. When more than 255 elements expired we're supposed to switch to a new gc container structure. This never happens: u8 type will wrap before reaching the boundary and nft_trans_gc_space() always returns true. This means we recycle the initial gc container structure and lose track of the elements that came before. While at it, don't deref 'gc' after we've passed it to call_rcu. Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Reported-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28netfilter: nf_tables: defer gc run if previous batch is still pendingFlorian Westphal
commit 8e51830e29e12670b4c10df070a4ea4c9593e961 upstream. Don't queue more gc work, else we may queue the same elements multiple times. If an element is flagged as dead, this can mean that either the previous gc request was invalidated/discarded by a transaction or that the previous request is still pending in the system work queue. The latter will happen if the gc interval is set to a very low value, e.g. 1ms, and system work queue is backlogged. The sets refcount is 1 if no previous gc requeusts are queued, so add a helper for this and skip gc run if old requests are pending. Add a helper for this and skip the gc run in this case. Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28netfilter: nf_tables: remove busy mark and gc batch APIPablo Neira Ayuso
commit a2dd0233cbc4d8a0abb5f64487487ffc9265beb5 upstream. Ditch it, it has been replace it by the GC transaction API and it has no clients anymore. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28netfilter: nf_tables: GC transaction API to avoid race with control planePablo Neira Ayuso
commit 5f68718b34a531a556f2f50300ead2862278da26 upstream. The set types rhashtable and rbtree use a GC worker to reclaim memory. >From system work queue, in periodic intervals, a scan of the table is done. The major caveat here is that the nft transaction mutex is not held. This causes a race between control plane and GC when they attempt to delete the same element. We cannot grab the netlink mutex from the work queue, because the control plane has to wait for the GC work queue in case the set is to be removed, so we get following deadlock: cpu 1 cpu2 GC work transaction comes in , lock nft mutex `acquire nft mutex // BLOCKS transaction asks to remove the set set destruction calls cancel_work_sync() cancel_work_sync will now block forever, because it is waiting for the mutex the caller already owns. This patch adds a new API that deals with garbage collection in two steps: 1) Lockless GC of expired elements sets on the NFT_SET_ELEM_DEAD_BIT so they are not visible via lookup. Annotate current GC sequence in the GC transaction. Enqueue GC transaction work as soon as it is full. If ruleset is updated, then GC transaction is aborted and retried later. 2) GC work grabs the mutex. If GC sequence has changed then this GC transaction lost race with control plane, abort it as it contains stale references to objects and let GC try again later. If the ruleset is intact, then this GC transaction deactivates and removes the elements and it uses call_rcu() to destroy elements. Note that no elements are removed from GC lockless path, the _DEAD bit is set and pointers are collected. GC catchall does not remove the elements anymore too. There is a new set->dead flag that is set on to abort the GC transaction to deal with set->ops->destroy() path which removes the remaining elements in the set from commit_release, where no mutex is held. To deal with GC when mutex is held, which allows safe deactivate and removal, add sync GC API which releases the set element object via call_rcu(). This is used by rbtree and pipapo backends which also perform garbage collection from control plane path. Since element removal from sets can happen from control plane and element garbage collection/timeout, it is necessary to keep the set structure alive until all elements have been deactivated and destroyed. We cannot do a cancel_work_sync or flush_work in nft_set_destroy because its called with the transaction mutex held, but the aforementioned async work queue might be blocked on the very mutex that nft_set_destroy() callchain is sitting on. This gives us the choice of ABBA deadlock or UaF. To avoid both, add set->refs refcount_t member. The GC API can then increment the set refcount and release it once the elements have been free'd. Set backends are adapted to use the GC transaction API in a follow up patch entitled: ("netfilter: nf_tables: use gc transaction API in set backends") This is joint work with Florian Westphal. Fixes: cfed7e1b1f8e ("netfilter: nf_tables: add set garbage collection helpers") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28netfilter: nf_tables: drop map element references from preparation phasePablo Neira Ayuso
commit 628bd3e49cba1c066228e23d71a852c23e26da73 upstream. set .destroy callback releases the references to other objects in maps. This is very late and it results in spurious EBUSY errors. Drop refcount from the preparation phase instead, update set backend not to drop reference counter from set .destroy path. Exceptions: NFT_TRANS_PREPARE_ERROR does not require to drop the reference counter because the transaction abort path releases the map references for each element since the set is unbound. The abort path also deals with releasing reference counter for new elements added to unbound sets. Fixes: 591054469b3e ("netfilter: nf_tables: revisit chain/object refcounting from elements") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28tracing: Have trace_event_file have ref countersSteven Rostedt (Google)
commit bb32500fb9b78215e4ef6ee8b4345c5f5d7eafb4 upstream. The following can crash the kernel: # cd /sys/kernel/tracing # echo 'p:sched schedule' > kprobe_events # exec 5>>events/kprobes/sched/enable # > kprobe_events # exec 5>&- The above commands: 1. Change directory to the tracefs directory 2. Create a kprobe event (doesn't matter what one) 3. Open bash file descriptor 5 on the enable file of the kprobe event 4. Delete the kprobe event (removes the files too) 5. Close the bash file descriptor 5 The above causes a crash! BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 877 Comm: bash Not tainted 6.5.0-rc4-test-00008-g2c6b6b1029d4-dirty #186 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:tracing_release_file_tr+0xc/0x50 What happens here is that the kprobe event creates a trace_event_file "file" descriptor that represents the file in tracefs to the event. It maintains state of the event (is it enabled for the given instance?). Opening the "enable" file gets a reference to the event "file" descriptor via the open file descriptor. When the kprobe event is deleted, the file is also deleted from the tracefs system which also frees the event "file" descriptor. But as the tracefs file is still opened by user space, it will not be totally removed until the final dput() is called on it. But this is not true with the event "file" descriptor that is already freed. If the user does a write to or simply closes the file descriptor it will reference the event "file" descriptor that was just freed, causing a use-after-free bug. To solve this, add a ref count to the event "file" descriptor as well as a new flag called "FREED". The "file" will not be freed until the last reference is released. But the FREE flag will be set when the event is removed to prevent any more modifications to that event from happening, even if there's still a reference to the event "file" descriptor. Link: https://lore.kernel.org/linux-trace-kernel/20231031000031.1e705592@gandalf.local.home/ Link: https://lore.kernel.org/linux-trace-kernel/20231031122453.7a48b923@gandalf.local.home Cc: stable@vger.kernel.org Cc: Mark Rutland <mark.rutland@arm.com> Fixes: f5ca233e2e66d ("tracing: Increase trace array ref count on enable and filter files") Reported-by: Beau Belgrave <beaub@linux.microsoft.com> Tested-by: Beau Belgrave <beaub@linux.microsoft.com> Reviewed-by: Masami Hiramatsu (Google) <mhiramat@kernel.org> Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-11-28net/mlx5_core: Clean driver version and nameLeon Romanovsky
[ Upstream commit 17a7612b99e66d2539341ab4f888f970c2c7f76d ] Remove exposed driver version as it was done in other drivers, so module version will work correctly by displaying the kernel version for which it is compiled. And move mlx5_core module name to general include, so auxiliary drivers will be able to use it as a basis for a name in their device ID tables. Reviewed-by: Parav Pandit <parav@nvidia.com> Reviewed-by: Roi Dayan <roid@nvidia.com> Signed-off-by: Leon Romanovsky <leonro@nvidia.com> Stable-dep-of: 1b2bd0c0264f ("net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28pwm: Fix double shift bugDan Carpenter
[ Upstream commit d27abbfd4888d79dd24baf50e774631046ac4732 ] These enums are passed to set/test_bit(). The set/test_bit() functions take a bit number instead of a shifted value. Passing a shifted value is a double shift bug like doing BIT(BIT(1)). The double shift bug doesn't cause a problem here because we are only checking 0 and 1 but if the value was 5 or above then it can lead to a buffer overflow. Signed-off-by: Dan Carpenter <dan.carpenter@linaro.org> Reviewed-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de> Reviewed-by: Sam Protsenko <semen.protsenko@linaro.org> Signed-off-by: Thierry Reding <thierry.reding@gmail.com> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28net: annotate data-races around sk->sk_dst_pending_confirmEric Dumazet
[ Upstream commit eb44ad4e635132754bfbcb18103f1dcb7058aedd ] This field can be read or written without socket lock being held. Add annotations to avoid load-store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-28net: annotate data-races around sk->sk_tx_queue_mappingEric Dumazet
[ Upstream commit 0bb4d124d34044179b42a769a0c76f389ae973b6 ] This field can be read or written without socket lock being held. Add annotations to avoid load-store tearing. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20netfilter: nft_redir: use `struct nf_nat_range2` throughout and deduplicate ↵Jeremy Sowden
eval call-backs [ Upstream commit 6f56ad1b92328997e1b1792047099df6f8d7acb5 ] `nf_nat_redirect_ipv4` takes a `struct nf_nat_ipv4_multi_range_compat`, but converts it internally to a `struct nf_nat_range2`. Change the function to take the latter, factor out the code now shared with `nf_nat_redirect_ipv6`, move the conversion to the xt_REDIRECT module, and update the ipv4 range initialization in the nft_redir module. Replace a bare hex constant for 127.0.0.1 with a macro. Remove `WARN_ON`. `nf_nat_setup_info` calls `nf_ct_is_confirmed`: /* Can't setup nat info for confirmed ct. */ if (nf_ct_is_confirmed(ct)) return NF_ACCEPT; This means that `ct` cannot be null or the kernel will crash, and implies that `ctinfo` is `IP_CT_NEW` or `IP_CT_RELATED`. nft_redir has separate ipv4 and ipv6 call-backs which share much of their code, and an inet one switch containing a switch that calls one of the others based on the family of the packet. Merge the ipv4 and ipv6 ones into the inet one in order to get rid of the duplicate code. Const-qualify the `priv` pointer since we don't need to write through it. Assign `priv->flags` to the range instead of OR-ing it in. Set the `NF_NAT_RANGE_PROTO_SPECIFIED` flag once during init, rather than on every eval. Signed-off-by: Jeremy Sowden <jeremy@azazel.net> Signed-off-by: Florian Westphal <fw@strlen.de> Stable-dep-of: 80abbe8a8263 ("netfilter: nat: fix ipv6 nat redirect with mapped and scoped addresses") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20Fix termination state for idr_for_each_entry_ul()NeilBrown
[ Upstream commit e8ae8ad479e2d037daa33756e5e72850a7bd37a9 ] The comment for idr_for_each_entry_ul() states after normal termination @entry is left with the value NULL This is not correct in the case where UINT_MAX has an entry in the idr. In that case @entry will be non-NULL after termination. No current code depends on the documentation being correct, but to save future code we should fix it. Also fix idr_for_each_entry_continue_ul(). While this is not documented as leaving @entry as NULL, the mellanox driver appears to depend on it doing so. So make that explicit in the documentation as well as in the code. Fixes: e33d2b74d805 ("idr: fix overflow case for idr_for_each_entry_ul()") Cc: Matthew Wilcox <willy@infradead.org> Cc: Chris Mi <chrism@mellanox.com> Cc: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: NeilBrown <neilb@suse.de> Signed-off-by: David S. Miller <davem@davemloft.net> Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-11-20inet: shrink struct flowi_commonEric Dumazet
[ Upstream commit 1726483b79a72e0150734d5367e4a0238bf8fcff ] I am looking at syzbot reports triggering kernel stack overflows involving a cascade of ipvlan devices. We can save 8 bytes in struct flowi_common. This patch alone will not fix the issue, but is a start. Fixes: 24ba14406c5c ("route: Add multipath_hash in flowi_common to make user-define hash") Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: wenxu <wenxu@ucloud.cn> Reviewed-by: David Ahern <dsahern@kernel.org> Link: https://lore.kernel.org/r/20231025141037.3448203-1-edumazet@google.com Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: Sasha Levin <sashal@kernel.org>