aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/nouveau
AgeCommit message (Collapse)Author
2024-04-17nouveau: fix function cast warningArnd Bergmann
[ Upstream commit 185fdb4697cc9684a02f2fab0530ecdd0c2f15d4 ] Calling a function through an incompatible pointer type causes breaks kcfi, so clang warns about the assignment: drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadowof.c:73:10: error: cast from 'void (*)(const void *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 73 | .fini = (void(*)(void *))kfree, Avoid this with a trivial wrapper. Fixes: c39f472e9f14 ("drm/nouveau: remove symlinks, move core/ to nvkm/ (no code changes)") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240404160234.2923554-1-arnd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26nouveau: reset the bo resource bus info after an evictionDave Airlie
[ Upstream commit f35c9af45ea7a4b1115b193d84858b14d13517fc ] Later attempts to refault the bo won't happen and the whole GPU does to lunch. I think Christian's refactoring of this code out to the driver broke this not very well tested path. Fixes: 141b15e59175 ("drm/nouveau: move io_reserve_lru handling into the driver v5") Cc: Christian König <christian.koenig@amd.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240311072037.287905-1-airlied@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-26drm/ttm: add ttm_resource_fini v2Christian König
[ Upstream commit de3688e469b08be958914674e8b01cb0cea42388 ] Make sure we call the common cleanup function in all implementations of the resource manager. v2: fix missing case in i915, rudimentary kerneldoc, should be filled in more when we add more functionality Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/20220124122514.1832-2-christian.koenig@amd.com Stable-dep-of: 89709105a609 ("drm/vmwgfx: fix a memleak in vmw_gmrid_man_get_node") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01drm/nouveau/instmem: fix uninitialized_var.cocci warningGuo Zhengkui
[ Upstream commit 2046e733e125fa58ed997f3d26d43543faf82c95 ] Fix following coccicheck warning: drivers/gpu/drm/nouveau/nvkm/subdev/instmem/nv50.c:316:11-12: WARNING this kind of initialization is deprecated. `void *map = map` has the same form of uninitialized_var() macro. I remove the redundant assignement. It has been tested with gcc (Debian 8.3.0-6) 8.3.0. The patch which removed uninitialized_var() is: https://lore.kernel.org/all/20121028102007.GA7547@gmail.com/ And there is very few "/* GCC */" comments in the Linux kernel code now. Signed-off-by: Guo Zhengkui <guozhengkui@vivo.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220228142352.18006-1-guozhengkui@vivo.com Stable-dep-of: 3b1ae9b71c2a ("octeontx2-af: Consider the action set by PF") Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-03-01nouveau: fix function cast warningsArnd Bergmann
[ Upstream commit 0affdba22aca5573f9d989bcb1d71d32a6a03efe ] clang-16 warns about casting between incompatible function types: drivers/gpu/drm/nouveau/nvkm/subdev/bios/shadow.c:161:10: error: cast from 'void (*)(const struct firmware *)' to 'void (*)(void *)' converts to incompatible function type [-Werror,-Wcast-function-type-strict] 161 | .fini = (void(*)(void *))release_firmware, This one was done to use the generic shadow_fw_release() function as a callback for struct nvbios_source. Change it to use the same prototype as the other five instances, with a trivial helper function that actually calls release_firmware. Fixes: 70c0f263cc2e ("drm/nouveau/bios: pull in basic vbios subdev, more to come later") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240213095753.455062-1-arnd@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-02-23nouveau/vmm: don't set addr on the fail path to avoid warningDave Airlie
commit cacea81390fd8c8c85404e5eb2adeb83d87a912e upstream. nvif_vmm_put gets called if addr is set, but if the allocation fails we don't need to call put, otherwise we get a warning like [523232.435671] ------------[ cut here ]------------ [523232.435674] WARNING: CPU: 8 PID: 1505697 at drivers/gpu/drm/nouveau/nvif/vmm.c:68 nvif_vmm_put+0x72/0x80 [nouveau] [523232.435795] Modules linked in: uinput rfcomm snd_seq_dummy snd_hrtimer nf_conntrack_netbios_ns nf_conntrack_broadcast nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink qrtr bnep sunrpc binfmt_misc intel_rapl_msr intel_rapl_common intel_uncore_frequency intel_uncore_frequency_common isst_if_common iwlmvm nfit libnvdimm vfat fat x86_pkg_temp_thermal intel_powerclamp mac80211 snd_soc_avs snd_soc_hda_codec coretemp snd_hda_ext_core snd_soc_core snd_hda_codec_realtek kvm_intel snd_hda_codec_hdmi snd_compress snd_hda_codec_generic ac97_bus snd_pcm_dmaengine snd_hda_intel libarc4 snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec kvm iwlwifi snd_hda_core btusb snd_hwdep btrtl snd_seq btintel irqbypass btbcm rapl snd_seq_device eeepc_wmi btmtk intel_cstate iTCO_wdt cfg80211 snd_pcm asus_wmi bluetooth intel_pmc_bxt iTCO_vendor_support snd_timer ledtrig_audio pktcdvd snd mei_me [523232.435828] sparse_keymap intel_uncore i2c_i801 platform_profile wmi_bmof mei pcspkr ioatdma soundcore i2c_smbus rfkill idma64 dca joydev acpi_tad loop zram nouveau drm_ttm_helper ttm video drm_exec drm_gpuvm gpu_sched crct10dif_pclmul i2c_algo_bit nvme crc32_pclmul crc32c_intel drm_display_helper polyval_clmulni nvme_core polyval_generic e1000e mxm_wmi cec ghash_clmulni_intel r8169 sha512_ssse3 nvme_common wmi pinctrl_sunrisepoint uas usb_storage ip6_tables ip_tables fuse [523232.435849] CPU: 8 PID: 1505697 Comm: gnome-shell Tainted: G W 6.6.0-rc7-nvk-uapi+ #12 [523232.435851] Hardware name: System manufacturer System Product Name/ROG STRIX X299-E GAMING II, BIOS 1301 09/24/2021 [523232.435852] RIP: 0010:nvif_vmm_put+0x72/0x80 [nouveau] [523232.435934] Code: 00 00 48 89 e2 be 02 00 00 00 48 c7 04 24 00 00 00 00 48 89 44 24 08 e8 fc bf ff ff 85 c0 75 0a 48 c7 43 08 00 00 00 00 eb b3 <0f> 0b eb f2 e8 f5 c9 b2 e6 0f 1f 44 00 00 90 90 90 90 90 90 90 90 [523232.435936] RSP: 0018:ffffc900077ffbd8 EFLAGS: 00010282 [523232.435937] RAX: 00000000fffffffe RBX: ffffc900077ffc00 RCX: 0000000000000010 [523232.435938] RDX: 0000000000000010 RSI: ffffc900077ffb38 RDI: ffffc900077ffbd8 [523232.435940] RBP: ffff888e1c4f2140 R08: 0000000000000000 R09: 0000000000000000 [523232.435940] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888503811800 [523232.435941] R13: ffffc900077ffca0 R14: ffff888e1c4f2140 R15: ffff88810317e1e0 [523232.435942] FS: 00007f933a769640(0000) GS:ffff88905fa00000(0000) knlGS:0000000000000000 [523232.435943] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [523232.435944] CR2: 00007f930bef7000 CR3: 00000005d0322001 CR4: 00000000003706e0 [523232.435945] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [523232.435946] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [523232.435964] Call Trace: [523232.435965] <TASK> [523232.435966] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436051] ? __warn+0x81/0x130 [523232.436055] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436138] ? report_bug+0x171/0x1a0 [523232.436142] ? handle_bug+0x3c/0x80 [523232.436144] ? exc_invalid_op+0x17/0x70 [523232.436145] ? asm_exc_invalid_op+0x1a/0x20 [523232.436149] ? nvif_vmm_put+0x72/0x80 [nouveau] [523232.436230] ? nvif_vmm_put+0x64/0x80 [nouveau] [523232.436342] nouveau_vma_del+0x80/0xd0 [nouveau] [523232.436506] nouveau_vma_new+0x1a0/0x210 [nouveau] [523232.436671] nouveau_gem_object_open+0x1d0/0x1f0 [nouveau] [523232.436835] drm_gem_handle_create_tail+0xd1/0x180 [523232.436840] drm_prime_fd_to_handle_ioctl+0x12e/0x200 [523232.436844] ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10 [523232.436847] drm_ioctl_kernel+0xd3/0x180 [523232.436849] drm_ioctl+0x26d/0x4b0 [523232.436851] ? __pfx_drm_prime_fd_to_handle_ioctl+0x10/0x10 [523232.436855] nouveau_drm_ioctl+0x5a/0xb0 [nouveau] [523232.437032] __x64_sys_ioctl+0x94/0xd0 [523232.437036] do_syscall_64+0x5d/0x90 [523232.437040] ? syscall_exit_to_user_mode+0x2b/0x40 [523232.437044] ? do_syscall_64+0x6c/0x90 [523232.437046] entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Reported-by: Faith Ekstrand <faith.ekstrand@collabora.com> Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20240117213852.295565-1-airlied@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-01-25drm/nouveau/fence:: fix warning directly dereferencing a rcu pointerAbhinav Singh
[ Upstream commit 5f35a624c1e30b5bae5023b3c256e94e0ad4f806 ] Fix a sparse warning with this message "warning:dereference of noderef expression". In this context it means we are dereferencing a __rcu tagged pointer directly. We should not be directly dereferencing a rcu pointer. To get a normal (non __rcu tagged pointer) from a __rcu tagged pointer we are using the function unrcu_pointer(...). The non __rcu tagged pointer then can be dereferenced just like a normal pointer. I tested with qemu with this command qemu-system-x86_64 \ -m 2G \ -smp 2 \ -kernel bzImage \ -append "console=ttyS0 root=/dev/sda earlyprintk=serial net.ifnames=0" \ -drive file=bullseye.img,format=raw \ -net user,host=10.0.2.10,hostfwd=tcp:127.0.0.1:10021-:22 \ -net nic,model=e1000 \ -enable-kvm \ -nographic \ -pidfile vm.pid \ 2>&1 | tee vm.log with lockdep enabled. Fixes: 0ec5f02f0e2c ("drm/nouveau: prevent stale fence->channel pointers, and protect with rcu") Signed-off-by: Abhinav Singh <singhabhinav9051571833@gmail.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231113191303.3277733-1-singhabhinav9051571833@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2024-01-25nouveau/tu102: flush all pdbs on vmm flushDave Airlie
[ Upstream commit cb9c919364653eeafb49e7ff5cd32f1ad64063ac ] This is a hack around a bug exposed with the GSP code, I'm not sure what is happening exactly, but it appears some of our flushes don't result in proper tlb invalidation for out BAR2 and we get a BAR2 fault from GSP and it all dies. Signed-off-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Danilo Krummrich <dakr@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20231130010852.4034774-1-airlied@gmail.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-08-16drm/nouveau/disp: Revert a NULL check inside nouveau_connector_get_modesKarol Herbst
commit d5712cd22b9cf109fded1b7f178f4c1888c8b84b upstream. The original commit adding that check tried to protect the kenrel against a potential invalid NULL pointer access. However we call nouveau_connector_detect_depth once without a native_mode set on purpose for non LVDS connectors and this broke DP support in a few cases. Cc: Olaf Skibbe <news@kravcenko.com> Cc: Lyude Paul <lyude@redhat.com> Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/238 Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/245 Fixes: 20a2ce87fbaf8 ("drm/nouveau/dp: check for NULL nv_connector->native_mode") Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230805101813.2603989-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-08-16drm/nouveau/gr: enable memory loads on helper invocation on all channelsKarol Herbst
commit 1cb9e2ef66d53b020842b18762e30d0eb4384de8 upstream. We have a lurking bug where Fragment Shader Helper Invocations can't load from memory. But this is actually required in OpenGL and is causing random hangs or failures in random shaders. It is unknown how widespread this issue is, but shaders hitting this can end up with infinite loops. We enable those only on all Kepler and newer GPUs where we use our own Firmware. Nvidia's firmware provides a way to set a kernelspace controlled list of mmio registers in the gr space from push buffers via MME macros. v2: drop code for gm200 and newer. Cc: Ben Skeggs <bskeggs@redhat.com> Cc: David Airlie <airlied@gmail.com> Cc: nouveau@lists.freedesktop.org Cc: stable@vger.kernel.org # 4.19+ Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230622152017.2512101-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-06-21drm/nouveau: add nv_encoder pointer check for NULLNatalia Petrova
[ Upstream commit 55b94bb8c42464bad3d2217f6874aa1a85664eac ] Pointer nv_encoder could be dereferenced at nouveau_connector.c in case it's equal to NULL by jumping to goto label. This patch adds a NULL-check to avoid it. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: 3195c5f9784a ("drm/nouveau: set encoder for lvds") Signed-off-by: Natalia Petrova <n.petrova@fintech.ru> Reviewed-by: Lyude Paul <lyude@redhat.com> [Fixed patch title] Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230512103320.82234-1-n.petrova@fintech.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21drm/nouveau/dp: check for NULL nv_connector->native_modeNatalia Petrova
[ Upstream commit 20a2ce87fbaf81e4c3dcb631d738e423959eb320 ] Add checking for NULL before calling nouveau_connector_detect_depth() in nouveau_connector_get_modes() function because nv_connector->native_mode could be dereferenced there since connector pointer passed to nouveau_connector_detect_depth() and the same value of nv_connector->native_mode is used there. Found by Linux Verification Center (linuxtesting.org) with SVACE. Fixes: d4c2c99bdc83 ("drm/nouveau/dp: remove broken display depth function, use the improved one") Signed-off-by: Natalia Petrova <n.petrova@fintech.ru> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230512111526.82408-1-n.petrova@fintech.ru Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21drm/nouveau: don't detect DSM for non-NVIDIA deviceRatchanan Srirattanamet
[ Upstream commit 11d24327c2d7ad7f24fcc44fb00e1fa91ebf6525 ] The call site of nouveau_dsm_pci_probe() uses single set of output variables for all invocations. So, we must not write anything to them unless it's an NVIDIA device. Otherwise, if we are called with another device after the NVIDIA device, we'll clober the result of the NVIDIA device. For example, if the other device doesn't have _PR3 resources, the detection later would miss the presence of power resource support, and the rest of the code will keep using Optimus DSM, breaking power management for that machine. Also, because we're detecting NVIDIA's DSM, it doesn't make sense to run this detection on a non-NVIDIA device anyway. Thus, check at the beginning of the detection code if this is an NVIDIA card, and just return if it isn't. This, together with commit d22915d22ded ("drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETED") developed independently and landed earlier, fixes runtime power management of the NVIDIA card in Lenovo Legion 5-15ARH05. Without this patch, the GPU resumption code will "timeout", sometimes hanging userspace. As a bonus, we'll also stop preventing _PR3 usage from the bridge for unrelated devices, which is always nice, I guess. Fixes: ccfc2d5cdb02 ("drm/nouveau: Use generic helper to check _PR3 presence") Signed-off-by: Ratchanan Srirattanamet <peathot@hotmail.com> Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/79 Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/DM6PR19MB2780805D4BE1E3F9B3AC96D0BC409@DM6PR19MB2780.namprd19.prod.outlook.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-06-21nouveau: fix client work fence deletion raceDave Airlie
commit c8a5d5ea3ba6a18958f8d76430e4cd68eea33943 upstream. This seems to have existed for ever but is now more apparant after commit 9bff18d13473 ("drm/ttm: use per BO cleanup workers") My analysis: two threads are running, one in the irq signalling the fence, in dma_fence_signal_timestamp_locked, it has done the DMA_FENCE_FLAG_SIGNALLED_BIT setting, but hasn't yet reached the callbacks. The second thread in nouveau_cli_work_ready, where it sees the fence is signalled, so then puts the fence, cleanups the object and frees the work item, which contains the callback. Thread one goes again and tries to call the callback and causes the use-after-free. Proposed fix: lock the fence signalled check in nouveau_cli_work_ready, so either the callbacks are done or the memory is freed. Reviewed-by: Karol Herbst <kherbst@redhat.com> Fixes: 11e451e74050 ("drm/nouveau: remove fence wait code from deferred client work handler") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://lore.kernel.org/dri-devel/20230615024008.1600281-1-airlied@gmail.com/ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-04-13drm/nouveau/disp: Support more modes by checking with lower bpcKarol Herbst
commit 7f67aa097e875c87fba024e850cf405342300059 upstream. This allows us to advertise more modes especially on HDR displays. Fixes using 4K@60 modes on my TV and main display both using a HDMI to DP adapter. Also fixes similar issues for users running into this. Cc: stable@vger.kernel.org # 5.10+ Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230330223938.4025569-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2023-03-17drm/nouveau/kms/nv50: fix nv50_wndw_new_ prototypeJiri Slaby (SUSE)
[ Upstream commit 3638a820c5c3b52f327cebb174fd4274bee08aa7 ] gcc-13 warns about mismatching types for enums. That revealed switched arguments of nv50_wndw_new_(): drivers/gpu/drm/nouveau/dispnv50/wndw.c:696:1: error: conflicting types for 'nv50_wndw_new_' due to enum/integer mismatch; have 'int(const struct nv50_wndw_func *, struct drm_device *, enum drm_plane_type, const char *, int, const u32 *, u32, enum nv50_disp_interlock_type, u32, struct nv50_wndw **)' drivers/gpu/drm/nouveau/dispnv50/wndw.h:36:5: note: previous declaration of 'nv50_wndw_new_' with type 'int(const struct nv50_wndw_func *, struct drm_device *, enum drm_plane_type, const char *, int, const u32 *, enum nv50_disp_interlock_type, u32, u32, struct nv50_wndw **)' It can be barely visible, but the declaration says about the parameters in the middle: enum nv50_disp_interlock_type, u32 interlock_data, u32 heads, While the definition states differently: u32 heads, enum nv50_disp_interlock_type interlock_type, u32 interlock_data, Unify/fix the declaration to match the definition. Fixes: 53e0a3e70de6 ("drm/nouveau/kms/nv50-: simplify tracking of channel interlocks") Cc: Martin Liska <mliska@suse.cz> Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: Lyude Paul <lyude@redhat.com> Cc: David Airlie <airlied@gmail.com> Cc: Daniel Vetter <daniel@ffwll.ch> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20221031114229.10289-1-jirislaby@kernel.org Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-03-17drm/nouveau/kms/nv50-: remove unused functionsBen Skeggs
[ Upstream commit 89ed996b888faaf11c69bb4cbc19f21475c9050e ] Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Dave Airlie <airlied@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Stable-dep-of: 3638a820c5c3 ("drm/nouveau/kms/nv50: fix nv50_wndw_new_ prototype") Signed-off-by: Sasha Levin <sashal@kernel.org>
2023-02-22drm/nouveau/devinit/tu102-: wait for GFW_BOOT_PROGRESS == COMPLETEDBen Skeggs
[ Upstream commit d22915d22ded21fd5b24b60d174775789f173997 ] Starting from Turing, the driver is no longer responsible for initiating DEVINIT when required as the GPU started loading a FW image from ROM and executing DEVINIT itself after power-on. However - we apparently still need to wait for it to complete. This should correct some issues with runpm on some systems, where we get control of the HW before it's been fully reinitialised after resume from suspend. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20230130223715.1831509-1-bskeggs@redhat.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26drm/nouveau/nouveau_bo: fix potential memory leak in nouveau_bo_alloc()Jianglei Nie
[ Upstream commit 6dc548745d5b5102e3c53dc5097296ac270b6c69 ] nouveau_bo_alloc() allocates a memory chunk for "nvbo" with kzalloc(). When some error occurs, "nvbo" should be released. But when WARN_ON(pi < 0)) equals true, the function return ERR_PTR without releasing the "nvbo", which will lead to a memory leak. We should release the "nvbo" with kfree() if WARN_ON(pi < 0)) equals true. Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220705094306.2244103-1-niejianglei2021@163.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-10-26drm/nouveau: fix a use-after-free in nouveau_gem_prime_import_sg_table()Jianglei Nie
commit 540dfd188ea2940582841c1c220bd035a7db0e51 upstream. nouveau_bo_init() is backed by ttm_bo_init() and ferries its return code back to the caller. On failures, ttm will call nouveau_bo_del_ttm() and free the memory.Thus, when nouveau_bo_init() returns an error, the gem object has already been released. Then the call to nouveau_bo_ref() will use the freed "nvbo->bo" and lead to a use-after-free bug. We should delete the call to nouveau_bo_ref() to avoid the use-after-free. Signed-off-by: Jianglei Nie <niejianglei2021@163.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 019cbd4a4feb ("drm/nouveau: Initialize GEM object before TTM object") Cc: Thierry Reding <treding@nvidia.com> Cc: <stable@vger.kernel.org> # v5.4+ Link: https://patchwork.freedesktop.org/patch/msgid/20220705132546.2247677-1-niejianglei2021@163.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-10-26drm/nouveau/kms/nv140-: Disable interlacingLyude Paul
commit 8ba9249396bef37cb68be9e8dee7847f1737db9d upstream. As it turns out: while Nvidia does actually have interlacing knobs on their GPU still pretty much no current GPUs since Volta actually support it. Trying interlacing on these GPUs will result in NVDisplay being quite unhappy like so: nouveau 0000:1f:00.0: disp: chid 0 stat 00004802 reason 4 [INVALID_ARG] mthd 2008 data 00000001 code 00080000 nouveau 0000:1f:00.0: disp: chid 0 stat 10005080 reason 5 [INVALID_STATE] mthd 0200 data 00000001 code 00000001 So let's fix this by following the same behavior Nvidia's driver does and disable interlacing entirely. Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org Reviewed-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220816180436.156310-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-31nouveau: explicitly wait on the fence in nouveau_bo_move_m2mfKarol Herbst
commit 6b04ce966a738ecdd9294c9593e48513c0dc90aa upstream. It is a bit unlcear to us why that's helping, but it does and unbreaks suspend/resume on a lot of GPUs without any known drawbacks. Cc: stable@vger.kernel.org # v5.15+ Closes: https://gitlab.freedesktop.org/drm/nouveau/-/issues/156 Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220819200928.401416-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-25drm/nouveau: recognise GA103Karol Herbst
commit c20ee5749a3f688d9bab83a3b09b75587153ff13 upstream. Appears to be ok with general GA10x code. Signed-off-by: Karol Herbst <kherbst@redhat.com> Cc: <stable@vger.kernel.org> # v5.15+ Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220803142745.2679510-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17drm/nouveau/kms: Fix failure path for creating DP connectorsLyude Paul
commit ca0367ca5d9216644b41f86348d6661f8d9e32d8 upstream. It looks like that when we moved nouveau over to using drm_dp_aux_init() and registering it's aux bus during late connector registration, we totally forgot to fix the failure codepath in nouveau_connector_create() - as it still seems to assume that drm_dp_aux_init() can fail (it can't). So, let's fix that and also add a missing check to ensure that we've properly allocated nv_connector->aux.name while we're at it. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: David Airlie <airlied@linux.ie> Fixes: fd43ad9d47e7 ("drm/nouveau/kms/nv50-: Move AUX adapter reg to connector late register/early unregister") Cc: <stable@vger.kernel.org> # v5.14+ Link: https://patchwork.freedesktop.org/patch/msgid/20220526204313.656473-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17drm/nouveau/acpi: Don't print error when we get -EINPROGRESS from pm_runtimeLyude Paul
commit 53c26181950ddc3c8ace3c0939c89e9c4d8deeb9 upstream. Since this isn't actually a failure. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: David Airlie <airlied@linux.ie> Fixes: 79e765ad665d ("drm/nouveau/drm/nouveau: Prevent handling ACPI HPD events too early") Cc: <stable@vger.kernel.org> # v4.19+ Link: https://patchwork.freedesktop.org/patch/msgid/20220714174234.949259-2-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17drm/nouveau: Don't pm_runtime_put_sync(), only pm_runtime_put_autosuspend()Lyude Paul
commit c96cfaf8fc02d4bb70727dfa7ce7841a3cff9be2 upstream. While trying to fix another issue, it occurred to me that I don't actually think there is any situation where we want pm_runtime_put() in nouveau to be synchronous. In fact, this kind of just seems like it would cause issues where we may unexpectedly block a thread we don't expect to be blocked. So, let's only use pm_runtime_put_autosuspend(). Changes since v1: * Use pm_runtime_put_autosuspend(), not pm_runtime_put() Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: David Airlie <airlied@linux.ie> Fixes: 3a6536c51d5d ("drm/nouveau: Intercept ACPI_VIDEO_NOTIFY_PROBE") Cc: Hans de Goede <hdegoede@redhat.com> Cc: <stable@vger.kernel.org> # v4.10+ Link: https://patchwork.freedesktop.org/patch/msgid/20220714174234.949259-3-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-17drm/nouveau: fix another off-by-one in nvbios_addrTimur Tabi
commit c441d28945fb113220d48d6c86ebc0b090a2b677 upstream. This check determines whether a given address is part of image 0 or image 1. Image 1 starts at offset image0_size, so that address should be included. Fixes: 4d4e9907ff572 ("drm/nouveau/bios: guard against out-of-bounds accesses to image") Cc: <stable@vger.kernel.org> # v4.8+ Signed-off-by: Timur Tabi <ttabi@nvidia.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220511163716.3520591-1-ttabi@nvidia.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-08-03nouveau/svm: Fix to migrate all requested pagesAlistair Popple
commit 66cee9097e2b74ff3c8cc040ce5717c521a0c3fa upstream. Users may request that pages from an OpenCL SVM allocation be migrated to the GPU with clEnqueueSVMMigrateMem(). In Nouveau this will call into nouveau_dmem_migrate_vma() to do the migration. If the total range to be migrated exceeds SG_MAX_SINGLE_ALLOC the pages will be migrated in chunks of size SG_MAX_SINGLE_ALLOC. However a typo in updating the starting address means that only the first chunk will get migrated. Fix the calculation so that the entire range will get migrated if possible. Signed-off-by: Alistair Popple <apopple@nvidia.com> Fixes: e3d8b0890469 ("drm/nouveau/svm: map pages after migration") Reviewed-by: Ralph Campbell <rcampbell@nvidia.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220720062745.960701-1-apopple@nvidia.com Cc: <stable@vger.kernel.org> # v5.8+ Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09drm/nouveau/kms/nv50-: atom: fix an incorrect NULL check on list iteratorXiaomeng Tong
commit 6ce4431c7ba7954c4fa6a96ce16ca1b2943e1a83 upstream. The bug is here: return encoder; The list iterator value 'encoder' will *always* be set and non-NULL by drm_for_each_encoder_mask(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element found. Otherwise it will bypass some NULL checks and lead to invalid memory access passing the check. To fix this bug, just return 'encoder' when found, otherwise return NULL. Cc: stable@vger.kernel.org Fixes: 12885ecbfe62d ("drm/nouveau/kms/nvd9-: Add CRC support") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Reviewed-by: Lyude Paul <lyude@redhat.com> [Changed commit title] Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220327073925.11121-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09drm/nouveau/clk: Fix an incorrect NULL check on list iteratorXiaomeng Tong
commit 1c3b2a27def609473ed13b1cd668cb10deab49b4 upstream. The bug is here: if (nvkm_cstate_valid(clk, cstate, max_volt, clk->temp)) return cstate; The list iterator value 'cstate' will *always* be set and non-NULL by list_for_each_entry_from_reverse(), so it is incorrect to assume that the iterator value will be unchanged if the list is empty or no element is found (In fact, it will be a bogus pointer to an invalid structure object containing the HEAD). Also it missed a NULL check at callsite and may lead to invalid memory access after that. To fix this bug, just return 'encoder' when found, otherwise return NULL. And add the NULL check. Cc: stable@vger.kernel.org Fixes: 1f7f3d91ad38a ("drm/nouveau/clk: Respect voltage limits in nvkm_cstate_prog") Signed-off-by: Xiaomeng Tong <xiam0nd.tong@gmail.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220327075824.11806-1-xiam0nd.tong@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-06-09drm/nouveau/subdev/bus: Ratelimit logging for fault errorsLyude Paul
commit 9887bda0c831df0c044d6de147d002e48024fb4a upstream. There's plenty of ways to fudge the GPU when developing on nouveau by mistake, some of which can result in nouveau seriously spamming dmesg with fault errors. This can be somewhat annoying, as it can quickly overrun the message buffer (or your terminal emulator's buffer) and get rid of actually useful feedback from the driver. While working on my new atomic only MST branch, I ran into this issue a couple of times. So, let's fix this by adding nvkm_error_ratelimited(), and using it to ratelimit errors from faults. This should be fine for developers, since it's nearly always only the first few faults that we care about seeing. Plus, you can turn off rate limiting in the kernel if you really need to. Signed-off-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Cc: stable@vger.kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20220429195350.85620-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18drm/nouveau/tegra: Stop using iommu_present()Robin Murphy
commit 87fd2b091fb33871a7f812658a0971e8e26f903f upstream. Even if some IOMMU has registered itself on the platform "bus", that doesn't necessarily mean it provides translation for the device we care about. Replace iommu_present() with a more appropriate check. Signed-off-by: Robin Murphy <robin.murphy@arm.com> Reviewed-by: Lyude Paul <lyude@redhat.com> [added cc for stable] Signed-off-by: Lyude Paul <lyude@redhat.com> Cc: stable@vger.kernel.org # v5.0+ Link: https://patchwork.freedesktop.org/patch/msgid/70d40ea441da3663c2824d54102b471e9a621f8a.1649168494.git.robin.murphy@arm.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-05-18drm/nouveau: Fix a potential theorical leak in nouveau_get_backlight_name()Christophe JAILLET
[ Upstream commit ab244be47a8f111bc82496a8a20c907236e37f95 ] If successful ida_simple_get() calls are not undone when needed, some additional memory may be allocated and wasted. Here, an ID between 0 and MAX_INT is required. If this ID is >=100, it is not taken into account and is wasted. It should be released. Instead of calling ida_simple_remove(), take advantage of the 'max' parameter to require the ID not to be too big. Should it be too big, it is not allocated and don't need to be freed. While at it, use ida_alloc_xxx()/ida_free() instead to ida_simple_get()/ida_simple_remove(). The latter is deprecated and more verbose. Fixes: db1a0ae21461 ("drm/nouveau/bl: Assign different names to interfaces") Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Reviewed-by: Lyude Paul <lyude@redhat.com> [Fixed formatting warning from checkpatch] Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/9ba85bca59df6813dc029e743a836451d5173221.1644386541.git.christophe.jaillet@wanadoo.fr Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-13drm/nouveau/pmu: Add missing callbacks for Tegra devicesKarol Herbst
commit 38d4e5cf5b08798f093374e53c2f4609d5382dd5 upstream. Fixes a crash booting on those platforms with nouveau. Fixes: 4cdd2450bf73 ("drm/nouveau/pmu/gm200-: use alternate falcon reset sequence") Cc: Ben Skeggs <bskeggs@redhat.com> Cc: Karol Herbst <kherbst@redhat.com> Cc: dri-devel@lists.freedesktop.org Cc: nouveau@lists.freedesktop.org Cc: <stable@vger.kernel.org> # v5.17+ Signed-off-by: Karol Herbst <kherbst@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220322124800.2605463-1-kherbst@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08drm/nouveau/acr: Fix undefined behavior in nvkm_acr_hsfw_load_bl()Zhou Qingyang
[ Upstream commit 2343bcdb4747d4f418a4daf2e898b94f86c24a59 ] In nvkm_acr_hsfw_load_bl(), the return value of kmalloc() is directly passed to memcpy(), which could lead to undefined behavior on failure of kmalloc(). Fix this bug by using kmemdup() instead of kmalloc()+memcpy(). This bug was found by a static analyzer. Builds with 'make allyesconfig' show no new warnings, and our static analyzer no longer warns about this code. Fixes: 22dcda45a3d1 ("drm/nouveau/acr: implement new subdev to replace "secure boot"") Signed-off-by: Zhou Qingyang <zhou1615@umn.edu> Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Lyude Paul <lyude@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220124165856.57022-1-zhou1615@umn.edu Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-04-08drm/nouveau/backlight: Just set all backlight types as RAWLyude Paul
commit b21a142fd2055d8276169efcc95b624ff908a341 upstream. Currently we can get a warning on systems with eDP backlights like so: nv_backlight: invalid backlight type WARNING: CPU: 4 PID: 454 at drivers/video/backlight/backlight.c:420 backlight_device_register+0x226/0x250 This happens as a result of us not filling out props.type for the eDP backlight, even though we do it for all other backlight types. Since nothing in our driver uses anything but BACKLIGHT_RAW, let's take the props\.type assignments out of the codepaths for individual backlight types and just set it unconditionally to prevent this from happening again. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 6eca310e8924 ("drm/nouveau/kms/nv50-: Add basic DPCD backlight support for nouveau") Cc: <stable@vger.kernel.org> # v5.15+ Reviewed-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220204193319.451119-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-04-08drm/nouveau/backlight: Fix LVDS backlight detection on some laptopsLyude Paul
commit 6b0076540faffd47f5a899bf12f3528c4f0e726b upstream. It seems that some laptops will report having both an eDP and LVDS connector, even though only the LVDS connector is actually hooked up. This can lead to issues with backlight registration if the eDP connector ends up getting registered before the LVDS connector, as the backlight device will then be registered to the eDP connector instead of the LVDS connector. So, fix this by only registering the backlight on connectors that are reported as being connected. Signed-off-by: Lyude Paul <lyude@redhat.com> Fixes: 6eca310e8924 ("drm/nouveau/kms/nv50-: Add basic DPCD backlight support for nouveau") Bugzilla: https://gitlab.freedesktop.org/drm/nouveau/-/issues/137 Cc: <stable@vger.kernel.org> # v5.15+ Reviewed-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220204180504.328999-1-lyude@redhat.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-23drm/nouveau/pmu/gm200-: use alternate falcon reset sequenceBen Skeggs
commit 4cdd2450bf739bada353e82d27b00db9af8c3001 upstream. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/10 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-02-08drm/nouveau: fix off by one in BIOS boundary checkingNick Lopez
commit 1b777d4d9e383d2744fc9b3a09af6ec1893c8b1a upstream. Bounds checking when parsing init scripts embedded in the BIOS reject access to the last byte. This causes driver initialization to fail on Apple eMac's with GeForce 2 MX GPUs, leaving the system with no working console. This is probably only seen on OpenFirmware machines like PowerPC Macs because the BIOS image provided by OF is only the used parts of the ROM, not a power-of-two blocks read from PCI directly so PCs always have empty bytes at the end that are never accessed. Signed-off-by: Nick Lopez <github@glowingmonkey.org> Fixes: 4d4e9907ff572 ("drm/nouveau/bios: guard against out-of-bounds accesses to image") Cc: <stable@vger.kernel.org> # v4.10+ Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20220122081906.2633061-1-github@glowingmonkey.org Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27drm/nouveau/kms/nv04: use vzalloc for nv04_displayIlia Mirkin
commit bd6e07e72f37f34535bec7eebc807e5fcfe37b43 upstream. The struct is giant, and triggers an order-7 allocation (512K). There is no reason for this to be kmalloc-type memory, so switch to vmalloc. This should help loading nouveau on low-memory and/or long-running systems. Reported-by: Nathan E. Egge <unlord@xiph.org> Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: stable@vger.kernel.org Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/10 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2022-01-27drm/nouveau/pmu/gm200-: avoid touching PMU outside of DEVINIT/PREOS/ACRBen Skeggs
[ Upstream commit 1d2271d2fb85e54bfc9630a6c30ac0feb9ffb983 ] There have been reports of the WFI timing out on some boards, and a patch was proposed to just remove it. This stuff is rather fragile, and I believe the WFI might be needed with our FW prior to GM200. However, we probably should not be touching PMU during init on GPUs where we depend on NVIDIA FW, outside of limited circumstances, so this should be a somewhat safer change that achieves the desired result. Reported-by: Diego Viola <diego.viola@gmail.com> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/10 Signed-off-by: Sasha Levin <sashal@kernel.org>
2022-01-05drm/nouveau: wait for the exclusive fence after the shared ones v2Christian König
commit 67f74302f45d5d862f22ced3297624e50ac352f0 upstream. Always waiting for the exclusive fence resulted on some performance regressions. So try to wait for the shared fences first, then the exclusive fence should always be signaled already. v2: fix incorrectly placed "(", add some comment why we do this. Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Stefan Fritsch <sf@sfritsch.de> Tested-by: Dan Moulding <dmoulding@me.com> Acked-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Christian König <christian.koenig@amd.com> Cc: <stable@vger.kernel.org> Link: https://patchwork.freedesktop.org/patch/msgid/20211209102335.18321-1-christian.koenig@amd.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-12-01drm/nouveau/acr: fix a couple NULL vs IS_ERR() checksDan Carpenter
[ Upstream commit b371fd131fcec59f6165c80778bdc2cd1abd616b ] The nvkm_acr_lsfw_add() function never returns NULL. It returns error pointers on error. Fixes: 22dcda45a3d1 ("drm/nouveau/acr: implement new subdev to replace "secure boot"") Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211118111314.GB1147@kili Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-12-01drm/nouveau: recognise GA106Ben Skeggs
commit 46741e4f593ff1bd0e4a140ab7e566701946484b upstream. I've got HW now, appears to work as expected so far. Signed-off-by: Ben Skeggs <bskeggs@redhat.com> Cc: <stable@vger.kernel.org> # 5.14+ Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20211118030413.2610-1-skeggsb@gmail.com Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/nouveau: clean up all clients on device removalJeremy Cline
commit f55aaf63bde0d0336c3823bb3713bd4a464abbcf upstream. The postclose handler can run after the device has been removed (or the driver has been unbound) since userspace clients are free to hold the file open as long as they want. Because the device removal callback frees the entire nouveau_drm structure, any reference to it in the postclose handler will result in a use-after-free. To reproduce this, one must simply open the device file, unbind the driver (or physically remove the device), and then close the device file. This was found and can be reproduced easily with the IGT core_hotunplug tests. To avoid this, all clients are cleaned up in the device finalization rather than deferring it to the postclose handler, and the postclose handler is protected by a critical section which ensures the drm_dev_unplug() and the postclose handler won't race. This is not an ideal fix, since as I understand the proposed plan for the kernel<->userspace interface for hotplug support, destroying the client before the file is closed will cause problems. However, I believe to properly fix this issue, the lifetime of the nouveau_drm structure needs to be extended to match the drm_device, and this proved to be a rather invasive change. Thus, I've broken this out so the fix can be easily backported. This fixes with the two previous commits CVE-2020-27820 (Karol). Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Tested-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201125202648.5220-4-jcline@redhat.com Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/14 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/nouveau: use drm_dev_unplug() during device removalJeremy Cline
commit aff2299e0d81b26304ccc6a1ec0170e437f38efc upstream. Nouveau does not currently support hot-unplugging, but it still makes sense to switch from drm_dev_unregister() to drm_dev_unplug(). drm_dev_unplug() calls drm_dev_unregister() after marking the device as unplugged, but only after any device critical sections are finished. Since nouveau isn't using drm_dev_enter() and drm_dev_exit(), there are no critical sections so this is nearly functionally equivalent. However, the DRM layer does check to see if the device is unplugged, and if it is returns appropriate error codes. In the future nouveau can add critical sections in order to truly support hot-unplugging. Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Tested-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201125202648.5220-2-jcline@redhat.com Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/14 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/nouveau: Add a dedicated mutex for the clients listJeremy Cline
commit abae9164a421bc4a41a3769f01ebcd1f9d955e0e upstream. Rather than protecting the nouveau_drm clients list with the lock within the "client" nouveau_cli, add a dedicated lock to serialize access to the list. This is both clearer and necessary to avoid lockdep being upset with us when we need to iterate through all the clients in the list and potentially lock their mutex, which is the same class as the lock protecting the entire list. Cc: stable@vger.kernel.org # 5.4+ Signed-off-by: Jeremy Cline <jcline@redhat.com> Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Tested-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20201125202648.5220-3-jcline@redhat.com Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/14 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2021-11-25drm/nouveau: hdmigv100.c: fix corrupted HDMI Vendor InfoFrameHans Verkuil
[ Upstream commit 3cc1ae1fa70ab369e4645e38ce335a19438093ad ] gv100_hdmi_ctrl() writes vendor_infoframe.subpack0_high to 0x6f0110, and then overwrites it with 0. Just drop the overwrite with 0, that's clearly a mistake. Because of this issue the HDMI VIC is 0 instead of 1 in the HDMI Vendor InfoFrame when transmitting 4kp30. Signed-off-by: Hans Verkuil <hverkuil-cisco@xs4all.nl> Fixes: 290ffeafcc1a ("drm/nouveau/disp/gv100: initial support") Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/3d3bd0f7-c150-2479-9350-35d394ee772d@xs4all.nl Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/ttm: remove ttm_bo_vm_insert_huge()Jason Gunthorpe
[ Upstream commit 0d979509539ed1df883a30d442177ca7be609565 ] The huge page functionality in TTM does not work safely because PUD and PMD entries do not have a special bit. get_user_pages_fast() considers any page that passed pmd_huge() as usable: if (unlikely(pmd_trans_huge(pmd) || pmd_huge(pmd) || pmd_devmap(pmd))) { And vmf_insert_pfn_pmd_prot() unconditionally sets entry = pmd_mkhuge(pfn_t_pmd(pfn, prot)); eg on x86 the page will be _PAGE_PRESENT | PAGE_PSE. As such gup_huge_pmd() will try to deref a struct page: head = try_grab_compound_head(pmd_page(orig), refs, flags); and thus crash. Thomas further notices that the drivers are not expecting the struct page to be used by anything - in particular the refcount incr above will cause them to malfunction. Thus everything about this is not able to fully work correctly considering GUP_fast. Delete it entirely. It can return someday along with a proper PMD/PUD_SPECIAL bit in the page table itself to gate GUP_fast. Fixes: 314b6580adc5 ("drm/ttm, drm/vmwgfx: Support huge TTM pagefaults") Signed-off-by: Jason Gunthorpe <jgg@nvidia.com> Reviewed-by: Thomas Hellström <thomas.helllstrom@linux.intel.com> Reviewed-by: Christian König <christian.koenig@amd.com> [danvet: Update subject per Thomas' &Christian's review] Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: https://patchwork.freedesktop.org/patch/msgid/0-v2-a44694790652+4ac-ttm_pmd_jgg@nvidia.com Signed-off-by: Sasha Levin <sashal@kernel.org>
2021-11-18drm/nouveau/svm: Fix refcount leak bug and missing check against null bugChenyuan Mi
[ Upstream commit 6bb8c2d51811eb5e6504f49efe3b089d026009d2 ] The reference counting issue happens in one exception handling path of nouveau_svmm_bind(). When cli->svm.svmm is null, the function forgets to decrease the refcount of mm increased by get_task_mm(), causing a refcount leak. Fix this issue by using mmput() to decrease the refcount in the exception handling path. Also, the function forgets to do check against null when get mm by get_task_mm(). Fix this issue by adding null check after get mm by get_task_mm(). Signed-off-by: Chenyuan Mi <cymi20@fudan.edu.cn> Signed-off-by: Xiyu Yang <xiyuyang19@fudan.edu.cn> Signed-off-by: Xin Tan <tanxin.ctf@gmail.com> Fixes: 822cab6150d3 ("drm/nouveau/svm: check for SVM initialized before migrating") Reviewed-by: Lyude Paul <lyude@redhat.com> Reviewed-by: Ben Skeggs <bskeggs@redhat.com> Reviewed-by: Karol Herbst <kherbst@redhat.com> Signed-off-by: Karol Herbst <kherbst@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20210907122633.16665-1-cymi20@fudan.edu.cn Link: https://gitlab.freedesktop.org/drm/nouveau/-/merge_requests/14 Signed-off-by: Sasha Levin <sashal@kernel.org>