aboutsummaryrefslogtreecommitdiffstats
path: root/drivers/gpu/drm/amd/scheduler
AgeCommit message (Collapse)Author
2017-11-02License cleanup: add SPDX GPL-2.0 license identifier to files with no licenseGreg Kroah-Hartman
Many source files in the tree are missing licensing information, which makes it harder for compliance tools to determine the correct license. By default all files without license information are under the default license of the kernel, which is GPL version 2. Update the files which contain no license information with the 'GPL-2.0' SPDX license identifier. The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. This patch is based on work done by Thomas Gleixner and Kate Stewart and Philippe Ombredanne. How this work was done: Patches were generated and checked against linux-4.14-rc6 for a subset of the use cases: - file had no licensing information it it. - file was a */uapi/* one with no licensing information in it, - file was a */uapi/* one with existing licensing information, Further patches will be generated in subsequent months to fix up cases where non-standard license headers were used, and references to license had to be inferred by heuristics based on keywords. The analysis to determine which SPDX License Identifier to be applied to a file was done in a spreadsheet of side by side results from of the output of two independent scanners (ScanCode & Windriver) producing SPDX tag:value files created by Philippe Ombredanne. Philippe prepared the base worksheet, and did an initial spot review of a few 1000 files. The 4.13 kernel was the starting point of the analysis with 60,537 files assessed. Kate Stewart did a file by file comparison of the scanner results in the spreadsheet to determine which SPDX license identifier(s) to be applied to the file. She confirmed any determination that was not immediately clear with lawyers working with the Linux Foundation. Criteria used to select files for SPDX license identifier tagging was: - Files considered eligible had to be source code files. - Make and config files were included as candidates if they contained >5 lines of source - File already had some variant of a license header in it (even if <5 lines). All documentation files were explicitly excluded. The following heuristics were used to determine which SPDX license identifiers to apply. - when both scanners couldn't find any license traces, file was considered to have no license information in it, and the top level COPYING file license applied. For non */uapi/* files that summary was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 11139 and resulted in the first patch in this series. If that file was a */uapi/* path one, it was "GPL-2.0 WITH Linux-syscall-note" otherwise it was "GPL-2.0". Results of that was: SPDX license identifier # files ---------------------------------------------------|------- GPL-2.0 WITH Linux-syscall-note 930 and resulted in the second patch in this series. - if a file had some form of licensing information in it, and was one of the */uapi/* ones, it was denoted with the Linux-syscall-note if any GPL family license was found in the file or had no licensing in it (per prior point). Results summary: SPDX license identifier # files ---------------------------------------------------|------ GPL-2.0 WITH Linux-syscall-note 270 GPL-2.0+ WITH Linux-syscall-note 169 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause) 21 ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause) 17 LGPL-2.1+ WITH Linux-syscall-note 15 GPL-1.0+ WITH Linux-syscall-note 14 ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause) 5 LGPL-2.0+ WITH Linux-syscall-note 4 LGPL-2.1 WITH Linux-syscall-note 3 ((GPL-2.0 WITH Linux-syscall-note) OR MIT) 3 ((GPL-2.0 WITH Linux-syscall-note) AND MIT) 1 and that resulted in the third patch in this series. - when the two scanners agreed on the detected license(s), that became the concluded license(s). - when there was disagreement between the two scanners (one detected a license but the other didn't, or they both detected different licenses) a manual inspection of the file occurred. - In most cases a manual inspection of the information in the file resulted in a clear resolution of the license that should apply (and which scanner probably needed to revisit its heuristics). - When it was not immediately clear, the license identifier was confirmed with lawyers working with the Linux Foundation. - If there was any question as to the appropriate license identifier, the file was flagged for further research and to be revisited later in time. In total, over 70 hours of logged manual review was done on the spreadsheet to determine the SPDX license identifiers to apply to the source files by Kate, Philippe, Thomas and, in some cases, confirmation by lawyers working with the Linux Foundation. Kate also obtained a third independent scan of the 4.13 code base from FOSSology, and compared selected files where the other two scanners disagreed against that SPDX file, to see if there was new insights. The Windriver scanner is based on an older version of FOSSology in part, so they are related. Thomas did random spot checks in about 500 files from the spreadsheets for the uapi headers and agreed with SPDX license identifier in the files he inspected. For the non-uapi files Thomas did random spot checks in about 15000 files. In initial set of patches against 4.14-rc6, 3 files were found to have copy/paste license identifier errors, and have been fixed to reflect the correct identifier. Additionally Philippe spent 10 hours this week doing a detailed manual inspection and review of the 12,461 patched files from the initial patch version early this week with: - a full scancode scan run, collecting the matched texts, detected license ids and scores - reviewing anything where there was a license detected (about 500+ files) to ensure that the applied SPDX license was correct - reviewing anything where there was no detection but the patch license was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied SPDX license was correct This produced a worksheet with 20 files needing minor correction. This worksheet was then exported into 3 different .csv files for the different types of files to be modified. These .csv files were then reviewed by Greg. Thomas wrote a script to parse the csv files and add the proper SPDX tag to the file, in the format that the file expected. This script was further refined by Greg based on the output to detect more types of files automatically and to distinguish between header and source .c files (which need different comment types.) Finally Greg ran the script using the .csv files to generate the patches. Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org> Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com> Reviewed-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-10-18Revert "drm/amdgpu: discard commands of killed processes"Alex Deucher
This causes instability in piglit. It's fixed in drm-next with: 515c6faf85970af529953ec137b4b6fcb3272e25 1650c14b459ff9c85767746f1ef795a780653128 214a91e6bfabaa6cbfa692df8732000aab050795 29d253553559dba919315be847f4f2cce29edd42 79867462634836ee5c39a2cdf624719feeb189bd This reverts commit 6af0883ed9770cf9b0a4f224c91481484cd1b025. Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-08-24drm/amdgpu: discard commands of killed processesChristian König
When a process is killed we shouldn't submit all waiting jobs, but instead clean up as fast as possible. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-07-14drm/amd/sched: print sched job id in amd_sched_job traceNicolai Hähnle
This makes it easier to correlate amd_sched_job with with other trace points that don't log the job pointer. v2: don't print the sched_job pointer (Andres) Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Reviewed-by: Andres Rodriguez <andresx7@gmail.com> Reviewed-by: Christian König <christian.koenig@amd.com>
2017-05-24drm/amdgpu/SRIOV:implement guilty job TDR for(V2)Monk Liu
1,TDR will kickout guilty job if it hang exceed the threshold of the given one from kernel paramter "job_hang_limit", that way a bad command stream will not infinitly cause GPU hang. by default this threshold is 1 so a job will be kicked out after it hang. 2,if a job timeout TDR routine will not reset all sched/ring, instead if will only reset on the givn one which is indicated by @job of amdgpu_sriov_gpu_reset, that way we don't need to reset and recover each sched/ring if we already know which job cause GPU hang. 3,unblock sriov_gpu_reset for AI family. V2: 1:put kickout guilty job after sched parked. 2:since parking scheduler prior to kickout already occupies a while, we can do last check on the in question job before doing hw_reset. TODO: 1:when a job is considered as guilty, we should mark some flag in its fence status flag, and let UMD side aware that this fence signaling is not due to job complete but job hang. 2:if gpu reset cause all video memory lost, we need introduce a new policy to implement TDR, like drop all jobs not yet signaled, and all IOCTL on this device will return ERROR DEVICE_LOST. this will be implemented later. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-10drm/amdgpu: fix dependency issueChunming Zhou
The problem is that executing the jobs in the right order doesn't give you the right result because consecutive jobs executed on the same engine are pipelined. In other words job B does it buffer read before job A has written it's result. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-05-10drm/amd: fix init order of sched jobChunming Zhou
Need to increment after the fence check. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-04-28drm/amdgpu: fix NULL pointer errorChunming Zhou
[ 141.420491] BUG: unable to handle kernel NULL pointer dereference at 0000000000000030 [ 141.420532] IP: [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.420563] PGD 20a030067 [ 141.420575] PUD 2088ca067 [ 141.420587] PMD 0 [ 141.420599] Oops: 0000 [#1] SMP [ 141.420612] Modules linked in: amdgpu(OE) ttm(OE) drm_kms_helper(E) drm(E) i2c_algo_bit(E) fb_sys_fops(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) rpcsec_gss_krb5(E) nfsv4(E) nfs(E) fscache(E) eeepc_wmi(E) asus_wmi(E) sparse_keymap(E) snd_hda_codec_realtek(E) video(E) snd_hda_codec_generic(E) snd_hda_codec_hdmi(E) snd_hda_intel(E) joydev(E) snd_hda_codec(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_hda_core(E) snd_hwdep(E) snd_rawmidi(E) snd_pcm(E) kvm(E) irqbypass(E) crct10dif_pclmul(E) snd_seq(E) crc32_pclmul(E) ghash_clmulni_intel(E) snd_seq_device(E) snd_timer(E) aesni_intel(E) aes_x86_64(E) lrw(E) gf128mul(E) glue_helper(E) ablk_helper(E) cryptd(E) snd(E) soundcore(E) serio_raw(E) shpchp(E) i2c_piix4(E) i2c_designware_platform(E) 8250_dw(E) i2c_designware_core(E) mac_hid(E) binfmt_misc(E) [ 141.420948] nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) parport_pc(E) ppdev(E) lp(E) parport(E) autofs4(E) hid_generic(E) usbhid(E) hid(E) psmouse(E) r8169(E) ahci(E) mii(E) libahci(E) wmi(E) [ 141.421042] CPU: 14 PID: 223 Comm: kworker/14:2 Tainted: G OE 4.9.0-custom #4 [ 141.421074] Hardware name: System manufacturer System Product Name/PRIME B350-PLUS, BIOS 0606 04/06/2017 [ 141.421146] Workqueue: events amd_sched_job_timedout [amdgpu] [ 141.421169] task: ffff88020b03ba80 task.stack: ffffc900016f4000 [ 141.421193] RIP: 0010:[<ffffffff81579ee1>] [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.421229] RSP: 0018:ffffc900016f7d30 EFLAGS: 00010202 [ 141.421250] RAX: ffff8801c049fc00 RBX: ffff8801d4d8dc00 RCX: 0000000000000000 [ 141.421278] RDX: 0000000000000001 RSI: ffff8801c049fcc0 RDI: 0000000000000000 [ 141.421307] RBP: ffffc900016f7d48 R08: 0000000000000000 R09: 0000000000000000 [ 141.421334] R10: 00000020ed512a30 R11: 0000000000000001 R12: 0000000000000000 [ 141.421362] R13: ffff880209ba4ba0 R14: ffff880209ba4c58 R15: ffff8801c055cc60 [ 141.421390] FS: 0000000000000000(0000) GS:ffff88021ef80000(0000) knlGS:0000000000000000 [ 141.421421] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 141.421443] CR2: 0000000000000030 CR3: 000000020b554000 CR4: 00000000003406e0 [ 141.421471] Stack: [ 141.421480] ffff8801d4d8dc00 ffff880209ba4c48 ffff880209ba4ba0 ffffc900016f7d78 [ 141.421513] ffffffffa0697920 ffff880209ba0000 0000000000000000 ffff880209ba2770 [ 141.421549] ffff880209ba4b08 ffffc900016f7df0 ffffffffa05ce2ae ffffffffa0509eb7 [ 141.421583] Call Trace: [ 141.421628] [<ffffffffa0697920>] amd_sched_hw_job_reset+0x50/0xb0 [amdgpu] [ 141.421676] [<ffffffffa05ce2ae>] amdgpu_gpu_reset+0x8e/0x690 [amdgpu] [ 141.421712] [<ffffffffa0509eb7>] ? drm_printk+0x97/0xa0 [drm] [ 141.421770] [<ffffffffa0698156>] amdgpu_job_timedout+0x46/0x50 [amdgpu] [ 141.421829] [<ffffffffa0696a07>] amd_sched_job_timedout+0x17/0x20 [amdgpu] [ 141.421859] [<ffffffff81095493>] process_one_work+0x153/0x3f0 [ 141.421884] [<ffffffff81095c5b>] worker_thread+0x12b/0x4b0 [ 141.421907] [<ffffffff81095b30>] ? rescuer_thread+0x350/0x350 [ 141.421931] [<ffffffff8109b423>] kthread+0xd3/0xf0 [ 141.421951] [<ffffffff8109b350>] ? kthread_park+0x60/0x60 [ 141.421975] [<ffffffff817e1ee5>] ret_from_fork+0x25/0x30 [ 141.421996] Code: ac 81 e8 a3 1f b0 ff 48 c7 c0 ea ff ff ff e9 48 ff ff ff 0f 1f 80 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc 53 <48> 8b 7f 30 48 89 f3 e8 73 7c 26 00 48 8b 13 48 39 d3 41 0f 95 [ 141.422156] RIP [<ffffffff81579ee1>] fence_remove_callback+0x11/0x60 [ 141.422183] RSP <ffffc900016f7d30> [ 141.422197] CR2: 0000000000000030 [ 141.433483] ---[ end trace bc0949bf7ddd6d4b ]--- if the job is reset twice, then the parent could be NULL. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amd/sched: revise priority numberChunming Zhou
big number is to high priority. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-29drm/amd/sched: add a unique job id to amd_sched_jobAndres Rodriguez
A unique id is useful for debugging and tracing. Intended to replace pointers in ftrace output. Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Andres Rodriguez <andresx7@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2017-03-02sched/headers: Prepare for new header dependencies before moving code to ↵Ingo Molnar
<uapi/linux/sched/types.h> We are going to move scheduler ABI details to <uapi/linux/sched/types.h>, which will be used from a number of .c files. Create empty placeholder header that maps to <linux/types.h>. Include the new header in the files that are going to need it. Acked-by: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-30Merge tag 'drm-qemu-20161121' of git://git.kraxel.org/linux into drm-nextDave Airlie
drm/virtio: fix busid in a different way, allocate more vbufs. drm/qxl: various bugfixes and cleanups, * tag 'drm-qemu-20161121' of git://git.kraxel.org/linux: (224 commits) drm/virtio: allocate some extra bufs qxl: Allow resolution which are not multiple of 8 qxl: Don't notify userspace when monitors config is unchanged qxl: Remove qxl_bo_init() return value qxl: Call qxl_gem_{init, fini} qxl: Add missing '\n' to qxl_io_log() call qxl: Remove unused prototype qxl: Mark some internal functions as static Revert "drm: virtio: reinstate drm_virtio_set_busid()" drm/virtio: fix busid regression drm: re-export drm_dev_set_unique Linux 4.9-rc5 gp8psk: Fix DVB frontend attach gp8psk: fix gp8psk_usb_in_op() logic dvb-usb: move data_mutex to struct dvb_usb_device iio: maxim_thermocouple: detect invalid storage size in read() aoe: fix crash in page count manipulation lightnvm: invalid offset calculation for lba_shift Kbuild: enable -Wmaybe-uninitialized warnings by default pcmcia: fix return value of soc_pcmcia_regulator_set ...
2016-11-07Backmerge tag 'v4.9-rc4' into drm-nextDave Airlie
Linux 4.9-rc4 This is needed for nouveau development.
2016-10-31drm/amd: fix scheduler fence teardown order v2Christian König
Some fences might be alive even after we have stopped the scheduler leading to warnings about leaked objects from the SLUB allocator. Fix this by allocating/freeing the SLUB allocator from the module init/fini functions just like we do it for hw fences. v2: make variable static, add link to bug Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=97500 Reported-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> (v1) Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: stable@vger.kernel.org
2016-10-25dma-buf: Rename struct fence to dma_fenceChris Wilson
I plan to usurp the short name of struct fence for a core kernel struct, and so I need to rename the specialised fence/timeline for DMA operations to make room. A consensus was reached in https://lists.freedesktop.org/archives/dri-devel/2016-July/113083.html that making clear this fence applies to DMA operations was a good thing. Since then the patch has grown a bit as usage increases, so hopefully it remains a good thing! (v2...: rebase, rerun spatch) v3: Compile on msm, spotted a manual fixup that I broke. v4: Try again for msm, sorry Daniel coccinelle script: @@ @@ - struct fence + struct dma_fence @@ @@ - struct fence_ops + struct dma_fence_ops @@ @@ - struct fence_cb + struct dma_fence_cb @@ @@ - struct fence_array + struct dma_fence_array @@ @@ - enum fence_flag_bits + enum dma_fence_flag_bits @@ @@ ( - fence_init + dma_fence_init | - fence_release + dma_fence_release | - fence_free + dma_fence_free | - fence_get + dma_fence_get | - fence_get_rcu + dma_fence_get_rcu | - fence_put + dma_fence_put | - fence_signal + dma_fence_signal | - fence_signal_locked + dma_fence_signal_locked | - fence_default_wait + dma_fence_default_wait | - fence_add_callback + dma_fence_add_callback | - fence_remove_callback + dma_fence_remove_callback | - fence_enable_sw_signaling + dma_fence_enable_sw_signaling | - fence_is_signaled_locked + dma_fence_is_signaled_locked | - fence_is_signaled + dma_fence_is_signaled | - fence_is_later + dma_fence_is_later | - fence_later + dma_fence_later | - fence_wait_timeout + dma_fence_wait_timeout | - fence_wait_any_timeout + dma_fence_wait_any_timeout | - fence_wait + dma_fence_wait | - fence_context_alloc + dma_fence_context_alloc | - fence_array_create + dma_fence_array_create | - to_fence_array + to_dma_fence_array | - fence_is_array + dma_fence_is_array | - trace_fence_emit + trace_dma_fence_emit | - FENCE_TRACE + DMA_FENCE_TRACE | - FENCE_WARN + DMA_FENCE_WARN | - FENCE_ERR + DMA_FENCE_ERR ) ( ... ) Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Reviewed-by: Gustavo Padovan <gustavo.padovan@collabora.co.uk> Acked-by: Sumit Semwal <sumit.semwal@linaro.org> Acked-by: Christian König <christian.koenig@amd.com> Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> Link: http://patchwork.freedesktop.org/patch/msgid/20161025120045.28839-1-chris@chris-wilson.co.uk
2016-10-24drm/amdgpu: update kernel-doc for some functionsGrazvydas Ignotas
The names were wrong. Reviewed-by: Edward O'Callaghan <funfunctor@folklore1984.net> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-10-24drm/amdgpu: fix sched fence slab teardownGrazvydas Ignotas
To free fences, call_rcu() is used, which calls amd_sched_fence_free() after a grace period. During teardown, there is no guarantee all callbacks have finished, so sched_fence_slab may be destroyed before all fences have been freed. If we are lucky, this results in some slab warnings, if not, we get a crash in one of rcu threads because callback is called after amdgpu has already been unloaded. Fix it with a rcu_barrier(). Fixes: 189e0fb76304 ("drm/amdgpu: RCU protected amd_sched_fence_release") Acked-by: Chunming Zhou <david1.zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Grazvydas Ignotas <notasas@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-08-19drm/amdgpu: fix timeout value check in amd_sched_job_recoveryChristian König
Could be that we don't actually have a timeout set. Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-29drm/amd: fix deadlock of job_list_lock V2Chunming Zhou
run_job involves mutex, which could sleep. V2: use list_for_each_entry_safe, since the job might complete while we dropped the lock. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-29drm/amd: reset hw count when reset jobChunming Zhou
Means the hw ring is empty after gpu reset. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: add amd_sched_job_recoveryChunming Zhou
Which is to recover hw jobs when gpu reset. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amd: add amd_sched_hw_job_resetChunming Zhou
amd_sched_hw_job_reset will remove callback from hw fence. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amd: add parent for sched fenceChunming Zhou
Parent of sched fence is hw fence which is to signal sched fence. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove fence parameter from amd_sched_job_initChristian König
We return the fence as part of the job structur anyway, no need to do this twice. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: stop disabling irqs when it isn't neccessaryChristian König
A regular spin_lock/unlock should do here as well. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: block scheduler when gpu resetChunming Zhou
Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: stop trying to schedule() with a spin heldChristian König
Drop the lock before calling cancel_delayed_work_sync(). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96445 Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Tested-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: generalize the scheduler fenceChristian König
Make it two events, one for the job being scheduled and one when it is finished. Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Acked-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: fix and cleanup job destructionChristian König
Remove the job reference counting and just properly destroy it from a work item which blocks on any potential running timeout handler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: move locking into the functions who need itChristian König
Otherwise the locking becomes rather confusing. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: properly abstract scheduler timeout handlingChristian König
The driver shouldn't mess with the scheduler internals. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove use_shed hack in job cleanupChristian König
Remembering the code path in a variable to cleanup differently is usually not a good idea at all. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove duplicated timeout callbackChristian König
No need for double housekeeping here. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: remove begin_job/finish_jobChristian König
Completely pointless and confusing to use a callback to call into the same code file. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-07-07drm/amdgpu: fix coding style in the scheduler v2Christian König
v2: fix even more Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Monk.Liu <monk.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-11drm/amd: cleanup remaining spaces and tabs v2Christian König
This is the result of running the following commands: find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/[ \t]\+$//' {} \; find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/[ \t]\+$//' {} \; find drivers/gpu/drm/amd/ -name "*.h" -exec sed -i 's/ \+\t/\t/' {} \; find drivers/gpu/drm/amd/ -name "*.c" -exec sed -i 's/ \+\t/\t/' {} \; v2: drop changes to DAL and internal headers Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-04drm/amd/scheduler: Mark amdgpu_sched_ops constNils Wallménius
This marks the struct amdgpu_sched_ops const and adjusts amd_sched_init to take a const pointer for the ops param. The ops member of struct amd_gpu_scheduler is also changed to const. Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Nils Wallménius <nils.wallmenius@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: use ref to keep job aliveMonk Liu
this is to fix fatal page fault error that occured if: job is signaled/released after its timeout work is already put to the global queue (in this case the cancel_delayed_work will return false), which will lead to NX-protection error page fault during job_timeout_func. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: rework TDR in scheduler (v2)Monk Liu
Add two callbacks to scheduler to maintain jobs, and invoked for job timeout calculations. Now TDR measures time gap from job is processed by hw. v2: fix typo Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: get rid of incorrect TDRMonk Liu
original time out detect routine is incorrect, cuz it measures the gap from job scheduled, but we should only measure the gap from processed by hw. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: put job to list before doneMonk Liu
the mirror_list will be used for later time out detect feature. This is needed to properly detect a GPU timeout with the scheduler. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: delay job free to when it's finished (v2)Monk Liu
for those jobs submitted through scheduler, do not free it immediately after scheduled, instead free it in global workqueue by its sched fence signaling callback function. v2: call uf's bo_undef after job_run() call job's sync free after job_run() no static inline __amdgpu_job_free() anymore, just use kfree(job) to replace it. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-05-02drm/amdgpu: use sched_job_init to initialize sched_jobMonk Liu
Consolidate job initialization in one place rather than duplicating it in multiple places. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2016-03-16drm/amdgpu: RCU protected amd_sched_fence_releaseChristian König
Fences must be freed RCU protected, otherwise the reservation_object_*_rcu() functions can run into problems. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
2016-02-10drm/amdgpu: drop a dummy wakeup schedulerMonk Liu
since the dependency job is also scheduled by the same scheduler with the job depended on it, no need to call wake up scheduler when the dep is scheduled. Signed-off-by: Monk Liu <Monk.Liu@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-14drm/amdgpu: add entity only when first job comeChunming Zhou
umd somtimes will create a context for every ring, that means some entities wouldn't be used at all. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
2015-12-04drm/amdgpu: fix race condition in amd_sched_entity_push_jobNicolai Hähnle
As soon as we leave the spinlock after the job has been added to the job queue, we can no longer rely on the job's data to be available. I have seen a null-pointer dereference due to sched == NULL in amd_sched_wakeup via amd_sched_entity_push_job and amd_sched_ib_submit_kernel_helper. Since the latter initializes sched_job->sched with the address of the ring scheduler, which is guaranteed to be non-NULL, this race appears to be a likely culprit. Signed-off-by: Nicolai Hähnle <nicolai.haehnle@amd.com> Bugzilla: https://bugs.freedesktop.org/attachment.cgi?bugid=93079 Reviewed-by: Christian König <christian.koenig@amd.com>
2015-12-02drm/amd: abstract kernel rq and normal rq to priority of run queueChunming Zhou
Allows us to set priorities in the scheduler. Signed-off-by: Chunming Zhou <David1.Zhou@amd.com> Reviewed-by: Christian König <christian.koenig@amd.com> Reviewed-by: Junwei Zhang <Jerry.Zhang@amd.com>
2015-11-23drm/amdgpu: move dependency handling out of atomic section v2Christian König
This way the driver isn't limited in the dependency handling callback. v2: remove extra check in amd_sched_entity_pop_job() Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>
2015-11-23drm/amdgpu: optimize scheduler fence handlingChristian König
We only need to wait for jobs to be scheduled when the dependency is from the same scheduler. Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Chunming Zhou <david1.zhou@amd.com>