sched: signal and free remaining fences in amd_sched_entity_fini

Nicolai Hähnle Mon, 09 Oct 2017 07:00:56 -0700

On 09.10.2017 13:12, Christian König wrote:

Nicolai, how hard would it be to handle ENODEV as failure for allcurrently existing contexts?
Impossible? "All currently existing contexts" is not a well-definedconcept when multiple drivers co-exist in the same process.
Ok, let me refine the question: I assume there are resources "shared"between contexts like binary shader code for example which needs to bereuploaded when VRAM is lost.
How hard would it be to handle that correctly?


Okay, that makes more sense :)

With the current interface it's still pretty difficult, but if we couldget a new per-device query ioctl which returns a "VRAM loss counter", itwould be reasonably straight-forward.

And what would be the purpose of this? If it's to support VRAM loss,having a per-context VRAM loss counter would enable each context tosignal ECANCELED separately.
I thought of that on top of the -ENODEV handling.
In other words when we see -ENODEV we call an IOCTL to let the kernelknow we noticed that something is wrong and then reinit all sharedresources in userspace.
All existing context will still see -ECANCELED when we drop theircommand submission, but new contexts would at least not cause a newlockup immediately because their shader binaries are corrupted.

I don't think we need -ENODEV for this. We just need -ECANCELED to bereturned when a submission is rejected due to reset (hang or VRAM loss).

Mesa would keep a shadow of the VRAM loss counter in pipe_screen andpipe_context, and query the kernel's counter when it encounters-ECANCELED. Each context would then know to drop the CS it's built up sofar and restart based on comparing the VRAM loss counter of pipe_screento that of pipe_context, and similarly we could keep a copy of the VRAMloss counter for important buffer objects like shader binaries,descriptors, etc.

This seems more robust to me than relying only on an ENODEV. We'd mostlikely keep some kind of VRAM loss counter in Mesa *anyway* (we don'tmaintain a list of all shaders, for example, and we can't nuke importantper-context across threads), and synthesizing such a counter fromENODEVs is not particularly robust (what if multiple ENODEVs occur forthe same loss event?).

BTW, I still don't like ENODEV. It seems more like the kind of errorcode you'd return with hot-pluggable GPUs where the device canphysically disappear...


Cheers,
Nicolai


Regards,
Christian.

Am 09.10.2017 um 13:04 schrieb Nicolai Hähnle:

On 09.10.2017 12:59, Christian König wrote:

Nicolai, how hard would it be to handle ENODEV as failure for allcurrently existing contexts?

Impossible? "All currently existing contexts" is not a well-definedconcept when multiple drivers co-exist in the same process.

And what would be the purpose of this? If it's to support VRAM loss,having a per-context VRAM loss counter would enable each context tosignal ECANCELED separately.


Cheers,
Nicolai

Monk, would it be ok with you when we return ENODEV only for existingcontext when VRAM is lost and/or we have a strict mode GPU reset?E.g. newly created contexts would continue work as they should.


Regards,
Christian.

Am 09.10.2017 um 12:49 schrieb Nicolai Hähnle:

Hi Monk,

Yes, you're right, we're only using ECANCELED internally. But as aconsequence, Mesa would already handle a kernel error of ECANCELEDon context loss correctly :)


Cheers,
Nicolai

On 09.10.2017 12:35, Liu, Monk wrote:

Hi Christian

You reject some of my patches that returns -ENODEV, with the causethat MESA doesn't do the handling on -ENODEV

But if Nicolai can confirm that MESA do have a handling on-ECANCELED, then we need to overall align our error code, on detailbelow IOCTL can return error code:


Amdgpu_cs_ioctl
Amdgpu_cs_wait_ioctl
Amdgpu_cs_wait_fences_ioctl
Amdgpu_info_ioctl


My patches is:
return -ENODEV on cs_ioctl if the context is detected guilty,

also return -ENODEV on cs_wait|cs_wait_fences if the fence issignaled but with error -ETIME,also return -ENODEV on info_ioctl so UMD can query if gpu resethappened after the process created (because for strict mode weblock process instead of context)



according to Nicolai:

amdgpu_cs_ioctl *can* return -ECANCELED, but to be franklyspeaking, kernel part doesn't have any place with "-ECANCELED" sothis solution on MESA side doesn't align with *current* amdgpu driver,which only return 0 on success or -EINVALID on other error butdefinitely no "-ECANCELED" error code,

so if we talking about community rules we shouldn't let MESA handle-ECANCELED , we should have a unified error code


+ Marek

BR Monk




-----Original Message-----
From: Haehnle, Nicolai
Sent: 2017年10月9日 18:14

To: Koenig, Christian <[email protected]>; Liu, Monk<[email protected]>; Nicolai Hähnle <[email protected]>;[email protected]Subject: Re: [PATCH 5/5] drm/amd/sched: signal and free remainingfences in amd_sched_entity_fini


On 09.10.2017 10:02, Christian König wrote:

For gpu reset patches (already submitted to pub) I would make kernel
return -ENODEV if the waiting fence (in cs_wait or wait_fencesIOCTL)
founded as error, that way UMD would run into robust extension path
and considering the GPU hang occurred,
Well that is only closed source behavior which is completely
irrelevant for upstream development.
As far as I know we haven't pushed the change to return -ENODEVupstream.

FWIW, radeonsi currently expects -ECANCELED on CS submissions andtreats those as context lost. Perhaps we could use the same erroron fences?

That makes more sense to me than -ENODEV.

Cheers,
Nicolai


Regards,
Christian.

Am 09.10.2017 um 08:42 schrieb Liu, Monk:

Christian

It would be really nice to have an error code set on
s_fence->finished before it is signaled, use dma_fence_set_error()
for this.

For gpu reset patches (already submitted to pub) I would make kernel

return -ENODEV if the waiting fence (in cs_wait or wait_fencesIOCTL)

founded as error, that way UMD would run into robust extension path
and considering the GPU hang occurred,

Don't know if this is expected for the case of normal process being

killed or crashed like Nicolai hit ... since there is no gpu hanghit



BR Monk




-----Original Message-----
From: amd-gfx [mailto:[email protected]] On
Behalf Of Christian K?nig
Sent: 2017年9月28日 23:01
To: Nicolai Hähnle <[email protected]>;
[email protected]
Cc: Haehnle, Nicolai <[email protected]>
Subject: Re: [PATCH 5/5] drm/amd/sched: signal and free remaining
fences in amd_sched_entity_fini

Am 28.09.2017 um 16:55 schrieb Nicolai Hähnle:

From: Nicolai Hähnle <[email protected]>

Highly concurrent Piglit runs can trigger a race condition where a
pending SDMA job on a buffer object is never executed because the
corresponding process is killed (perhaps due to a crash). Since the
job's fences were never signaled, the buffer object was effectively
leaked. Worse, the buffer was stuck wherever it happened to be at
the time, possibly in VRAM.

The symptom was user space processes stuck in interruptible waits
with kernel stacks like:

       [<ffffffffbc5e6722>] dma_fence_default_wait+0x112/0x250
       [<ffffffffbc5e6399>] dma_fence_wait_timeout+0x39/0xf0
       [<ffffffffbc5e82d2>]
reservation_object_wait_timeout_rcu+0x1c2/0x300

[<ffffffffc03ce56f>]ttm_bo_cleanup_refs_and_unlock+0xff/0x1a0

[ttm]
       [<ffffffffc03cf1ea>] ttm_mem_evict_first+0xba/0x1a0 [ttm]
       [<ffffffffc03cf611>] ttm_bo_mem_space+0x341/0x4c0 [ttm]
       [<ffffffffc03cfc54>] ttm_bo_validate+0xd4/0x150 [ttm]
       [<ffffffffc03cffbd>] ttm_bo_init_reserved+0x2ed/0x420 [ttm]
       [<ffffffffc042f523>] amdgpu_bo_create_restricted+0x1f3/0x470
[amdgpu]
       [<ffffffffc042f9fa>] amdgpu_bo_create+0xda/0x220 [amdgpu]
       [<ffffffffc04349ea>] amdgpu_gem_object_create+0xaa/0x140
[amdgpu]
       [<ffffffffc0434f97>] amdgpu_gem_create_ioctl+0x97/0x120
[amdgpu]
       [<ffffffffc037ddba>] drm_ioctl+0x1fa/0x480 [drm]
       [<ffffffffc041904f>] amdgpu_drm_ioctl+0x4f/0x90 [amdgpu]
       [<ffffffffbc23db33>] do_vfs_ioctl+0xa3/0x5f0
       [<ffffffffbc23e0f9>] SyS_ioctl+0x79/0x90
       [<ffffffffbc864ffb>] entry_SYSCALL_64_fastpath+0x1e/0xad
       [<ffffffffffffffff>] 0xffffffffffffffff

Signed-off-by: Nicolai Hähnle <[email protected]>
Acked-by: Christian König <[email protected]>
---
    drivers/gpu/drm/amd/scheduler/gpu_scheduler.c | 7 ++++++-
    1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
index 54eb77cffd9b..32a99e980d78 100644
--- a/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
+++ b/drivers/gpu/drm/amd/scheduler/gpu_scheduler.c
@@ -220,22 +220,27 @@ void amd_sched_entity_fini(struct
amd_gpu_scheduler *sched,
amd_sched_entity_is_idle(entity));
        amd_sched_rq_remove_entity(rq, entity);
        if (r) {
            struct amd_sched_job *job;
            /* Park the kernel for a moment to make sure it isn't
processing
             * our enity.
             */
            kthread_park(sched->thread);
            kthread_unpark(sched->thread);
-        while (kfifo_out(&entity->job_queue, &job, sizeof(job)))
+        while (kfifo_out(&entity->job_queue, &job, sizeof(job))) {
+            struct amd_sched_fence *s_fence = job->s_fence;
+            amd_sched_fence_scheduled(s_fence);

It would be really nice to have an error code set on

s_fence->finished before it is signaled, usedma_fence_set_error() for this.


Additional to that it would be nice to note in the subject line that
this is a rather important bug fix.

With that fixed the whole series is Reviewed-by: Christian König
<[email protected]>.

Regards,
Christian.

+ amd_sched_fence_finished(s_fence);
+            dma_fence_put(&s_fence->finished);
                sched->ops->free_job(job);
+        }
        }
        kfifo_free(&entity->job_queue);
    }
    static void amd_sched_entity_wakeup(struct dma_fence *f, struct
dma_fence_cb *cb)
    {
        struct amd_sched_entity *entity =
            container_of(cb, struct amd_sched_entity, cb);
        entity->dependency = NULL;


_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx
_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx


_______________________________________________
amd-gfx mailing list
[email protected]
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH 5/5] drm/amd/sched: signal and free remaining fences in amd_sched_entity_fini

Reply via email to