On 25/06/2021 17:02, Boris Brezillon wrote:
> On Fri, 25 Jun 2021 16:55:12 +0100
> Steven Price <[email protected]> wrote:
> 
>> On 25/06/2021 14:33, Boris Brezillon wrote:
>>> This is not yet needed because we let active jobs be killed during by
>>> the reset and we don't really bother making sure they can be restarted.
>>> But once we start adding soft-stop support, controlling when we deal
>>> with the remaining interrrupts and making sure those are handled before
>>> the reset is issued gets tricky if we keep job interrupts active.
>>>
>>> Let's prepare for that and mask+flush job IRQs before issuing a reset.
>>>
>>> Signed-off-by: Boris Brezillon <[email protected]>
>>> ---
>>>  drivers/gpu/drm/panfrost/panfrost_job.c | 21 +++++++++++++++------
>>>  1 file changed, 15 insertions(+), 6 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_job.c 
>>> b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> index 88d34fd781e8..0566e2f7e84a 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_job.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_job.c
>>> @@ -34,6 +34,7 @@ struct panfrost_queue_state {
>>>  struct panfrost_job_slot {
>>>     struct panfrost_queue_state queue[NUM_JOB_SLOTS];
>>>     spinlock_t job_lock;
>>> +   int irq;
>>>  };
>>>  
>>>  static struct panfrost_job *
>>> @@ -400,7 +401,15 @@ static void panfrost_reset(struct panfrost_device 
>>> *pfdev,
>>>     if (bad)
>>>             drm_sched_increase_karma(bad);
>>>  
>>> -   spin_lock(&pfdev->js->job_lock);  
>>
>> I'm not sure it's safe to remove this lock as this protects the
>> pfdev->jobs array: I can't see what would prevent panfrost_job_close()
>> running at the same time without the lock. Am I missing something?
> 
> Ah, you're right, I'll add it back.
> 
>>
>>> +   /* Mask job interrupts and synchronize to make sure we won't be
>>> +    * interrupted during our reset.
>>> +    */
>>> +   job_write(pfdev, JOB_INT_MASK, 0);
>>> +   synchronize_irq(pfdev->js->irq);
>>> +
>>> +   /* Schedulers are stopped and interrupts are masked+flushed, we don't
>>> +    * need to protect the 'evict unfinished jobs' lock with the job_lock.
>>> +    */
>>>     for (i = 0; i < NUM_JOB_SLOTS; i++) {
>>>             if (pfdev->jobs[i]) {
>>>                     pm_runtime_put_noidle(pfdev->dev);
>>> @@ -408,7 +417,6 @@ static void panfrost_reset(struct panfrost_device 
>>> *pfdev,
>>>                     pfdev->jobs[i] = NULL;
>>>             }
>>>     }
>>> -   spin_unlock(&pfdev->js->job_lock);
>>>  
>>>     panfrost_device_reset(pfdev);
>>>  
>>> @@ -504,6 +512,7 @@ static void panfrost_job_handle_irq(struct 
>>> panfrost_device *pfdev, u32 status)
>>>  
>>>                     job = pfdev->jobs[j];
>>>                     /* Only NULL if job timeout occurred */
>>> +                   WARN_ON(!job);  
>>
>> Was this WARN_ON intentional?
> 
> Yes, now that we mask and synchronize the irq in the reset I don't see
> any reason why we would end up with an event but no job to attach this
> even to, but maybe I missed something.
> 

Ok - but I guess the comment above needs updating then! ;) Job timeouts
are still a thing which definitely can happen!

Steve

Reply via email to