On 3/21/24 15:08, Tomi Valkeinen wrote:
> On 21/03/2024 20:01, Sean Anderson wrote:
>> On 3/21/24 13:25, Tomi Valkeinen wrote:
>>> On 21/03/2024 17:52, Sean Anderson wrote:
>>>> On 3/20/24 02:53, Tomi Valkeinen wrote:
>>>>> On 20/03/2024 00:51, Sean Anderson wrote:
>>>>>> Retraining the link can take a while, and might involve waiting for
>>>>>> DPCD reads/writes to complete. This is inappropriate for an IRQ handler.
>>>>>> Just schedule this work for later completion. This is racy, but will be
>>>>>> fixed in the next commit.
>>>>>
>>>>> You should add the locks first, and use them here, rather than first
>>>>> adding a buggy commit and fixing it in the next one.
>>>>
>>>> I didn't think I could add the locks first since I only noticed the IRQ
>>>> was threaded right before sending out this series. So yeah, we could add
>>>> locking, add the workqueue, and then unthread the IRQ.
>>>>
>>>>>> Signed-off-by: Sean Anderson <[email protected]>
>>>>>> ---
>>>>>> Actually, on second look this IRQ is threaded. So why do we have a
>>>>>> workqueue for HPD events? Maybe we should make it unthreaded?
>>>>>
>>>>> Indeed, there's not much work being done in the IRQ handler. I don't know
>>>>> why it's threaded.
>>>>>
>>>>> We could move the queued work to be inside the threaded irq handler,
>>>>> but with a quick look, the HPD work has lines like "msleep(100)" (and
>>>>> that's inside a for loop...), which is probably not a good thing to do
>>>>> even in threaded irq handler.
>>>>>
>>>>> Although I'm not sure if that code is good to have anywhere. Why do we
>>>>> even have such code in the HPD work path... We already got the HPD
>>>>> interrupt. What does "It takes some delay (ex, 100 ~ 500 msec) to get
>>>>> the HPD signal with some monitors" even mean...
>>>>
>>>> The documentation for this bit is
>>>>
>>>> | HPD_STATE 0 ro 0x0 Contains the raw state of the HPD pin on
>>>> the DisplayPort connector.
>>>>
>>>> So I think the idea is to perform some debouncing.
>>>
>>> Hmm, it just looks a bit odd to me. It can sleep for a second. And the
>>> wording "It takes some delay (ex, 100 ~ 500 msec) to get the HPD signal
>>> with some monitors" makes it sound like some kind of a hack...
>>>
>>> The docs mention debounce once:
>>>
>>> https://docs.amd.com/r/en-US/pg299-v-dp-txss1/Hot-Plug-Detection
>>
>> Are you sure this is the right document? This seems to be documentation for
>> [1]. Is that instantiated as a hard block on the ZynqMP?
>>
>> [1]
>> https://www.xilinx.com/products/intellectual-property/ef-di-displayport.html
>
> You're right, wrong document. The registers and bitfield names I looked at
> just matched, so I didn't think it through...
>
> The right doc says even less:
>
> https://docs.amd.com/r/en-US/ug1085-zynq-ultrascale-trm/Upon-HPD-Assertion
>
>>> But it's not immediately obvious what the SW must do and what's done by the
>>> HW. Debounce is not mentioned later, e.g. in the HPD Event Handling. But if
>>> debounce is needed, wouldn't it be perhaps in a few milliseconds, instead
>>> of hundreds of milliseconds...
>>
>> Well, the DP spec says
>>
>> | If the HPD is the result of a new device being connected, either
>> | directly to the Source device (signaled by a long HPD), –or– downstream
>> | of a Branch device (indicated by incrementing the DFP_COUNT field value
>> | in the DOWN_STREAM_PORT_COUNT register (DPCD 00007h[3:0]) and signaled
>> | by an IRQ_HPD pulse), the Source device shall read the new DisplayID or
>> | legacy EDID that has been made available to it to ensure that content
>> | being transmitted over the link is able to be properly received and
>> | rendered.
>> |
>> | Informative Note: If the HPD signal toggling (or bouncing) is the
>> | result of the Hot Unplug followed by Hot Plug of a
>> | cable-connector assembly, the HPD signal is likely
>> | to remain unstable during the de-bouncing period,
>> | which is in the order of tens of milliseconds. The
>> | Source device may either check the HPD signal’s
>> | stability before initiating an AUX read transaction,
>> | –or– immediately initiate the AUX read transaction
>> | after each HPD rising edge.
>>
>> So a 100 ms delay seems plausible for some monitors.
>
> I read the text above as "it may take tens of milliseconds for HPD to
> stabilize". So polling it for total of 100ms sounds fine, but we're polling
> it for 1000ms.
>
> And I think checking for stability is fine, but for detect() I think it goes
> overboard: if the cable is disconnected, every detect call spends a second
> checking for HPD, even if we haven't seen any sign of an HPD =).
>
> And if we're checking the HPD stability, wouldn't we, say, poll the HPD for
> some time, and see if it stays the same? At the moment the code proceeds
> right away if HPD is high, but keeps polling if HPD is low.
>
>> That said, maybe we can just skip this and always read the DPCD.
>
> If the HPD is bouncing, is the AUX line also unstable?
>
> I don't mind a HPD stability check, I think it makes sense as (I think) the
> HW doesn't handle de-bouncing here. I think think it could be much much
> shorter than what it is now, and that it would make sense to observe the HPD
> for a period, instead of just waiting for the HPD to go high.
>
> But this could also be left for later, I don't think it matters in the
> context of this series.
>
>>> zynqmp_dp_bridge_detect() is used for drm_bridge_funcs.detect(), and if the
>>> cable is not connected, it'll sleep for 1 second (probably more) until
>>> returning not connected. It just doesn't sound correct to me.
>>>
>>> Well, it's not part of this patch as such, but related to the amount of
>>> time we spend in the interrupt handler (and also the detect()).
>>>
>>>>> Would it be possible to clean up the work funcs a bit (I haven't
>>>>> looked a the new work func yet), to remove the worst extra sleeps, and
>>>>> just do all that inside the threaded irq handler?
>>>>
>>>> Probably not, since a HPD IRQ results in link retraining, which can take a
>>>> while.
>>>
>>> But is it any different if you have a workqueue? Isn't a threaded interrupt
>>> handler basically the same thing?
>>>
>>> Probably at least the zynqmp_dp_hpd_work_func() could be done in the
>>> threaded irq just fine, if the insane 1s sleep can be dropped.
>>
>> Anything involving AUX shouldn't been in an IRQ, since
>> zynqmp_dp_aux_transfer will retry for up to 50ms by default.
>
> Perhaps. I'm still not sure if that's a problem. If a threaded irq is
> essentially a workqueue dedicated for this device, and we don't need to
> handle other irqs while the work is being done, then... What's the difference
> with a threaded irq and a workqueue?
>
> Oh, but we do need to handle irqs, we have the vblank irq in there. We don't
> want the vblanks to stop if there's a HPD IRQ.
>
> Btw, looks like zynqmp_dpsub_drm_handle_vblank() can sleep, so we can't move
> to non-threaded irq.
I don't see that. We have
zynqmp_dpsub_drm_handle_vblank
drm_crtc_handle_vblank
drm_handle_vblank
spin_lock_irqsave(...)
...
spin_lock_irqsave(...)
vblank_disable_fn(...)
spin_lock_irqsave(...)
...
spin_lock_irqrestore(...)
so no sleeping AFAICT.
--Sean
>>>>> Do we need to handle interrupts while either delayed work is being done?
>>>>
>>>> Probably not.
>>>>
>>>>> If we do need a delayed work, would just one work be enough which
>>>>> handles both HPD_EVENT and HPD_IRQ, instead of two?
>>>>
>>>> Maybe, but then we need to determine which pending events we need to
>>>> handle. I think since we have only two events it will be easier to just
>>>> have separate workqueues.
>>>
>>> The less concurrency, the better...Which is why it would be nice to do it
>>> all in the threaded irq.
>>
>> Yeah, but we can use a mutex for this which means there is not too much
>> interesting going on.
>
> Ok. Yep, if we get (hopefully) a single mutex with clearly defined fields
> that it protects, I'm ok with workqueues.
>
> I'd still prefer just one workqueue, though...
Yeah, but then we need a spinlock or something to tell the workqueue what it
should do.
--Sean