On 05/07/2015 01:56 PM, Peter Hurley wrote:
> On 05/06/2015 04:56 AM, Daniel Vetter wrote:
>> On Tue, May 05, 2015 at 11:57:42AM -0400, Peter Hurley wrote:
>>> On 05/05/2015 11:42 AM, Daniel Vetter wrote:
>>>> On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote:
>>>>> On 05/04/2015 12:52 AM, Mario Kleiner wrote:
>>>>>> On 04/16/2015 03:03 PM, Daniel Vetter wrote:
>>>>>>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:
>>>>>>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote:
>>>>>>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>>>>>>>>>> Hi Daniel,
>>>>>>>>>>
>>>>>>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>>>>>>>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>>>>>>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>>>>>>>>> under the protection of dev->vblank_time_lock. Switch to an
>>>>>>>>>>> unsigned
>>>>>>>>>>> long instead and update comments. Note that atomic_read is just
>>>>>>>>>>> a
>>>>>>>>>>> normal read of a volatile variable, so no need to audit all the
>>>>>>>>>>> read-side access specifically.
>>>>>>>>>>>
>>>>>>>>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>>>>>>>>> read-side was missing the first barrier between the counter
>>>>>>>>>>> read and
>>>>>>>>>>> the timestamp read, it only had a barrier between the ts and the
>>>>>>>>>>> counter read. We need both.
>>>>>>>>>>>
>>>>>>>>>>> - Barriers weren't properly documented. Since barriers only work if
>>>>>>>>>>> you have them on boths sides of the transaction it's prudent to
>>>>>>>>>>> reference where the other side is. To avoid duplicating the
>>>>>>>>>>> write-side comment 3 times extract a little store_vblank()
>>>>>>>>>>> helper.
>>>>>>>>>>> In that helper also assert that we do indeed hold
>>>>>>>>>>> dev->vblank_time_lock, since in some cases the lock is acquired
>>>>>>>>>>> a
>>>>>>>>>>> few functions up in the callchain.
>>>>>>>>>>>
>>>>>>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath
>>>>>>>>>>> to
>>>>>>>>>>> the vblank_wait ioctl.
>>>>>>>>>>>
>>>>>>>>>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>>>>>>>>>> Cc: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>>>>>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>>>>>>>>>> Cc: Michel Dänzer <michel at daenzer.net>
>>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>>>>>>>>>> ---
>>>>>>>>>>> drivers/gpu/drm/drm_irq.c | 92
>>>>>>>>>>> ++++++++++++++++++++++++-----------------------
>>>>>>>>>>> include/drm/drmP.h | 8 +++--
>>>>>>>>>>> 2 files changed, 54 insertions(+), 46 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>>>>>>> index c8a34476570a..23bfbc61a494 100644
>>>>>>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay,
>>>>>>>>>>> drm_vblank_offdelay, int, 0600);
>>>>>>>>>>> module_param_named(timestamp_precision_usec,
>>>>>>>>>>> drm_timestamp_precision, int, 0600);
>>>>>>>>>>> module_param_named(timestamp_monotonic, drm_timestamp_monotonic,
>>>>>>>>>>> int, 0600);
>>>>>>>>>>>
>>>>>>>>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>>>>>>>>> + unsigned vblank_count_inc,
>>>>>>>>>>> + struct timeval *t_vblank)
>>>>>>>>>>> +{
>>>>>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>>>>>>>>>> + u32 tslot;
>>>>>>>>>>> +
>>>>>>>>>>> + assert_spin_locked(&dev->vblank_time_lock);
>>>>>>>>>>> +
>>>>>>>>>>> + if (t_vblank) {
>>>>>>>>>>> + tslot = vblank->count + vblank_count_inc;
>>>>>>>>>>> + vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>>>>>>>>>> + }
>>>>>>>>>>> +
>>>>>>>>>>> + /*
>>>>>>>>>>> + * vblank timestamp updates are protected on the write side
>>>>>>>>>>> with
>>>>>>>>>>> + * vblank_time_lock, but on the read side done locklessly
>>>>>>>>>>> using a
>>>>>>>>>>> + * sequence-lock on the vblank counter. Ensure correct
>>>>>>>>>>> ordering using
>>>>>>>>>>> + * memory barrriers. We need the barrier both before and also
>>>>>>>>>>> after the
>>>>>>>>>>> + * counter update to synchronize with the next timestamp write.
>>>>>>>>>>> + * The read-side barriers for this are in
>>>>>>>>>>> drm_vblank_count_and_time.
>>>>>>>>>>> + */
>>>>>>>>>>> + smp_wmb();
>>>>>>>>>>> + vblank->count += vblank_count_inc;
>>>>>>>>>>> + smp_wmb();
>>>>>>>>>>
>>>>>>>>>> The comment and the code are each self-contradictory.
>>>>>>>>>>
>>>>>>>>>> If vblank->count writes are always protected by vblank_time_lock
>>>>>>>>>> (something I
>>>>>>>>>> did not verify but that the comment above asserts), then the
>>>>>>>>>> trailing write
>>>>>>>>>> barrier is not required (and the assertion that it is in the comment
>>>>>>>>>> is incorrect).
>>>>>>>>>>
>>>>>>>>>> A spin unlock operation is always a write barrier.
>>>>>>>>>
>>>>>>>>> Hm yeah. Otoh to me that's bordering on "code too clever for my own
>>>>>>>>> good".
>>>>>>>>> That the spinlock is held I can assure. That no one goes around and
>>>>>>>>> does
>>>>>>>>> multiple vblank updates (because somehow that code raced with the hw
>>>>>>>>> itself) I can't easily assure with a simple assert or something
>>>>>>>>> similar.
>>>>>>>>> It's not the case right now, but that can changes.
>>>>>>>>
>>>>>>>> The algorithm would be broken if multiple updates for the same vblank
>>>>>>>> count were allowed; that's why it checks to see if the vblank count has
>>>>>>>> not advanced before storing a new timestamp.
>>>>>>>>
>>>>>>>> Otherwise, the read side would not be able to determine that the
>>>>>>>> timestamp is valid by double-checking that the vblank count has not
>>>>>>>> changed.
>>>>>>>>
>>>>>>>> And besides, even if the code looped without dropping the spinlock,
>>>>>>>> the correct write order would still be observed because it would still
>>>>>>>> be executing on the same cpu.
>>>>>>>>
>>>>>>>> My objection to the write memory barrier is not about optimization;
>>>>>>>> it's about correct code.
>>>>>>>
>>>>>>> Well diff=0 is not allowed, I guess I could enforce this with some
>>>>>>> WARN_ON. And I still think my point of non-local correctness is solid.
>>>>>>> With the smp_wmb() removed the following still works correctly:
>>>>>>>
>>>>>>> spin_lock(vblank_time_lock);
>>>>>>> store_vblank(dev, crtc, 1, ts1);
>>>>>>> spin_unlock(vblank_time_lock);
>>>>>>>
>>>>>>> spin_lock(vblank_time_lock);
>>>>>>> store_vblank(dev, crtc, 1, ts2);
>>>>>>> spin_unlock(vblank_time_lock);
>>>>>>>
>>>>>>> But with the smp_wmb(); removed the following would be broken:
>>>>>>>
>>>>>>> spin_lock(vblank_time_lock);
>>>>>>> store_vblank(dev, crtc, 1, ts1);
>>>>>>> store_vblank(dev, crtc, 1, ts2);
>>>>>>> spin_unlock(vblank_time_lock);
>>>>>>>
>>>>>>> because the compiler/cpu is free to reorder the store for vblank->count
>>>>>>> _ahead_ of the store for the timestamp. And that would trick readers
>>>>>>> into
>>>>>>> believing that they have a valid timestamp when they potentially raced.
>>>>>>>
>>>>>>> Now you're correct that right now there's no such thing going on, and
>>>>>>> it's
>>>>>>> unlikely to happen (given the nature of vblank updates). But my point is
>>>>>>> that if we optimize this then the correctness can't be proven locally
>>>>>>> anymore by just looking at store_vblank, but instead you must audit all
>>>>>>> the callers. And leaking locking/barriers like that is too fragile
>>>>>>> design
>>>>>>> for my taste.
>>>>>>>
>>>>>>> But you insist that my approach is broken somehow and dropping the
>>>>>>> smp_wmb
>>>>>>> is needed for correctness. I don't see how that's the case at all.
>>>>>
>>>>> Daniel,
>>>>>
>>>>> I've been really busy this last week; my apologies for not replying
>>>>> promptly.
>>>>>
>>>>>> Fwiw, i spent some time reeducating myself about memory barriers (thanks
>>>>>> for your explanations) and thinking about this, and the last version of
>>>>>> your patch looks good to me. It also makes sense to me to leave that
>>>>>> last smb_wmb() in place to make future use of the helper robust - for
>>>>>> non-local correctness, to avoid having to audit all future callers of
>>>>>> that helper.
>>>>>
>>>>> My concern wrt to unnecessary barriers in this algorithm is that the
>>>>> trailing
>>>>> barrier now appears mandatory, when in fact it is not.
>>>>>
>>>>> Moreover, this algorithm is, in general, fragile and not designed to
>>>>> handle
>>>>> random or poorly-researched changes.
>>>>
>>>> Less fragility is exactly why I want that surplus barrier. But I've run
>>>> out of new ideas for how to explain that ...
>>>>
>>>>> For example, if only the read and store operations are considered, it's
>>>>> obviously
>>>>> unsafe, since a read may unwittingly retrieve an store in progress.
>>>>>
>>>>>
>>>>> CPU 0 | CPU 1
>>>>> |
>>>>> /* vblank->count == 0 */
>>>>> |
>>>>> drm_vblank_count_and_time() | store_vblank(.., inc = 2, ...)
>>>>> |
>>>>> cur_vblank <= LOAD vblank->count |
>>>>> | tslot = vblank->count + 2
>>>>> | /* tslot == 2 */
>>>>> | STORE vblanktime[0]
>>>>
>>>> This line here is wrong, it should be "STORE vblanktime[2]"
>>>>
>>>> The "STORE vblanktime[0]" happened way earlier, before 2 smp_wmb and the
>>>> previous updating of vblank->count.
>>>
>>> &vblanktime[0] == &vblanktime[2]
>>>
>>> That's why I keep trying to explain you actually have to look at and
>>> understand the algorithm before blindly assuming local behavior is
>>> sufficient.
>>
>> Ok now I think I got it, the issue is when the array (which is only 2
>> elements big) wraps around. And that's racy because we don't touch the
>> increment before _and_ after the write side update. But that seems like a
>> bug that's always been there?
>
> I'm not sure if those conditions can actually occur; it's been a long time
> since I analyzed vblank timestamping.
>
>
They shouldn't occur under correct use. Normally one has to wrap any
call to drm_vblank_count() or drm_vblank_count_and_time() into a pair of
drm_vblank_get() -> query -> drm_vblank_put(). Only drm_vblank_get()
will call drm_update_vblank() on a refcount 0 -> 1 transition if vblanks
were previously off, and only that function bumps the count by more than
+1. Iow. the overflow case isn't executed in parallel with queries ->
problem avoided.
Proper _get()->query->put() sequence is given for queueing vblank
events, waiting for target vblank counts or during pageflip completion.
The one exception would be in Chris recently proposed "lockless instant
query" patch where a pure query is done - That patch that triggered
Daniels cleanup patch. I was just about to ok that one, and testing with
my timing tests and under normal use didn't show problems. There
drm_vblank_count_and_time() is used outside a get/put protected path.
I'm not sure if some race there could happen under realistic conditions.
-mario