On 05/06/2015 04:56 AM, Daniel Vetter wrote:
> On Tue, May 05, 2015 at 11:57:42AM -0400, Peter Hurley wrote:
>> On 05/05/2015 11:42 AM, Daniel Vetter wrote:
>>> On Tue, May 05, 2015 at 10:36:24AM -0400, Peter Hurley wrote:
>>>> On 05/04/2015 12:52 AM, Mario Kleiner wrote:
>>>>> On 04/16/2015 03:03 PM, Daniel Vetter wrote:
>>>>>> On Thu, Apr 16, 2015 at 08:30:55AM -0400, Peter Hurley wrote:
>>>>>>> On 04/15/2015 01:31 PM, Daniel Vetter wrote:
>>>>>>>> On Wed, Apr 15, 2015 at 09:00:04AM -0400, Peter Hurley wrote:
>>>>>>>>> Hi Daniel,
>>>>>>>>>
>>>>>>>>> On 04/15/2015 03:17 AM, Daniel Vetter wrote:
>>>>>>>>>> This was a bit too much cargo-culted, so lets make it solid:
>>>>>>>>>> - vblank->count doesn't need to be an atomic, writes are always done
>>>>>>>>>> under the protection of dev->vblank_time_lock. Switch to an
>>>>>>>>>> unsigned
>>>>>>>>>> long instead and update comments. Note that atomic_read is just a
>>>>>>>>>> normal read of a volatile variable, so no need to audit all the
>>>>>>>>>> read-side access specifically.
>>>>>>>>>>
>>>>>>>>>> - The barriers for the vblank counter seqlock weren't complete: The
>>>>>>>>>> read-side was missing the first barrier between the counter read
>>>>>>>>>> and
>>>>>>>>>> the timestamp read, it only had a barrier between the ts and the
>>>>>>>>>> counter read. We need both.
>>>>>>>>>>
>>>>>>>>>> - Barriers weren't properly documented. Since barriers only work if
>>>>>>>>>> you have them on boths sides of the transaction it's prudent to
>>>>>>>>>> reference where the other side is. To avoid duplicating the
>>>>>>>>>> write-side comment 3 times extract a little store_vblank() helper.
>>>>>>>>>> In that helper also assert that we do indeed hold
>>>>>>>>>> dev->vblank_time_lock, since in some cases the lock is acquired a
>>>>>>>>>> few functions up in the callchain.
>>>>>>>>>>
>>>>>>>>>> Spotted while reviewing a patch from Chris Wilson to add a fastpath
>>>>>>>>>> to
>>>>>>>>>> the vblank_wait ioctl.
>>>>>>>>>>
>>>>>>>>>> Cc: Chris Wilson <chris at chris-wilson.co.uk>
>>>>>>>>>> Cc: Mario Kleiner <mario.kleiner.de at gmail.com>
>>>>>>>>>> Cc: Ville Syrjälä <ville.syrjala at linux.intel.com>
>>>>>>>>>> Cc: Michel Dänzer <michel at daenzer.net>
>>>>>>>>>> Signed-off-by: Daniel Vetter <daniel.vetter at intel.com>
>>>>>>>>>> ---
>>>>>>>>>> drivers/gpu/drm/drm_irq.c | 92
>>>>>>>>>> ++++++++++++++++++++++++-----------------------
>>>>>>>>>> include/drm/drmP.h | 8 +++--
>>>>>>>>>> 2 files changed, 54 insertions(+), 46 deletions(-)
>>>>>>>>>>
>>>>>>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c
>>>>>>>>>> index c8a34476570a..23bfbc61a494 100644
>>>>>>>>>> --- a/drivers/gpu/drm/drm_irq.c
>>>>>>>>>> +++ b/drivers/gpu/drm/drm_irq.c
>>>>>>>>>> @@ -74,6 +74,33 @@ module_param_named(vblankoffdelay,
>>>>>>>>>> drm_vblank_offdelay, int, 0600);
>>>>>>>>>> module_param_named(timestamp_precision_usec,
>>>>>>>>>> drm_timestamp_precision, int, 0600);
>>>>>>>>>> module_param_named(timestamp_monotonic, drm_timestamp_monotonic,
>>>>>>>>>> int, 0600);
>>>>>>>>>>
>>>>>>>>>> +static void store_vblank(struct drm_device *dev, int crtc,
>>>>>>>>>> + unsigned vblank_count_inc,
>>>>>>>>>> + struct timeval *t_vblank)
>>>>>>>>>> +{
>>>>>>>>>> + struct drm_vblank_crtc *vblank = &dev->vblank[crtc];
>>>>>>>>>> + u32 tslot;
>>>>>>>>>> +
>>>>>>>>>> + assert_spin_locked(&dev->vblank_time_lock);
>>>>>>>>>> +
>>>>>>>>>> + if (t_vblank) {
>>>>>>>>>> + tslot = vblank->count + vblank_count_inc;
>>>>>>>>>> + vblanktimestamp(dev, crtc, tslot) = *t_vblank;
>>>>>>>>>> + }
>>>>>>>>>> +
>>>>>>>>>> + /*
>>>>>>>>>> + * vblank timestamp updates are protected on the write side with
>>>>>>>>>> + * vblank_time_lock, but on the read side done locklessly using
>>>>>>>>>> a
>>>>>>>>>> + * sequence-lock on the vblank counter. Ensure correct ordering
>>>>>>>>>> using
>>>>>>>>>> + * memory barrriers. We need the barrier both before and also
>>>>>>>>>> after the
>>>>>>>>>> + * counter update to synchronize with the next timestamp write.
>>>>>>>>>> + * The read-side barriers for this are in
>>>>>>>>>> drm_vblank_count_and_time.
>>>>>>>>>> + */
>>>>>>>>>> + smp_wmb();
>>>>>>>>>> + vblank->count += vblank_count_inc;
>>>>>>>>>> + smp_wmb();
>>>>>>>>>
>>>>>>>>> The comment and the code are each self-contradictory.
>>>>>>>>>
>>>>>>>>> If vblank->count writes are always protected by vblank_time_lock
>>>>>>>>> (something I
>>>>>>>>> did not verify but that the comment above asserts), then the trailing
>>>>>>>>> write
>>>>>>>>> barrier is not required (and the assertion that it is in the comment
>>>>>>>>> is incorrect).
>>>>>>>>>
>>>>>>>>> A spin unlock operation is always a write barrier.
>>>>>>>>
>>>>>>>> Hm yeah. Otoh to me that's bordering on "code too clever for my own
>>>>>>>> good".
>>>>>>>> That the spinlock is held I can assure. That no one goes around and
>>>>>>>> does
>>>>>>>> multiple vblank updates (because somehow that code raced with the hw
>>>>>>>> itself) I can't easily assure with a simple assert or something
>>>>>>>> similar.
>>>>>>>> It's not the case right now, but that can changes.
>>>>>>>
>>>>>>> The algorithm would be broken if multiple updates for the same vblank
>>>>>>> count were allowed; that's why it checks to see if the vblank count has
>>>>>>> not advanced before storing a new timestamp.
>>>>>>>
>>>>>>> Otherwise, the read side would not be able to determine that the
>>>>>>> timestamp is valid by double-checking that the vblank count has not
>>>>>>> changed.
>>>>>>>
>>>>>>> And besides, even if the code looped without dropping the spinlock,
>>>>>>> the correct write order would still be observed because it would still
>>>>>>> be executing on the same cpu.
>>>>>>>
>>>>>>> My objection to the write memory barrier is not about optimization;
>>>>>>> it's about correct code.
>>>>>>
>>>>>> Well diff=0 is not allowed, I guess I could enforce this with some
>>>>>> WARN_ON. And I still think my point of non-local correctness is solid.
>>>>>> With the smp_wmb() removed the following still works correctly:
>>>>>>
>>>>>> spin_lock(vblank_time_lock);
>>>>>> store_vblank(dev, crtc, 1, ts1);
>>>>>> spin_unlock(vblank_time_lock);
>>>>>>
>>>>>> spin_lock(vblank_time_lock);
>>>>>> store_vblank(dev, crtc, 1, ts2);
>>>>>> spin_unlock(vblank_time_lock);
>>>>>>
>>>>>> But with the smp_wmb(); removed the following would be broken:
>>>>>>
>>>>>> spin_lock(vblank_time_lock);
>>>>>> store_vblank(dev, crtc, 1, ts1);
>>>>>> store_vblank(dev, crtc, 1, ts2);
>>>>>> spin_unlock(vblank_time_lock);
>>>>>>
>>>>>> because the compiler/cpu is free to reorder the store for vblank->count
>>>>>> _ahead_ of the store for the timestamp. And that would trick readers into
>>>>>> believing that they have a valid timestamp when they potentially raced.
>>>>>>
>>>>>> Now you're correct that right now there's no such thing going on, and
>>>>>> it's
>>>>>> unlikely to happen (given the nature of vblank updates). But my point is
>>>>>> that if we optimize this then the correctness can't be proven locally
>>>>>> anymore by just looking at store_vblank, but instead you must audit all
>>>>>> the callers. And leaking locking/barriers like that is too fragile design
>>>>>> for my taste.
>>>>>>
>>>>>> But you insist that my approach is broken somehow and dropping the
>>>>>> smp_wmb
>>>>>> is needed for correctness. I don't see how that's the case at all.
>>>>
>>>> Daniel,
>>>>
>>>> I've been really busy this last week; my apologies for not replying
>>>> promptly.
>>>>
>>>>> Fwiw, i spent some time reeducating myself about memory barriers (thanks
>>>>> for your explanations) and thinking about this, and the last version of
>>>>> your patch looks good to me. It also makes sense to me to leave that last
>>>>> smb_wmb() in place to make future use of the helper robust - for
>>>>> non-local correctness, to avoid having to audit all future callers of
>>>>> that helper.
>>>>
>>>> My concern wrt to unnecessary barriers in this algorithm is that the
>>>> trailing
>>>> barrier now appears mandatory, when in fact it is not.
>>>>
>>>> Moreover, this algorithm is, in general, fragile and not designed to handle
>>>> random or poorly-researched changes.
>>>
>>> Less fragility is exactly why I want that surplus barrier. But I've run
>>> out of new ideas for how to explain that ...
>>>
>>>> For example, if only the read and store operations are considered, it's
>>>> obviously
>>>> unsafe, since a read may unwittingly retrieve an store in progress.
>>>>
>>>>
>>>> CPU 0 | CPU 1
>>>> |
>>>> /* vblank->count == 0 */
>>>> |
>>>> drm_vblank_count_and_time() | store_vblank(.., inc = 2, ...)
>>>> |
>>>> cur_vblank <= LOAD vblank->count |
>>>> | tslot = vblank->count + 2
>>>> | /* tslot == 2 */
>>>> | STORE vblanktime[0]
>>>
>>> This line here is wrong, it should be "STORE vblanktime[2]"
>>>
>>> The "STORE vblanktime[0]" happened way earlier, before 2 smp_wmb and the
>>> previous updating of vblank->count.
>>
>> &vblanktime[0] == &vblanktime[2]
>>
>> That's why I keep trying to explain you actually have to look at and
>> understand the algorithm before blindly assuming local behavior is
>> sufficient.
>
> Ok now I think I got it, the issue is when the array (which is only 2
> elements big) wraps around. And that's racy because we don't touch the
> increment before _and_ after the write side update. But that seems like a
> bug that's always been there?
I'm not sure if those conditions can actually occur; it's been a long time
since I analyzed vblank timestamping.