[Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock

federico.kircheis at gmail dot com Tue, 14 Jul 2020 13:16:51 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417


--- Comment #3 from Federico Kircheis <federico.kircheis at gmail dot com> ---
Thank you for the analysis, explanation and references, I did not think about
testing std::lock directly.


I'm still unsure if that means that it is a bug in valgrind, unfortunately I do
not know other 3rd party tool for doing such verifications.

I also noticed that I made an error in my initial report, the second snippet
should have been

-----
int main(){
        std::mutex m1;
        std::mutex m2;
        int data1{};
        int data2{};
        auto f1 = std::async(std::launch::async, [&](){
                std::lock_guard lg1{m1};std::lock_guard lg2{m2};
                ++data1;
                ++data2;
                return data1;
        });
    auto f2 = std::async(std::launch::async, [&](){
                std::lock_guard lg2{m2};std::lock_guard lg1{m1};
                ++data1;
                ++data2;
                return data2;
    });
    return f1.get() + f2.get(); // result should be 3
}
-----

otherwise it does not generate any warning in valgrind as the order is the
same.

> Also, I'm not even sure what the helgrind error really means.

>From helgrind manual: https://www.valgrind.org/docs/manual/hg-manual.html

----
Helgrind monitors the order in which threads acquire locks. This allows it to
detect potential deadlocks which could arise from the formation of cycles of
locks. Detecting such inconsistencies is useful because, whilst actual
deadlocks are fairly obvious, potential deadlocks may never be discovered
during testing and could later lead to hard-to-diagnose in-service failures.
----



> Just because they were locked in one order at one time doesn't establish an 
> invariant that they must always be locked in that order elsewhere in the 
> program.

I'm not an expert, I've just made the report because I've tried out valgrind on
a bigger program and then noticed it also complained about this example.

In order to avoid deadlocks, shouldn't mutexes get always locked in the same
order?

Otherwise thread1 locks the first mutex, then thread2 locks the second, and now
both thread are waiting for locking the next.

My guess, without having looked at the implementation of std::lock, is that the
algorithm uses try_lock to eventually lock/unlock the mutexes and see if it
manages to lock both, in order to avoid the deadlock.
And I suppose that somehow this algorithm also manages to avoid livelocks.


But even if std::lock would be correct (and it is a false positive of
valgrind), wouldn't sorting by address be an optimization?

As far as I understand, if the mutexes are always locked in the same order, one
does not need to try_lock. If it does, it would at least avoid the dance of
both threads trying to lock and unlock, as the first one that arrives manages
to lock both.
And if someone else locked them somewhere else in a different order, then the
current algorithm still applies.

I'm therefore not saying the current algorithm should be dismissed, just asking
if ordering before applying the algorithm could have any benefit.
Apart from the fact that helgrind would not complain anymore, which IMHO would
already be a big win.

Maybe there are known drawbacks to sorting by address, but as mutexes cannot be
moved, the output of sorting should always be the same, and I suppose that
sorting should not cost much, especially compared to locking.

[Bug libstdc++/89417] helgrind detects a lock order violation inside std::scoped_lock

Reply via email to