https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89417

            Bug ID: 89417
           Summary: helgrind detects a lock order violation inside
                    std::scoped_lock
           Product: gcc
           Version: 8.2.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: libgcc
          Assignee: unassigned at gcc dot gnu.org
          Reporter: federico.kircheis at gmail dot com
  Target Milestone: ---

Created attachment 45776
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45776&action=edit
helgrind log output

Hello,

Im unsure if this is a bug or feature request, depending on who is "wrong"
(g++/valgrind)

I'm using g++ (Debian 8.2.0-14) 8.2.0, and the program is compiled with
following flags: "-std=c++17 -lpthread"

Consider following snippet of code:

----
int main(){
        std::mutex m1;
        std::mutex m2;
        int data1{};
        int data2{};
        auto f1 = std::async(std::launch::async, [&](){
                std::scoped_lock sl{m1, m2};
                ++data1;
                ++data2;
                return data1;
        });
    auto f2 = std::async(std::launch::async, [&](){
                std::scoped_lock sl{m2, m1};
                ++data1;
                ++data2;
                return data2;
    });
    return f1.get() + f2.get(); // result should be 3
}
----

helgrind (valgrind-3.14.0) reports that the lock order is violated: error
message attached)


it prints exactly the same error for

----
int main(){
        std::mutex m1;
        std::mutex m2;
        int data1{};
        int data2{};
        auto f1 = std::async(std::launch::async, [&](){
                std::lock_guard lg1{m1};std::lock_guard lg2{m2};
                ++data1;
                ++data2;
                return data1;
        });
    auto f2 = std::async(std::launch::async, [&](){
                std::lock_guard lg1{m1};std::lock_guard lg2{m2};
                ++data1;
                ++data2;
                return data2;
    });
    return f1.get() + f2.get(); // result should be 3
}
----

In case helgrind is correct, it seems that there are some issues behind
std::scoped_lock, since it was explicitly designed for solving issues with lock
order.

In case helgrind (and possibly for the same reason other tools) is wrong, this
is would be a feature request.


A possible fix (or improvement) would be for `std::scoped_lock` to sort its
arguments by address (since std::mutex are not copyable or movable, and thus
their address should remain constant):

----
auto make_lock(std::mutex& m1_, std::mutex& m2_) {
        const auto mless = std::less<std::mutex*>{};
        return std::scoped_lock{*std::min(&m1_, &m2_, mless), *std::max(&m1_,
&m2_, mless)}; 
}

int main(){
        std::mutex m1;
        std::mutex m2;
        int data1{};
        int data2{};
        auto f1 = std::async(std::launch::async, [&](){
                auto sl = make_lock(m1,m2);
                ++data1;
                ++data2;
                return data1;
        });
    auto f2 = std::async(std::launch::async, [&](){
                auto sl = make_lock(m2,m1);
                ++data1;
                ++data2;
                return data2;
    });
    return f1.get() + f2.get(); // result should be 3
}
----

in this case, helgrind does not generate any warning.
I do not know the internal algorithm of `std::scoped_lock`, so it might be
completely wrong, but sorting by address might also avoid possible live-lock
issues.

Reply via email to