https://bugs.kde.org/show_bug.cgi?id=360557

--- Comment #3 from Philippe Waroquiers <philippe.waroqui...@skynet.be> ---
(In reply to Ari Sundholm from comment #2)
> This is not the case, as, per standard semantics for condition variables,
> pthread_cond_wait re-acquires the mutex before returning to the caller.
Yes, you are correct, I stopped reading pthread_cond_wait manual too quickly
:(.

So, here is another (desperate?) trial to explain the helgrind behaviour.

I have modified the program to comment out the destroy and delete of B_cond and
B_lock.
So, cond and locks are never deleted/destroyed.
This allows then to still show which lock was held, to avoid the  (and 1 that
can't be shown).

Then this has shown that in fact, the read had the A lock and had one B_lock,
but not the same B_lock as the write operation.

This then leads to the hypothesis that the memory of B might be freed and then
reallocated
fast enough (and get another lock) so that effectively there is some kind of
race condition
reported by helgrind.
Or asked otherwise, how is the program ensuring that no thread is waiting
on a lock of B, while in parallel another thread is delete-ing b.
For sure, the delete &b; is done after the unlock of B_lock
but I guess that the A lock ensures that there is no race on B when busy
inserting
or deleting B.

In any case, here is a little bit of tracing produced by --trace-malloc=yes
(this is the modified version that does not destroy B_cond/B_lock):
--21045-- _ZdlPv(0x4B780E8)
--21045-- _Znwj(16) = 0x4B780E8 <<<<< allocation of a B
--21045-- _Znwj(24) = 0x4B7DB58 <<<<< allocation of new lock for B 
--21045-- _Znwj(48) = 0x4B7DBA0
--21045-- _Znwj(24) = 0x4B7DC00
--21045-- _ZdlPv(0x4B7DC00)
--21045-- _ZdlPv(0x4B780E8)     <<<<< free previous B
--21045-- calloc(18,8) = 0x4B7DC00
--21045-- _Znwj(16) = 0x4B780E8 <<<<< allocation of a B (reallocating just
freed block)
--21045-- _Znwj(24) = 0x4B7DCC0 <<<<< allocation of new lock for B
--21045-- _Znwj(48) = 0x4B7DD08
--21045-- _Znwj(24) = 0x4B7DD68


When using the helgrind option  --free-is-write=yes, helgrind reports a race
condition
between a write and a previous write, which is the delete/free operation:
==21164== Possible data race during write of size 4 at 0x4B783EC by thread #7
==21164== Locks held: 1, at address 0x4B78698
==21164==    at 0x8049178: A::method1(int)
(helgrind_bug_reproducer.orig.cpp:36)
==21164==    by 0x8048E06: thread(void*) (helgrind_bug_reproducer.orig.cpp:137)
==21164==    by 0x402DA76: mythread_wrapper (hg_intercepts.c:389)
==21164==    by 0x405BEFA: start_thread (pthread_create.c:309)
==21164==    by 0x42B2EDD: clone (clone.S:129)
==21164== 
==21164== This conflicts with a previous write of size 4 by thread #11
==21164== Locks held: 1, at address 0x804CBD8
==21164==    at 0x402A868: operator delete(void*) (vg_replace_malloc.c:576)
==21164==    by 0x80492AD: A::method2(int)
(helgrind_bug_reproducer.orig.cpp:65)
==21164==    by 0x8048E30: thread(void*) (helgrind_bug_reproducer.orig.cpp:139)
==21164==    by 0x402DA76: mythread_wrapper (hg_intercepts.c:389)
==21164==    by 0x405BEFA: start_thread (pthread_create.c:309)
==21164==    by 0x42B2EDD: clone (clone.S:129)
==21164==  Address 0x4b783ec is 4 bytes inside a block of size 16 alloc'd
==21164==    at 0x40299F6: operator new(unsigned int) (vg_replace_malloc.c:328)
==21164==    by 0x80490D3: A::method1(int)
(helgrind_bug_reproducer.orig.cpp:23)
==21164==    by 0x8048E06: thread(void*) (helgrind_bug_reproducer.orig.cpp:137)
==21164==    by 0x402DA76: mythread_wrapper (hg_intercepts.c:389)
==21164==    by 0x405BEFA: start_thread (pthread_create.c:309)
==21164==    by 0x42B2EDD: clone (clone.S:129)
==21164==  Block was alloc'd by thread #9

So, we have 2 indications that the problem might be related to the way the
memory
of the B objects are allocated/freed/reallocated.

So that might be a bug in the way helgrind handles allocate/free/reallocate.
It might also be a real race in the program, related to memory management, but
not very
clear how this can happen with the A lock.

I am wondering also about the limitations helgrind has about the condition
variables
(see helgrind user manual).

So, this helgrind error is not very clear to me, it looks like we need a real
expert to look
at this :)

-- 
You are receiving this mail because:
You are watching all bug changes.

Reply via email to