On Tue, Oct 30, 2007 at 10:31:17AM +0000, Andrew Haley wrote: > Jakub Jelinek writes: > > On Tue, Oct 30, 2007 at 10:20:34AM +0000, Andrew Haley wrote: > > > That's what the proposed standard language says, kinda-sorta. There's > > > an informal description at > > > http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2338.html. > > > > > > Anyway, we have fixed this bug and are committing it to all open gcc > > > branches. Credit to Ian Taylor for writing the patch. > > > > To be precise, it was fixed in one of the optimization passes, there > > are still others (e.g. loop invariant motion). > > Ah, thanks for clarifying that. Do you have a test case?
Sure, e.g. http://gcc.gnu.org/bugzilla/show_bug.cgi?id=31862#c7 Say on x86_64-linux, foo at -O2 is on the trunk: foo: movl v(%rip), %edx xorl %eax, %eax .p2align 4,,10 .p2align 3 .L3: cmpl %eax, %edi cmovl %esi, %edx addl $1, %eax cmpl $100, %eax jne .L3 movl %edx, v(%rip) ret (note that v is read and written unconditionally), which is a data race, the program as written only reads or writes variable v when holding the mutex m. With -O2 -fno-tree-loop-im we got until yesterday foo: xorl %edx, %edx .p2align 4,,10 .p2align 3 .L3: movl v(%rip), %eax cmpl %edx, %edi cmovl %esi, %eax addl $1, %edx cmpl $100, %edx movl %eax, v(%rip) jne .L3 rep ret which has a data race as well, but with today's trunk (thanks to Ian's patch) we now finally get foo: xorl %eax, %eax .p2align 4,,10 .p2align 3 .L3: cmpl %eax, %edi jge .L2 movl %esi, v(%rip) .L2: addl $1, %eax cmpl $100, %eax jne .L3 rep ret at -O2 -fno-tree-loop-im, which doesn't have the data race. Of course if the compiler was smart enough it could transform the loop into just if (x < 99) v = y; but that's quite unlikely we'll be able to do (and it is questionable if it would be a worthwile optimization in real world). Anyway, -ftree-loop-im breaks this by adding v_lsm.12 below: foo (x, y) { int v_lsm.12; int i; <bb 2>: v_lsm.12_11 = v; <bb 3>: # v_lsm.12_1 = PHI <v_lsm.12_7(6), v_lsm.12_11(2)> # i_12 = PHI <i_5(6), 0(2)> if (x_3(D) < i_12) goto <bb 4>; else goto <bb 5>; <bb 4>: v_lsm.12_10 = y_4(D); <bb 5>: # v_lsm.12_7 = PHI <v_lsm.12_1(3), v_lsm.12_10(4)> i_5 = i_12 + 1; if (i_5 <= 99) goto <bb 6>; else goto <bb 7>; <bb 6>: goto <bb 3>; <bb 7>: # v_lsm.12_15 = PHI <v_lsm.12_7(5)> v = v_lsm.12_15; return; } which should be allowed only if v is not is_global_var and its address has not been taken. Jakub