https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92462

Wilco <wilco at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wilco at gcc dot gnu.org

--- Comment #1 from Wilco <wilco at gcc dot gnu.org> ---
(In reply to Aleksei Voitylov from comment #0)
> Created attachment 47212 [details]
> reduced testcase from the openjdk sources
> 
> While compiling openjdk sources I bumped into a bug: when -ftree-pre is
> enabled the optimizer hoists out reload of a variable which subsequently
> leads to the infinite loop situation.
> 
> Below is the relevant piece of code and "new_value" is the variable that
> gets hoisted out.
> 
> template<typename T>
> inline T Atomic::CmpxchgByteUsingInt::operator()(T exchange_value,
>         T volatile* dest,
>         T compare_value,
>         atomic_memory_order order) const {
>     printf ("Atomic::CmpxchgByteUsingInt::operator: 1: %d, 2: %d\n",
> exchange_value, compare_value);
>     uint8_t canon_exchange_value = exchange_value;
>     uint8_t canon_compare_value = compare_value;
>     volatile uint32_t* aligned_dest
>         = reinterpret_cast<volatile uint32_t*>(align_down(dest,
> sizeof(uint32_t)));
>     size_t offset = (uintptr_t)dest  - (uintptr_t)aligned_dest;
>     uint32_t cur = *aligned_dest;
>     uint8_t* cur_as_bytes = reinterpret_cast<uint8_t*>(&cur);
>     cur_as_bytes[offset] = canon_compare_value;
>     do {
>         uint32_t new_value = cur;
>         reinterpret_cast<uint8_t*>(&new_value)[offset] =
> canon_exchange_value;
>         printf ("Atomic::CmpxchgByteUsingInt::operator2: 1: %d, 2: %d\n",
> new_value, cur);
>         uint32_t res = cmpxchg(new_value, aligned_dest, cur, order);
>         if (res == cur) break;
>         cur = res;
>     } while (cur_as_bytes[offset] == canon_compare_value);
>     return PrimitiveConversions::cast<T>(cur_as_bytes[offset]);
> }
> 
> $ g++ -O1 -ftree-pre t.cpp
> $ ./a.out
> Atomic::CmpxchgByteUsingInt::operator: 1: 0, 2: 1
> Atomic::CmpxchgByteUsingInt::operator2: 1: 0, 2: 0
> 
> $ g++ -O1 t.cpp
> $ ./a.out
> Atomic::CmpxchgByteUsingInt::operator: 1: 0, 2: 1
> Atomic::CmpxchgByteUsingInt::operator2: 1: 0, 2: 256
> 
> Below is the assembler of the loop for the correct version:
> 
> .L7:
>         ldr     r4, [sp]
>         str     r4, [sp, #4]
>         strb    r7, [r5, #-4]
>         mov     r2, r4
>         ldr     r1, [sp, #4]
>         mov     r0, r6
>         bl      printf
>         cbz     r4, .L6
>         movs    r3, #0
>         str     r3, [sp]
>         ldrb    r3, [r8]        @ zero_extendqisi2
>         cmp     r3, #1
>         beq     .L7
> 
> and for the incorrect one:
> 
> .L7:
>         str     r4, [sp, #4]
>         strb    r8, [r6]
>         mov     r2, r4
>         ldr     r1, [sp, #4]
>         mov     r0, r5
>         bl      printf
>         cbz     r4, .L6
>         movs    r4, #0
>         str     r4, [sp]
>         ldrb    r3, [r7]        @ zero_extendqisi2
>         cmp     r3, #1
>         beq     .L7

There are serious aliasing bugs in the source - GCC is quite correct in
assuming that cur and cur_as_bytes[offset] never alias (obviously) and even
optimize away the cmpxchg (no idea why, that appears wrong).

Even if you fix the aliasing bugs, it won't emulate a byte-oriented cmpxchg
correctly, there are bugs in the logic too.

So I suggest to go back to the drawing board - you can't hack your own atomic
operations and just hope for the best. GCC supports a standard set of atomic
operations for a good reason!

Reply via email to