https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94649

Yongwei Wu <wuyongwei at gmail dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wuyongwei at gmail dot com

--- Comment #3 from Yongwei Wu <wuyongwei at gmail dot com> ---
Is there really a valid use case for a non-lock-free version of 128-bit CAS?

I am using it in a lock-free data structure. The GCC-generated code is MUCH
slower than the mutex-based version, defeating all its valid purposes. I am
talking about a 10x difference. And the Clang-generated code is more than 200x
faster in my 8-thread contention test.

To me, the current GCC behaviour is not missed optimization. It is
pessimization. I am really having a difficult time understanding the rationale
of the current design.

Reply via email to