https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106594
Roger Sayle <roger at nextmovesoftware dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
Last reconfirmed| |2022-08-12
Status|UNCONFIRMED |NEW
CC| |roger at nextmovesoftware dot
com
Ever confirmed|0 |1
--- Comment #3 from Roger Sayle <roger at nextmovesoftware dot com> ---
Ah interesting. Because index is a char, the tree-level optimizers realize
that the shift by 4 can be an 8-bit shift instead of an int-sized shift.
What's interesting is that because of (x & 3) << 4, is used, the optimizers
realize that because index can never be negative, that in the array memory
reference expression constellation_64qam[index], when the 8-bit index is being
sign extended, it is effectively being zero-extended.
I think that the aarch64 backend needs to be taught that in this case (because
of the AND), the zero extension is the same as (can be implemented using) a
sign-extension, i.e. restoring the original code generation. Practically, the
sxtw;ldr[..lsl 2] above can legitimately be optimized to ldr[..sxtw 2] (or
ldr[..uxtw 2] like LLVM) in this case [cases like this].
(sign_extend:DI (and:SI x (const_int 3))) and
(zero_extend:DI (and:SI x (const_int 3))) should (ideally) produce the exact
same code [the more efficient if two implementations are possible].