https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63464
Jakub Jelinek <jakub at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |uros at gcc dot gnu.org
--- Comment #14 from Jakub Jelinek <jakub at gcc dot gnu.org> ---
Patch committed. There is still an issue that remains, e.g. on vrp66.c
testcase
we often get code like:
_120 = 3314649325744685057 >> _119;
_121 = _120 & 1;
_25 = _121 ^ 1;
_122 = (_Bool) _25;
if (_122 != 0)
when actually
_120 = 3314649325744685057 >> _119;
_121 = _120 & 1;
if (_121 == 0)
would be enough and shorter. And it isn't just a GIMPLE preference, it
actually
shows up in the generated code. On vrp66.c on x86_64 at -O2, I'm seeing just
24 btq instructions (before the optimization 0) and 22 shrq instructions
(again, before 0), where the assembly typically looks like:
movabsq $-9223372032543031257, %rax
movl %edi, %ecx
shrq %cl, %rax
andl $1, %eax
xorq $1, %rax
andl $1, %eax
i.e. pretty much the same nonsense, at least the last andl $1 is redundant,
because the first andl ensures %eax is 0 or 1, the xorq turns that into 1 or 0
and so the second andl is useless. It might be nicer if the code used btq/setc
instead.
So, supposedly there is something we want to match-and-simplify, perhaps also
something we want to simplify at the RTL level, and check if bt+set{,n}c might
not be beneficial here compared to the shift + and + xor.