mal
Priority: P3
Component: c
Assignee: unassigned at gcc dot gnu.org
Reporter: joern at purestorage dot com
Target Milestone: ---
Created attachment 45945
--> https://gcc.gnu.org/bugzilla/attachment.cgi?id=45945&action=edit
matchlen testcase extracted from lz
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #2 from Jörn Engel ---
The input is 32. Does the "undefined-if-zero" thing give gcc license to remove
code depending on the output? If it does, why is the code only removed when
comparing against 31/32, not when comparing against 30
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #4 from Jörn Engel ---
Fair enough. That means the only way to get tzcnt without a conditional is by
using inline asm. Annoying, but something I can work with.
Annoying because for CPUs with BMI1, tzcnt is well-defined and I explic
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #6 from Jörn Engel ---
True for one, but not the other.
return mask ? __builtin_ctz(mask) : 32;
1099: 83 f6 ffxor$0x,%esi
109c: 74 47 je 10e5
109e:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #8 from Jörn Engel ---
Updated testcase below fails to remove the branch with my gcc-8.
/*
* usage:
* gcc -std=gnu11 -Wall -Wextra -g -march=core-avx2 -mbmi -fPIC -O3 % &&
./a.out < /dev/zero
*/
#include
#include
#include
#incl
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #11 from Jörn Engel ---
I stand corrected. Thank you very much!
Out of curiosity, if the only non-broken way to call __builtin_ctz(foo) is via
"foo ? __builtin_ctz(foo) : 32", why isn't the conditional moved into
__builtin_ctz()? I
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #13 from Jörn Engel ---
None of those examples convince me. If you or I know that a zero-argument is
impossible, but the compiler doesn't know, wouldn't that still be UB? And if
the compiler knows, it can remove the branch either wa
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=89670
--- Comment #15 from Jörn Engel ---
> int foo (int x) { return __builtin_ctz (x); }
>
> Without -mbmi, gcc emits:
> xorl%eax, %eax
> rep bsfl%edi, %eax
> ret
That example convinces me. Code would be broken w