[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-04-23 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #23 from CVS Commits --- The master branch has been updated by hongtao Liu : https://gcc.gnu.org/g:8311c26757657fe8ffa28ca1539d02d141bb8292 commit r14-182-g8311c26757657fe8ffa28ca1539d02d141bb8292 Author: liuhongt Date: Wed Mar

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-04-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #22 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:87c9bae4e32b54829dce0a93ff735412d5f684f8 commit r14-121-g87c9bae4e32b54829dce0a93ff735412d5f684f8 Author: Jakub Jelinek Date: T

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-04-20 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #21 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:705b0d2b62318b3935214f08a1cf023b1117acb8 commit r14-108-g705b0d2b62318b3935214f08a1cf023b1117acb8 Author: Jakub Jelinek Date: T

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-04-19 Thread cvs-commit at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #20 from CVS Commits --- The master branch has been updated by Jakub Jelinek : https://gcc.gnu.org/g:ade0a1ee5c6707b950ba284adcfed0514866c12d commit r14-65-gade0a1ee5c6707b950ba284adcfed0514866c12d Author: Jakub Jelinek Date: We

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #19 from Hongtao.liu --- Created attachment 54678 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54678&action=edit gcc13-pr109011-3.patch Fix an ICE when gimple_call_lhs (call_stmt) is NULL in vect_recog_ctz_ffs_pattern, recog

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #18 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #16) > Created attachment 54590 [details] > gcc13-pr109011-2.patch > > Here is what I have right now, totally untested and will need further work > so that the two pat

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #17 from Jakub Jelinek --- Testcase for the normal SI -> SI stuff might be something like with e.g. -O3 -mavx512{bw,cd,vl,dq,bitalg,vpopcntdq} -mbmi -mlzcnt options or so (the intent of the last 2 is to make clz/ctz defined at zero i

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #16 from Jakub Jelinek --- Created attachment 54590 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54590&action=edit gcc13-pr109011-2.patch Here is what I have right now, totally untested and will need further work so that the

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #15 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #14) > (In reply to Hongtao.liu from comment #13) > > It looks like ffs is *just* ctz with defined behavior for zero, so we can > > handle it exactly the same as ctz in

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-06 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #14 from Jakub Jelinek --- (In reply to Hongtao.liu from comment #13) > It looks like ffs is *just* ctz with defined behavior for zero, so we can > handle it exactly the same as ctz in the same pattern match((bitsize - .CLZ > ((x - 1

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #13 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #12) > (In reply to Hongtao.liu from comment #11) > > (In reply to Jakub Jelinek from comment #3) > > > Seems they are vectorizing __builtin_ctz (x) as bitsize - .CLZ (

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-05 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #12 from Jakub Jelinek --- (In reply to Hongtao.liu from comment #11) > (In reply to Jakub Jelinek from comment #3) > > Seems they are vectorizing __builtin_ctz (x) as bitsize - .CLZ ((x - 1) & > > ~x) for CLZ_DEFINED_VALUE_AT_ZERO 2

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #11 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #3) > Seems they are vectorizing __builtin_ctz (x) as bitsize - .CLZ ((x - 1) & > ~x) for CLZ_DEFINED_VALUE_AT_ZERO 2 with value bitsize. > Perhaps we should pattern ma

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 Jakub Jelinek changed: What|Removed |Added Attachment #54584|0 |1 is obsolete|

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #9 from Jakub Jelinek --- Created attachment 54584 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=54584&action=edit gcc13-pr109011.patch Untested patch to just extend the popcount handling to clz, ctz and ffs, though for now o

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 Jakub Jelinek changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Com

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-04 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #7 from Jakub Jelinek --- Also, I wonder why vect_recog_popcount_pattern handles only popcount, can't it handle clz/ctz as well? I mean for void foo (long long *p, long long *q) { for (int i = 0; i < 2048; ++i) p[i] = __builtin

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #6 from Jakub Jelinek --- Oh, and optabs.cc expands ctz using clz as (bitsize-1) - .CLZ(x & -x) which is one fewer operations if andn isn't supported, on the other side is undefined at zero (so could be used for __builtin_ctz but not

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #5 from Jakub Jelinek --- And to answer myself, as x86 has vplzcnt* just for 32-bit and 64-bit elts with -mavx512cd (perhaps -mavx512vl also depending on vecsize), there is also 8-bit and 16-bit element vector popcount (guarded by di

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #4 from Jakub Jelinek --- Hacker's Delight has also a variant for popcount, either .POPCOUNT ((x - 1) & ~x) or bitsize - .POPCOUNT (x | -x), though a question is if there are any targets which have vector popcount and don't have vect

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-03 Thread jakub at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #3

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 Andrew Pinski changed: What|Removed |Added Blocks||53947 Target|

[Bug tree-optimization/109011] missed optimization in presence of __builtin_ctz

2023-03-03 Thread pinskia at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109011 --- Comment #1 from Andrew Pinski --- On aarch64, both get vectorized.