https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118380

Richard Biener <rguenth at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
   Last reconfirmed|                            |2025-01-09
     Ever confirmed|0                           |1
             Status|UNCONFIRMED                 |NEW

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
Well, LLVM likely unrolls all loops while we don't, so constant propagation
from the initializer doesn't work.  With --param
max-completely-peeled-insns=1000 we produce

test256:
.LFB7779:
        .cfi_startproc
        vmovss  .LC0(%rip), %xmm0
        ret

which is better than clang which fails to eliminate an empty loop.

I think this works as intended (limiting code growth and compile-time,
heuristically - obviously not realizing the full followup optimization).

The __builtin_ia32_vbroadcastss256 call is of course a blocker, confirmed
for that part.

Reply via email to