On 4/12/25 12:41 AM, Alexandre Oliva wrote:
pr118182-2.c fails on gcc-14 because it lacks the late_combine passes,
particularly the one that runs after register allocation.
Even in the trunk, the predicate broadcast for the add reduction is
expanded and register-allocated as _zvfh, taking up an unneeded scalar
register to hold the constant to be vec_duplicated.
It is the late combine pass after register allocation that substitutes
this unneeded scalar register into the vec_duplicate, resolving to the
_zero or _imm insns.
It's easy enough and more efficient to expand pred_broadcast to the
insns that take the already-duplicated vector constant, when the
operands satisfy the predicates of the _zero or _imm insns.
Regression-tested with gcc-14 x86_64-linux-gnu-hosted crosses to
riscv64-elf and riscv32-elf. Also smoke-tested on trunk, still passing
the pr118182-2.c test with a cross to riscv64-elf. Ok to install?
for gcc/ChangeLog
PR target/118182
* config/riscv/vector.md (@pred_broadcast<mode>): Expand to
_zero and _imm variants without vec_duplicate.
Ah, this case is constants only. So the concerns about crossing units
really doesn't apply here. It *should* be a win across the board.
OK for the trunk.
jeff