https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116573
Richard Biener <rguenth at gcc dot gnu.org> changed:
What |Removed |Added
----------------------------------------------------------------------------
Ever confirmed|0 |1
Last reconfirmed| |2024-09-06
Status|UNCONFIRMED |NEW
--- Comment #3 from Richard Biener <rguenth at gcc dot gnu.org> ---
So when investigating "future" fallout I've seen similar differences for
gcc.target/riscv/rvv/autovec/binop/vec_sat_u_add-1.c for example with the
GIMPLE difference being that before we used .SELECT_VL but afterwards
there's a MIN_EXPR to compute the length.
I've tried to read up the RVV specification but there doesn't seem to be
a good overall operand documentation for vsetvli :( I tried to understand
.L6:
mv a4,a3
bleu a3,a5,.L5 // this is likely the MIN?
csrr a4,vlenb // save VLEN to a4(?)
.L5:
vsetvli zero,a4,e8,m1,ta,ma // set VLEN to a4 and store new VLEN to
'zero'(?)
vle8.v v1,0(a1)
vle8.v v2,0(a2)
vsetvli a6,zero,e8,m1,ta,ma // set VLEN to zero?!
vsaddu.vv v1,v1,v2
vsetvli zero,a4,e8,m1,ta,ma // set VLEN to a4 again
vse8.v v1,0(a0)
add a1,a1,a5
add a2,a2,a5
add a0,a0,a5
mv a4,a3
sub a3,a3,a5
bgtu a4,a5,.L6
I think the GIMPLE looks straight-forward but the code the backend generates
looks bad, possibly the vsetvli pass is lacking here.
Now, the vectorizer doesn't use .SELECT_VL because
if (direct_internal_fn_supported_p (IFN_SELECT_VL, iv_type,
OPTIMIZE_FOR_SPEED)
&& LOOP_VINFO_LENS (loop_vinfo).length () == 1
&& LOOP_VINFO_LENS (loop_vinfo)[0].factor == 1 && !slp
&& (!LOOP_VINFO_NITERS_KNOWN_P (loop_vinfo)
|| !LOOP_VINFO_VECT_FACTOR (loop_vinfo).is_constant ()))
LOOP_VINFO_USING_SELECT_VL_P (loop_vinfo) = true;
see the !slp - the comment doesn't explain why, but for example
vectorizable_induction simply asserts !slp_node when
LOOP_VINFO_USING_SELECT_VL_P. I would have expected it to be handled
more like LOOP_VINFO_CAN_USE_PARTIAL_VECTORS_P and be disabled when
we cannot handle code generation for a feature.
Simply removing the && !slp fixes the particular testcase above for me.
I'll leave this bug and the fallout to Ju-Zhe Zhong who added
LOOP_VINFO_USING_SELECT_VL_P support.
Anyway, confirmed.