https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119640

--- Comment #4 from Richard Biener <rguenth at gcc dot gnu.org> ---
The issue is we have an invariant shift, 1<<mask_nbits in the loop.

vectorizable_shift runs into the was_scalar_shift_arg == true,
scalar_shift_arg == false, incompatible_op1_vectype_p == true case
which makes vectorizable_shift handle the required conversions.

But that leaves SLP scheduling with no idea where to schedule the
invariant shift - not knowing vectorizable_shift would later
insert code in the preheader.

Arguably scheduling should at most lift code to the preheader,
consistent with a NULL gsi from vect_init_vector but that's
difficult as that inserts on edge (immediate) while with a gsi
we insert after that.  But it's not the time to mess with this.

Testing a patch.

Reply via email to