https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119224

--- Comment #7 from Robin Dapp <rdapp at gcc dot gnu.org> ---
> So this why you weren't seeing it but I'm confused about the rationale...
> I unpack above to following statements
> 
> 1. -mno-vector-strict-align allows us to unroll - seems ok.
> 2. Otherwise (-mvector-strict-align ?) leads to unaligned access ???
> 3. Or Is this the vector element vs. whole vector thing

At first I though I wasn't confused but now I am as well ;)

Unrolling the inner loop is not an issue, but the outer one is because of its
unknown stride.  However, we unroll just fine if I set the strides to something
weird like 17 and 23.  Our "preferred" vector alignment is element alignment
anyway so for a char array there is nothing to do.
I think the idea is that with an unknown stride we have an unknown misalignment
of our data refs.  Sometimes misalignment can be handled by peeling off some
iterations but we don't do that for variable strides.

But here every alignment is 1 anyway and it can't get any worse than that. 
Therefore we _should_ be able to unroll the loop even without the movmisalign
pattern (=-mno-vector-strict-align).  I believe that indicates a mistake in
riscv_support_vector_misalignment and we could just always return true there
for byte vectors.  Then we would unroll here.  That might be a rush job,
though, need to think about it some more.

But always unrolling would only exacerbate the scheduling/spilling problem so
we should somehow get this under control first.  With zvl128b in particular the
situation is just unpleasant right now.

Reply via email to