https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97236

--- Comment #5 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Richard Biener from comment #4)
> So what goes wrong is the single-element interleaving code-gen for the
> pointer copy.  We have
> 
> t.c:18:21: note:   Detected single element interleaving
> picture_7(D)->p[i_18].p_pixels step 16
> 
> but for the store:
> 
> t.c:18:21: missed:   not consecutive access res_8(D)->p[i_18].p_pixels = _1;
> t.c:18:21: note:   using strided accesses
> 
> ...
> 
> t.c:18:21: note:   ==> examining statement: _1 =
> picture_7(D)->p[i_18].p_pixels;
> t.c:18:21: note:   vect_model_load_cost: aligned.
> t.c:18:21: note:   vect_model_load_cost: inside_cost = 24, prologue_cost = 0
> .
> 
> and in group get-load-store type we handle it as (V1DI)
> 
>       if (!STMT_VINFO_STRIDED_P (first_stmt_info)
>           && (can_overrun_p || !would_overrun_p)
>           && compare_step_with_zero (vinfo, stmt_info) > 0)
>         {
>           /* First cope with the degenerate case of a single-element
>              vector.  */
>           if (known_eq (TYPE_VECTOR_SUBPARTS (vectype), 1U))
>             *memory_access_type = VMAT_CONTIGUOUS;

So both doing && gap == 0 here and removing this special-case alltogether
passes bootstrap / regtest on x86_64.

I have no idea why the special case was needed in the first place?
Was the load-lanes code confused?  I think VMAT_ELEMENTWISE for
single-element vectors is a good enough match?  What's the advantage
of VMAT_CONTIGUOUS here?

Reply via email to