https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113622

--- Comment #10 from Richard Biener <rguenth at gcc dot gnu.org> ---
(In reply to Jakub Jelinek from comment #8)
> Guess for an rvalue (if even that crashes) we want to expand it to some
> permutation or whole vector shift which moves the indexed elements first and
> then extract it, for lvalue we need to insert it similarly.

If we can we should match this up with .VEC_SET / .VEC_EXTRACT, otherwise
we should go "simple" and spill.

diff --git a/gcc/gimple-isel.cc b/gcc/gimple-isel.cc
index 7e2392ecd38..e94f292dd38 100644
--- a/gcc/gimple-isel.cc
+++ b/gcc/gimple-isel.cc
@@ -104,7 +104,8 @@ gimple_expand_vec_set_extract_expr (struct function *fun,
       machine_mode outermode = TYPE_MODE (TREE_TYPE (view_op0));
       machine_mode extract_mode = TYPE_MODE (TREE_TYPE (ref));

-      if (auto_var_in_fn_p (view_op0, fun->decl)
+      if ((auto_var_in_fn_p (view_op0, fun->decl)
+          || DECL_HARD_REGISTER (view_op0))
          && !TREE_ADDRESSABLE (view_op0)
          && ((!is_extract && can_vec_set_var_idx_p (outermode))
              || (is_extract

ensures the former and fixes the ICE on x86_64 on trunk.  The comment#5
testcase then results in the following loop:

.L3:
        movslq  %eax, %rdx
        vmovaps %zmm2, -56(%rsp)
        vmovaps %zmm0, -120(%rsp)
        vmovss  -120(%rsp,%rdx,4), %xmm4
        vmovss  -56(%rsp,%rdx,4), %xmm3
        vcmpltss        %xmm4, %xmm3, %xmm3
        vpbroadcastd    %eax, %zmm4
        addl    $1, %eax
        vpcmpd  $0, %zmm7, %zmm4, %k1
        vblendvps       %xmm3, %xmm5, %xmm6, %xmm3
        vbroadcastss    %xmm3, %zmm1{%k1}
        cmpl    $8, %eax
        jne     .L3

this isn't optimal of course, for optimality we need vectorization.  But
we still need to avoid the ICEs since vectorization can be disabled.  That
said, I'm quite sure in code using hard registers people are not doing
such stupid things so I wonder how important it is to avoid "regressing"
the vectorization here.

Reply via email to