Hi,

this patch adds simple misalignment checks for gather/scatter
operations.  Previously, we assumed that those perform element accesses
internally so alignment does not matter.  The riscv vector spec however
explicitly states that vector operations are allowed to fault on
element-misaligned accesses.  Reasonable uarchs won't, but...

For gather/scatter we have two paths in the vectorizer:

(1) Regular analysis based on datarefs.  Here we can also create
    strided loads.
(2) Non-affine access where each gather index is relative to the
    initial address.

The assumption this patch works off is that once the alignment for the
first scalar is correct, all others will fall in line, as the index is
always a multiple of the first element's size.

For (1) we have a dataref and can check it for alignment as in other
cases.  For (2) this patch checks the object alignment of BASE and
compares it against the natural alignment of the current vectype's unit.

The patch also adds a pointer argument to the gather/scatter IFNs that
contains the necessary alignment.  Most of the patch is thus mechanical
in that it merely adjusts indices.

I tested the riscv version with a custom qemu version that faults on
element-misaligned vector accesses.  With this patch applied, there is
just a single fault left, which is due to PR120782 and which will be
addressed separately.

Is the general approach reasonable or do we need to do something else
entirely?  Bootstrap and regtest on aarch64 went fine.

I couldn't bootstrap/regtest on x86 as my regular cfarm machines
(420-422) are currently down.  Issues are expected, though, as the patch
doesn't touch x86's old-style gathers/scatters at all yet.  I still
wanted to get this initial version out there to get feedback.

The two riscv-specific changes I can still split off, obviously.
Also, I couldn't help but do tiny refactoring in some spots :)  This
could also go if requested.

I noticed one early-break failure with the changes where we would give
up on a load_permutation of {0}.  It looks latent and probably
unintended but I didn't investigate for now and just allowed this
specific permutation.

Regards
Robin

gcc/ChangeLog:

        * config/riscv/riscv.cc (riscv_support_vector_misalignment):
        Always support known aligned types.
        * internal-fn.cc (expand_scatter_store_optab_fn): Change
        argument numbers.
        (expand_gather_load_optab_fn): Ditto.
        (internal_fn_len_index): Ditto.
        (internal_fn_else_index): Ditto.
        (internal_fn_mask_index): Ditto.
        (internal_fn_stored_value_index): Ditto.
        (internal_gather_scatter_fn_supported_p): Ditto.
        * optabs-query.cc (supports_vec_gather_load_p): Ditto.
        * tree-vect-data-refs.cc (vect_describe_gather_scatter_call):
        Handle align_ptr.
        (vect_check_gather_scatter): Compute and set align_ptr.
        * tree-vect-patterns.cc (vect_recog_gather_scatter_pattern):
        Ditto.
        * tree-vect-slp.cc (GATHER_SCATTER_OFFSET): Define.
        (vect_get_and_check_slp_defs): Use define.
        * tree-vect-stmts.cc (vect_truncate_gather_scatter_offset):
        Set align_ptr.
        (get_group_load_store_type): Do not special-case gather/scatter.
        (get_load_store_type): Compute misalignment.
        (vectorizable_store): Remove alignment assert for
        scatter/gather.
        (vectorizable_load): Ditto.
        * tree-vectorizer.h (struct gather_scatter_info): Add align_ptr.

gcc/testsuite/ChangeLog:

        * lib/target-supports.exp: Fix riscv misalign supported check.
---
gcc/config/riscv/riscv.cc             | 24 ++++++--
gcc/internal-fn.cc                    | 21 ++++---
gcc/optabs-query.cc                   |  2 +-
gcc/testsuite/lib/target-supports.exp |  2 +-
gcc/tree-vect-data-refs.cc            | 13 ++++-
gcc/tree-vect-patterns.cc             | 17 +++---
gcc/tree-vect-slp.cc                  | 20 ++++---
gcc/tree-vect-stmts.cc                | 83 ++++++++++++++++++++-------
gcc/tree-vectorizer.h                 |  3 +
9 files changed, 130 insertions(+), 55 deletions(-)

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 8fdc5b21484..02637ee5a5b 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -12069,11 +12069,27 @@ riscv_estimated_poly_value (poly_int64 val,
   target.  */
bool
riscv_support_vector_misalignment (machine_mode mode,
-                                  const_tree type ATTRIBUTE_UNUSED,
+                                  const_tree type,
                                   int misalignment,
-                                  bool is_packed ATTRIBUTE_UNUSED)
-{
-  /* Depend on movmisalign pattern.  */
+                                  bool is_packed)
+{
+  /* IS_PACKED is true if the corresponding scalar element is not naturally
+     aligned.  In that case defer to the default hook which will check
+     if movmisalign is present.  Movmisalign, in turn, depends on
+     TARGET_VECTOR_MISALIGN_SUPPORTED.  */
+  if (is_packed)
+    return default_builtin_support_vector_misalignment (mode, type,
+                                                       misalignment,
+                                                       is_packed);
+
+  /* If we know that misalignment is a multiple of the element size, we're
+     good.  */
+  if (misalignment % TYPE_ALIGN_UNIT (type) == 0)
+    return true;
+
+  /* TODO: misalignment == -1.  Give up?  */
+
+  /* Otherwise fall back to movmisalign again.  */
  return default_builtin_support_vector_misalignment (mode, type, misalignment,
                                                      is_packed);
}
diff --git a/gcc/internal-fn.cc b/gcc/internal-fn.cc
index 7b44fabc408..2f066aea460 100644
--- a/gcc/internal-fn.cc
+++ b/gcc/internal-fn.cc
@@ -3654,8 +3654,8 @@ expand_scatter_store_optab_fn (internal_fn, gcall *stmt, 
direct_optab optab)
  internal_fn ifn = gimple_call_internal_fn (stmt);
  int rhs_index = internal_fn_stored_value_index (ifn);
  tree base = gimple_call_arg (stmt, 0);
-  tree offset = gimple_call_arg (stmt, 1);
-  tree scale = gimple_call_arg (stmt, 2);
+  tree offset = gimple_call_arg (stmt, 2);
+  tree scale = gimple_call_arg (stmt, 3);
  tree rhs = gimple_call_arg (stmt, rhs_index);

  rtx base_rtx = expand_normal (base);
@@ -3684,8 +3684,8 @@ expand_gather_load_optab_fn (internal_fn, gcall *stmt, 
direct_optab optab)
{
  tree lhs = gimple_call_lhs (stmt);
  tree base = gimple_call_arg (stmt, 0);
-  tree offset = gimple_call_arg (stmt, 1);
-  tree scale = gimple_call_arg (stmt, 2);
+  tree offset = gimple_call_arg (stmt, 2);
+  tree scale = gimple_call_arg (stmt, 3);

  rtx lhs_rtx = expand_expr (lhs, NULL_RTX, VOIDmode, EXPAND_WRITE);
  rtx base_rtx = expand_normal (base);
@@ -4936,11 +4936,13 @@ internal_fn_len_index (internal_fn fn)
      return 2;

    case IFN_MASK_LEN_SCATTER_STORE:
+      return 6;
+
    case IFN_MASK_LEN_STRIDED_LOAD:
      return 5;

    case IFN_MASK_LEN_GATHER_LOAD:
-      return 6;
+      return 7;

    case IFN_COND_LEN_FMA:
    case IFN_COND_LEN_FMS:
@@ -5044,7 +5046,7 @@ internal_fn_else_index (internal_fn fn)

    case IFN_MASK_GATHER_LOAD:
    case IFN_MASK_LEN_GATHER_LOAD:
-      return 5;
+      return 6;

    default:
      return -1;
@@ -5079,7 +5081,7 @@ internal_fn_mask_index (internal_fn fn)
    case IFN_MASK_SCATTER_STORE:
    case IFN_MASK_LEN_GATHER_LOAD:
    case IFN_MASK_LEN_SCATTER_STORE:
-      return 4;
+      return 5;

    case IFN_VCOND_MASK:
    case IFN_VCOND_MASK_LEN:
@@ -5104,10 +5106,11 @@ internal_fn_stored_value_index (internal_fn fn)

    case IFN_MASK_STORE:
    case IFN_MASK_STORE_LANES:
+      return 3;
    case IFN_SCATTER_STORE:
    case IFN_MASK_SCATTER_STORE:
    case IFN_MASK_LEN_SCATTER_STORE:
-      return 3;
+      return 4;

    case IFN_LEN_STORE:
      return 4;
@@ -5205,7 +5208,7 @@ internal_gather_scatter_fn_supported_p (internal_fn ifn, 
tree vector_type,
     */
  if (ok && elsvals)
    get_supported_else_vals
-      (icode, internal_fn_else_index (IFN_MASK_GATHER_LOAD) + 1, *elsvals);
+      (icode, internal_fn_else_index (IFN_MASK_GATHER_LOAD), *elsvals);

  return ok;
}
diff --git a/gcc/optabs-query.cc b/gcc/optabs-query.cc
index f5ca98da818..ac9d7106aee 100644
--- a/gcc/optabs-query.cc
+++ b/gcc/optabs-query.cc
@@ -725,7 +725,7 @@ supports_vec_gather_load_p (machine_mode mode, vec<int> 
*elsvals)
     */
  if (elsvals && icode != CODE_FOR_nothing)
    get_supported_else_vals
-      (icode, internal_fn_else_index (IFN_MASK_GATHER_LOAD) + 1, *elsvals);
+      (icode, internal_fn_else_index (IFN_MASK_GATHER_LOAD), *elsvals);

  return this_fn_optabs->supports_vec_gather_load[mode] > 0;
}
diff --git a/gcc/testsuite/lib/target-supports.exp 
b/gcc/testsuite/lib/target-supports.exp
index dfffe3adfbd..ab127cb8f8b 100644
--- a/gcc/testsuite/lib/target-supports.exp
+++ b/gcc/testsuite/lib/target-supports.exp
@@ -2428,7 +2428,7 @@ proc check_effective_target_riscv_v_misalign_ok { } {
                = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15};
              asm ("vsetivli zero,7,e8,m1,ta,ma");
              asm ("addi a7,%0,1" : : "r" (a) : "a7" );
-             asm ("vle8.v v8,0(a7)" : : : "v8");
+             asm ("vle16.v v8,0(a7)" : : : "v8");
              return 0; } } "-march=${gcc_march}"] } {
        return 1
    }
diff --git a/gcc/tree-vect-data-refs.cc b/gcc/tree-vect-data-refs.cc
index 036903a948f..087c717b8e9 100644
--- a/gcc/tree-vect-data-refs.cc
+++ b/gcc/tree-vect-data-refs.cc
@@ -4441,10 +4441,11 @@ vect_describe_gather_scatter_call (stmt_vec_info 
stmt_info,
  info->ifn = gimple_call_internal_fn (call);
  info->decl = NULL_TREE;
  info->base = gimple_call_arg (call, 0);
-  info->offset = gimple_call_arg (call, 1);
+  info->align_ptr = gimple_call_arg (call, 1);
+  info->offset = gimple_call_arg (call, 2);
  info->offset_dt = vect_unknown_def_type;
  info->offset_vectype = NULL_TREE;
-  info->scale = TREE_INT_CST_LOW (gimple_call_arg (call, 2));
+  info->scale = TREE_INT_CST_LOW (gimple_call_arg (call, 3));
  info->element_type = TREE_TYPE (vectype);
  info->memory_type = TREE_TYPE (DR_REF (dr));
}
@@ -4769,6 +4770,14 @@ vect_check_gather_scatter (stmt_vec_info stmt_info, 
loop_vec_info loop_vinfo,
  info->ifn = ifn;
  info->decl = decl;
  info->base = base;
+
+  /* TODO: Is IS_PACKED necessary/useful here or does get_obj_alignment
+     suffice?  */
+  bool is_packed = not_size_aligned (DR_REF (dr));
+  info->align_ptr = build_int_cst
+    (reference_alias_ptr_type (DR_REF (dr)),
+     is_packed ? 1 : get_object_alignment (DR_REF (dr)));
+
  info->offset = off;
  info->offset_dt = vect_unknown_def_type;
  info->offset_vectype = offset_vectype;
diff --git a/gcc/tree-vect-patterns.cc b/gcc/tree-vect-patterns.cc
index 0f6d6b77ea1..e0035ed845a 100644
--- a/gcc/tree-vect-patterns.cc
+++ b/gcc/tree-vect-patterns.cc
@@ -6042,12 +6042,14 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo,

          tree vec_els
            = vect_get_mask_load_else (elsval, TREE_TYPE (gs_vectype));
-         pattern_stmt = gimple_build_call_internal (gs_info.ifn, 6, base,
+         pattern_stmt = gimple_build_call_internal (gs_info.ifn, 7, base,
+                                                    gs_info.align_ptr,
                                                     offset, scale, zero, mask,
                                                     vec_els);
        }
      else
-       pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4, base,
+       pattern_stmt = gimple_build_call_internal (gs_info.ifn, 5, base,
+                                                  gs_info.align_ptr,
                                                   offset, scale, zero);
      tree lhs = gimple_get_lhs (stmt_info->stmt);
      tree load_lhs = vect_recog_temp_ssa_var (TREE_TYPE (lhs), NULL);
@@ -6057,12 +6059,13 @@ vect_recog_gather_scatter_pattern (vec_info *vinfo,
    {
      tree rhs = vect_get_store_rhs (stmt_info);
      if (mask != NULL)
-       pattern_stmt = gimple_build_call_internal (gs_info.ifn, 5,
-                                                  base, offset, scale, rhs,
-                                                  mask);
+       pattern_stmt = gimple_build_call_internal (gs_info.ifn, 6,
+                                                  base, gs_info.align_ptr,
+                                                  offset, scale, rhs, mask);
      else
-       pattern_stmt = gimple_build_call_internal (gs_info.ifn, 4,
-                                                  base, offset, scale, rhs);
+       pattern_stmt = gimple_build_call_internal (gs_info.ifn, 5,
+                                                  base, gs_info.align_ptr,
+                                                  offset, scale, rhs);
    }
  gimple_call_set_nothrow (pattern_stmt, true);

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index dc89da3bf17..b0d417d0309 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -507,6 +507,8 @@ vect_def_types_match (enum vect_def_type dta, enum 
vect_def_type dtb)
              && (dtb == vect_external_def || dtb == vect_constant_def)));
}

+#define GATHER_SCATTER_OFFSET (-3)
+
static const int cond_expr_maps[3][5] = {
  { 4, -1, -2, 1, 2 },
  { 4, -2, -1, 1, 2 },
@@ -514,17 +516,17 @@ static const int cond_expr_maps[3][5] = {
};
static const int no_arg_map[] = { 0 };
static const int arg0_map[] = { 1, 0 };
-static const int arg1_map[] = { 1, 1 };
+static const int arg1_map[] = { 1, 2 };
static const int arg2_arg3_map[] = { 2, 2, 3 };
-static const int arg1_arg3_map[] = { 2, 1, 3 };
-static const int arg1_arg4_arg5_map[] = { 3, 1, 4, 5 };
-static const int arg1_arg3_arg4_map[] = { 3, 1, 3, 4 };
+static const int arg1_arg3_map[] = { 2, 2, 4 };
+static const int arg1_arg4_arg5_map[] = { 3, 2, 5, 6 };
+static const int arg1_arg3_arg4_map[] = { 3, 2, 4, 5 };
static const int arg3_arg2_map[] = { 2, 3, 2 };
static const int op1_op0_map[] = { 2, 1, 0 };
-static const int off_map[] = { 1, -3 };
-static const int off_op0_map[] = { 2, -3, 0 };
-static const int off_arg2_arg3_map[] = { 3, -3, 2, 3 };
-static const int off_arg3_arg2_map[] = { 3, -3, 3, 2 };
+static const int off_map[] = { 1, GATHER_SCATTER_OFFSET };
+static const int off_op0_map[] = { 2, GATHER_SCATTER_OFFSET, 0 };
+static const int off_arg2_arg3_map[] = { 3, GATHER_SCATTER_OFFSET, 2, 3 };
+static const int off_arg3_arg2_map[] = { 3, GATHER_SCATTER_OFFSET, 3, 2 };
static const int mask_call_maps[6][7] = {
  { 1, 1, },
  { 2, 1, 2, },
@@ -696,7 +698,7 @@ vect_get_and_check_slp_defs (vec_info *vinfo, unsigned char 
swap,
    {
      oprnd_info = (*oprnds_info)[i];
      int opno = map ? map[i] : int (i);
-      if (opno == -3)
+      if (opno == GATHER_SCATTER_OFFSET)
        {
          gcc_assert (STMT_VINFO_GATHER_SCATTER_P (stmt_info));
          if (!is_a <loop_vec_info> (vinfo)
diff --git a/gcc/tree-vect-stmts.cc b/gcc/tree-vect-stmts.cc
index 02a12ab20c2..3c7861c3fd9 100644
--- a/gcc/tree-vect-stmts.cc
+++ b/gcc/tree-vect-stmts.cc
@@ -1803,6 +1803,8 @@ vect_truncate_gather_scatter_offset (stmt_vec_info 
stmt_info,
      /* Logically the sum of DR_BASE_ADDRESS, DR_INIT and DR_OFFSET,
         but we don't need to store that here.  */
      gs_info->base = NULL_TREE;
+      gs_info->align_ptr = build_int_cst
+       (reference_alias_ptr_type (DR_REF (dr)), DR_BASE_ALIGNMENT (dr));
      gs_info->element_type = TREE_TYPE (vectype);
      gs_info->offset = fold_convert (offset_type, step);
      gs_info->offset_dt = vect_constant_def;
@@ -2411,8 +2413,7 @@ get_group_load_store_type (vec_info *vinfo, stmt_vec_info 
stmt_info,
      || *memory_access_type == VMAT_CONTIGUOUS_REVERSE)
    *poffset = neg_ldst_offset;

-  if (*memory_access_type == VMAT_GATHER_SCATTER
-      || *memory_access_type == VMAT_ELEMENTWISE
+  if (*memory_access_type == VMAT_ELEMENTWISE
      || *memory_access_type == VMAT_STRIDED_SLP
      || *memory_access_type == VMAT_INVARIANT)
    {
@@ -2543,9 +2544,36 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
stmt_info,
              return false;
            }
        }
-      /* Gather-scatter accesses perform only component accesses, alignment
-        is irrelevant for them.  */
-      *alignment_support_scheme = dr_unaligned_supported;
+
+      /* Gather-scatter accesses normally perform only component accesses so
+        alignment is irrelevant for them.  Targets like riscv do care about
+        scalar alignment in vector accesses, though, so check scalar
+        alignment here.  We determined the alias pointer as well as the base
+        alignment during pattern recognition and can re-use it here.
+
+        As we do not have a dataref we only know the alignment of the
+        base.  For now don't try harder to determine misalignment and just
+        assume it is unknown.  We consider the type packed if its scalar
+        alignment is lower than the natural alignment of a vector
+        element's type.  */
+
+      tree inner_vectype = TREE_TYPE (vectype);
+
+      unsigned HOST_WIDE_INT scalar_align
+       = tree_to_uhwi (gs_info->align_ptr);
+      unsigned HOST_WIDE_INT inner_vectype_sz
+       = tree_to_uhwi (TYPE_SIZE (inner_vectype));
+
+      bool is_misaligned = scalar_align < inner_vectype_sz;
+      bool is_packed = scalar_align > 1 && is_misaligned;
+
+      *misalignment = !is_misaligned ? 0 : inner_vectype_sz - scalar_align;
+
+      if (targetm.vectorize.support_vector_misalignment
+         (TYPE_MODE (vectype), inner_vectype, *misalignment, is_packed))
+       *alignment_support_scheme = dr_unaligned_supported;
+      else
+       *alignment_support_scheme = dr_unaligned_unsupported;
    }
  else if (!get_group_load_store_type (vinfo, stmt_info, vectype, slp_node,
                                       masked_p,
@@ -2586,10 +2614,10 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
stmt_info,
                           "alignment. With non-contiguous memory vectorization"
                           " could read out of bounds at %G ",
                           STMT_VINFO_STMT (stmt_info));
-       if (inbounds)
-         LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
-       else
-         return false;
+      if (inbounds)
+       LOOP_VINFO_MUST_USE_PARTIAL_VECTORS_P (loop_vinfo) = true;
+      else
+       return false;
    }

  /* If this DR needs alignment for correctness, we must ensure the target
@@ -2677,7 +2705,9 @@ get_load_store_type (vec_info  *vinfo, stmt_vec_info 
stmt_info,
         such only the first load in the group is aligned, the rest are not.
         Because of this the permutes may break the alignment requirements that
         have been set, and as such we should for now, reject them.  */
-      if (SLP_TREE_LOAD_PERMUTATION (slp_node).exists ())
+      load_permutation_t lperm = SLP_TREE_LOAD_PERMUTATION (slp_node);
+      if (lperm.exists ()
+         && (lperm.length () > 1 || (lperm.length () && lperm[0] != 0)))
        {
          if (dump_enabled_p ())
            dump_printf_loc (MSG_MISSED_OPTIMIZATION, vect_location,
@@ -9201,7 +9231,8 @@ vectorizable_store (vec_info *vinfo,
                {
                  if (VECTOR_TYPE_P (TREE_TYPE (vec_offset)))
                    call = gimple_build_call_internal (
-                           IFN_MASK_LEN_SCATTER_STORE, 7, dataref_ptr,
+                           IFN_MASK_LEN_SCATTER_STORE, 8, dataref_ptr,
+                           gs_info.align_ptr,
                            vec_offset, scale, vec_oprnd, final_mask, final_len,
                            bias);
                  else
@@ -9214,11 +9245,14 @@ vectorizable_store (vec_info *vinfo,
                }
              else if (final_mask)
                call = gimple_build_call_internal
-                            (IFN_MASK_SCATTER_STORE, 5, dataref_ptr,
+                            (IFN_MASK_SCATTER_STORE, 6, dataref_ptr,
+                             gs_info.align_ptr,
                              vec_offset, scale, vec_oprnd, final_mask);
              else
-               call = gimple_build_call_internal (IFN_SCATTER_STORE, 4,
-                                                  dataref_ptr, vec_offset,
+               call = gimple_build_call_internal (IFN_SCATTER_STORE, 5,
+                                                  dataref_ptr,
+                                                  gs_info.align_ptr,
+                                                  vec_offset,
                                                   scale, vec_oprnd);
              gimple_call_set_nothrow (call, true);
              vect_finish_stmt_generation (vinfo, stmt_info, call, gsi);
@@ -10869,7 +10903,6 @@ vectorizable_load (vec_info *vinfo,
        vec_num = SLP_TREE_NUMBER_OF_VEC_STMTS (slp_node);
    }

-  gcc_assert (alignment_support_scheme);
  vec_loop_masks *loop_masks
    = (loop_vinfo && LOOP_VINFO_FULLY_MASKED_P (loop_vinfo)
       ? &LOOP_VINFO_MASKS (loop_vinfo)
@@ -10889,10 +10922,12 @@ vectorizable_load (vec_info *vinfo,

  /* Targets with store-lane instructions must not require explicit
     realignment.  vect_supportable_dr_alignment always returns either
-     dr_aligned or dr_unaligned_supported for masked operations.  */
+     dr_aligned or dr_unaligned_supported for (non-length) masked
+     operations.  */
  gcc_assert ((memory_access_type != VMAT_LOAD_STORE_LANES
               && !mask
               && !loop_masks)
+             || memory_access_type == VMAT_GATHER_SCATTER
              || alignment_support_scheme == dr_aligned
              || alignment_support_scheme == dr_unaligned_supported);

@@ -11259,8 +11294,8 @@ vectorizable_load (vec_info *vinfo,

  if (memory_access_type == VMAT_GATHER_SCATTER)
    {
-      gcc_assert (alignment_support_scheme == dr_aligned
-                 || alignment_support_scheme == dr_unaligned_supported);
+//      gcc_assert (alignment_support_scheme == dr_aligned
+//               || alignment_support_scheme == dr_unaligned_supported);
      gcc_assert (!grouped_load && !slp_perm);

      unsigned int inside_cost = 0, prologue_cost = 0;
@@ -11363,7 +11398,8 @@ vectorizable_load (vec_info *vinfo,
                    {
                      if (VECTOR_TYPE_P (TREE_TYPE (vec_offset)))
                        call = gimple_build_call_internal (
-                         IFN_MASK_LEN_GATHER_LOAD, 8, dataref_ptr, vec_offset,
+                         IFN_MASK_LEN_GATHER_LOAD, 9, dataref_ptr,
+                         gs_info.align_ptr, vec_offset,
                          scale, zero, final_mask, vec_els, final_len, bias);
                      else
                        /* Non-vector offset indicates that prefer to take
@@ -11375,13 +11411,16 @@ vectorizable_load (vec_info *vinfo,
                    }
                  else if (final_mask)
                    call = gimple_build_call_internal (IFN_MASK_GATHER_LOAD,
-                                                      6, dataref_ptr,
+                                                      7, dataref_ptr,
+                                                      gs_info.align_ptr,
                                                       vec_offset, scale,
                                                       zero, final_mask,
                                                       vec_els);
                  else
-                   call = gimple_build_call_internal (IFN_GATHER_LOAD, 4,
-                                                      dataref_ptr, vec_offset,
+                   call = gimple_build_call_internal (IFN_GATHER_LOAD, 5,
+                                                      dataref_ptr,
+                                                      gs_info.align_ptr,
+                                                      vec_offset,
                                                       scale, zero);
                  gimple_call_set_nothrow (call, true);
                  new_stmt = call;
diff --git a/gcc/tree-vectorizer.h b/gcc/tree-vectorizer.h
index 32c7e52a46e..42da0fa294b 100644
--- a/gcc/tree-vectorizer.h
+++ b/gcc/tree-vectorizer.h
@@ -1545,6 +1545,9 @@ struct gather_scatter_info {
  /* The loop-invariant base value.  */
  tree base;

+  /* The alignment_ptr of the base.  */
+  tree align_ptr;
+
  /* The original scalar offset, which is a non-loop-invariant SSA_NAME.  */
  tree offset;

--
2.49.0


Reply via email to