https://gcc.gnu.org/bugzilla/show_bug.cgi?id=37021

Bill Schmidt <wschmidt at gcc dot gnu.org> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |wschmidt at gcc dot gnu.org

--- Comment #20 from Bill Schmidt <wschmidt at gcc dot gnu.org> ---
We still don't vectorize the original code example on Power.  It appears that
this is being disabled because of an alignment issue.  The data references are
being rejected by:

product.f:9:0: note: can't force alignment of ref: REALPART_EXPR <*a.0_24[_50]>

and similar for the other three DRs.  This happens due to this code in
vect_compute_data_ref_alignment:

  if (base_alignment < TYPE_ALIGN (vectype))
    {
      /* Strip an inner MEM_REF to a bare decl if possible.  */
      if (TREE_CODE (base) == MEM_REF
          && integer_zerop (TREE_OPERAND (base, 1))
          && TREE_CODE (TREE_OPERAND (base, 0)) == ADDR_EXPR)
        base = TREE_OPERAND (TREE_OPERAND (base, 0), 0);

      if (!vect_can_force_dr_alignment_p (base, TYPE_ALIGN (vectype)))
        {
          if (dump_enabled_p ())
            {
              dump_printf_loc (MSG_NOTE, vect_location,
                               "can't force alignment of ref: ");
              dump_generic_expr (MSG_NOTE, TDF_SLIM, ref);
              dump_printf (MSG_NOTE, "\n");
            }
          return true;
        }

Here TYPE_ALIGN (vectype) is 128 (Power vectors are normally aligned on a
128-bit value), and base_alignment is 64.  a.0 is defined as:

complex(kind=8) [0:D.1831] * restrict a.0;

In both ELFv1 and ELFv2 ABIs for Power, a complex type is defined to have the
same alignment as the underlying type.  So "complex double" has 8-byte
alignment.

On earlier versions of Power, the decision is fine, because unaligned accesses
are expensive prior to POWER8.  With POWER8, though, an unaligned access will
(most of the time) perform as well as an aligned access.  So ideally we would
like to teach the vectorizer to allow vectorization here.

It seems like vect_supportable_dr_alignment ought to be considered as part of
the SLP vectorization decision here, rather than just comparing the base
alignment with the vector type alignment.  Adding a check for that allows
things to get a little further, but we still don't vectorize the block.  (I
haven't yet looked into why, but I assume more needs to be done downstream to
handle this case.)

My understanding of the vectorizer is not yet very deep, so before going too
far down the wrong path, I'd like your opinion on the best approach to fixing
the problem.  Thanks!

Bill

Reply via email to