https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621

--- Comment #9 from rguenther at suse dot de <rguenther at suse dot de> ---
On Mon, 19 Sep 2016, ubizjak at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77621
> 
> --- Comment #8 from Uroš Bizjak <ubizjak at gmail dot com> ---
> (In reply to Richard Biener from comment #7)
> 
> > So it seems V2DFmode is available but discouraged via the above hook when
> > tuning for atom.
> > 
> > Indeed:
> > 
> > static machine_mode
> > ix86_preferred_simd_mode (machine_mode mode)
> > {
> > ...
> >     case DFmode:
> >       if (!TARGET_VECTORIZE_DOUBLE)
> >         return word_mode;
> 
> This part should be OK, the hook documentation says:
> 
>  -- Target Hook: machine_mode TARGET_VECTORIZE_PREFERRED_SIMD_MODE
>           (machine_mode MODE)
>      This hook should return the preferred mode for vectorizing scalar
>      mode MODE.  The default is equal to 'word_mode', because the
>      vectorizer can do some transformations even in absence of
>      specialized SIMD hardware.
> 
> IIUC, word_mode should be returned for unsupported scalar modes.

It says "preferred" -- it only implicitely suggests that maybe
returning word_mode disables vectorization (it does not).

> > but targetm.vector_mode_supported_p happily returns true for V2DFmode.
> 
> Yes, also following the documentation:
> 
>  -- Target Hook: bool TARGET_VECTOR_MODE_SUPPORTED_P (machine_mode MODE)
>      Define this to return nonzero if the port is prepared to handle
>      insns involving vector mode MODE.  At the very least, it must have
>      move patterns for this mode.
> 
> We *do* have V2DF move patterns.

from the testcase it's clear you also have conversion patterns (ok, maybe
not -- eventually vectorization would fail if we didn't ICE -- will
check that).

I believe atom _does_ have full SSE2 support, no?  Using intrinsics
(even those expanding to GCC generic vector extension code) should
end up emitting SSE2 double instructions?

So what you want to tell the vectorizer is to not introduce vectorized
code using V2DFmode.  I still think a better way is to handle this
via costs (like a loop with mostly integer ops but a single FP double
op is probably still profitable to vectorize).

At least if you want the preferred_simd_mode to _disable_ vectorization
for V2DFmode that needs additional changes in the vectorizer code.  And
using word_mode from that hook when constructing the vector type
(using word_mode -- you're only lucky that word_mode is smaller than
DFmode that it fails) will end up with a vector type that has V2DF
mode anyway because stor-layout uses vector_mode_supported_p
to decide whether to use V2DFmode or BLKmode.

> > This means the above is _not_ a good way to achieve what it tries to
> > (make the vectorizer not use V2DFmode).  A more proper way would be to
> > handle this in ix86_add_stmt_cost, increasing the cost for double type
> > vectorization.
> > 
> > Nevertheless the vectorizer shouldn't ICE on this inconsistency, I'll see
> > what it takes to "fix" it on the vectorizer side.
> 
> Please note that we have similar situation with TARGET_PREFER_AVX128, the
> difference is that we still vectorize with narrower vector mode, whereas
> V2DFmode falls back to a word_mode.

Yeah.

Note that preferred_simd_mode was introduced to choose a prefered vector
size for a given mode, not really to disable vectorization for selected
modes.  See it's originally only use:

  /* If no size was supplied use the mode the target prefers.   Otherwise
     lookup a vector mode of the specified size.  */
  if (size == 0)
    simd_mode = targetm.vectorize.preferred_simd_mode (inner_mode);
  else
    simd_mode = mode_for_vector (inner_mode, size / nbytes);

it seems later there was at least one additional use added that follows
your idea:

int
estimate_move_cost (tree type, bool ARG_UNUSED (speed_p))
{
  HOST_WIDE_INT size;

  gcc_assert (!VOID_TYPE_P (type));

  if (TREE_CODE (type) == VECTOR_TYPE)
    {
      machine_mode inner = TYPE_MODE (TREE_TYPE (type));
      machine_mode simd
        = targetm.vectorize.preferred_simd_mode (inner);
      int simd_mode_size = GET_MODE_SIZE (simd);
      return ((GET_MODE_SIZE (TYPE_MODE (type)) + simd_mode_size - 1)
              / simd_mode_size);
    }

not sure why we override TYPE_MODE with preferred_simd_mode.  It's not
that the x86 backend will emit word_mode loads/stores for V2DFmode
loads/stores on i?86 with -mtune=atom?

Reply via email to