https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393
Hongtao.liu <crazylht at gmail dot com> changed:
What |Removed |Added
----------------------------------------------------------------------------
CC| |crazylht at gmail dot com
--- Comment #5 from Hongtao.liu <crazylht at gmail dot com> ---
(In reply to Richard Biener from comment #3)
> (In reply to H.J. Lu from comment #2)
> > (In reply to Richard Biener from comment #1)
> > > It isn't the vectorizer but memmove inline expansion. I'm not sure it's
> > > really a bug, but there isn't a way to disable %ymm use besides disabling
> > > AVX entirely.
> > > HJ?
> >
> > YMM move is generated by loop distribution which doesn't check
> > TARGET_PREFER_AVX128.
>
> I think it's generated by gimple_fold_builtin_memory_op which since Richards
> changes accepts bigger now, up to MOVE_MAX * MOVE_RATIO and that ends up
> picking an integer mode via
>
> scalar_int_mode mode;
> if (int_mode_for_size (ilen * 8, 0).exists (&mode)
> && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8
> && have_insn_for (SET, mode)
> /* If the destination pointer is not aligned we must be
> able
> to emit an unaligned store. */
> && (dest_align >= GET_MODE_ALIGNMENT (mode)
> || !targetm.slow_unaligned_access (mode, dest_align)
> || (optab_handler (movmisalign_optab, mode)
> != CODE_FOR_nothing)))
>
> not sure if there's another way to validate things.
For one single set operation, shouldn't the total size be less than MOVE_MAX
instead of MOVE_MAX * MOVE_RATIO?
/* If we can perform the copy efficiently with first doing all loads and
then all stores inline it that way. Currently efficiently means that
we can load all the memory with a single set operation and that the
total size is less than MOVE_MAX * MOVE_RATIO. */
src_align = get_pointer_alignment (src);
dest_align = get_pointer_alignment (dest);
if (tree_fits_uhwi_p (len)
&& (compare_tree_int
(len, (MOVE_MAX
* MOVE_RATIO (optimize_function_for_size_p (cfun))))
<= 0)