https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103393
Hongtao.liu <crazylht at gmail dot com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |crazylht at gmail dot com --- Comment #5 from Hongtao.liu <crazylht at gmail dot com> --- (In reply to Richard Biener from comment #3) > (In reply to H.J. Lu from comment #2) > > (In reply to Richard Biener from comment #1) > > > It isn't the vectorizer but memmove inline expansion. I'm not sure it's > > > really a bug, but there isn't a way to disable %ymm use besides disabling > > > AVX entirely. > > > HJ? > > > > YMM move is generated by loop distribution which doesn't check > > TARGET_PREFER_AVX128. > > I think it's generated by gimple_fold_builtin_memory_op which since Richards > changes accepts bigger now, up to MOVE_MAX * MOVE_RATIO and that ends up > picking an integer mode via > > scalar_int_mode mode; > if (int_mode_for_size (ilen * 8, 0).exists (&mode) > && GET_MODE_SIZE (mode) * BITS_PER_UNIT == ilen * 8 > && have_insn_for (SET, mode) > /* If the destination pointer is not aligned we must be > able > to emit an unaligned store. */ > && (dest_align >= GET_MODE_ALIGNMENT (mode) > || !targetm.slow_unaligned_access (mode, dest_align) > || (optab_handler (movmisalign_optab, mode) > != CODE_FOR_nothing))) > > not sure if there's another way to validate things. For one single set operation, shouldn't the total size be less than MOVE_MAX instead of MOVE_MAX * MOVE_RATIO? /* If we can perform the copy efficiently with first doing all loads and then all stores inline it that way. Currently efficiently means that we can load all the memory with a single set operation and that the total size is less than MOVE_MAX * MOVE_RATIO. */ src_align = get_pointer_alignment (src); dest_align = get_pointer_alignment (dest); if (tree_fits_uhwi_p (len) && (compare_tree_int (len, (MOVE_MAX * MOVE_RATIO (optimize_function_for_size_p (cfun)))) <= 0)