https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105122

--- Comment #2 from Richard Biener <rguenth at gcc dot gnu.org> ---
Ah, in GCC 11 we had

/* Max number of bytes we can move from memory to memory
   in one reasonably fast instruction.  */
#define MOVE_MAX 16

while in GCC 12 it is now

/* Max number of bytes we can move from memory to memory in one
   reasonably fast instruction, as opposed to MOVE_MAX_PIECES which
   is the number of bytes at a time which we can move efficiently.
   MOVE_MAX_PIECES defaults to MOVE_MAX.  */

#define MOVE_MAX \
  ((TARGET_AVX512F \
    && (ix86_move_max == PVW_AVX512 \
        || ix86_store_max == PVW_AVX512)) \
   ? 64 \
   : ((TARGET_AVX \
       && (ix86_move_max >= PVW_AVX256 \
           || ix86_store_max >= PVW_AVX256)) \
      ? 32 \
      : ((TARGET_SSE2 \
          && TARGET_SSE_UNALIGNED_LOAD_OPTIMAL \
          && TARGET_SSE_UNALIGNED_STORE_OPTIMAL) \
         ? 16 : UNITS_PER_WORD)))

and UNITS_PER_WORD is now 4.  Not sure if that was a concious decision?

I'm not sure we want to "cheat" here.  For memcpy-6.c we have sth like

char a[32];
void fold_copy_8 (void)
{
  __builtin_memcpy (a + 3, a, 8);
}

where if we'd try to use 'long long' we'd succeed (since 'a' is properly
aligned).  We'd have to use lang_hook.types.types_for_mode to not get
too large types but even with that we'll happily use __int128_t on
i?86 with -mno-sse when copying 16 bytes.  The idea of using larger
types than MOVE_MAX was to restrict that to the cases where we do say

 __int128_t tem;
 memcpy (&tem, a, 16);

and thus the large type is used in the source already (and is > MOVE_MAX).
Similarly for 'double' on i?86 where we'd use DImode.  For the sake
of removing abstraction.  But the gcc.dg/memcpy-6.c testcase should be
about RTL expansion (with the known lack of handling of memmmove).

Btw, lang_hooks.types.type_for_mode reveals __int128_t via

#if HOST_BITS_PER_WIDE_INT >= 64
  if (mode == TYPE_MODE (intTI_type_node))
    return unsignedp ? unsigned_intTI_type_node : intTI_type_node;
#endif

even though we have

(gdb) p int_n_enabled_p
$2 = {false}
(gdb) p int_n_data[0]
$3 = {bitsize = 128, m = TImode}

of course __int128_t != unsigned __attribute__((mode(TI))), but ...

Reply via email to