On Dec 11, 2023, Richard Biener <richard.guent...@gmail.com> wrote: > you can use .exists (&move_mode) here to ...
Aah, yeah, and that should help avoid the noisy as_a conversions too, that I could replace with require(), or drop altogether. >> + || (GET_MODE_BITSIZE (as_a <scalar_int_mode> (int_move_mode)) >> + != incr * BITS_PER_UNIT)) Unfortunately, here it can't quite be dropped, GET_MODE_SIZE on a machine_mode isn't suitable for the !=, but with .require() we know it's a scalar_int_mode and thus != on its bitsize is fine. > I'll note that int_mode_for_size and smallest_int_mode_for_size > are not semantically equivalent in what they can return. In > particular it seems you are incrementing by iter_incr even when > formerly smallest_int_mode_for_size would have returned a > larger than necessary mode, resulting in at least inefficient > code, copying/comparing pieces multiple times. If we get a mode that isn't exactly the expected size, we go for BLKmode, so it should be fine and efficient. Unless machine modes are not powers of two multiples of BITS_PER_UNIT, then things may get a little weird, not so much because of repeated copying/comparing, but because of inefficiencies in the block copy/compare operations with block sizes that are not a good fit for such hypothetical (?) GCC targets. I guess we can cross that bridge if we ever get to it. > So int_mode_for_size looks more correct. *nod*. IIRC I had it at first (very long ago), and went for the smallest_ when transitioning to the machine_mode type hierarchy revamp. > OK with the above change. Here's what I've regstrapped on x86_64-linux-gnu, and will install shortly. Thanks! smallest_int_mode_for_size may abort when the requested mode is not available. Call int_mode_for_size instead, that signals the unsatisfiable request in a more graceful way. for gcc/ChangeLog PR middle-end/112784 * expr.cc (emit_block_move_via_loop): Call int_mode_for_size for maybe-too-wide sizes. (emit_block_cmp_via_loop): Likewise. for gcc/testsuite/ChangeLog PR middle-end/112784 * gcc.target/i386/avx512cd-inline-stringops-pr112784.c: New. --- gcc/expr.cc | 20 +++++++++----------- .../i386/avx512cd-inline-stringops-pr112784.c | 12 ++++++++++++ 2 files changed, 21 insertions(+), 11 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/avx512cd-inline-stringops-pr112784.c diff --git a/gcc/expr.cc b/gcc/expr.cc index 6da51f2aca296..076ba706537aa 100644 --- a/gcc/expr.cc +++ b/gcc/expr.cc @@ -2449,15 +2449,14 @@ emit_block_move_via_loop (rtx x, rtx y, rtx size, } emit_move_insn (iter, iter_init); - scalar_int_mode int_move_mode - = smallest_int_mode_for_size (incr * BITS_PER_UNIT); - if (GET_MODE_BITSIZE (int_move_mode) != incr * BITS_PER_UNIT) + opt_scalar_int_mode int_move_mode + = int_mode_for_size (incr * BITS_PER_UNIT, 1); + if (!int_move_mode.exists (&move_mode) + || GET_MODE_BITSIZE (int_move_mode.require ()) != incr * BITS_PER_UNIT) { move_mode = BLKmode; gcc_checking_assert (can_move_by_pieces (incr, align)); } - else - move_mode = int_move_mode; x_addr = force_operand (XEXP (x, 0), NULL_RTX); y_addr = force_operand (XEXP (y, 0), NULL_RTX); @@ -2701,16 +2700,15 @@ emit_block_cmp_via_loop (rtx x, rtx y, rtx len, tree len_type, rtx target, iter = gen_reg_rtx (iter_mode); emit_move_insn (iter, iter_init); - scalar_int_mode int_cmp_mode - = smallest_int_mode_for_size (incr * BITS_PER_UNIT); - if (GET_MODE_BITSIZE (int_cmp_mode) != incr * BITS_PER_UNIT - || !can_compare_p (NE, int_cmp_mode, ccp_jump)) + opt_scalar_int_mode int_cmp_mode + = int_mode_for_size (incr * BITS_PER_UNIT, 1); + if (!int_cmp_mode.exists (&cmp_mode) + || GET_MODE_BITSIZE (int_cmp_mode.require ()) != incr * BITS_PER_UNIT + || !can_compare_p (NE, cmp_mode, ccp_jump)) { cmp_mode = BLKmode; gcc_checking_assert (incr != 1); } - else - cmp_mode = int_cmp_mode; /* Save the base addresses. */ x_addr = force_operand (XEXP (x, 0), NULL_RTX); diff --git a/gcc/testsuite/gcc.target/i386/avx512cd-inline-stringops-pr112784.c b/gcc/testsuite/gcc.target/i386/avx512cd-inline-stringops-pr112784.c new file mode 100644 index 0000000000000..c81f99c693c24 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/avx512cd-inline-stringops-pr112784.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512cd -finline-stringops" } */ + +struct S { + int e; +} __attribute__((aligned(128))); + +int main() { + struct S s1; + struct S s2; + int v = __builtin_memcmp(&s1, &s2, sizeof(s1)); +} -- Alexandre Oliva, happy hacker https://FSFLA.org/blogs/lxo/ Free Software Activist GNU Toolchain Engineer More tolerance and less prejudice are key for inclusion and diversity Excluding neuro-others for not behaving ""normal"" is *not* inclusive