[gcc r15-6475] [RISC-V][PR target/115375] Fix expected dump output
https://gcc.gnu.org/g:d369ddca549b5ff7d868b8f5ee139835b1f9382a commit r15-6475-gd369ddca549b5ff7d868b8f5ee139835b1f9382a Author: Jeff Law Date: Mon Dec 30 23:40:58 2024 -0700 [RISC-V][PR target/115375] Fix expected dump output Several months ago changes were made to the vectorizer which mucked up several of the scan tests. All but one of the cases in pr115375 have since been fixed. The remaining failure seems to be primarily a debugging dump issue -- we're still selecting the same lmul values. This patch adjusts the dump scan appropriately. PR target/115375 gcc/testsuite * gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c: Adjust expected output. Diff: --- gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c index f045f857cc3f..793d16418bf1 100644 --- a/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c +++ b/gcc/testsuite/gcc.dg/vect/costmodel/riscv/rvv/dynamic-lmul8-9.c @@ -14,6 +14,6 @@ foo (int64_t *__restrict a, int64_t init, int n) /* { dg-final { scan-assembler {e64,m8} } } */ /* { dg-final { scan-assembler-not {csrr} } } */ /* { dg-final { scan-tree-dump-not "Preferring smaller LMUL loop because it has unexpected spills" "vect" } } */ -/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 2 "vect" } } */ +/* { dg-final { scan-tree-dump-times "Maximum lmul = 8" 1 "vect" } } */ /* { dg-final { scan-tree-dump-times "Maximum lmul = 4" 1 "vect" } } */ /* { dg-final { scan-tree-dump-times "Maximum lmul = 2" 1 "vect" } } */
[gcc r15-6476] [PATCH v2] varasm: Use native_encode_rtx for constant vectors.
https://gcc.gnu.org/g:509df13fbf0b3544cd39a9e0a5de11ce841bb185 commit r15-6476-g509df13fbf0b3544cd39a9e0a5de11ce841bb185 Author: Robin Dapp Date: Mon Dec 30 23:47:53 2024 -0700 [PATCH v2] varasm: Use native_encode_rtx for constant vectors. optimize_constant_pool hashes vector masks by native_encode_rtx and merges identically hashed values in the constant pool. Afterwards the optimized values are written in output_constant_pool_2. However, native_encode_rtx and output_constant_pool_2 disagree in their encoding of vector masks: native_encode_rtx does not pad with zeroes while output_constant_pool_2 implicitly does. In RVV's shuffle-evenodd-run.c there are two masks (a) "0101" for V4BI (b) "01010101" for V8BI and that have the same representation/encoding ("1010101") in native_encode_rtx. output_constant_pool_2 uses "101" for (a) and "1010101" for (b). Now, optimize_constant_pool might happen to merge both masks using (a) as representative. Then, output_constant_pool_2 will output "1010" which is only valid for the second mask as the implicit zero padding doesn't agree with (b). (b)'s "1010101" works for both masks as a V4BI load will ignore the last four padding bits. This patch makes output_constant_pool_2 use native_encode_rtx so both functions will agree on an encoding and output the correct constant. PR target/118036 gcc/ChangeLog: * varasm.cc (output_constant_pool_2): Use native_encode_rtx for building the memory image of a const vector mask. Diff: --- gcc/varasm.cc | 37 ++--- 1 file changed, 10 insertions(+), 27 deletions(-) diff --git a/gcc/varasm.cc b/gcc/varasm.cc index 0068ec2ce4dc..507da629619a 100644 --- a/gcc/varasm.cc +++ b/gcc/varasm.cc @@ -4301,34 +4301,17 @@ output_constant_pool_2 (fixed_size_mode mode, rtx x, unsigned int align) { gcc_assert (GET_CODE (x) == CONST_VECTOR); - /* Pick the smallest integer mode that contains at least one - whole element. Often this is byte_mode and contains more - than one element. */ - unsigned int nelts = GET_MODE_NUNITS (mode); - unsigned int elt_bits = GET_MODE_PRECISION (mode) / nelts; - unsigned int int_bits = MAX (elt_bits, BITS_PER_UNIT); - scalar_int_mode int_mode = int_mode_for_size (int_bits, 0).require (); - unsigned int mask = GET_MODE_MASK (GET_MODE_INNER (mode)); - - /* We allow GET_MODE_PRECISION (mode) <= GET_MODE_BITSIZE (mode) but - only properly handle cases where the difference is less than a - byte. */ - gcc_assert (GET_MODE_BITSIZE (mode) - GET_MODE_PRECISION (mode) < - BITS_PER_UNIT); - - /* Build the constant up one integer at a time. */ - unsigned int elts_per_int = int_bits / elt_bits; - for (unsigned int i = 0; i < nelts; i += elts_per_int) + auto_vec buffer; + buffer.reserve (GET_MODE_SIZE (mode)); + + bool ok = native_encode_rtx (mode, x, buffer, 0, GET_MODE_SIZE (mode)); + gcc_assert (ok); + + for (unsigned i = 0; i < GET_MODE_SIZE (mode); i++) { - unsigned HOST_WIDE_INT value = 0; - unsigned int limit = MIN (nelts - i, elts_per_int); - for (unsigned int j = 0; j < limit; ++j) - { - auto elt = INTVAL (CONST_VECTOR_ELT (x, i + j)); - value |= (elt & mask) << (j * elt_bits); - } - output_constant_pool_2 (int_mode, gen_int_mode (value, int_mode), - i != 0 ? MIN (align, int_bits) : align); + unsigned HOST_WIDE_INT value = buffer[i]; + output_constant_pool_2 (byte_mode, gen_int_mode (value, byte_mode), + i == 0 ? align : 1); } break; }
[gcc r15-6471] Fortran: Implement f_c_string function.
https://gcc.gnu.org/g:efc0981077a70c4de4596f682c4aeade07ec2f17 commit r15-6471-gefc0981077a70c4de4596f682c4aeade07ec2f17 Author: Steven G. Kargl Date: Sun Dec 29 14:19:18 2024 -0800 Fortran: Implement f_c_string function. Fortran 2023 has added the new intrinsic function F_C_STRING to convert fortran strings of default character kind to a null terminated C string. Contributions from Steve Kargl, Harald Anlauf, FX Coudert, Mikael Morin, and Jerry DeLisle. PR fortran/117643 gcc/fortran/ChangeLog: * check.cc (gfc_check_f_c_string): Check arguments of f_c_string(). * gfortran.h (enum gfc_isym_id): New symbol GFC_ISYM_F_C_STRING. * intrinsic.cc (add_functions): Add the ISO C Binding routine f_c_string(). Wrap nearby long line to less than 80 characters. * intrinsic.h (gfc_check_f_c_string): Prototype for gfc_check_f_c_string(). * iso-c-binding.def (NAMED_FUNCTION): Declare for ISO C Binding routine f_c_string(). * primary.cc (gfc_match_rvalue): Fix comment that has been untrue since 2011. Add ISOCBINDING_F_C_STRING to conditional. * trans-intrinsic.cc (conv_trim): Specialized version of trim() for f_c_string(). (gfc_conv_intrinsic_function): Use GFC_ISYM_F_C_STRING to trigger in-lining. gcc/testsuite/ChangeLog: * gfortran.dg/f_c_string1.f90: New test. * gfortran.dg/f_c_string2.f90: New test. Diff: --- gcc/fortran/check.cc | 36 ++ gcc/fortran/gfortran.h| 1 + gcc/fortran/intrinsic.cc | 11 +- gcc/fortran/intrinsic.h | 1 + gcc/fortran/iso-c-binding.def | 3 + gcc/fortran/primary.cc| 5 +- gcc/fortran/trans-intrinsic.cc| 182 +- gcc/testsuite/gfortran.dg/f_c_string1.f90 | 49 gcc/testsuite/gfortran.dg/f_c_string2.f90 | 50 9 files changed, 329 insertions(+), 9 deletions(-) diff --git a/gcc/fortran/check.cc b/gcc/fortran/check.cc index f4fde83e8ab5..08cc88ba7cbd 100644 --- a/gcc/fortran/check.cc +++ b/gcc/fortran/check.cc @@ -1829,6 +1829,42 @@ gfc_check_image_status (gfc_expr *image, gfc_expr *team) } +/* Check the arguments for f_c_string. */ + +bool +gfc_check_f_c_string (gfc_expr *string, gfc_expr *asis) +{ + + if (gfc_invalid_null_arg (string)) +return false; + + if (!scalar_check (string, 0)) +return false; + + if (string->ts.type != BT_CHARACTER + || (string->ts.type == BT_CHARACTER + && (string->ts.kind != gfc_default_character_kind))) +{ + gfc_error ("%qs argument of %qs intrinsic at %L shall have " +"a type of CHARACTER(KIND=C_CHAR)", +gfc_current_intrinsic_arg[0]->name, gfc_current_intrinsic, +&string->where); + return false; +} + + if (asis) +{ + if (!type_check (asis, 1, BT_LOGICAL)) + return false; + + if (!scalar_check (asis, 1)) + return false; +} + + return true; +} + + bool gfc_check_failed_or_stopped_images (gfc_expr *team, gfc_expr *kind) { diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h index 87307c5531e9..a2c8ebc6b3ef 100644 --- a/gcc/fortran/gfortran.h +++ b/gcc/fortran/gfortran.h @@ -508,6 +508,7 @@ enum gfc_isym_id GFC_ISYM_EXP, GFC_ISYM_EXPONENT, GFC_ISYM_EXTENDS_TYPE_OF, + GFC_ISYM_F_C_STRING, GFC_ISYM_FAILED_IMAGES, GFC_ISYM_FDATE, GFC_ISYM_FE_RUNTIME_ERROR, diff --git a/gcc/fortran/intrinsic.cc b/gcc/fortran/intrinsic.cc index a2e241280c38..d4db6abe7b4d 100644 --- a/gcc/fortran/intrinsic.cc +++ b/gcc/fortran/intrinsic.cc @@ -3145,6 +3145,14 @@ add_functions (void) x, BT_UNKNOWN, 0, REQUIRED); make_from_module(); + add_sym_2 ("f_c_string", GFC_ISYM_F_C_STRING, CLASS_TRANSFORMATIONAL, +ACTUAL_NO, +BT_CHARACTER, dc, GFC_STD_F2023, +gfc_check_f_c_string, NULL, NULL, +stg, BT_CHARACTER, dc, REQUIRED, +"asis", BT_CHARACTER, dc, OPTIONAL); + make_from_module(); + add_sym_1 ("c_sizeof", GFC_ISYM_C_SIZEOF, CLASS_INQUIRY, ACTUAL_NO, BT_INTEGER, gfc_index_integer_kind, GFC_STD_F2008, gfc_check_c_sizeof, gfc_simplify_sizeof, NULL, @@ -3301,7 +3309,8 @@ add_functions (void) make_generic ("transpose", GFC_ISYM_TRANSPOSE, GFC_STD_F95); - add_sym_1 ("trim", GFC_ISYM_TRIM, CLASS_TRANSFORMATIONAL, ACTUAL_NO, BT_CHARACTER, dc, GFC_STD_F95, + add_sym_1 ("trim", GFC_ISYM_TRIM, CLASS_TRANSFORMATIONAL, ACTUAL_NO, +BT_CHARACTER, dc, GFC_STD_F95, gfc_check_trim, gfc_simplify_trim, gfc_resolve_trim, stg, BT_CHARACTER, dc, REQUIRED); diff --git a/gcc/fortran/intrinsic.h b/gcc/fortran/intrinsic.h index 61d85eedc693..640d1bc15ebd 100644 --- a/gcc/fortran/intrinsic.
[gcc r15-6472] [RISC-V][PR target/106544] Avoid ICEs due to bogus asms
https://gcc.gnu.org/g:07e532a0608640b9e57ae6fc3a0ca83c9afc75a1 commit r15-6472-g07e532a0608640b9e57ae6fc3a0ca83c9afc75a1 Author: Jeff Law Date: Mon Dec 30 13:51:55 2024 -0700 [RISC-V][PR target/106544] Avoid ICEs due to bogus asms This is a fix for a bug Andrew P filed a while back where essentially a poorly crafted asm statement could trigger a ICE during assembly output. Various cases will use INTVAL (op) without verifying the operand is a CONST_INT node first. The usual way to handle this is via output_operand_lossage, which this patch implements. I focused primarily on the CONST_INT cases, there could well be other problems in this space, if so they should get distinct bugs with testcases. Tested in my tester on rv32 and rv64. Waiting for pre-commit testing before moving forward. PR target/106544 gcc/ * config/riscv/riscv.cc (riscv_print_operand): Issue an error for invalid operands rather than invalidly accessing INTVAL of an object that is not a CONST_INT. Fix one error string for 'N'. gcc/testsuite * gcc.target/riscv/pr106544.c: New test. Diff: --- gcc/config/riscv/riscv.cc | 169 +- gcc/testsuite/gcc.target/riscv/pr106544.c | 6 ++ 2 files changed, 104 insertions(+), 71 deletions(-) diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index 1374868eddfb..08754be529e9 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -7009,62 +7009,77 @@ riscv_print_operand (FILE *file, rtx op, int letter) fputs (GET_RTX_NAME (reverse_condition (code)), file); break; -case 'A': { - const enum memmodel model = memmodel_base (INTVAL (op)); - if (riscv_memmodel_needs_amo_acquire (model) - && riscv_memmodel_needs_amo_release (model)) - fputs (".aqrl", file); - else if (riscv_memmodel_needs_amo_acquire (model)) - fputs (".aq", file); - else if (riscv_memmodel_needs_amo_release (model)) - fputs (".rl", file); +case 'A': + if (!CONST_INT_P (op)) + output_operand_lossage ("invalid operand for '%%%c'", letter); + else + { + const enum memmodel model = memmodel_base (INTVAL (op)); + if (riscv_memmodel_needs_amo_acquire (model) + && riscv_memmodel_needs_amo_release (model)) + fputs (".aqrl", file); + else if (riscv_memmodel_needs_amo_acquire (model)) + fputs (".aq", file); + else if (riscv_memmodel_needs_amo_release (model)) + fputs (".rl", file); + } break; -} -case 'I': { - const enum memmodel model = memmodel_base (INTVAL (op)); - if (TARGET_ZTSO && model != MEMMODEL_SEQ_CST) - /* LR ops only have an annotation for SEQ_CST in the Ztso mapping. */ - break; - else if (model == MEMMODEL_SEQ_CST) - fputs (".aqrl", file); - else if (riscv_memmodel_needs_amo_acquire (model)) - fputs (".aq", file); +case 'I': + if (!CONST_INT_P (op)) + output_operand_lossage ("invalid operand for '%%%c'", letter); + else + { + const enum memmodel model = memmodel_base (INTVAL (op)); + if (TARGET_ZTSO && model != MEMMODEL_SEQ_CST) + /* LR ops only have an annotation for SEQ_CST in the Ztso mapping. */ + break; + else if (model == MEMMODEL_SEQ_CST) + fputs (".aqrl", file); + else if (riscv_memmodel_needs_amo_acquire (model)) + fputs (".aq", file); + } break; -} -case 'J': { - const enum memmodel model = memmodel_base (INTVAL (op)); - if (TARGET_ZTSO && model == MEMMODEL_SEQ_CST) - /* SC ops only have an annotation for SEQ_CST in the Ztso mapping. */ - fputs (".rl", file); - else if (TARGET_ZTSO) - break; - else if (riscv_memmodel_needs_amo_release (model)) - fputs (".rl", file); +case 'J': + if (!CONST_INT_P (op)) + output_operand_lossage ("invalid operand for '%%%c'", letter); + else + { + const enum memmodel model = memmodel_base (INTVAL (op)); + if (TARGET_ZTSO && model == MEMMODEL_SEQ_CST) + /* SC ops only have an annotation for SEQ_CST in the Ztso mapping. */ + fputs (".rl", file); + else if (TARGET_ZTSO) + break; + else if (riscv_memmodel_needs_amo_release (model)) + fputs (".rl", file); + } break; -} case 'L': - { - const char *ntl_hint = NULL; - switch (INTVAL (op)) - { - case 0: - ntl_hint = "ntl.all"; - break; - case 1: - ntl_hint = "ntl.pall"; - break; - case 2: - ntl_hint = "ntl.p1"; - break; - } + if (!CONST_INT_P (op)) + output_operand_lossage ("invalid o
[gcc r15-6464] avoid-store-forwarding: fix reg init on load-eliminiation [PR117835]
https://gcc.gnu.org/g:c86e1c54c6f8771d08a8c070717b80607f990f8a commit r15-6464-gc86e1c54c6f8771d08a8c070717b80607f990f8a Author: kelefth Date: Mon Dec 16 14:36:59 2024 +0100 avoid-store-forwarding: fix reg init on load-eliminiation [PR117835] During the initialization of the base register for the zero-offset store, in the case that we are eliminating the load, we used a paradoxical subreg assuming that we don't care about the higher bits of the register. This led to writing wrong values when we were not updating the whole register. This patch fixes the issue by zero-extending the value stored in the base register instead of using a paradoxical subreg. Bootstrapped/regtested on x86 and AArch64. PR rtl-optimization/117835 PR rtl-optimization/117872 gcc/ChangeLog: * avoid-store-forwarding.cc (store_forwarding_analyzer::process_store_forwarding): Zero-extend the value stored in the base register instead of using a paradoxical subreg. gcc/testsuite/ChangeLog: * gcc.target/i386/pr117835.c: New test. Diff: --- gcc/avoid-store-forwarding.cc| 6 +- gcc/testsuite/gcc.target/i386/pr117835.c | 20 2 files changed, 21 insertions(+), 5 deletions(-) diff --git a/gcc/avoid-store-forwarding.cc b/gcc/avoid-store-forwarding.cc index 1b8c35bc6cb7..fa83e10fedca 100644 --- a/gcc/avoid-store-forwarding.cc +++ b/gcc/avoid-store-forwarding.cc @@ -238,11 +238,7 @@ process_store_forwarding (vec &stores, rtx_insn *load_insn, { start_sequence (); - /* We can use a paradoxical subreg to force this to a wider mode, as -the only use will be inserting the bits (i.e., we don't care about -the value of the higher bits). */ - rtx ext0 = lowpart_subreg (GET_MODE (dest), it->mov_reg, -GET_MODE (it->mov_reg)); + rtx ext0 = gen_rtx_ZERO_EXTEND (GET_MODE (dest), it->mov_reg); if (ext0) { rtx_insn *move0 = emit_move_insn (dest, ext0); diff --git a/gcc/testsuite/gcc.target/i386/pr117835.c b/gcc/testsuite/gcc.target/i386/pr117835.c new file mode 100644 index ..eac71aac916b --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr117835.c @@ -0,0 +1,20 @@ +/* { dg-do run } */ +/* { dg-options "-O -favoid-store-forwarding -mno-push-args --param=store-forwarding-max-distance=0 -Wno-psabi" } */ + +typedef __attribute__((__vector_size__ (64))) unsigned short V; + +__attribute__((__noipa__)) V +foo (V v, V) +{ + return v; +} + +int main () +{ + V a = (V){3, 5, 0, 8, 9, 3, 5, 1, 3, 4, 2, 5, 5, 0, 5, 3, 61886}; + V b = (V){6, 80, 15, 2, 2, 1, 1, 3, 5}; + V x = foo (a, b); + for (unsigned i = 0; i < sizeof(x)/sizeof(x[0]); i++) +if (x[i] != a[i]) + __builtin_abort(); +} \ No newline at end of file
[gcc r15-6466] aarch64: Macroise simd_type definitions
https://gcc.gnu.org/g:5f5b1a3625b14b8ee3d4967726242e10c4ab8fbb commit r15-6466-g5f5b1a3625b14b8ee3d4967726242e10c4ab8fbb Author: Richard Sandiford Date: Mon Dec 30 12:50:55 2024 + aarch64: Macroise simd_type definitions This patch tries to regularise the definitions of the new pragma simd types. Not all of the new types are currently used, but they will be by later patches. gcc/ * config/aarch64/aarch64-builtins.cc (simd_types): Use one macro invocation for each element type. Diff: --- gcc/config/aarch64/aarch64-builtins.cc | 65 +- 1 file changed, 32 insertions(+), 33 deletions(-) diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index ca1dc5a3e6a7..bad97181cf60 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -1637,41 +1637,40 @@ struct simd_type { }; namespace simd_types { - constexpr simd_type f8 { V8QImode, qualifier_modal_float }; - constexpr simd_type f8q { V16QImode, qualifier_modal_float }; - constexpr simd_type p8 { V8QImode, qualifier_poly }; - constexpr simd_type p8q { V16QImode, qualifier_poly }; - constexpr simd_type s8 { V8QImode, qualifier_none }; - constexpr simd_type s8q { V16QImode, qualifier_none }; - constexpr simd_type u8 { V8QImode, qualifier_unsigned }; - constexpr simd_type u8q { V16QImode, qualifier_unsigned }; - - constexpr simd_type f16 { V4HFmode, qualifier_none }; - constexpr simd_type f16q { V8HFmode, qualifier_none }; - constexpr simd_type f16qx2 { V2x8HFmode, qualifier_none }; - constexpr simd_type p16 { V4HImode, qualifier_poly }; - constexpr simd_type p16q { V8HImode, qualifier_poly }; - constexpr simd_type p16qx2 { V2x8HImode, qualifier_poly }; - constexpr simd_type s16 { V4HImode, qualifier_none }; - constexpr simd_type s16q { V8HImode, qualifier_none }; - constexpr simd_type s16qx2 { V2x8HImode, qualifier_none }; - constexpr simd_type u16 { V4HImode, qualifier_unsigned }; - constexpr simd_type u16q { V8HImode, qualifier_unsigned }; - constexpr simd_type u16qx2 { V2x8HImode, qualifier_unsigned }; - - constexpr simd_type bf16 { V4BFmode, qualifier_none }; - constexpr simd_type bf16q { V8BFmode, qualifier_none }; - constexpr simd_type bf16qx2 { V2x8BFmode, qualifier_none }; - - constexpr simd_type f32 { V2SFmode, qualifier_none }; - constexpr simd_type f32q { V4SFmode, qualifier_none }; - constexpr simd_type s32 { V2SImode, qualifier_none }; - constexpr simd_type s32q { V4SImode, qualifier_none }; - - constexpr simd_type f64q { V2DFmode, qualifier_none }; - constexpr simd_type s64q { V2DImode, qualifier_none }; +#define VARIANTS(BASE, D, Q, MODE, QUALIFIERS) \ + constexpr simd_type BASE { V##D##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##x2 { V2x##D##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##x3 { V3x##D##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##x4 { V4x##D##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##q { V##Q##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##qx2 { V2x##Q##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##qx3 { V3x##Q##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##qx4 { V4x##Q##MODE, QUALIFIERS }; \ + constexpr simd_type BASE##_scalar { MODE, QUALIFIERS }; + + VARIANTS (f8, 8, 16, QImode, qualifier_modal_float) + VARIANTS (p8, 8, 16, QImode, qualifier_poly) + VARIANTS (s8, 8, 16, QImode, qualifier_none) + VARIANTS (u8, 8, 16, QImode, qualifier_unsigned) + + VARIANTS (bf16, 4, 8, BFmode, qualifier_none) + VARIANTS (f16, 4, 8, HFmode, qualifier_none) + VARIANTS (p16, 4, 8, HImode, qualifier_poly) + VARIANTS (s16, 4, 8, HImode, qualifier_none) + VARIANTS (u16, 4, 8, HImode, qualifier_unsigned) + + VARIANTS (f32, 2, 4, SFmode, qualifier_none) + VARIANTS (p32, 2, 4, SImode, qualifier_poly) + VARIANTS (s32, 2, 4, SImode, qualifier_none) + VARIANTS (u32, 2, 4, SImode, qualifier_unsigned) + + VARIANTS (f64, 1, 2, DFmode, qualifier_none) + VARIANTS (p64, 1, 2, DImode, qualifier_poly) + VARIANTS (s64, 1, 2, DImode, qualifier_none) + VARIANTS (u64, 1, 2, DImode, qualifier_unsigned) constexpr simd_type none { VOIDmode, qualifier_none }; +#undef VARIANTS } }
[gcc r15-6465] Don't include subst attributes in "@" md helpers
https://gcc.gnu.org/g:a7d974136239adf62010f56fc0ad26a88928af46 commit r15-6465-ga7d974136239adf62010f56fc0ad26a88928af46 Author: Richard Sandiford Date: Mon Dec 30 12:50:54 2024 + Don't include subst attributes in "@" md helpers In a later patch, I need to add "@" to a pattern that uses subst attributes. This combination is problematic for two reasons: (1) define_substs are applied and filtered at a later stage than the handling of "@" patterns, so that the handling of "@" patterns doesn't know which subst variants are valid and which will later be dropped. Just adding a "@" therefore triggers a build error due to references to non-existent patterns. (2) Currently, the code will treat a single "@" pattern as contributing to a single set of overloaded functions. These overloaded functions will have an integer argument for every subst attribute. For example, the vczle and vczbe in: "@aarch64_rev" are subst attributes, and so currently we'd try to generate a single set of overloads that take four arguments: one for rev_op, one for the mode, one for vczle, and one for vczbe. The gen_* and maybe_gen_* functions will also have one rtx argument for each operand in the original pattern. This model doesn't really make sense for define_substs, since define_substs are allowed to add extra operands to an instruction. The number of rtx operands to the generators would then be incorrect. I think a more sensible way of handling define_substs would be to apply them first (and thus expand things like and above) and then apply "@". However, that's a relatively invasive change and not suitable for stage 3. This patch instead skips over subst attributes and restricts "@" overload handling to the cases where no define_subst is applied. I looked through all uses of "@" names in target code and there seemed to be only one current use of "@" with define_substs, in x86 vector code. The current behaviour seemed to be unwanted there, and the x86 code was having to work around it. gcc/ * read-rtl.cc (md_reader::handle_overloaded_name): Don't add arguments for uses of subst attributes. (apply_iterators): Only add instructions to an overloaded helper if they use the default subst iterator values. * doc/md.texi: Update documentation accordingly. * config/i386/i386-expand.cc (expand_vec_perm_broadcast_1): Update accordingly. Diff: --- gcc/config/i386/i386-expand.cc | 4 ++-- gcc/doc/md.texi| 5 + gcc/read-rtl.cc| 18 ++ 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/gcc/config/i386/i386-expand.cc b/gcc/config/i386/i386-expand.cc index 6d25477841a9..7f1dcd0937bb 100644 --- a/gcc/config/i386/i386-expand.cc +++ b/gcc/config/i386/i386-expand.cc @@ -23884,7 +23884,7 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d *d) if (d->testing_p) return true; - rtx (*gen_interleave) (machine_mode, int, rtx, rtx, rtx); + rtx (*gen_interleave) (machine_mode, rtx, rtx, rtx); if (elt >= nelt2) { gen_interleave = gen_vec_interleave_high; @@ -23895,7 +23895,7 @@ expand_vec_perm_broadcast_1 (struct expand_vec_perm_d *d) nelt2 /= 2; dest = gen_reg_rtx (vmode); - emit_insn (gen_interleave (vmode, 1, dest, op0, op0)); + emit_insn (gen_interleave (vmode, dest, op0, op0)); vmode = V4SImode; op0 = gen_lowpart (vmode, dest); diff --git a/gcc/doc/md.texi b/gcc/doc/md.texi index 0457bb56dcd0..82054b226ba9 100644 --- a/gcc/doc/md.texi +++ b/gcc/doc/md.texi @@ -12405,4 +12405,9 @@ output and three inputs). This combination would produce separate each operand count, but it would still produce a single @samp{maybe_code_for_@var{name}} and a single @samp{code_for_@var{name}}. +Currently, these @code{@@} patterns only take into account patterns for +which no @code{define_subst} has been applied (@pxref{Define Subst}). +Any @samp{<@dots{}>} placeholders that refer to subst attributes +(@pxref{Subst Iterators}) are ignored. + @end ifset diff --git a/gcc/read-rtl.cc b/gcc/read-rtl.cc index 195f78bd5e16..49a5306254b0 100644 --- a/gcc/read-rtl.cc +++ b/gcc/read-rtl.cc @@ -814,9 +814,14 @@ md_reader::handle_overloaded_name (rtx original, vec *iterators) pending_underscore_p = false; } - /* Record an argument for ITERATOR. */ - iterators->safe_push (iterator); - tmp_oname.arg_types.safe_push (iterator->group->type); + /* Skip define_subst iterators, since define_substs are allowed to +add new match_operands in their output templates. */ + if (iterator->group != &substs) + { + /* Record an a
[gcc r15-6467] aarch64: Use mf8 instead of f8 in builtin definitions
https://gcc.gnu.org/g:834939a82ea23daaf99c58ea1694079f22eca6f4 commit r15-6467-g834939a82ea23daaf99c58ea1694079f22eca6f4 Author: Richard Sandiford Date: Mon Dec 30 12:50:55 2024 + aarch64: Use mf8 instead of f8 in builtin definitions The intrinsic type suffix for modal floating-point types is _mf8, so it's more convenient if we use that for the simd_types as well. gcc/ * config/aarch64/aarch64-builtins.cc (simd_types::f8): Rename to... (simd_types::mf8): ...this. * config/aarch64/aarch64-simd-pragma-builtins.def: Update accordingly. Diff: --- gcc/config/aarch64/aarch64-builtins.cc | 2 +- .../aarch64/aarch64-simd-pragma-builtins.def | 42 +++--- 2 files changed, 22 insertions(+), 22 deletions(-) diff --git a/gcc/config/aarch64/aarch64-builtins.cc b/gcc/config/aarch64/aarch64-builtins.cc index bad97181cf60..9d1d0260e739 100644 --- a/gcc/config/aarch64/aarch64-builtins.cc +++ b/gcc/config/aarch64/aarch64-builtins.cc @@ -1648,7 +1648,7 @@ namespace simd_types { constexpr simd_type BASE##qx4 { V4x##Q##MODE, QUALIFIERS }; \ constexpr simd_type BASE##_scalar { MODE, QUALIFIERS }; - VARIANTS (f8, 8, 16, QImode, qualifier_modal_float) + VARIANTS (mf8, 8, 16, QImode, qualifier_modal_float) VARIANTS (p8, 8, 16, QImode, qualifier_poly) VARIANTS (s8, 8, 16, QImode, qualifier_none) VARIANTS (u8, 8, 16, QImode, qualifier_unsigned) diff --git a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def index 5dafa7bb6b91..8924262cc53e 100644 --- a/gcc/config/aarch64/aarch64-simd-pragma-builtins.def +++ b/gcc/config/aarch64/aarch64-simd-pragma-builtins.def @@ -91,24 +91,24 @@ #undef ENTRY_VDOT_FPM #define ENTRY_VDOT_FPM(T) \ - ENTRY_TERNARY (vdot_##T##_mf8_fpm, T, T, f8, f8, \ + ENTRY_TERNARY (vdot_##T##_mf8_fpm, T, T, mf8, mf8, \ UNSPEC_FDOT_FP8, FP8) \ - ENTRY_TERNARY (vdotq_##T##_mf8_fpm, T##q, T##q, f8q, f8q,\ + ENTRY_TERNARY (vdotq_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q, \ UNSPEC_FDOT_FP8, FP8) \ - ENTRY_TERNARY_LANE (vdot_lane_##T##_mf8_fpm, T, T, f8, f8, \ + ENTRY_TERNARY_LANE (vdot_lane_##T##_mf8_fpm, T, T, mf8, mf8, \ UNSPEC_FDOT_LANE_FP8, FP8)\ - ENTRY_TERNARY_LANE (vdot_laneq_##T##_mf8_fpm, T, T, f8, f8q, \ + ENTRY_TERNARY_LANE (vdot_laneq_##T##_mf8_fpm, T, T, mf8, mf8q, \ UNSPEC_FDOT_LANE_FP8, FP8)\ - ENTRY_TERNARY_LANE (vdotq_lane_##T##_mf8_fpm, T##q, T##q, f8q, f8, \ + ENTRY_TERNARY_LANE (vdotq_lane_##T##_mf8_fpm, T##q, T##q, mf8q, mf8, \ UNSPEC_FDOT_LANE_FP8, FP8)\ - ENTRY_TERNARY_LANE (vdotq_laneq_##T##_mf8_fpm, T##q, T##q, f8q, f8q, \ + ENTRY_TERNARY_LANE (vdotq_laneq_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q,\ UNSPEC_FDOT_LANE_FP8, FP8) #undef ENTRY_FMA_FPM #define ENTRY_FMA_FPM(N, T, U) \ - ENTRY_TERNARY (N##q_##T##_mf8_fpm, T##q, T##q, f8q, f8q, U, FP8) \ - ENTRY_TERNARY_LANE (N##q_lane_##T##_mf8_fpm, T##q, T##q, f8q, f8, U, FP8) \ - ENTRY_TERNARY_LANE (N##q_laneq_##T##_mf8_fpm, T##q, T##q, f8q, f8q, U, FP8) + ENTRY_TERNARY (N##q_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q, U, FP8) \ + ENTRY_TERNARY_LANE (N##q_lane_##T##_mf8_fpm, T##q, T##q, mf8q, mf8, U, FP8) \ + ENTRY_TERNARY_LANE (N##q_laneq_##T##_mf8_fpm, T##q, T##q, mf8q, mf8q, U, FP8) // faminmax #define REQUIRED_EXTENSIONS nonstreaming_only (AARCH64_FL_FAMINMAX) @@ -131,18 +131,18 @@ ENTRY_TERNARY_VLUT16 (u) // fpm conversion #define REQUIRED_EXTENSIONS nonstreaming_only (AARCH64_FL_FP8) -ENTRY_UNARY_VQ_BHF (vcvt1, f8, UNSPEC_F1CVTL_FP8, FP8) -ENTRY_UNARY_VQ_BHF (vcvt1_high, f8q, UNSPEC_F1CVTL2_FP8, FP8) -ENTRY_UNARY_VQ_BHF (vcvt1_low, f8q, UNSPEC_F1CVTL_FP8, FP8) -ENTRY_UNARY_VQ_BHF (vcvt2, f8, UNSPEC_F2CVTL_FP8, FP8) -ENTRY_UNARY_VQ_BHF (vcvt2_high, f8q, UNSPEC_F2CVTL2_FP8, FP8) -ENTRY_UNARY_VQ_BHF (vcvt2_low, f8q, UNSPEC_F2CVTL_FP8, FP8) - -ENTRY_BINARY (vcvt_mf8_f16_fpm, f8, f16, f16, UNSPEC_FCVTN_FP8, FP8) -ENTRY_BINARY (vcvtq_mf8_f16_fpm, f8q, f16q, f16q, UNSPEC_FCVTN_FP8, FP8) -ENTRY_BINARY (vcvt_mf8_f32_fpm, f8, f32q, f32q, UNSPEC_FCVTN_FP8, FP8) - -ENTRY_TERNARY (vcvt_high_mf8_f32_fpm, f8q, f8, f32q, f32q, +ENTRY_UNARY_VQ_BHF (vcvt1, mf8, UNSPEC_F1CVTL_FP8, FP8) +ENTRY_UNARY_VQ_BHF (vcvt1_high, mf8q, UNSPEC_F1CVTL2_FP8, FP8) +ENTRY_UNARY_VQ_BHF (vcvt1_low, mf8q, UNSPEC_F1CVTL_FP8, FP8) +ENTRY_UNARY_VQ_BHF (vcvt2, mf8, UNSPEC_F2CVTL_FP8, FP8) +ENTRY_UNARY_VQ_BHF (vcvt2_high, mf8q, UNSPEC_F2CVTL2_FP8, FP8) +ENTRY_UNARY_VQ_BHF (vcvt2_low, mf8q, UNSPEC_F2CVTL_FP8, FP8) + +ENTRY_BINARY (vcvt_mf8_f16_fpm, mf8
[gcc r15-6468] aarch64: Add missing makefile dependency
https://gcc.gnu.org/g:5f40ff8efde2b8b140f170619e99b6df9722f79d commit r15-6468-g5f40ff8efde2b8b140f170619e99b6df9722f79d Author: Richard Sandiford Date: Mon Dec 30 12:50:56 2024 + aarch64: Add missing makefile dependency gcc/ * config/aarch64/t-aarch64 (aarch64-builtins.o): Depend on aarch64-simd-pragma-builtins.def. Diff: --- gcc/config/aarch64/t-aarch64 | 1 + 1 file changed, 1 insertion(+) diff --git a/gcc/config/aarch64/t-aarch64 b/gcc/config/aarch64/t-aarch64 index dfb159d1da63..3219871e8d72 100644 --- a/gcc/config/aarch64/t-aarch64 +++ b/gcc/config/aarch64/t-aarch64 @@ -55,6 +55,7 @@ aarch64-builtins.o: $(srcdir)/config/aarch64/aarch64-builtins.cc $(CONFIG_H) \ $(DIAGNOSTIC_CORE_H) $(OPTABS_H) \ $(srcdir)/config/aarch64/aarch64-simd-builtins.def \ $(srcdir)/config/aarch64/aarch64-simd-builtin-types.def \ + $(srcdir)/config/aarch64/aarch64-simd-pragma-builtins.def \ aarch64-builtin-iterators.h $(COMPILER) -c $(ALL_COMPILERFLAGS) $(ALL_CPPFLAGS) $(INCLUDES) \ $(srcdir)/config/aarch64/aarch64-builtins.cc
[gcc r15-6470] [RISC-V][PR target/118122] Fix modes in recently added risc-v pattern
https://gcc.gnu.org/g:64d31343d4676d8ceef9232dcd33824bc2eff330 commit r15-6470-g64d31343d4676d8ceef9232dcd33824bc2eff330 Author: Jeff Law Date: Mon Dec 30 07:40:07 2024 -0700 [RISC-V][PR target/118122] Fix modes in recently added risc-v pattern The new pattern to optimize certain code sequences on RISC-V played things a bit fast and loose with modes -- some operands were using the ALLI iterator while the scratch used X and the split codegen used X. Naturally under the "right" circumstances this would trigger an ICE due to mismatched modes. This patch uses X consistently in that pattern. It also fixes some formatting nits. Tested in my tester, but waiting on the pre-commit verdict before moving forward. PR target/118122 gcc/ * config/riscv/riscv.md (lui_constraint_and_to_or): Use X iterator rather than ANYI consistently. Fix formatting. gcc/testsuite * gcc.target/riscv/pr118122.c: New test. Diff: --- gcc/config/riscv/riscv.md | 24 gcc/testsuite/gcc.target/riscv/pr118122.c | 12 2 files changed, 24 insertions(+), 12 deletions(-) diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md index 6c6155ceeb83..deb156075497 100644 --- a/gcc/config/riscv/riscv.md +++ b/gcc/config/riscv/riscv.md @@ -861,19 +861,19 @@ ;; Transform (X & C1) + C2 into (X | ~C1) - (-C2 | ~C1) ;; Where C1 is not a LUI operand, but ~C1 is a LUI operand -(define_insn_and_split "*lui_constraint_and_to_or" - [(set (match_operand:ANYI 0 "register_operand" "=r") - (plus:ANYI (and:ANYI (match_operand:ANYI 1 "register_operand" "r") -(match_operand 2 "const_int_operand")) -(match_operand 3 "const_int_operand"))) +(define_insn_and_split "*lui_constraint_and_to_or" + [(set (match_operand:X 0 "register_operand" "=r") + (plus:X (and:X (match_operand:X 1 "register_operand" "r") + (match_operand 2 "const_int_operand")) + (match_operand 3 "const_int_operand"))) (clobber (match_scratch:X 4 "=&r"))] - "LUI_OPERAND (~INTVAL (operands[2])) - && ((INTVAL (operands[2]) & (-INTVAL (operands[3]))) - == (-INTVAL (operands[3]))) - && riscv_const_insns (operands[3], false) - && (riscv_const_insns - (GEN_INT (~INTVAL (operands[2]) | -INTVAL (operands[3])), false) - <= riscv_const_insns (operands[3], false))" + "(LUI_OPERAND (~INTVAL (operands[2])) +&& ((INTVAL (operands[2]) & (-INTVAL (operands[3]))) + == (-INTVAL (operands[3]))) +&& riscv_const_insns (operands[3], false) +&& (riscv_const_insns (GEN_INT (~INTVAL (operands[2]) + | -INTVAL (operands[3])), false) + <= riscv_const_insns (operands[3], false)))" "#" "&& reload_completed" [(set (match_dup 4) (match_dup 5)) diff --git a/gcc/testsuite/gcc.target/riscv/pr118122.c b/gcc/testsuite/gcc.target/riscv/pr118122.c new file mode 100644 index ..0cdc3bf83b12 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/pr118122.c @@ -0,0 +1,12 @@ +/* { dg-do compile } */ +/* { dg-options "-O -fno-tree-ter -fno-forward-propagate" } */ +char c; + +void +foo(short s) +{ + s += 34231u; + c = s; +} + +
[gcc r15-6473] [PR testsuite/114182] Fix minor testsuite issue when double == float
https://gcc.gnu.org/g:b739efa05d96edbc1468043a630bf29d38a0c30b commit r15-6473-gb739efa05d96edbc1468043a630bf29d38a0c30b Author: Jeff Law Date: Mon Dec 30 16:14:29 2024 -0700 [PR testsuite/114182] Fix minor testsuite issue when double == float This is a minor testsuite adjustment attr-complex-method-2.c selects between two scan-tree-dump clauses based on avr, !avr. But what they really should be checking is "large_double" that way it works for avr, h8, rl78 and any other target which makes doubles the same size as floats. attr-complex-method.c should be doing the same thing. After this change avr passes attr-complex-method.c and the rl78 and h8 ports will pass both tests. Other targets in my tester are unaffected. PR testsuite/114182 gcc/testsuite/ * gcc.c-torture/compile/attr-complex-method.c: Use "large_double" to select between scan outputs. * gcc.c-torture/compile/attr-complex-method-2.c: Similarly. Diff: --- gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c | 4 ++-- gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c | 4 +++- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c index dc28e2c99c61..19ec1dbdf0c8 100644 --- a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c +++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method-2.c @@ -8,5 +8,5 @@ void do_div (_Complex double *a, _Complex double *b) *a = *b / (4.0 - 5.0fi); } -/* { dg-final { scan-tree-dump "__(?:gnu_)?divdc3" "optimized" { target { ! { avr-*-* } } } } } */ -/* { dg-final { scan-tree-dump "__(?:gnu_)?divsc3" "optimized" { target { avr-*-* } } } } */ +/* { dg-final { scan-tree-dump "__(?:gnu_)?divdc3" "optimized" { target { large_double } } } } */ +/* { dg-final { scan-tree-dump "__(?:gnu_)?divsc3" "optimized" { target { ! { large_double } } } } } */ diff --git a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c index 046de7efeb96..239f2f7d36fd 100644 --- a/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c +++ b/gcc/testsuite/gcc.c-torture/compile/attr-complex-method.c @@ -8,4 +8,6 @@ void do_div (_Complex double *a, _Complex double *b) *a = *b / (4.0 - 5.0fi); } -/* { dg-final { scan-tree-dump-not "__divdc3" "optimized" } } */ +/* { dg-final { scan-tree-dump-not "__divdc3" "optimized" { target { large_double } } } } */ +/* { dg-final { scan-tree-dump-not "__divsc3" "optimized" { target { ! { large_double } } } } } */ +