[committed] match.pd: Don't optimize vector X + (X << C) -> X * (1 + (1 << C)) if there is no mult support [PR99544]
Hi! E.g. on aarch64, the target has V2DImode addition and shift by scalar optabs, but doesn't have V2DImode multiply. The following testcase ICEs because this simplification is done after last lowering, but generally, even if it is done before that, turning it into a multiplication will not be an improvement because that means scalarization, while the former can be done in vectors. It would be nice if we added expansion support for vector multiplication by uniform constants using shifts and additions like we have for scalar multiplication, but that is something that can be done in stage1. Bootstrapped/regtested on aarch64-linux, x86_64-linux and i686-linux, acked by Richi in the PR, committed to trunk. 2021-03-13 Jakub Jelinek PR tree-optimization/99544 * match.pd (X + (X << C) -> X * (1 + (1 << C))): Don't simplify if for vector types multiplication can't be done in type's mode. * gcc.dg/gomp/pr99544.c: New test. --- gcc/match.pd.jj 2021-02-25 10:22:39.740401251 +0100 +++ gcc/match.pd2021-03-12 11:51:08.375897831 +0100 @@ -2788,7 +2788,10 @@ (define_operator_list COND_TERNARY (plus:c @0 (lshift:s @0 INTEGER_CST@1)) (if (ANY_INTEGRAL_TYPE_P (TREE_TYPE (@0)) && tree_fits_uhwi_p (@1) - && tree_to_uhwi (@1) < element_precision (type)) + && tree_to_uhwi (@1) < element_precision (type) + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + || optab_handler (smul_optab, +TYPE_MODE (type)) != CODE_FOR_nothing)) (with { tree t = type; if (!TYPE_OVERFLOW_WRAPS (t)) t = unsigned_type_for (t); wide_int w = wi::set_bit_in_zero (tree_to_uhwi (@1), @@ -2804,7 +2807,10 @@ (define_operator_list COND_TERNARY && tree_fits_uhwi_p (@1) && tree_to_uhwi (@1) < element_precision (type) && tree_fits_uhwi_p (@2) - && tree_to_uhwi (@2) < element_precision (type)) + && tree_to_uhwi (@2) < element_precision (type) + && (INTEGRAL_TYPE_P (TREE_TYPE (@0)) + || optab_handler (smul_optab, +TYPE_MODE (type)) != CODE_FOR_nothing)) (with { tree t = type; if (!TYPE_OVERFLOW_WRAPS (t)) t = unsigned_type_for (t); unsigned int prec = element_precision (type); --- gcc/testsuite/gcc.dg/gomp/pr99544.c.jj 2021-03-12 11:48:06.551906424 +0100 +++ gcc/testsuite/gcc.dg/gomp/pr99544.c 2021-03-12 11:49:32.020961796 +0100 @@ -0,0 +1,13 @@ +/* PR tree-optimization/99544 */ +/* { dg-do compile } */ +/* { dg-options "-Os -fopenmp" } */ + +long +foo (long a, long b, long c) +{ + long d, e; + #pragma omp teams distribute parallel for simd firstprivate (a, b, c) lastprivate(e) + for (d = a; d < b; d++) +e = c + d * 5; + return e; +} Jakub
[PATCH] debug: Fix __int128 handling in dwarf2out [PR99562]
Hi! The PR66728 changes broke __int128 handling. It emits wide_int numbers in their minimum unsigned precision rather than in their full precision. The problem is then that e.g. the DW_OP_implicit_value path: int_mode = as_a (mode); loc_result = new_loc_descr (DW_OP_implicit_value, GET_MODE_SIZE (int_mode), 0); loc_result->dw_loc_oprnd2.val_class = dw_val_class_wide_int; loc_result->dw_loc_oprnd2.v.val_wide = ggc_alloc (); *loc_result->dw_loc_oprnd2.v.val_wide = rtx_mode_t (rtl, int_mode); emits invalid DWARF. In particular this patch fixes there multiple occurences of: .byte 0x9e# DW_OP_implicit_value .uleb128 0x10 .quad 0x + .quad 0 .quad .LVL46 # Location list begin address (*.LLST40) .quad .LFE14 # Location list end address (*.LLST40) where we said the value has 16 byte size but then only emitted 8 byte value. My understanding is that most of the places that use val_wide expect the precision they chose (the one of the mode they want etc.), the only exception is the add_const_value_attribute case where it deals with VOIDmode CONST_WIDE_INTs, for that I agree when we don't have a mode we need to fallback to minimum precision (not sure if maximum of min_precision UNSIGNED and SIGNED wouldn't be better, then consumers would know if it is signed or unsigned by looking at the MSB), but that code already computes the precision, just decided to create the wide_int with much larger precision (e.g. 512 bit on x86_64). Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk? 2021-03-13 Jakub Jelinek PR debug/99562 PR debug/66728 * dwarf2out.c (get_full_len): Use get_precision rather than min_precision. (add_const_value_attribute): Make sure add_AT_wide argument has precision prec rather than some very wide one. --- gcc/dwarf2out.c.jj 2021-03-11 14:01:43.385194205 +0100 +++ gcc/dwarf2out.c 2021-03-12 17:34:49.365207265 +0100 @@ -385,13 +385,12 @@ dump_struct_debug (tree type, enum debug #endif /* Get the number of HOST_WIDE_INTs needed to represent the precision - of the number. Some constants have a large uniform precision, so - we get the precision needed for the actual value of the number. */ + of the number. */ static unsigned int get_full_len (const wide_int &op) { - int prec = wi::min_precision (op, UNSIGNED); + int prec = wi::get_precision (op); return ((prec + HOST_BITS_PER_WIDE_INT - 1) / HOST_BITS_PER_WIDE_INT); } @@ -19732,8 +19731,9 @@ add_const_value_attribute (dw_die_ref di { wide_int w1 = rtx_mode_t (rtl, MAX_MODE_INT); unsigned int prec = MIN (wi::min_precision (w1, UNSIGNED), -(unsigned int)CONST_WIDE_INT_NUNITS (rtl) * HOST_BITS_PER_WIDE_INT); - wide_int w = wi::zext (w1, prec); +(unsigned int) CONST_WIDE_INT_NUNITS (rtl) +* HOST_BITS_PER_WIDE_INT); + wide_int w = wide_int::from (w1, prec, UNSIGNED); add_AT_wide (die, DW_AT_const_value, w); } return true; Jakub
Re: [PATCH] PR fortran/99112 - [11 Regression] ICE with runtime diagnostics for SIZE intrinsic function
Hi Harald, I am not sure of the etiquette for this - it looks OK to me :-) Cheers Paul On Fri, 12 Mar 2021 at 21:20, Harald Anlauf via Fortran wrote: > Dear all, > > the addition of runtime checks for the SIZE intrinsic created a regression > that showed up for certain CLASS arguments to procedures. Paul did most of > the work (~ 99%), but asked me to dig into an issue with an inappropriately > selected error message. This actually turned out to be a simple one-liner > on top of Paul's patch. > > Regtested on x86_64-pc-linux-gnu. OK for mainline? > > Thanks, > Harald > > P.S.: I couldn't find a Changelog entry that uses co-authors. Is the > version > below correct? > > > PR fortran/99112 - ICE with runtime diagnostics for SIZE intrinsic function > > Add/fix handling of runtime checks for CLASS arguments with ALLOCATABLE > or POINTER attribute. > > gcc/fortran/ChangeLog: > > * trans-expr.c (gfc_conv_procedure_call): Fix runtime checks for > CLASS arguments. > * trans-intrinsic.c (gfc_conv_intrinsic_size): Likewise. > > gcc/testsuite/ChangeLog: > > * gfortran.dg/pr99112.f90: New test. > > Co-authored-by: Paul Thomas > > -- "If you can't explain it simply, you don't understand it well enough" - Albert Einstein
[PATCH] Fix ICE: in function_and_variable_visibility, at ipa-visibility.c:795 (PR99466)
Hi, This patch fixes an ICE caused by emutls routines generating a weak, non-public symbol for storing the initializer of a weak TLS variable. In get_emutls_init_templ_addr, only declarations that were DECL_ONE_ONLY would get a public initializer symbol, ignoring variables that were declared with __attribute__((weak)). Because DECL_VISIBILITY is also copied to the emutls initializer, a second test is included which checks that the expected visibility is emitted too. Tested on x86_64-apple-darwin10, OK for mainline? The oldest version of gcc I've checked is 7.5.0, and the ICE is present there too. Is this OK for backporting, and if so which versions should it be backported to? Regards, Iain. --- gcc/ChangeLog: PR ipa/99466 * tree-emutls.c (get_emutls_init_templ_addr): Mark initializer of weak TLS declarations as public. gcc/testsuite/ChangeLog: * gcc.dg/tls/pr98607-1.c: New test. * gcc.dg/tls/pr98607-2.c: New test. --- gcc/testsuite/gcc.dg/tls/pr98607-1.c | 8 gcc/testsuite/gcc.dg/tls/pr98607-2.c | 10 ++ gcc/tree-emutls.c| 6 -- 3 files changed, 22 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tls/pr98607-1.c create mode 100644 gcc/testsuite/gcc.dg/tls/pr98607-2.c diff --git a/gcc/testsuite/gcc.dg/tls/pr98607-1.c b/gcc/testsuite/gcc.dg/tls/pr98607-1.c new file mode 100644 index 000..446850e148b --- /dev/null +++ b/gcc/testsuite/gcc.dg/tls/pr98607-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-weak "" } */ +/* { dg-require-effective-target tls_emulated } */ +/* { dg-add-options tls } */ +__attribute__((weak)) +__thread int tlsvar = 3; +/* { dg-final { scan-assembler ".weak_definition ___emutls_t.tlsvar" { target *-*-darwin* } } } */ +/* { dg-final { scan-assembler-not ".private_extern ___emutls_t.tlsvar" { target *-*-darwin* } } } */ diff --git a/gcc/testsuite/gcc.dg/tls/pr98607-2.c b/gcc/testsuite/gcc.dg/tls/pr98607-2.c new file mode 100644 index 000..86ffaad7f48 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tls/pr98607-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-weak "" } */ +/* { dg-require-visibility "" } */ +/* { dg-require-effective-target tls_emulated } */ +/* { dg-add-options tls } */ +__attribute__((weak)) +__attribute__((visibility ("hidden"))) +__thread int tlsvar = 3; +/* { dg-final { scan-assembler ".weak_definition ___emutls_t.tlsvar" { target *-*-darwin* } } } */ +/* { dg-final { scan-assembler ".private_extern ___emutls_t.tlsvar" { target *-*-darwin* } } } */ diff --git a/gcc/tree-emutls.c b/gcc/tree-emutls.c index f1053385944..1c9c5d5aee1 100644 --- a/gcc/tree-emutls.c +++ b/gcc/tree-emutls.c @@ -242,16 +242,18 @@ get_emutls_init_templ_addr (tree decl) DECL_PRESERVE_P (to) = DECL_PRESERVE_P (decl); DECL_WEAK (to) = DECL_WEAK (decl); - if (DECL_ONE_ONLY (decl)) + if (DECL_ONE_ONLY (decl) || DECL_WEAK (decl)) { TREE_STATIC (to) = TREE_STATIC (decl); TREE_PUBLIC (to) = TREE_PUBLIC (decl); DECL_VISIBILITY (to) = DECL_VISIBILITY (decl); - make_decl_one_only (to, DECL_ASSEMBLER_NAME (to)); } else TREE_STATIC (to) = 1; + if (DECL_ONE_ONLY (decl)) +make_decl_one_only (to, DECL_ASSEMBLER_NAME (to)); + DECL_VISIBILITY_SPECIFIED (to) = DECL_VISIBILITY_SPECIFIED (decl); DECL_INITIAL (to) = DECL_INITIAL (decl); DECL_INITIAL (decl) = NULL; -- 2.27.0
Re: [PATCH] Fix ICE: in function_and_variable_visibility, at ipa-visibility.c:795 (PR99466)
Hi Iain, Iain Buclaw via Gcc-patches wrote: This patch fixes an ICE caused by emutls routines generating a weak, non-public symbol for storing the initializer of a weak TLS variable. In get_emutls_init_templ_addr, only declarations that were DECL_ONE_ONLY would get a public initializer symbol, ignoring variables that were declared with __attribute__((weak)). Because DECL_VISIBILITY is also copied to the emutls initializer, a second test is included which checks that the expected visibility is emitted too. Tested on x86_64-apple-darwin10, OK for mainline? The oldest version of gcc I've checked is 7.5.0, and the ICE is present there too. Is this OK for backporting, and if so which versions should it be backported to? Regards, Iain. --- gcc/ChangeLog: PR ipa/99466 * tree-emutls.c (get_emutls_init_templ_addr): Mark initializer of weak TLS declarations as public. gcc/testsuite/ChangeLog: * gcc.dg/tls/pr98607-1.c: New test. * gcc.dg/tls/pr98607-2.c: New test. ^^^ s/98607/99466/ ? --- gcc/testsuite/gcc.dg/tls/pr98607-1.c | 8 gcc/testsuite/gcc.dg/tls/pr98607-2.c | 10 ++ gcc/tree-emutls.c| 6 -- 3 files changed, 22 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/tls/pr98607-1.c create mode 100644 gcc/testsuite/gcc.dg/tls/pr98607-2.c diff --git a/gcc/testsuite/gcc.dg/tls/pr98607-1.c b/gcc/testsuite/gcc.dg/tls/pr98607-1.c new file mode 100644 index 000..446850e148b --- /dev/null +++ b/gcc/testsuite/gcc.dg/tls/pr98607-1.c @@ -0,0 +1,8 @@ +/* { dg-do compile } */ +/* { dg-require-weak "" } */ +/* { dg-require-effective-target tls_emulated } */ +/* { dg-add-options tls } */ +__attribute__((weak)) +__thread int tlsvar = 3; +/* { dg-final { scan-assembler ".weak_definition ___emutls_t.tlsvar" { target *-*-darwin* } } } */ +/* { dg-final { scan-assembler-not ".private_extern ___emutls_t.tlsvar" { target *-*-darwin* } } } */ diff --git a/gcc/testsuite/gcc.dg/tls/pr98607-2.c b/gcc/testsuite/gcc.dg/tls/pr98607-2.c new file mode 100644 index 000..86ffaad7f48 --- /dev/null +++ b/gcc/testsuite/gcc.dg/tls/pr98607-2.c @@ -0,0 +1,10 @@ +/* { dg-do compile } */ +/* { dg-require-weak "" } */ +/* { dg-require-visibility "" } */ +/* { dg-require-effective-target tls_emulated } */ +/* { dg-add-options tls } */ +__attribute__((weak)) +__attribute__((visibility ("hidden"))) +__thread int tlsvar = 3; +/* { dg-final { scan-assembler ".weak_definition ___emutls_t.tlsvar" { target *-*-darwin* } } } */ +/* { dg-final { scan-assembler ".private_extern ___emutls_t.tlsvar" { target *-*-darwin* } } } */ diff --git a/gcc/tree-emutls.c b/gcc/tree-emutls.c index f1053385944..1c9c5d5aee1 100644 --- a/gcc/tree-emutls.c +++ b/gcc/tree-emutls.c @@ -242,16 +242,18 @@ get_emutls_init_templ_addr (tree decl) DECL_PRESERVE_P (to) = DECL_PRESERVE_P (decl); DECL_WEAK (to) = DECL_WEAK (decl); - if (DECL_ONE_ONLY (decl)) + if (DECL_ONE_ONLY (decl) || DECL_WEAK (decl)) { TREE_STATIC (to) = TREE_STATIC (decl); TREE_PUBLIC (to) = TREE_PUBLIC (decl); DECL_VISIBILITY (to) = DECL_VISIBILITY (decl); - make_decl_one_only (to, DECL_ASSEMBLER_NAME (to)); } else TREE_STATIC (to) = 1; + if (DECL_ONE_ONLY (decl)) +make_decl_one_only (to, DECL_ASSEMBLER_NAME (to)); + DECL_VISIBILITY_SPECIFIED (to) = DECL_VISIBILITY_SPECIFIED (decl); DECL_INITIAL (to) = DECL_INITIAL (decl); DECL_INITIAL (decl) = NULL; -- 2.27.0
[PATCH v3] x86: Update 'P' operand modifier for -fno-plt
On Fri, Mar 12, 2021 at 8:37 AM Uros Bizjak wrote: > > On Fri, Mar 12, 2021 at 2:20 PM H.J. Lu wrote: > > > > On Thu, Mar 11, 2021 at 11:21 PM Uros Bizjak wrote: > > > > > > On Thu, Mar 11, 2021 at 11:22 PM H.J. Lu wrote: > > > > > > > > Update 'P' operand modifier for -fno-plt to support inline assembly > > > > statements. In 64-bit, we can always load function address with > > > > @GOTPCREL. In 32-bit, we load function address with @GOT only for > > > > non-PIC since PIC register may not be available at call site. > > > > > > > > gcc/ > > > > > > > > PR target/99504 > > > > * config/i386/i386.c (ix86_print_operand): Update 'P' handling > > > > for -fno-plt. > > > > > > > > gcc/testsuite/ > > > > > > > > PR target/99504 > > > > * gcc.target/i386/pr99530-1.c: New test. > > > > * gcc.target/i386/pr99530-2.c: Likewise. > > > > * gcc.target/i386/pr99530-3.c: Likewise. > > > > * gcc.target/i386/pr99530-4.c: Likewise. > > > > * gcc.target/i386/pr99530-5.c: Likewise. > > > > * gcc.target/i386/pr99530-6.c: Likewise. > > > > --- > > > > gcc/config/i386/i386.c| 33 +-- > > > > gcc/testsuite/gcc.target/i386/pr99530-1.c | 11 > > > > gcc/testsuite/gcc.target/i386/pr99530-2.c | 11 > > > > gcc/testsuite/gcc.target/i386/pr99530-3.c | 11 > > > > gcc/testsuite/gcc.target/i386/pr99530-4.c | 11 > > > > gcc/testsuite/gcc.target/i386/pr99530-5.c | 11 > > > > gcc/testsuite/gcc.target/i386/pr99530-6.c | 11 > > > > 7 files changed, 97 insertions(+), 2 deletions(-) > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-1.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-2.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-3.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-4.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-5.c > > > > create mode 100644 gcc/testsuite/gcc.target/i386/pr99530-6.c > > > > > > > > diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c > > > > index 260f87b..8733fcecf65 100644 > > > > --- a/gcc/config/i386/i386.c > > > > +++ b/gcc/config/i386/i386.c > > > > @@ -12701,7 +12701,8 @@ print_reg (rtx x, int code, FILE *file) > > > > y -- print "st(0)" instead of "st" as a register. > > > > d -- print duplicated register operand for AVX instruction. > > > > D -- print condition for SSE cmp instruction. > > > > - P -- if PIC, print an @PLT suffix. > > > > + P -- if PIC, print an @PLT suffix. For -fno-plt, load function > > > > + address from GOT. > > > > p -- print raw symbol name. > > > > X -- don't print any sort of PIC '@' suffix for a symbol. > > > > & -- print some in-use local-dynamic symbol name. > > > > @@ -13445,7 +13446,35 @@ ix86_print_operand (FILE *file, rtx x, int > > > > code) > > > > x = const0_rtx; > > > > } > > > > > > > > - if (code != 'P' && code != 'p') > > > > + if (code == 'P') > > > > + { > > > > + if (current_output_insn == NULL_RTX > > > > + && (TARGET_64BIT || (!flag_pic && HAVE_AS_IX86_GOT32X)) > > > > + && !TARGET_PECOFF > > > > + && !TARGET_MACHO > > > > + && ix86_cmodel != CM_LARGE > > > > + && ix86_cmodel != CM_LARGE_PIC > > > > + && GET_CODE (x) == SYMBOL_REF > > > > + && SYMBOL_REF_FUNCTION_P (x) > > > > + && (!flag_plt > > > > + || (SYMBOL_REF_DECL (x) > > > > + && lookup_attribute ("noplt", > > > > + DECL_ATTRIBUTES > > > > (SYMBOL_REF_DECL (x) > > > > + && !SYMBOL_REF_LOCAL_P (x)) > > > > + { > > > > + /* For inline assembly statement, load function address > > > > +from GOT with 'P' operand modifier to avoid PLT. > > > > +NB: This works only with call or jmp. */ > > > > + const char *xasm; > > > > + if (TARGET_64BIT) > > > > + xasm = "{*%p0@GOTPCREL(%%rip)|[QWORD PTR > > > > %p0@GOTPCREL[rip]]}"; > > > > + else > > > > + xasm = "{*%p0@GOT|[DWORD PTR %p0@GOT]}"; > > > > + output_asm_insn (xasm, &x); > > > > + return; > > > > > > This should be handled in output_pic_addr_const. > > > > > > > call/jmp are special and are handled by ix86_output_call_insn, > > not output_pic_addr_const. > > I see, the call_insn is output using output_asm_insn, which I think is > not appropriate in ix86_print_operand. Probably you should introduce a > new helper function and output a GOTPCREL reloc there. Something like > x86_print_operand with 'A' code, calling output_addr_const and > appending @GOTPCREL. Perhaps some parts of ix86_print_opreands can be > used instead. Done. Here is the updated patch. Tested on
[PATCH] aarch64: Fix up aarch64_simd_clone_compute_vecsize_and_simdlen [PR99542]
Hi! As the patch shows, there are several bugs in aarch64_simd_clone_compute_vecsize_and_simdlen. One is that unlike for function declarations that aren't definitions it completely ignores argument types. Such decls don't have DECL_ARGUMENTS, but we can walk TYPE_ARG_TYPES instead, like the i386 backend does or like the simd cloning code in the middle end does too. Another problem is that it checks types of uniform arguments. That is unnecessary, uniform arguments are passed the way it normally is, it is a scalar argument rather than vector, so there is no reason not to support uniform argument of different size, or long double, structure etc. Fixed thusly, bootstrapped/regtested on aarch64-linux, ok for trunk? 2021-03-13 Jakub Jelinek PR target/99542 * config/aarch64/aarch64.c (aarch64_simd_clone_compute_vecsize_and_simdlen): If not a function definition, walk TYPE_ARG_TYPES list if non-NULL for argument types instead of DECL_ARGUMENTS. Ignore types for uniform arguments. * gcc.dg/gomp/pr99542.c: New test. * gcc.dg/gomp/pr59669-2.c (bar): Don't expect a warning on aarch64. * gcc.dg/gomp/simd-clones-2.c (setArray): Likewise. * g++.dg/vect/simd-clone-7.cc (bar): Likewise. * g++.dg/gomp/declare-simd-1.C (f37): Expect a different warning on aarch64. * gcc.dg/declare-simd.c (fn2): Expect a new warning on aarch64. --- gcc/config/aarch64/aarch64.c.jj 2021-03-12 19:01:48.672296344 +0100 +++ gcc/config/aarch64/aarch64.c2021-03-13 09:16:42.585045538 +0100 @@ -23412,11 +23412,17 @@ aarch64_simd_clone_compute_vecsize_and_s return 0; } - for (t = DECL_ARGUMENTS (node->decl); t; t = DECL_CHAIN (t)) + int i; + tree type_arg_types = TYPE_ARG_TYPES (TREE_TYPE (node->decl)); + bool decl_arg_p = (node->definition || type_arg_types == NULL_TREE); + + for (t = (decl_arg_p ? DECL_ARGUMENTS (node->decl) : type_arg_types), i = 0; + t && t != void_list_node; t = TREE_CHAIN (t), i++) { - arg_type = TREE_TYPE (t); + tree arg_type = decl_arg_p ? TREE_TYPE (t) : TREE_VALUE (t); - if (!currently_supported_simd_type (arg_type, base_type)) + if (clonei->args[i].arg_type != SIMD_CLONE_ARG_TYPE_UNIFORM + && !currently_supported_simd_type (arg_type, base_type)) { if (TYPE_SIZE (arg_type) != TYPE_SIZE (base_type)) warning_at (DECL_SOURCE_LOCATION (node->decl), 0, --- gcc/testsuite/gcc.dg/gomp/pr99542.c.jj 2021-03-13 09:16:42.586045527 +0100 +++ gcc/testsuite/gcc.dg/gomp/pr99542.c 2021-03-13 09:16:42.586045527 +0100 @@ -0,0 +1,17 @@ +/* PR middle-end/89246 */ +/* { dg-do compile { target int128 } } */ +/* { dg-options "-O0 -fopenmp-simd" } */ + +#pragma omp declare simd +extern int foo (__int128 x); /* { dg-warning "GCC does not currently support mixed size types for 'simd' function" "" { target aarch64*-*-* } } */ +/* { dg-warning "unsupported argument type '__int128' for simd" "" { target i?86-*-* x86_64-*-* } .-1 } */ + +#pragma omp declare simd uniform (x) +extern int baz (__int128 x); + +#pragma omp declare simd +int +bar (int x) +{ + return x + foo (0) + baz (0); +} --- gcc/testsuite/gcc.dg/gomp/pr59669-2.c.jj2020-01-12 11:54:37.430398065 +0100 +++ gcc/testsuite/gcc.dg/gomp/pr59669-2.c 2021-03-13 09:35:07.100807877 +0100 @@ -7,4 +7,3 @@ void bar (int *a) { } -/* { dg-warning "GCC does not currently support mixed size types for 'simd' functions" "" { target aarch64*-*-* } .-3 } */ --- gcc/testsuite/gcc.dg/gomp/simd-clones-2.c.jj2020-01-12 11:54:37.431398050 +0100 +++ gcc/testsuite/gcc.dg/gomp/simd-clones-2.c 2021-03-13 09:36:21.143988256 +0100 @@ -15,7 +15,6 @@ float setArray(float *a, float x, int k) return a[k]; } -/* { dg-warning "GCC does not currently support mixed size types for 'simd' functions" "" { target aarch64*-*-* } .-6 } */ /* { dg-final { scan-tree-dump "_ZGVbN4ua32vl_setArray" "optimized" { target i?86-*-* x86_64-*-* } } } */ /* { dg-final { scan-tree-dump "_ZGVbN4vvva32_addit" "optimized" { target i?86-*-* x86_64-*-* } } } */ /* { dg-final { scan-tree-dump "_ZGVbM4vl66u_addit" "optimized" { target i?86-*-* x86_64-*-* } } } */ --- gcc/testsuite/g++.dg/vect/simd-clone-7.cc.jj2020-01-12 11:54:37.280400328 +0100 +++ gcc/testsuite/g++.dg/vect/simd-clone-7.cc 2021-03-13 09:23:58.005214442 +0100 @@ -8,5 +8,3 @@ bar (float x, float *y, int) { return y[0] + y[1] * x; } -// { dg-warning "GCC does not currently support mixed size types for 'simd' functions" "" { target { { aarch64*-*-* } && lp64 } } .-4 } - --- gcc/testsuite/g++.dg/gomp/declare-simd-1.C.jj 2020-01-12 11:54:37.177401882 +0100 +++ gcc/testsuite/g++.dg/gomp/declare-simd-1.C 2021-03-13 09:22:58.029878348 +0100 @@ -287,7 +287,7 @@ struct D int f37 (int a); int e; }; -// { dg-warning "GCC does not currently support simdlen 16 for type 'int'" "" { target aarch64*-*-* } .-3 } +//
Re: [PATCH] avoid assuming gimple_call_alloc_size argument is a call (PR 99489)
On 3/12/21 6:52 AM, Jakub Jelinek wrote: On Tue, Mar 09, 2021 at 03:07:38PM -0700, Martin Sebor via Gcc-patches wrote: The gimple_call_alloc_size() function is documented to "return null when STMT is not a call to a valid allocation function" but the code assumes STMT is a call statement, causing the function to ICE when it isn't. The attached patch changes the function to fulfill its contract and return null also when STMT isn't a call. The fix seems obvious to me but I'll wait some time before committing it in case it's not to someone else. I think the name of the function suggests that it should be called on calls, not random stmts. I wrote the function so I should know how I expected it to be used. I can't say I remember for sure but I imagine I would have declared the argument gcall* rather than gimple* if I had intended it to be called with only gcall statements. Currently the function has 3 callers, two of them already verify is_gimple_call before calling it and only one doesn't, and the stmt will never be NULL. So I'd say it would be better to remove the if (!stmt) return NULL_TREE; from the start of the function and add is_gimple_call (stmt) && in tree-ssa-strlen.c. My preference is to make code more robust, not less, so that if another caller is introduced that doesn't check the argument it doesn't cause another ICE. An alternative might be to change the function to take a gcall* as some of the gimple_call_xxx() APIs do that expect to be called only with GIMPLE call statements, like gimple_call_fn(), but that would force the caller to both do the checking and the conversion from gimple* to gcall*. That also seems less preferable to me. A better variant of the above that would be in line with the GIMPLE API design is to also introduce a gimple* overload/wrapper around gcall* form of the function and convert its gimple* argument to gcall* via a GIMPLE_CHECK()) cast. I'd like to ultimately move the function into gimple.{h,c} so that might be something to consider then. But making such a change now would introduce more churn than is necessary to fix the regression. I've committed the patch as is for now and will plan to revisit the overload idea in stage 1. Martin
Re: [PATCH] handle VLA of zero length arrays and vice versa (PR 99121)
On 3/12/21 6:27 AM, Jakub Jelinek wrote: On Mon, Mar 08, 2021 at 07:37:46PM -0700, Martin Sebor via Gcc-patches wrote: Accesses to zero-length arrays continue to be diagnosed (except for trailing arrays of unknown objects), as are nonempty accesses to empty types. The warning message for (3) remains unchanged, i.e., for the following: struct S { } a[3]; void g (int n) { ((int*)a)[0] = 0; } it's: warning: array subscript 0 is outside array bounds of ‘struct S[3]’ [-Warray-bounds] As I tried to explain several times, this is completely unacceptable to me. We want to warn, I agree with that, but we don't care that we emit completely nonsensical warning? The user will be just confused by that, will (rightly) think it is a bug in the compiler and might not fix the actual bug. Array subscript 0 is not outside of those array bounds. You're being unreasonable. As I explained, I'm open to improving the text of the warning but consider tweaking it outside the scope of this bug. If you don't want a shortcut that for arrays with zero sized elements and for non-array objects with zero size uses a different code path that would just warn "access outside of zero sized object %qD" You're quibbling over a minute difference in the phrasing of a warning for an obscure case that will probably never even come up. But if it did, the text is secondary to issuing it since it detects a bug. So again, I'm open to rewording the warning for GCC 12 but I'm not willing to start rewriting the code in stage 4 just because you don't like how the warning is phrased. Martin (which is what I'd prefer, any non-zero sized access to object of zero size unless it is flexible array member or decl with flexible array member is undefined, whatever offset you use) and want to reuse the current code, at least please change reftype to build_printable_array_type (TREE_TYPE (ref), 0); so that it prints warning: array subscript 0 is outside array bounds of ‘int[0]’ [-Warray-bounds] You'll need to redo the: || !COMPLETE_TYPE_P (reftype) || TREE_CODE (TYPE_SIZE_UNIT (reftype)) != INTEGER_CST) return false; check on the new reftype. It should be done not just where you currently do: nelts = integer_zero_node; eltsize = 1; but also somewhere in: eltsize = 1; tree size = TYPE_SIZE_UNIT (reftype); if (VAR_P (arg)) if (tree initsize = DECL_SIZE_UNIT (arg)) if (tree_int_cst_lt (size, initsize)) size = initsize; arrbounds[1] = wi::to_offset (size); below that for the case where integer_zerop (size) && TYPE_EMPTY_P (reftype). There should be also: struct S { } a; void g (int n) { ((int*)&a)[0] = 0; } testcase that covers that. BTW, what is the reason behind the POINTER_TYPE_P check in: /* The type of the object being referred to. It can be an array, string literal, or a non-array type when the MEM_REF represents a reference/subscript via a pointer to an object that is not an element of an array. Incomplete types are excluded as well because their size is not known. */ reftype = TREE_TYPE (arg); if (POINTER_TYPE_P (reftype) ? It disables the second warning on: __UINTPTR_TYPE__ a; void *b; void g (int n) { ((int*)&a)[4] = 0; ((int*)&b)[4] = 0; } I really don't see what is different on vars with pointer type vs. vars with non-pointer type of the same size for this warning. Jakub