Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue
On Mon, Jun 8, 2020 at 11:13 PM Jeff Law via Gcc-patches wrote: > > > We're seeing some failures due to the local-alignment pass. For example on > sh4: > > Tests that now fail, but worked before (16 tests): > > gcc.dg/pr48335-1.c (test for excess errors) > gcc.dg/pr48335-2.c (test for excess errors) > gcc.dg/pr48335-5.c (test for excess errors) > gcc.dg/pr48335-6.c (test for excess errors) > gcc.dg/torture/pr48493.c -O0 (test for excess errors) > gcc.dg/torture/pr48493.c -O0 (test for excess errors) > gcc.dg/torture/pr48493.c -O1 (test for excess errors) > gcc.dg/torture/pr48493.c -O1 (test for excess errors) > gcc.dg/torture/pr48493.c -O2 (test for excess errors) > gcc.dg/torture/pr48493.c -O2 (test for excess errors) > gcc.dg/torture/pr48493.c -O2 -flto -fno-use-linker-plugin -flto- > partition=none (test for excess errors) > gcc.dg/torture/pr48493.c -O2 -flto -fno-use-linker-plugin -flto- > partition=none (test for excess errors) > gcc.dg/torture/pr48493.c -O3 -g (test for excess errors) > gcc.dg/torture/pr48493.c -O3 -g (test for excess errors) > gcc.dg/torture/pr48493.c -Os (test for excess errors) > gcc.dg/torture/pr48493.c -Os (test for excess errors) > > Or on x86_64: > > during GIMPLE pass: adjust_alignment > /home/jenkins/linux/arch/x86/boot/string.c: In function ‘simple_strtoull’: > /home/jenkins/linux/arch/x86/boot/string.c:121:20: internal compiler error: in > execute, at adjust-alignment.c:74 > 121 | unsigned long long simple_strtoull(const char *cp, char **endp, > unsigned > int base) > |^~~ > 0x79c5b3 execute > ../../../../../gcc/gcc/adjust-alignment.c:74 > Please submit a full bug report, > > There may be others, I haven't searched the failing targets extensively for > this > issue. > > AFAICT the adjust-alignment pass is supposed to increase alignments of locals > when needed. It has an assert to ensure that alignments are only increased. > > However, if the object was declared with an alignment attribute that is larger > than what LOCAL_ALIGNMENT would produce for the object, then the > adjust-alignment > pass trips the assert. > > Rather than asserting, just ignoring such objects seems better. But I'm not > intimately familiar with the issues here. > > Bootstrapped and regression tested on x86_64, also verified this fixes the sh4 > issue. All the *-elf targets have also built/tested with this patch with no > regressions. > > OK for the trunk? While technically OK the issue is that it does not solve the x86 issue which with incoming stack alignment < 8 bytes does not perform dynamic re-alignment to align 'long long' variables appropriately. Currently the x86 backend thinks it can fixup by lowering alignment via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue because the larger alignment is present in the IL for a long (previously RTL expansion, now adjust-aligment) time and your patch makes that wrong info last forever (or alternatively cause dynamic stack alignment which we do _not_ want to perform here). So I prefer to wait for a proper x86 solution before cutting the ICEs. (did you verify effects on code generation of your patch?) Thanks, Richard. > > Jeff > >
Re: [PR95416] outputs.exp: skip lto tests when not using linker plugin
Hi Alexandre Alexandre Oliva wrote: When the linker plugin is not available, dump outputs in lto runs are different from those outputs.exp tests expect, so skip them for now. Targets (like Darwin too) that don’t have the linker plugin (presumably) use fat LTO objects and drive the LTO process using collect2 (Darwin and Solaris seem to be two at least that do this) That means that the intermediate objects proceed all the way to .s output - and thus the ‘final’ pass is run (producing the extra files seen). You can mimic this with x86 Linux by appending -ffat-lto-objects to an LTO command line. I have an ugly patch that makes this work for Darwin (essentially, by having two versions of the LTO tests - one for targets with fat LTO and one for thin) since it seems that one has to apply all the files expected to the list - [this comment is also on PR95577]. Perhaps, as the author, you can see a more elegant way to implement this - and I was wondering (as an aside) if the -ffat-lto-objects case should be tested on targets which default to thin LTO anyway? cheers Iain Regstrapped on x86_64-linux-gnu. Ok to install? for gcc/testsuite/ChangeLog * outputs.exp (skip_lto): Set when missing the linker plugin. --- testsuite/gcc.misc-tests/outputs.exp |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git gcc/testsuite/gcc.misc-tests/outputs.exp gcc/testsuite/gcc.misc-tests/outputs.exp index 06a32db..a47f8a6 100644 --- gcc/testsuite/gcc.misc-tests/outputs.exp +++ gcc/testsuite/gcc.misc-tests/outputs.exp @@ -42,7 +42,8 @@ if ![check_no_compiler_messages gsplitdwarf object { # Check for -flto support. We explicitly test the result to skip # tests that use -flto. -set skip_lto ![check_effective_target_lto] +set skip_lto { ![check_effective_target_lto] \ + || ![check_linker_plugin_available] } # Prepare additional options to be used for linking. # We do not compile to an executable, because that requires naming an output. -- Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/ Free Software Evangelist Stallman was right, but he's left :( GNU Toolchain Engineer Live long and free, and prosper ethically
[PATCH] libgcov: fix TOPN type casting
The patch fixes tree-prof.exp tests on solaris11 and i686-linux-gnu, problem was that sizeof of a pointer is different from sizeof gcov_type. I'm going to install it if there are no objections. Thanks, Martin libgcc/ChangeLog: PR libgcov/95494 * libgcov-driver.c (write_top_counters): Cast first to intptr_t as sizeof(*) != sizeof(gcov_type). * libgcov.h (gcov_counter_set_if_null): Remove. (gcov_topn_add_value): Cast first to intptr_t and update linked list directly. --- libgcc/libgcov-driver.c | 4 ++-- libgcc/libgcov.h| 49 +++-- 2 files changed, 29 insertions(+), 24 deletions(-) diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c index 8348d9f33ec..cbfcae96d19 100644 --- a/libgcc/libgcov-driver.c +++ b/libgcc/libgcov-driver.c @@ -352,8 +352,8 @@ write_top_counters (const struct gcov_ctr_info *ci_ptr, gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1]; gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]); gcov_write_counter (pair_count); - for (struct gcov_kvp *node - = (struct gcov_kvp *)ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2]; + gcov_type start = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2]; + for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start; node != NULL; node = node->next) { gcov_write_counter (node->value); diff --git a/libgcc/libgcov.h b/libgcc/libgcov.h index 1456100815d..5d237a4c730 100644 --- a/libgcc/libgcov.h +++ b/libgcc/libgcov.h @@ -401,24 +401,6 @@ gcov_counter_add (gcov_type *counter, gcov_type value, *counter += value; } -/* Set NODE to memory location COUNTER and make it with atomic operation - if USE_ATOMIC is true. */ - -static inline int -gcov_counter_set_if_null (gcov_type *counter, struct gcov_kvp *node, - int use_atomic ATTRIBUTE_UNUSED) -{ -#if GCOV_SUPPORTS_ATOMIC - if (use_atomic) -return !__sync_val_compare_and_swap (counter, NULL, (intptr_t)node); - else -#endif -{ - *counter = (intptr_t)node; - return 1; -} -} - /* Add key value pair VALUE:COUNT to a top N COUNTERS. When INCREMENT_TOTAL is true, add COUNT to total of the TOP counter. If USE_ATOMIC is true, do it in atomic way. */ @@ -432,7 +414,7 @@ gcov_topn_add_value (gcov_type *counters, gcov_type value, gcov_type count, struct gcov_kvp *prev_node = NULL; struct gcov_kvp *minimal_node = NULL; - struct gcov_kvp *current_node = (struct gcov_kvp *)counters[2]; + struct gcov_kvp *current_node = (struct gcov_kvp *)(intptr_t)counters[2]; while (current_node) { @@ -467,10 +449,33 @@ gcov_topn_add_value (gcov_type *counters, gcov_type value, gcov_type count, int success = 0; if (!counters[2]) - success = gcov_counter_set_if_null (&counters[2], new_node, use_atomic); + { +#if GCOV_SUPPORTS_ATOMIC + if (use_atomic) + { + struct gcov_kvp **ptr = (struct gcov_kvp **)(intptr_t)&counters[2]; + success = !__sync_val_compare_and_swap (ptr, 0, new_node); + } + else +#endif + { + counters[2] = (intptr_t)new_node; + success = 1; + } + } else if (prev_node && !prev_node->next) - success = gcov_counter_set_if_null ((gcov_type *)&prev_node->next, - new_node, use_atomic); + { +#if GCOV_SUPPORTS_ATOMIC + if (use_atomic) + success = !__sync_val_compare_and_swap (&prev_node->next, 0, + new_node); + else +#endif + { + prev_node->next = new_node; + success = 1; + } + } /* Increment number of nodes. */ if (success) -- 2.26.2
Re: [PATCH] guard against calls with fewer arguments than the function type implies (PR 95581)
On Mon, Jun 8, 2020 at 11:59 PM Martin Sebor via Gcc-patches wrote: > > PR 95581 reports an ICE in a call to gimple_call_arg() with > an out-of-bounds argument number (2). The attached patch avoids > the ICE but I'm not sure it's the right fix. Other than verifying > the ICE is gone with a powerpc64 cross-compiler I couldn't come up > with a generic test cases to exercise this change. > > The call, synthesized by the vectorizer, is > >vect__5.6_24 = __builtin_altivec_mask_for_load (vectp_a.5_8); > > but the function is declared by the powerpc64 back end to take two > arguments: long and const void*. Is it correct for the builtin to > be called with fewer arguments than their type indicates? (If it > isn't the patch shouldn't be necessary but some other fix would > be needed.) I think the backend declaration is wrong, the function only takes a void * argument and returns a long. Richard. > > Martin
Re: avoid line breaks in -fverbose-asm tree exprs
On Tue, Jun 9, 2020 at 1:47 AM Alexandre Oliva wrote: > > > An asm operand with a "VIEW_CONVERT_EXPR [...] > }>" will output the definition of the struct as asm code. Oops. > > I hoped setting print_rtx_head would affect print_mem_expr, just > because it's in print-rtl rather than tree-pretty-print, but it didn't > help. I've thus arranged for print_mem_expr to "obey" print_rtx_head > by enabling TDF_SLIM. > > Regstrapped on x86_64-linux-gnu. Ok to install? > > > Another alternative that occurred to me was to get print_rtx_head > printed by e.g. pp_newline, or otherwise get print-rtl and > pretty-print infrastructure to both use the same line-leading > infrastructure. I think that would be a much bigger undertaking, so I > figured I'd ask whether that would be desirable, before taking it up. > Thoughts? > > > for gcc/ChangeLog > > * final.c (output_asm_operand_names): Set print_rtx_head > around print_mem_expr. > * print-rtl.c (print_mem_expr): Enable TDF_SLIM in dump_flags > when print_rtx_head is nonempty. > --- > final.c |3 +++ > print-rtl.c |4 > 2 files changed, 7 insertions(+) > > diff --git gcc/final.c gcc/final.c > index a360196..482801db 100644 > --- gcc/final.c > +++ gcc/final.c > @@ -3706,9 +3706,12 @@ output_asm_operand_names (rtx *operands, int *oporder, > int nops) >wrote = 1; >if (expr) > { > + const char *saved_print_rtx_head = print_rtx_head; > fprintf (asm_out_file, "%s", >addressp ? "*" : ""); > + print_rtx_head = ASM_COMMENT_START; > print_mem_expr (asm_out_file, expr); > + print_rtx_head = saved_print_rtx_head; > wrote = 1; > } >else if (REG_P (op) && ORIGINAL_REGNO (op) > diff --git gcc/print-rtl.c gcc/print-rtl.c > index 611ea07..0563540 100644 > --- gcc/print-rtl.c > +++ gcc/print-rtl.c > @@ -182,8 +182,12 @@ rtx_reuse_manager::set_seen_def (int reuse_id) > void > print_mem_expr (FILE *outfile, const_tree expr) > { > + dump_flags_t save_dump_flags = dump_flags; > + if (*print_rtx_head) > +dump_flags |= TDF_SLIM; >fputc (' ', outfile); >print_generic_expr (outfile, CONST_CAST_TREE (expr), dump_flags); How about simply unconditionally doing dump_flags | TDF_SLIM here to have the whole mem_expr on one line. OK with that change. Richard. > + dump_flags = save_dump_flags; > } > #endif > > > -- > Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/ > Free Software Evangelist Stallman was right, but he's left :( > GNU Toolchain Engineer Live long and free, and prosper ethically
[PATCH v2] tsan: Add optional support for distinguishing volatiles
Add support to optionally emit different instrumentation for accesses to volatile variables. While the default TSAN runtime likely will never require this feature, other runtimes for different environments that have subtly different memory models or assumptions may require distinguishing volatiles. One such environment are OS kernels, where volatile is still used in various places, and often declare volatile to be appropriate even in multi-threaded contexts. One such example is the Linux kernel, which implements various synchronization primitives using volatile (READ_ONCE(), WRITE_ONCE()). Here the Kernel Concurrency Sanitizer (KCSAN), is a runtime that uses TSAN instrumentation but otherwise implements a very different approach to race detection from TSAN: https://github.com/google/ktsan/wiki/KCSAN Due to recent changes in requirements by the Linux kernel, KCSAN requires that the compiler supports tsan-distinguish-volatile (among several new requirements): https://lore.kernel.org/lkml/20200521142047.169334-7-el...@google.com/ gcc/ * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new builtin for volatile instrumentation of reads/writes. (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. * tsan.c (get_memory_access_decl): Argument if access is volatile. If param tsan-distinguish-volatile is non-zero, and access if volatile, return volatile instrumentation decl. (instrument_expr): Check if access is volatile. gcc/testsuite/ * c-c++-common/tsan/volatile.c: New test. Acked-by: Dmitry Vyukov --- v2: * Add Optimization keyword to -param=tsan-distinguish-volatile= as the parameter can be different per TU. * Add tree-dump check to test. --- gcc/params.opt | 4 ++ gcc/sanitizer.def | 21 +++ gcc/testsuite/c-c++-common/tsan/volatile.c | 67 ++ gcc/tsan.c | 53 +++-- 4 files changed, 128 insertions(+), 17 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/tsan/volatile.c diff --git a/gcc/params.opt b/gcc/params.opt index 4aec480798b..c751416bcad 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best edge is less than this th Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization Set the maximum number of instructions executed in parallel in reassociated tree. If 0, use the target dependent heuristic. +-param=tsan-distinguish-volatile= +Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, 1) Param Optimization +Emit special instrumentation for accesses to volatiles. + -param=uninit-control-dep-attempts= Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) IntegerRange(1, 65536) Param Optimization Maximum number of nested calls to search for control dependencies during uninitialized variable analysis. diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def index 11eb6467eba..a32715ddb92 100644 --- a/gcc/sanitizer.def +++ b/gcc/sanitizer.def @@ -214,6 +214,27 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_READ_RANGE, "__tsan_read_range", DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_WRITE_RANGE, "__tsan_write_range", BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ1, "__tsan_volatile_read1", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ2, "__tsan_volatile_read2", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ4, "__tsan_volatile_read4", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ8, "__tsan_volatile_read8", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ16, "__tsan_volatile_read16", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE1, "__tsan_volatile_write1", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE2, "__tsan_volatile_write2", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE4, "__tsan_volatile_write4", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN
[PATCH] gcc-changelog: fix deduction for root ChangeLog
The patch fixes a wrong ChangeLog regex, pushed to master. Martin contrib/ChangeLog: * gcc-changelog/git_commit.py: Fix ChangeLog regex in order to match the top-level ChangeLog. * gcc-changelog/test_email.py: Add test. * gcc-changelog/test_patches.txt: Likewise. --- contrib/gcc-changelog/git_commit.py| 5 +- contrib/gcc-changelog/test_email.py| 5 ++ contrib/gcc-changelog/test_patches.txt | 68 ++ 3 files changed, 76 insertions(+), 2 deletions(-) diff --git a/contrib/gcc-changelog/git_commit.py b/contrib/gcc-changelog/git_commit.py index 069900d7cbf..f85d4c83c63 100755 --- a/contrib/gcc-changelog/git_commit.py +++ b/contrib/gcc-changelog/git_commit.py @@ -150,7 +150,7 @@ misc_files = [ author_line_regex = \ re.compile(r'^(?P\d{4}-\d{2}-\d{2})\ {2}(?P.* <.*>)') additional_author_regex = re.compile(r'^\t(?P\ *)?(?P.* <.*>)') -changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)/ChangeLog:?') +changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?') pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$') dr_regex = re.compile(r'\tDR ([0-9]+)$') star_prefix_regex = re.compile(r'\t\*(?P\ *)(?P.*)') @@ -359,7 +359,8 @@ class GitCommit: % LINE_LIMIT, line)) m = changelog_regex.match(line) if m: -last_entry = ChangeLogEntry(m.group(1), self.top_level_authors, +last_entry = ChangeLogEntry(m.group(1).rstrip('/'), +self.top_level_authors, self.top_level_prs) self.changelog_entries.append(last_entry) elif self.find_changelog_location(line): diff --git a/contrib/gcc-changelog/test_email.py b/contrib/gcc-changelog/test_email.py index 7e8140c915f..04ddad3f100 100755 --- a/contrib/gcc-changelog/test_email.py +++ b/contrib/gcc-changelog/test_email.py @@ -331,3 +331,8 @@ class TestGccChangelog(unittest.TestCase): assert len(email.errors) == 1 msg = 'changed file not mentioned in a ChangeLog' assert email.errors[0].message == msg + +def test_not_deduce(self): +email = self.from_patch_glob('0001-configure.patch') +assert not email.errors +assert len(email.changelog_entries) == 2 diff --git a/contrib/gcc-changelog/test_patches.txt b/contrib/gcc-changelog/test_patches.txt index 5d9b62d2637..15fe0df1ccc 100644 --- a/contrib/gcc-changelog/test_patches.txt +++ b/contrib/gcc-changelog/test_patches.txt @@ -3058,4 +3058,72 @@ index 967e5f5f348..95d21b5bf9f 100644 + -- 2.25.4 +=== 0001-configure.patch === +From dbe341cf6a77bb28c5fdf8b32dcb0ff1c2a27348 Mon Sep 17 00:00:00 2001 +From: Martin Liska +Date: Tue, 9 Jun 2020 09:39:36 +0200 +Subject: [PATCH] c++: Fix --disable-bootstrap with older g++. + +Previously I had AX_CXX_COMPILE_STDCXX in the gcc directory configure, which +added -std=c++11 to CXX if needed, but then CXX is overridden from the +toplevel directory, so it didn't have the desired effect. Fixed by moving +the check to the toplevel. Currently it is only used when building GCC +without bootstrapping; other packages that share the toplevel directory +can adjust the condition if they also want to require C++11 support. +ChangeLog: + + * configure.ac: Check AX_CXX_COMPILE_STDCXX if not bootstrapping. + * configure: Regenerate. + +gcc/ChangeLog: + + * aclocal.m4: Remove ax_cxx_compile_stdcxx.m4. + * configure.ac: Remove AX_CXX_COMPILE_STDCXX. + * configure: Regenerate. + +--- + configure| 999 ++- + configure.ac | 6 +- + gcc/aclocal.m4 | 1 - + gcc/configure| 997 +- + gcc/configure.ac | 2 - + 5 files changed, 1004 insertions(+), 1001 deletions(-) + +diff --git a/configure b/configure +index b7897446c70..a0c5aca9e8d 100755 +--- a/configure b/configure +@@ -1 +1,2 @@ + ++ +diff --git a/configure.ac b/configure.ac +index 59bd92a3e53..1a53ed418e4 100644 +--- a/configure.ac b/configure.ac +@@ -1 +1,2 @@ + ++ +diff --git a/gcc/aclocal.m4 b/gcc/aclocal.m4 +index e93c1535063..1737d59d1cb 100644 +--- a/gcc/aclocal.m4 b/gcc/aclocal.m4 +@@ -1 +1,2 @@ + ++ +diff --git a/gcc/configure b/gcc/configure +index 46850710424..629c7c7e153 100755 +--- a/gcc/configure b/gcc/configure +@@ -1 +1,2 @@ + ++ +diff --git a/gcc/configure.ac b/gcc/configure.ac +index 60d83c30771..9e7efd13ecc 100644 +--- a/gcc/configure.ac b/gcc/configure.ac +@@ -1 +1,2 @@ + ++ +-- +2.26.2 -- 2.26.2
Re: [PATCH] libsanitizer: use gnu++14
On 6/8/20 4:53 PM, Martin Liška wrote: Hi. Thank you for the report. It's caused by fact that LLVM switch in 4d474e078ac7 to c++14. So that I suggest to use gnu++14. Patch can bootstrap on x86_64-linux-gnu and survives regression tests. I also verified that abidiff is equal for all libsanitizer shared libraries. I'm going to install the patch if there are no objections. Thanks, Martin Installed as 942a384ef9f. @Andreas: Can you please check the riscv64 build? Martin
[Ada] Spurious error on instantiations with Taft_Amendment types and tasks
An access type whose designated type is an incomplete or class-wide type may end up designating a task type or a type with a task component. It is then necessary to associate the access type with the Master of the tasks that may be generated on an allocator for that access type. This is done by creating a renaming declaration that uses a local name to rename the master of an enclosing construct (subprogram or task body) that is the finalization master of these tasks, the local nanme used in that renaming declaration must be unique across the compilation unit, because several nested constructs that are not masters (for example, package instantiastions) may contain local access types with the same name. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Ed Schonberg gcc/ada/ * exp_ch9.adb (Build_Master_Renaming): Make name in renaming declaration unique by adding a numeric suffix, to prevent accidental name conflict when several instantiations of a package containing an access_to_incomplete type that designate tasks appear in the same scope.--- gcc/ada/exp_ch9.adb +++ gcc/ada/exp_ch9.adb @@ -3469,10 +3469,13 @@ package body Exp_Ch9 is -- Generate: --M : Master_Id renames _Master; + -- and add a numeric suffix to the name to ensure that it is + -- unique in case other access types in nested constructs + -- are homonyms of this one. Master_Id := Make_Defining_Identifier (Loc, - New_External_Name (Chars (Ptr_Typ), 'M')); + New_External_Name (Chars (Ptr_Typ), 'M', -1)); Master_Decl := Make_Object_Renaming_Declaration (Loc,
[Ada] Remove bypass for instance bodies from Is_Visible_Component
This removes an obsolete bypass present for long in the predicate Is_Visible_Component, which is responsible for determining whether a component present in a private tagged record type is visible in its extensions. This bypass would essentially always return true in bodies of instances, but there is no real basis to do so. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Eric Botcazou gcc/ada/ * sem_ch3.adb (Is_Visible_Component): Do not special-case bodies of instances.--- gcc/ada/sem_ch3.adb +++ gcc/ada/sem_ch3.adb @@ -18762,39 +18762,6 @@ package body Sem_Ch3 is then return True; - -- In the body of an instantiation, check the visibility of a component - -- in case it has a homograph that is a primitive operation of a private - -- type which was not visible in the generic unit. - - -- Should Is_Prefixed_Call be propagated from template to instance??? - - elsif In_Instance_Body then - if not Is_Tagged_Type (Original_Type) - or else not Is_Private_Type (Original_Type) - then -return True; - - else -declare - Subp_Elmt : Elmt_Id; - -begin - Subp_Elmt := First_Elmt (Primitive_Operations (Original_Type)); - while Present (Subp_Elmt) loop - - -- The component is hidden by a primitive operation - - if Chars (Node (Subp_Elmt)) = Chars (C) then - return False; - end if; - - Next_Elmt (Subp_Elmt); - end loop; - - return True; -end; - end if; - -- If the component has been declared in an ancestor which is currently -- a private type, then it is not visible. The same applies if the -- component's containing type is not in an open scope and the original
[Ada] Ada2020: AI12-0301 Predicates and Default_Value
This AI clarifies that predicate checks apply at object declaration for types that contain components with a Default_Value or Default_Component_Value. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * sem_util.ads, sem_util.adb (Is_Partially_Initialized_Type): Take Default_Value and Default_Component_Value into account. * sem_ch3.adb (Analyze_Object_Declaration): Update comment.--- gcc/ada/sem_ch3.adb +++ gcc/ada/sem_ch3.adb @@ -4366,8 +4366,10 @@ package body Sem_Ch3 is -- We need a predicate check if the type has predicates that are not -- ignored, and if either there is an initializing expression, or for -- default initialization when we have at least one case of an explicit - -- default initial value and then this is not an internal declaration - -- whose initialization comes later (as for an aggregate expansion). + -- default initial value (including via a Default_Value or + -- Default_Component_Value aspect, see AI12-0301) and then this is not + -- an internal declaration whose initialization comes later (as for an + -- aggregate expansion). -- If expression is an aggregate it may be expanded into assignments -- and the declaration itself is marked with No_Initialization, but -- the predicate still applies. --- gcc/ada/sem_util.adb +++ gcc/ada/sem_util.adb @@ -16863,7 +16863,7 @@ package body Sem_Util is is begin if Is_Scalar_Type (Typ) then - return False; + return Has_Default_Aspect (Base_Type (Typ)); elsif Is_Access_Type (Typ) then return Include_Implicit; @@ -16872,8 +16872,9 @@ package body Sem_Util is -- If component type is partially initialized, so is array type - if Is_Partially_Initialized_Type - (Component_Type (Typ), Include_Implicit) + if Has_Default_Aspect (Base_Type (Typ)) + or else Is_Partially_Initialized_Type + (Component_Type (Typ), Include_Implicit) then return True; --- gcc/ada/sem_util.ads +++ gcc/ada/sem_util.ads @@ -1872,7 +1872,8 @@ package Sem_Util is -- Typ is a type entity. This function returns true if this type is partly -- initialized, meaning that an object of the type is at least partly -- initialized (in particular in the record case, that at least one - -- component has an initialization expression). Note that initialization + -- component has an initialization expression, including via Default_Value + -- and Default_Component_Value aspects). Note that initialization -- resulting from the use of pragma Normalize_Scalars does not count. -- Include_Implicit controls whether implicit initialization of access -- values to null, and of discriminant values, is counted as making the
[Ada] Refine implementation of AI05-0149 missing conversion checks
We were accepting conversion from anon-access-T'Class to access-all-T which is incorrect. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * sem_type.adb (Covers): Fix implementation of AI05-0149. * sem_res.adb: Fix typo.--- gcc/ada/sem_res.adb +++ gcc/ada/sem_res.adb @@ -2852,7 +2852,7 @@ package body Sem_Res is return; -- Under relaxed RM semantics silently replace occurrences of null --- by System.Address_Null. +-- by System.Null_Address. elsif Null_To_Null_Address_Convert_OK (N, Typ) then Replace_Null_By_Null_Address (N); --- gcc/ada/sem_type.adb +++ gcc/ada/sem_type.adb @@ -1021,15 +1021,17 @@ package body Sem_Type is -- Ada 2012 (AI05-0149): Allow an anonymous access type in the context -- of a named general access type. An implicit conversion will be - -- applied. For the resolution, one designated type must cover the - -- other. + -- applied. For the resolution, the designated types must match if + -- untagged; further, if the designated type is tagged, the designated + -- type of the anonymous access type shall be covered by the designated + -- type of the named access type. elsif Ada_Version >= Ada_2012 and then Ekind (BT1) = E_General_Access_Type and then Ekind (BT2) = E_Anonymous_Access_Type -and then (Covers (Designated_Type (T1), Designated_Type (T2)) -or else - Covers (Designated_Type (T2), Designated_Type (T1))) +and then Covers (Designated_Type (T1), Designated_Type (T2)) +and then (Is_Class_Wide_Type (Designated_Type (T1)) >= + Is_Class_Wide_Type (Designated_Type (T2))) then return True;
[Ada] Ada2020 AI12-0282: Shared variable control aspects in generics
This patch refines the checks and the corresponding error messsages on the legality of instantiations where some formal objects and their corresponding actuals may carry Atomic, Atonic_Components, Volatile, Volatile_Components, Independent, or Independent_Components aspect specifications. The legality rules require exact match between formal and actual for the first four aspects, but not for the last two. For Atomic_Components and Volatile_Components we also check whether the actual is of an array type with the corresponding: aspect on its component type. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Ed Schonberg gcc/ada/ * sem_ch12.adb (Check_Shared_Variable_Control_Aspects): Require exact match between formal and actual for aspects Atomic, Atomic_Component, Volatile, and Volatile_Components.--- gcc/ada/sem_ch12.adb +++ gcc/ada/sem_ch12.adb @@ -12398,21 +12398,15 @@ package body Sem_Ch12 is procedure Check_Shared_Variable_Control_Aspects is begin if Ada_Version >= Ada_2020 then -if Is_Atomic (A_Gen_T) and then not Is_Atomic (Act_T) then +if Is_Atomic (A_Gen_T) /= Is_Atomic (Act_T) then Error_Msg_NE - ("actual for& must be an atomic type", Actual, A_Gen_T); + ("actual for& has different Atomic aspect", Actual, A_Gen_T); end if; -if Is_Volatile (A_Gen_T) and then not Is_Volatile (Act_T) then +if Is_Volatile (A_Gen_T) /= Is_Volatile (Act_T) then Error_Msg_NE - ("actual for& must be a Volatile type", Actual, A_Gen_T); -end if; - -if - Is_Independent (A_Gen_T) and then not Is_Independent (Act_T) -then - Error_Msg_NE - ("actual for& must be an Independent type", Actual, A_Gen_T); + ("actual for& has different Volatile aspect", +Actual, A_Gen_T); end if; -- We assume that an array type whose atomic component type @@ -12420,43 +12414,51 @@ package body Sem_Ch12 is -- aspect Has_Atomic_Components. This is a reasonable inference -- from the intent of AI12-0282, and makes it legal to use an -- actual that does not have the identical aspect as the formal. +-- Ditto for volatile components. -if Has_Atomic_Components (A_Gen_T) - and then not Has_Atomic_Components (Act_T) -then - if Is_Array_Type (Act_T) - and then Is_Atomic (Component_Type (Act_T)) - then - null; +declare + Actual_Atomic_Comp : constant Boolean := + Has_Atomic_Components (Act_T) + or else (Is_Array_Type (Act_T) +and then Is_Atomic (Component_Type (Act_T))); +begin + if Has_Atomic_Components (A_Gen_T) /= Actual_Atomic_Comp then + Error_Msg_NE +("formal and actual for& must agree on atomic components", + Actual, A_Gen_T); + end if; +end; - else +declare + Actual_Volatile_Comp : constant Boolean := + Has_Volatile_Components (Act_T) + or else (Is_Array_Type (Act_T) + and then Is_Volatile (Component_Type (Act_T))); +begin + if Has_Volatile_Components (A_Gen_T) /= Actual_Volatile_Comp + then Error_Msg_NE -("actual for& must have atomic components", +("actual for& must have volatile components", Actual, A_Gen_T); end if; -end if; +end; -if Has_Independent_Components (A_Gen_T) - and then not Has_Independent_Components (Act_T) +-- The following two aspects do not require exact matching, +-- but only one-way agreement. See RM C.6. + +if Is_Independent (A_Gen_T) and then not Is_Independent (Act_T) then Error_Msg_NE - ("actual for& must have independent components", -Actual, A_Gen_T); + ("actual for& must have Independent aspect specified", + Actual, A_Gen_T); end if; -if Has_Volatile_Components (A_Gen_T) - and then not Has_Volatile_Components (Act_T) +if Has_Independent_Components (A_Gen_T) + and then not Has_Independent_Components (Act_T) then - if Is_Array_Type (Act_T) - and then Is_Volatile (Component_Type (Act_T)) - then - null; - - else - Error_Msg_NE -
[Ada] Annotate Ada.Synchronous_Barriers with SPARK_Mode => Off
Synchronous barriers are currently not supported by GNATprove (i.e. it doesn't recognize that they are initialized by synchronous). They are specifically detected and rejected as not-in-SPARK, but it is more elegant to annotate the Ada.Synchronous runtime library unit with SPARK_Mode => Off. Compilation is not affected by this annotation. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Piotr Trojanek gcc/ada/ * libgnarl/a-synbar.ads, libgnarl/a-synbar.adb, libgnarl/a-synbar__posix.ads, libgnarl/a-synbar__posix.adb (Ada.Synchronous_Barriers): Annotate with SPARK_Mode => Off.--- gcc/ada/libgnarl/a-synbar.adb +++ gcc/ada/libgnarl/a-synbar.adb @@ -33,7 +33,7 @@ -- -- -- -package body Ada.Synchronous_Barriers is +package body Ada.Synchronous_Barriers with SPARK_Mode => Off is protected body Synchronous_Barrier is --- gcc/ada/libgnarl/a-synbar.ads +++ gcc/ada/libgnarl/a-synbar.ads @@ -33,7 +33,7 @@ -- -- -- -package Ada.Synchronous_Barriers is +package Ada.Synchronous_Barriers with SPARK_Mode => Off is pragma Preelaborate (Synchronous_Barriers); subtype Barrier_Limit is Positive range 1 .. Positive'Last; --- gcc/ada/libgnarl/a-synbar__posix.adb +++ gcc/ada/libgnarl/a-synbar__posix.adb @@ -37,7 +37,7 @@ with Interfaces.C; use Interfaces.C; -package body Ada.Synchronous_Barriers is +package body Ada.Synchronous_Barriers with SPARK_Mode => Off is -- POSIX barriers -- --- gcc/ada/libgnarl/a-synbar__posix.ads +++ gcc/ada/libgnarl/a-synbar__posix.ads @@ -39,7 +39,7 @@ with System; private with Ada.Finalization; private with Interfaces.C; -package Ada.Synchronous_Barriers is +package Ada.Synchronous_Barriers with SPARK_Mode => Off is pragma Preelaborate (Synchronous_Barriers); subtype Barrier_Limit is Positive range 1 .. Positive'Last;
[Ada] Spurious overlap error on zero-sized arrays with -gnateV
This patch corrects an issue whereby the compiler would incorrectly generate checks for overlapping formals (-gnateV) leading to spurious runtime errors when a zero-sized array actual gets passed alongside another formal declared in the same region. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Justin Squirek gcc/ada/ * exp_attr.adb (Expand_N_Attribute_Reference): Modify expansion of 'Overlaps_Storage to take into account zero sized arrays.--- gcc/ada/exp_attr.adb +++ gcc/ada/exp_attr.adb @@ -4783,27 +4783,31 @@ package body Exp_Attr is when Attribute_Overlaps_Storage => Overlaps_Storage : declare Loc : constant Source_Ptr := Sloc (N); + X : constant Node_Id:= Prefix (N); + Y : constant Node_Id:= First (Expressions (N)); - X : constant Node_Id := Prefix (N); - Y : constant Node_Id := First (Expressions (N)); -- The arguments X_Addr, Y_Addr : Node_Id; - -- the expressions for their integer addresses + + -- The expressions for their integer addresses X_Size, Y_Size : Node_Id; - -- the expressions for their sizes + + -- The expressions for their sizes Cond : Node_Id; begin -- Attribute expands into: - --if X'Address < Y'address then - -- (X'address + X'Size - 1) >= Y'address - --else - -- (Y'address + Y'size - 1) >= X'Address - --end if; + --(if X'Size = 0 or else Y'Size = 0 then + -- False + -- else + -- (if X'Address <= Y'Address then + -- (X'Address + X'Size - 1) >= Y'Address + --else + -- (Y'Address + Y'Size - 1) >= X'Address)) -- with the proper address operations. We convert addresses to -- integer addresses to use predefined arithmetic. The size is @@ -4846,29 +4850,62 @@ package body Exp_Attr is Left_Opnd => X_Addr, Right_Opnd => Y_Addr); + -- Perform the rewriting + Rewrite (N, Make_If_Expression (Loc, New_List ( - Cond, - Make_Op_Ge (Loc, - Left_Opnd => - Make_Op_Add (Loc, - Left_Opnd => New_Copy_Tree (X_Addr), - Right_Opnd => - Make_Op_Subtract (Loc, - Left_Opnd => X_Size, - Right_Opnd => Make_Integer_Literal (Loc, 1))), - Right_Opnd => Y_Addr), + -- Generate a check for zero sized things like a null record with + -- size zero or an array with zero length since they have no + -- opportunity of overlapping. + + -- Without this check a zero-sized object can trigger a false + -- runtime result if its compared against another object + -- in its declarative region due to the zero-sized object having + -- the same address. - Make_Op_Ge (Loc, + Make_Or_Else (Loc, Left_Opnd => - Make_Op_Add (Loc, - Left_Opnd => New_Copy_Tree (Y_Addr), - Right_Opnd => - Make_Op_Subtract (Loc, - Left_Opnd => Y_Size, - Right_Opnd => Make_Integer_Literal (Loc, 1))), - Right_Opnd => X_Addr; + Make_Op_Eq (Loc, + Left_Opnd => + Make_Attribute_Reference (Loc, + Attribute_Name => Name_Size, + Prefix => New_Copy_Tree (X)), + Right_Opnd => Make_Integer_Literal (Loc, 0)), + Right_Opnd => + Make_Op_Eq (Loc, + Left_Opnd => + Make_Attribute_Reference (Loc, + Attribute_Name => Name_Size, + Prefix => New_Copy_Tree (Y)), + Right_Opnd => Make_Integer_Literal (Loc, 0))), + + New_Occurrence_Of (Standard_False, Loc), + + -- Non-size zero overlap check + + Make_If_Expression (Loc, New_List ( + Cond, + + Make_Op_Ge (Loc, + Left_Opnd => + Make_Op_Add (Loc, +Left_Opnd => New_Copy_Tree (X_Addr), + Right_Opnd => + Make_Op_Subtract (Loc, + Left_Opnd => X_Size, + Right_Opnd => Make_Integer_Literal (Loc, 1))), + Right_Opnd => Y_Addr), + + Make_Op_Ge (Loc, + Left_Opnd => + Make_Op_Add (Loc, + Left_Opnd => New_Copy_Tree (Y_Addr), + Right_Opnd => + Make_Op_Subtract (Loc, +
[Ada] Crash on exit statement within predicated loop
This patch corrects an issue whereby the compiler would crash if it encountered an exit statement featuring a loop identifier where such referenced loop iterates through a predicated type. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Justin Squirek gcc/ada/ * exp_ch5.adb (Expand_Predicated_Loop): Perserve the original loop identifier within the expansion.--- gcc/ada/exp_ch5.adb +++ gcc/ada/exp_ch5.adb @@ -4920,13 +4920,14 @@ package body Exp_Ch5 is -- mode, the semantic analyzer may disallow one or both forms. procedure Expand_Predicated_Loop (N : Node_Id) is - Loc : constant Source_Ptr := Sloc (N); - Isc : constant Node_Id:= Iteration_Scheme (N); - LPS : constant Node_Id:= Loop_Parameter_Specification (Isc); - Loop_Id : constant Entity_Id := Defining_Identifier (LPS); - Ltype : constant Entity_Id := Etype (Loop_Id); - Stat: constant List_Id:= Static_Discrete_Predicate (Ltype); - Stmts : constant List_Id:= Statements (N); + Orig_Loop_Id : Node_Id:= Empty; + Loc : constant Source_Ptr := Sloc (N); + Isc : constant Node_Id:= Iteration_Scheme (N); + LPS : constant Node_Id:= Loop_Parameter_Specification (Isc); + Loop_Id : constant Entity_Id := Defining_Identifier (LPS); + Ltype: constant Entity_Id := Etype (Loop_Id); + Stat : constant List_Id:= Static_Discrete_Predicate (Ltype); + Stmts: constant List_Id:= Statements (N); begin -- Case of iteration over non-static predicate, should not be possible @@ -5205,7 +5206,13 @@ package body Exp_Ch5 is Alternatives => Alts); Append_To (Stmts, Cstm); --- Rewrite the loop +-- Rewrite the loop preserving the loop identifier in case there +-- are exit statements referencing it. + +if Present (Identifier (N)) then + Orig_Loop_Id := New_Occurrence_Of + (Entity (Identifier (N)), Loc); +end if; Set_Suppress_Assignment_Checks (D); @@ -5217,6 +5224,7 @@ package body Exp_Ch5 is Statements => New_List ( Make_Loop_Statement (Loc, Statements => Stmts, +Identifier => Orig_Loop_Id, End_Label => Empty); Analyze (N);
[Ada] Write_Invocation_Graph_Vertex: include lib item name
Make each invocation graph point to the corresponding library graph. Use this information in Write_Invocation_Graph_Vertex to print out the name of the library item containing this vertex. For procedures that take both the library graph and invocation graph as parameters, remove the library graph parameter, and fetch that information from the invocation graph. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Bob Duff gcc/ada/ * bindo-graphs.adb, bindo-graphs.ads: For each invocation graph, record the corresponding library graph. * bindo-writers.adb (Write_Invocation_Graph_Vertex): Print the lib item name. Remove library graph parameters. * bindo-augmentors.adb, bindo-augmentors.ads, bindo-builders.adb, bindo-diagnostics.adb, bindo-diagnostics.ads, bindo-elaborators.adb: Remove library graph parameters. patch.diff.gz Description: application/gzip
[Ada] Add debugging message
When raising Program_Error due to an incorrect Kind transition, print a more informative debugging message. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Bob Duff gcc/ada/ * bindo-graphs.adb (Add_Edge_Kind_Check): Add the Image of the old and new Kinds to the raise Program_Error message.--- gcc/ada/bindo-graphs.adb +++ gcc/ada/bindo-graphs.adb @@ -1824,7 +1824,7 @@ package body Bindo.Graphs is end case; if not OK then -raise Program_Error; +raise Program_Error with Kind'Img & "-->" & Attributes.Kind'Img; end if; end Add_Edge_Kind_Check;
[Ada] Membership test against a non-excluding subtype
GNAT ignores the null exclusion property of the target access type in a membership test (e.g. Ptr in Null_Excluding_Ptr_Type), this is fixed here. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * exp_ch4.adb (Expand_N_In): Fix handling of null exclusion.--- gcc/ada/exp_ch4.adb +++ gcc/ada/exp_ch4.adb @@ -6468,12 +6468,13 @@ package body Exp_Ch4 is else declare -Typ : Entity_Id:= Etype (Rop); -Is_Acc: constant Boolean := Is_Access_Type (Typ); -Cond : Node_Id := Empty; -New_N : Node_Id; -Obj : Node_Id := Lop; -SCIL_Node : Node_Id; +Typ : Entity_Id:= Etype (Rop); +Is_Acc : constant Boolean := Is_Access_Type (Typ); +Check_Null_Exclusion : Boolean; +Cond : Node_Id := Empty; +New_N: Node_Id; +Obj : Node_Id := Lop; +SCIL_Node: Node_Id; begin Remove_Side_Effects (Obj); @@ -6549,12 +6550,19 @@ package body Exp_Ch4 is -- Here we have a non-scalar type if Is_Acc then + + -- If the null exclusion checks are not compatible, need to + -- perform further checks. In other words, we cannot have + -- Ltyp including null and Typ excluding null. All other cases + -- are OK. + + Check_Null_Exclusion := + Can_Never_Be_Null (Typ) and then not Can_Never_Be_Null (Ltyp); Typ := Designated_Type (Typ); end if; if not Is_Constrained (Typ) then - Rewrite (N, New_Occurrence_Of (Standard_True, Loc)); - Analyze_And_Resolve (N, Restyp); + Cond := New_Occurrence_Of (Standard_True, Loc); -- For the constrained array case, we have to check the subscripts -- for an exact match if the lengths are non-zero (the lengths @@ -6610,19 +6618,6 @@ package body Exp_Ch4 is Build_Attribute_Reference (New_Occurrence_Of (Typ, Loc), Name_Last, J))); end loop; - - if Is_Acc then - Cond := - Make_Or_Else (Loc, - Left_Opnd => - Make_Op_Eq (Loc, - Left_Opnd => Obj, - Right_Opnd => Make_Null (Loc)), - Right_Opnd => Cond); - end if; - - Rewrite (N, Cond); - Analyze_And_Resolve (N, Restyp); end Check_Subscripts; -- These are the cases where constraint checks may be required, @@ -6638,24 +6633,32 @@ package body Exp_Ch4 is if Has_Discriminants (Typ) then Cond := Make_Op_Not (Loc, Right_Opnd => Build_Discriminant_Checks (Obj, Typ)); - - if Is_Acc then - Cond := Make_Or_Else (Loc, - Left_Opnd => - Make_Op_Eq (Loc, - Left_Opnd => Obj, - Right_Opnd => Make_Null (Loc)), - Right_Opnd => Cond); - end if; - else Cond := New_Occurrence_Of (Standard_True, Loc); end if; +end if; - Rewrite (N, Cond); - Analyze_And_Resolve (N, Restyp); +if Is_Acc then + if Check_Null_Exclusion then + Cond := Make_And_Then (Loc, +Left_Opnd => + Make_Op_Ne (Loc, +Left_Opnd => Obj, +Right_Opnd => Make_Null (Loc)), +Right_Opnd => Cond); + else + Cond := Make_Or_Else (Loc, +Left_Opnd => + Make_Op_Eq (Loc, +Left_Opnd => Obj, +Right_Opnd => Make_Null (Loc)), +Right_Opnd => Cond); + end if; end if; +Rewrite (N, Cond); +Analyze_And_Resolve (N, Restyp); + -- Ada 2012 (AI05-0149): Handle membership tests applied to an -- expression of an anonymous access type. This can involve an -- accessibility test and a tagged type membership test in the
[Ada] gnatbind: Correct assertions in Add_Edge_Kind_Check
This patch corrects the assertions in Add_Edge_Kind_Check. In particular, a spec-->body edge can come from an Invocation_Edge or a Forced_Edge in case of cycles. Other edges are added in a certain order. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Bob Duff gcc/ada/ * bindo-graphs.ads (Library_Graph_Edge_Kind): Reorder enumerals to reflect the order of adding edges. Clarify comments. * bindo-graphs.adb (Add_Edge_Kind_Check): Correct the assertions. Reorder the "when"s to match the order of adding edges, and therefore the order of enumerals in type Library_Graph_Edge_Kind. Change names to "Old_" and "New_" to clarify what's what. Combine Invocation_Edge into the "<=" test. Fix the "raise Program_Error" message, which was backwards.--- gcc/ada/bindo-graphs.adb +++ gcc/ada/bindo-graphs.adb @@ -1064,9 +1064,9 @@ package body Bindo.Graphs is (G : Library_Graph; Pred : Library_Graph_Vertex_Id; Succ : Library_Graph_Vertex_Id; - Kind : Library_Graph_Edge_Kind); + New_Kind : Library_Graph_Edge_Kind); -- This is called by Add_Edge in the case where there is already a - -- Pred-->Succ edge, to assert that the new Kind is appropriate. Raises + -- Pred-->Succ edge, to assert that the New_Kind is appropriate. Raises -- Program_Error if a bug is detected. The purpose is to prevent bugs -- where calling Add_Edge in different orders produces different output. @@ -1781,50 +1781,45 @@ package body Bindo.Graphs is (G : Library_Graph; Pred : Library_Graph_Vertex_Id; Succ : Library_Graph_Vertex_Id; - Kind : Library_Graph_Edge_Kind) + New_Kind : Library_Graph_Edge_Kind) is Old_Edge : constant Library_Graph_Edge_Id := Find_Edge (G, Pred, Succ); - Attributes : constant Library_Graph_Edge_Attributes := - Get_LGE_Attributes (G, Old_Edge); + Old_Kind : constant Library_Graph_Edge_Kind := + Get_LGE_Attributes (G, Old_Edge).Kind; OK : Boolean; begin - case Kind is --- We call Add_Edge with Body_Before_Spec_Edge twice -- once --- for the spec and once for the body, but no other Kind can --- be spec-->body. - -when Body_Before_Spec_Edge => - if True then - -- Disable this part of the assertion for now - OK := True; - else - OK := Attributes.Kind = Body_Before_Spec_Edge; - end if; - --- Spec_Before_Body_Edge comes first - + case New_Kind is when Spec_Before_Body_Edge => OK := False; - --- With clauses and forced edges come after Spec_Before_Body_Edge + -- Spec_Before_Body_Edge comes first, and there is never more + -- than one Spec_Before_Body_Edge for a given unit, so we can't + -- have a preexisting edge in the Spec_Before_Body_Edge case. when With_Edge | Elaborate_Edge | Elaborate_All_Edge - | Forced_Edge => - OK := Attributes.Kind <= Kind; + | Forced_Edge | Invocation_Edge => + OK := Old_Kind <= New_Kind; + -- These edges are created in the order of the enumeration + -- type, and there can be duplicates; hence "<=". --- Invocation_Edge can come after anything, including another --- Invocation_Edge. +when Body_Before_Spec_Edge => + OK := Old_Kind = Body_Before_Spec_Edge + -- We call Add_Edge with Body_Before_Spec_Edge twice -- once + -- for the spec and once for the body. -when Invocation_Edge => - OK := True; + or else Old_Kind = Forced_Edge + or else Old_Kind = Invocation_Edge; + -- The old one can be Forced_Edge or Invocation_Edge, which + -- necessarily results in an elaboration cycle (in the static + -- model), but this assertion happens before cycle detection, + -- so we need to allow these cases. when No_Edge => OK := False; end case; if not OK then -raise Program_Error with Kind'Img & "-->" & Attributes.Kind'Img; +raise Program_Error with Old_Kind'Img & "-->" & New_Kind'Img; end if; end Add_Edge_Kind_Check; --- gcc/ada/bindo-graphs.ads +++ gcc/ada/bindo-graphs.ads @@ -702,29 +702,28 @@ package Bindo.Graphs is No_Cycle_Kind); - -- The following type represents the various kinds of library edges. - -- The order is important here, and roughly correspo
[Ada] Disable assertion regarding Body_Before_Spec_Edge
This patch disables a sometimes-failing assertion. The assertion is failing because there is an invocation edge spec-->body, and then we later add a Body_Before_Spec_Edge for the SCC computation. This is a temporary fix; we need to investigate why the spec-->body invocation edge exists. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Bob Duff gcc/ada/ * bindo-graphs.adb (Add_Edge_Kind_Check): Disable failing part of the assertion.--- gcc/ada/bindo-graphs.adb +++ gcc/ada/bindo-graphs.adb @@ -1795,7 +1795,12 @@ package body Bindo.Graphs is -- be spec-->body. when Body_Before_Spec_Edge => - OK := Attributes.Kind = Body_Before_Spec_Edge; + if True then + -- Disable this part of the assertion for now + OK := True; + else + OK := Attributes.Kind = Body_Before_Spec_Edge; + end if; -- Spec_Before_Body_Edge comes first
[Ada] Improve handling of null unbounded strings
This is a useful optimization to avoid contention around the shared empty string. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * libgnat/a-strunb__shared.ads, libgnat/a-strunb__shared.adb (Reference, Unreference): No-op for Empty_Shared_String. Remove unneeded calls to Reference.--- gcc/ada/libgnat/a-strunb__shared.adb +++ gcc/ada/libgnat/a-strunb__shared.adb @@ -73,7 +73,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared empty string if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Left string is empty, return Right string @@ -112,7 +111,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared empty string if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Right is an empty string, return Left string @@ -145,7 +143,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared one if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Left is empty string, return Right string @@ -214,7 +211,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared empty string if Left = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Otherwise, allocate new shared string and fill it @@ -244,7 +240,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared empty string if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Otherwise, allocate new shared string and fill it @@ -277,7 +272,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared empty string if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Coefficient is one, just return string itself @@ -506,7 +500,6 @@ package body Ada.Strings.Unbounded is -- Empty string requested, return shared empty string if Max_Length = 0 then - Reference (Empty_Shared_String'Access); return Empty_Shared_String'Access; -- Otherwise, allocate requested space (and probably some more room) @@ -701,7 +694,6 @@ package body Ada.Strings.Unbounded is -- Result is an empty string, reuse shared empty string if DL = 0 then -Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Otherwise, allocate new shared string and fill it @@ -743,7 +735,6 @@ package body Ada.Strings.Unbounded is -- Result is empty, reuse shared empty string if DL = 0 then -Reference (Empty_Shared_String'Access); Source.Reference := Empty_Shared_String'Access; Unreference (SR); @@ -801,7 +792,6 @@ package body Ada.Strings.Unbounded is -- effects if a program references an already-finalized object. Object.Reference := Null_Unbounded_String.Reference; - Reference (Object.Reference); Unreference (SR); end if; end Finalize; @@ -862,7 +852,6 @@ package body Ada.Strings.Unbounded is -- Result is empty, reuse shared empty string if Count = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Length of the string is the same as requested, reuse source shared @@ -912,7 +901,6 @@ package body Ada.Strings.Unbounded is -- Result is empty, reuse empty shared string if Count = 0 then - Reference (Empty_Shared_String'Access); Source.Reference := Empty_Shared_String'Access; Unreference (SR); @@ -1090,7 +1078,6 @@ package body Ada.Strings.Unbounded is -- Result is empty, reuse empty shared string if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Inserted string is empty, reuse source shared string @@ -1132,7 +1119,6 @@ package body Ada.Strings.Unbounded is -- Result is empty string, reuse empty shared string if DL = 0 then - Reference (Empty_Shared_String'Access); Source.Reference := Empty_Shared_String'Access; Unreference (SR); @@ -1197,7 +1183,6 @@ package body Ada.Strings.Unbounded is -- Result is empty string, reuse empty shared string if DL = 0 then - Reference (Empty_Shared_String'Access); DR := Empty_Shared_String'Access; -- Result is same as source string, reuse source shared string @@ -1241,7 +1226,6 @@ package body Ada.Strings.Unbounded is
[Ada] Improve handling of aggregates in Side_Effect_Free
Side_Effect_Free was always assuming an aggregate isn't side effects free, but we can do better by taking advantage of the Compile_Time_Known_Aggregate flag. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * exp_util.adb (Side_Effect_Free): Improve handling of N_Aggregate.--- gcc/ada/exp_util.adb +++ gcc/ada/exp_util.adb @@ -13357,6 +13357,12 @@ package body Exp_Util is => return True; + -- An aggregate is side effect free if all its values are compile + -- time known. + + when N_Aggregate => +return Compile_Time_Known_Aggregate (N); + -- We consider that anything else has side effects. This is a bit -- crude, but we are pretty close for most common cases, and we -- are certainly correct (i.e. we never return True when the
[Ada] Implement AI12-0028: Import of variadic C functions
This implements the support for the new C_Variadic_n conventions, n ranging from 0 to 16, in all versions of the language since the AI is a binding interpretation. These new conventions are meant to be used in conjunction with the Import aspect/pragma to call variadic C functions with the special calling conventions attached to them, depending on the ABI. But a warning is issued for C_Variadic_0 since it cannot actually be used. The typical example is calling the printf function of the C library. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Eric Botcazou gcc/ada/ * exp_ch6.adb (Freeze_Subprogram): Deal with convention C_Family. * freeze.adb (Freeze_Profile): Likewise. Add missing guard. * sem_mech.adb (Set_Mechanisms): Likewise. * lib-xref.adb (Output_Import_Export_Info): Ditto for C_Variadic. * repinfo.adb (List_Subprogram_Info): Likewise. * sem_prag.adb (Set_Convention_From_Pragma): Move main checks for Stdcall to... (Process_Convention): ...here. Add checks for C_Variadic. * snames.ads-tmpl: Add Name_C_Variadic_0 .. Name_C_Variadic_16. Use consistent format for subtype declarations. (Convention_Id): Add Convention_C_Variadic_0 .. C_Variadic_16 and move Convention_CPP up. (Convention_C_Family): New subtype of Convention_Id. (Convention_C_Variadic): Likewise. (Foreign_Convention): Use explicit upper bound. Add pragma Inline for Is_Configuration_Pragma_Name, Is_Function_Attribute_Name, Is_Internal_Attribute_Name and Is_Procedure_Attribute_Name. * snames.adb-tmpl (Get_Convention_Id): Deal with Name_Variadic_n. (Get_Convention_Name): Deal with Convention_Variadic_n. * types.h (Convention_Id): New typedef. * xsnamest.adb (Name2): New variable. (Is_Conv): New pattern. (Get_Subt1): Likewise. (Get_Subt2): Likewise. Output subtypes of Convention_Id into the C header file.--- gcc/ada/exp_ch6.adb +++ gcc/ada/exp_ch6.adb @@ -7716,8 +7716,7 @@ package body Exp_Ch6 is -- Build_Inherit_Prims takes care of initializing these slots. elsif Is_Imported (Subp) - and then (Convention (Subp) = Convention_CPP - or else Convention (Subp) = Convention_C) + and then Convention (Subp) in Convention_C_Family then null; --- gcc/ada/freeze.adb +++ gcc/ada/freeze.adb @@ -3674,9 +3674,7 @@ package body Freeze is if Warn_On_Export_Import and then Comes_From_Source (E) - and then (Convention (E) = Convention_C - or else -Convention (E) = Convention_CPP) + and then Convention (E) in Convention_C_Family and then (Is_Imported (E) or else Is_Exported (E)) and then Convention (E) /= Convention (Formal) and then not Has_Warnings_Off (E) @@ -3823,9 +3821,8 @@ package body Freeze is -- Check suspicious return type for C function if Warn_On_Export_Import - and then (Convention (E) = Convention_C - or else -Convention (E) = Convention_CPP) + and then Comes_From_Source (E) + and then Convention (E) in Convention_C_Family and then (Is_Imported (E) or else Is_Exported (E)) then -- Check suspicious return of fat C pointer --- gcc/ada/lib-xref.adb +++ gcc/ada/lib-xref.adb @@ -1652,7 +1652,7 @@ package body Lib.Xref is begin -- Generate language name from convention - if Conv = Convention_C then + if Conv = Convention_C or else Conv in Convention_C_Variadic then Language_Name := Name_C; elsif Conv = Convention_CPP then --- gcc/ada/repinfo.adb +++ gcc/ada/repinfo.adb @@ -1935,6 +1935,21 @@ package body Repinfo is when Convention_C => Write_Str ("C"); + when Convention_C_Variadic => +declare + N : Nat := + Convention_Id'Pos (Convention (Ent)) - + Convention_Id'Pos (Convention_C_Variadic_0); +begin + Write_Str ("C_Variadic_"); + if N >= 10 then + Write_Char ('1'); + N := N - 10; + end if; + pragma Assert (N < 10); + Write_Char (Character'Val (Character'Pos ('0') + N)); +end; + when Convention_COBOL => Write_Str ("COBOL"); --- gcc/ada/sem_mech.adb +++ gcc/ada/sem_mech.adb @@ -181,11 +181,10 @@ package body Sem_Mech is -- C -- --- - -- Note: Assembler, C++, Stdcall also use C conventions + -- Note: Assembler and Stdcall also use C conve
[Ada] Expand more others aggregates statically
Now that there is no artifical limit on Convert_To_Positional, we use the same heuristics than in Aggr_Size_OK, to share the code and to generate static aggregates in more cases. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * exp_aggr.adb (Max_Aggregate_Size): New function to factorize code. (Convert_To_Positional, Aggr_Size_OK): Use Max_Aggregate_Size.--- gcc/ada/exp_aggr.adb +++ gcc/ada/exp_aggr.adb @@ -289,19 +289,12 @@ package body Exp_Aggr is -- construct the allocated object on the heap. procedure Convert_To_Positional - (N: Node_Id; - Max_Others_Replicate : Nat := 32; - Handle_Bit_Packed: Boolean := False); + (N : Node_Id; + Handle_Bit_Packed : Boolean := False); -- If possible, convert named notation to positional notation. This -- conversion is possible only in some static cases. If the conversion is -- possible, then N is rewritten with the analyzed converted aggregate. - -- The parameter Max_Others_Replicate controls the maximum number of - -- values corresponding to an others choice that will be converted to - -- positional notation (the default of 32 is the normal limit, and reflects - -- the fact that normally the loop is better than a lot of separate - -- assignments). Note that this limit gets overridden in any case if - -- either of the restrictions No_Elaboration_Code or No_Implicit_Loops is - -- set. The parameter Handle_Bit_Packed is usually set False (since we do + -- The parameter Handle_Bit_Packed is usually set False (since we do -- not expect the back end to handle bit packed arrays, so the normal case -- of conversion is pointless), but in the special case of a call from -- Packed_Array_Aggregate_Handled, we set this parameter to True, since @@ -320,6 +313,12 @@ package body Exp_Aggr is -- an array that is suitable for this optimization: it returns True if Typ -- is a two dimensional bit packed array with component size 1, 2, or 4. + function Max_Aggregate_Size + (Typ : Entity_Id; + Default_Size : Nat := 5000) return Nat; + -- Return the max size for a static aggregate for the given Typ. + -- Return Default_Size if no other special criteria triggers. + function Packed_Array_Aggregate_Handled (N : Node_Id) return Boolean; -- Given an array aggregate, this function handles the case of a packed -- array aggregate with all constant values, where the aggregate can be @@ -429,43 +428,15 @@ package body Exp_Aggr is -- Start of processing for Aggr_Size_OK begin - -- The normal aggregate limit is 50, but we increase this limit to - -- 2**24 (about 16 million) if Restrictions (No_Elaboration_Code) or - -- Restrictions (No_Implicit_Loops) is specified, since in either case - -- we are at risk of declaring the program illegal because of this - -- limit. We also increase the limit when Static_Elaboration_Desired, - -- given that this means that objects are intended to be placed in data - -- memory. - - -- We also increase the limit if the aggregate is for a packed two- - -- dimensional array, because if components are static it is much more - -- efficient to construct a one-dimensional equivalent array with static - -- components. - - -- Conversely, we decrease the maximum size if none of the above - -- requirements apply, and if the aggregate has a single component + -- We bump the maximum size unless the aggregate has a single component -- association, which will be more efficient if implemented with a loop. - -- Finally, we use a small limit in CodePeer mode where we favor loops - -- instead of thousands of single assignments (from large aggregates). - - Max_Aggr_Size := 50; - - if CodePeer_Mode then - Max_Aggr_Size := 100; - - elsif Restriction_Active (No_Elaboration_Code) -or else Restriction_Active (No_Implicit_Loops) -or else Is_Two_Dim_Packed_Array (Typ) -or else (Ekind (Current_Scope) = E_Package - and then Static_Elaboration_Desired (Current_Scope)) - then - Max_Aggr_Size := 2 ** 24; - - elsif No (Expressions (N)) + if No (Expressions (N)) and then No (Next (First (Component_Associations (N then - Max_Aggr_Size := 5000; + Max_Aggr_Size := Max_Aggregate_Size (Typ); + else + Max_Aggr_Size := Max_Aggregate_Size (Typ, 500_000); end if; Size := UI_From_Int (Component_Count (Component_Type (Typ))); @@ -4561,11 +4532,11 @@ package body Exp_Aggr is --- procedure Convert_To_Positional - (N: Node_Id; - Max_Others_Replicate : Nat := 32; - Handle_Bit_Packed: Boolean := False) + (N
[Ada] Small cleanup in Einfo unit
This moves around the declarations of functions that are not in the right category, adds pragma Inline for others and fixes alphabetization issues. This also clearly splits the functions subject to pragma Inline from the procedures subject to the pragma, the former being read by XEINFO but not the latter, and adds a comment stating the distinction. No functional changes. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Eric Botcazou gcc/ada/ * einfo.ads (Has_Foreign_Convention): Fix description. (Component_Alignment): Move around. (Has_DIC): Likewise. (Has_Interrupt_Handler): Likewise. (Has_Invariants): Likewise. (Is_Atomic_Or_VFA): Likewise. (Next_Index): Likewise. (Scope_Depth): Likewise. (Init_Component_Size): Likewise. (Init_Component_Location): Likewise. (Init_Size): Likewise. (Inline Pragmas for functions): Add Corresponding_Function, Corresponding_Procedure, Entry_Max_Queue_Lengths_Array, Finalize_Storage_Only, Has_DIC, Has_Invariants, Initialization_Statements, Is_Anonymous_Access_Type, Next_Stored_Discriminant, Address_Clause, Alignment_Clause, Float_Rep, Has_Foreign_Convention, Has_Non_Limited_View, Is_Constant_Object, Is_Discriminal, Is_Finalizer, Is_Null_State, Is_Prival, Is_Protected_Component, Is_Protected_Record_Type, Is_Subprogram_Or_Entry, Is_Task_Record_Type, Size_Clause, Stream_Size_Clause, Type_High_Bound, Type_Low_Bound, Known_*, Unknown_*. (Inline Pragmas for procedures): Add Set_Corresponding_Function, Set_Corresponding_Procedure, Set_Finalize_Storage_Only, Set_Float_Rep, Set_Initialization_Statements, Init_Normalized_First_Bit, Init_Normalized_Position, Init_Normalized_Position_Max. * einfo.adb (Was_Hidden): Move around. (Is_Packed_Array): Likewise. (Model_Emin_Value): Likewise. (Model_Epsilon_Value): Likewise. (Model_Mantissa_Value): Likewise. (Model_Small_Value): Likewise.--- gcc/ada/einfo.adb +++ gcc/ada/einfo.adb @@ -3615,6 +3615,11 @@ package body Einfo is return Flag238 (Id); end Warnings_Off_Used_Unreferenced; + function Was_Hidden (Id : E) return B is + begin + return Flag196 (Id); + end Was_Hidden; + function Wrapped_Entity (Id : E) return E is begin pragma Assert (Ekind_In (Id, E_Function, E_Procedure) @@ -3622,11 +3627,6 @@ package body Einfo is return Node27 (Id); end Wrapped_Entity; - function Was_Hidden (Id : E) return B is - begin - return Flag196 (Id); - end Was_Hidden; - -- -- Classification Functions -- -- @@ -8168,15 +8168,6 @@ package body Einfo is Ekind (Id) = E_Abstract_State and then Nkind (Parent (Id)) = N_Null; end Is_Null_State; - - - -- Is_Packed_Array -- - - - - function Is_Packed_Array (Id : E) return B is - begin - return Is_Array_Type (Id) and then Is_Packed (Id); - end Is_Packed_Array; - --- -- Is_Package_Or_Generic_Package -- --- @@ -8186,6 +8177,15 @@ package body Einfo is return Ekind_In (Id, E_Generic_Package, E_Package); end Is_Package_Or_Generic_Package; + - + -- Is_Packed_Array -- + - + + function Is_Packed_Array (Id : E) return B is + begin + return Is_Array_Type (Id) and then Is_Packed (Id); + end Is_Packed_Array; + --- -- Is_Prival -- --- @@ -8404,44 +8404,6 @@ package body Einfo is Set_Next_Entity (First, Second); -- First --> Second end Link_Entities; - -- - -- Model_Emin_Value -- - -- - - function Model_Emin_Value (Id : E) return Uint is - begin - return Machine_Emin_Value (Id); - end Model_Emin_Value; - - - - -- Model_Epsilon_Value -- - - - - function Model_Epsilon_Value (Id : E) return Ureal is - Radix : constant Ureal := UR_From_Uint (Machine_Radix_Value (Id)); - begin - return Radix ** (1 - Model_Mantissa_Value (Id)); - end Model_Epsilon_Value; - - -- - -- Model_Mantissa_Value -- - -- - - function Model_Mantissa_Value (Id : E) return Uint is - begin - return Machine_Mantissa_Value (Id); - end Model_Mantissa_Value; - - --- - -- Model_Small_Value -- - --- - - function Model_Small_Value (Id : E) return Ureal is - Radix : constant Ureal := UR_From_Uint (Machine_Radix_Value (Id)); - begin - return Radix ** (Model_Emin_Value (Id) - 1); - end Model_Small_Value; - -- Machine_Emax_Value
[Ada] Code clean ups and comments updates
We identify all the places where error messages are issued in the expander and should be issued in semantic analysis instead, for future clean ups. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * exp_ch3.adb, exp_ch4.adb, exp_ch6.adb, exp_ch9.adb, exp_disp.adb, exp_util.adb: Add comments related to errors that should be moved to semantic analysis. Also replace "?" with "??" in warning messages.--- gcc/ada/exp_ch3.adb +++ gcc/ada/exp_ch3.adb @@ -4456,6 +4456,8 @@ package body Exp_Ch3 is procedure Check_Attr (Nam : Name_Id; TSS_Nam : TSS_Name_Type) is begin + -- Move this check to sem??? + if not Stream_Attribute_Available (Etype (Comp), TSS_Nam) then Error_Msg_Name_1 := Nam; Error_Msg_N @@ -9039,6 +9041,8 @@ package body Exp_Ch3 is end loop; pragma Assert (Present (Comp)); + + -- Move this check to sem??? Error_Msg_Node_2 := Comp; Error_Msg_NE ("parent type & with dynamic component & cannot be parent" --- gcc/ada/exp_ch4.adb +++ gcc/ada/exp_ch4.adb @@ -4486,7 +4486,7 @@ package body Exp_Ch4 is and then Nkind (Associated_Node_For_Itype (PtrT)) = N_Object_Declaration) then - Error_Msg_N ("?use of an anonymous access type allocator", N); + Error_Msg_N ("??use of an anonymous access type allocator", N); end if; -- RM E.2.2(17). We enforce that the expected type of an allocator --- gcc/ada/exp_ch6.adb +++ gcc/ada/exp_ch6.adb @@ -1960,7 +1960,7 @@ package body Exp_Ch6 is then if Comes_From_Source (N) then Error_Msg_N - ("?atomic actual passed by copy (RM C.6(19))", Actual); + ("??atomic actual passed by copy (RM C.6(19))", Actual); end if; return True; end if; @@ -1971,7 +1971,7 @@ package body Exp_Ch6 is then if Comes_From_Source (N) then Error_Msg_N - ("?volatile actual passed by copy (RM C.6(19))", Actual); + ("??volatile actual passed by copy (RM C.6(19))", Actual); end if; return True; end if; @@ -3921,6 +3921,8 @@ package body Exp_Ch6 is if Present (Ass) and then Is_Class_Wide_Type (Etype (Name (Ass))) then + -- Move the error messages below to sem??? + if Is_Access_Type (Etype (Call_Node)) then if Designated_Type (Etype (Call_Node)) /= Root_Type (Etype (Name (Ass))) @@ -4115,6 +4117,8 @@ package body Exp_Ch6 is Set_Entity (Name (Call_Node), Parent_Subp); + -- Move this check to sem??? + if Is_Abstract_Subprogram (Parent_Subp) and then not In_Instance then --- gcc/ada/exp_ch9.adb +++ gcc/ada/exp_ch9.adb @@ -11168,6 +11168,7 @@ package body Exp_Ch9 is then null; else + -- Move this check to sem??? Error_Msg_NE ( "& is not a time type (RM 9.6(6))", Expression (Delay_Statement (Alt)), Time_Type); --- gcc/ada/exp_disp.adb +++ gcc/ada/exp_disp.adb @@ -3836,6 +3836,7 @@ package body Exp_Disp is -- tagged type, when one of its primitive operations has a type in its -- profile whose full view has not been analyzed yet. More complex cases -- involve composite types that have one private unfrozen subcomponent. + -- Move this check to sem??? procedure Export_DT (Typ : Entity_Id; DT : Entity_Id; Index : Nat := 0); -- Export the dispatch table DT of tagged type Typ. Required to generate @@ -8150,6 +8151,7 @@ package body Exp_Disp is -- We exclude Input and Output stream operations because -- Limited_Controlled inherits useless Input and Output stream -- operations from Root_Controlled, which can never be overridden. +-- Move this check to sem??? if not Is_TSS (Prim, TSS_Stream_Input) and then --- gcc/ada/exp_util.adb +++ gcc/ada/exp_util.adb @@ -1296,6 +1296,7 @@ package body Exp_Util is -- of the type. In the case of an inherited condition for an -- overriding operation, both the operation and the function -- are given by primitive wrappers. + -- Move this check to sem??? if Ekind (New_E) = E_Function and then Is_Primitive_Wrapper (New_E) @@ -1326,6 +1327,7 @@ package body Exp_Util is -- Check that there are no calls left to abstract operations if -- the current subprogram is not abstract. +
[Ada] Missing check on private overriding of dispatching primitive
When a dispatching operation overrides an inherited subprogram, it must be subtype conformant with the inherited subprogram. Such check is not performed by the frontend on overriding primitives of private types which are declared before the full type declaration of the private type. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Javier Miranda gcc/ada/ * sem_ch6.adb (New_Overloaded_Entity): Add missing call to check subtype conformance of overriding dispatching primitive. * sem_eval.adb (Subtypes_Statically_Match): Handle derivations of private subtypes. * libgnat/g-exptty.adb, libgnat/g-exptty.ads (Set_Up_Communications): Fix the profile since null-exclusion is missing in the access type formals. * sem_disp.ads (Check_Operation_From_Private_View): Adding documentation.--- gcc/ada/libgnat/g-exptty.adb +++ gcc/ada/libgnat/g-exptty.adb @@ -314,9 +314,9 @@ package body GNAT.Expect.TTY is overriding procedure Set_Up_Communications (Pid: in out TTY_Process_Descriptor; Err_To_Out : Boolean; - Pipe1 : access Pipe_Type; - Pipe2 : access Pipe_Type; - Pipe3 : access Pipe_Type) + Pipe1 : not null access Pipe_Type; + Pipe2 : not null access Pipe_Type; + Pipe3 : not null access Pipe_Type) is pragma Unreferenced (Err_To_Out, Pipe1, Pipe2, Pipe3); --- gcc/ada/libgnat/g-exptty.ads +++ gcc/ada/libgnat/g-exptty.ads @@ -116,9 +116,9 @@ private procedure Set_Up_Communications (Pid: in out TTY_Process_Descriptor; Err_To_Out : Boolean; - Pipe1 : access Pipe_Type; - Pipe2 : access Pipe_Type; - Pipe3 : access Pipe_Type); + Pipe1 : not null access Pipe_Type; + Pipe2 : not null access Pipe_Type; + Pipe3 : not null access Pipe_Type); procedure Set_Up_Parent_Communications (Pid : in out TTY_Process_Descriptor; --- gcc/ada/sem_ch6.adb +++ gcc/ada/sem_ch6.adb @@ -11177,6 +11177,18 @@ package body Sem_Ch6 is Inherit_Subprogram_Contract (E, S); end if; + -- When a dispatching operation overrides an inherited + -- subprogram, it shall be subtype conformant with the + -- inherited subprogram (RM 3.9.2 (10.2)). + + if Comes_From_Source (E) +and then Is_Dispatching_Operation (E) +and then Find_Dispatching_Type (S) + = Find_Dispatching_Type (E) + then + Check_Subtype_Conformant (E, S); + end if; + if Comes_From_Source (E) then Check_Overriding_Indicator (E, S, Is_Primitive => False); --- gcc/ada/sem_disp.ads +++ gcc/ada/sem_disp.ads @@ -64,11 +64,11 @@ package Sem_Disp is -- this call actually do??? procedure Check_Operation_From_Private_View (Subp, Old_Subp : Entity_Id); - -- Add Old_Subp to the list of primitive operations of the corresponding - -- tagged type if it is the full view of a private tagged type. The Alias - -- of Old_Subp is adjusted to point to the inherited procedure of the - -- full view because it is always this one which has to be called. - -- What is Subp used for??? + -- No action performed if Subp is not an alias of a dispatching operation. + -- Add Old_Subp (if not already present) to the list of primitives of the + -- tagged type T of Subp if T is the full view of a private tagged type. + -- The Alias of Old_Subp is adjusted to point to the inherited procedure + -- of the full view because it is always this one which has to be called. function Covered_Interface_Op (Prim : Entity_Id) return Entity_Id; -- Returns the interface primitive that Prim covers, when its controlling --- gcc/ada/sem_eval.adb +++ gcc/ada/sem_eval.adb @@ -6092,6 +6092,29 @@ package body Sem_Eval is elsif Has_Discriminants (T1) or else Has_Discriminants (T2) then + -- Handle derivations of private subtypes. For example S1 statically + -- matches the full view of T1 in the following example: + + -- type T1(<>) is new Root with private; + -- subtype S1 is new T1; + -- overriding proc P1 (P : S1); + --private + -- type T1 (D : Disc) is new Root with ... + + if Ekind (T2) = E_Record_Subtype_With_Private + and then not Has_Discriminants (T2) + and then Partial_View_Has_Unknown_Discr (T1) + and then Etype (T2) = T1 + then +return True; + + elsif Ekind (T1) = E_Record_Subtype_With_Private + and then not Has_Discriminants (T1) + and then Partial_View_Has_Unknown_Discr (T2) + and then Etype (T1) = T2 + then +return True; + --
[Ada] Propagate DIC, Invariant and Predicate attributes to views
This change is a step towards propagating more consistently the DIC, Invariant and Predicate attributes between different views of the same type, in particular to the newly built underlying full views. It also cleans up the handling of the base types for the DIC and Invariant attributes, which looks obsolete since everything related to DIC and Invariant is already done on the base types, so doing it again explicitly on them is superfluous. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Eric Botcazou gcc/ada/ * checks.adb (Apply_Predicate_Check): Extend trick used for aggregates to qualified aggregates and object declarations * einfo.ads (Has_Own_DIC): Mention the underlying full view. (Has_Own_Invariants): Likewise. (Has_Predicates): Likewise. * exp_util.adb (Build_DIC_Procedure_Declaration): Do not deal with base types explicitly but with underlying full views. (Build_Invariant_Procedure_Declaration): Likewise. * sem_ch13.adb (Build_Predicate_Functions): Do not deal with the full view manually but call Propagate_Predicate_Attributes to propagate attributes to views. (Build_Predicate_Function_Declaration): Likewise. * sem_ch3.adb (Build_Assertion_Bodies_For_Type): Build bodies for private full views with an underlying full view. (Build_Derived_Private_Type): Small comment tweak. (Complete_Private_Subtype): Call Propagate_Predicate_Attributes. (Process_Full_View): Do not deal with base types explicitly for DIC and Invariant attributes. Deal with underlying full views for them. Call Propagate_Predicate_Attributes and deal with underlying full views for them. * sem_ch7.adb (Preserve_Full_Attributes): Do not cross propagate DIC and Invariant attributes between full type and its base type. Propagate Predicate attributes from the full to the private view. * sem_ch9.adb (Analyze_Protected_Type_Declaration): Likewise. (Analyze_Task_Type_Declaration): Likewise. * sem_util.ads (Get_Views): Remove Full_Base parameter and add UFull_Typ parameter. (Propagate_Predicate_Attributes): New procedure. * sem_util.adb (Get_Views): Remove Full_Base parameter and add UFull_Typ parameter. Retrieve the Corresponding_Record_Type from the underlying full view, if any. (Propagate_DIC_Attributes): Remove useless tests. (Propagate_Invariant_Attributes): Likewise. (Propagate_Predicate_Attributes): New procedure.--- gcc/ada/checks.adb +++ gcc/ada/checks.adb @@ -2711,7 +2711,8 @@ package body Checks is Typ : Entity_Id; Fun : Entity_Id := Empty) is - S : Entity_Id; + Par : Node_Id; + S : Entity_Id; begin if Predicate_Checks_Suppressed (Empty) then @@ -2807,6 +2808,11 @@ package body Checks is return; end if; +Par := Parent (N); +if Nkind (Par) = N_Qualified_Expression then + Par := Parent (Par); +end if; + -- For an entity of the type, generate a call to the predicate -- function, unless its type is an actual subtype, which is not -- visible outside of the enclosing subprogram. @@ -2818,24 +2824,36 @@ package body Checks is Make_Predicate_Check (Typ, New_Occurrence_Of (Entity (N), Sloc (N; --- If the expression is not an entity it may have side effects, --- and the following call will create an object declaration for --- it. We disable checks during its analysis, to prevent an --- infinite recursion. - --- If the prefix is an aggregate in an assignment, apply the --- check to the LHS after assignment, rather than create a +-- If the expression is an aggregate in an assignment, apply the +-- check to the LHS after the assignment, rather than create a -- redundant temporary. This is only necessary in rare cases -- of array types (including strings) initialized with an -- aggregate with an "others" clause, either coming from source -- or generated by an Initialize_Scalars pragma. -elsif Nkind (N) = N_Aggregate - and then Nkind (Parent (N)) = N_Assignment_Statement +elsif Nkind_In (N, N_Aggregate, N_Extension_Aggregate) + and then Nkind (Par) = N_Assignment_Statement then - Insert_Action_After (Parent (N), + Insert_Action_After (Par, Make_Predicate_Check - (Typ, Duplicate_Subexpr (Name (Parent (N); + (Typ, Duplicate_Subexpr (Name (Par; + +-- Similarly, if the expression is an aggregate in an object +-- decl
[Ada] Fix wrong type being used for range check generation
Before this commit, GNAT would use the type of the expression rather than that of the subtype mark in order to decide whether it should generate range checks or not. This means that in some cases, GNAT would decide that no range checks were needed, which was wrong. This issue went mostly unnoticed because the expansion routines use their own logic in order to determine whether range checks should be generated or not. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Ghjuvan Lacambre gcc/ada/ * sem_res.adb (Resolve_Qualified_Expression): Use Subtype_Mark type.--- gcc/ada/sem_res.adb +++ gcc/ada/sem_res.adb @@ -10004,8 +10004,10 @@ package body Sem_Res is -- check may convert an illegal static expression and result in warning -- rather than giving an error (e.g Integer'(Integer'Last + 1)). - if Nkind (N) = N_Qualified_Expression and then Is_Scalar_Type (Typ) then - Apply_Scalar_Range_Check (Expr, Typ); + if Nkind (N) = N_Qualified_Expression +and then Is_Scalar_Type (Target_Typ) + then + Apply_Scalar_Range_Check (Expr, Target_Typ); end if; -- AI12-0100: Once the qualified expression is resolved, check whether
[Ada] Ada2020 AI12-0282: Shared variable control aspects in generics
This patch further refines the checks and the corresponding error messsages on the legality of instantiations where some formal types and their corresponding actuals may carry Atomic, Volatile, etc. aspect specifications. The legality rules stated in RN 6.1 (12/5) are in fact backward-incompatible because they violate a basic language design rule, namely that limited private formal types impose no constraint on the corresponding formals. Pending further ARG discussions the current code reverts the checking for an exact match between specified aspects to fornal derived types only. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Ed Schonberg gcc/ada/ * sem_ch12.adb (Check_Shared_Variable_Control_Aspects): Require exact match between formal and actual for aspects Atomic and Volatile only for formal derived types.--- gcc/ada/sem_ch12.adb +++ gcc/ada/sem_ch12.adb @@ -12394,19 +12394,38 @@ package body Sem_Ch12 is -- Ada 2020: Verify that shared variable control aspects (RM C.6) -- that may be specified for the formal are obeyed by the actual. + -- If the fornal is a derived type the aspect specifications must match. + -- NOTE: AI12-0282 implies that matching of aspects is required between + -- formal and actual in all cases, but this is too restrictive. + -- In particular it violates a language design rule: a limited private + -- indefinite formal can be matched by any actual. The current code + -- reflects an older and more permissve version of RM C.6 (12/5). procedure Check_Shared_Variable_Control_Aspects is begin if Ada_Version >= Ada_2020 then -if Is_Atomic (A_Gen_T) /= Is_Atomic (Act_T) then +if Is_Atomic (A_Gen_T) and then not Is_Atomic (Act_T) then + Error_Msg_NE + ("actual for& must have Atomic aspect", Actual, A_Gen_T); + +elsif Is_Derived_Type (A_Gen_T) + and then Is_Atomic (A_Gen_T) /= Is_Atomic (Act_T) +then Error_Msg_NE ("actual for& has different Atomic aspect", Actual, A_Gen_T); end if; -if Is_Volatile (A_Gen_T) /= Is_Volatile (Act_T) then +if Is_Volatile (A_Gen_T) and then not Is_Volatile (Act_T) then Error_Msg_NE ("actual for& has different Volatile aspect", Actual, A_Gen_T); + +elsif Is_Derived_Type (A_Gen_T) + and then Is_Volatile (A_Gen_T) /= Is_Volatile (Act_T) +then + Error_Msg_NE + ("actual for& has different Volatile aspect", + Actual, A_Gen_T); end if; -- We assume that an array type whose atomic component type
[Ada] Remove kludge for AI05-0087
This is a code clean up as part of removing all calls to Error_Msg* in the expander. Tested on x86_64-pc-linux-gnu, committed on trunk 2020-06-09 Arnaud Charlet gcc/ada/ * exp_ch5.adb (Expand_N_Assignment): Remove kludge for AI05-0087. * sem_ch12.adb (Validate_Derived_Type_Instance): Implement AI05-0087 retroactively since it's a binding interpretation.--- gcc/ada/exp_ch5.adb +++ gcc/ada/exp_ch5.adb @@ -29,7 +29,6 @@ with Checks; use Checks; with Debug;use Debug; with Einfo;use Einfo; with Elists; use Elists; -with Errout; use Errout; with Exp_Aggr; use Exp_Aggr; with Exp_Ch6; use Exp_Ch6; with Exp_Ch7; use Exp_Ch7; @@ -2664,25 +2663,13 @@ package body Exp_Ch5 is and then not Restriction_Active (No_Dispatching_Calls)) then - if Is_Limited_Type (Typ) then - - -- This can happen in an instance when the formal is an - -- extension of a limited interface, and the actual is - -- limited. This is an error according to AI05-0087, but - -- is not caught at the point of instantiation in earlier - -- versions. We also must verify that the limited type does - -- not come from source as corner cases may exist where - -- an assignment was not intended like the pathological case - -- of a raise expression within a return statement. - - -- This is wrong, error messages cannot be issued during - -- expansion, since they would be missed in -gnatc mode ??? - - if Comes_From_Source (N) then - Error_Msg_N - ("assignment not available on limited type", N); - end if; + -- We should normally not encounter any limited type here, + -- except in the corner case where an assignment was not + -- intended like the pathological case of a raise expression + -- within a return statement. + if Is_Limited_Type (Typ) then + pragma Assert (not Comes_From_Source (N)); return; end if; --- gcc/ada/sem_ch12.adb +++ gcc/ada/sem_ch12.adb @@ -13460,17 +13460,8 @@ package body Sem_Ch12 is -- explicitly so. If not declared limited, the actual cannot be -- limited (see AI05-0087). - -- Even though this AI is a binding interpretation, we enable the - -- check only in Ada 2012 mode, because this improper construct - -- shows up in user code and in existing B-tests. - - if Is_Limited_Type (Act_T) - and then not Is_Limited_Type (A_Gen_T) - and then Ada_Version >= Ada_2012 - then -if In_Instance then - null; -else + if Is_Limited_Type (Act_T) and then not Is_Limited_Type (A_Gen_T) then +if not In_Instance then Error_Msg_NE ("actual for non-limited & cannot be a limited type", Actual, Gen_T); @@ -13479,30 +13470,25 @@ package body Sem_Ch12 is end if; end if; - -- Don't check Ada_Version here (for now) because AI12-0036 is - -- a binding interpretation; this decision may be reversed if - -- the situation turns out to be similar to that of the preceding - -- Is_Limited_Type test (see preceding comment). + -- Check for AI12-0036 declare Formal_Is_Private_Extension : constant Boolean := Nkind (Parent (A_Gen_T)) = N_Private_Extension_Declaration; Actual_Is_Tagged : constant Boolean := Is_Tagged_Type (Act_T); + begin if Actual_Is_Tagged /= Formal_Is_Private_Extension then - if In_Instance then - null; - else + if not In_Instance then if Actual_Is_Tagged then Error_Msg_NE - ("actual for & cannot be a tagged type", -Actual, Gen_T); + ("actual for & cannot be a tagged type", Actual, Gen_T); else Error_Msg_NE - ("actual for & must be a tagged type", -Actual, Gen_T); + ("actual for & must be a tagged type", Actual, Gen_T); end if; + Abandon_Instantiation (Actual); end if; end if;
Re: [patch] Make memory copy functions scalar storage order barriers
On Mon, Jun 8, 2020 at 10:37 AM Eric Botcazou wrote: > > > OK, but all I was saying is that looking at the pointed-to type of > > the address argument to the memcpy looks fragile to me. For the > > case cited above it would be better to look at the actual > > reference we are looking up and not allowing memcpy handling > > when that reference contains a storage order barrier? Like > > a && !contains_storage_order_barrier_p (vr->operands) on the whole block > > like we do for the assign from constant/SSA value case? > > The memcpy is the storage order barrier though, there is no other, so I'm not > sure what contains_storage_order_barrier_p should be called on. Or do you > mean that an operand should be created for the memcpy with "reverse" set? I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER works and what the constraints are. But if the memcpy is the storage order barrier then what cases are OK? - both source and destination objects are [only] accessed with the same storage order - both source and destination objects are [only] accessed with !TYPE_REVERSE_STORAGE_ORDER "containing" accesses and what determines the "object state" with respect to storage order when we see a memcpy call? I had the impression (from the other existing cases of contains_storage_order_barrier_p () calls) that we derive the "object state" from the downstream access (vr->operands), in this case the memcpy destination access. But see below for maybe the easier case (memcpy folding). > > But the above cited case is the _only_ case we're possibly building an > > assignment of a variable - so it's enough to assert that in the above case > > destvar is not TYPE_REVERSE_STORAGE_ORDER? > > What about > > /* If we can perform the copy efficiently with first doing all loads > and then all stores inline it that way. Currently efficiently > means that we can load all the memory into a single integer > register which is what MOVE_MAX gives us. */ > src_align = get_pointer_alignment (src); > > It's also a scalar copy, so it won't work either. In my understanding of how TYPE_REVERSE_STORAGE_ORDER works a memcpy, irrespective of its actual "implementation" should not break anything since it transfers bytes. In particular a 8 byte TYPE_REVERSE_STORAGE_ORDER object can be transfered ("memcpy"ed) with (ignoring TBAA) int tema = *(int *)src; *(int *)dst = tema; int temb = *((int *)src + 1); *((int *)dst + 1) = temb; and what byteorder the int accesses are performed in is irrelevant as long as read and write use the same byteorder. > > The other cases all build an aggregate copy assignment with > > a non-reverse-storage-order char[size] type which should be OK, no? > > Yes, theoritically, but SRA rewrites the access into a scalar access and we're > back to square one (I tried). But what's the issue with this? > , so this can possibly work only if the array type > is built with TYPE_REVERSE_STORAGE_ORDER too, which implies that the source is > in reverse order too. So the only case that could be handled correctly is the > memcpy between two aggregates types with TYPE_REVERSE_STORAGE_ORDER. Why? Is the following an invalid program? void mymemcpy (void *a, void *b) { memcpy (a, b, 8); } void foo() { int b[2] __attribute__(reverse_storage); int a[2]; mymemcpy (a, b); return a[0]; } how can GCC infer, inside mymemcpy that b is reverse storage order? Thus, if all this is a problem then I think we're papering over the issue by not folding memcpy. > > Note even for the above we probably would be fine if we'd made sure > > to use a !TYPE_REVERSE_STORAGE_ORDER "qualified" type > > for the access. So the memcpy happens via a native order > > load plus a native order store, exactly what memcpy does (in fact > > the actual order does not matter unless the source and destination > > order "agree")? Thus maybe amend > > > > /* Make sure we are not copying using a floating-point mode or > > a type whose size possibly does not match its precision. */ > > if (FLOAT_MODE_P (TYPE_MODE (desttype)) > > > > || TREE_CODE (desttype) == BOOLEAN_TYPE > > || TREE_CODE (desttype) == ENUMERAL_TYPE) > > > > desttype = bitwise_type_for_mode (TYPE_MODE (desttype)); > > if (FLOAT_MODE_P (TYPE_MODE (srctype) ) > > > > || TREE_CODE (srctype) == BOOLEAN_TYPE > > || TREE_CODE (srctype) == ENUMERAL_TYPE) > > > > srctype = bitwise_type_for_mode (TYPE_MODE (srctype)); > > > > with || TYPE_REVERSE_STORAGE_ORDER (srctype) (same for desttype)? > > We don't have TYPE_REVERSE_STORAGE_ORDER qualified types, the flag is only set > on aggregate types and bitwise_type_for_mode will return an integer type. Exactly. When the original src/desttype turns out to be a RECORD_TYPE (IIRC that can happen) with TYPE_REVERSE_STORAGE_ORDER make sure we're using a bitwise_type_for_mode. Because then the copying w
Re: Fix ICE with typeless storage
On Fri, Jun 5, 2020 at 11:08 AM Jan Hubicka wrote: > > Hi, > this patch fixes ICE while bulding Firefox on assert in > record_component_aliases which checks that the alias set did not change > while peeling off vector and array types. > This is done so we can translate notice references to pointers. > > If array is TYPE_TYPELESS_STORAGE then it is not true that array type > woud have same alias set as its elements since its alias set is 0, so we > need to watch for this. (And it is safe to miss pointer here once we add > alias set 0 as a component) OK. > Honza > > * alias.c (record_component_aliases): Watch for typeless storage while > skipping the ARRAY_TREE wrappers. > > diff --git a/gcc/alias.c b/gcc/alias.c > index 49bd7b37966..2bed5e78c62 100644 > --- a/gcc/alias.c > +++ b/gcc/alias.c > @@ -1273,8 +1273,12 @@ record_component_aliases (tree type, alias_set_type > superset) > { > /* VECTOR_TYPE and ARRAY_TYPE share the alias set with their > element type and that type has to be normalized to void > *, > -too, in the case it is a pointer. */ > - while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t)) > +too, in the case it is a pointer. > +An exception is array with TYPE_TYPELESS_STORAGE which > +has alias set 0. */ > + while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t) > +&& (!AGGREGATE_TYPE_P (t) > +|| !TYPE_TYPELESS_STORAGE (t))) > { > gcc_checking_assert (TYPE_STRUCTURAL_EQUALITY_P (t)); > t = TREE_TYPE (t);
[PATCH/RFC] How to fix PR95440
Hi C++ folks, There are a number of places in the coroutine codegen where we need to build calls to promise methods. Such methods can, in several cases, also be constexpr or static (or both) which leads to the PR, Now the mechanism I use is this; 1/ lookup the method in the relevant scope tree method = lookup_member (promise, member_id, /*protect=*/1, /*want_type=*/0, tf_warning_or_error); - this returns a BASELINK tree.. 2/ In many cases, the calls are to methods with no parms. so, with a dummy object, I do: tree call = build_new_method_call (dummy, method, NULL, NULL_TREE, LOOKUP_NORMAL, NULL, flags); ‘args’ is the 3rd parm. 3/ The header comment on build_new_method_call says: /* Build a call to "INSTANCE.FN (ARGS)". If FN_P is non-NULL, it will be set, upon return, to the function called. ARGS may be NULL. This may change ARGS. */ Which says that the 3rd arg can validly be NULL (as per my example). 4/ however, inside build_new_method_call ‘args' gets assigned to ‘user_args’ which is used in a lot of places without any guard to see if it’s non-null. One of those places is passing it to add_candidates (), which eventually in the case of a static method uses the pointer and ICEs. = So… A/ I’m doing something wrong, and my use of APIs is violating an expectation (in which case what’s the right API usage?) B/ It happens to be other callers to build_new_method_call() have either had dummy args or not produced the same set of conditions (for my case of static promise methods) * if I provide an empty dummy ‘args' to the build_new_method_call it works * if I fix the unguarded use of ‘args' in add_candidates (as below) it works, AFAICT, the intent is to avoid every caller for build_new_method_call() having to build a dummy argument list when that list is empty - it would seem irritating to have to add one for every use in coros - but I can. WDYT? thanks Iain the fix for call.c: diff --git a/gcc/cp/call.c b/gcc/cp/call.c index 3c97b9846e2..dccc1bf596f 100644 --- a/gcc/cp/call.c +++ b/gcc/cp/call.c @@ -5862,7 +5862,7 @@ add_candidates (tree fns, tree first_arg, const vec *args, } /* Don't bother reversing an operator with two identical parameters. */ - else if (args->length () == 2 && (flags & LOOKUP_REVERSED)) + else if (args && args->length () == 2 && (flags & LOOKUP_REVERSED)) { tree parmlist = TYPE_ARG_TYPES (TREE_TYPE (fn)); if (same_type_p (TREE_VALUE (parmlist),
Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles
On 6/9/20 9:48 AM, Marco Elver wrote: v2: * Add Optimization keyword to -param=tsan-distinguish-volatile= as the parameter can be different per TU. * Add tree-dump check to test. Hello. I support the patch. @Jakub: Can you please approve the patch? Thanks, Martin
[PATCH PR95569] ICE in tmmark:verify_ssa failed
Hi Commit eb72dc663e9070b281be83a80f6f838a3a878822 introduces a ICE on AArch64. See the discussion on the bug for details. Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95569 Trivial patch fixing the ICE attached. Bootstrap and tested on aarch64 platform. No new regression witnessed. ok for trunk? 2020-06-09 Qian Chao gcc/: PR tree-optimization/95569 * trans-mem.c (expand_assign_tm): Ensure that rtmp is marked TREE_ADDRESSABLE. * testsuite/gcc.dg/tm/pr51472.c: New test. PR95569-v0.patch Description: PR95569-v0.patch
Re: [PATCH v2] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]
Ping? Maybe I could mention that LLVM emits a warning in this case (https://reviews.llvm.org/D28820). Thanks, Christophe On Wed, 3 Jun 2020 at 15:23, Christophe Lyon wrote: > > Ping? > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html > > On Wed, 27 May 2020 at 13:52, Christophe Lyon > wrote: > > > > Ping? > > > > On Thu, 14 May 2020 at 16:57, Christophe Lyon > > wrote: > > > > > > The interrupt attribute does not guarantee that the FP registers are > > > saved, which can result in problems difficult to debug. > > > > > > Saving the FP registers and status registers can be a large penalty, > > > so it's probably not desirable to do that all the time. > > > > > > If the handler calls other functions, we'd likely need to save all of > > > them, for lack of knowledge of which registers they actually clobber. > > > > > > This is even more obscure for the end-user when the compiler inserts > > > calls to helper functions such as memcpy (some multilibs do use FP > > > registers to speed it up). > > > > > > In the PR, we discussed adding routines in libgcc to save the FP > > > context and saving only locally-clobbered FP registers, but this seems > > > to be too much work for the purpose, given that in general such > > > handlers try to avoid this kind of penalty. > > > I suspect we would also want new attributes to instruct the compiler > > > that saving the FP context is not needed. > > > > > > In the mean time, emit a warning to suggest re-compiling with > > > -mgeneral-regs-only. Note that this can lead to errors if the code > > > uses floating-point and -mfloat-abi=hard, eg: > > > argument of type 'double' not permitted with -mgeneral-regs-only > > > > > > This can be troublesome for the user, but at least this would make > > > him aware of the latent issue. > > > > > > The patch adds several testcases: > > > > > > - pr94734-1-hard.c checks that a warning is emitted when using > > > -mfloat-abi=hard. Function IRQ_HDLR_Test can make implicit calls to > > > runtime floating-point routines (or direct use of FP instructions), > > > IRQ_HDLR_Test2 doesn't. We emit a warning in both cases, though. > > > > > > - pr94734-1-softfp.c: same as above wih -mfloat-abi=softfp. > > > > > > - pr94734-1-soft.c checks that no warning is emitted when using > > > -mfloat-abi=soft when the same code as above. > > > > > > - pr94734-2.c checks that no warning is emitted when using > > > -mgeneral-regs-only. > > > > > > - pr94734-3.c checks that no warning is emitted when using > > > -mgeneral-regs-only even using float-point data. > > > > > > 2020-05-14 Christophe Lyon > > > > > > PR target/94743 > > > gcc/ > > > * config/arm/arm.c (arm_handle_isr_attribute): Warn if > > > -mgeneral-regs-only is not used. > > > > > > gcc/testsuite/ > > > * gcc.misc-tests/arm-isr.c: Add -mgeneral-regs-only. > > > * gcc.target/arm/empty_fiq_handler.c: Add -mgeneral-regs-only. > > > * gcc.target/arm/interrupt-1.c: Add -mgeneral-regs-only. > > > * gcc.target/arm/interrupt-2.c: Add -mgeneral-regs-only. > > > * gcc.target/arm/pr70830.c: Add -mgeneral-regs-only. > > > * gcc.target/arm/pr94743-1-hard.c: New test. > > > * gcc.target/arm/pr94743-1-soft.c: New test. > > > * gcc.target/arm/pr94743-1-softfp.c: New test. > > > * gcc.target/arm/pr94743-2.c: New test. > > > * gcc.target/arm/pr94743-3.c: New test. > > > --- > > > gcc/config/arm/arm.c | 5 > > > gcc/testsuite/gcc.misc-tests/arm-isr.c | 2 ++ > > > gcc/testsuite/gcc.target/arm/empty_fiq_handler.c | 1 + > > > gcc/testsuite/gcc.target/arm/interrupt-1.c | 2 +- > > > gcc/testsuite/gcc.target/arm/interrupt-2.c | 2 +- > > > gcc/testsuite/gcc.target/arm/pr70830.c | 2 +- > > > gcc/testsuite/gcc.target/arm/pr94743-1-hard.c| 29 > > > > > > gcc/testsuite/gcc.target/arm/pr94743-1-soft.c| 27 > > > ++ > > > gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c | 29 > > > > > > gcc/testsuite/gcc.target/arm/pr94743-2.c | 22 ++ > > > gcc/testsuite/gcc.target/arm/pr94743-3.c | 23 +++ > > > 11 files changed, 141 insertions(+), 3 deletions(-) > > > create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-hard.c > > > create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-soft.c > > > create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c > > > create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c > > > create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-3.c > > > > > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c > > > index dda8771..c88de3e 100644 > > > --- a/gcc/config/arm/arm.c > > > +++ b/gcc/config/arm/arm.c > > > @@ -7232,6 +7232,11 @@ arm_handle_isr_attribute (tree *node, tree name, > > > tree args, int flags, > > >
Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles
On Tue, Jun 09, 2020 at 09:48:34AM +0200, Marco Elver wrote: > gcc/ > * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. > * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new > builtin for volatile instrumentation of reads/writes. > (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. > (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. > (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. > (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. > * tsan.c (get_memory_access_decl): Argument if access is > volatile. If param tsan-distinguish-volatile is non-zero, and > access if volatile, return volatile instrumentation decl. > (instrument_expr): Check if access is volatile. > > gcc/testsuite/ > * c-c++-common/tsan/volatile.c: New test. In general looks ok, just some minor nits. > --- a/gcc/params.opt > +++ b/gcc/params.opt > @@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best > edge is less than this th > Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization > Set the maximum number of instructions executed in parallel in reassociated > tree. If 0, use the target dependent heuristic. > > +-param=tsan-distinguish-volatile= > +Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, > 1) Param Optimization Do we need/want Optimization here? Optimization means the option is per-function, but to me whether you want to distinguish volatiles or not seems to be a global decision for the whole project. > +Emit special instrumentation for accesses to volatiles. > + > -param=uninit-control-dep-attempts= > Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) > IntegerRange(1, 65536) Param Optimization > Maximum number of nested calls to search for control dependencies during > uninitialized variable analysis. > --- a/gcc/sanitizer.def > +++ b/gcc/sanitizer.def > @@ -214,6 +214,27 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_READ_RANGE, > "__tsan_read_range", > DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_WRITE_RANGE, "__tsan_write_range", > BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST) > > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ1, "__tsan_volatile_read1", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ2, "__tsan_volatile_read2", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ4, "__tsan_volatile_read4", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ8, "__tsan_volatile_read8", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ16, > "__tsan_volatile_read16", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE1, > "__tsan_volatile_write1", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE2, > "__tsan_volatile_write2", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE4, > "__tsan_volatile_write4", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE8, > "__tsan_volatile_write8", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, > "__tsan_volatile_write16", > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) This last entry is already too long (line limit 80 chars), so should be wrapped like: DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, "__tsan_volatile_write16", BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) instead. > --- a/gcc/tsan.c > +++ b/gcc/tsan.c > @@ -52,25 +52,41 @@ along with GCC; see the file COPYING3. If not see > void __tsan_read/writeX (void *addr); */ > > static tree > -get_memory_access_decl (bool is_write, unsigned size) > +get_memory_access_decl (bool is_write, unsigned size, bool volatilep) > { >enum built_in_function fcode; > > - if (size <= 1) > -fcode = is_write ? BUILT_IN_TSAN_WRITE1 > - : BUILT_IN_TSAN_READ1; > - else if (size <= 3) > -fcode = is_write ? BUILT_IN_TSAN_WRITE2 > - : BUILT_IN_TSAN_READ2; > - else if (size <= 7) > -fcode = is_write ? BUILT_IN_TSAN_WRITE4 > - : BUILT_IN_TSAN_READ4; > - else if (size <= 15) > -fcode = is_write ? BUILT_IN_TSAN_WRITE8 > - : BUILT_IN_TSAN_READ8; > + if (param_tsan_distinguish_volatile && vo
Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles
On Tue, 9 Jun 2020 at 11:50, Jakub Jelinek wrote: > > On Tue, Jun 09, 2020 at 09:48:34AM +0200, Marco Elver wrote: > > gcc/ > > * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. > > * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new > > builtin for volatile instrumentation of reads/writes. > > (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. > > (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. > > (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. > > (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. > > * tsan.c (get_memory_access_decl): Argument if access is > > volatile. If param tsan-distinguish-volatile is non-zero, and > > access if volatile, return volatile instrumentation decl. > > (instrument_expr): Check if access is volatile. > > > > gcc/testsuite/ > > * c-c++-common/tsan/volatile.c: New test. > > In general looks ok, just some minor nits. > > > --- a/gcc/params.opt > > +++ b/gcc/params.opt > > @@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best > > edge is less than this th > > Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization > > Set the maximum number of instructions executed in parallel in > > reassociated tree. If 0, use the target dependent heuristic. > > > > +-param=tsan-distinguish-volatile= > > +Common Joined UInteger Var(param_tsan_distinguish_volatile) > > IntegerRange(0, 1) Param Optimization > > Do we need/want Optimization here? Optimization means the option is > per-function, but to me whether you want to distinguish volatiles or not > seems to be a global decision for the whole project. Adding Optimization here was Martin's suggestion. I'm fine either way and just wanted to err on the conservative side. Do note that in the kernel, we blacklist some files from instrumentation entirely, which implies leaving '-fsanitize=thread --param=tsan-distinguish-volatile=1' off. Although given that the option is only used with -fsanitize=thread, maybe it doesn't matter. If you strongly feel that Optimization should be removed again, please let me know. > > +Emit special instrumentation for accesses to volatiles. > > + [...] > > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, > > "__tsan_volatile_write16", > > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > > This last entry is already too long (line limit 80 chars), so should be > wrapped like: > DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, > "__tsan_volatile_write16", BT_FN_VOID_PTR, > ATTR_NOTHROW_LEAF_LIST) > instead. > > > --- a/gcc/tsan.c > > +++ b/gcc/tsan.c > > @@ -52,25 +52,41 @@ along with GCC; see the file COPYING3. If not see > > void __tsan_read/writeX (void *addr); */ > > > > static tree > > -get_memory_access_decl (bool is_write, unsigned size) > > +get_memory_access_decl (bool is_write, unsigned size, bool volatilep) > > { > >enum built_in_function fcode; > > > > - if (size <= 1) > > -fcode = is_write ? BUILT_IN_TSAN_WRITE1 > > - : BUILT_IN_TSAN_READ1; > > - else if (size <= 3) > > -fcode = is_write ? BUILT_IN_TSAN_WRITE2 > > - : BUILT_IN_TSAN_READ2; > > - else if (size <= 7) > > -fcode = is_write ? BUILT_IN_TSAN_WRITE4 > > - : BUILT_IN_TSAN_READ4; > > - else if (size <= 15) > > -fcode = is_write ? BUILT_IN_TSAN_WRITE8 > > - : BUILT_IN_TSAN_READ8; > > + if (param_tsan_distinguish_volatile && volatilep) > > +{ > > + if (size <= 1) > > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE1 > > +: BUILT_IN_TSAN_VOLATILE_READ1; > > + else if (size <= 3) > > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE2 > > +: BUILT_IN_TSAN_VOLATILE_READ2; > > + else if (size <= 7) > > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE4 > > +: BUILT_IN_TSAN_VOLATILE_READ4; > > + else if (size <= 15) > > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE8 > > +: BUILT_IN_TSAN_VOLATILE_READ8; > > + else > > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE16 > > +: BUILT_IN_TSAN_VOLATILE_READ16; > > +} > >else > > -fcode = is_write ? BUILT_IN_TSAN_WRITE16 > > - : BUILT_IN_TSAN_READ16; > > +{ > > + if (size <= 1) > > +fcode = is_write ? BUILT_IN_TSAN_WRITE1 : BUILT_IN_TSAN_READ1; > > + else if (size <= 3) > > +fcode = is_write ? BUILT_IN_TSAN_WRITE2 : BUILT_IN_TSAN_READ2; > > + else if (size <= 7) > > +fcode = is_write ? BUILT_IN_TSAN_WRITE4 : BUILT_IN_TSAN_READ4; > > + else if (size <= 15) > > +f
Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles
Just wanted to change this one, and noticed this > > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, > > "__tsan_volatile_write16", > > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) is precisely 80 characters. So if I read the style guide right, it's <= 80 chars (and not < 80 chars), right? > This last entry is already too long (line limit 80 chars), so should be > wrapped like: > DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, > "__tsan_volatile_write16", BT_FN_VOID_PTR, > ATTR_NOTHROW_LEAF_LIST) > instead. So I think it can stay. Thanks, -- Marco
Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles
On Tue, Jun 09, 2020 at 12:08:07PM +0200, Marco Elver wrote: > Just wanted to change this one, and noticed this > > > > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, > > > "__tsan_volatile_write16", > > > + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) > > is precisely 80 characters. So if I read the style guide right, it's > <= 80 chars (and not < 80 chars), right? Ah, you're right, sorry. Jakub
Re: [patch] Make memory copy functions scalar storage order barriers
> I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER > works and what the constraints are. But if the memcpy is the storage order > barrier then what cases are OK? > > - both source and destination objects are [only] accessed with the same >storage order > - both source and destination objects are [only] accessed with >!TYPE_REVERSE_STORAGE_ORDER "containing" accesses > > and what determines the "object state" with respect to storage order > when we see a memcpy call? I had the impression (from the other > existing cases of contains_storage_order_barrier_p () calls) that > we derive the "object state" from the downstream access (vr->operands), > in this case the memcpy destination access. But see below for maybe > the easier case (memcpy folding). All cases are OK if you don't try to replace memcpy with something else. The only possible replacement would be with a VIEW_CONVERT_EXPR, but I guess that it would be quickly folded into a MEM_EXPR and then we're back to square one. > In my understanding of how TYPE_REVERSE_STORAGE_ORDER works > a memcpy, irrespective of its actual "implementation" should not break > anything since it transfers bytes. The point of the discussion is precisely to make it work in all cases. > In particular a 8 byte TYPE_REVERSE_STORAGE_ORDER > object can be transfered ("memcpy"ed) with (ignoring TBAA) > > int tema = *(int *)src; > *(int *)dst = tema; > int temb = *((int *)src + 1); > *((int *)dst + 1) = temb; > > and what byteorder the int accesses are performed in is irrelevant > as long as read and write use the same byteorder. No, because of the fundamental invariant of the implementation: you may _not_ access the same memory location with different storage orders, this is simply not supported and we document the limitation in the manual. The program does not do that, so it's valid; the program after folding does that, which means that the folding is invalid. > > Yes, theoritically, but SRA rewrites the access into a scalar access and > > we're back to square one (I tried). > > But what's the issue with this? Same as above: access to the same location with different storage orders. > Why? Is the following an invalid program? > > void mymemcpy (void *a, void *b) { memcpy (a, b, 8); } > void foo() > { > int b[2] __attribute__(reverse_storage); > int a[2]; > mymemcpy (a, b); > return a[0]; > } > > how can GCC infer, inside mymemcpy that b is reverse storage order? No, that's supported, again the point of the discussion is precisely to make it work! > Thus, if all this is a problem then I think we're papering over the issue > by not folding memcpy. I don't really see why you're insisting on folding memcpy. It's semantically equivalent to VIEW_CONVERT_EXPR so it needs to be treated the same way. -- Eric Botcazou
Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare
On 08/06/20 19:20 +0100, Jonathan Wakely wrote: On 05/06/20 22:24 +0200, François Dumont via Libstdc++ wrote: Hi    Here is the last of my algo patches this time to extend the memcmp optimization to std::deque iterators and _GLIBCXX_DEBUG mode.    To do so I had to return int in implementation details of lexicographical_compare to make sure we can treat a deque iterator range by chunk in a performant way. Tested under Linux x86_64 normal and debug modes.            * include/bits/stl_algobase.h            (__lexicographical_compare_impl): Return int.            (__lexicographical_compare::__lc): Likewise.            (__lexicographical_compare_aux1(_II1, _II1, _II2, _II2)): New. (__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,            _II2, _II2)): Declare.            (__lexicographical_compare_aux1(_II1, _II1,            _Deque_iterator<>, _Deque_iterator<>)): Declare. (__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,            _Deque_iterator<>, _Deque_iterator<>)): Declare.            (__lexicographical_compare_aux): Adapt, call later. Is this meant to say "latter"? That's still not correct grammar though. I think it would be better to name the function it calls explicitly: "Call __lexicographical_compare_aux1."            (__lexicographical_compare_aux(_Safe_iterator<>, _Safe_iterator<>,            _II2, _II2)): Declare.            (__lexicographical_compare_aux(_II1, _II1,            _Safe_iterator<>, _Safe_iterator<>)): Declare.            (__lexicographical_compare_aux(_Safe_iterator<>, _Safe_iterator<>,            _Safe_iterator<>, _Safe_iterator<>)): Declare.            (std::lexicographical_compare): Adapt, call later without __niter_base            usage.            * include/bits/deque.tcc (__lex_cmp_dit): New.            (__lexicographical_compare_aux1): New.            * include/debug/safe_iterator.tcc            (__lexicographical_compare_aux(const _Safe_iterator<>&,            const _Safe_iterator<>&, _II2, _II2)): New.            (__lexicographical_compare_aux(            const _Safe_iterator<>&, const_Safe_iterator<>&,            const _Safe_iterator<>&, const _Safe_iterator<>&)): New.            * testsuite/25_algorithms/lexicographical_compare/1.cc (test6, test7):            New.            * testsuite/25_algorithms/lexicographical_compare/deque_iterators/1.cc:            New test. Ok to commit ? François diff --git a/libstdc++-v3/include/bits/deque.tcc b/libstdc++-v3/include/bits/deque.tcc index 1d32a1691c7..d7dbe64f3e1 100644 --- a/libstdc++-v3/include/bits/deque.tcc +++ b/libstdc++-v3/include/bits/deque.tcc @@ -1261,6 +1261,98 @@ _GLIBCXX_END_NAMESPACE_CONTAINER return true; } + template _II is the wrong name here, it mean InputIterator. All callers of this function are constrained for random access iterators. This cannot be called with an input iterator. Please use _RAIter. +int +__lex_cmp_dit( + const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __first1, + const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __last1, + _II __first2, _II __last2) +{ + typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> _Iter; + typedef typename _Iter::difference_type difference_type; + + if (__first1._M_node != __last1._M_node) + { + difference_type __len = __last2 - __first2; + difference_type __flen What does "flen" mean? Why not use len2 for last2 - first2, as we do elsewhere? And then use len for min(len1, len2)? + = std::min(__len, __first1._M_last - __first1._M_cur); + if (int __ret = std::__lexicographical_compare_aux1( This call (and the three later in this function) will do overload resolution again on the full set of __lexicographical_compare_aux1 overloads, which includes the ones for deque iterators. But we know that __first1._M_cur and __first1._M_last are not deque iterators. __first2 could be a deque iterator, but I'm not sure if we really want one function that has to handle that case anyway. + __first1._M_cur, __first1._M_last, __first2, __first2 + __flen)) + return __ret; + + __first2 += __flen; + __len -= __flen; + __flen = std::min(__len, _Iter::_S_buffer_size()); + for (typename _Iter::_Map_pointer __node = __first1._M_node + 1; + __node != __last1._M_node; + __first2 += __flen, __len -= __flen, + __flen = std::min(__len, _Iter::_S_buffer_size()), + ++__node) + if (int __ret = std::__lexicographical_compare_aux1( +
OpenACC 'attach'/'detach' has no business affecting user-visible reference counting (was: [PATCH 07/13] OpenACC 2.6 deep copy: libgomp parts)
Hi Julian! On 2020-06-05T21:31:08+0100, Julian Brown wrote: > On Fri, 5 Jun 2020 13:17:09 +0200 > Thomas Schwinge wrote: >> On 2019-12-17T21:03:47-0800, Julian Brown >> wrote: >> > This part contains the libgomp runtime support for the >> > GOMP_MAP_ATTACH and GOMP_MAP_DETACH mapping kinds >> >> > --- a/libgomp/target.c >> > +++ b/libgomp/target.c >> >> > @@ -1203,6 +1211,32 @@ gomp_map_vars_internal (struct gomp_device_descr >> > *devicep, >> >> > +case GOMP_MAP_ATTACH: >> > + { >> > +cur_node.host_start = (uintptr_t) hostaddrs[i]; >> > +cur_node.host_end = cur_node.host_start + sizeof (void *); >> > +splay_tree_key n = splay_tree_lookup (mem_map, &cur_node); >> > +if (n != NULL) >> > + { >> > +tgt->list[i].key = n; >> > +tgt->list[i].offset = cur_node.host_start - n->host_start; >> > +tgt->list[i].length = n->host_end - n->host_start; >> > +tgt->list[i].copy_from = false; >> > +tgt->list[i].always_copy_from = false; >> > +tgt->list[i].do_detach >> > + = (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA); >> > +n->refcount++; >> > + } >> > +else >> > + { >> > +gomp_mutex_unlock (&devicep->lock); >> > +gomp_fatal ("outer struct not mapped for attach"); >> > + } >> > +gomp_attach_pointer (devicep, aq, mem_map, n, >> > + (uintptr_t) hostaddrs[i], sizes[i], >> > + cbufp); >> > +continue; >> > + } >> >> For the OpenACC runtime API 'acc_attach' etc. routines they don't, so >> what's the conceptual reason that for the corresponding OpenACC >> directive variants, 'GOMP_MAP_ATTACH' etc. here participate in >> reference counting ('n->refcount++' above)? I understand OpenACC >> 'attach'/'detach' clauses to be simple "executable clauses", which >> just update some values somewhere (say, like >> 'GOMP_MAP_ALWAYS_POINTER'), but they don't alter any mapping state, >> thus wouldn't appear to need reference counting? > > IIUC, n->refcount is not directly the "structural reference count" as > seen at source level, but rather counts the number of target_var_descs > in the lists appended to each target_mem_desc -- and GOMP_MAP_ATTACH > have variable entries in those lists. That may be OK if that's purely an implementation detail that isn't visible to the user, however: > That's not the case for the API > routines. As I had mentioned, the problem is: in contrast to 'acc_attach', an OpenACC 'enter data' directive with 'attach' clause currently uses this same reference-counted code path, and thus such an 'attach' without corresponding 'detach' inhibits unmapping; see 'libgomp.oacc-c-c++-common/mdc-refcount-1.c' in the attached patch "OpenACC 'attach'/'detach' has no business affecting user-visible reference counting". That patch seemed to be the logical next step then, to unify the code paths for 'acc_attach' and 'enter data' directive with 'attach' clause (which have to act in the same way). That's (conceptually) somewhat similar to what you had proposed as part ofhttp://mid.mail-archive.com/b23ea71697f77d8214411a3e1348e9dee496e5a6.1590182783.git.julian@codesourcery.com>. (But all these things really need to be discussed individually...) However, that patch regresses 'libgomp.oacc-fortran/deep-copy-6-no_finalize.F90', and also the 'deep-copy-7b2f-2.c', and 'deep-copy-7cf.c' that I'm attaching here. I have not yet made an attempts to understand these regressions. It may be that a Detach Action actually effects an (attached) device pointer being copied back to the host, and then disturbing things -- and if that, then it may be a bug in libgomp, or in the test case. ;-) Grüße Thomas - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter >From d99a701387054259419292b95462f3646a00d6d9 Mon Sep 17 00:00:00 2001 From: Thomas Schwinge Date: Mon, 8 Jun 2020 21:35:32 +0200 Subject: [PATCH] OpenACC 'attach'/'detach' has no business affecting user-visible reference counting In particular, an 'attach' without 'detach' must not inhibit unmapping. libgomp/ * oacc-mem.c (goacc_attach_internal): New function, split out of 'acc_attach_async'. (acc_attach, goacc_enter_data_internal): Use it. (goacc_exit_data_internal) : Skip unmapping. * target.c (gomp_map_vars_existing): Assert not 'GOMP_MAP_ATTACH'. (gomp_map_vars_internal) : Assert this is not an 'enter data'. * testsuite/libgomp.oacc-c-c++-common/mdc-refcount-1.c: New file. * testsuite/libgomp.oacc-fortran/mdc-refcount-1-4-1.f90: Adjust. --- libgomp/oacc-mem.c
Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles
On Tue, Jun 09, 2020 at 12:01:24PM +0200, Marco Elver wrote: > > Do we need/want Optimization here? Optimization means the option is > > per-function, but to me whether you want to distinguish volatiles or not > > seems to be a global decision for the whole project. > > Adding Optimization here was Martin's suggestion. I'm fine either way > and just wanted to err on the conservative side. We've discussed it with Martin and the end result is that it shouldn't be Optimization, because we don't want to support mixing it. Essentially, it is an ABI decision of the tsan implementation, which can be only one in the whole process, because if somebody tries to mix code that distinguishes them and other that doesn't, one still needs to use an implementation that does distinguish them and will assume that code that was built without --param=tsan-distinguish-volatile=1 is never using volatiles, even when it means we don't know as the information is lost. Note, for LTO that will mean the param needs to be specified on the link line too. Jakub
Re: [PATCH] libgcov: fix TOPN type casting
On 6/9/20 9:33 AM, Martin Liška wrote: I'm going to install it if there are no objections. Pushed as 862b9b225fb. Martin
Re: [stage1][PATCH] Make TOPN counter dynamically allocated.
On 6/8/20 1:07 PM, Martin Liška wrote: Can you please test the following patch candidate: I've just pushed 862b9b225fb which should fix that. Can you please test the current master? Thanks, Martin
[PATCH] gcov: improve --coverage small example
Pushed to master. gcc/ChangeLog: PR gcov-profile/95365 * doc/gcov.texi: Compile and link one example in 2 steps. --- gcc/doc/gcov.texi | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi index 61250c9407e..22e42da2ea6 100644 --- a/gcc/doc/gcov.texi +++ b/gcc/doc/gcov.texi @@ -528,7 +528,8 @@ for each line. For example, if your program is called @file{tmp.cpp}, this is what you see when you use the basic @command{gcov} facility: @smallexample -$ g++ --coverage tmp.cpp +$ g++ --coverage tmp.cpp -c +$ g++ --coverage tmp.o $ a.out $ gcov tmp.cpp -m File 'tmp.cpp' -- 2.26.2
Re: [PATCH] AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.
Hi Richard, The 06/08/2020 16:42, Richard Sandiford wrote: > Tamar Christina writes: > > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > > index > > 97da60762390db81df9cffaf316b909cd1609130..9cc8da338125afa01bc9fb645f4112d2d7ef548c > > 100644 > > --- a/gcc/config/aarch64/aarch64.c > > +++ b/gcc/config/aarch64/aarch64.c > > @@ -11279,6 +11279,14 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, > > int outer, bool speed) > >if (VECTOR_MODE_P (mode)) > > mode = GET_MODE_INNER (mode); > > > > + /* The by element versions of the instruction has the same costs as the > > + normal 3 vector version. So don't add the costs of the duplicate into > > + the costs of the multiply. */ > > + if (GET_CODE (op0) == VEC_DUPLICATE) > > +op0 = XEXP (op0, 0); > > + else if (GET_CODE (op1) == VEC_DUPLICATE) > > +op1 = XEXP (op1, 0); > > + > >/* Integer multiply/fma. */ > >if (GET_MODE_CLASS (mode) == MODE_INT) > > { > > SVE doesn't have duplicating forms, so I think we should put this code > under the “if (VECTOR_MODE_P (mode))” condition, before changing “mode”, > and then restrict it to VEC_ADVSIMD modes. > > (SVE FMUL does have an indexed form, but the index is relative to the > start of the associated quadword, so it isn't a VEC_DUPLICATE.) > Done, I have updated the patch. (See attached) > I guess there's a danger that this could underestimate the cost for > integer modes, if the scalar integer input needs to be moved from GPRs. > In that case the cost of a MULT + VEC_DUPLICATE is probably more > accurate, even though it's still one instruction before RA. > > But I guess there's no perfect answer there. The new code will be > right for integer modes in some cases and not in others. Same if > we leave things as they are. But maybe it'd be worth having a comment > to say that we're assuming the best case, i.e. that the duplicated > value is naturally in FPRs? > Hmm I haven't added the comment yet since I don't fully understand when the integer case would be misleading. In both cases the cost for the GPR is paid by the MOV no? I'm missing why having the MUL account for it would be better in some cases. For instance for the integer case we used to generate dup v0.4s, w2 mul v2.4s, v2.4s, v0.4s but now do fmovs0, w2 mul v2.4s, v2.4s, v0.s[0] Which is better on older cores such Cortex-A55 and no different on newer cores such as Cortex-A76 according to the optimization guides. Regards, Tamar > Thanks, > Richard -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 973c65aa4fb348450872036617362aa17310fb20..d2e959c5276d9b801294c722c92762c5674cb244 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11279,7 +11279,20 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, int outer, bool speed) op1 = XEXP (x, 1); if (VECTOR_MODE_P (mode)) -mode = GET_MODE_INNER (mode); +{ + unsigned int vec_flags = aarch64_classify_vector_mode (mode); + mode = GET_MODE_INNER (mode); + if (vec_flags & VEC_ADVSIMD) + { + /* The by element versions of the instruction has the same costs as the + normal 3 vector version. So don't add the costs of the duplicate into + the costs of the multiply. */ + if (GET_CODE (op0) == VEC_DUPLICATE) + op0 = XEXP (op0, 0); + else if (GET_CODE (op1) == VEC_DUPLICATE) + op1 = XEXP (op1, 0); + } +} /* Integer multiply/fma. */ if (GET_MODE_CLASS (mode) == MODE_INT) diff --git a/gcc/testsuite/gcc.target/aarch64/asimd-mull-elem.c b/gcc/testsuite/gcc.target/aarch64/asimd-mull-elem.c new file mode 100644 index ..513721cee0c8372781e6daf33bc06e256cab8cb8 --- /dev/null +++ b/gcc/testsuite/gcc.target/aarch64/asimd-mull-elem.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target vect_int } */ +/* { dg-require-effective-target vect_float } */ +/* { dg-options "-Ofast" } */ + +#include + +void s_mult_i (int32_t* restrict res, int32_t* restrict a, int32_t b) +{ +for (int x = 0; x < 16; x++) + res[x] = a[x] * b; +} + +void s_mult_f (float32_t* restrict res, float32_t* restrict a, float32_t b) +{ +for (int x = 0; x < 16; x++) + res[x] = a[x] * b; +} + +/* { dg-final { scan-assembler-times {\s+mul\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.s\[0\]} 4 } } */ +/* { dg-final { scan-assembler-times {\s+fmul\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.s\[0\]} 4 } } */
Re: [PATCH] AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.
Tamar Christina writes: > Hi Richard, > The 06/08/2020 16:42, Richard Sandiford wrote: >> Tamar Christina writes: >> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c >> > index >> > 97da60762390db81df9cffaf316b909cd1609130..9cc8da338125afa01bc9fb645f4112d2d7ef548c >> > 100644 >> > --- a/gcc/config/aarch64/aarch64.c >> > +++ b/gcc/config/aarch64/aarch64.c >> > @@ -11279,6 +11279,14 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, >> > int outer, bool speed) >> >if (VECTOR_MODE_P (mode)) >> > mode = GET_MODE_INNER (mode); >> > >> > + /* The by element versions of the instruction has the same costs as the >> > + normal 3 vector version. So don't add the costs of the duplicate >> > into >> > + the costs of the multiply. */ >> > + if (GET_CODE (op0) == VEC_DUPLICATE) >> > +op0 = XEXP (op0, 0); >> > + else if (GET_CODE (op1) == VEC_DUPLICATE) >> > +op1 = XEXP (op1, 0); >> > + >> >/* Integer multiply/fma. */ >> >if (GET_MODE_CLASS (mode) == MODE_INT) >> > { >> >> SVE doesn't have duplicating forms, so I think we should put this code >> under the “if (VECTOR_MODE_P (mode))” condition, before changing “mode”, >> and then restrict it to VEC_ADVSIMD modes. >> >> (SVE FMUL does have an indexed form, but the index is relative to the >> start of the associated quadword, so it isn't a VEC_DUPLICATE.) >> > > Done, I have updated the patch. (See attached) > >> I guess there's a danger that this could underestimate the cost for >> integer modes, if the scalar integer input needs to be moved from GPRs. >> In that case the cost of a MULT + VEC_DUPLICATE is probably more >> accurate, even though it's still one instruction before RA. >> >> But I guess there's no perfect answer there. The new code will be >> right for integer modes in some cases and not in others. Same if >> we leave things as they are. But maybe it'd be worth having a comment >> to say that we're assuming the best case, i.e. that the duplicated >> value is naturally in FPRs? >> > > Hmm I haven't added the comment yet since I don't fully understand when the > integer case would be misleading. > > In both cases the cost for the GPR is paid by the MOV no? I'm missing > why having the MUL account for it would be better in some cases. The point was that any MOV isn't exposed until after register allocation, whereas costs are usually applied before then. So before RA: > For instance for the integer case we used to generate > > dup v0.4s, w2 > mul v2.4s, v2.4s, v0.4s ...this was costed as: (set (reg:V4SI R2) (vec_duplicate:V4SI (reg:SI R1))) (set (reg:V4SI R3) (mult:V4SI ...)) and so accurate when R1 naturally ends up in a GPR. > but now do > > fmovs0, w2 > mul v2.4s, v2.4s, v0.s[0] ...and this is costed as: (set (reg:V4SI R3) (mult:V4SI ...)) and so accurate when R1 naturally ends up in an FPR (without needing a reload to put it there). In other words, before RA, the patch is making the optimistic assumption that R1 is already in FPRs and so a separate FMOV won't be needed. Thanks, Richard > Which is better on older cores such Cortex-A55 and no different on newer > cores such as > Cortex-A76 according to the optimization guides. > > Regards, > Tamar > >> Thanks, >> Richard
[PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
Hello. The patch is follow up of the kernel discussion: https://lore.kernel.org/lkml/canpmjnnrz5ovkb6pe7k6gjfogbht_zhypkng9ad+kjndzk7...@mail.gmail.com/ The patch changes inliner in the following way: 1) do not compare caller and callee sanitizer attributes for always_inline functions - that matches Clang behavior 2) do not inline if one function is sanitized and the second one isn't (for all ASAN, UBSAN, TSAN, LSAN sanitizers); that also matches what Clang does Patch can bootstrap on x86_64-linux-gnu and survives regression tests. Ready to be installed? Thanks, Martin gcc/ChangeLog: * cif-code.def (ATTRIBUTE_MISMATCH): Rename to... (SANITIZE_ATTRIBUTE_MISMATCH): ...this. * ipa-inline.c (sanitize_attrs_match_for_inline_p): Handle all sanitizer options. (can_inline_edge_p): Use renamed CIF_* enum value. gcc/testsuite/ChangeLog: * c-c++-common/asan/inline.c: New test. * c-c++-common/tsan/inline.c: New test. * c-c++-common/ubsan/inline.c: New test. --- gcc/cif-code.def | 7 +++--- gcc/ipa-inline.c | 29 ++- gcc/testsuite/c-c++-common/asan/inline.c | 20 gcc/testsuite/c-c++-common/tsan/inline.c | 20 gcc/testsuite/c-c++-common/ubsan/inline.c | 20 5 files changed, 82 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/inline.c create mode 100644 gcc/testsuite/c-c++-common/tsan/inline.c create mode 100644 gcc/testsuite/c-c++-common/ubsan/inline.c diff --git a/gcc/cif-code.def b/gcc/cif-code.def index 31c18c6c691..c65b2477203 100644 --- a/gcc/cif-code.def +++ b/gcc/cif-code.def @@ -128,9 +128,10 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR, DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR, N_("callee refers to comdat-local symbols")) -/* We can't inline because of mismatched caller/callee attributes. */ -DEFCIFCODE(ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR, - N_("function attribute mismatch")) +/* We can't inline because of mismatched caller/callee + sanitizer attributes. */ +DEFCIFCODE(SANITIZE_ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR, + N_("sanitizer function attribute mismatch")) /* We can't inline because the user requests only static functions but the function has external linkage for live patching purpose. */ diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index f71443feff7..edf4095bcbc 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -264,18 +264,25 @@ sanitize_attrs_match_for_inline_p (const_tree caller, const_tree callee) if (!caller || !callee) return true; - /* Allow inlining always_inline functions into no_sanitize_address - functions. */ - if (!sanitize_flags_p (SANITIZE_ADDRESS, caller) - && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))) + /* Follow clang and allow inlining for always_inline functions. */ + if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))) return true; - return ((sanitize_flags_p (SANITIZE_ADDRESS, caller) - == sanitize_flags_p (SANITIZE_ADDRESS, callee)) - && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller) - == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee)) - && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller) - == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee))); + const sanitize_code codes[] = +{ + SANITIZE_ADDRESS, + SANITIZE_THREAD, + SANITIZE_LEAK, + SANITIZE_UNDEFINED, + SANITIZE_UNDEFINED_NONDEFAULT +}; + + for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++) +if (sanitize_flags_p (codes[i], caller) + != sanitize_flags_p (codes[i], callee)) + return false; + + return true; } /* Used for flags where it is safe to inline when caller's value is @@ -382,7 +389,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report, /* Don't inline a function with mismatched sanitization attributes. */ else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl)) { - e->inline_failed = CIF_ATTRIBUTE_MISMATCH; + e->inline_failed = CIF_SANITIZE_ATTRIBUTE_MISMATCH; inlinable = false; } if (!inlinable && report) diff --git a/gcc/testsuite/c-c++-common/asan/inline.c b/gcc/testsuite/c-c++-common/asan/inline.c new file mode 100644 index 000..7c36702cf9e --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/inline.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=address -c -O3 -fdump-tree-optimized" } */ + +int x; + +static inline +__attribute__((no_sanitize("address"))) +void do_not_sanitize(void) +{ + x++; +} + +void +sanitize_this(void) +{ + x++; + do_not_sanitize(); +} + +/* { dg-final { scan-tree-dump-times "Function do_not_sanitize" 1 "optimized" } } */ diff --git a/gcc/testsuite/c-c++-common/tsan/inline.c b/gcc/testsuite/c-
Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
On Tue, Jun 09, 2020 at 01:53:38PM +0200, Martin Liška wrote: > + const sanitize_code codes[] = > +{ > + SANITIZE_ADDRESS, > + SANITIZE_THREAD, > + SANITIZE_LEAK, Why leak? That is really a link time option only, doesn't affect code generation in any way. On the other side, does the SANITIZE_ADDRESS you have in there handle properly also -fsanitize=kernel-address cases? Jakub
Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
On 6/9/20 2:01 PM, Jakub Jelinek wrote: On Tue, Jun 09, 2020 at 01:53:38PM +0200, Martin Liška wrote: + const sanitize_code codes[] = +{ + SANITIZE_ADDRESS, + SANITIZE_THREAD, + SANITIZE_LEAK, Why leak? I was too eager ;) That is really a link time option only, doesn't affect code generation in any way. On the other side, does the SANITIZE_ADDRESS you have in there handle properly also -fsanitize=kernel-address cases? Yes, I've just added a new test-case for it. Martin Jakub >From 799c28526b1fd9e16bef70bcf06264960d25032a Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Tue, 9 Jun 2020 13:03:55 +0200 Subject: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn gcc/ChangeLog: * cif-code.def (ATTRIBUTE_MISMATCH): Rename to... (SANITIZE_ATTRIBUTE_MISMATCH): ...this. * ipa-inline.c (sanitize_attrs_match_for_inline_p): Handle all sanitizer options. (can_inline_edge_p): Use renamed CIF_* enum value. gcc/testsuite/ChangeLog: * c-c++-common/asan/inline.c: New test. * c-c++-common/asan/inline-kernel.c: New test. * c-c++-common/tsan/inline.c: New test. * c-c++-common/ubsan/inline.c: New test. --- gcc/cif-code.def | 7 +++-- gcc/ipa-inline.c | 28 +++ .../c-c++-common/asan/inline-kernel.c | 20 + gcc/testsuite/c-c++-common/asan/inline.c | 20 + gcc/testsuite/c-c++-common/tsan/inline.c | 20 + gcc/testsuite/c-c++-common/ubsan/inline.c | 20 + 6 files changed, 101 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/inline-kernel.c create mode 100644 gcc/testsuite/c-c++-common/asan/inline.c create mode 100644 gcc/testsuite/c-c++-common/tsan/inline.c create mode 100644 gcc/testsuite/c-c++-common/ubsan/inline.c diff --git a/gcc/cif-code.def b/gcc/cif-code.def index 31c18c6c691..c65b2477203 100644 --- a/gcc/cif-code.def +++ b/gcc/cif-code.def @@ -128,9 +128,10 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR, DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR, N_("callee refers to comdat-local symbols")) -/* We can't inline because of mismatched caller/callee attributes. */ -DEFCIFCODE(ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR, - N_("function attribute mismatch")) +/* We can't inline because of mismatched caller/callee + sanitizer attributes. */ +DEFCIFCODE(SANITIZE_ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR, + N_("sanitizer function attribute mismatch")) /* We can't inline because the user requests only static functions but the function has external linkage for live patching purpose. */ diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index f71443feff7..e934a3e5805 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -264,18 +264,24 @@ sanitize_attrs_match_for_inline_p (const_tree caller, const_tree callee) if (!caller || !callee) return true; - /* Allow inlining always_inline functions into no_sanitize_address - functions. */ - if (!sanitize_flags_p (SANITIZE_ADDRESS, caller) - && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))) + /* Follow clang and allow inlining for always_inline functions. */ + if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))) return true; - return ((sanitize_flags_p (SANITIZE_ADDRESS, caller) - == sanitize_flags_p (SANITIZE_ADDRESS, callee)) - && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller) - == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee)) - && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller) - == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee))); + const sanitize_code codes[] = +{ + SANITIZE_ADDRESS, + SANITIZE_THREAD, + SANITIZE_UNDEFINED, + SANITIZE_UNDEFINED_NONDEFAULT +}; + + for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++) +if (sanitize_flags_p (codes[i], caller) + != sanitize_flags_p (codes[i], callee)) + return false; + + return true; } /* Used for flags where it is safe to inline when caller's value is @@ -382,7 +388,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report, /* Don't inline a function with mismatched sanitization attributes. */ else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl)) { - e->inline_failed = CIF_ATTRIBUTE_MISMATCH; + e->inline_failed = CIF_SANITIZE_ATTRIBUTE_MISMATCH; inlinable = false; } if (!inlinable && report) diff --git a/gcc/testsuite/c-c++-common/asan/inline-kernel.c b/gcc/testsuite/c-c++-common/asan/inline-kernel.c new file mode 100644 index 000..4ca739b5d04 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/inline-kernel.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=kernel-address -c -O3 -fdump-tree-optimized" } */ + +int x; + +static inline +__attribute__((no_sanitize("kernel-address"))) +void do_not_sanitize(void) +{ + x++; +} + +v
Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
On Tue, Jun 09, 2020 at 02:09:06PM +0200, Martin Liška wrote: > - return ((sanitize_flags_p (SANITIZE_ADDRESS, caller) > -== sanitize_flags_p (SANITIZE_ADDRESS, callee)) > - && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller) > - == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee)) > - && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller) > - == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee))); > + const sanitize_code codes[] = > +{ > + SANITIZE_ADDRESS, > + SANITIZE_THREAD, > + SANITIZE_UNDEFINED, > + SANITIZE_UNDEFINED_NONDEFAULT > +}; > + > + for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++) > +if (sanitize_flags_p (codes[i], caller) > + != sanitize_flags_p (codes[i], callee)) > + return false; Sorry for not writing everything at once, but are the SANITIZER_POINTER_{COMPARE,SUBTRACT} differences unimportant? Jakub
Re: [PATCH] AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.
Hi Richard, The 06/09/2020 12:44, Richard Sandiford wrote: > Tamar Christina writes: > > Hi Richard, > > The 06/08/2020 16:42, Richard Sandiford wrote: > >> Tamar Christina writes: > >> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c > >> > index > >> > 97da60762390db81df9cffaf316b909cd1609130..9cc8da338125afa01bc9fb645f4112d2d7ef548c > >> > 100644 > >> > --- a/gcc/config/aarch64/aarch64.c > >> > +++ b/gcc/config/aarch64/aarch64.c > >> > @@ -11279,6 +11279,14 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code > >> > code, int outer, bool speed) > >> >if (VECTOR_MODE_P (mode)) > >> > mode = GET_MODE_INNER (mode); > >> > > >> > + /* The by element versions of the instruction has the same costs as > >> > the > >> > + normal 3 vector version. So don't add the costs of the duplicate > >> > into > >> > + the costs of the multiply. */ > >> > + if (GET_CODE (op0) == VEC_DUPLICATE) > >> > +op0 = XEXP (op0, 0); > >> > + else if (GET_CODE (op1) == VEC_DUPLICATE) > >> > +op1 = XEXP (op1, 0); > >> > + > >> >/* Integer multiply/fma. */ > >> >if (GET_MODE_CLASS (mode) == MODE_INT) > >> > { > >> > >> SVE doesn't have duplicating forms, so I think we should put this code > >> under the “if (VECTOR_MODE_P (mode))” condition, before changing “mode”, > >> and then restrict it to VEC_ADVSIMD modes. > >> > >> (SVE FMUL does have an indexed form, but the index is relative to the > >> start of the associated quadword, so it isn't a VEC_DUPLICATE.) > >> > > > > Done, I have updated the patch. (See attached) > > > >> I guess there's a danger that this could underestimate the cost for > >> integer modes, if the scalar integer input needs to be moved from GPRs. > >> In that case the cost of a MULT + VEC_DUPLICATE is probably more > >> accurate, even though it's still one instruction before RA. > >> > >> But I guess there's no perfect answer there. The new code will be > >> right for integer modes in some cases and not in others. Same if > >> we leave things as they are. But maybe it'd be worth having a comment > >> to say that we're assuming the best case, i.e. that the duplicated > >> value is naturally in FPRs? > >> > > > > Hmm I haven't added the comment yet since I don't fully understand when the > > integer case would be misleading. > > > > In both cases the cost for the GPR is paid by the MOV no? I'm missing > > why having the MUL account for it would be better in some cases. > > The point was that any MOV isn't exposed until after register allocation, > whereas costs are usually applied before then. So before RA: > > > For instance for the integer case we used to generate > > > > dup v0.4s, w2 > > mul v2.4s, v2.4s, v0.4s > > ...this was costed as: > >(set (reg:V4SI R2) (vec_duplicate:V4SI (reg:SI R1))) >(set (reg:V4SI R3) (mult:V4SI ...)) > > and so accurate when R1 naturally ends up in a GPR. > > > but now do > > > > fmovs0, w2 > > mul v2.4s, v2.4s, v0.s[0] > > ...and this is costed as: > >(set (reg:V4SI R3) (mult:V4SI ...)) > > and so accurate when R1 naturally ends up in an FPR (without needing > a reload to put it there). > > In other words, before RA, the patch is making the optimistic assumption > that R1 is already in FPRs and so a separate FMOV won't be needed. > Aargggs... yes that makes sense. Sorry when I looked at the dump before I didn't noticed the order was switched. The SET was for the load of course. :( I have added the comment as suggested, thanks for the explanation. OK for master? Thanks, Tamar > Thanks, > Richard > > > Which is better on older cores such Cortex-A55 and no different on newer > > cores such as > > Cortex-A76 according to the optimization guides. > > > > Regards, > > Tamar > > > >> Thanks, > >> Richard -- diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c index 973c65aa4fb348450872036617362aa17310fb20..5a5a9ad44f0945b4d6a869fc2b4e857022659c55 100644 --- a/gcc/config/aarch64/aarch64.c +++ b/gcc/config/aarch64/aarch64.c @@ -11279,7 +11279,22 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, int outer, bool speed) op1 = XEXP (x, 1); if (VECTOR_MODE_P (mode)) -mode = GET_MODE_INNER (mode); +{ + unsigned int vec_flags = aarch64_classify_vector_mode (mode); + mode = GET_MODE_INNER (mode); + if (vec_flags & VEC_ADVSIMD) + { + /* The by element versions of the instruction has the same costs as the + normal 3 vector version. So don't add the costs of the duplicate into + the costs of the multiply. We make an assumption that the value in + the VEC_DUPLICATE is already the FP&SIMD side. This means costing of + a MUL by element pre RA is a bit optimistic. */ + if (GET_CODE (op0) == VEC_DUPLICATE) + op0 = XEXP (op0, 0); + else if (GET_CODE (op1) == VEC_DUPLICATE) + op1 = XEXP (op1, 0); + } +} /* Integer multiply/fma. */ if (GET_MODE_CLA
Re: OpenACC 'attach'/'detach' has no business affecting user-visible reference counting (was: [PATCH 07/13] OpenACC 2.6 deep copy: libgomp parts)
On Tue, 9 Jun 2020 12:41:21 +0200 Thomas Schwinge wrote: > Hi Julian! > > On 2020-06-05T21:31:08+0100, Julian Brown > wrote: > >> For the OpenACC runtime API 'acc_attach' etc. routines they don't, > >> so what's the conceptual reason that for the corresponding OpenACC > >> directive variants, 'GOMP_MAP_ATTACH' etc. here participate in > >> reference counting ('n->refcount++' above)? I understand OpenACC > >> 'attach'/'detach' clauses to be simple "executable clauses", which > >> just update some values somewhere (say, like > >> 'GOMP_MAP_ALWAYS_POINTER'), but they don't alter any mapping state, > >> thus wouldn't appear to need reference counting? > > > > IIUC, n->refcount is not directly the "structural reference count" > > as seen at source level, but rather counts the number of > > target_var_descs in the lists appended to each target_mem_desc -- > > and GOMP_MAP_ATTACH have variable entries in those lists. > > That may be OK if that's purely an implementation detail that isn't > visible to the user, however: > > > That's not the case for the API > > routines. > > As I had mentioned, the problem is: in contrast to 'acc_attach', an > OpenACC 'enter data' directive with 'attach' clause currently uses > this same reference-counted code path, and thus such an 'attach' > without corresponding 'detach' inhibits unmapping; see > 'libgomp.oacc-c-c++-common/mdc-refcount-1.c' in the attached patch > "OpenACC 'attach'/'detach' has no business affecting user-visible > reference counting". Hmm, right. That's quite a problem from an implementation perspective: the "attach" clause in the target_mem_desc's var list is what triggers the "detach" operation (for structured data lifetimes). Having those references "not count" is quite an ugly wrinkle. I'll think about that some more... > That patch seemed to be the logical next step then, to unify the code > paths for 'acc_attach' and 'enter data' directive with 'attach' clause > (which have to act in the same way). That's (conceptually) somewhat > similar to what you had proposed as part of >http://mid.mail-archive.com/b23ea71697f77d8214411a3e1348e9dee496e5a6.1590182783.git.julian@codesourcery.com>. > (But all these things really need to be discussed individually...) > > However, that patch regresses > 'libgomp.oacc-fortran/deep-copy-6-no_finalize.F90', and also the > 'deep-copy-7b2f-2.c', and 'deep-copy-7cf.c' that I'm attaching here. > I have not yet made an attempts to understand these regressions. It > may be that a Detach Action actually effects an (attached) device > pointer being copied back to the host, and then disturbing things -- > and if that, then it may be a bug in libgomp, or in the test case. > ;-) I haven't (even) quite absorbed what you are trying to test with the "no finalize" version of the deep-copy-6.f90 test case... I probably need to go back and re-read the spec. IIRC, my understanding was that copying out a data item that still has multiple attachments would *not* automatically perform a detachment. Thus, attaches & detaches have to balance (at least without "finalize"). But maybe I was wrong about that! Thanks, Julian
Re: drop -aux{dir,base}, revamp -dump{dir,base}
Hi! On 2020-05-26T04:08:44-0300, Alexandre Oliva wrote: > Thanks, here's the combined patch I'm checking in. > > revamp dump and aux output names For BRIG (HSAIL) front end testing, I'm see a lot of failures like: Running [...]/source-gcc/gcc/testsuite/brig.dg/dg.exp ... PASS: brig.dg/test/gimple/variables.hsail (test for excess errors) [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original "__group_base_addr \\+ \\(0 \\+" [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original "__group_base_addr \\+ 0" [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple "[ ]*prog_global = s204;" [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple ".module.mod_global;" [...] That's: spawn -ignore SIGHUP [...]/build-gcc/gcc/xgcc -B[...]/build-gcc/gcc/ [...]/build-gcc/gcc/testsuite/brig/variables.hsail.brig -fdump-tree-gimple -fdump-tree-original -S -o variables.s PASS: brig.dg/test/gimple/variables.hsail (test for excess errors) variables.hsail.brig: dump file does not exist dump file: variables.hsail.brig.original UNRESOLVED: variables.hsail.brig scan-tree-dump original "__group_base_addr \\+ \\(0 \\+" We're trying to scan 'variables.hsail.brig.*', but for input file name 'variables.hsail.brig', we're now creating: $ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.* build-gcc/gcc/testsuite/brig/variables.brig.004t.original build-gcc/gcc/testsuite/brig/variables.brig.005t.gimple Before your changes, GCC produced the expected: $ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.* build-gcc/gcc/testsuite/brig/variables.hsail.brig.004t.original build-gcc/gcc/testsuite/brig/variables.hsail.brig.005t.gimple Are you able to easily create a patch for that? How/where to adjust: producer-side (GCC driver, or BRIG (HSAIL) front end?), or consumer-side (testsuite: tree scanning machinery, or have to put some '-dumpbase' into all test case files?)? Grüße Thomas - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
On 6/9/20 2:15 PM, Jakub Jelinek wrote: Sorry for not writing everything at once, but are the SANITIZER_POINTER_{COMPARE,SUBTRACT} differences unimportant? They are. I got confused that they are not part of SANITIZE_UNDEFINED or SANITIZE_UNDEFINED_NONDEFAULT. Anyway, good point! Martin >From 5fe0671ad79d14d1c9d0fead1a471875a4416fac Mon Sep 17 00:00:00 2001 From: Martin Liska Date: Tue, 9 Jun 2020 13:03:55 +0200 Subject: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn gcc/ChangeLog: * cif-code.def (ATTRIBUTE_MISMATCH): Rename to... (SANITIZE_ATTRIBUTE_MISMATCH): ...this. * ipa-inline.c (sanitize_attrs_match_for_inline_p): Handle all sanitizer options. (can_inline_edge_p): Use renamed CIF_* enum value. gcc/testsuite/ChangeLog: * c-c++-common/asan/inline.c: New test. * c-c++-common/asan/inline-kernel.c: New test. * c-c++-common/tsan/inline.c: New test. * c-c++-common/ubsan/inline.c: New test. --- gcc/cif-code.def | 7 +++-- gcc/ipa-inline.c | 30 --- .../c-c++-common/asan/inline-kernel.c | 20 + gcc/testsuite/c-c++-common/asan/inline.c | 20 + gcc/testsuite/c-c++-common/tsan/inline.c | 20 + gcc/testsuite/c-c++-common/ubsan/inline.c | 20 + 6 files changed, 103 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/asan/inline-kernel.c create mode 100644 gcc/testsuite/c-c++-common/asan/inline.c create mode 100644 gcc/testsuite/c-c++-common/tsan/inline.c create mode 100644 gcc/testsuite/c-c++-common/ubsan/inline.c diff --git a/gcc/cif-code.def b/gcc/cif-code.def index 31c18c6c691..c65b2477203 100644 --- a/gcc/cif-code.def +++ b/gcc/cif-code.def @@ -128,9 +128,10 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR, DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR, N_("callee refers to comdat-local symbols")) -/* We can't inline because of mismatched caller/callee attributes. */ -DEFCIFCODE(ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR, - N_("function attribute mismatch")) +/* We can't inline because of mismatched caller/callee + sanitizer attributes. */ +DEFCIFCODE(SANITIZE_ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR, + N_("sanitizer function attribute mismatch")) /* We can't inline because the user requests only static functions but the function has external linkage for live patching purpose. */ diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c index f71443feff7..c667de2a97c 100644 --- a/gcc/ipa-inline.c +++ b/gcc/ipa-inline.c @@ -264,18 +264,26 @@ sanitize_attrs_match_for_inline_p (const_tree caller, const_tree callee) if (!caller || !callee) return true; - /* Allow inlining always_inline functions into no_sanitize_address - functions. */ - if (!sanitize_flags_p (SANITIZE_ADDRESS, caller) - && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))) + /* Follow clang and allow inlining for always_inline functions. */ + if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee))) return true; - return ((sanitize_flags_p (SANITIZE_ADDRESS, caller) - == sanitize_flags_p (SANITIZE_ADDRESS, callee)) - && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller) - == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee)) - && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller) - == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee))); + const sanitize_code codes[] = +{ + SANITIZE_ADDRESS, + SANITIZE_THREAD, + SANITIZE_UNDEFINED, + SANITIZE_UNDEFINED_NONDEFAULT, + SANITIZE_POINTER_COMPARE, + SANITIZE_POINTER_SUBTRACT +}; + + for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++) +if (sanitize_flags_p (codes[i], caller) + != sanitize_flags_p (codes[i], callee)) + return false; + + return true; } /* Used for flags where it is safe to inline when caller's value is @@ -382,7 +390,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report, /* Don't inline a function with mismatched sanitization attributes. */ else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl)) { - e->inline_failed = CIF_ATTRIBUTE_MISMATCH; + e->inline_failed = CIF_SANITIZE_ATTRIBUTE_MISMATCH; inlinable = false; } if (!inlinable && report) diff --git a/gcc/testsuite/c-c++-common/asan/inline-kernel.c b/gcc/testsuite/c-c++-common/asan/inline-kernel.c new file mode 100644 index 000..4ca739b5d04 --- /dev/null +++ b/gcc/testsuite/c-c++-common/asan/inline-kernel.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-fsanitize=kernel-address -c -O3 -fdump-tree-optimized" } */ + +int x; + +static inline +__attribute__((no_sanitize("kernel-address"))) +void do_not_sanitize(void) +{ + x++; +} + +void +sanitize_this(void) +{ + x++; + do_not_sanitize(); +} + +/* { dg-final { scan-tree-dump-times "Function do_not_sanitize" 1 "optimized" } } */ diff --git a/gcc/t
Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
On Tue, Jun 09, 2020 at 02:32:36PM +0200, Martin Liška wrote: > >From 5fe0671ad79d14d1c9d0fead1a471875a4416fac Mon Sep 17 00:00:00 2001 > From: Martin Liska > Date: Tue, 9 Jun 2020 13:03:55 +0200 > Subject: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn > > gcc/ChangeLog: > > * cif-code.def (ATTRIBUTE_MISMATCH): Rename to... > (SANITIZE_ATTRIBUTE_MISMATCH): ...this. > * ipa-inline.c (sanitize_attrs_match_for_inline_p): > Handle all sanitizer options. > (can_inline_edge_p): Use renamed CIF_* enum value. > > gcc/testsuite/ChangeLog: > > * c-c++-common/asan/inline.c: New test. > * c-c++-common/asan/inline-kernel.c: New test. > * c-c++-common/tsan/inline.c: New test. > * c-c++-common/ubsan/inline.c: New test. Ok, thanks. Jakub
[PATCH] gcov-dump: fix --help spacing
Pushed to master. gcc/ChangeLog: * gcov-dump.c (print_usage): Fix spacing for --raw option in --help. --- gcc/gcov-dump.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/gcov-dump.c b/gcc/gcov-dump.c index 90cbd1ace52..ffb71ca4b68 100644 --- a/gcc/gcov-dump.c +++ b/gcc/gcov-dump.c @@ -133,7 +133,7 @@ print_usage (void) printf (" -h, --help Print this help\n"); printf (" -l, --long Dump record contents too\n"); printf (" -p, --positions Dump record positions\n"); - printf (" -r, --raw Print content records in raw format\n"); + printf (" -r, --rawPrint content records in raw format\n"); printf (" -v, --versionPrint version number\n"); printf ("\nFor bug reporting instructions, please see:\n%s.\n", bug_report_url); -- 2.26.2
Re: [PATCH RFA] tree-inline: Fix VLA handling [PR95552]
> Yes, but the problem is that remap_decl isn't getting called. Right, I can get it to be called by adding a pushdecl to grokdeclarator... > Attaching it to the BIND_EXPR doesn't help walk_tree_1 do the right > thing with the DECL_EXPR. ... but, indeed, this still ICEs. So the key is that the DECL_EXPR_DECL of the copied DECL_EXPR points to the remapped TYPE_DECL before the type is copied? If so, then your original patch is probably the way to go, but the comment would need to be slightly adjusted. In Ada, where we attach the TYPE_DECL to the BIND_EXPR, this will mean that remap_decl is invoked 3 times per TYPE_DECL: first twice from copy_bind_expr and then once again for the DECL_EXPR. But probably no big deal in the end. -- Eric Botcazou
BRIG FE testsuite: Fix all dump-scans (Was: Re: drop -aux{dir, base}, revamp -dump{dir, base})
Hi, On Tue, Jun 09 2020, Thomas Schwinge wrote: > Hi! > > On 2020-05-26T04:08:44-0300, Alexandre Oliva wrote: >> Thanks, here's the combined patch I'm checking in. >> >> revamp dump and aux output names > > For BRIG (HSAIL) front end testing, I'm see a lot of failures like: > > Running [...]/source-gcc/gcc/testsuite/brig.dg/dg.exp ... > PASS: brig.dg/test/gimple/variables.hsail (test for excess errors) > [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original > "__group_base_addr \\+ \\(0 \\+" > [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original > "__group_base_addr \\+ 0" > [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple "[ > ]*prog_global = s204;" > [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple > ".module.mod_global;" > [...] > > That's: > > spawn -ignore SIGHUP [...]/build-gcc/gcc/xgcc -B[...]/build-gcc/gcc/ > [...]/build-gcc/gcc/testsuite/brig/variables.hsail.brig -fdump-tree-gimple > -fdump-tree-original -S -o variables.s > PASS: brig.dg/test/gimple/variables.hsail (test for excess errors) > variables.hsail.brig: dump file does not exist > dump file: variables.hsail.brig.original > UNRESOLVED: variables.hsail.brig scan-tree-dump original > "__group_base_addr \\+ \\(0 \\+" > > We're trying to scan 'variables.hsail.brig.*', but for input file name > 'variables.hsail.brig', we're now creating: > > $ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.* > build-gcc/gcc/testsuite/brig/variables.brig.004t.original > build-gcc/gcc/testsuite/brig/variables.brig.005t.gimple > > Before your changes, GCC produced the expected: > > $ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.* > build-gcc/gcc/testsuite/brig/variables.hsail.brig.004t.original > build-gcc/gcc/testsuite/brig/variables.hsail.brig.005t.gimple > > Are you able to easily create a patch for that? How/where to adjust: > producer-side (GCC driver, or BRIG (HSAIL) front end?), or consumer-side > (testsuite: tree scanning machinery, or have to put some '-dumpbase' into > all test case files?)? I looked into the issue yesterday and decided the simplest fix is probably the following. I am going to use my BRIG maintainer hat to commit the patch in a day or two unless someone thinks it is a bad idea. Tested by running make check-brig on an x86_64-linux. Martin Since Alexandre's revamp of dump files handling in r11-627-g1dedc12d186, BRIG FE has been receiving slightly different -dumpbase (e.g. smoke_test.brig instead of smoke_test.hsail.brig when compiling file smoke_test.hsail.brig) and the testsuite then could not find the generated dump files it wanted to scan. I have not really looked into why that changed, the easiest fix seems to me to remove the hsail part already when generating the binary brig file from the textual HSAIL representation. gcc/testsuite/ChangeLog: 2020-06-09 Martin Jambor * lib/brig.exp (brig_target_compile): Strip hsail extension when gnerating the name of the binary brig file. --- gcc/testsuite/lib/brig.exp | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/gcc/testsuite/lib/brig.exp b/gcc/testsuite/lib/brig.exp index fbfb1da947a..de47f13e42c 100644 --- a/gcc/testsuite/lib/brig.exp +++ b/gcc/testsuite/lib/brig.exp @@ -29,7 +29,7 @@ proc brig_target_compile { source dest type options } { # We cannot assume all inputs are .hsail as the dg machinery # calls this for a some c files to check linker plugin support or # similar. - set brig_source ${tmpdir}/[file tail ${source}].brig + set brig_source ${tmpdir}/[file rootname [file tail ${source}]].brig exec HSAILasm $source -o ${brig_source} set source ${brig_source} # Change the testname the .brig. -- 2.26.2
Re: [PATCH PR95569] ICE in tmmark:verify_ssa failed
On Tue, Jun 9, 2020 at 11:47 AM qianchao wrote: > > Hi > > Commit eb72dc663e9070b281be83a80f6f838a3a878822 introduces a ICE on AArch64. > See the discussion on the bug for details. > Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95569 > > Trivial patch fixing the ICE attached. > > Bootstrap and tested on aarch64 platform. No new regression witnessed. ok for > trunk? OK. Thanks, Richard. > 2020-06-09 Qian Chao > > gcc/: > > PR tree-optimization/95569 > * trans-mem.c (expand_assign_tm): Ensure that rtmp is marked > TREE_ADDRESSABLE. > * testsuite/gcc.dg/tm/pr51472.c: New test.
Re: [patch] Make memory copy functions scalar storage order barriers
On Tue, Jun 9, 2020 at 12:19 PM Eric Botcazou wrote: > > > I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER > > works and what the constraints are. But if the memcpy is the storage order > > barrier then what cases are OK? > > > > - both source and destination objects are [only] accessed with the same > >storage order > > - both source and destination objects are [only] accessed with > >!TYPE_REVERSE_STORAGE_ORDER "containing" accesses > > > > and what determines the "object state" with respect to storage order > > when we see a memcpy call? I had the impression (from the other > > existing cases of contains_storage_order_barrier_p () calls) that > > we derive the "object state" from the downstream access (vr->operands), > > in this case the memcpy destination access. But see below for maybe > > the easier case (memcpy folding). > > All cases are OK if you don't try to replace memcpy with something else. The > only possible replacement would be with a VIEW_CONVERT_EXPR, but I guess that > it would be quickly folded into a MEM_EXPR and then we're back to square one. > > > In my understanding of how TYPE_REVERSE_STORAGE_ORDER works > > a memcpy, irrespective of its actual "implementation" should not break > > anything since it transfers bytes. > > The point of the discussion is precisely to make it work in all cases. > > > In particular a 8 byte TYPE_REVERSE_STORAGE_ORDER > > object can be transfered ("memcpy"ed) with (ignoring TBAA) > > > > int tema = *(int *)src; > > *(int *)dst = tema; > > int temb = *((int *)src + 1); > > *((int *)dst + 1) = temb; > > > > and what byteorder the int accesses are performed in is irrelevant > > as long as read and write use the same byteorder. > > No, because of the fundamental invariant of the implementation: you may _not_ > access the same memory location with different storage orders, this is simply > not supported and we document the limitation in the manual. The program does > not do that, so it's valid; the program after folding does that, which means > that the folding is invalid. > > > > Yes, theoritically, but SRA rewrites the access into a scalar access and > > > we're back to square one (I tried). > > > > But what's the issue with this? > > Same as above: access to the same location with different storage orders. > > > Why? Is the following an invalid program? > > > > void mymemcpy (void *a, void *b) { memcpy (a, b, 8); } > > void foo() > > { > > int b[2] __attribute__(reverse_storage); > > int a[2]; > > mymemcpy (a, b); > > return a[0]; > > } > > > > how can GCC infer, inside mymemcpy that b is reverse storage order? > > No, that's supported, again the point of the discussion is precisely to make > it work! But GCC does not see the reverse storage order in mymemcpy so it happily folds the memcpy inside it, inlines the result and then? > > Thus, if all this is a problem then I think we're papering over the issue > > by not folding memcpy. > > I don't really see why you're insisting on folding memcpy. It's semantically > equivalent to VIEW_CONVERT_EXPR so it needs to be treated the same way. I'm not insisting on folding memcpy. I'm questioning you are fixing the actual bug ;) Richard. > -- > Eric Botcazou
Re: drop -aux{dir,base}, revamp -dump{dir,base}
Hi! On 2020-05-26T04:08:44-0300, Alexandre Oliva wrote: > Thanks, here's the combined patch I'm checking in. > > revamp dump and aux output names For libgomp offloading testing, I'm seeing a number of failures like: UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test [-PASS:-]{+UNRESOLVED:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-offload-rtl-dump mach "Merging loop .* into " [...] UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test [-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-assembler-times bar.sync 2 [...] PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 (test for excess errors) PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 execution test PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 output pattern test [-PASS:-]{+UNRESOLVED:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 scan-offload-tree-dump oaccdevlow "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" [...] UNSUPPORTED: libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O0 PASS: libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 (test for excess errors) PASS: libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 execution test [-PASS:-]{+UNRESOLVED:+} libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none -O2 scan-offload-tree-dump slp1 "vector\\(2\\) long long unsigned int" [...] (A number of similar ones skipped.) A word of warning: some of that testing indeed is fragile/ugly ;-) -- but we have to restore it anyway. That means, we shall gladly accept suggestions how to do it better. Consider 'libgomp.oacc-c-c++-common/nvptx-merged-loop.c': /* { dg-options "-foffload=-fdump-rtl-mach" } */ /* { dg-final { scan-offload-rtl-dump "Merging loop .* into " "mach" } } */ Previously, for '-foffload=nvptx-none -foffload=-fdump-rtl-mach -save-temps -o ./nvptx-merged-loop.exe', GCC produced the expected 'nvptx-merged-loop.o.307r.mach'. Now, I find the file at '/tmp/cc6nlTN9.mkoffload-cc4ebUn4.o.307r.mach' (The '-save-temps' comes from 'libgomp/testsuite/lib/libgomp-dg.exp:libgomp-dg-test', by the way.) (Yes, that's all a bit ugly, how the 'mkoffload's etc. handle their files...) Similar for 'libgomp.oacc-c-c++-common/vector-length-128-1.c': /* { dg-additional-options "-foffload=-fdump-tree-oaccdevlow" } */ /* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function \\(1, 1, 128\\)" "oaccdevlow" } } */ Similar for 'libgomp.oacc-c/vec.c': /* { dg-additional-options "-std=c99 -ftree-slp-vectorize -foffload=-fdump-tree-slp1" } */ /* { dg-final { scan-offload-tree-dump "vector\\(2\\) long long unsigned int" "slp1" } } */ (Thus not analyed these any further.) Consider 'libgomp.oacc-c-c++-common/pr85381-2.c': /* { dg-additional-options "-save-temps" } */ /* { dg-final { scan-assembler-times "bar.sync" 2 } } */ This expects to scan the PTX offloading compilation assembler code (not host code!), expecting that nvptx offloading code assembly is produced after the host code, and thus overwrites the latter file. (Yes, that's certainly ugly/fragile...) Previously, for '-foffload=nvptx-none -save-temps -o ./pr85381-2.exe', GCC left the '-foffload=nvptx-none' assembly in the expected 'pr85381-2.s'. Now, I find the file at '/tmp/cc8JsG4H.mkoffload-pr85381-2.s'. Are you able to easily create/suggest patches for these? (You're prob
[PATCH v3] tsan: Add optional support for distinguishing volatiles
Add support to optionally emit different instrumentation for accesses to volatile variables. While the default TSAN runtime likely will never require this feature, other runtimes for different environments that have subtly different memory models or assumptions may require distinguishing volatiles. One such environment are OS kernels, where volatile is still used in various places, and often declare volatile to be appropriate even in multi-threaded contexts. One such example is the Linux kernel, which implements various synchronization primitives using volatile (READ_ONCE(), WRITE_ONCE()). Here the Kernel Concurrency Sanitizer (KCSAN), is a runtime that uses TSAN instrumentation but otherwise implements a very different approach to race detection from TSAN: https://github.com/google/ktsan/wiki/KCSAN Due to recent changes in requirements by the Linux kernel, KCSAN requires that the compiler supports tsan-distinguish-volatile (among several new requirements): https://lore.kernel.org/lkml/20200521142047.169334-7-el...@google.com/ gcc/ * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new builtin for volatile instrumentation of reads/writes. (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. * tsan.c (get_memory_access_decl): Argument if access is volatile. If param tsan-distinguish-volatile is non-zero, and access if volatile, return volatile instrumentation decl. (instrument_expr): Check if access is volatile. gcc/testsuite/ * c-c++-common/tsan/volatile.c: New test. Acked-by: Dmitry Vyukov --- v3: * Remove Optimization from -param=tsan-distinguish-volatile. * Simplify get_memory_access_decl. * Avoid use of builtin_decl temporary. v2: * Add Optimization keyword to -param=tsan-distinguish-volatile= as the parameter can be different per TU. * Add tree-dump check to test. --- gcc/params.opt | 4 ++ gcc/sanitizer.def | 21 +++ gcc/testsuite/c-c++-common/tsan/volatile.c | 67 ++ gcc/tsan.c | 31 +- 4 files changed, 110 insertions(+), 13 deletions(-) create mode 100644 gcc/testsuite/c-c++-common/tsan/volatile.c diff --git a/gcc/params.opt b/gcc/params.opt index 4aec480798b..9b564bb046c 100644 --- a/gcc/params.opt +++ b/gcc/params.opt @@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best edge is less than this th Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization Set the maximum number of instructions executed in parallel in reassociated tree. If 0, use the target dependent heuristic. +-param=tsan-distinguish-volatile= +Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, 1) Param +Emit special instrumentation for accesses to volatiles. + -param=uninit-control-dep-attempts= Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) IntegerRange(1, 65536) Param Optimization Maximum number of nested calls to search for control dependencies during uninitialized variable analysis. diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def index 11eb6467eba..a32715ddb92 100644 --- a/gcc/sanitizer.def +++ b/gcc/sanitizer.def @@ -214,6 +214,27 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_READ_RANGE, "__tsan_read_range", DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_WRITE_RANGE, "__tsan_write_range", BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ1, "__tsan_volatile_read1", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ2, "__tsan_volatile_read2", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ4, "__tsan_volatile_read4", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ8, "__tsan_volatile_read8", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ16, "__tsan_volatile_read16", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE1, "__tsan_volatile_write1", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE2, "__tsan_volatile_write2", + BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST) +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATIL
Re: [PATCH v3] tsan: Add optional support for distinguishing volatiles
On Tue, Jun 09, 2020 at 03:15:39PM +0200, Marco Elver wrote: > gcc/ > * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. > * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new > builtin for volatile instrumentation of reads/writes. > (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. > (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. > (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. > (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. > (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. > * tsan.c (get_memory_access_decl): Argument if access is > volatile. If param tsan-distinguish-volatile is non-zero, and > access if volatile, return volatile instrumentation decl. > (instrument_expr): Check if access is volatile. > > gcc/testsuite/ > * c-c++-common/tsan/volatile.c: New test. > > Acked-by: Dmitry Vyukov Ok, thanks. Jakub
RE: [PATCH] move permutation validity check
> -Original Message- > From: Gcc-patches On Behalf Of Richard > Biener > Sent: 05 May 2020 14:49 > To: gcc-patches@gcc.gnu.org > Subject: [PATCH] move permutation validity check > > > This delays the SLP permutation check to vectorizable_load and optimizes > permutations only after all SLP instances have been generated and the > vectorization factor is determined. > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > Richard. Hi Richard, This patch (bc484e25) introduced a test failure on arm and aarch64. The test gcc.dg/vect/slp-perm-9.c is now failing with: FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation requires at least three vectors" 1 It seems this assertion isn't run on x86-64 which explains why this was missed. Thanks, Alex > > 2020-05-05 Richard Biener > > * tree-vectorizer.h (vec_info::slp_loads): New. > (vect_optimize_slp): Declare. > * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do > nothing when there are no loads. > (vect_gather_slp_loads): Gather loads into a vector. > (vect_supported_load_permutation_p): Remove. > (vect_analyze_slp_instance): Do not verify permutation > validity here. > (vect_analyze_slp): Optimize permutations of reductions > after all SLP instances have been gathered and gather > all loads. > (vect_optimize_slp): New function split out from > vect_supported_load_permutation_p. Elide some permutations. > (vect_slp_analyze_bb_1): Call vect_optimize_slp. > * tree-vect-loop.c (vect_analyze_loop_2): Likewise. > * tree-vect-stmts.c (vectorizable_load): Check whether > the load can be permuted. When generating code assert we can. > > * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported > SLP permutations becoming builds from scalars. > * gcc.dg/vect/bb-slp-pr78205.c: Likewise. > * gcc.dg/vect/bb-slp-34.c: Likewise. > --- > gcc/testsuite/gcc.dg/vect/bb-slp-34.c | 3 +- > gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c | 7 +- > gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c | 6 +- > gcc/tree-vect-loop.c | 3 + > gcc/tree-vect-slp.c| 262 +++ > -- > gcc/tree-vect-stmts.c | 50 +- > gcc/tree-vectorizer.h | 4 +- > 7 files changed, 154 insertions(+), 181 deletions(-)
Re: [PATCH] move permutation validity check
On Tue, Jun 9, 2020 at 3:22 PM Alex Coplan wrote: > > > -Original Message- > > From: Gcc-patches On Behalf Of Richard > > Biener > > Sent: 05 May 2020 14:49 > > To: gcc-patches@gcc.gnu.org > > Subject: [PATCH] move permutation validity check > > > > > > This delays the SLP permutation check to vectorizable_load and optimizes > > permutations only after all SLP instances have been generated and the > > vectorization factor is determined. > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > > > Richard. > > Hi Richard, > > This patch (bc484e25) introduced a test failure on arm and aarch64. The > test gcc.dg/vect/slp-perm-9.c is now failing with: > > FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation > requires at least three vectors" 1 > > It seems this assertion isn't run on x86-64 which explains why this was > missed. Please open a bugreport if there's none already. Thanks, Richard. > > Thanks, > Alex > > > > > 2020-05-05 Richard Biener > > > > * tree-vectorizer.h (vec_info::slp_loads): New. > > (vect_optimize_slp): Declare. > > * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do > > nothing when there are no loads. > > (vect_gather_slp_loads): Gather loads into a vector. > > (vect_supported_load_permutation_p): Remove. > > (vect_analyze_slp_instance): Do not verify permutation > > validity here. > > (vect_analyze_slp): Optimize permutations of reductions > > after all SLP instances have been gathered and gather > > all loads. > > (vect_optimize_slp): New function split out from > > vect_supported_load_permutation_p. Elide some permutations. > > (vect_slp_analyze_bb_1): Call vect_optimize_slp. > > * tree-vect-loop.c (vect_analyze_loop_2): Likewise. > > * tree-vect-stmts.c (vectorizable_load): Check whether > > the load can be permuted. When generating code assert we can. > > > > * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported > > SLP permutations becoming builds from scalars. > > * gcc.dg/vect/bb-slp-pr78205.c: Likewise. > > * gcc.dg/vect/bb-slp-34.c: Likewise. > > --- > > gcc/testsuite/gcc.dg/vect/bb-slp-34.c | 3 +- > > gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c | 7 +- > > gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c | 6 +- > > gcc/tree-vect-loop.c | 3 + > > gcc/tree-vect-slp.c| 262 +++ > > -- > > gcc/tree-vect-stmts.c | 50 +- > > gcc/tree-vectorizer.h | 4 +- > > 7 files changed, 154 insertions(+), 181 deletions(-)
RE: [PATCH] move permutation validity check
> -Original Message- > From: Richard Biener > Sent: 09 June 2020 14:24 > To: Alex Coplan > Cc: Richard Biener ; gcc-patches@gcc.gnu.org > Subject: Re: [PATCH] move permutation validity check > > On Tue, Jun 9, 2020 at 3:22 PM Alex Coplan wrote: > > > > > -Original Message- > > > From: Gcc-patches On Behalf Of Richard > > > Biener > > > Sent: 05 May 2020 14:49 > > > To: gcc-patches@gcc.gnu.org > > > Subject: [PATCH] move permutation validity check > > > > > > > > > This delays the SLP permutation check to vectorizable_load and optimizes > > > permutations only after all SLP instances have been generated and the > > > vectorization factor is determined. > > > > > > Bootstrap and regtest running on x86_64-unknown-linux-gnu. > > > > > > Richard. > > > > Hi Richard, > > > > This patch (bc484e25) introduced a test failure on arm and aarch64. The > > test gcc.dg/vect/slp-perm-9.c is now failing with: > > > > FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation > requires at least three vectors" 1 > > > > It seems this assertion isn't run on x86-64 which explains why this was > > missed. > > Please open a bugreport if there's none already. > > Thanks, > Richard. Ah, it appears there already is one: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95056 Apologies, I should have checked Bugzilla first. Thanks, Alex > > > > > Thanks, > > Alex > > > > > > > > 2020-05-05 Richard Biener > > > > > > * tree-vectorizer.h (vec_info::slp_loads): New. > > > (vect_optimize_slp): Declare. > > > * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Do > > > nothing when there are no loads. > > > (vect_gather_slp_loads): Gather loads into a vector. > > > (vect_supported_load_permutation_p): Remove. > > > (vect_analyze_slp_instance): Do not verify permutation > > > validity here. > > > (vect_analyze_slp): Optimize permutations of reductions > > > after all SLP instances have been gathered and gather > > > all loads. > > > (vect_optimize_slp): New function split out from > > > vect_supported_load_permutation_p. Elide some permutations. > > > (vect_slp_analyze_bb_1): Call vect_optimize_slp. > > > * tree-vect-loop.c (vect_analyze_loop_2): Likewise. > > > * tree-vect-stmts.c (vectorizable_load): Check whether > > > the load can be permuted. When generating code assert we can. > > > > > > * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported > > > SLP permutations becoming builds from scalars. > > > * gcc.dg/vect/bb-slp-pr78205.c: Likewise. > > > * gcc.dg/vect/bb-slp-34.c: Likewise. > > > --- > > > gcc/testsuite/gcc.dg/vect/bb-slp-34.c | 3 +- > > > gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c | 7 +- > > > gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c | 6 +- > > > gcc/tree-vect-loop.c | 3 + > > > gcc/tree-vect-slp.c| 262 +++ > > > -- > > > gcc/tree-vect-stmts.c | 50 +- > > > gcc/tree-vectorizer.h | 4 +- > > > 7 files changed, 154 insertions(+), 181 deletions(-)
Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.
On Mon, Jun 8, 2020 at 1:04 PM Martin Liška wrote: > > Hello. > > Thank you for the approval. There's the patch that defines 4 new > DEF_INTERNAL_OPTAB_FN. > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > It also builds on ppc64le-linux-gnu. > > Ready to be installed? The ChangeLog refers to DEF_INTERNAL_OPTAB_CAN_FAIL which is no longer there. Can you put the isel pass to a separate file please? So this is a first step towards sanitizing VEC_COND_EXPR. There were followups mentioned, namely a) enforcing that VEC_COND_EXPR constraint everywhere, b) isel vector comparisons at the same time since expansion has a vec_cond fallback There's + /* ??? If we do not cleanup EH then we will ICE in +verification. But in reality we have created wrong-code +as we did not properly transition EH info and edges to +the piecewise computations. */ + if (maybe_clean_eh_stmt (gsi_stmt (gsi)) + && gimple_purge_dead_eh_edges (bb)) + cfg_changed = true; which of course is bad. It's the comparison that can throw and I guess current RTL expansion manages to cope by find_many_sub_bbs and friends. But we need to get this correct on GIMPLE here. Note I find it odd this only triggers during ISEL - it should trigger during the lowering step which splits the comparison from the VEC_COND_EXPR. An appropriate fix at lowering time would be to insert the VEC_COND_EXPR w/o the condition on the normal outgoing edge and keep the comparison in place of the original VEC_COND_EXPR, moving EH info from the VEC_COND_EXPR to the comparison. I think we need to fix that before merging. Thanks, Richard. > Thanks, > Martin
Re: [PATCH v3] tsan: Add optional support for distinguishing volatiles
On Tue, 9 Jun 2020 at 15:22, Jakub Jelinek wrote: > > On Tue, Jun 09, 2020 at 03:15:39PM +0200, Marco Elver wrote: > > gcc/ > > * params.opt: Define --param=tsan-distinguish-volatile=[0,1]. > > * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new > > builtin for volatile instrumentation of reads/writes. > > (BUILT_IN_TSAN_VOLATILE_READ2): Likewise. > > (BUILT_IN_TSAN_VOLATILE_READ4): Likewise. > > (BUILT_IN_TSAN_VOLATILE_READ8): Likewise. > > (BUILT_IN_TSAN_VOLATILE_READ16): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise. > > (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise. > > * tsan.c (get_memory_access_decl): Argument if access is > > volatile. If param tsan-distinguish-volatile is non-zero, and > > access if volatile, return volatile instrumentation decl. > > (instrument_expr): Check if access is volatile. > > > > gcc/testsuite/ > > * c-c++-common/tsan/volatile.c: New test. > > > > Acked-by: Dmitry Vyukov > > Ok, thanks. I think one of you has to commit the patch, as we don't have access to the GCC git repository. Thank you! -- Marco
[PATCH] Remove dead code
This removes dead code that was left over from the reduction vectorization refactoring last year. Bootstrapped / tested on x86_64-unknown-linux-gnu, applied. Richard. 2020-06-09 Richard Biener * tree-vect-stmts.c (vect_transform_stmt): Remove dead code. --- gcc/tree-vect-stmts.c | 43 --- 1 file changed, 43 deletions(-) diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c index b24b0fe4304..61adf7d7fa4 100644 --- a/gcc/tree-vect-stmts.c +++ b/gcc/tree-vect-stmts.c @@ -11186,12 +11186,6 @@ vect_transform_stmt (vec_info *vinfo, gcc_assert (slp_node || !PURE_SLP_STMT (stmt_info)); stmt_vec_info old_vec_stmt_info = STMT_VINFO_VEC_STMT (stmt_info); - loop_vec_info loop_vinfo = dyn_cast (vinfo); - bool nested_p = (loop_vinfo - && nested_in_vect_loop_p - (LOOP_VINFO_LOOP (loop_vinfo), stmt_info)); - - gimple *stmt = stmt_info->stmt; switch (STMT_VINFO_TYPE (stmt_info)) { case type_demotion_vec_info_type: @@ -11266,13 +11260,11 @@ vect_transform_stmt (vec_info *vinfo, case call_vec_info_type: done = vectorizable_call (vinfo, stmt_info, gsi, &vec_stmt, slp_node, NULL); - stmt = gsi_stmt (*gsi); break; case call_simd_clone_vec_info_type: done = vectorizable_simd_clone_call (vinfo, stmt_info, gsi, &vec_stmt, slp_node, NULL); - stmt = gsi_stmt (*gsi); break; case reduc_vec_info_type: @@ -11310,41 +11302,6 @@ vect_transform_stmt (vec_info *vinfo, gcc_assert (!vec_stmt && STMT_VINFO_VEC_STMT (stmt_info) == old_vec_stmt_info); - /* Handle inner-loop stmts whose DEF is used in the loop-nest that - is being vectorized, but outside the immediately enclosing loop. */ - if (vec_stmt - && nested_p - && STMT_VINFO_TYPE (stmt_info) != reduc_vec_info_type - && (STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer - || STMT_VINFO_RELEVANT (stmt_info) == - vect_used_in_outer_by_reduction)) -{ - class loop *innerloop = LOOP_VINFO_LOOP (loop_vinfo)->inner; - imm_use_iterator imm_iter; - use_operand_p use_p; - tree scalar_dest; - - if (dump_enabled_p ()) -dump_printf_loc (MSG_NOTE, vect_location, - "Record the vdef for outer-loop vectorization.\n"); - - /* Find the relevant loop-exit phi-node, and reord the vec_stmt there -(to be used when vectorizing outer-loop stmts that use the DEF of -STMT). */ - if (gimple_code (stmt) == GIMPLE_PHI) -scalar_dest = PHI_RESULT (stmt); - else -scalar_dest = gimple_get_lhs (stmt); - - FOR_EACH_IMM_USE_FAST (use_p, imm_iter, scalar_dest) - if (!flow_bb_inside_loop_p (innerloop, gimple_bb (USE_STMT (use_p - { - stmt_vec_info exit_phi_info - = vinfo->lookup_stmt (USE_STMT (use_p)); - STMT_VINFO_VEC_STMT (exit_phi_info) = vec_stmt; - } -} - if (vec_stmt) STMT_VINFO_VEC_STMT (stmt_info) = vec_stmt; -- 2.25.1
Re: [Patch][RFC] openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
It turned out that this patch fails with LTO and partitions, causing fails at runtime such as libgomp: Duplicate node via libgomp/splay-tree.c's splay_tree_insert. In the test case, the problem occurred for functions - namely main._omp_fn.* on the host. If the code is run in LTO context, the filtering-out should have already happen via the stream-out/stream-in and hence no additional check is needed for omp_finish_file. OK? Tobias PS: The streaming-in is done via: input_offload_tables (/* do_force_output = */ !flag_ltrans); - Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551) gcc/ChangeLog: * omp-offload.c (add_decls_addresses_to_decl_constructor, omp_finish_file): With in_lto_p, stream out all offload-table items even if the symtab_node does not exist. diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c index 4e44cfc9d0a..32c2485abd4 100644 --- a/gcc/omp-offload.c +++ b/gcc/omp-offload.c @@ -126,7 +126,7 @@ add_decls_addresses_to_decl_constructor (vec *v_decls, && lookup_attribute ("omp declare target link", DECL_ATTRIBUTES (it)); /* See also omp_finish_file and output_offload_tables in lto-cgraph.c. */ - if (!symtab_node::get (it)) + if (!in_lto_p && !symtab_node::get (it)) continue; tree size = NULL_TREE; @@ -382,14 +382,14 @@ omp_finish_file (void) tree it = (*offload_funcs)[i]; /* See also add_decls_addresses_to_decl_constructor and output_offload_tables in lto-cgraph.c. */ - if (!symtab_node::get (it)) + if (!in_lto_p && !symtab_node::get (it)) continue; targetm.record_offload_symbol (it); } for (unsigned i = 0; i < num_vars; i++) { tree it = (*offload_vars)[i]; - if (!symtab_node::get (it)) + if (!in_lto_p && !symtab_node::get (it)) continue; #ifdef ACCEL_COMPILER if (DECL_HAS_VALUE_EXPR_P (it)
Re: [Patch][RFC] openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
On Tue, Jun 09, 2020 at 04:02:19PM +0200, Tobias Burnus wrote: > It turned out that this patch fails with LTO and partitions, > causing fails at runtime such as > libgomp: Duplicate node > via libgomp/splay-tree.c's splay_tree_insert. > > In the test case, the problem occurred for functions - namely > main._omp_fn.* on the host. > If the code is run in LTO context, the filtering-out should > have already happen via the stream-out/stream-in and hence no > additional check is needed for omp_finish_file. > > OK? Was this caught in the testsuite, or do you have some short testcase that could be used in the testsuite? > > Tobias > > PS: The streaming-in is done via: > input_offload_tables (/* do_force_output = */ !flag_ltrans); > > - > Mentor Graphics (Deutschland) GmbH, ArnulfstraÃe 201, 80634 München / > Germany > Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, > Alexander Walter > openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551) > > gcc/ChangeLog: > > * omp-offload.c (add_decls_addresses_to_decl_constructor, > omp_finish_file): With in_lto_p, stream out all offload-table > items even if the symtab_node does not exist. Ok with or without the testcase. Jakub
[RFC][vect] BB SLP reduction prototype
Hi, So this is my rework of the code you sent me, I have not included the 'permute' code you included as I can't figure out what it is meant to be doing. Maybe something to look at later. I have also included three tests that show it working for some simple cases and even a nested one. Unfortunately it will not handle other simple cases where reassoc doesn't put the reduction in the form of : sum0 = a + b; sum1 = c + sum0; ... For instance a testcase I have been looking at is: unsigned int u32_single_abs_sum (unsigned int * a, unsigned int * b) { unsigned int sub0 = a[0] - b[0]; unsigned int sub1 = a[1] - b[1]; unsigned int sub2 = a[2] - b[2]; unsigned int sub3 = a[3] - b[3]; unsigned int sum = sub0 + sub1; sum += sub2; sum += sub3; return sum; } Unfortunately, the code that reaches slp will look like: _1 = *a_10(D); _2 = *b_11(D); _3 = MEM[(unsigned int *)a_10(D) + 4B]; _4 = MEM[(unsigned int *)b_11(D) + 4B]; _5 = MEM[(unsigned int *)a_10(D) + 8B]; _6 = MEM[(unsigned int *)b_11(D) + 8B]; _7 = MEM[(unsigned int *)a_10(D) + 12B]; _8 = MEM[(unsigned int *)b_11(D) + 12B]; _28 = _1 - _2; _29 = _3 + _28; _30 = _29 - _4; _31 = _5 + _30; _32 = _31 - _6; _33 = _7 + _32; sum_18 = _33 - _8; return sum_18; Which doesn't have the format expected as I described above... I am wondering how to teach it to support this. Maybe starting with your suggestion of making plus_expr and minus_expr have the same hash, so it groups all these statements together might be a start, but you'd still need to 'rebalance' the tree somehow I need to give this a bit more thought but I wanted to share what I have so far. The code is severely lacking in comments for now btw... Cheers, Andre diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-1.c b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-1.c new file mode 100644 index ..66b53ff9bb1e77414e7493c07ab87d46f4d33651 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-1.c @@ -0,0 +1,66 @@ +/* { dg-require-effective-target vect_int } */ +#include +#include "tree-vect.h" +extern int abs (int); + +#define ABS4(N)\ + sum += abs (a[N]); \ + sum += abs (a[N+1]); \ + sum += abs (a[N+2]); \ + sum += abs (a[N+3]); + +#define ABS8(N) \ + ABS4(N)\ + ABS4(N+4) + +#define ABS16(N) \ + ABS8(N)\ + ABS8(N+8) + +__attribute__ ((noipa)) unsigned char +u8_single_abs_sum (signed char * a) +{ + unsigned char sum = 0; + ABS16(0) + return sum; +} + +__attribute__ ((noipa)) unsigned short +u16_single_abs_sum (signed short * a) +{ + unsigned short sum = 0; + ABS8(0) + return sum; +} + +__attribute__ ((noipa)) unsigned int +u32_single_abs_sum (signed int * a) +{ + unsigned int sum = 0; + ABS4(0) + return sum; +} + +signed char u8[16] = {0, 1, 2, 3, 4, 5, 6, -7, -8, -9, -10, -11, -12, -13, + -14, -15}; +signed short u16[8] = {0, 1, 2, 3, 4, -5, -6, -7}; +signed int u32[4] = {-10, -20, 30, 40}; + + +int main (void) +{ + check_vect (); + + if (u8_single_abs_sum (&(u8[0])) != 120) +abort (); + + if (u16_single_abs_sum (&(u16[0])) != 28) +abort (); + + if (u32_single_abs_sum (&(u32[0])) != 100) +abort (); + + return 0; +} + +/* { dg-final { scan-tree-dump-times "basic block vectorized" 3 "slp2" } } */ diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-2.c b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-2.c new file mode 100644 index ..298a22cfef687f6634d61bf808a41942c3ce4a85 --- /dev/null +++ b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-2.c @@ -0,0 +1,82 @@ +/* { dg-require-effective-target vect_int } */ +#include +#include "tree-vect.h" +extern int abs (int); + +#define ABS4(N)\ + sum += abs (a[N]); \ + sum += abs (a[N+1]); \ + sum += abs (a[N+2]); \ + sum += abs (a[N+3]); + +#define ABS8(N) \ + ABS4(N)\ + ABS4(N+4) + +#define ABS16(N) \ + ABS8(N)\ + ABS8(N+8) + +__attribute__ ((noipa)) unsigned char +u8_double_abs_sum (signed char * a) +{ + unsigned char sum = 0; + ABS16(0) + ABS16(16) + return sum; +} + +__attribute__ ((noipa)) unsigned short +u16_double_abs_sum (signed short * a) +{ + unsigned short sum = 0; + ABS16(0) + return sum; +} + +__attribute__ ((noipa)) unsigned int +u32_double_abs_sum (signed int * a) +{ + unsigned int sum = 0; + ABS8(0) + return sum; +} + +__attribute__ ((noipa)) unsigned int +u32_triple_abs_sum (signed int * a) +{ + unsigned int sum = 0; + ABS8(0) + ABS4(8) + return sum; +} + +signed char u8[32] = {0, 1, 2, 3, 4, 5, 6, -7, -8, -9, -10, -11, -12, -13, + -14, -15, 0, 1, 2, 3, 4, 5, 6, -7, -8, -9, -10, -11, -12, -13, + -14, -30}; + +signed short u16[16] = {0, 1, 2, 3, 4, -5, -6, -7, 10, 20, 30, 40, -10, -20, + -30, -40}; +signed int u32[16] = {-10, -20, 30, 40, 100, 200, -300, -500, -600, -700, 1000, +2000}; + +i
Re: [PR95416] outputs.exp: skip lto tests when not using linker plugin
Hi Alexandre, > When the linker plugin is not available, dump outputs in lto runs are > different from those outputs.exp tests expect, so skip them for now. > > Regstrapped on x86_64-linux-gnu. Ok to install? > > > for gcc/testsuite/ChangeLog > > * outputs.exp (skip_lto): Set when missing the linker plugin. > --- > testsuite/gcc.misc-tests/outputs.exp |3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git gcc/testsuite/gcc.misc-tests/outputs.exp > gcc/testsuite/gcc.misc-tests/outputs.exp > index 06a32db..a47f8a6 100644 > --- gcc/testsuite/gcc.misc-tests/outputs.exp > +++ gcc/testsuite/gcc.misc-tests/outputs.exp > @@ -42,7 +42,8 @@ if ![check_no_compiler_messages gsplitdwarf object { > > # Check for -flto support. We explicitly test the result to skip > # tests that use -flto. > -set skip_lto ![check_effective_target_lto] > +set skip_lto { ![check_effective_target_lto] \ > +|| ![check_linker_plugin_available] } this is wrong unfortunately: braces are the Tcl equivalent of single quotes so you're setting skip_lto to the string inside. Instead you need something like set skip_lto [expr ![check_effective_target_lto] \ || ![check_linker_plugin_available]] While this worked for me on i386-pc-solaris2.11 (both with ld, thus without the lto plugin, and with gld and the plugin), I wonder if there's not a better way than just skipping the lto tests, i.e. account for the difference in expected files between the cases with and without the plugin. Your call in the end (perhaps as a followup): everything is better than the current bunch of failures. Rainer -- - Rainer Orth, Center for Biotechnology, Bielefeld University
Re: [PATCH] rs6000/testsuite: Allow xxperm* instead of only vperm*
On Tue, 2020-06-09 at 01:00 +, Segher Boessenkool wrote: > Some testcases failed (esp. with --with-cpu=power9) after my change > to > prefer xxperm over vperm when all else is equal. Fix that. (This > also > tightens the relevant REs somewhat). > > Tested on way too many configurations. Committed to trunk. > Hi, A post-commit review .. (since I authored many of the fold-vec-* tests, i had to chime in) this looks good to me, appreciate the effort. Thanks :-) -Will > > Segher > > > 2020-06-09 Segher Boessenkool > > gcc/testsuite/ > * gcc.target/powerpc/fold-vec-perm-char.c: Allow both > vperm/vpermr and > xxperm/xxpermr. > * gcc.target/powerpc/fold-vec-perm-double.c: Ditto. > * gcc.target/powerpc/fold-vec-perm-float.c: Ditto. > * gcc.target/powerpc/fold-vec-perm-int.c: Ditto. > * gcc.target/powerpc/fold-vec-perm-longlong.c: Ditto. > * gcc.target/powerpc/fold-vec-perm-pixel.c: Ditto. > * gcc.target/powerpc/fold-vec-perm-short.c: Ditto. > * gcc.target/powerpc/lvsl-lvsr.c: Ditto. > * gcc.target/powerpc/vec-mult-char-2.c: Ditto. > * gcc.target/powerpc/vsx-vector-6.p9.c: Also allow xxpermr. > > --- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c | 2 +- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c | 2 +- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c| 2 +- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c | 2 +- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c | 2 +- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c| 2 +- > gcc/testsuite/gcc.target/powerpc/fold-vec-perm-short.c| 2 +- > gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c | 2 +- > gcc/testsuite/gcc.target/powerpc/vec-mult-char-2.c| 2 +- > gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p9.c| 2 +- > 10 files changed, 10 insertions(+), 10 deletions(-) > > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c > b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c > index d907eae..56a89f3 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c > @@ -28,4 +28,4 @@ testuc (vector unsigned char vuc2, vector unsigned > char vuc3, >return vec_perm (vuc2, vuc3, vuc); > } > > -/* { dg-final { scan-assembler-times "vperm" 3 } } */ > +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 3 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c > b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c > index 7ceca9e..c982bc2 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c > @@ -14,4 +14,4 @@ testd (vector double vd2, vector double vd3, vector > unsigned char vuc) >return vec_perm (vd2, vd3, vuc); > } > > -/* { dg-final { scan-assembler-times "vperm" 1 } } */ > +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 1 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c > b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c > index c9cfb0d..64b8ac7 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c > @@ -13,4 +13,4 @@ testf (vector float vf2, vector float vf3, vector > unsigned char vuc) >return vec_perm (vf2, vf3, vuc); > } > > -/* { dg-final { scan-assembler-times "vperm" 1 } } */ > +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 1 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c > b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c > index a2fdc26..a6dd595 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c > @@ -28,4 +28,4 @@ testui (vector unsigned int vui2, vector unsigned > int vui3, >return vec_perm (vui2, vui3, vuc); > } > > -/* { dg-final { scan-assembler-times "vperm" 3 } } */ > +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 3 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm- > longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm- > longlong.c > index 1333d88..3cc757d 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c > @@ -29,4 +29,4 @@ testul (vector unsigned long long vul2, vector > unsigned long long vul3, >return vec_perm (vul2, vul3, vuc); > } > > -/* { dg-final { scan-assembler-times "vperm" 3 } } */ > +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 3 } } */ > diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c > b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c > index 0d3cb0a..54fccd2 100644 > --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c > +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c > @@ -13,4 +1
Re: [RFC][vect] BB SLP reduction prototype
The 'you' here is Richi, which Richi is probably aware but maybe not the rest of the list :') On 09/06/2020 15:29, Andre Vieira (lists) wrote: Hi, So this is my rework of the code you sent me, I have not included the 'permute' code you included as I can't figure out what it is meant to be doing. Maybe something to look at later. I have also included three tests that show it working for some simple cases and even a nested one. Unfortunately it will not handle other simple cases where reassoc doesn't put the reduction in the form of : sum0 = a + b; sum1 = c + sum0; ... For instance a testcase I have been looking at is: unsigned int u32_single_abs_sum (unsigned int * a, unsigned int * b) { unsigned int sub0 = a[0] - b[0]; unsigned int sub1 = a[1] - b[1]; unsigned int sub2 = a[2] - b[2]; unsigned int sub3 = a[3] - b[3]; unsigned int sum = sub0 + sub1; sum += sub2; sum += sub3; return sum; } Unfortunately, the code that reaches slp will look like: _1 = *a_10(D); _2 = *b_11(D); _3 = MEM[(unsigned int *)a_10(D) + 4B]; _4 = MEM[(unsigned int *)b_11(D) + 4B]; _5 = MEM[(unsigned int *)a_10(D) + 8B]; _6 = MEM[(unsigned int *)b_11(D) + 8B]; _7 = MEM[(unsigned int *)a_10(D) + 12B]; _8 = MEM[(unsigned int *)b_11(D) + 12B]; _28 = _1 - _2; _29 = _3 + _28; _30 = _29 - _4; _31 = _5 + _30; _32 = _31 - _6; _33 = _7 + _32; sum_18 = _33 - _8; return sum_18; Which doesn't have the format expected as I described above... I am wondering how to teach it to support this. Maybe starting with your suggestion of making plus_expr and minus_expr have the same hash, so it groups all these statements together might be a start, but you'd still need to 'rebalance' the tree somehow I need to give this a bit more thought but I wanted to share what I have so far. The code is severely lacking in comments for now btw... Cheers, Andre
Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue
On Tue, 2020-06-09 at 09:17 +0200, Richard Biener wrote: > On Mon, Jun 8, 2020 at 11:13 PM Jeff Law via Gcc-patches > wrote: > > > > We're seeing some failures due to the local-alignment pass. For example on > > sh4: > > > > Tests that now fail, but worked before (16 tests): > > > > gcc.dg/pr48335-1.c (test for excess errors) > > gcc.dg/pr48335-2.c (test for excess errors) > > gcc.dg/pr48335-5.c (test for excess errors) > > gcc.dg/pr48335-6.c (test for excess errors) > > gcc.dg/torture/pr48493.c -O0 (test for excess errors) > > gcc.dg/torture/pr48493.c -O0 (test for excess errors) > > gcc.dg/torture/pr48493.c -O1 (test for excess errors) > > gcc.dg/torture/pr48493.c -O1 (test for excess errors) > > gcc.dg/torture/pr48493.c -O2 (test for excess errors) > > gcc.dg/torture/pr48493.c -O2 (test for excess errors) > > gcc.dg/torture/pr48493.c -O2 -flto -fno-use-linker-plugin -flto- > > partition=none (test for excess errors) > > gcc.dg/torture/pr48493.c -O2 -flto -fno-use-linker-plugin -flto- > > partition=none (test for excess errors) > > gcc.dg/torture/pr48493.c -O3 -g (test for excess errors) > > gcc.dg/torture/pr48493.c -O3 -g (test for excess errors) > > gcc.dg/torture/pr48493.c -Os (test for excess errors) > > gcc.dg/torture/pr48493.c -Os (test for excess errors) > > > > Or on x86_64: > > > > during GIMPLE pass: adjust_alignment > > /home/jenkins/linux/arch/x86/boot/string.c: In function ‘simple_strtoull’: > > /home/jenkins/linux/arch/x86/boot/string.c:121:20: internal compiler error: > > in > > execute, at adjust-alignment.c:74 > > 121 | unsigned long long simple_strtoull(const char *cp, char **endp, > > unsigned > > int base) > > |^~~ > > 0x79c5b3 execute > > ../../../../../gcc/gcc/adjust-alignment.c:74 > > Please submit a full bug report, > > > > There may be others, I haven't searched the failing targets extensively for > > this > > issue. > > > > AFAICT the adjust-alignment pass is supposed to increase alignments of > > locals > > when needed. It has an assert to ensure that alignments are only increased. > > > > However, if the object was declared with an alignment attribute that is > > larger > > than what LOCAL_ALIGNMENT would produce for the object, then the > > adjust-alignment > > pass trips the assert. > > > > Rather than asserting, just ignoring such objects seems better. But I'm not > > intimately familiar with the issues here. > > > > Bootstrapped and regression tested on x86_64, also verified this fixes the > > sh4 > > issue. All the *-elf targets have also built/tested with this patch with no > > regressions. > > > > OK for the trunk? > > While technically OK the issue is that it does not solve the x86 issue > which with incoming stack alignment < 8 bytes does not perform > dynamic re-alignment to align 'long long' variables appropriately. > Currently the x86 backend thinks it can fixup by lowering alignment > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue > because the larger alignment is present in the IL for a long (previously > RTL expansion, now adjust-aligment) time and your patch makes that > wrong info last forever (or alternatively cause dynamic stack alignment > which we do _not_ want to perform here). I've never looked at the dynamic realignment stuff -- is there a good reason why we don't dynamically realign when we've got a local with a requested alignment? Isn't that a huge oversight in the whole concept of dynamic realignment? > > So I prefer to wait for a proper x86 solution before cutting the ICEs. > (did you verify effects on code generation of your patch?) I don't mind waiting. And no, I didn't look at the codegen at all. jeff
Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue
On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote: > > While technically OK the issue is that it does not solve the x86 issue > > which with incoming stack alignment < 8 bytes does not perform > > dynamic re-alignment to align 'long long' variables appropriately. > > Currently the x86 backend thinks it can fixup by lowering alignment > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue > > because the larger alignment is present in the IL for a long (previously > > RTL expansion, now adjust-aligment) time and your patch makes that > > wrong info last forever (or alternatively cause dynamic stack alignment > > which we do _not_ want to perform here). > I've never looked at the dynamic realignment stuff -- is there a good reason > why > we don't dynamically realign when we've got a local with a requested > alignment? > Isn't that a huge oversight in the whole concept of dynamic realignment? It is quite expensive and long long/double uses are pervasive, so we don't want to realign just because of that. If we do dynamic realignment for other reasons, there is no reason not to align the long long/double properly. Jakub
Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue
On Tue, 2020-06-09 at 16:59 +0200, Jakub Jelinek wrote: > On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote: > > > While technically OK the issue is that it does not solve the x86 issue > > > which with incoming stack alignment < 8 bytes does not perform > > > dynamic re-alignment to align 'long long' variables appropriately. > > > Currently the x86 backend thinks it can fixup by lowering alignment > > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue > > > because the larger alignment is present in the IL for a long (previously > > > RTL expansion, now adjust-aligment) time and your patch makes that > > > wrong info last forever (or alternatively cause dynamic stack alignment > > > which we do _not_ want to perform here). > > I've never looked at the dynamic realignment stuff -- is there a good > > reason why > > we don't dynamically realign when we've got a local with a requested > > alignment? > > Isn't that a huge oversight in the whole concept of dynamic realignment? > > It is quite expensive and long long/double uses are pervasive, so we don't > want to realign just because of that. If we do dynamic realignment for > other reasons, there is no reason not to align the long long/double Right, but if there's an explicit alignment from the user, don't we need to honor that? Jeff
Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue
On Tue, Jun 09, 2020 at 09:18:25AM -0600, Jeff Law wrote: > On Tue, 2020-06-09 at 16:59 +0200, Jakub Jelinek wrote: > > On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote: > > > > While technically OK the issue is that it does not solve the x86 issue > > > > which with incoming stack alignment < 8 bytes does not perform > > > > dynamic re-alignment to align 'long long' variables appropriately. > > > > Currently the x86 backend thinks it can fixup by lowering alignment > > > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue > > > > because the larger alignment is present in the IL for a long (previously > > > > RTL expansion, now adjust-aligment) time and your patch makes that > > > > wrong info last forever (or alternatively cause dynamic stack alignment > > > > which we do _not_ want to perform here). > > > I've never looked at the dynamic realignment stuff -- is there a good > > > reason why > > > we don't dynamically realign when we've got a local with a requested > > > alignment? > > > Isn't that a huge oversight in the whole concept of dynamic realignment? > > > > It is quite expensive and long long/double uses are pervasive, so we don't > > want to realign just because of that. If we do dynamic realignment for > > other reasons, there is no reason not to align the long long/double > Right, but if there's an explicit alignment from the user, don't we need to > honor > that? Sure. Do we ignore that? For user alignment we have DECL_USER_ALIGN bit... Jakub
Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue
On Tue, 2020-06-09 at 17:26 +0200, Jakub Jelinek wrote: > On Tue, Jun 09, 2020 at 09:18:25AM -0600, Jeff Law wrote: > > On Tue, 2020-06-09 at 16:59 +0200, Jakub Jelinek wrote: > > > On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote: > > > > > While technically OK the issue is that it does not solve the x86 issue > > > > > which with incoming stack alignment < 8 bytes does not perform > > > > > dynamic re-alignment to align 'long long' variables appropriately. > > > > > Currently the x86 backend thinks it can fixup by lowering alignment > > > > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue > > > > > because the larger alignment is present in the IL for a long > > > > > (previously > > > > > RTL expansion, now adjust-aligment) time and your patch makes that > > > > > wrong info last forever (or alternatively cause dynamic stack > > > > > alignment > > > > > which we do _not_ want to perform here). > > > > I've never looked at the dynamic realignment stuff -- is there a good > > > > reason why > > > > we don't dynamically realign when we've got a local with a requested > > > > alignment? > > > > Isn't that a huge oversight in the whole concept of dynamic realignment? > > > > > > It is quite expensive and long long/double uses are pervasive, so we don't > > > want to realign just because of that. If we do dynamic realignment for > > > other reasons, there is no reason not to align the long long/double > > Right, but if there's an explicit alignment from the user, don't we need to > > honor > > that? > > Sure. Do we ignore that? For user alignment we have DECL_USER_ALIGN bit... I suspect that's what's going on with the kernel build failure. Jeff
Re: [PATCH] guard against calls with fewer arguments than the function type implies (PR 95581)
On 6/9/20 1:43 AM, Richard Biener wrote: On Mon, Jun 8, 2020 at 11:59 PM Martin Sebor via Gcc-patches wrote: PR 95581 reports an ICE in a call to gimple_call_arg() with an out-of-bounds argument number (2). The attached patch avoids the ICE but I'm not sure it's the right fix. Other than verifying the ICE is gone with a powerpc64 cross-compiler I couldn't come up with a generic test cases to exercise this change. The call, synthesized by the vectorizer, is vect__5.6_24 = __builtin_altivec_mask_for_load (vectp_a.5_8); but the function is declared by the powerpc64 back end to take two arguments: long and const void*. Is it correct for the builtin to be called with fewer arguments than their type indicates? (If it isn't the patch shouldn't be necessary but some other fix would be needed.) I think the backend declaration is wrong, the function only takes a void * argument and returns a long. Thanks. In his comment on the bug Segher (CC'd) points to the internals manual that documents the function: https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/target.def;h=07059a87caf7cc0237d583124b476ee45ea41ed5;hb=HEAD#l1744 (By the way, thanks for the pointer!) If I read it right, ihe function f in the TARGET_VECTORIZE_BUILTIN_MASK_FOR_LOAD documentation is __builtin_altivec_mask_for_load. Although the manual isn't unequivocal about this but it does suggest the address addr the function is given as an argument should be the first (and presumably only) argument. This matches the call in the IL where the first argument is a pointer, but not the function's signature. I think the middle end needs to be able to rely on built-in function types when processing calls: i.e., the types and numbers of actual arguments in the calls need to match those of the built-in function type. Otherwise it would have to validate every call against the function signature and avoid treating it as a built-in if it doesn't match. There are parts of the middle end that do that for library built-ins (because they can be declared in a program with mismatched signatures) but as we have seen it's error-prone. I don't think it would be helpful to try to extend this approach to internal built-ins. So I agree that the real problem is the declaration of the built-in. I have no issue with also committing the proposed patch in the meantime, until the back end is fixed, if it helps. But I'll leave that decision to the middle end maintainers. Martin
Re: [PATCH] guard against calls with fewer arguments than the function type implies (PR 95581)
On Tue, Jun 9, 2020 at 5:51 PM Martin Sebor wrote: > > On 6/9/20 1:43 AM, Richard Biener wrote: > > On Mon, Jun 8, 2020 at 11:59 PM Martin Sebor via Gcc-patches > > wrote: > >> > >> PR 95581 reports an ICE in a call to gimple_call_arg() with > >> an out-of-bounds argument number (2). The attached patch avoids > >> the ICE but I'm not sure it's the right fix. Other than verifying > >> the ICE is gone with a powerpc64 cross-compiler I couldn't come up > >> with a generic test cases to exercise this change. > >> > >> The call, synthesized by the vectorizer, is > >> > >> vect__5.6_24 = __builtin_altivec_mask_for_load (vectp_a.5_8); > >> > >> but the function is declared by the powerpc64 back end to take two > >> arguments: long and const void*. Is it correct for the builtin to > >> be called with fewer arguments than their type indicates? (If it > >> isn't the patch shouldn't be necessary but some other fix would > >> be needed.) > > > > I think the backend declaration is wrong, the function only takes > > a void * argument and returns a long. > > Thanks. In his comment on the bug Segher (CC'd) points to > the internals manual that documents the function: > > https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/target.def;h=07059a87caf7cc0237d583124b476ee45ea41ed5;hb=HEAD#l1744 > > (By the way, thanks for the pointer!) > > If I read it right, ihe function f in > the TARGET_VECTORIZE_BUILTIN_MASK_FOR_LOAD documentation is > __builtin_altivec_mask_for_load. Although the manual isn't > unequivocal about this but it does suggest the address addr > the function is given as an argument should be the first > (and presumably only) argument. This matches the call in > the IL where the first argument is a pointer, but not > the function's signature. > > I think the middle end needs to be able to rely on built-in > function types when processing calls: i.e., the types and > numbers of actual arguments in the calls need to match those > of the built-in function type. Otherwise it would have to > validate every call against the function signature and avoid > treating it as a built-in if it doesn't match. There are > parts of the middle end that do that for library built-ins > (because they can be declared in a program with mismatched > signatures) but as we have seen it's error-prone. I don't > think it would be helpful to try to extend this approach to > internal built-ins. > > So I agree that the real problem is the declaration of > the built-in. > > I have no issue with also committing the proposed patch > in the meantime, until the back end is fixed, if it helps. > But I'll leave that decision to the middle end maintainers. The backend needs to be fixed. Richard. > Martin
[pushed] c++: Tweak predeclare_vla.
We only need to predeclare a VLA type if it's wrapped in a pointer type; otherwise gimplify_type_sizes will handle it. Tested x86_64-pc-linux-gnu, applying to trunk. gcc/cp/ChangeLog: PR c++/95552 * cp-gimplify.c (predeclare_vla): Only predeclare a VLA if it's wrapped in a pointer type. --- gcc/cp/cp-gimplify.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c index e8fbc300fda..0e949e29c5c 100644 --- a/gcc/cp/cp-gimplify.c +++ b/gcc/cp/cp-gimplify.c @@ -1205,7 +1205,8 @@ predeclare_vla (tree expr) return expr; vla = TREE_TYPE (vla); } - if (TYPE_NAME (vla) || !variably_modified_type_p (vla, NULL_TREE)) + if (vla == type || TYPE_NAME (vla) + || !variably_modified_type_p (vla, NULL_TREE)) return expr; tree decl = build_decl (input_location, TYPE_DECL, NULL_TREE, vla); base-commit: 009668e31f4ee910eae874b24afb8eb6adf65fae -- 2.18.1
Re: [PATCH RFA] tree-inline: Fix VLA handling [PR95552]
On 6/9/20 8:41 AM, Eric Botcazou wrote: Yes, but the problem is that remap_decl isn't getting called. Right, I can get it to be called by adding a pushdecl to grokdeclarator... Attaching it to the BIND_EXPR doesn't help walk_tree_1 do the right thing with the DECL_EXPR. ... but, indeed, this still ICEs. So the key is that the DECL_EXPR_DECL of the copied DECL_EXPR points to the remapped TYPE_DECL before the type is copied? If so, then your original patch is probably the way to go, but the comment would need to be slightly adjusted. Like this? In Ada, where we attach the TYPE_DECL to the BIND_EXPR, this will mean that remap_decl is invoked 3 times per TYPE_DECL: first twice from copy_bind_expr and then once again for the DECL_EXPR. But probably no big deal in the end. Yes, we want to remap every occurrence of the decl so they all get replaced. Jason commit 7b7c1b07dc32cb3cb6dc9b97d516a7240c825cf9 Author: Jason Merrill Date: Fri Jun 5 16:36:27 2020 -0400 tree-inline: Fix VLA handling [PR95552] The problem in this testcase comes from cloning the constructor into complete and base variants. When we clone the body the first time, walk_tree_1 calls copy_tree_body_r on the type of the artificial TYPE_DECL we made for the VLA type without calling it on the decl itself, so we overwrite the type of the TYPE_DECL without copying the decl first. This has been broken since we started inserting a TYPE_DECL for anonymous VLAs in r7-457. This patch fixes walk_tree_1 to call the function on the TYPE_DECL, as we do for other decls of a DECL_EXPR. gcc/ChangeLog: PR c++/95552 * tree.c (walk_tree_1): Call func on the TYPE_DECL of a DECL_EXPR. gcc/testsuite/ChangeLog: PR c++/95552 * g++.dg/ext/vla23.C: New test. diff --git a/gcc/testsuite/g++.dg/ext/vla23.C b/gcc/testsuite/g++.dg/ext/vla23.C new file mode 100644 index 000..317a824b2f3 --- /dev/null +++ b/gcc/testsuite/g++.dg/ext/vla23.C @@ -0,0 +1,14 @@ +// PR c++/95552 +// Test for VLA and cloned constructor. +// { dg-additional-options -Wno-vla } +// { dg-require-effective-target alloca } + +struct VB { }; +struct ViewDom: virtual VB +{ + ViewDom(int i) { char (*a)[i]; } +}; +void element( ) +{ + ViewDom a(2); +} diff --git a/gcc/tree.c b/gcc/tree.c index 7197b4720ce..805f669a945 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -12212,6 +12212,12 @@ walk_tree_1 (tree *tp, walk_tree_fn func, void *data, Note that DECLs get walked as part of processing the BIND_EXPR. */ if (TREE_CODE (DECL_EXPR_DECL (*tp)) == TYPE_DECL) { + /* Call the function for the decl so e.g. copy_tree_body_r can + replace it with the remapped one. */ + result = (*func) (&DECL_EXPR_DECL (*tp), &walk_subtrees, data); + if (result || !walk_subtrees) + return result; + tree *type_p = &TREE_TYPE (DECL_EXPR_DECL (*tp)); if (TREE_CODE (*type_p) == ERROR_MARK) return NULL_TREE;
Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare
On 09/06/20 00:02 +0100, Jonathan Wakely wrote: On 08/06/20 22:08 +0100, Jonathan Wakely wrote: On 08/06/20 19:20 +0100, Jonathan Wakely wrote: On 05/06/20 22:24 +0200, François Dumont via Libstdc++ wrote: Hi    Here is the last of my algo patches this time to extend the memcmp optimization to std::deque iterators and _GLIBCXX_DEBUG mode.    To do so I had to return int in implementation details of lexicographical_compare to make sure we can treat a deque iterator range by chunk in a performant way. Here's a simpler version, which doesn't alter anything for the existing code (i.e. all iterators that aren't deque iterators) and also fixes the infinite loop bug. This seems simpler and less invasive. Please take a look. I've actually tested it in debug mode now, and ... diff --git a/libstdc++-v3/include/debug/safe_iterator.tcc b/libstdc++-v3/include/debug/safe_iterator.tcc index 888ac803ae5..db6a301655f 100644 --- a/libstdc++-v3/include/debug/safe_iterator.tcc +++ b/libstdc++-v3/include/debug/safe_iterator.tcc @@ -470,6 +470,80 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return __equal_aux1(__first1, __last1, __first2); } + template +int This should return bool here. +__lexicographical_compare_aux( + const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __first1, + const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __last1, + _II2 __first2, _II2 __last2) +{ + typename ::__gnu_debug::_Distance_traits<_Ite1>::__type __dist1; + __glibcxx_check_valid_range2(__first1, __last1, __dist1); + __glibcxx_check_valid_range(__first2, __last2); + + if (__dist1.second > ::__gnu_debug::__dp_equality) + return std::__lexicographical_compare_aux(__first1.base(), + __last1.base(), + __first2, __last2); + return std::__lexicographical_compare_aux1(__first1, __last1, +__first2, __last2); +} + + template +int And here. +__lexicographical_compare_aux( + _II1 __first1, _II1 __last1, + const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __first2, + const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __last2) +{ + __glibcxx_check_valid_range(__first1, __last1); + typename ::__gnu_debug::_Distance_traits<_II1>::__type __dist2; + __glibcxx_check_valid_range2(__first2, __last2, __dist2); + + if (__dist2.second > ::__gnu_debug::__dp_equality) + return std::__lexicographical_compare_aux(__first1, __last1, + __first2.base(), + __last2.base()); + return std::__lexicographical_compare_aux1(__first1, __last1, +__first2, __last2); +} + + template +int And here. And I've also enhanced the tests so they catch the bug I created where the __lexicographical_compare_aux1 overload taking two ranges of deque iterators was still trying to return a three-way result, rather than bool. Corrected patch attached. But, the 25_algorithms/lexicographical_compare/deque_iterators/1.cc test doesn't actually test the new code properly, because it only uses deque which means memcmp isn't even used. We need better tests.
Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare
On 09/06/20 17:11 +0100, Jonathan Wakely wrote: On 09/06/20 00:02 +0100, Jonathan Wakely wrote: On 08/06/20 22:08 +0100, Jonathan Wakely wrote: On 08/06/20 19:20 +0100, Jonathan Wakely wrote: On 05/06/20 22:24 +0200, François Dumont via Libstdc++ wrote: Hi    Here is the last of my algo patches this time to extend the memcmp optimization to std::deque iterators and _GLIBCXX_DEBUG mode.    To do so I had to return int in implementation details of lexicographical_compare to make sure we can treat a deque iterator range by chunk in a performant way. Here's a simpler version, which doesn't alter anything for the existing code (i.e. all iterators that aren't deque iterators) and also fixes the infinite loop bug. This seems simpler and less invasive. Please take a look. I've actually tested it in debug mode now, and ... diff --git a/libstdc++-v3/include/debug/safe_iterator.tcc b/libstdc++-v3/include/debug/safe_iterator.tcc index 888ac803ae5..db6a301655f 100644 --- a/libstdc++-v3/include/debug/safe_iterator.tcc +++ b/libstdc++-v3/include/debug/safe_iterator.tcc @@ -470,6 +470,80 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION return __equal_aux1(__first1, __last1, __first2); } + template +int This should return bool here. +__lexicographical_compare_aux( + const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __first1, + const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __last1, + _II2 __first2, _II2 __last2) +{ + typename ::__gnu_debug::_Distance_traits<_Ite1>::__type __dist1; + __glibcxx_check_valid_range2(__first1, __last1, __dist1); + __glibcxx_check_valid_range(__first2, __last2); + + if (__dist1.second > ::__gnu_debug::__dp_equality) + return std::__lexicographical_compare_aux(__first1.base(), + __last1.base(), + __first2, __last2); + return std::__lexicographical_compare_aux1(__first1, __last1, +__first2, __last2); +} + + template +int And here. +__lexicographical_compare_aux( + _II1 __first1, _II1 __last1, + const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __first2, + const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __last2) +{ + __glibcxx_check_valid_range(__first1, __last1); + typename ::__gnu_debug::_Distance_traits<_II1>::__type __dist2; + __glibcxx_check_valid_range2(__first2, __last2, __dist2); + + if (__dist2.second > ::__gnu_debug::__dp_equality) + return std::__lexicographical_compare_aux(__first1, __last1, + __first2.base(), + __last2.base()); + return std::__lexicographical_compare_aux1(__first1, __last1, +__first2, __last2); +} + + template +int And here. And I've also enhanced the tests so they catch the bug I created where the __lexicographical_compare_aux1 overload taking two ranges of deque iterators was still trying to return a three-way result, rather than bool. Corrected patch attached. *Really* attached this time. But, the 25_algorithms/lexicographical_compare/deque_iterators/1.cc test doesn't actually test the new code properly, because it only uses deque which means memcmp isn't even used. We need better tests. commit 4d501ba840194a28799cb83bcd8aaecd0eec5304 Author: Fran??ois Dumont Date: Tue Jun 9 00:01:51 2020 +0100 libstdc++: Extend memcmp optimization in std::lexicographical_compare Make the memcmp optimization work for std::deque iterators and safe iterators. Co-authored-by: Jonathan Wakely libstdc++-v3/ChangeLog: 2020-06-08 Fran??ois Dumont Jonathan Wakely * include/bits/deque.tcc (__lex_cmp_dit): New. (__lexicographical_compare_aux1): Define overloads for deque iterators. * include/bits/stl_algobase.h (__lexicographical_compare::__3way): New static member function. (__lexicographical_compare::__3way): Likewise. (__lexicographical_compare::__lc): Use __3way. (__lexicographical_compare_aux): Rename to __lexicographical_compare_aux1 and declare overloads for deque iterators. (__lexicographical_compare_aux): Define new forwarding function that calls __lexicographical_compare_aux1 and declare new overloads for safe iterators. (lexicographical_compare): Do not use __niter_base on parameters. * include/debug/safe_iterator.tcc (__lexicographical_compare_aux): Define overloads for safe iterators. * testsuite/25_algorithms/lexicographical_compare/1.cc: Add checks with random access i
[PATCH] Remove dead code
This removes dead code left over from the reduction vectorization refactoring last year. Bootstrapped on x86_64-unknown-linux-gnu, testing in progress. Richard. 2020-06-09 Richard Biener * tree-vect-loop.c (vectorizable_induction): Remove dead code. --- gcc/tree-vect-loop.c | 42 -- 1 file changed, 42 deletions(-) diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c index f2c52ae1909..5329982e4c9 100644 --- a/gcc/tree-vect-loop.c +++ b/gcc/tree-vect-loop.c @@ -7373,11 +7373,6 @@ vectorizable_induction (loop_vec_info loop_vinfo, unsigned i; tree expr; gimple_seq stmts; - imm_use_iterator imm_iter; - use_operand_p use_p; - gimple *exit_phi; - edge latch_e; - tree loop_arg; gimple_stmt_iterator si; gphi *phi = dyn_cast (stmt_info->stmt); @@ -7485,9 +7480,6 @@ vectorizable_induction (loop_vec_info loop_vinfo, if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "transform induction phi.\n"); - latch_e = loop_latch_edge (iv_loop); - loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e); - step_expr = STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_info); gcc_assert (step_expr != NULL_TREE); tree step_vectype = get_same_sized_vectype (TREE_TYPE (step_expr), vectype); @@ -7872,40 +7864,6 @@ vectorizable_induction (loop_vec_info loop_vinfo, } } - if (nested_in_vect_loop) -{ - /* Find the loop-closed exit-phi of the induction, and record - the final vector of induction results: */ - exit_phi = NULL; - FOR_EACH_IMM_USE_FAST (use_p, imm_iter, loop_arg) -{ - gimple *use_stmt = USE_STMT (use_p); - if (is_gimple_debug (use_stmt)) - continue; - - if (!flow_bb_inside_loop_p (iv_loop, gimple_bb (use_stmt))) - { - exit_phi = use_stmt; - break; - } -} - if (exit_phi) - { - stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt (exit_phi); - /* FORNOW. Currently not supporting the case that an inner-loop induction -is not used in the outer-loop (i.e. only outside the outer-loop). */ - gcc_assert (STMT_VINFO_RELEVANT_P (stmt_vinfo) - && !STMT_VINFO_LIVE_P (stmt_vinfo)); - - STMT_VINFO_VEC_STMT (stmt_vinfo) = new_stmt_info; - if (dump_enabled_p ()) - dump_printf_loc (MSG_NOTE, vect_location, -"vector of inductions after inner-loop:%G", -new_stmt); - } -} - - if (dump_enabled_p ()) dump_printf_loc (MSG_NOTE, vect_location, "transform induction: created def-use cycle: %G%G", -- 2.25.1
Re: [PATCH] libsanitizer: use gnu++14
On Tue, Jun 9, 2020 at 10:09 AM Martin Liška wrote: > > On 6/8/20 4:53 PM, Martin Liška wrote: > > Hi. > > > > Thank you for the report. It's caused by fact that LLVM switch in > > 4d474e078ac7 > > to c++14. So that I suggest to use gnu++14. > > > > Patch can bootstrap on x86_64-linux-gnu and survives regression tests. > > I also verified that abidiff is equal for all libsanitizer shared libraries. > > I'm going to install the patch if there are no objections. > > > > Thanks, > > Martin > > Installed as 942a384ef9f. > @Andreas: Can you please check the riscv64 build? Note we need to document (and configure test?) the build requirement for non-bootstrap and asan/ubsan bootstraps. For now we only document the requirement of a C++11 host compiler. Also not sure whether using -std=gnu++1y would allow more released compilers to build the code? For example GCC 4.8.5 knows -std=gnu++1y but not -std=gnu++14 and GCC 4.8.3 is required for bootstrap anyway. Richard. > Martin
PING^1 [PATCH 0/2] x86: Add cmpmemsi for -minline-all-stringops
On Sun, May 31, 2020 at 4:10 PM H.J. Lu wrote: > > We used to expand memcmp to "repz cmpsb" via cmpstrnsi. It was changed > by > > commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f > Author: Nick Clifton > Date: Fri Aug 12 16:26:11 2011 + > > builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern. > > * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi > pattern. > * doc/md.texi (cmpstrn): Note that the comparison stops if both > fetched bytes are zero. > (cmpstr): Likewise. > (cmpmem): Note that the comparison does not stop if both of the > fetched bytes are zero. > > Duplicate the cmpstrn pattern for cmpmem and expand cmpmem to "repz cmpsb" > for -minline-all-stringops. The only difference is that the length > argument of cmpmem is guaranteed to be less than or equal to lengths of > 2 memory areas. Since cmpmem expander may pass the actual string length > directly to cmpstrnqi patterns. Pass a copy of the string length to > cmpstrnqi patterns to avoid changing the actual string length by cmpstrnqi > patterns. > > Tested on Linux/x86 and Linux/x86-64. OK for master? > > H.J. Lu (2): > x86: Pass a copy of the string length to cmpstrnqi > x86: Add cmpmemsi for -minline-all-stringops > > gcc/config/i386/i386-expand.c | 84 ++ > gcc/config/i386/i386-protos.h | 1 + > gcc/config/i386/i386.md | 80 - > gcc/testsuite/gcc.target/i386/pr95151-1.c | 17 +++ > gcc/testsuite/gcc.target/i386/pr95151-2.c | 10 ++ > gcc/testsuite/gcc.target/i386/pr95151-3.c | 18 +++ > gcc/testsuite/gcc.target/i386/pr95151-4.c | 11 ++ > gcc/testsuite/gcc.target/i386/pr95443-1.c | 130 ++ > gcc/testsuite/gcc.target/i386/pr95443-2.c | 79 + > 9 files changed, 371 insertions(+), 59 deletions(-) > create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr95443-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr95443-2.c > PING: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546920.html https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html -- H.J.
PING^2: [PATCH] x86: Add UNSPECV_PATCHABLE_AREA
On Fri, May 22, 2020 at 4:22 AM H.J. Lu wrote: > > On Sat, May 2, 2020 at 4:55 AM H.J. Lu wrote: > > > > Currently patchable area is at the wrong place. It is placed immediately > > after function label, before both .cfi_startproc and ENDBR. This patch > > adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and > > changes ENDBR insertion pass to also insert patchable area instruction. > > TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to avoid placing > > patchable area before .cfi_startproc and ENDBR. > > > > OK for master? > > > > Thanks. > > > > H.J. > > --- > > gcc/ > > > > PR target/93492 > > * config/i386/i386-features.c (rest_of_insert_endbranch): > > Renamed to ... > > (rest_of_insert_endbr_and_patchable_area): Change return type > > to void. Add need_endbr and patchable_area_size arguments. > > Don't call timevar_push nor timevar_pop. Replace > > endbr_queued_at_entrance with insn_queued_at_entrance. Insert > > UNSPECV_PATCHABLE_AREA for patchable area. > > (pass_data_insert_endbranch): Renamed to ... > > (pass_data_insert_endbr_and_patchable_area): This. Change > > pass name to endbr_and_patchable_area. > > (pass_insert_endbranch): Renamed to ... > > (pass_insert_endbr_and_patchable_area): This. Add need_endbr > > and patchable_area_size;. > > (pass_insert_endbr_and_patchable_area::gate): Set and check > > need_endbr and patchable_area_size. > > (pass_insert_endbr_and_patchable_area::execute): Call > > timevar_push and timevar_pop. Pass need_endbr and > > patchable_area_size to rest_of_insert_endbr_and_patchable_area. > > (make_pass_insert_endbranch): Renamed to ... > > (make_pass_insert_endbr_and_patchable_area): This. > > * config/i386/i386-passes.def: Replace pass_insert_endbranch > > with pass_insert_endbr_and_patchable_area. > > * config/i386/i386-protos.h (ix86_output_patchable_area): New. > > (make_pass_insert_endbranch): Renamed to ... > > (make_pass_insert_endbr_and_patchable_area): This. > > * config/i386/i386.c (ix86_asm_output_function_label): Set > > function_label_emitted to true. > > (ix86_print_patchable_function_entry): New function. > > (ix86_output_patchable_area): Likewise. > > (x86_function_profiler): Replace endbr_queued_at_entrance with > > insn_queued_at_entrance. Generate ENDBR only for TYPE_ENDBR. > > Call ix86_output_patchable_area to generate patchable area if > > needed. > > (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New. > > * i386.h (queued_insn_type): New. > > (machine_function): Add function_label_emitted. Replace > > endbr_queued_at_entrance with insn_queued_at_entrance. > > * config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New. > > (patchable_area): New. > > > > gcc/testsuite/ > > > > PR target/93492 > > * gcc.target/i386/pr93492-1.c: New test. > > * gcc.target/i386/pr93492-2.c: Likewise. > > * gcc.target/i386/pr93492-3.c: Likewise. > > * gcc.target/i386/pr93492-4.c: Likewise. > > * gcc.target/i386/pr93492-5.c: Likewise. > > PING: > > https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545021.html PING. -- H.J.
PING^1: V5 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386
On Tue, May 26, 2020 at 6:27 AM Martin Liška wrote: > > On 5/26/20 1:59 PM, H.J. Lu wrote: > > On Tue, May 26, 2020 at 2:30 AM Martin Liška wrote: > >> > >> On 5/25/20 7:42 PM, H.J. Lu wrote: > >>> Here is the updated patch. OK for master? > >> > >> Thank you for the updated patch. > >> > >> I have still few nits: > >> > >> 1) I would make all the: > >> > >>> + has_sse3 = has_feature (FEATURE_SSE3); > >> > >> a macro. The local variable seems to superfluous. > > > > Done. > > Thanks! > > > > >> 2) can we automatically deduce option name: > >> > >>> + ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, "-mrdpid") > >>> + ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, "-mrdrnd") > >> > >> I mean "-m" + "rdrnd" == "-mrdrnd" ? > > > > The new option field serves 2 purposes: > > > > 1. Not all features have a corresponding command-line option > > > > ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL) > > > > for (i = 0; i < ARRAY_SIZE (isa_names_table); i++) > > if (isa_names_table[i].option) > > > > 2. Some feature has a different name in the command-line option. > > > >ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, "-mfxsr") > > I noticed that, one can theoretically use "" for an option that does not > have a flag. And NULL for these which have option equal to "-m" + name. > Anyway, that's a nit. > > I support the patch! > Martin > > > > > Here is the updated patch. OK for master? > > > > Thanks. > > > PING: https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546522.html -- H.J.
Ping: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]
May I please ping this patch? https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545426.html Thanks! -Lewis On Fri, May 08, 2020 at 03:35:25PM -0400, Lewis Hyatt wrote: > On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote: > > On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote: > > > Hello- > > > > > > Here is the second patch that I mentioned when I submitted the other > > > related > > > patch (which is awaiting review): > > > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. > > > > Sorry about that; I'm v. busy with analyzer bugs right now. > > > > > This second patch > > > is based on top of the first one and it closes out PR49973 and > > > PR86904 by > > > adding the new option -fdiagnostics-column-unit=[display|byte]. This > > > allows > > > to specify whether columns are output as simple byte counts (the > > > current > > > behavior), or as display columns including handling multibyte > > > characters and > > > tabs. The patch makes display columns the new default. Additionally, > > > a > > > second new option -fdiagnostics-column-origin is added, which allows > > > to make > > > the column 0-based (or N-based for any N) instead of 1-based. The > > > default > > > remains at 1-based as it is now. > > > > > > A number of testcases were explicitly testing for the old behavior, > > > so I > > > have updated them to test for the new behavior instead, since the > > > column > > > number adjusted for tabs is more natural to test for, and matches > > > what > > > editors typically show (give or take 1 for the origin convention). > > > > > > One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It > > > failed > > > after this patch, although it doesn't test for any column numbers. > > > The > > > answer turned out to be, this test checks for identical error text on > > > two > > > different lines. When the column units are changed to display > > > columns, then > > > the column of the second error happens to match the line of the first > > > one. dejagnu then misinterprets the second error as if it matched the > > > location of the first one (it doesn't distinguish whether it checks > > > for the > > > line number or the column number in the output). I added a comment to > > > the > > > test explaining the situation; since adding the comment has the side > > > effect > > > of making the first line number no longer match the second column > > > number, it > > > also makes the test pass again. > > > > > > It wasn't quite clear to me whether this change was appropriate for > > > GCC 10 > > > or not at this point. We discussed it a couple months ago here: > > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way, > > > I hope > > > it isn't a problem that I submitted the patch for review now, whether > > > it > > > will end up in 10 or 11. Please let me know what's normally expected? > > > Thanks! > > > > Thanks Lewis. > > > > This patch looks very promising, but should wait until gcc 11; we're > > trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug- > > fixing, so I don't want to add any more diagnostics changes). > > > > Hi Dave- > > Well GCC 10 was released for a whole day so I thought I would bug you with > this > patch again now :). To summarize, I previously sent this in two separate > parts. > > Part 1: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html > Part 2: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html > > Part 1 added the support for converting tabs to spaces when outputting > diagnostics. Part 2 added the new options -fdiagnostics-column-unit and > -fdiagnostics-column-origin to control whether the column number is printed > in display or byte units. Together they resolve both PR49973 and PR86904. > > You provided me with feedback on part 2, which is quoted below with some > notes interspersed. The new version of the patch incorporates all of your > suggestions. Part 1 has not changed other than some trivial rebasing > conflicts. The two patches touch nearly disjoint sets of files and are > logically linked together, so I thought it would be simpler if I just sent > one combined patch now. If you prefer them to be separated as before, please > let me know and I can send them that way as well. > > Bootstrap and reg tests were done on x86-64 Linux for all languages. Tests > look good: > > type, before, after > FAIL 96 96 > PASS 474637 475097 > UNSUPPORTED 11607 11607 > UNTESTED 195 195 > XFAIL 1816 1816 > XPASS 36 36 > > > > > > gcc/ChangeLog: > > > > > > 2020-01-31 Lewis Hyatt > > > > > > > Please reference the PRs here > > > > [...] > > > > > gcc/testsuite/ChangeLog: > > > > > > 2020-01-31 Lewis Hyatt > > > > Likewise here. > > > > [...] > > > > Done. > > > > diff --git a/gcc/common.opt b/gcc/common.opt > > > index 630c380bd6a..657985450c2 100644 > > > --- a/gcc/common.opt > > > +++ b/gcc/common.opt > > > @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(a
[committed] d: Merge upstream dmd 13d67c575.
Hi, This patch merges the D front-end implementation with upstream dmd 13d67c575. Contains many small API changes, most contained within the front-end, but some spill out into the codegen. Bootstrapped and regression tested on x86_64-linux-gnu, and committed to mainline. Regards Iain. gcc/d/ChangeLog: * dmd/MERGE: Merge upstream dmd 13d67c575. * d-builtins.cc (build_frontend_type): Update call to TypeVector::create. * d-frontend.cc (Global::_init): Move setting of errorLimit to ... * d-lang.cc (d_init_options): ... here. Update for new field location of errorLimit. (d_post_options): Likewise. * d-port.cc (Port::readwordLE): Update signature. (Port::readwordBE): Likewise. (Port::readlongLE): Likewise. (Port::readlongBE): Likewise. * decl.cc (get_symbol_decl): Update for new field types. --- gcc/d/d-builtins.cc | 2 +- gcc/d/d-frontend.cc | 1 - gcc/d/d-lang.cc | 3 +- gcc/d/d-port.cc | 16 +- gcc/d/decl.cc | 8 +- gcc/d/dmd/MERGE | 2 +- gcc/d/dmd/access.c | 2 + gcc/d/dmd/aggregate.h | 1 + gcc/d/dmd/ast_node.h| 20 ++ gcc/d/dmd/cond.h| 5 +- gcc/d/dmd/ctfeexpr.c| 4 +- gcc/d/dmd/dclass.c | 10 +- gcc/d/dmd/declaration.h | 2 +- gcc/d/dmd/dmangle.c | 8 +- gcc/d/dmd/dsymbol.h | 6 +- gcc/d/dmd/expression.c | 518 gcc/d/dmd/expression.h | 109 - gcc/d/dmd/globals.h | 5 +- gcc/d/dmd/init.h| 6 +- gcc/d/dmd/mtype.c | 104 +++- gcc/d/dmd/mtype.h | 32 ++- gcc/d/dmd/root/port.h | 8 +- gcc/d/dmd/statement.h | 5 +- gcc/d/dmd/template.h| 4 +- gcc/d/dmd/utf.c | 19 +- 25 files changed, 843 insertions(+), 57 deletions(-) create mode 100644 gcc/d/dmd/ast_node.h diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc index 33221ea3229..91e3173e670 100644 --- a/gcc/d/d-builtins.cc +++ b/gcc/d/d-builtins.cc @@ -211,7 +211,7 @@ build_frontend_type (tree type) if (dtype->nextOf ()->isTypeBasic () == NULL) break; - dtype = (TypeVector::create (Loc (), dtype))->addMod (mod); + dtype = (TypeVector::create (dtype))->addMod (mod); builtin_converted_decls.safe_push (builtin_data (dtype, type)); return dtype; } diff --git a/gcc/d/d-frontend.cc b/gcc/d/d-frontend.cc index 90cf74a0f61..5415d471ef4 100644 --- a/gcc/d/d-frontend.cc +++ b/gcc/d/d-frontend.cc @@ -55,7 +55,6 @@ Global::_init (void) ; this->stdmsg = stderr; - this->errorLimit = flag_max_errors; } /* Start gagging. Return the current number of gagged errors. */ diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc index 2bc0def02c3..badd67f5a8f 100644 --- a/gcc/d/d-lang.cc +++ b/gcc/d/d-lang.cc @@ -293,6 +293,7 @@ d_init_options (unsigned int, cl_decoded_option *decoded_options) global.params.hdrStripPlainFunctions = true; global.params.betterC = false; global.params.allInst = false; + global.params.errorLimit = flag_max_errors; /* Default extern(C++) mangling to C++14. */ global.params.cplusplus = CppStdRevisionCpp14; @@ -793,7 +794,7 @@ d_post_options (const char ** fn) /* Make -fmax-errors visible to frontend's diagnostic machinery. */ if (global_options_set.x_flag_max_errors) -global.errorLimit = flag_max_errors; +global.params.errorLimit = flag_max_errors; if (flag_excess_precision == EXCESS_PRECISION_DEFAULT) flag_excess_precision = EXCESS_PRECISION_STANDARD; diff --git a/gcc/d/d-port.cc b/gcc/d/d-port.cc index f129fd8843d..d49bb1ba8f3 100644 --- a/gcc/d/d-port.cc +++ b/gcc/d/d-port.cc @@ -97,9 +97,9 @@ Port::isFloat64LiteralOutOfRange (const char *buffer) /* Fetch a little-endian 16-bit value from BUFFER. */ unsigned -Port::readwordLE (void *buffer) +Port::readwordLE (const void *buffer) { - unsigned char *p = (unsigned char*) buffer; + const unsigned char *p = (const unsigned char*) buffer; return ((unsigned) p[1] << 8) | (unsigned) p[0]; } @@ -107,9 +107,9 @@ Port::readwordLE (void *buffer) /* Fetch a big-endian 16-bit value from BUFFER. */ unsigned -Port::readwordBE (void *buffer) +Port::readwordBE (const void *buffer) { - unsigned char *p = (unsigned char*) buffer; + const unsigned char *p = (const unsigned char*) buffer; return ((unsigned) p[0] << 8) | (unsigned) p[1]; } @@ -117,9 +117,9 @@ Port::readwordBE (void *buffer) /* Fetch a little-endian 32-bit value from BUFFER. */ unsigned -Port::readlongLE (void *buffer) +Port::readlongLE (const void *buffer) { - unsigned char *p = (unsigned char*) buffer; + const unsigned char *p = (const unsigned char*) buffer; return (((unsigned) p[3] << 24) | ((unsigned) p[2] << 16) @@ -130,9 +130,9 @@ Port::readlongLE (void *buffer) /* Fetch a big-endian 32-bit value from BUFFER. */ unsigned -Port::readlongBE (void *buffer) +Port::readlongBE (const void *
Re: [PATCH] fortran/95509 - fix spellcheck-operator.f90 regression
Hi Bernhard, For Fortran identifiers, that would be really good. Would you do that? Is it really needed? We do not enter internal helper names (class names, anything artificial, anything leading underscore or leading uppercase) to the suggestions proposals so should never actually recommend any of those? You're quite right, this is not needed. Thanks for pointing this out :-) Regards Thomas
Re: [PATCH RFA] tree-inline: Fix VLA handling [PR95552]
> Like this? Fine with me, thanks! -- Eric Botcazou