Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue

2020-06-09 Thread Richard Biener via Gcc-patches
On Mon, Jun 8, 2020 at 11:13 PM Jeff Law via Gcc-patches
 wrote:
>
>
> We're seeing some failures due to the local-alignment pass.  For example on 
> sh4:
>
> Tests that now fail, but worked before (16 tests):
>
> gcc.dg/pr48335-1.c (test for excess errors)
> gcc.dg/pr48335-2.c (test for excess errors)
> gcc.dg/pr48335-5.c (test for excess errors)
> gcc.dg/pr48335-6.c (test for excess errors)
> gcc.dg/torture/pr48493.c   -O0  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O0  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O1  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O1  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O2  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O2  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O2 -flto -fno-use-linker-plugin -flto-
> partition=none  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O2 -flto -fno-use-linker-plugin -flto-
> partition=none  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O3 -g  (test for excess errors)
> gcc.dg/torture/pr48493.c   -O3 -g  (test for excess errors)
> gcc.dg/torture/pr48493.c   -Os  (test for excess errors)
> gcc.dg/torture/pr48493.c   -Os  (test for excess errors)
>
> Or on x86_64:
>
> during GIMPLE pass: adjust_alignment
> /home/jenkins/linux/arch/x86/boot/string.c: In function ‘simple_strtoull’:
> /home/jenkins/linux/arch/x86/boot/string.c:121:20: internal compiler error: in
> execute, at adjust-alignment.c:74
>   121 | unsigned long long simple_strtoull(const char *cp, char **endp, 
> unsigned
> int base)
>   |^~~
> 0x79c5b3 execute
> ../../../../../gcc/gcc/adjust-alignment.c:74
> Please submit a full bug report,
>
> There may be others, I haven't searched the failing targets extensively for 
> this
> issue.
>
> AFAICT the adjust-alignment pass is supposed to increase alignments of locals
> when needed.  It has an assert to ensure that alignments are only increased.
>
> However, if the object was declared with an alignment attribute that is larger
> than what LOCAL_ALIGNMENT would produce for the object, then the 
> adjust-alignment
> pass trips the assert.
>
> Rather than asserting, just ignoring such objects seems better.  But I'm not
> intimately familiar with the issues here.
>
> Bootstrapped and regression tested on x86_64, also verified this fixes the sh4
> issue.  All the *-elf targets have also built/tested with this patch with no
> regressions.
>
> OK for the trunk?

While technically OK the issue is that it does not solve the x86 issue
which with incoming stack alignment < 8 bytes does not perform
dynamic re-alignment to align 'long long' variables appropriately.
Currently the x86 backend thinks it can fixup by lowering alignment
via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue
because the larger alignment is present in the IL for a long (previously
RTL expansion, now adjust-aligment) time and your patch makes that
wrong info last forever (or alternatively cause dynamic stack alignment
which we do _not_ want to perform here).

So I prefer to wait for a proper x86 solution before cutting the ICEs.
(did you verify effects on code generation of your patch?)

Thanks,
Richard.

>
> Jeff
>
>


Re: [PR95416] outputs.exp: skip lto tests when not using linker plugin

2020-06-09 Thread Iain Sandoe via Gcc-patches

Hi Alexandre

Alexandre Oliva  wrote:



When the linker plugin is not available, dump outputs in lto runs are
different from those outputs.exp tests expect, so skip them for now.


Targets (like Darwin too) that don’t have the linker plugin (presumably)
use fat LTO objects and drive the LTO process using collect2 (Darwin
and Solaris seem to be two at least that do this)

That means that the intermediate objects proceed all the way to .s
output - and thus the ‘final’ pass is run (producing the extra files seen).

You can mimic this with x86 Linux by appending -ffat-lto-objects to an
LTO command line.

I have an ugly patch that makes this work for Darwin (essentially, by
having two versions of the LTO tests - one for targets with fat LTO
and one for thin) since it seems that one has to apply all the files
expected to the list - [this comment is also on PR95577].

Perhaps, as the author, you can see a more elegant way to implement
this - and I was wondering (as an aside) if the -ffat-lto-objects case should
be tested on targets which default to thin LTO anyway?

cheers
Iain


Regstrapped on x86_64-linux-gnu.  Ok to install?


for  gcc/testsuite/ChangeLog

* outputs.exp (skip_lto): Set when missing the linker plugin.
---
testsuite/gcc.misc-tests/outputs.exp |3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git gcc/testsuite/gcc.misc-tests/outputs.exp  
gcc/testsuite/gcc.misc-tests/outputs.exp

index 06a32db..a47f8a6 100644
--- gcc/testsuite/gcc.misc-tests/outputs.exp
+++ gcc/testsuite/gcc.misc-tests/outputs.exp
@@ -42,7 +42,8 @@ if ![check_no_compiler_messages gsplitdwarf object {

# Check for -flto support.  We explicitly test the result to skip
# tests that use -flto.
-set skip_lto ![check_effective_target_lto]
+set skip_lto { ![check_effective_target_lto] \
+  || ![check_linker_plugin_available] }

# Prepare additional options to be used for linking.
# We do not compile to an executable, because that requires naming an  
output.



--
Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
Free Software Evangelist  Stallman was right, but he's left :(
GNU Toolchain Engineer   Live long and free, and prosper ethically





[PATCH] libgcov: fix TOPN type casting

2020-06-09 Thread Martin Liška

The patch fixes tree-prof.exp tests on solaris11 and i686-linux-gnu,
problem was that sizeof of a pointer is different from sizeof gcov_type.

I'm going to install it if there are no objections.
Thanks,
Martin

libgcc/ChangeLog:

PR libgcov/95494
* libgcov-driver.c (write_top_counters): Cast first to
intptr_t as sizeof(*) != sizeof(gcov_type).
* libgcov.h (gcov_counter_set_if_null): Remove.
(gcov_topn_add_value): Cast first to intptr_t and update
linked list directly.
---
 libgcc/libgcov-driver.c |  4 ++--
 libgcc/libgcov.h| 49 +++--
 2 files changed, 29 insertions(+), 24 deletions(-)

diff --git a/libgcc/libgcov-driver.c b/libgcc/libgcov-driver.c
index 8348d9f33ec..cbfcae96d19 100644
--- a/libgcc/libgcov-driver.c
+++ b/libgcc/libgcov-driver.c
@@ -352,8 +352,8 @@ write_top_counters (const struct gcov_ctr_info *ci_ptr,
   gcov_type pair_count = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 1];
   gcov_write_counter (ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i]);
   gcov_write_counter (pair_count);
-  for (struct gcov_kvp *node
-  = (struct gcov_kvp *)ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2];
+  gcov_type start = ci_ptr->values[GCOV_TOPN_MEM_COUNTERS * i + 2];
+  for (struct gcov_kvp *node = (struct gcov_kvp *)(intptr_t)start;
   node != NULL; node = node->next)
{
  gcov_write_counter (node->value);
diff --git a/libgcc/libgcov.h b/libgcc/libgcov.h
index 1456100815d..5d237a4c730 100644
--- a/libgcc/libgcov.h
+++ b/libgcc/libgcov.h
@@ -401,24 +401,6 @@ gcov_counter_add (gcov_type *counter, gcov_type value,
 *counter += value;
 }
 
-/* Set NODE to memory location COUNTER and make it with atomic operation

-   if USE_ATOMIC is true.  */
-
-static inline int
-gcov_counter_set_if_null (gcov_type *counter, struct gcov_kvp *node,
- int use_atomic ATTRIBUTE_UNUSED)
-{
-#if GCOV_SUPPORTS_ATOMIC
-  if (use_atomic)
-return !__sync_val_compare_and_swap (counter, NULL, (intptr_t)node);
-  else
-#endif
-{
-  *counter = (intptr_t)node;
-  return 1;
-}
-}
-
 /* Add key value pair VALUE:COUNT to a top N COUNTERS.  When INCREMENT_TOTAL
is true, add COUNT to total of the TOP counter.  If USE_ATOMIC is true,
do it in atomic way.  */
@@ -432,7 +414,7 @@ gcov_topn_add_value (gcov_type *counters, gcov_type value, 
gcov_type count,
 
   struct gcov_kvp *prev_node = NULL;

   struct gcov_kvp *minimal_node = NULL;
-  struct gcov_kvp *current_node  = (struct gcov_kvp *)counters[2];
+  struct gcov_kvp *current_node  = (struct gcov_kvp *)(intptr_t)counters[2];
 
   while (current_node)

 {
@@ -467,10 +449,33 @@ gcov_topn_add_value (gcov_type *counters, gcov_type 
value, gcov_type count,
 
   int success = 0;

   if (!counters[2])
-   success = gcov_counter_set_if_null (&counters[2], new_node, use_atomic);
+   {
+#if GCOV_SUPPORTS_ATOMIC
+ if (use_atomic)
+   {
+ struct gcov_kvp **ptr = (struct gcov_kvp 
**)(intptr_t)&counters[2];
+ success = !__sync_val_compare_and_swap (ptr, 0, new_node);
+   }
+ else
+#endif
+   {
+ counters[2] = (intptr_t)new_node;
+ success = 1;
+   }
+   }
   else if (prev_node && !prev_node->next)
-   success = gcov_counter_set_if_null ((gcov_type *)&prev_node->next,
-   new_node, use_atomic);
+   {
+#if GCOV_SUPPORTS_ATOMIC
+ if (use_atomic)
+   success = !__sync_val_compare_and_swap (&prev_node->next, 0,
+   new_node);
+ else
+#endif
+   {
+ prev_node->next = new_node;
+ success = 1;
+   }
+   }
 
   /* Increment number of nodes.  */

   if (success)
--
2.26.2



Re: [PATCH] guard against calls with fewer arguments than the function type implies (PR 95581)

2020-06-09 Thread Richard Biener via Gcc-patches
On Mon, Jun 8, 2020 at 11:59 PM Martin Sebor via Gcc-patches
 wrote:
>
> PR 95581 reports an ICE in a call to gimple_call_arg() with
> an out-of-bounds argument number (2).  The attached patch avoids
> the ICE but I'm not sure it's the right fix.  Other than verifying
> the ICE is gone with a powerpc64 cross-compiler I couldn't come up
> with a generic test cases to exercise this change.
>
> The call, synthesized by the vectorizer, is
>
>vect__5.6_24 = __builtin_altivec_mask_for_load (vectp_a.5_8);
>
> but the function is declared by the powerpc64 back end to take two
> arguments: long and const void*.  Is it correct for the builtin to
> be called with fewer arguments than their type indicates?  (If it
> isn't the patch shouldn't be necessary but some other fix would
> be needed.)

I think the backend declaration is wrong, the function only takes
a void * argument and returns a long.

Richard.

>
> Martin


Re: avoid line breaks in -fverbose-asm tree exprs

2020-06-09 Thread Richard Biener via Gcc-patches
On Tue, Jun 9, 2020 at 1:47 AM Alexandre Oliva  wrote:
>
>
> An asm operand with a "VIEW_CONVERT_EXPR   [...]
> }>" will output the definition of the struct as asm code.  Oops.
>
> I hoped setting print_rtx_head would affect print_mem_expr, just
> because it's in print-rtl rather than tree-pretty-print, but it didn't
> help.  I've thus arranged for print_mem_expr to "obey" print_rtx_head
> by enabling TDF_SLIM.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?
>
>
> Another alternative that occurred to me was to get print_rtx_head
> printed by e.g. pp_newline, or otherwise get print-rtl and
> pretty-print infrastructure to both use the same line-leading
> infrastructure.  I think that would be a much bigger undertaking, so I
> figured I'd ask whether that would be desirable, before taking it up.
> Thoughts?
>
>
> for  gcc/ChangeLog
>
> * final.c (output_asm_operand_names): Set print_rtx_head
> around print_mem_expr.
> * print-rtl.c (print_mem_expr): Enable TDF_SLIM in dump_flags
> when print_rtx_head is nonempty.
> ---
>  final.c |3 +++
>  print-rtl.c |4 
>  2 files changed, 7 insertions(+)
>
> diff --git gcc/final.c gcc/final.c
> index a360196..482801db 100644
> --- gcc/final.c
> +++ gcc/final.c
> @@ -3706,9 +3706,12 @@ output_asm_operand_names (rtx *operands, int *oporder, 
> int nops)
>wrote = 1;
>if (expr)
> {
> + const char *saved_print_rtx_head = print_rtx_head;
>   fprintf (asm_out_file, "%s",
>addressp ? "*" : "");
> + print_rtx_head = ASM_COMMENT_START;
>   print_mem_expr (asm_out_file, expr);
> + print_rtx_head = saved_print_rtx_head;
>   wrote = 1;
> }
>else if (REG_P (op) && ORIGINAL_REGNO (op)
> diff --git gcc/print-rtl.c gcc/print-rtl.c
> index 611ea07..0563540 100644
> --- gcc/print-rtl.c
> +++ gcc/print-rtl.c
> @@ -182,8 +182,12 @@ rtx_reuse_manager::set_seen_def (int reuse_id)
>  void
>  print_mem_expr (FILE *outfile, const_tree expr)
>  {
> +  dump_flags_t save_dump_flags = dump_flags;
> +  if (*print_rtx_head)
> +dump_flags |= TDF_SLIM;
>fputc (' ', outfile);
>print_generic_expr (outfile, CONST_CAST_TREE (expr), dump_flags);

How about simply unconditionally doing dump_flags | TDF_SLIM here
to have the whole mem_expr on one line.

OK with that change.

Richard.

> +  dump_flags = save_dump_flags;
>  }
>  #endif
>
>
> --
> Alexandre Oliva, freedom fighterhe/himhttps://FSFLA.org/blogs/lxo/
> Free Software Evangelist  Stallman was right, but he's left :(
> GNU Toolchain Engineer   Live long and free, and prosper ethically


[PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Marco Elver via Gcc-patches
Add support to optionally emit different instrumentation for accesses to
volatile variables. While the default TSAN runtime likely will never
require this feature, other runtimes for different environments that
have subtly different memory models or assumptions may require
distinguishing volatiles.

One such environment are OS kernels, where volatile is still used in
various places, and often declare volatile to be
appropriate even in multi-threaded contexts. One such example is the
Linux kernel, which implements various synchronization primitives using
volatile (READ_ONCE(), WRITE_ONCE()).

Here the Kernel Concurrency Sanitizer (KCSAN), is a runtime that uses
TSAN instrumentation but otherwise implements a very different approach
to race detection from TSAN:

https://github.com/google/ktsan/wiki/KCSAN

Due to recent changes in requirements by the Linux kernel, KCSAN
requires that the compiler supports tsan-distinguish-volatile (among
several new requirements):

https://lore.kernel.org/lkml/20200521142047.169334-7-el...@google.com/

gcc/
* params.opt: Define --param=tsan-distinguish-volatile=[0,1].
* sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new
builtin for volatile instrumentation of reads/writes.
(BUILT_IN_TSAN_VOLATILE_READ2): Likewise.
(BUILT_IN_TSAN_VOLATILE_READ4): Likewise.
(BUILT_IN_TSAN_VOLATILE_READ8): Likewise.
(BUILT_IN_TSAN_VOLATILE_READ16): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise.
* tsan.c (get_memory_access_decl): Argument if access is
volatile. If param tsan-distinguish-volatile is non-zero, and
access if volatile, return volatile instrumentation decl.
(instrument_expr): Check if access is volatile.

gcc/testsuite/
* c-c++-common/tsan/volatile.c: New test.

Acked-by: Dmitry Vyukov 
---
v2:
* Add Optimization keyword to -param=tsan-distinguish-volatile= as the
  parameter can be different per TU.
* Add tree-dump check to test.
---
 gcc/params.opt |  4 ++
 gcc/sanitizer.def  | 21 +++
 gcc/testsuite/c-c++-common/tsan/volatile.c | 67 ++
 gcc/tsan.c | 53 +++--
 4 files changed, 128 insertions(+), 17 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/tsan/volatile.c

diff --git a/gcc/params.opt b/gcc/params.opt
index 4aec480798b..c751416bcad 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best 
edge is less than this th
 Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization
 Set the maximum number of instructions executed in parallel in reassociated 
tree.  If 0, use the target dependent heuristic.
 
+-param=tsan-distinguish-volatile=
+Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, 1) 
Param Optimization
+Emit special instrumentation for accesses to volatiles.
+
 -param=uninit-control-dep-attempts=
 Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) 
IntegerRange(1, 65536) Param Optimization
 Maximum number of nested calls to search for control dependencies during 
uninitialized variable analysis.
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 11eb6467eba..a32715ddb92 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -214,6 +214,27 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_READ_RANGE, 
"__tsan_read_range",
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_WRITE_RANGE, "__tsan_write_range",
  BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
 
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ1, "__tsan_volatile_read1",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ2, "__tsan_volatile_read2",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ4, "__tsan_volatile_read4",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ8, "__tsan_volatile_read8",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ16, "__tsan_volatile_read16",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE1, "__tsan_volatile_write1",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE2, "__tsan_volatile_write2",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE4, "__tsan_volatile_write4",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN

[PATCH] gcc-changelog: fix deduction for root ChangeLog

2020-06-09 Thread Martin Liška

The patch fixes a wrong ChangeLog regex, pushed to master.

Martin

contrib/ChangeLog:

* gcc-changelog/git_commit.py: Fix ChangeLog regex in order to
match the top-level ChangeLog.
* gcc-changelog/test_email.py: Add test.
* gcc-changelog/test_patches.txt: Likewise.
---
 contrib/gcc-changelog/git_commit.py|  5 +-
 contrib/gcc-changelog/test_email.py|  5 ++
 contrib/gcc-changelog/test_patches.txt | 68 ++
 3 files changed, 76 insertions(+), 2 deletions(-)

diff --git a/contrib/gcc-changelog/git_commit.py 
b/contrib/gcc-changelog/git_commit.py
index 069900d7cbf..f85d4c83c63 100755
--- a/contrib/gcc-changelog/git_commit.py
+++ b/contrib/gcc-changelog/git_commit.py
@@ -150,7 +150,7 @@ misc_files = [
 author_line_regex = \
 re.compile(r'^(?P\d{4}-\d{2}-\d{2})\ {2}(?P.*  <.*>)')
 additional_author_regex = re.compile(r'^\t(?P\ *)?(?P.*  <.*>)')
-changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)/ChangeLog:?')
+changelog_regex = re.compile(r'^(?:[fF]or +)?([a-z0-9+-/]*)ChangeLog:?')
 pr_regex = re.compile(r'\tPR (?P[a-z+-]+\/)?([0-9]+)$')
 dr_regex = re.compile(r'\tDR ([0-9]+)$')
 star_prefix_regex = re.compile(r'\t\*(?P\ *)(?P.*)')
@@ -359,7 +359,8 @@ class GitCommit:
  % LINE_LIMIT, line))
 m = changelog_regex.match(line)
 if m:
-last_entry = ChangeLogEntry(m.group(1), self.top_level_authors,
+last_entry = ChangeLogEntry(m.group(1).rstrip('/'),
+self.top_level_authors,
 self.top_level_prs)
 self.changelog_entries.append(last_entry)
 elif self.find_changelog_location(line):
diff --git a/contrib/gcc-changelog/test_email.py 
b/contrib/gcc-changelog/test_email.py
index 7e8140c915f..04ddad3f100 100755
--- a/contrib/gcc-changelog/test_email.py
+++ b/contrib/gcc-changelog/test_email.py
@@ -331,3 +331,8 @@ class TestGccChangelog(unittest.TestCase):
 assert len(email.errors) == 1
 msg = 'changed file not mentioned in a ChangeLog'
 assert email.errors[0].message == msg
+
+def test_not_deduce(self):
+email = self.from_patch_glob('0001-configure.patch')
+assert not email.errors
+assert len(email.changelog_entries) == 2
diff --git a/contrib/gcc-changelog/test_patches.txt 
b/contrib/gcc-changelog/test_patches.txt
index 5d9b62d2637..15fe0df1ccc 100644
--- a/contrib/gcc-changelog/test_patches.txt
+++ b/contrib/gcc-changelog/test_patches.txt
@@ -3058,4 +3058,72 @@ index 967e5f5f348..95d21b5bf9f 100644
 +
 --
 2.25.4
+=== 0001-configure.patch ===
+From dbe341cf6a77bb28c5fdf8b32dcb0ff1c2a27348 Mon Sep 17 00:00:00 2001
+From: Martin Liska 
+Date: Tue, 9 Jun 2020 09:39:36 +0200
+Subject: [PATCH] c++: Fix --disable-bootstrap with older g++.
+
+Previously I had AX_CXX_COMPILE_STDCXX in the gcc directory configure, which
+added -std=c++11 to CXX if needed, but then CXX is overridden from the
+toplevel directory, so it didn't have the desired effect.  Fixed by moving
+the check to the toplevel.  Currently it is only used when building GCC
+without bootstrapping; other packages that share the toplevel directory
+can adjust the condition if they also want to require C++11 support.
 
+ChangeLog:

+
+   * configure.ac: Check AX_CXX_COMPILE_STDCXX if not bootstrapping.
+   * configure: Regenerate.
+
+gcc/ChangeLog:
+
+   * aclocal.m4: Remove ax_cxx_compile_stdcxx.m4.
+   * configure.ac: Remove AX_CXX_COMPILE_STDCXX.
+   * configure: Regenerate.
+
+---
+ configure| 999 ++-
+ configure.ac |   6 +-
+ gcc/aclocal.m4   |   1 -
+ gcc/configure| 997 +-
+ gcc/configure.ac |   2 -
+ 5 files changed, 1004 insertions(+), 1001 deletions(-)
+
+diff --git a/configure b/configure
+index b7897446c70..a0c5aca9e8d 100755
+--- a/configure
 b/configure
+@@ -1 +1,2 @@
+
++
+diff --git a/configure.ac b/configure.ac
+index 59bd92a3e53..1a53ed418e4 100644
+--- a/configure.ac
 b/configure.ac
+@@ -1 +1,2 @@
+
++
+diff --git a/gcc/aclocal.m4 b/gcc/aclocal.m4
+index e93c1535063..1737d59d1cb 100644
+--- a/gcc/aclocal.m4
 b/gcc/aclocal.m4
+@@ -1 +1,2 @@
+
++
+diff --git a/gcc/configure b/gcc/configure
+index 46850710424..629c7c7e153 100755
+--- a/gcc/configure
 b/gcc/configure
+@@ -1 +1,2 @@
+
++
+diff --git a/gcc/configure.ac b/gcc/configure.ac
+index 60d83c30771..9e7efd13ecc 100644
+--- a/gcc/configure.ac
 b/gcc/configure.ac
+@@ -1 +1,2 @@
+
++
+--
+2.26.2
--
2.26.2



Re: [PATCH] libsanitizer: use gnu++14

2020-06-09 Thread Martin Liška

On 6/8/20 4:53 PM, Martin Liška wrote:

Hi.

Thank you for the report. It's caused by fact that LLVM switch in 4d474e078ac7
to c++14. So that I suggest to use gnu++14.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
I also verified that abidiff is equal for all libsanitizer shared libraries.
I'm going to install the patch if there are no objections.

Thanks,
Martin


Installed as 942a384ef9f.
@Andreas: Can you please check the riscv64 build?

Martin


[Ada] Spurious error on instantiations with Taft_Amendment types and tasks

2020-06-09 Thread Pierre-Marie de Rodat
An access type whose designated type is an incomplete or class-wide type
may end up designating a task type or a type with a task component. It
is then necessary to associate the access type with the Master of the
tasks that may be generated on an allocator for that access type. This
is done by creating a renaming declaration that uses a local name to
rename the master of an enclosing construct (subprogram or task body)
that is the finalization master of these tasks, the local nanme used in
that renaming declaration must be unique across the compilation unit,
because several nested constructs that are not masters (for example,
package instantiastions) may contain local access types with the same
name.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Ed Schonberg  

gcc/ada/

* exp_ch9.adb (Build_Master_Renaming): Make name in renaming
declaration unique by adding a numeric suffix, to prevent
accidental name conflict when several instantiations of a
package containing an access_to_incomplete type that designate
tasks appear in the same scope.--- gcc/ada/exp_ch9.adb
+++ gcc/ada/exp_ch9.adb
@@ -3469,10 +3469,13 @@ package body Exp_Ch9 is
 
   --  Generate:
   --M : Master_Id renames _Master;
+  --  and add a numeric suffix to the name to ensure that it is
+  --  unique in case other access types in nested constructs
+  --  are homonyms of this one.
 
   Master_Id :=
 Make_Defining_Identifier (Loc,
-  New_External_Name (Chars (Ptr_Typ), 'M'));
+  New_External_Name (Chars (Ptr_Typ), 'M', -1));
 
   Master_Decl :=
 Make_Object_Renaming_Declaration (Loc,



[Ada] Remove bypass for instance bodies from Is_Visible_Component

2020-06-09 Thread Pierre-Marie de Rodat
This removes an obsolete bypass present for long in the predicate
Is_Visible_Component, which is responsible for determining whether
a component present in a private tagged record type is visible in
its extensions.  This bypass would essentially always return true
in bodies of instances, but there is no real basis to do so.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Eric Botcazou  

gcc/ada/

* sem_ch3.adb (Is_Visible_Component): Do not special-case
bodies of instances.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -18762,39 +18762,6 @@ package body Sem_Ch3 is
   then
  return True;
 
-  --  In the body of an instantiation, check the visibility of a component
-  --  in case it has a homograph that is a primitive operation of a private
-  --  type which was not visible in the generic unit.
-
-  --  Should Is_Prefixed_Call be propagated from template to instance???
-
-  elsif In_Instance_Body then
- if not Is_Tagged_Type (Original_Type)
-   or else not Is_Private_Type (Original_Type)
- then
-return True;
-
- else
-declare
-   Subp_Elmt : Elmt_Id;
-
-begin
-   Subp_Elmt := First_Elmt (Primitive_Operations (Original_Type));
-   while Present (Subp_Elmt) loop
-
-  --  The component is hidden by a primitive operation
-
-  if Chars (Node (Subp_Elmt)) = Chars (C) then
- return False;
-  end if;
-
-  Next_Elmt (Subp_Elmt);
-   end loop;
-
-   return True;
-end;
- end if;
-
   --  If the component has been declared in an ancestor which is currently
   --  a private type, then it is not visible. The same applies if the
   --  component's containing type is not in an open scope and the original



[Ada] Ada2020: AI12-0301 Predicates and Default_Value

2020-06-09 Thread Pierre-Marie de Rodat
This AI clarifies that predicate checks apply at object declaration for
types that contain components with a Default_Value or
Default_Component_Value.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* sem_util.ads, sem_util.adb (Is_Partially_Initialized_Type):
Take Default_Value and Default_Component_Value into account.
* sem_ch3.adb (Analyze_Object_Declaration): Update comment.--- gcc/ada/sem_ch3.adb
+++ gcc/ada/sem_ch3.adb
@@ -4366,8 +4366,10 @@ package body Sem_Ch3 is
   --  We need a predicate check if the type has predicates that are not
   --  ignored, and if either there is an initializing expression, or for
   --  default initialization when we have at least one case of an explicit
-  --  default initial value and then this is not an internal declaration
-  --  whose initialization comes later (as for an aggregate expansion).
+  --  default initial value (including via a Default_Value or
+  --  Default_Component_Value aspect, see AI12-0301) and then this is not
+  --  an internal declaration whose initialization comes later (as for an
+  --  aggregate expansion).
   --  If expression is an aggregate it may be expanded into assignments
   --  and the declaration itself is marked with No_Initialization, but
   --  the predicate still applies.

--- gcc/ada/sem_util.adb
+++ gcc/ada/sem_util.adb
@@ -16863,7 +16863,7 @@ package body Sem_Util is
is
begin
   if Is_Scalar_Type (Typ) then
- return False;
+ return Has_Default_Aspect (Base_Type (Typ));
 
   elsif Is_Access_Type (Typ) then
  return Include_Implicit;
@@ -16872,8 +16872,9 @@ package body Sem_Util is
 
  --  If component type is partially initialized, so is array type
 
- if Is_Partially_Initialized_Type
-  (Component_Type (Typ), Include_Implicit)
+ if Has_Default_Aspect (Base_Type (Typ))
+   or else Is_Partially_Initialized_Type
+ (Component_Type (Typ), Include_Implicit)
  then
 return True;
 

--- gcc/ada/sem_util.ads
+++ gcc/ada/sem_util.ads
@@ -1872,7 +1872,8 @@ package Sem_Util is
--  Typ is a type entity. This function returns true if this type is partly
--  initialized, meaning that an object of the type is at least partly
--  initialized (in particular in the record case, that at least one
-   --  component has an initialization expression). Note that initialization
+   --  component has an initialization expression, including via Default_Value
+   --  and Default_Component_Value aspects). Note that initialization
--  resulting from the use of pragma Normalize_Scalars does not count.
--  Include_Implicit controls whether implicit initialization of access
--  values to null, and of discriminant values, is counted as making the



[Ada] Refine implementation of AI05-0149 missing conversion checks

2020-06-09 Thread Pierre-Marie de Rodat
We were accepting conversion from anon-access-T'Class to access-all-T
which is incorrect.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* sem_type.adb (Covers): Fix implementation of AI05-0149.
* sem_res.adb: Fix typo.--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -2852,7 +2852,7 @@ package body Sem_Res is
return;
 
 --  Under relaxed RM semantics silently replace occurrences of null
---  by System.Address_Null.
+--  by System.Null_Address.
 
 elsif Null_To_Null_Address_Convert_OK (N, Typ) then
Replace_Null_By_Null_Address (N);

--- gcc/ada/sem_type.adb
+++ gcc/ada/sem_type.adb
@@ -1021,15 +1021,17 @@ package body Sem_Type is
 
   --  Ada 2012 (AI05-0149): Allow an anonymous access type in the context
   --  of a named general access type. An implicit conversion will be
-  --  applied. For the resolution, one designated type must cover the
-  --  other.
+  --  applied. For the resolution, the designated types must match if
+  --  untagged; further, if the designated type is tagged, the designated
+  --  type of the anonymous access type shall be covered by the designated
+  --  type of the named access type.
 
   elsif Ada_Version >= Ada_2012
 and then Ekind (BT1) = E_General_Access_Type
 and then Ekind (BT2) = E_Anonymous_Access_Type
-and then (Covers (Designated_Type (T1), Designated_Type (T2))
-or else
-  Covers (Designated_Type (T2), Designated_Type (T1)))
+and then Covers (Designated_Type (T1), Designated_Type (T2))
+and then (Is_Class_Wide_Type (Designated_Type (T1)) >=
+  Is_Class_Wide_Type (Designated_Type (T2)))
   then
  return True;
 



[Ada] Ada2020 AI12-0282: Shared variable control aspects in generics

2020-06-09 Thread Pierre-Marie de Rodat
This patch refines the checks and the corresponding error messsages on
the legality of instantiations where some formal objects and their
corresponding actuals may carry Atomic, Atonic_Components,  Volatile,
Volatile_Components, Independent, or Independent_Components aspect
specifications. The legality rules require exact match between formal
and actual for the first four aspects, but not for the last two.  For
Atomic_Components and Volatile_Components we also check whether the
actual is of an array type with the corresponding: aspect on its
component type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Ed Schonberg  

gcc/ada/

* sem_ch12.adb (Check_Shared_Variable_Control_Aspects): Require
exact match between formal and actual for aspects Atomic,
Atomic_Component, Volatile, and Volatile_Components.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -12398,21 +12398,15 @@ package body Sem_Ch12 is
   procedure Check_Shared_Variable_Control_Aspects is
   begin
  if Ada_Version >= Ada_2020 then
-if Is_Atomic (A_Gen_T) and then not Is_Atomic (Act_T) then
+if Is_Atomic (A_Gen_T) /= Is_Atomic (Act_T) then
Error_Msg_NE
-  ("actual for& must be an atomic type", Actual, A_Gen_T);
+  ("actual for& has different Atomic aspect", Actual, A_Gen_T);
 end if;
 
-if Is_Volatile (A_Gen_T) and then not Is_Volatile (Act_T) then
+if Is_Volatile (A_Gen_T) /= Is_Volatile (Act_T) then
Error_Msg_NE
-  ("actual for& must be a Volatile type", Actual, A_Gen_T);
-end if;
-
-if
-  Is_Independent (A_Gen_T) and then not Is_Independent (Act_T)
-then
-   Error_Msg_NE
- ("actual for& must be an Independent type", Actual, A_Gen_T);
+  ("actual for& has different Volatile aspect",
+Actual, A_Gen_T);
 end if;
 
 --  We assume that an array type whose atomic component type
@@ -12420,43 +12414,51 @@ package body Sem_Ch12 is
 --  aspect Has_Atomic_Components. This is a reasonable inference
 --  from the intent of AI12-0282, and makes it legal to use an
 --  actual that does not have the identical aspect as the formal.
+--  Ditto for volatile components.
 
-if Has_Atomic_Components (A_Gen_T)
-   and then not Has_Atomic_Components (Act_T)
-then
-   if Is_Array_Type (Act_T)
- and then Is_Atomic (Component_Type (Act_T))
-   then
-  null;
+declare
+   Actual_Atomic_Comp : constant Boolean :=
+   Has_Atomic_Components (Act_T)
+  or else (Is_Array_Type (Act_T)
+and then Is_Atomic (Component_Type (Act_T)));
+begin
+   if Has_Atomic_Components (A_Gen_T) /= Actual_Atomic_Comp then
+  Error_Msg_NE
+("formal and actual for& must agree on atomic components",
+   Actual, A_Gen_T);
+   end if;
+end;
 
-   else
+declare
+   Actual_Volatile_Comp : constant Boolean :=
+ Has_Volatile_Components (Act_T)
+   or else (Is_Array_Type (Act_T)
+ and then Is_Volatile (Component_Type (Act_T)));
+begin
+   if Has_Volatile_Components (A_Gen_T) /= Actual_Volatile_Comp
+   then
   Error_Msg_NE
-("actual for& must have atomic components",
+("actual for& must have volatile components",
Actual, A_Gen_T);
end if;
-end if;
+end;
 
-if Has_Independent_Components (A_Gen_T)
-   and then not Has_Independent_Components (Act_T)
+--  The following two aspects do not require exact matching,
+--  but only one-way agreement. See RM C.6.
+
+if Is_Independent (A_Gen_T) and then not Is_Independent (Act_T)
 then
Error_Msg_NE
- ("actual for& must have independent components",
-Actual, A_Gen_T);
+ ("actual for& must have Independent aspect specified",
+ Actual, A_Gen_T);
 end if;
 
-if Has_Volatile_Components (A_Gen_T)
-   and then not Has_Volatile_Components (Act_T)
+if Has_Independent_Components (A_Gen_T)
+  and then not Has_Independent_Components (Act_T)
 then
-   if Is_Array_Type (Act_T)
- and then Is_Volatile (Component_Type (Act_T))
-   then
-  null;
-
-   else
-  Error_Msg_NE
-   

[Ada] Annotate Ada.Synchronous_Barriers with SPARK_Mode => Off

2020-06-09 Thread Pierre-Marie de Rodat
Synchronous barriers are currently not supported by GNATprove (i.e. it
doesn't recognize that they are initialized by synchronous). They are
specifically detected and rejected as not-in-SPARK, but it is more
elegant to annotate the Ada.Synchronous runtime library unit with
SPARK_Mode => Off.

Compilation is not affected by this annotation.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Piotr Trojanek  

gcc/ada/

* libgnarl/a-synbar.ads, libgnarl/a-synbar.adb,
libgnarl/a-synbar__posix.ads, libgnarl/a-synbar__posix.adb
(Ada.Synchronous_Barriers): Annotate with SPARK_Mode => Off.--- gcc/ada/libgnarl/a-synbar.adb
+++ gcc/ada/libgnarl/a-synbar.adb
@@ -33,7 +33,7 @@
 --  --
 --
 
-package body Ada.Synchronous_Barriers is
+package body Ada.Synchronous_Barriers with SPARK_Mode => Off is
 
protected body Synchronous_Barrier is
 

--- gcc/ada/libgnarl/a-synbar.ads
+++ gcc/ada/libgnarl/a-synbar.ads
@@ -33,7 +33,7 @@
 --  --
 --
 
-package Ada.Synchronous_Barriers is
+package Ada.Synchronous_Barriers with SPARK_Mode => Off is
pragma Preelaborate (Synchronous_Barriers);
 
subtype Barrier_Limit is Positive range 1 .. Positive'Last;

--- gcc/ada/libgnarl/a-synbar__posix.adb
+++ gcc/ada/libgnarl/a-synbar__posix.adb
@@ -37,7 +37,7 @@
 
 with Interfaces.C; use Interfaces.C;
 
-package body Ada.Synchronous_Barriers is
+package body Ada.Synchronous_Barriers with SPARK_Mode => Off is
 

-- POSIX barriers --

--- gcc/ada/libgnarl/a-synbar__posix.ads
+++ gcc/ada/libgnarl/a-synbar__posix.ads
@@ -39,7 +39,7 @@ with System;
 private with Ada.Finalization;
 private with Interfaces.C;
 
-package Ada.Synchronous_Barriers is
+package Ada.Synchronous_Barriers with SPARK_Mode => Off is
pragma Preelaborate (Synchronous_Barriers);
 
subtype Barrier_Limit is Positive range 1 .. Positive'Last;



[Ada] Spurious overlap error on zero-sized arrays with -gnateV

2020-06-09 Thread Pierre-Marie de Rodat
This patch corrects an issue whereby the compiler would incorrectly
generate checks for overlapping formals (-gnateV) leading to spurious
runtime errors when a zero-sized array actual gets passed alongside
another formal declared in the same region.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Justin Squirek  

gcc/ada/

* exp_attr.adb (Expand_N_Attribute_Reference): Modify expansion
of 'Overlaps_Storage to take into account zero sized arrays.--- gcc/ada/exp_attr.adb
+++ gcc/ada/exp_attr.adb
@@ -4783,27 +4783,31 @@ package body Exp_Attr is
 
   when Attribute_Overlaps_Storage => Overlaps_Storage : declare
  Loc : constant Source_Ptr := Sloc (N);
+ X   : constant Node_Id:= Prefix (N);
+ Y   : constant Node_Id:= First (Expressions (N));
 
- X   : constant Node_Id := Prefix (N);
- Y   : constant Node_Id := First (Expressions (N));
  --  The arguments
 
  X_Addr, Y_Addr : Node_Id;
- --  the expressions for their integer addresses
+
+ --  The expressions for their integer addresses
 
  X_Size, Y_Size : Node_Id;
- --  the expressions for their sizes
+
+ --  The expressions for their sizes
 
  Cond : Node_Id;
 
   begin
  --  Attribute expands into:
 
- --if X'Address < Y'address then
- --  (X'address + X'Size - 1) >= Y'address
- --else
- --  (Y'address + Y'size - 1) >= X'Address
- --end if;
+ --(if X'Size = 0 or else Y'Size = 0 then
+ --   False
+ -- else
+ --   (if X'Address <= Y'Address then
+ -- (X'Address + X'Size - 1) >= Y'Address
+ --else
+ -- (Y'Address + Y'Size - 1) >= X'Address))
 
  --  with the proper address operations. We convert addresses to
  --  integer addresses to use predefined arithmetic. The size is
@@ -4846,29 +4850,62 @@ package body Exp_Attr is
   Left_Opnd  => X_Addr,
   Right_Opnd => Y_Addr);
 
+ --  Perform the rewriting
+
  Rewrite (N,
Make_If_Expression (Loc, New_List (
- Cond,
 
- Make_Op_Ge (Loc,
-   Left_Opnd   =>
- Make_Op_Add (Loc,
-   Left_Opnd  => New_Copy_Tree (X_Addr),
-   Right_Opnd =>
- Make_Op_Subtract (Loc,
-   Left_Opnd  => X_Size,
-   Right_Opnd => Make_Integer_Literal (Loc, 1))),
-   Right_Opnd => Y_Addr),
+ --  Generate a check for zero sized things like a null record with
+ --  size zero or an array with zero length since they have no
+ --  opportunity of overlapping.
+
+ --  Without this check a zero-sized object can trigger a false
+ --  runtime result if its compared against another object
+ --  in its declarative region due to the zero-sized object having
+ --  the same address.
 
- Make_Op_Ge (Loc,
+ Make_Or_Else (Loc,
Left_Opnd  =>
- Make_Op_Add (Loc,
-   Left_Opnd  => New_Copy_Tree (Y_Addr),
-   Right_Opnd =>
- Make_Op_Subtract (Loc,
-   Left_Opnd  => Y_Size,
-   Right_Opnd => Make_Integer_Literal (Loc, 1))),
-   Right_Opnd => X_Addr;
+ Make_Op_Eq (Loc,
+   Left_Opnd  =>
+ Make_Attribute_Reference (Loc,
+   Attribute_Name => Name_Size,
+   Prefix => New_Copy_Tree (X)),
+   Right_Opnd => Make_Integer_Literal (Loc, 0)),
+   Right_Opnd =>
+ Make_Op_Eq (Loc,
+   Left_Opnd  =>
+ Make_Attribute_Reference (Loc,
+   Attribute_Name => Name_Size,
+   Prefix => New_Copy_Tree (Y)),
+   Right_Opnd => Make_Integer_Literal (Loc, 0))),
+
+ New_Occurrence_Of (Standard_False, Loc),
+
+ --  Non-size zero overlap check
+
+ Make_If_Expression (Loc, New_List (
+   Cond,
+
+   Make_Op_Ge (Loc,
+ Left_Opnd   =>
+   Make_Op_Add (Loc,
+Left_Opnd  => New_Copy_Tree (X_Addr),
+ Right_Opnd =>
+   Make_Op_Subtract (Loc,
+ Left_Opnd  => X_Size,
+ Right_Opnd => Make_Integer_Literal (Loc, 1))),
+ Right_Opnd => Y_Addr),
+
+   Make_Op_Ge (Loc,
+ Left_Opnd   =>
+   Make_Op_Add (Loc,
+ Left_Opnd  => New_Copy_Tree (Y_Addr),
+ Right_Opnd =>
+   Make_Op_Subtract (Loc,
+ 

[Ada] Crash on exit statement within predicated loop

2020-06-09 Thread Pierre-Marie de Rodat
This patch corrects an issue whereby the compiler would crash if it
encountered an exit statement featuring a loop identifier where such
referenced loop iterates through a predicated type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Justin Squirek  

gcc/ada/

* exp_ch5.adb (Expand_Predicated_Loop): Perserve the original
loop identifier within the expansion.--- gcc/ada/exp_ch5.adb
+++ gcc/ada/exp_ch5.adb
@@ -4920,13 +4920,14 @@ package body Exp_Ch5 is
--  mode, the semantic analyzer may disallow one or both forms.
 
procedure Expand_Predicated_Loop (N : Node_Id) is
-  Loc : constant Source_Ptr := Sloc (N);
-  Isc : constant Node_Id:= Iteration_Scheme (N);
-  LPS : constant Node_Id:= Loop_Parameter_Specification (Isc);
-  Loop_Id : constant Entity_Id  := Defining_Identifier (LPS);
-  Ltype   : constant Entity_Id  := Etype (Loop_Id);
-  Stat: constant List_Id:= Static_Discrete_Predicate (Ltype);
-  Stmts   : constant List_Id:= Statements (N);
+  Orig_Loop_Id :  Node_Id:= Empty;
+  Loc  : constant Source_Ptr := Sloc (N);
+  Isc  : constant Node_Id:= Iteration_Scheme (N);
+  LPS  : constant Node_Id:= Loop_Parameter_Specification (Isc);
+  Loop_Id  : constant Entity_Id  := Defining_Identifier (LPS);
+  Ltype: constant Entity_Id  := Etype (Loop_Id);
+  Stat : constant List_Id:= Static_Discrete_Predicate (Ltype);
+  Stmts: constant List_Id:= Statements (N);
 
begin
   --  Case of iteration over non-static predicate, should not be possible
@@ -5205,7 +5206,13 @@ package body Exp_Ch5 is
 Alternatives => Alts);
 Append_To (Stmts, Cstm);
 
---  Rewrite the loop
+--  Rewrite the loop preserving the loop identifier in case there
+--  are exit statements referencing it.
+
+if Present (Identifier (N)) then
+   Orig_Loop_Id := New_Occurrence_Of
+ (Entity (Identifier (N)), Loc);
+end if;
 
 Set_Suppress_Assignment_Checks (D);
 
@@ -5217,6 +5224,7 @@ package body Exp_Ch5 is
 Statements => New_List (
   Make_Loop_Statement (Loc,
 Statements => Stmts,
+Identifier => Orig_Loop_Id,
 End_Label  => Empty);
 
 Analyze (N);



[Ada] Write_Invocation_Graph_Vertex: include lib item name

2020-06-09 Thread Pierre-Marie de Rodat
Make each invocation graph point to the corresponding library graph. Use
this information in Write_Invocation_Graph_Vertex to print out the name
of the library item containing this vertex.

For procedures that take both the library graph and invocation graph as
parameters, remove the library graph parameter, and fetch that
information from the invocation graph.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Bob Duff  

gcc/ada/

* bindo-graphs.adb, bindo-graphs.ads: For each invocation graph,
record the corresponding library graph.
* bindo-writers.adb (Write_Invocation_Graph_Vertex): Print the
lib item name. Remove library graph parameters.
* bindo-augmentors.adb, bindo-augmentors.ads,
bindo-builders.adb, bindo-diagnostics.adb,
bindo-diagnostics.ads, bindo-elaborators.adb: Remove library
graph parameters.

patch.diff.gz
Description: application/gzip


[Ada] Add debugging message

2020-06-09 Thread Pierre-Marie de Rodat
When raising Program_Error due to an incorrect Kind transition, print a
more informative debugging message.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Bob Duff  

gcc/ada/

* bindo-graphs.adb (Add_Edge_Kind_Check): Add the Image of the
old and new Kinds to the raise Program_Error message.--- gcc/ada/bindo-graphs.adb
+++ gcc/ada/bindo-graphs.adb
@@ -1824,7 +1824,7 @@ package body Bindo.Graphs is
  end case;
 
  if not OK then
-raise Program_Error;
+raise Program_Error with Kind'Img & "-->" & Attributes.Kind'Img;
  end if;
   end Add_Edge_Kind_Check;
 



[Ada] Membership test against a non-excluding subtype

2020-06-09 Thread Pierre-Marie de Rodat
GNAT ignores the null exclusion property of the target access type in a
membership test (e.g. Ptr in Null_Excluding_Ptr_Type), this is fixed
here.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* exp_ch4.adb (Expand_N_In): Fix handling of null exclusion.--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -6468,12 +6468,13 @@ package body Exp_Ch4 is
 
   else
  declare
-Typ   : Entity_Id:= Etype (Rop);
-Is_Acc: constant Boolean := Is_Access_Type (Typ);
-Cond  : Node_Id  := Empty;
-New_N : Node_Id;
-Obj   : Node_Id  := Lop;
-SCIL_Node : Node_Id;
+Typ  : Entity_Id:= Etype (Rop);
+Is_Acc   : constant Boolean := Is_Access_Type (Typ);
+Check_Null_Exclusion : Boolean;
+Cond : Node_Id  := Empty;
+New_N: Node_Id;
+Obj  : Node_Id  := Lop;
+SCIL_Node: Node_Id;
 
  begin
 Remove_Side_Effects (Obj);
@@ -6549,12 +6550,19 @@ package body Exp_Ch4 is
 --  Here we have a non-scalar type
 
 if Is_Acc then
+
+   --  If the null exclusion checks are not compatible, need to
+   --  perform further checks. In other words, we cannot have
+   --  Ltyp including null and Typ excluding null. All other cases
+   --  are OK.
+
+   Check_Null_Exclusion :=
+ Can_Never_Be_Null (Typ) and then not Can_Never_Be_Null (Ltyp);
Typ := Designated_Type (Typ);
 end if;
 
 if not Is_Constrained (Typ) then
-   Rewrite (N, New_Occurrence_Of (Standard_True, Loc));
-   Analyze_And_Resolve (N, Restyp);
+   Cond := New_Occurrence_Of (Standard_True, Loc);
 
 --  For the constrained array case, we have to check the subscripts
 --  for an exact match if the lengths are non-zero (the lengths
@@ -6610,19 +6618,6 @@ package body Exp_Ch4 is
Build_Attribute_Reference
  (New_Occurrence_Of (Typ, Loc), Name_Last, J)));
   end loop;
-
-  if Is_Acc then
- Cond :=
-   Make_Or_Else (Loc,
- Left_Opnd  =>
-   Make_Op_Eq (Loc,
- Left_Opnd  => Obj,
- Right_Opnd => Make_Null (Loc)),
- Right_Opnd => Cond);
-  end if;
-
-  Rewrite (N, Cond);
-  Analyze_And_Resolve (N, Restyp);
end Check_Subscripts;
 
 --  These are the cases where constraint checks may be required,
@@ -6638,24 +6633,32 @@ package body Exp_Ch4 is
if Has_Discriminants (Typ) then
   Cond := Make_Op_Not (Loc,
 Right_Opnd => Build_Discriminant_Checks (Obj, Typ));
-
-  if Is_Acc then
- Cond := Make_Or_Else (Loc,
-   Left_Opnd  =>
- Make_Op_Eq (Loc,
-   Left_Opnd  => Obj,
-   Right_Opnd => Make_Null (Loc)),
-   Right_Opnd => Cond);
-  end if;
-
else
   Cond := New_Occurrence_Of (Standard_True, Loc);
end if;
+end if;
 
-   Rewrite (N, Cond);
-   Analyze_And_Resolve (N, Restyp);
+if Is_Acc then
+   if Check_Null_Exclusion then
+  Cond := Make_And_Then (Loc,
+Left_Opnd  =>
+  Make_Op_Ne (Loc,
+Left_Opnd  => Obj,
+Right_Opnd => Make_Null (Loc)),
+Right_Opnd => Cond);
+   else
+  Cond := Make_Or_Else (Loc,
+Left_Opnd  =>
+  Make_Op_Eq (Loc,
+Left_Opnd  => Obj,
+Right_Opnd => Make_Null (Loc)),
+Right_Opnd => Cond);
+   end if;
 end if;
 
+Rewrite (N, Cond);
+Analyze_And_Resolve (N, Restyp);
+
 --  Ada 2012 (AI05-0149): Handle membership tests applied to an
 --  expression of an anonymous access type. This can involve an
 --  accessibility test and a tagged type membership test in the



[Ada] gnatbind: Correct assertions in Add_Edge_Kind_Check

2020-06-09 Thread Pierre-Marie de Rodat
This patch corrects the assertions in Add_Edge_Kind_Check.  In
particular, a spec-->body edge can come from an Invocation_Edge or a
Forced_Edge in case of cycles.  Other edges are added in a certain
order.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Bob Duff  

gcc/ada/

* bindo-graphs.ads (Library_Graph_Edge_Kind): Reorder enumerals
to reflect the order of adding edges. Clarify comments.
* bindo-graphs.adb (Add_Edge_Kind_Check): Correct the
assertions.  Reorder the "when"s to match the order of adding
edges, and therefore the order of enumerals in type
Library_Graph_Edge_Kind.  Change names to "Old_" and "New_" to
clarify what's what.  Combine Invocation_Edge into the "<="
test.  Fix the "raise Program_Error" message, which was
backwards.--- gcc/ada/bindo-graphs.adb
+++ gcc/ada/bindo-graphs.adb
@@ -1064,9 +1064,9 @@ package body Bindo.Graphs is
 (G  : Library_Graph;
  Pred   : Library_Graph_Vertex_Id;
  Succ   : Library_Graph_Vertex_Id;
- Kind   : Library_Graph_Edge_Kind);
+ New_Kind   : Library_Graph_Edge_Kind);
   --  This is called by Add_Edge in the case where there is already a
-  --  Pred-->Succ edge, to assert that the new Kind is appropriate. Raises
+  --  Pred-->Succ edge, to assert that the New_Kind is appropriate. Raises
   --  Program_Error if a bug is detected. The purpose is to prevent bugs
   --  where calling Add_Edge in different orders produces different output.
 
@@ -1781,50 +1781,45 @@ package body Bindo.Graphs is
 (G  : Library_Graph;
  Pred   : Library_Graph_Vertex_Id;
  Succ   : Library_Graph_Vertex_Id;
- Kind   : Library_Graph_Edge_Kind)
+ New_Kind   : Library_Graph_Edge_Kind)
   is
  Old_Edge : constant Library_Graph_Edge_Id :=
Find_Edge (G, Pred, Succ);
- Attributes : constant Library_Graph_Edge_Attributes :=
-   Get_LGE_Attributes (G, Old_Edge);
+ Old_Kind : constant Library_Graph_Edge_Kind :=
+   Get_LGE_Attributes (G, Old_Edge).Kind;
  OK : Boolean;
   begin
- case Kind is
---  We call Add_Edge with Body_Before_Spec_Edge twice -- once
---  for  the spec and once for the body, but no other Kind can
---  be spec-->body.
-
-when Body_Before_Spec_Edge =>
-   if True then
-  --  Disable this part of the assertion for now
-  OK := True;
-   else
-  OK := Attributes.Kind = Body_Before_Spec_Edge;
-   end if;
-
---  Spec_Before_Body_Edge comes first
-
+ case New_Kind is
 when Spec_Before_Body_Edge =>
OK := False;
-
---  With clauses and forced edges come after Spec_Before_Body_Edge
+   --  Spec_Before_Body_Edge comes first, and there is never more
+   --  than one Spec_Before_Body_Edge for a given unit, so we can't
+   --  have a preexisting edge in the Spec_Before_Body_Edge case.
 
 when With_Edge | Elaborate_Edge | Elaborate_All_Edge
-  | Forced_Edge =>
-   OK := Attributes.Kind <= Kind;
+  | Forced_Edge | Invocation_Edge =>
+   OK := Old_Kind <= New_Kind;
+   --  These edges are created in the order of the enumeration
+   --  type, and there can be duplicates; hence "<=".
 
---  Invocation_Edge can come after anything, including another
---  Invocation_Edge.
+when Body_Before_Spec_Edge =>
+   OK := Old_Kind = Body_Before_Spec_Edge
+   --  We call Add_Edge with Body_Before_Spec_Edge twice -- once
+   --  for the spec and once for the body.
 
-when Invocation_Edge =>
-   OK := True;
+ or else Old_Kind = Forced_Edge
+ or else Old_Kind = Invocation_Edge;
+   --  The old one can be Forced_Edge or Invocation_Edge, which
+   --  necessarily results in an elaboration cycle (in the static
+   --  model), but this assertion happens before cycle detection,
+   --  so we need to allow these cases.
 
 when No_Edge =>
OK := False;
  end case;
 
  if not OK then
-raise Program_Error with Kind'Img & "-->" & Attributes.Kind'Img;
+raise Program_Error with Old_Kind'Img & "-->" & New_Kind'Img;
  end if;
   end Add_Edge_Kind_Check;
 

--- gcc/ada/bindo-graphs.ads
+++ gcc/ada/bindo-graphs.ads
@@ -702,29 +702,28 @@ package Bindo.Graphs is
 
  No_Cycle_Kind);
 
-  --  The following type represents the various kinds of library edges.
-  --  The order is important here, and roughly correspo

[Ada] Disable assertion regarding Body_Before_Spec_Edge

2020-06-09 Thread Pierre-Marie de Rodat
This patch disables a sometimes-failing assertion.  The assertion is
failing because there is an invocation edge spec-->body, and then we
later add a Body_Before_Spec_Edge for the SCC computation. This is a
temporary fix; we need to investigate why the spec-->body invocation
edge exists.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Bob Duff  

gcc/ada/

* bindo-graphs.adb (Add_Edge_Kind_Check): Disable failing part
of the assertion.--- gcc/ada/bindo-graphs.adb
+++ gcc/ada/bindo-graphs.adb
@@ -1795,7 +1795,12 @@ package body Bindo.Graphs is
 --  be spec-->body.
 
 when Body_Before_Spec_Edge =>
-   OK := Attributes.Kind = Body_Before_Spec_Edge;
+   if True then
+  --  Disable this part of the assertion for now
+  OK := True;
+   else
+  OK := Attributes.Kind = Body_Before_Spec_Edge;
+   end if;
 
 --  Spec_Before_Body_Edge comes first
 



[Ada] Improve handling of null unbounded strings

2020-06-09 Thread Pierre-Marie de Rodat
This is a useful optimization to avoid contention around the shared
empty string.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* libgnat/a-strunb__shared.ads, libgnat/a-strunb__shared.adb
(Reference, Unreference): No-op for Empty_Shared_String.
Remove unneeded calls to Reference.--- gcc/ada/libgnat/a-strunb__shared.adb
+++ gcc/ada/libgnat/a-strunb__shared.adb
@@ -73,7 +73,6 @@ package body Ada.Strings.Unbounded is
   --  Result is an empty string, reuse shared empty string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Left string is empty, return Right string
@@ -112,7 +111,6 @@ package body Ada.Strings.Unbounded is
   --  Result is an empty string, reuse shared empty string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Right is an empty string, return Left string
@@ -145,7 +143,6 @@ package body Ada.Strings.Unbounded is
   --  Result is an empty string, reuse shared one
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Left is empty string, return Right string
@@ -214,7 +211,6 @@ package body Ada.Strings.Unbounded is
   --  Result is an empty string, reuse shared empty string
 
   if Left = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Otherwise, allocate new shared string and fill it
@@ -244,7 +240,6 @@ package body Ada.Strings.Unbounded is
   --  Result is an empty string, reuse shared empty string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Otherwise, allocate new shared string and fill it
@@ -277,7 +272,6 @@ package body Ada.Strings.Unbounded is
   --  Result is an empty string, reuse shared empty string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Coefficient is one, just return string itself
@@ -506,7 +500,6 @@ package body Ada.Strings.Unbounded is
   --  Empty string requested, return shared empty string
 
   if Max_Length = 0 then
- Reference (Empty_Shared_String'Access);
  return Empty_Shared_String'Access;
 
   --  Otherwise, allocate requested space (and probably some more room)
@@ -701,7 +694,6 @@ package body Ada.Strings.Unbounded is
  --  Result is an empty string, reuse shared empty string
 
  if DL = 0 then
-Reference (Empty_Shared_String'Access);
 DR := Empty_Shared_String'Access;
 
  --  Otherwise, allocate new shared string and fill it
@@ -743,7 +735,6 @@ package body Ada.Strings.Unbounded is
  --  Result is empty, reuse shared empty string
 
  if DL = 0 then
-Reference (Empty_Shared_String'Access);
 Source.Reference := Empty_Shared_String'Access;
 Unreference (SR);
 
@@ -801,7 +792,6 @@ package body Ada.Strings.Unbounded is
  --  effects if a program references an already-finalized object.
 
  Object.Reference := Null_Unbounded_String.Reference;
- Reference (Object.Reference);
  Unreference (SR);
   end if;
end Finalize;
@@ -862,7 +852,6 @@ package body Ada.Strings.Unbounded is
   --  Result is empty, reuse shared empty string
 
   if Count = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Length of the string is the same as requested, reuse source shared
@@ -912,7 +901,6 @@ package body Ada.Strings.Unbounded is
   --  Result is empty, reuse empty shared string
 
   if Count = 0 then
- Reference (Empty_Shared_String'Access);
  Source.Reference := Empty_Shared_String'Access;
  Unreference (SR);
 
@@ -1090,7 +1078,6 @@ package body Ada.Strings.Unbounded is
   --  Result is empty, reuse empty shared string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Inserted string is empty, reuse source shared string
@@ -1132,7 +1119,6 @@ package body Ada.Strings.Unbounded is
   --  Result is empty string, reuse empty shared string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  Source.Reference := Empty_Shared_String'Access;
  Unreference (SR);
 
@@ -1197,7 +1183,6 @@ package body Ada.Strings.Unbounded is
   --  Result is empty string, reuse empty shared string
 
   if DL = 0 then
- Reference (Empty_Shared_String'Access);
  DR := Empty_Shared_String'Access;
 
   --  Result is same as source string, reuse source shared string
@@ -1241,7 +1226,6 @@ package body Ada.Strings.Unbounded is
   

[Ada] Improve handling of aggregates in Side_Effect_Free

2020-06-09 Thread Pierre-Marie de Rodat
Side_Effect_Free was always assuming an aggregate isn't side effects
free, but we can do better by taking advantage of the
Compile_Time_Known_Aggregate flag.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* exp_util.adb (Side_Effect_Free): Improve handling of
N_Aggregate.--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -13357,6 +13357,12 @@ package body Exp_Util is
  =>
 return True;
 
+ --  An aggregate is side effect free if all its values are compile
+ --  time known.
+
+ when N_Aggregate =>
+return Compile_Time_Known_Aggregate (N);
+
  --  We consider that anything else has side effects. This is a bit
  --  crude, but we are pretty close for most common cases, and we
  --  are certainly correct (i.e. we never return True when the



[Ada] Implement AI12-0028: Import of variadic C functions

2020-06-09 Thread Pierre-Marie de Rodat
This implements the support for the new C_Variadic_n conventions,
n ranging from 0 to 16, in all versions of the language since the AI is
a binding interpretation.

These new conventions are meant to be used in conjunction with the
Import aspect/pragma to call variadic C functions with the special
calling conventions attached to them, depending on the ABI.  But a
warning is issued for C_Variadic_0 since it cannot actually be used.

The typical example is calling the printf function of the C library.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Eric Botcazou  

gcc/ada/

* exp_ch6.adb (Freeze_Subprogram): Deal with convention C_Family.
* freeze.adb (Freeze_Profile): Likewise.  Add missing guard.
* sem_mech.adb (Set_Mechanisms): Likewise.
* lib-xref.adb (Output_Import_Export_Info): Ditto for C_Variadic.
* repinfo.adb (List_Subprogram_Info): Likewise.
* sem_prag.adb (Set_Convention_From_Pragma): Move main checks for
Stdcall to...
(Process_Convention): ...here.  Add checks for C_Variadic.
* snames.ads-tmpl: Add Name_C_Variadic_0 .. Name_C_Variadic_16.
Use consistent format for subtype declarations.
(Convention_Id): Add Convention_C_Variadic_0 .. C_Variadic_16
and move Convention_CPP up.
(Convention_C_Family): New subtype of Convention_Id.
(Convention_C_Variadic): Likewise.
(Foreign_Convention): Use explicit upper bound.
Add pragma Inline for Is_Configuration_Pragma_Name,
Is_Function_Attribute_Name, Is_Internal_Attribute_Name
and Is_Procedure_Attribute_Name.
* snames.adb-tmpl (Get_Convention_Id): Deal with Name_Variadic_n.
(Get_Convention_Name): Deal with Convention_Variadic_n.
* types.h (Convention_Id): New typedef.
* xsnamest.adb (Name2): New variable.
(Is_Conv): New pattern.
(Get_Subt1): Likewise.
(Get_Subt2): Likewise.
Output subtypes of Convention_Id into the C header file.--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -7716,8 +7716,7 @@ package body Exp_Ch6 is
 --  Build_Inherit_Prims takes care of initializing these slots.
 
 elsif Is_Imported (Subp)
-   and then (Convention (Subp) = Convention_CPP
-   or else Convention (Subp) = Convention_C)
+   and then Convention (Subp) in Convention_C_Family
 then
null;
 

--- gcc/ada/freeze.adb
+++ gcc/ada/freeze.adb
@@ -3674,9 +3674,7 @@ package body Freeze is
 
 if Warn_On_Export_Import
   and then Comes_From_Source (E)
-  and then (Convention (E) = Convention_C
-  or else
-Convention (E) = Convention_CPP)
+  and then Convention (E) in Convention_C_Family
   and then (Is_Imported (E) or else Is_Exported (E))
   and then Convention (E) /= Convention (Formal)
   and then not Has_Warnings_Off (E)
@@ -3823,9 +3821,8 @@ package body Freeze is
 --  Check suspicious return type for C function
 
 if Warn_On_Export_Import
-  and then (Convention (E) = Convention_C
-  or else
-Convention (E) = Convention_CPP)
+  and then Comes_From_Source (E)
+  and then Convention (E) in Convention_C_Family
   and then (Is_Imported (E) or else Is_Exported (E))
 then
--  Check suspicious return of fat C pointer

--- gcc/ada/lib-xref.adb
+++ gcc/ada/lib-xref.adb
@@ -1652,7 +1652,7 @@ package body Lib.Xref is
   begin
  --  Generate language name from convention
 
- if Conv  = Convention_C then
+ if Conv = Convention_C or else Conv in Convention_C_Variadic then
 Language_Name := Name_C;
 
  elsif Conv = Convention_CPP then

--- gcc/ada/repinfo.adb
+++ gcc/ada/repinfo.adb
@@ -1935,6 +1935,21 @@ package body Repinfo is
  when Convention_C =>
 Write_Str ("C");
 
+ when Convention_C_Variadic =>
+declare
+   N : Nat :=
+ Convention_Id'Pos (Convention (Ent)) -
+   Convention_Id'Pos (Convention_C_Variadic_0);
+begin
+   Write_Str ("C_Variadic_");
+   if N >= 10 then
+  Write_Char ('1');
+  N := N - 10;
+   end if;
+   pragma Assert (N < 10);
+   Write_Char (Character'Val (Character'Pos ('0') + N));
+end;
+
  when Convention_COBOL =>
 Write_Str ("COBOL");
 

--- gcc/ada/sem_mech.adb
+++ gcc/ada/sem_mech.adb
@@ -181,11 +181,10 @@ package body Sem_Mech is
-- C --
---
 
-   --  Note: Assembler, C++, Stdcall also use C conventions
+   --  Note: Assembler and Stdcall also use C conve

[Ada] Expand more others aggregates statically

2020-06-09 Thread Pierre-Marie de Rodat
Now that there is no artifical limit on Convert_To_Positional, we use
the same heuristics than in Aggr_Size_OK, to share the code and to
generate static aggregates in more cases.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* exp_aggr.adb (Max_Aggregate_Size): New function to factorize
code.
(Convert_To_Positional, Aggr_Size_OK): Use Max_Aggregate_Size.--- gcc/ada/exp_aggr.adb
+++ gcc/ada/exp_aggr.adb
@@ -289,19 +289,12 @@ package body Exp_Aggr is
--  construct the allocated object on the heap.
 
procedure Convert_To_Positional
- (N: Node_Id;
-  Max_Others_Replicate : Nat := 32;
-  Handle_Bit_Packed: Boolean := False);
+ (N : Node_Id;
+  Handle_Bit_Packed : Boolean := False);
--  If possible, convert named notation to positional notation. This
--  conversion is possible only in some static cases. If the conversion is
--  possible, then N is rewritten with the analyzed converted aggregate.
-   --  The parameter Max_Others_Replicate controls the maximum number of
-   --  values corresponding to an others choice that will be converted to
-   --  positional notation (the default of 32 is the normal limit, and reflects
-   --  the fact that normally the loop is better than a lot of separate
-   --  assignments). Note that this limit gets overridden in any case if
-   --  either of the restrictions No_Elaboration_Code or No_Implicit_Loops is
-   --  set. The parameter Handle_Bit_Packed is usually set False (since we do
+   --  The parameter Handle_Bit_Packed is usually set False (since we do
--  not expect the back end to handle bit packed arrays, so the normal case
--  of conversion is pointless), but in the special case of a call from
--  Packed_Array_Aggregate_Handled, we set this parameter to True, since
@@ -320,6 +313,12 @@ package body Exp_Aggr is
--  an array that is suitable for this optimization: it returns True if Typ
--  is a two dimensional bit packed array with component size 1, 2, or 4.
 
+   function Max_Aggregate_Size
+ (Typ  : Entity_Id;
+  Default_Size : Nat := 5000) return Nat;
+   --  Return the max size for a static aggregate for the given Typ.
+   --  Return Default_Size if no other special criteria triggers.
+
function Packed_Array_Aggregate_Handled (N : Node_Id) return Boolean;
--  Given an array aggregate, this function handles the case of a packed
--  array aggregate with all constant values, where the aggregate can be
@@ -429,43 +428,15 @@ package body Exp_Aggr is
--  Start of processing for Aggr_Size_OK
 
begin
-  --  The normal aggregate limit is 50, but we increase this limit to
-  --  2**24 (about 16 million) if Restrictions (No_Elaboration_Code) or
-  --  Restrictions (No_Implicit_Loops) is specified, since in either case
-  --  we are at risk of declaring the program illegal because of this
-  --  limit. We also increase the limit when Static_Elaboration_Desired,
-  --  given that this means that objects are intended to be placed in data
-  --  memory.
-
-  --  We also increase the limit if the aggregate is for a packed two-
-  --  dimensional array, because if components are static it is much more
-  --  efficient to construct a one-dimensional equivalent array with static
-  --  components.
-
-  --  Conversely, we decrease the maximum size if none of the above
-  --  requirements apply, and if the aggregate has a single component
+  --  We bump the maximum size unless the aggregate has a single component
   --  association, which will be more efficient if implemented with a loop.
 
-  --  Finally, we use a small limit in CodePeer mode where we favor loops
-  --  instead of thousands of single assignments (from large aggregates).
-
-  Max_Aggr_Size := 50;
-
-  if CodePeer_Mode then
- Max_Aggr_Size := 100;
-
-  elsif Restriction_Active (No_Elaboration_Code)
-or else Restriction_Active (No_Implicit_Loops)
-or else Is_Two_Dim_Packed_Array (Typ)
-or else (Ekind (Current_Scope) = E_Package
-   and then Static_Elaboration_Desired (Current_Scope))
-  then
- Max_Aggr_Size := 2 ** 24;
-
-  elsif No (Expressions (N))
+  if No (Expressions (N))
 and then No (Next (First (Component_Associations (N
   then
- Max_Aggr_Size := 5000;
+ Max_Aggr_Size := Max_Aggregate_Size (Typ);
+  else
+ Max_Aggr_Size := Max_Aggregate_Size (Typ, 500_000);
   end if;
 
   Size := UI_From_Int (Component_Count (Component_Type (Typ)));
@@ -4561,11 +4532,11 @@ package body Exp_Aggr is
---
 
procedure Convert_To_Positional
- (N: Node_Id;
-  Max_Others_Replicate : Nat := 32;
-  Handle_Bit_Packed: Boolean := False)
+ (N  

[Ada] Small cleanup in Einfo unit

2020-06-09 Thread Pierre-Marie de Rodat
This moves around the declarations of functions that are not in the
right category, adds pragma Inline for others and fixes alphabetization
issues.

This also clearly splits the functions subject to pragma Inline from the
procedures subject to the pragma, the former being read by XEINFO but
not the latter, and adds a comment stating the distinction.

No functional changes.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Eric Botcazou  

gcc/ada/

* einfo.ads (Has_Foreign_Convention): Fix description.
(Component_Alignment): Move around.
(Has_DIC): Likewise.
(Has_Interrupt_Handler): Likewise.
(Has_Invariants): Likewise.
(Is_Atomic_Or_VFA): Likewise.
(Next_Index): Likewise.
(Scope_Depth): Likewise.
(Init_Component_Size): Likewise.
(Init_Component_Location): Likewise.
(Init_Size): Likewise.
(Inline Pragmas for functions): Add Corresponding_Function,
Corresponding_Procedure, Entry_Max_Queue_Lengths_Array,
Finalize_Storage_Only, Has_DIC, Has_Invariants,
Initialization_Statements, Is_Anonymous_Access_Type,
Next_Stored_Discriminant, Address_Clause, Alignment_Clause,
Float_Rep, Has_Foreign_Convention, Has_Non_Limited_View,
Is_Constant_Object, Is_Discriminal, Is_Finalizer, Is_Null_State,
Is_Prival, Is_Protected_Component, Is_Protected_Record_Type,
Is_Subprogram_Or_Entry, Is_Task_Record_Type, Size_Clause,
Stream_Size_Clause, Type_High_Bound, Type_Low_Bound, Known_*,
Unknown_*.
(Inline Pragmas for procedures): Add Set_Corresponding_Function,
Set_Corresponding_Procedure, Set_Finalize_Storage_Only,
Set_Float_Rep, Set_Initialization_Statements,
Init_Normalized_First_Bit, Init_Normalized_Position,
Init_Normalized_Position_Max.
* einfo.adb (Was_Hidden): Move around.
(Is_Packed_Array): Likewise.
(Model_Emin_Value): Likewise.
(Model_Epsilon_Value): Likewise.
(Model_Mantissa_Value): Likewise.
(Model_Small_Value): Likewise.--- gcc/ada/einfo.adb
+++ gcc/ada/einfo.adb
@@ -3615,6 +3615,11 @@ package body Einfo is
   return Flag238 (Id);
end Warnings_Off_Used_Unreferenced;
 
+   function Was_Hidden (Id : E) return B is
+   begin
+  return Flag196 (Id);
+   end Was_Hidden;
+
function Wrapped_Entity (Id : E) return E is
begin
   pragma Assert (Ekind_In (Id, E_Function, E_Procedure)
@@ -3622,11 +3627,6 @@ package body Einfo is
   return Node27 (Id);
end Wrapped_Entity;
 
-   function Was_Hidden (Id : E) return B is
-   begin
-  return Flag196 (Id);
-   end Was_Hidden;
-
--
-- Classification Functions --
--
@@ -8168,15 +8168,6 @@ package body Einfo is
 Ekind (Id) = E_Abstract_State and then Nkind (Parent (Id)) = N_Null;
end Is_Null_State;
 
-   -
-   -- Is_Packed_Array --
-   -
-
-   function Is_Packed_Array (Id : E) return B is
-   begin
-  return Is_Array_Type (Id) and then Is_Packed (Id);
-   end Is_Packed_Array;
-
---
-- Is_Package_Or_Generic_Package --
---
@@ -8186,6 +8177,15 @@ package body Einfo is
   return Ekind_In (Id, E_Generic_Package, E_Package);
end Is_Package_Or_Generic_Package;
 
+   -
+   -- Is_Packed_Array --
+   -
+
+   function Is_Packed_Array (Id : E) return B is
+   begin
+  return Is_Array_Type (Id) and then Is_Packed (Id);
+   end Is_Packed_Array;
+
---
-- Is_Prival --
---
@@ -8404,44 +8404,6 @@ package body Einfo is
   Set_Next_Entity (First, Second); --  First --> Second
end Link_Entities;
 
-   --
-   -- Model_Emin_Value --
-   --
-
-   function Model_Emin_Value (Id : E) return Uint is
-   begin
-  return Machine_Emin_Value (Id);
-   end Model_Emin_Value;
-
-   -
-   -- Model_Epsilon_Value --
-   -
-
-   function Model_Epsilon_Value (Id : E) return Ureal is
-  Radix : constant Ureal := UR_From_Uint (Machine_Radix_Value (Id));
-   begin
-  return Radix ** (1 - Model_Mantissa_Value (Id));
-   end Model_Epsilon_Value;
-
-   --
-   -- Model_Mantissa_Value --
-   --
-
-   function Model_Mantissa_Value (Id : E) return Uint is
-   begin
-  return Machine_Mantissa_Value (Id);
-   end Model_Mantissa_Value;
-
-   ---
-   -- Model_Small_Value --
-   ---
-
-   function Model_Small_Value (Id : E) return Ureal is
-  Radix : constant Ureal := UR_From_Uint (Machine_Radix_Value (Id));
-   begin
-  return Radix ** (Model_Emin_Value (Id) - 1);
-   end Model_Small_Value;
-

-- Machine_Emax_Value 

[Ada] Code clean ups and comments updates

2020-06-09 Thread Pierre-Marie de Rodat
We identify all the places where error messages are issued in the
expander and should be issued in semantic analysis instead, for future
clean ups.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* exp_ch3.adb, exp_ch4.adb, exp_ch6.adb, exp_ch9.adb,
exp_disp.adb, exp_util.adb: Add comments related to errors that
should be moved to semantic analysis. Also replace "?" with "??"
in warning messages.--- gcc/ada/exp_ch3.adb
+++ gcc/ada/exp_ch3.adb
@@ -4456,6 +4456,8 @@ package body Exp_Ch3 is
 
   procedure Check_Attr (Nam : Name_Id; TSS_Nam : TSS_Name_Type) is
   begin
+ --  Move this check to sem???
+
  if not Stream_Attribute_Available (Etype (Comp), TSS_Nam) then
 Error_Msg_Name_1 := Nam;
 Error_Msg_N
@@ -9039,6 +9041,8 @@ package body Exp_Ch3 is
   end loop;
 
   pragma Assert (Present (Comp));
+
+  --  Move this check to sem???
   Error_Msg_Node_2 := Comp;
   Error_Msg_NE
 ("parent type & with dynamic component & cannot be parent"

--- gcc/ada/exp_ch4.adb
+++ gcc/ada/exp_ch4.adb
@@ -4486,7 +4486,7 @@ package body Exp_Ch4 is
and then Nkind (Associated_Node_For_Itype (PtrT)) =
   N_Object_Declaration)
   then
- Error_Msg_N ("?use of an anonymous access type allocator", N);
+ Error_Msg_N ("??use of an anonymous access type allocator", N);
   end if;
 
   --  RM E.2.2(17). We enforce that the expected type of an allocator

--- gcc/ada/exp_ch6.adb
+++ gcc/ada/exp_ch6.adb
@@ -1960,7 +1960,7 @@ package body Exp_Ch6 is
  then
 if Comes_From_Source (N) then
Error_Msg_N
- ("?atomic actual passed by copy (RM C.6(19))", Actual);
+ ("??atomic actual passed by copy (RM C.6(19))", Actual);
 end if;
 return True;
  end if;
@@ -1971,7 +1971,7 @@ package body Exp_Ch6 is
  then
 if Comes_From_Source (N) then
Error_Msg_N
- ("?volatile actual passed by copy (RM C.6(19))", Actual);
+ ("??volatile actual passed by copy (RM C.6(19))", Actual);
 end if;
 return True;
  end if;
@@ -3921,6 +3921,8 @@ package body Exp_Ch6 is
 if Present (Ass)
   and then Is_Class_Wide_Type (Etype (Name (Ass)))
 then
+   --  Move the error messages below to sem???
+
if Is_Access_Type (Etype (Call_Node)) then
   if Designated_Type (Etype (Call_Node)) /=
 Root_Type (Etype (Name (Ass)))
@@ -4115,6 +4117,8 @@ package body Exp_Ch6 is
 
  Set_Entity (Name (Call_Node), Parent_Subp);
 
+ --  Move this check to sem???
+
  if Is_Abstract_Subprogram (Parent_Subp)
and then not In_Instance
  then

--- gcc/ada/exp_ch9.adb
+++ gcc/ada/exp_ch9.adb
@@ -11168,6 +11168,7 @@ package body Exp_Ch9 is
then
   null;
else
+  --  Move this check to sem???
   Error_Msg_NE (
 "& is not a time type (RM 9.6(6))",
Expression (Delay_Statement (Alt)), Time_Type);

--- gcc/ada/exp_disp.adb
+++ gcc/ada/exp_disp.adb
@@ -3836,6 +3836,7 @@ package body Exp_Disp is
   --  tagged type, when one of its primitive operations has a type in its
   --  profile whose full view has not been analyzed yet. More complex cases
   --  involve composite types that have one private unfrozen subcomponent.
+  --  Move this check to sem???
 
   procedure Export_DT (Typ : Entity_Id; DT : Entity_Id; Index : Nat := 0);
   --  Export the dispatch table DT of tagged type Typ. Required to generate
@@ -8150,6 +8151,7 @@ package body Exp_Disp is
 --  We exclude Input and Output stream operations because
 --  Limited_Controlled inherits useless Input and Output stream
 --  operations from Root_Controlled, which can never be overridden.
+--  Move this check to sem???
 
 if not Is_TSS (Prim, TSS_Stream_Input)
  and then

--- gcc/ada/exp_util.adb
+++ gcc/ada/exp_util.adb
@@ -1296,6 +1296,7 @@ package body Exp_Util is
--  of the type. In the case of an inherited condition for an
--  overriding operation, both the operation and the function
--  are given by primitive wrappers.
+   --  Move this check to sem???
 
if Ekind (New_E) = E_Function
  and then Is_Primitive_Wrapper (New_E)
@@ -1326,6 +1327,7 @@ package body Exp_Util is
 
 --  Check that there are no calls left to abstract operations if
 --  the current subprogram is not abstract.
+ 

[Ada] Missing check on private overriding of dispatching primitive

2020-06-09 Thread Pierre-Marie de Rodat
When a dispatching operation overrides an inherited subprogram, it must
be subtype conformant with the inherited subprogram.  Such check is not
performed by the frontend on overriding primitives of private types
which are declared before the full type declaration of the private type.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Javier Miranda  

gcc/ada/

* sem_ch6.adb (New_Overloaded_Entity): Add missing call to check
subtype conformance of overriding dispatching primitive.
* sem_eval.adb (Subtypes_Statically_Match): Handle derivations
of private subtypes.
* libgnat/g-exptty.adb, libgnat/g-exptty.ads
(Set_Up_Communications): Fix the profile since null-exclusion is
missing in the access type formals.
* sem_disp.ads (Check_Operation_From_Private_View): Adding
documentation.--- gcc/ada/libgnat/g-exptty.adb
+++ gcc/ada/libgnat/g-exptty.adb
@@ -314,9 +314,9 @@ package body GNAT.Expect.TTY is
overriding procedure Set_Up_Communications
  (Pid: in out TTY_Process_Descriptor;
   Err_To_Out : Boolean;
-  Pipe1  : access Pipe_Type;
-  Pipe2  : access Pipe_Type;
-  Pipe3  : access Pipe_Type)
+  Pipe1  : not null access Pipe_Type;
+  Pipe2  : not null access Pipe_Type;
+  Pipe3  : not null access Pipe_Type)
is
   pragma Unreferenced (Err_To_Out, Pipe1, Pipe2, Pipe3);
 

--- gcc/ada/libgnat/g-exptty.ads
+++ gcc/ada/libgnat/g-exptty.ads
@@ -116,9 +116,9 @@ private
procedure Set_Up_Communications
  (Pid: in out TTY_Process_Descriptor;
   Err_To_Out : Boolean;
-  Pipe1  : access Pipe_Type;
-  Pipe2  : access Pipe_Type;
-  Pipe3  : access Pipe_Type);
+  Pipe1  : not null access Pipe_Type;
+  Pipe2  : not null access Pipe_Type;
+  Pipe3  : not null access Pipe_Type);
 
procedure Set_Up_Parent_Communications
  (Pid   : in out TTY_Process_Descriptor;

--- gcc/ada/sem_ch6.adb
+++ gcc/ada/sem_ch6.adb
@@ -11177,6 +11177,18 @@ package body Sem_Ch6 is
  Inherit_Subprogram_Contract (E, S);
   end if;
 
+  --  When a dispatching operation overrides an inherited
+  --  subprogram, it shall be subtype conformant with the
+  --  inherited subprogram (RM 3.9.2 (10.2)).
+
+  if Comes_From_Source (E)
+and then Is_Dispatching_Operation (E)
+and then Find_Dispatching_Type (S)
+   = Find_Dispatching_Type (E)
+  then
+ Check_Subtype_Conformant (E, S);
+  end if;
+
   if Comes_From_Source (E) then
  Check_Overriding_Indicator (E, S, Is_Primitive => False);
 

--- gcc/ada/sem_disp.ads
+++ gcc/ada/sem_disp.ads
@@ -64,11 +64,11 @@ package Sem_Disp is
--  this call actually do???
 
procedure Check_Operation_From_Private_View (Subp, Old_Subp : Entity_Id);
-   --  Add Old_Subp to the list of primitive operations of the corresponding
-   --  tagged type if it is the full view of a private tagged type. The Alias
-   --  of Old_Subp is adjusted to point to the inherited procedure of the
-   --  full view because it is always this one which has to be called.
-   --  What is Subp used for???
+   --  No action performed if Subp is not an alias of a dispatching operation.
+   --  Add Old_Subp (if not already present) to the list of primitives of the
+   --  tagged type T of Subp if T is the full view of a private tagged type.
+   --  The Alias of Old_Subp is adjusted to point to the inherited procedure
+   --  of the full view because it is always this one which has to be called.
 
function Covered_Interface_Op (Prim : Entity_Id) return Entity_Id;
--  Returns the interface primitive that Prim covers, when its controlling

--- gcc/ada/sem_eval.adb
+++ gcc/ada/sem_eval.adb
@@ -6092,6 +6092,29 @@ package body Sem_Eval is
 
   elsif Has_Discriminants (T1) or else Has_Discriminants (T2) then
 
+ --  Handle derivations of private subtypes. For example S1 statically
+ --  matches the full view of T1 in the following example:
+
+ --  type T1(<>) is new Root with private;
+ --  subtype S1 is new T1;
+ --  overriding proc P1 (P : S1);
+ --private
+ --  type T1 (D : Disc) is new Root with ...
+
+ if Ekind (T2) = E_Record_Subtype_With_Private
+   and then not Has_Discriminants (T2)
+   and then Partial_View_Has_Unknown_Discr (T1)
+   and then Etype (T2) = T1
+ then
+return True;
+
+ elsif Ekind (T1) = E_Record_Subtype_With_Private
+   and then not Has_Discriminants (T1)
+   and then Partial_View_Has_Unknown_Discr (T2)
+   and then Etype (T1) = T2
+ then
+return True;
+
  -- 

[Ada] Propagate DIC, Invariant and Predicate attributes to views

2020-06-09 Thread Pierre-Marie de Rodat
This change is a step towards propagating more consistently the DIC,
Invariant and Predicate attributes between different views of the
same type, in particular to the newly built underlying full views.

It also cleans up the handling of the base types for the DIC and
Invariant attributes, which looks obsolete since everything related
to DIC and Invariant is already done on the base types, so doing it
again explicitly on them is superfluous.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Eric Botcazou  

gcc/ada/

* checks.adb (Apply_Predicate_Check): Extend trick used for
aggregates to qualified aggregates and object declarations
* einfo.ads (Has_Own_DIC): Mention the underlying full view.
(Has_Own_Invariants): Likewise.
(Has_Predicates): Likewise.
* exp_util.adb (Build_DIC_Procedure_Declaration): Do not deal
with base types explicitly but with underlying full views.
(Build_Invariant_Procedure_Declaration): Likewise.
* sem_ch13.adb (Build_Predicate_Functions): Do not deal with
the full view manually but call Propagate_Predicate_Attributes
to propagate attributes to views.
(Build_Predicate_Function_Declaration): Likewise.
* sem_ch3.adb (Build_Assertion_Bodies_For_Type): Build bodies
for private full views with an underlying full view.
(Build_Derived_Private_Type): Small comment tweak.
(Complete_Private_Subtype): Call Propagate_Predicate_Attributes.
(Process_Full_View): Do not deal with base types explicitly for
DIC and Invariant attributes.  Deal with underlying full views
for them.  Call Propagate_Predicate_Attributes and deal with
underlying full views for them.
* sem_ch7.adb (Preserve_Full_Attributes): Do not cross propagate
DIC and Invariant attributes between full type and its base type.
Propagate Predicate attributes from the full to the private view.
* sem_ch9.adb (Analyze_Protected_Type_Declaration): Likewise.
(Analyze_Task_Type_Declaration): Likewise.
* sem_util.ads (Get_Views): Remove Full_Base parameter and add
UFull_Typ parameter.
(Propagate_Predicate_Attributes): New procedure.
* sem_util.adb (Get_Views): Remove Full_Base parameter and add
UFull_Typ parameter.  Retrieve the Corresponding_Record_Type
from the underlying full view, if any.
(Propagate_DIC_Attributes): Remove useless tests.
(Propagate_Invariant_Attributes): Likewise.
(Propagate_Predicate_Attributes): New procedure.--- gcc/ada/checks.adb
+++ gcc/ada/checks.adb
@@ -2711,7 +2711,8 @@ package body Checks is
   Typ : Entity_Id;
   Fun : Entity_Id := Empty)
is
-  S : Entity_Id;
+  Par : Node_Id;
+  S   : Entity_Id;
 
begin
   if Predicate_Checks_Suppressed (Empty) then
@@ -2807,6 +2808,11 @@ package body Checks is
return;
 end if;
 
+Par := Parent (N);
+if Nkind (Par) = N_Qualified_Expression then
+   Par := Parent (Par);
+end if;
+
 --  For an entity of the type, generate a call to the predicate
 --  function, unless its type is an actual subtype, which is not
 --  visible outside of the enclosing subprogram.
@@ -2818,24 +2824,36 @@ package body Checks is
  Make_Predicate_Check
(Typ, New_Occurrence_Of (Entity (N), Sloc (N;
 
---  If the expression is not an entity it may have side effects,
---  and the following call will create an object declaration for
---  it. We disable checks during its analysis, to prevent an
---  infinite recursion.
-
---  If the prefix is an aggregate in an assignment, apply the
---  check to the LHS after assignment, rather than create a
+--  If the expression is an aggregate in an assignment, apply the
+--  check to the LHS after the assignment, rather than create a
 --  redundant temporary. This is only necessary in rare cases
 --  of array types (including strings) initialized with an
 --  aggregate with an "others" clause, either coming from source
 --  or generated by an Initialize_Scalars pragma.
 
-elsif Nkind (N) = N_Aggregate
-  and then Nkind (Parent (N)) = N_Assignment_Statement
+elsif Nkind_In (N, N_Aggregate, N_Extension_Aggregate)
+  and then Nkind (Par) = N_Assignment_Statement
 then
-   Insert_Action_After (Parent (N),
+   Insert_Action_After (Par,
  Make_Predicate_Check
-   (Typ, Duplicate_Subexpr (Name (Parent (N);
+   (Typ, Duplicate_Subexpr (Name (Par;
+
+--  Similarly, if the expression is an aggregate in an object
+--  decl

[Ada] Fix wrong type being used for range check generation

2020-06-09 Thread Pierre-Marie de Rodat
Before this commit, GNAT would use the type of the expression rather
than that of the subtype mark in order to decide whether it should
generate range checks or not. This means that in some cases, GNAT would
decide that no range checks were needed, which was wrong. This issue
went mostly unnoticed because the expansion routines use their own logic
in order to determine whether range checks should be generated or not.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Ghjuvan Lacambre  

gcc/ada/

* sem_res.adb (Resolve_Qualified_Expression): Use Subtype_Mark
type.--- gcc/ada/sem_res.adb
+++ gcc/ada/sem_res.adb
@@ -10004,8 +10004,10 @@ package body Sem_Res is
   --  check may convert an illegal static expression and result in warning
   --  rather than giving an error (e.g Integer'(Integer'Last + 1)).
 
-  if Nkind (N) = N_Qualified_Expression and then Is_Scalar_Type (Typ) then
- Apply_Scalar_Range_Check (Expr, Typ);
+  if Nkind (N) = N_Qualified_Expression
+and then Is_Scalar_Type (Target_Typ)
+  then
+ Apply_Scalar_Range_Check (Expr, Target_Typ);
   end if;
 
   --  AI12-0100: Once the qualified expression is resolved, check whether



[Ada] Ada2020 AI12-0282: Shared variable control aspects in generics

2020-06-09 Thread Pierre-Marie de Rodat
This patch further refines the checks and the corresponding error
messsages on the legality of instantiations where some formal types and
their corresponding actuals may carry Atomic,  Volatile, etc.  aspect
specifications. The legality rules stated in RN 6.1 (12/5) are in fact
backward-incompatible because they violate a basic language design rule,
namely that limited private formal types impose no constraint on the
corresponding formals. Pending further ARG discussions the current code
reverts the checking for an exact match between specified aspects to
fornal derived types only.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Ed Schonberg  

gcc/ada/

* sem_ch12.adb (Check_Shared_Variable_Control_Aspects): Require
exact match between formal and actual for aspects Atomic and
Volatile only for formal derived types.--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -12394,19 +12394,38 @@ package body Sem_Ch12 is
 
   --  Ada 2020: Verify that shared variable control aspects (RM C.6)
   --  that may be specified for the formal are obeyed by the actual.
+  --  If the fornal is a derived type the aspect specifications must match.
+  --  NOTE: AI12-0282 implies that matching of aspects is required between
+  --  formal and actual in all cases, but this is too restrictive.
+  --  In particular it violates a language design rule: a limited private
+  --  indefinite formal can be matched by any actual. The current code
+  --  reflects an older and more permissve version of RM C.6 (12/5).
 
   procedure Check_Shared_Variable_Control_Aspects is
   begin
  if Ada_Version >= Ada_2020 then
-if Is_Atomic (A_Gen_T) /= Is_Atomic (Act_T) then
+if Is_Atomic (A_Gen_T) and then not Is_Atomic (Act_T) then
+   Error_Msg_NE
+  ("actual for& must have Atomic aspect", Actual, A_Gen_T);
+
+elsif Is_Derived_Type (A_Gen_T)
+  and then Is_Atomic (A_Gen_T) /= Is_Atomic (Act_T)
+then
Error_Msg_NE
   ("actual for& has different Atomic aspect", Actual, A_Gen_T);
 end if;
 
-if Is_Volatile (A_Gen_T) /= Is_Volatile (Act_T) then
+if Is_Volatile (A_Gen_T) and then not Is_Volatile (Act_T) then
Error_Msg_NE
   ("actual for& has different Volatile aspect",
 Actual, A_Gen_T);
+
+elsif Is_Derived_Type (A_Gen_T)
+  and then Is_Volatile (A_Gen_T) /= Is_Volatile (Act_T)
+then
+   Error_Msg_NE
+  ("actual for& has different Volatile aspect",
+ Actual, A_Gen_T);
 end if;
 
 --  We assume that an array type whose atomic component type



[Ada] Remove kludge for AI05-0087

2020-06-09 Thread Pierre-Marie de Rodat
This is a code clean up as part of removing all calls to Error_Msg* in
the expander.

Tested on x86_64-pc-linux-gnu, committed on trunk

2020-06-09  Arnaud Charlet  

gcc/ada/

* exp_ch5.adb (Expand_N_Assignment): Remove kludge for
AI05-0087.
* sem_ch12.adb (Validate_Derived_Type_Instance): Implement
AI05-0087 retroactively since it's a binding interpretation.--- gcc/ada/exp_ch5.adb
+++ gcc/ada/exp_ch5.adb
@@ -29,7 +29,6 @@ with Checks;   use Checks;
 with Debug;use Debug;
 with Einfo;use Einfo;
 with Elists;   use Elists;
-with Errout;   use Errout;
 with Exp_Aggr; use Exp_Aggr;
 with Exp_Ch6;  use Exp_Ch6;
 with Exp_Ch7;  use Exp_Ch7;
@@ -2664,25 +2663,13 @@ package body Exp_Ch5 is
  and then
not Restriction_Active (No_Dispatching_Calls))
 then
-   if Is_Limited_Type (Typ) then
-
-  --  This can happen in an instance when the formal is an
-  --  extension of a limited interface, and the actual is
-  --  limited. This is an error according to AI05-0087, but
-  --  is not caught at the point of instantiation in earlier
-  --  versions. We also must verify that the limited type does
-  --  not come from source as corner cases may exist where
-  --  an assignment was not intended like the pathological case
-  --  of a raise expression within a return statement.
-
-  --  This is wrong, error messages cannot be issued during
-  --  expansion, since they would be missed in -gnatc mode ???
-
-  if Comes_From_Source (N) then
- Error_Msg_N
-   ("assignment not available on limited type", N);
-  end if;
+   --  We should normally not encounter any limited type here,
+   --  except in the corner case where an assignment was not
+   --  intended like the pathological case of a raise expression
+   --  within a return statement.
 
+   if Is_Limited_Type (Typ) then
+  pragma Assert (not Comes_From_Source (N));
   return;
end if;
 

--- gcc/ada/sem_ch12.adb
+++ gcc/ada/sem_ch12.adb
@@ -13460,17 +13460,8 @@ package body Sem_Ch12 is
  --  explicitly so. If not declared limited, the actual cannot be
  --  limited (see AI05-0087).
 
- --  Even though this AI is a binding interpretation, we enable the
- --  check only in Ada 2012 mode, because this improper construct
- --  shows up in user code and in existing B-tests.
-
- if Is_Limited_Type (Act_T)
-   and then not Is_Limited_Type (A_Gen_T)
-   and then Ada_Version >= Ada_2012
- then
-if In_Instance then
-   null;
-else
+ if Is_Limited_Type (Act_T) and then not Is_Limited_Type (A_Gen_T) then
+if not In_Instance then
Error_Msg_NE
  ("actual for non-limited & cannot be a limited type",
   Actual, Gen_T);
@@ -13479,30 +13470,25 @@ package body Sem_Ch12 is
 end if;
  end if;
 
- --  Don't check Ada_Version here (for now) because AI12-0036 is
- --  a binding interpretation; this decision may be reversed if
- --  the situation turns out to be similar to that of the preceding
- --  Is_Limited_Type test (see preceding comment).
+ --  Check for AI12-0036
 
  declare
 Formal_Is_Private_Extension : constant Boolean :=
   Nkind (Parent (A_Gen_T)) = N_Private_Extension_Declaration;
 
 Actual_Is_Tagged : constant Boolean := Is_Tagged_Type (Act_T);
+
  begin
 if Actual_Is_Tagged /= Formal_Is_Private_Extension then
-   if In_Instance then
-  null;
-   else
+   if not In_Instance then
   if Actual_Is_Tagged then
  Error_Msg_NE
-   ("actual for & cannot be a tagged type",
-Actual, Gen_T);
+   ("actual for & cannot be a tagged type", Actual, Gen_T);
   else
  Error_Msg_NE
-   ("actual for & must be a tagged type",
-Actual, Gen_T);
+   ("actual for & must be a tagged type", Actual, Gen_T);
   end if;
+
   Abandon_Instantiation (Actual);
end if;
 end if;



Re: [patch] Make memory copy functions scalar storage order barriers

2020-06-09 Thread Richard Biener via Gcc-patches
On Mon, Jun 8, 2020 at 10:37 AM Eric Botcazou  wrote:
>
> > OK, but all I was saying is that looking at the pointed-to type of
> > the address argument to the memcpy looks fragile to me.  For the
> > case cited above it would be better to look at the actual
> > reference we are looking up and not allowing memcpy handling
> > when that reference contains a storage order barrier?  Like
> > a && !contains_storage_order_barrier_p (vr->operands) on the whole block
> > like we do for the assign from constant/SSA value case?
>
> The memcpy is the storage order barrier though, there is no other, so I'm not
> sure what contains_storage_order_barrier_p should be called on.  Or do you
> mean that an operand should be created for the memcpy with "reverse" set?

I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER
works and what the constraints are.  But if the memcpy is the storage order
barrier then what cases are OK?

 - both source and destination objects are [only] accessed with the same
   storage order
 - both source and destination objects are [only] accessed with
   !TYPE_REVERSE_STORAGE_ORDER "containing" accesses

and what determines the "object state" with respect to storage order
when we see a memcpy call?  I had the impression (from the other
existing cases of contains_storage_order_barrier_p () calls) that
we derive the "object state" from the downstream access (vr->operands),
in this case the memcpy destination access.  But see below for maybe
the easier case (memcpy folding).

> > But the above cited case is the _only_ case we're possibly building an
> > assignment of a variable - so it's enough to assert that in the above case
> > destvar is not TYPE_REVERSE_STORAGE_ORDER?
>
> What about
>
>   /* If we can perform the copy efficiently with first doing all loads
>  and then all stores inline it that way.  Currently efficiently
>  means that we can load all the memory into a single integer
>  register which is what MOVE_MAX gives us.  */
>   src_align = get_pointer_alignment (src);
>
> It's also a scalar copy, so it won't work either.

In my understanding of how TYPE_REVERSE_STORAGE_ORDER works
a memcpy, irrespective of its actual "implementation" should not break
anything since it transfers bytes.  In particular a 8 byte
TYPE_REVERSE_STORAGE_ORDER
object can be transfered ("memcpy"ed) with (ignoring TBAA)

  int tema = *(int *)src;
  *(int *)dst = tema;
  int temb = *((int *)src + 1);
  *((int *)dst + 1) = temb;

and what byteorder the int accesses are performed in is irrelevant
as long as read and write use the same byteorder.

> > The other cases all build an aggregate copy assignment with
> > a non-reverse-storage-order char[size] type which should be OK, no?
>
> Yes, theoritically, but SRA rewrites the access into a scalar access and we're
> back to square one (I tried).

But what's the issue with this?

> , so this can possibly work only if the array type
> is built with TYPE_REVERSE_STORAGE_ORDER too, which implies that the source is
> in reverse order too.  So the only case that could be handled correctly is the
> memcpy between two aggregates types with TYPE_REVERSE_STORAGE_ORDER.

Why?  Is the following an invalid program?

void mymemcpy (void *a, void *b) { memcpy (a, b, 8); }
void foo()
{
  int b[2] __attribute__(reverse_storage);
  int a[2];
  mymemcpy (a, b);
  return a[0];
}

how can GCC infer, inside mymemcpy that b is reverse storage order?

Thus, if all this is a problem then I think we're papering over the issue
by not folding memcpy.

> > Note even for the above we probably would be fine if we'd made sure
> > to use a !TYPE_REVERSE_STORAGE_ORDER "qualified" type
> > for the access.  So the memcpy happens via a native order
> > load plus a native order store, exactly what memcpy does (in fact
> > the actual order does not matter unless the source and destination
> > order "agree")?  Thus maybe amend
> >
> >   /* Make sure we are not copying using a floating-point mode or
> >  a type whose size possibly does not match its precision.  */
> >   if (FLOAT_MODE_P (TYPE_MODE (desttype))
> >
> >   || TREE_CODE (desttype) == BOOLEAN_TYPE
> >   || TREE_CODE (desttype) == ENUMERAL_TYPE)
> >
> > desttype = bitwise_type_for_mode (TYPE_MODE (desttype));
> >   if (FLOAT_MODE_P (TYPE_MODE (srctype) )
> >
> >   || TREE_CODE (srctype) == BOOLEAN_TYPE
> >   || TREE_CODE (srctype) == ENUMERAL_TYPE)
> >
> > srctype = bitwise_type_for_mode (TYPE_MODE (srctype));
> >
> > with || TYPE_REVERSE_STORAGE_ORDER (srctype) (same for desttype)?
>
> We don't have TYPE_REVERSE_STORAGE_ORDER qualified types, the flag is only set
> on aggregate types and bitwise_type_for_mode will return an integer type.

Exactly.  When the original src/desttype turns out to be a RECORD_TYPE (IIRC
that can happen) with TYPE_REVERSE_STORAGE_ORDER make sure we're
using a bitwise_type_for_mode.  Because then the copying w

Re: Fix ICE with typeless storage

2020-06-09 Thread Richard Biener via Gcc-patches
On Fri, Jun 5, 2020 at 11:08 AM Jan Hubicka  wrote:
>
> Hi,
> this patch fixes ICE while bulding Firefox on assert in
> record_component_aliases which checks that the alias set did not change
> while peeling off vector and array types.
> This is done so we can translate notice references to pointers.
>
> If array is TYPE_TYPELESS_STORAGE then it is not true that array type
> woud have same alias set as its elements since its alias set is 0, so we
> need to watch for this. (And it is safe to miss pointer here once we add
> alias set 0 as a component)

OK.

> Honza
>
> * alias.c (record_component_aliases): Watch for typeless storage while
> skipping the ARRAY_TREE wrappers.
>
> diff --git a/gcc/alias.c b/gcc/alias.c
> index 49bd7b37966..2bed5e78c62 100644
> --- a/gcc/alias.c
> +++ b/gcc/alias.c
> @@ -1273,8 +1273,12 @@ record_component_aliases (tree type, alias_set_type 
> superset)
> {
>   /* VECTOR_TYPE and ARRAY_TYPE share the alias set with their
>  element type and that type has to be normalized to void 
> *,
> -too, in the case it is a pointer. */
> - while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t))
> +too, in the case it is a pointer.
> +An exception is array with TYPE_TYPELESS_STORAGE which
> +has alias set 0.  */
> + while (!canonical_type_used_p (t) && !POINTER_TYPE_P (t)
> +&& (!AGGREGATE_TYPE_P (t)
> +|| !TYPE_TYPELESS_STORAGE (t)))
> {
>   gcc_checking_assert (TYPE_STRUCTURAL_EQUALITY_P (t));
>   t = TREE_TYPE (t);


[PATCH/RFC] How to fix PR95440

2020-06-09 Thread Iain Sandoe

Hi C++ folks,

There are a number of places in the coroutine codegen where we need to  
build calls to promise methods.


Such methods can, in several cases, also be constexpr or static (or both)  
which leads to the PR,


Now the mechanism I use is this;

1/ lookup the method in the relevant scope

  tree method
= lookup_member (promise, member_id,  /*protect=*/1, /*want_type=*/0, 
tf_warning_or_error);

- this returns a BASELINK tree..

2/ In many cases, the calls are to methods with no parms.

so, with a dummy object, I do:

tree call
  = build_new_method_call (dummy, method, NULL, NULL_TREE, LOOKUP_NORMAL, NULL, 
flags);

‘args’ is the 3rd parm.

3/ The header comment on build_new_method_call says:

/* Build a call to "INSTANCE.FN (ARGS)".  If FN_P is non-NULL, it will
   be set, upon return, to the function called.  ARGS may be NULL.
   This may change ARGS.  */

Which says that the 3rd arg can validly be NULL (as per my example).

4/ however, inside build_new_method_call  ‘args' gets assigned to  
‘user_args’ which is used in a lot of places without any guard to see if  
it’s non-null.   One of those places is passing it to add_candidates (),  
which eventually in the case of a static method uses the pointer and ICEs.


=

So…

A/  I’m doing something wrong, and my use of APIs is violating an  
expectation (in which case what’s the right API usage?)


B/ It happens to be other callers to build_new_method_call() have either  
had dummy args or not produced the same set of conditions


(for my case of static promise methods)

 * if I provide an empty dummy ‘args' to the  build_new_method_call it works
 * if I fix the unguarded use of ‘args' in add_candidates (as below) it works,

AFAICT, the intent is to avoid every caller for build_new_method_call()  
having to build a dummy argument list when that list is empty - it would  
seem irritating to have to add one for every use in coros - but I can.


WDYT?
thanks
Iain



the fix for call.c:

diff --git a/gcc/cp/call.c b/gcc/cp/call.c
index 3c97b9846e2..dccc1bf596f 100644
--- a/gcc/cp/call.c
+++ b/gcc/cp/call.c
@@ -5862,7 +5862,7 @@ add_candidates (tree fns, tree first_arg, const  
vec *args,

}

   /* Don't bother reversing an operator with two identical parameters.  */
-  else if (args->length () == 2 && (flags & LOOKUP_REVERSED))
+  else if (args && args->length () == 2 && (flags & LOOKUP_REVERSED))
{
  tree parmlist = TYPE_ARG_TYPES (TREE_TYPE (fn));
  if (same_type_p (TREE_VALUE (parmlist),






Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Martin Liška

On 6/9/20 9:48 AM, Marco Elver wrote:

v2:
* Add Optimization keyword to -param=tsan-distinguish-volatile= as the
   parameter can be different per TU.
* Add tree-dump check to test.


Hello.

I support the patch.
@Jakub: Can you please approve the patch?

Thanks,
Martin


[PATCH PR95569] ICE in tmmark:verify_ssa failed

2020-06-09 Thread qianchao
Hi

Commit eb72dc663e9070b281be83a80f6f838a3a878822 introduces a ICE on AArch64. 
See the discussion on the bug for details.
Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95569

Trivial patch fixing the ICE attached.

Bootstrap and tested on aarch64 platform. No new regression witnessed. ok for 
trunk?

2020-06-09  Qian Chao  

gcc/:

PR tree-optimization/95569
* trans-mem.c (expand_assign_tm): Ensure that rtmp is marked 
TREE_ADDRESSABLE.
* testsuite/gcc.dg/tm/pr51472.c: New test.


PR95569-v0.patch
Description: PR95569-v0.patch


Re: [PATCH v2] arm: Warn if IRQ handler is not compiled with -mgeneral-regs-only [PR target/94743]

2020-06-09 Thread Christophe Lyon via Gcc-patches
Ping?

Maybe I could mention that LLVM emits a warning in this case
(https://reviews.llvm.org/D28820).

Thanks,

Christophe


On Wed, 3 Jun 2020 at 15:23, Christophe Lyon  wrote:
>
> Ping?
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545747.html
>
> On Wed, 27 May 2020 at 13:52, Christophe Lyon
>  wrote:
> >
> > Ping?
> >
> > On Thu, 14 May 2020 at 16:57, Christophe Lyon
> >  wrote:
> > >
> > > The interrupt attribute does not guarantee that the FP registers are
> > > saved, which can result in problems difficult to debug.
> > >
> > > Saving the FP registers and status registers can be a large penalty,
> > > so it's probably not desirable to do that all the time.
> > >
> > > If the handler calls other functions, we'd likely need to save all of
> > > them, for lack of knowledge of which registers they actually clobber.
> > >
> > > This is even more obscure for the end-user when the compiler inserts
> > > calls to helper functions such as memcpy (some multilibs do use FP
> > > registers to speed it up).
> > >
> > > In the PR, we discussed adding routines in libgcc to save the FP
> > > context and saving only locally-clobbered FP registers, but this seems
> > > to be too much work for the purpose, given that in general such
> > > handlers try to avoid this kind of penalty.
> > > I suspect we would also want new attributes to instruct the compiler
> > > that saving the FP context is not needed.
> > >
> > > In the mean time, emit a warning to suggest re-compiling with
> > > -mgeneral-regs-only. Note that this can lead to errors if the code
> > > uses floating-point and -mfloat-abi=hard, eg:
> > > argument of type 'double' not permitted with -mgeneral-regs-only
> > >
> > > This can be troublesome for the user, but at least this would make
> > > him aware of the latent issue.
> > >
> > > The patch adds several testcases:
> > >
> > > - pr94734-1-hard.c checks that a warning is emitted when using
> > >   -mfloat-abi=hard. Function IRQ_HDLR_Test can make implicit calls to
> > >   runtime floating-point routines (or direct use of FP instructions),
> > >   IRQ_HDLR_Test2 doesn't. We emit a warning in both cases, though.
> > >
> > > - pr94734-1-softfp.c: same as above wih -mfloat-abi=softfp.
> > >
> > > - pr94734-1-soft.c checks that no warning is emitted when using
> > >   -mfloat-abi=soft when the same code as above.
> > >
> > > - pr94734-2.c checks that no warning is emitted when using
> > >   -mgeneral-regs-only.
> > >
> > > - pr94734-3.c checks that no warning is emitted when using
> > >   -mgeneral-regs-only even using float-point data.
> > >
> > > 2020-05-14  Christophe Lyon  
> > >
> > > PR target/94743
> > > gcc/
> > > * config/arm/arm.c (arm_handle_isr_attribute): Warn if
> > > -mgeneral-regs-only is not used.
> > >
> > > gcc/testsuite/
> > > * gcc.misc-tests/arm-isr.c: Add -mgeneral-regs-only.
> > > * gcc.target/arm/empty_fiq_handler.c: Add -mgeneral-regs-only.
> > > * gcc.target/arm/interrupt-1.c: Add -mgeneral-regs-only.
> > > * gcc.target/arm/interrupt-2.c: Add -mgeneral-regs-only.
> > > * gcc.target/arm/pr70830.c: Add -mgeneral-regs-only.
> > > * gcc.target/arm/pr94743-1-hard.c: New test.
> > > * gcc.target/arm/pr94743-1-soft.c: New test.
> > > * gcc.target/arm/pr94743-1-softfp.c: New test.
> > > * gcc.target/arm/pr94743-2.c: New test.
> > > * gcc.target/arm/pr94743-3.c: New test.
> > > ---
> > >  gcc/config/arm/arm.c |  5 
> > >  gcc/testsuite/gcc.misc-tests/arm-isr.c   |  2 ++
> > >  gcc/testsuite/gcc.target/arm/empty_fiq_handler.c |  1 +
> > >  gcc/testsuite/gcc.target/arm/interrupt-1.c   |  2 +-
> > >  gcc/testsuite/gcc.target/arm/interrupt-2.c   |  2 +-
> > >  gcc/testsuite/gcc.target/arm/pr70830.c   |  2 +-
> > >  gcc/testsuite/gcc.target/arm/pr94743-1-hard.c| 29 
> > > 
> > >  gcc/testsuite/gcc.target/arm/pr94743-1-soft.c| 27 
> > > ++
> > >  gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c  | 29 
> > > 
> > >  gcc/testsuite/gcc.target/arm/pr94743-2.c | 22 ++
> > >  gcc/testsuite/gcc.target/arm/pr94743-3.c | 23 +++
> > >  11 files changed, 141 insertions(+), 3 deletions(-)
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-hard.c
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-soft.c
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-1-softfp.c
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-2.c
> > >  create mode 100644 gcc/testsuite/gcc.target/arm/pr94743-3.c
> > >
> > > diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
> > > index dda8771..c88de3e 100644
> > > --- a/gcc/config/arm/arm.c
> > > +++ b/gcc/config/arm/arm.c
> > > @@ -7232,6 +7232,11 @@ arm_handle_isr_attribute (tree *node, tree name, 
> > > tree args, int flags,
> > >

Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 09:48:34AM +0200, Marco Elver wrote:
> gcc/
>   * params.opt: Define --param=tsan-distinguish-volatile=[0,1].
>   * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new
>   builtin for volatile instrumentation of reads/writes.
>   (BUILT_IN_TSAN_VOLATILE_READ2): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_READ4): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_READ8): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_READ16): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise.
>   * tsan.c (get_memory_access_decl): Argument if access is
>   volatile. If param tsan-distinguish-volatile is non-zero, and
>   access if volatile, return volatile instrumentation decl.
>   (instrument_expr): Check if access is volatile.
> 
> gcc/testsuite/
>   * c-c++-common/tsan/volatile.c: New test.

In general looks ok, just some minor nits.

> --- a/gcc/params.opt
> +++ b/gcc/params.opt
> @@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best 
> edge is less than this th
>  Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization
>  Set the maximum number of instructions executed in parallel in reassociated 
> tree.  If 0, use the target dependent heuristic.
>  
> +-param=tsan-distinguish-volatile=
> +Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, 
> 1) Param Optimization

Do we need/want Optimization here?  Optimization means the option is
per-function, but to me whether you want to distinguish volatiles or not
seems to be a global decision for the whole project.

> +Emit special instrumentation for accesses to volatiles.
> +
>  -param=uninit-control-dep-attempts=
>  Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) 
> IntegerRange(1, 65536) Param Optimization
>  Maximum number of nested calls to search for control dependencies during 
> uninitialized variable analysis.
> --- a/gcc/sanitizer.def
> +++ b/gcc/sanitizer.def
> @@ -214,6 +214,27 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_READ_RANGE, 
> "__tsan_read_range",
>  DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_WRITE_RANGE, "__tsan_write_range",
> BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
>  
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ1, "__tsan_volatile_read1",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ2, "__tsan_volatile_read2",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ4, "__tsan_volatile_read4",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ8, "__tsan_volatile_read8",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ16, 
> "__tsan_volatile_read16",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE1, 
> "__tsan_volatile_write1",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE2, 
> "__tsan_volatile_write2",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE4, 
> "__tsan_volatile_write4",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE8, 
> "__tsan_volatile_write8",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, 
> "__tsan_volatile_write16",
> +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)

This last entry is already too long (line limit 80 chars), so should be
wrapped like:
DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16,
  "__tsan_volatile_write16", BT_FN_VOID_PTR,
  ATTR_NOTHROW_LEAF_LIST)
instead.

> --- a/gcc/tsan.c
> +++ b/gcc/tsan.c
> @@ -52,25 +52,41 @@ along with GCC; see the file COPYING3.  If not see
> void __tsan_read/writeX (void *addr);  */
>  
>  static tree
> -get_memory_access_decl (bool is_write, unsigned size)
> +get_memory_access_decl (bool is_write, unsigned size, bool volatilep)
>  {
>enum built_in_function fcode;
>  
> -  if (size <= 1)
> -fcode = is_write ? BUILT_IN_TSAN_WRITE1
> -  : BUILT_IN_TSAN_READ1;
> -  else if (size <= 3)
> -fcode = is_write ? BUILT_IN_TSAN_WRITE2
> -  : BUILT_IN_TSAN_READ2;
> -  else if (size <= 7)
> -fcode = is_write ? BUILT_IN_TSAN_WRITE4
> -  : BUILT_IN_TSAN_READ4;
> -  else if (size <= 15)
> -fcode = is_write ? BUILT_IN_TSAN_WRITE8
> -  : BUILT_IN_TSAN_READ8;
> +  if (param_tsan_distinguish_volatile && vo

Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Marco Elver via Gcc-patches
On Tue, 9 Jun 2020 at 11:50, Jakub Jelinek  wrote:
>
> On Tue, Jun 09, 2020 at 09:48:34AM +0200, Marco Elver wrote:
> > gcc/
> >   * params.opt: Define --param=tsan-distinguish-volatile=[0,1].
> >   * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new
> >   builtin for volatile instrumentation of reads/writes.
> >   (BUILT_IN_TSAN_VOLATILE_READ2): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_READ4): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_READ8): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_READ16): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise.
> >   * tsan.c (get_memory_access_decl): Argument if access is
> >   volatile. If param tsan-distinguish-volatile is non-zero, and
> >   access if volatile, return volatile instrumentation decl.
> >   (instrument_expr): Check if access is volatile.
> >
> > gcc/testsuite/
> >   * c-c++-common/tsan/volatile.c: New test.
>
> In general looks ok, just some minor nits.
>
> > --- a/gcc/params.opt
> > +++ b/gcc/params.opt
> > @@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best 
> > edge is less than this th
> >  Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization
> >  Set the maximum number of instructions executed in parallel in 
> > reassociated tree.  If 0, use the target dependent heuristic.
> >
> > +-param=tsan-distinguish-volatile=
> > +Common Joined UInteger Var(param_tsan_distinguish_volatile) 
> > IntegerRange(0, 1) Param Optimization
>
> Do we need/want Optimization here?  Optimization means the option is
> per-function, but to me whether you want to distinguish volatiles or not
> seems to be a global decision for the whole project.

Adding Optimization here was Martin's suggestion. I'm fine either way
and just wanted to err on the conservative side.

Do note that in the kernel, we blacklist some files from
instrumentation entirely, which implies leaving '-fsanitize=thread
--param=tsan-distinguish-volatile=1' off. Although given that the
option is only used with -fsanitize=thread, maybe it doesn't matter.

If you strongly feel that Optimization should be removed again, please
let me know.

> > +Emit special instrumentation for accesses to volatiles.
> > +
[...]
> > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, 
> > "__tsan_volatile_write16",
> > +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
>
> This last entry is already too long (line limit 80 chars), so should be
> wrapped like:
> DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16,
>   "__tsan_volatile_write16", BT_FN_VOID_PTR,
>   ATTR_NOTHROW_LEAF_LIST)
> instead.
>
> > --- a/gcc/tsan.c
> > +++ b/gcc/tsan.c
> > @@ -52,25 +52,41 @@ along with GCC; see the file COPYING3.  If not see
> > void __tsan_read/writeX (void *addr);  */
> >
> >  static tree
> > -get_memory_access_decl (bool is_write, unsigned size)
> > +get_memory_access_decl (bool is_write, unsigned size, bool volatilep)
> >  {
> >enum built_in_function fcode;
> >
> > -  if (size <= 1)
> > -fcode = is_write ? BUILT_IN_TSAN_WRITE1
> > -  : BUILT_IN_TSAN_READ1;
> > -  else if (size <= 3)
> > -fcode = is_write ? BUILT_IN_TSAN_WRITE2
> > -  : BUILT_IN_TSAN_READ2;
> > -  else if (size <= 7)
> > -fcode = is_write ? BUILT_IN_TSAN_WRITE4
> > -  : BUILT_IN_TSAN_READ4;
> > -  else if (size <= 15)
> > -fcode = is_write ? BUILT_IN_TSAN_WRITE8
> > -  : BUILT_IN_TSAN_READ8;
> > +  if (param_tsan_distinguish_volatile && volatilep)
> > +{
> > +  if (size <= 1)
> > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE1
> > +: BUILT_IN_TSAN_VOLATILE_READ1;
> > +  else if (size <= 3)
> > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE2
> > +: BUILT_IN_TSAN_VOLATILE_READ2;
> > +  else if (size <= 7)
> > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE4
> > +: BUILT_IN_TSAN_VOLATILE_READ4;
> > +  else if (size <= 15)
> > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE8
> > +: BUILT_IN_TSAN_VOLATILE_READ8;
> > +  else
> > +fcode = is_write ? BUILT_IN_TSAN_VOLATILE_WRITE16
> > +: BUILT_IN_TSAN_VOLATILE_READ16;
> > +}
> >else
> > -fcode = is_write ? BUILT_IN_TSAN_WRITE16
> > -  : BUILT_IN_TSAN_READ16;
> > +{
> > +  if (size <= 1)
> > +fcode = is_write ? BUILT_IN_TSAN_WRITE1 : BUILT_IN_TSAN_READ1;
> > +  else if (size <= 3)
> > +fcode = is_write ? BUILT_IN_TSAN_WRITE2 : BUILT_IN_TSAN_READ2;
> > +  else if (size <= 7)
> > +fcode = is_write ? BUILT_IN_TSAN_WRITE4 : BUILT_IN_TSAN_READ4;
> > +  else if (size <= 15)
> > +f

Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Marco Elver via Gcc-patches
Just wanted to change this one, and noticed this

> > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, 
> > "__tsan_volatile_write16",
> > +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)

is precisely 80 characters. So if I read the style guide right, it's
<= 80 chars (and not < 80 chars), right?

> This last entry is already too long (line limit 80 chars), so should be
> wrapped like:
> DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16,
>   "__tsan_volatile_write16", BT_FN_VOID_PTR,
>   ATTR_NOTHROW_LEAF_LIST)
> instead.

So I think it can stay.

Thanks,
-- Marco


Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 12:08:07PM +0200, Marco Elver wrote:
> Just wanted to change this one, and noticed this
> 
> > > +DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE16, 
> > > "__tsan_volatile_write16",
> > > +   BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
> 
> is precisely 80 characters. So if I read the style guide right, it's
> <= 80 chars (and not < 80 chars), right?

Ah, you're right, sorry.

Jakub



Re: [patch] Make memory copy functions scalar storage order barriers

2020-06-09 Thread Eric Botcazou
> I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER
> works and what the constraints are.  But if the memcpy is the storage order
> barrier then what cases are OK?
> 
>  - both source and destination objects are [only] accessed with the same
>storage order
>  - both source and destination objects are [only] accessed with
>!TYPE_REVERSE_STORAGE_ORDER "containing" accesses
> 
> and what determines the "object state" with respect to storage order
> when we see a memcpy call?  I had the impression (from the other
> existing cases of contains_storage_order_barrier_p () calls) that
> we derive the "object state" from the downstream access (vr->operands),
> in this case the memcpy destination access.  But see below for maybe
> the easier case (memcpy folding).

All cases are OK if you don't try to replace memcpy with something else.  The 
only possible replacement would be with a VIEW_CONVERT_EXPR, but I guess that 
it would be quickly folded into a MEM_EXPR and then we're back to square one.

> In my understanding of how TYPE_REVERSE_STORAGE_ORDER works
> a memcpy, irrespective of its actual "implementation" should not break
> anything since it transfers bytes.

The point of the discussion is precisely to make it work in all cases.

> In particular a 8 byte TYPE_REVERSE_STORAGE_ORDER
> object can be transfered ("memcpy"ed) with (ignoring TBAA)
> 
>   int tema = *(int *)src;
>   *(int *)dst = tema;
>   int temb = *((int *)src + 1);
>   *((int *)dst + 1) = temb;
> 
> and what byteorder the int accesses are performed in is irrelevant
> as long as read and write use the same byteorder.

No, because of the fundamental invariant of the implementation: you may _not_ 
access the same memory location with different storage orders, this is simply 
not supported and we document the limitation in the manual.  The program does 
not do that, so it's valid; the program after folding does that, which means 
that the folding is invalid.

> > Yes, theoritically, but SRA rewrites the access into a scalar access and
> > we're back to square one (I tried).
> 
> But what's the issue with this?

Same as above: access to the same location with different storage orders.

> Why?  Is the following an invalid program?
> 
> void mymemcpy (void *a, void *b) { memcpy (a, b, 8); }
> void foo()
> {
>   int b[2] __attribute__(reverse_storage);
>   int a[2];
>   mymemcpy (a, b);
>   return a[0];
> }
> 
> how can GCC infer, inside mymemcpy that b is reverse storage order?

No, that's supported, again the point of the discussion is precisely to make 
it work!

> Thus, if all this is a problem then I think we're papering over the issue
> by not folding memcpy.

I don't really see why you're insisting on folding memcpy.  It's semantically 
equivalent to VIEW_CONVERT_EXPR so it needs to be treated the same way.

-- 
Eric Botcazou


Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare

2020-06-09 Thread Jonathan Wakely via Gcc-patches

On 08/06/20 19:20 +0100, Jonathan Wakely wrote:

On 05/06/20 22:24 +0200, François Dumont via Libstdc++ wrote:

Hi

    Here is the last of my algo patches this time to extend the 
memcmp optimization to std::deque iterators and _GLIBCXX_DEBUG mode.


    To do so I had to return int in implementation details of 
lexicographical_compare to make sure we can treat a deque iterator 
range by chunk in a performant way.


Tested under Linux x86_64 normal and debug modes.

            * include/bits/stl_algobase.h
            (__lexicographical_compare_impl): Return int.
            (__lexicographical_compare::__lc): Likewise.
            (__lexicographical_compare_aux1(_II1, _II1, _II2, 
_II2)): New.
(__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,
            _II2, _II2)): Declare.
            (__lexicographical_compare_aux1(_II1, _II1,
            _Deque_iterator<>, _Deque_iterator<>)): Declare.
(__lexicographical_compare_aux1(_Deque_iterator<>, _Deque_iterator<>,
            _Deque_iterator<>, _Deque_iterator<>)): Declare.
            (__lexicographical_compare_aux): Adapt, call later.


Is this meant to say "latter"? That's still not correct grammar
though. I think it would be better to name the function it calls
explicitly: "Call __lexicographical_compare_aux1."

            (__lexicographical_compare_aux(_Safe_iterator<>, 
_Safe_iterator<>,

            _II2, _II2)): Declare.
            (__lexicographical_compare_aux(_II1, _II1,
            _Safe_iterator<>, _Safe_iterator<>)): Declare.
            (__lexicographical_compare_aux(_Safe_iterator<>, 
_Safe_iterator<>,

            _Safe_iterator<>, _Safe_iterator<>)): Declare.
            (std::lexicographical_compare): Adapt, call later 
without __niter_base

            usage.
            * include/bits/deque.tcc (__lex_cmp_dit): New.
            (__lexicographical_compare_aux1): New.
            * include/debug/safe_iterator.tcc
            (__lexicographical_compare_aux(const _Safe_iterator<>&,
            const _Safe_iterator<>&, _II2, _II2)): New.
            (__lexicographical_compare_aux(
            const _Safe_iterator<>&, const_Safe_iterator<>&,
            const _Safe_iterator<>&, const _Safe_iterator<>&)): New.
            * testsuite/25_algorithms/lexicographical_compare/1.cc 
(test6, test7):

            New.
            * 
testsuite/25_algorithms/lexicographical_compare/deque_iterators/1.cc:

            New test.

Ok to commit ?

François





diff --git a/libstdc++-v3/include/bits/deque.tcc 
b/libstdc++-v3/include/bits/deque.tcc
index 1d32a1691c7..d7dbe64f3e1 100644
--- a/libstdc++-v3/include/bits/deque.tcc
+++ b/libstdc++-v3/include/bits/deque.tcc
@@ -1261,6 +1261,98 @@ _GLIBCXX_END_NAMESPACE_CONTAINER
 return true;
   }

+  template


_II is the wrong name here, it mean InputIterator. All callers of this
function are constrained for random access iterators. This cannot be
called with an input iterator. Please use _RAIter.


+int
+__lex_cmp_dit(
+   const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __first1,
+   const _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr>& __last1,
+   _II __first2, _II __last2)
+{
+  typedef _GLIBCXX_STD_C::_Deque_iterator<_Tp, _Ref, _Ptr> _Iter;
+  typedef typename _Iter::difference_type difference_type;
+
+  if (__first1._M_node != __last1._M_node)
+   {
+ difference_type __len = __last2 - __first2;
+ difference_type __flen


What does "flen" mean? Why not use len2 for last2 - first2, as we do
elsewhere? And then use len for min(len1, len2)?



+   = std::min(__len, __first1._M_last - __first1._M_cur);
+ if (int __ret = std::__lexicographical_compare_aux1(


This call (and the three later in this function) will do overload
resolution again on the full set of __lexicographical_compare_aux1
overloads, which includes the ones for deque iterators. But we know
that __first1._M_cur and __first1._M_last are not deque iterators.

__first2 could be a deque iterator, but I'm not sure if we really want
one function that has to handle that case anyway.


+ __first1._M_cur, __first1._M_last, __first2, __first2 + __flen))
+   return __ret;
+
+ __first2 += __flen;
+ __len -= __flen;
+ __flen = std::min(__len, _Iter::_S_buffer_size());
+ for (typename _Iter::_Map_pointer __node = __first1._M_node + 1;
+  __node != __last1._M_node;
+  __first2 += __flen, __len -= __flen,
+  __flen = std::min(__len, _Iter::_S_buffer_size()),
+  ++__node)
+   if (int __ret = std::__lexicographical_compare_aux1(
+  

OpenACC 'attach'/'detach' has no business affecting user-visible reference counting (was: [PATCH 07/13] OpenACC 2.6 deep copy: libgomp parts)

2020-06-09 Thread Thomas Schwinge
Hi Julian!

On 2020-06-05T21:31:08+0100, Julian Brown  wrote:
> On Fri, 5 Jun 2020 13:17:09 +0200
> Thomas Schwinge  wrote:
>> On 2019-12-17T21:03:47-0800, Julian Brown 
>> wrote:
>> > This part contains the libgomp runtime support for the
>> > GOMP_MAP_ATTACH and GOMP_MAP_DETACH mapping kinds
>>
>> > --- a/libgomp/target.c
>> > +++ b/libgomp/target.c
>>
>> > @@ -1203,6 +1211,32 @@ gomp_map_vars_internal (struct gomp_device_descr 
>> > *devicep,
>>
>> > +case GOMP_MAP_ATTACH:
>> > +  {
>> > +cur_node.host_start = (uintptr_t) hostaddrs[i];
>> > +cur_node.host_end = cur_node.host_start + sizeof (void *);
>> > +splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
>> > +if (n != NULL)
>> > +  {
>> > +tgt->list[i].key = n;
>> > +tgt->list[i].offset = cur_node.host_start - n->host_start;
>> > +tgt->list[i].length = n->host_end - n->host_start;
>> > +tgt->list[i].copy_from = false;
>> > +tgt->list[i].always_copy_from = false;
>> > +tgt->list[i].do_detach
>> > +  = (pragma_kind != GOMP_MAP_VARS_OPENACC_ENTER_DATA);
>> > +n->refcount++;
>> > +  }
>> > +else
>> > +  {
>> > +gomp_mutex_unlock (&devicep->lock);
>> > +gomp_fatal ("outer struct not mapped for attach");
>> > +  }
>> > +gomp_attach_pointer (devicep, aq, mem_map, n,
>> > + (uintptr_t) hostaddrs[i], sizes[i],
>> > + cbufp);
>> > +continue;
>> > +  }
>>
>> For the OpenACC runtime API 'acc_attach' etc. routines they don't, so
>> what's the conceptual reason that for the corresponding OpenACC
>> directive variants, 'GOMP_MAP_ATTACH' etc. here participate in
>> reference counting ('n->refcount++' above)?  I understand OpenACC
>> 'attach'/'detach' clauses to be simple "executable clauses", which
>> just update some values somewhere (say, like
>> 'GOMP_MAP_ALWAYS_POINTER'), but they don't alter any mapping state,
>> thus wouldn't appear to need reference counting?
>
> IIUC, n->refcount is not directly the "structural reference count" as
> seen at source level, but rather counts the number of target_var_descs
> in the lists appended to each target_mem_desc -- and GOMP_MAP_ATTACH
> have variable entries in those lists.

That may be OK if that's purely an implementation detail that isn't
visible to the user, however:

> That's not the case for the API
> routines.

As I had mentioned, the problem is: in contrast to 'acc_attach', an
OpenACC 'enter data' directive with 'attach' clause currently uses this
same reference-counted code path, and thus such an 'attach' without
corresponding 'detach' inhibits unmapping; see
'libgomp.oacc-c-c++-common/mdc-refcount-1.c' in the attached patch
"OpenACC 'attach'/'detach' has no business affecting user-visible
reference counting".

That patch seemed to be the logical next step then, to unify the code
paths for 'acc_attach' and 'enter data' directive with 'attach' clause
(which have to act in the same way).  That's (conceptually) somewhat
similar to what you had proposed as part of
http://mid.mail-archive.com/b23ea71697f77d8214411a3e1348e9dee496e5a6.1590182783.git.julian@codesourcery.com>.
(But all these things really need to be discussed individually...)

However, that patch regresses
'libgomp.oacc-fortran/deep-copy-6-no_finalize.F90', and also the
'deep-copy-7b2f-2.c', and 'deep-copy-7cf.c' that I'm attaching here.  I
have not yet made an attempts to understand these regressions.  It may be
that a Detach Action actually effects an (attached) device pointer being
copied back to the host, and then disturbing things -- and if that, then
it may be a bug in libgomp, or in the test case.  ;-)


Grüße
 Thomas


-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
>From d99a701387054259419292b95462f3646a00d6d9 Mon Sep 17 00:00:00 2001
From: Thomas Schwinge 
Date: Mon, 8 Jun 2020 21:35:32 +0200
Subject: [PATCH] OpenACC 'attach'/'detach' has no business affecting
 user-visible reference counting

In particular, an 'attach' without 'detach' must not inhibit unmapping.

	libgomp/
	* oacc-mem.c (goacc_attach_internal): New function, split out of
	'acc_attach_async'.
	(acc_attach, goacc_enter_data_internal): Use it.
	(goacc_exit_data_internal) : Skip unmapping.
	* target.c (gomp_map_vars_existing): Assert not 'GOMP_MAP_ATTACH'.
	(gomp_map_vars_internal) : Assert this
	is not an 'enter data'.
	* testsuite/libgomp.oacc-c-c++-common/mdc-refcount-1.c: New file.
	* testsuite/libgomp.oacc-fortran/mdc-refcount-1-4-1.f90: Adjust.
---
 libgomp/oacc-mem.c 

Re: [PATCH v2] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 12:01:24PM +0200, Marco Elver wrote:
> > Do we need/want Optimization here?  Optimization means the option is
> > per-function, but to me whether you want to distinguish volatiles or not
> > seems to be a global decision for the whole project.
> 
> Adding Optimization here was Martin's suggestion. I'm fine either way
> and just wanted to err on the conservative side.

We've discussed it with Martin and the end result is that it shouldn't be
Optimization, because we don't want to support mixing it.
Essentially, it is an ABI decision of the tsan implementation, which can be
only one in the whole process, because if somebody tries to mix
code that distinguishes them and other that doesn't, one still needs
to use an implementation that does distinguish them and will assume that
code that was built without --param=tsan-distinguish-volatile=1
is never using volatiles, even when it means we don't know as the
information is lost.

Note, for LTO that will mean the param needs to be specified on the link
line too.

Jakub



Re: [PATCH] libgcov: fix TOPN type casting

2020-06-09 Thread Martin Liška

On 6/9/20 9:33 AM, Martin Liška wrote:

I'm going to install it if there are no objections.


Pushed as 862b9b225fb.

Martin


Re: [stage1][PATCH] Make TOPN counter dynamically allocated.

2020-06-09 Thread Martin Liška

On 6/8/20 1:07 PM, Martin Liška wrote:

Can you please test the following patch candidate:


I've just pushed 862b9b225fb which should fix that.
Can you please test the current master?

Thanks,
Martin


[PATCH] gcov: improve --coverage small example

2020-06-09 Thread Martin Liška

Pushed to master.

gcc/ChangeLog:

PR gcov-profile/95365
* doc/gcov.texi: Compile and link one example in 2 steps.
---
 gcc/doc/gcov.texi | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/gcov.texi b/gcc/doc/gcov.texi
index 61250c9407e..22e42da2ea6 100644
--- a/gcc/doc/gcov.texi
+++ b/gcc/doc/gcov.texi
@@ -528,7 +528,8 @@ for each line.  For example, if your program is called 
@file{tmp.cpp}, this
 is what you see when you use the basic @command{gcov} facility:
 
 @smallexample

-$ g++ --coverage tmp.cpp
+$ g++ --coverage tmp.cpp -c
+$ g++ --coverage tmp.o
 $ a.out
 $ gcov tmp.cpp -m
 File 'tmp.cpp'
--
2.26.2



Re: [PATCH] AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.

2020-06-09 Thread Tamar Christina
Hi Richard,


The 06/08/2020 16:42, Richard Sandiford wrote:
> Tamar Christina  writes:
> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> > index 
> > 97da60762390db81df9cffaf316b909cd1609130..9cc8da338125afa01bc9fb645f4112d2d7ef548c
> >  100644
> > --- a/gcc/config/aarch64/aarch64.c
> > +++ b/gcc/config/aarch64/aarch64.c
> > @@ -11279,6 +11279,14 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, 
> > int outer, bool speed)
> >if (VECTOR_MODE_P (mode))
> >  mode = GET_MODE_INNER (mode);
> > 
> > +  /* The by element versions of the instruction has the same costs as the
> > + normal 3 vector version.  So don't add the costs of the duplicate into
> > + the costs of the multiply.  */
> > +  if (GET_CODE (op0) == VEC_DUPLICATE)
> > +op0 = XEXP (op0, 0);
> > +  else if (GET_CODE (op1) == VEC_DUPLICATE)
> > +op1 = XEXP (op1, 0);
> > +
> >/* Integer multiply/fma.  */
> >if (GET_MODE_CLASS (mode) == MODE_INT)
> >  {
> 
> SVE doesn't have duplicating forms, so I think we should put this code
> under the “if (VECTOR_MODE_P (mode))” condition, before changing “mode”,
> and then restrict it to VEC_ADVSIMD modes.
> 
> (SVE FMUL does have an indexed form, but the index is relative to the
> start of the associated quadword, so it isn't a VEC_DUPLICATE.)
>

Done, I have updated the patch. (See attached)
 
> I guess there's a danger that this could underestimate the cost for
> integer modes, if the scalar integer input needs to be moved from GPRs.
> In that case the cost of a MULT + VEC_DUPLICATE is probably more
> accurate, even though it's still one instruction before RA.
> 
> But I guess there's no perfect answer there.  The new code will be
> right for integer modes in some cases and not in others.  Same if
> we leave things as they are.  But maybe it'd be worth having a comment
> to say that we're assuming the best case, i.e. that the duplicated
> value is naturally in FPRs?
> 

Hmm I haven't added the comment yet since I don't fully understand when the
integer case would be misleading.

In both cases the cost for the GPR is paid by the MOV no? I'm missing why 
having the MUL
account for it would be better in some cases.

For instance for the integer case we used to generate

dup v0.4s, w2
mul v2.4s, v2.4s, v0.4s

but now do

fmovs0, w2
mul v2.4s, v2.4s, v0.s[0]

Which is better on older cores such Cortex-A55 and no different on newer cores 
such as
Cortex-A76 according to the optimization guides.

Regards,
Tamar

> Thanks,
> Richard

-- 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 973c65aa4fb348450872036617362aa17310fb20..d2e959c5276d9b801294c722c92762c5674cb244 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11279,7 +11279,20 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, int outer, bool speed)
   op1 = XEXP (x, 1);
 
   if (VECTOR_MODE_P (mode))
-mode = GET_MODE_INNER (mode);
+{
+  unsigned int vec_flags = aarch64_classify_vector_mode (mode);
+  mode = GET_MODE_INNER (mode);
+  if (vec_flags & VEC_ADVSIMD)
+	{
+	  /* The by element versions of the instruction has the same costs as the
+	 normal 3 vector version.  So don't add the costs of the duplicate into
+	 the costs of the multiply.  */
+	  if (GET_CODE (op0) == VEC_DUPLICATE)
+	op0 = XEXP (op0, 0);
+	  else if (GET_CODE (op1) == VEC_DUPLICATE)
+	op1 = XEXP (op1, 0);
+	}
+}
 
   /* Integer multiply/fma.  */
   if (GET_MODE_CLASS (mode) == MODE_INT)
diff --git a/gcc/testsuite/gcc.target/aarch64/asimd-mull-elem.c b/gcc/testsuite/gcc.target/aarch64/asimd-mull-elem.c
new file mode 100644
index ..513721cee0c8372781e6daf33bc06e256cab8cb8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/asimd-mull-elem.c
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_int } */
+/* { dg-require-effective-target vect_float } */
+/* { dg-options "-Ofast" } */
+
+#include 
+
+void s_mult_i (int32_t* restrict res, int32_t* restrict a, int32_t b)
+{
+for (int x = 0; x < 16; x++)
+  res[x] = a[x] * b;
+}
+
+void s_mult_f (float32_t* restrict res, float32_t* restrict a, float32_t b)
+{
+for (int x = 0; x < 16; x++)
+  res[x] = a[x] * b;
+}
+
+/* { dg-final { scan-assembler-times {\s+mul\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.s\[0\]} 4 } } */
+/* { dg-final { scan-assembler-times {\s+fmul\tv[0-9]+\.4s, v[0-9]+\.4s, v[0-9]+\.s\[0\]} 4 } } */



Re: [PATCH] AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.

2020-06-09 Thread Richard Sandiford
Tamar Christina  writes:
> Hi Richard,
> The 06/08/2020 16:42, Richard Sandiford wrote:
>> Tamar Christina  writes:
>> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
>> > index 
>> > 97da60762390db81df9cffaf316b909cd1609130..9cc8da338125afa01bc9fb645f4112d2d7ef548c
>> >  100644
>> > --- a/gcc/config/aarch64/aarch64.c
>> > +++ b/gcc/config/aarch64/aarch64.c
>> > @@ -11279,6 +11279,14 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, 
>> > int outer, bool speed)
>> >if (VECTOR_MODE_P (mode))
>> >  mode = GET_MODE_INNER (mode);
>> > 
>> > +  /* The by element versions of the instruction has the same costs as the
>> > + normal 3 vector version.  So don't add the costs of the duplicate 
>> > into
>> > + the costs of the multiply.  */
>> > +  if (GET_CODE (op0) == VEC_DUPLICATE)
>> > +op0 = XEXP (op0, 0);
>> > +  else if (GET_CODE (op1) == VEC_DUPLICATE)
>> > +op1 = XEXP (op1, 0);
>> > +
>> >/* Integer multiply/fma.  */
>> >if (GET_MODE_CLASS (mode) == MODE_INT)
>> >  {
>> 
>> SVE doesn't have duplicating forms, so I think we should put this code
>> under the “if (VECTOR_MODE_P (mode))” condition, before changing “mode”,
>> and then restrict it to VEC_ADVSIMD modes.
>> 
>> (SVE FMUL does have an indexed form, but the index is relative to the
>> start of the associated quadword, so it isn't a VEC_DUPLICATE.)
>>
>
> Done, I have updated the patch. (See attached)
>  
>> I guess there's a danger that this could underestimate the cost for
>> integer modes, if the scalar integer input needs to be moved from GPRs.
>> In that case the cost of a MULT + VEC_DUPLICATE is probably more
>> accurate, even though it's still one instruction before RA.
>> 
>> But I guess there's no perfect answer there.  The new code will be
>> right for integer modes in some cases and not in others.  Same if
>> we leave things as they are.  But maybe it'd be worth having a comment
>> to say that we're assuming the best case, i.e. that the duplicated
>> value is naturally in FPRs?
>> 
>
> Hmm I haven't added the comment yet since I don't fully understand when the
> integer case would be misleading.
>
> In both cases the cost for the GPR is paid by the MOV no? I'm missing
> why having the MUL account for it would be better in some cases.

The point was that any MOV isn't exposed until after register allocation,
whereas costs are usually applied before then.  So before RA:

> For instance for the integer case we used to generate
>
> dup v0.4s, w2
> mul v2.4s, v2.4s, v0.4s

...this was costed as:

   (set (reg:V4SI R2) (vec_duplicate:V4SI (reg:SI R1)))
   (set (reg:V4SI R3) (mult:V4SI ...))

and so accurate when R1 naturally ends up in a GPR.

> but now do
>
> fmovs0, w2
> mul v2.4s, v2.4s, v0.s[0]

...and this is costed as:

   (set (reg:V4SI R3) (mult:V4SI ...))

and so accurate when R1 naturally ends up in an FPR (without needing
a reload to put it there).

In other words, before RA, the patch is making the optimistic assumption
that R1 is already in FPRs and so a separate FMOV won't be needed.

Thanks,
Richard

> Which is better on older cores such Cortex-A55 and no different on newer 
> cores such as
> Cortex-A76 according to the optimization guides.
>
> Regards,
> Tamar
>
>> Thanks,
>> Richard


[PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

2020-06-09 Thread Martin Liška

Hello.

The patch is follow up of the kernel discussion:
https://lore.kernel.org/lkml/canpmjnnrz5ovkb6pe7k6gjfogbht_zhypkng9ad+kjndzk7...@mail.gmail.com/

The patch changes inliner in the following way:
1) do not compare caller and callee sanitizer attributes
   for always_inline functions - that matches Clang behavior
2) do not inline if one function is sanitized and the second
   one isn't (for all ASAN, UBSAN, TSAN, LSAN sanitizers);
   that also matches what Clang does

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

* cif-code.def (ATTRIBUTE_MISMATCH): Rename to...
(SANITIZE_ATTRIBUTE_MISMATCH): ...this.
* ipa-inline.c (sanitize_attrs_match_for_inline_p):
Handle all sanitizer options.
(can_inline_edge_p): Use renamed CIF_* enum value.

gcc/testsuite/ChangeLog:

* c-c++-common/asan/inline.c: New test.
* c-c++-common/tsan/inline.c: New test.
* c-c++-common/ubsan/inline.c: New test.
---
 gcc/cif-code.def  |  7 +++---
 gcc/ipa-inline.c  | 29 ++-
 gcc/testsuite/c-c++-common/asan/inline.c  | 20 
 gcc/testsuite/c-c++-common/tsan/inline.c  | 20 
 gcc/testsuite/c-c++-common/ubsan/inline.c | 20 
 5 files changed, 82 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/inline.c
 create mode 100644 gcc/testsuite/c-c++-common/tsan/inline.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/inline.c

diff --git a/gcc/cif-code.def b/gcc/cif-code.def
index 31c18c6c691..c65b2477203 100644
--- a/gcc/cif-code.def
+++ b/gcc/cif-code.def
@@ -128,9 +128,10 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR,
 DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR,
   N_("callee refers to comdat-local symbols"))
 
-/* We can't inline because of mismatched caller/callee attributes.  */

-DEFCIFCODE(ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR,
-  N_("function attribute mismatch"))
+/* We can't inline because of mismatched caller/callee
+   sanitizer attributes.  */
+DEFCIFCODE(SANITIZE_ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR,
+  N_("sanitizer function attribute mismatch"))
 
 /* We can't inline because the user requests only static functions

but the function has external linkage for live patching purpose.  */
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index f71443feff7..edf4095bcbc 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -264,18 +264,25 @@ sanitize_attrs_match_for_inline_p (const_tree caller, 
const_tree callee)
   if (!caller || !callee)
 return true;
 
-  /* Allow inlining always_inline functions into no_sanitize_address

- functions.  */
-  if (!sanitize_flags_p (SANITIZE_ADDRESS, caller)
-  && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)))
+  /* Follow clang and allow inlining for always_inline functions.  */
+  if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)))
 return true;
 
-  return ((sanitize_flags_p (SANITIZE_ADDRESS, caller)

-  == sanitize_flags_p (SANITIZE_ADDRESS, callee))
- && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller)
- == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee))
- && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller)
- == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee)));
+  const sanitize_code codes[] =
+{
+  SANITIZE_ADDRESS,
+  SANITIZE_THREAD,
+  SANITIZE_LEAK,
+  SANITIZE_UNDEFINED,
+  SANITIZE_UNDEFINED_NONDEFAULT
+};
+
+  for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++)
+if (sanitize_flags_p (codes[i], caller)
+   != sanitize_flags_p (codes[i], callee))
+  return false;
+
+  return true;
 }
 
 /* Used for flags where it is safe to inline when caller's value is

@@ -382,7 +389,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report,
   /* Don't inline a function with mismatched sanitization attributes. */
   else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl))
 {
-  e->inline_failed = CIF_ATTRIBUTE_MISMATCH;
+  e->inline_failed = CIF_SANITIZE_ATTRIBUTE_MISMATCH;
   inlinable = false;
 }
   if (!inlinable && report)
diff --git a/gcc/testsuite/c-c++-common/asan/inline.c 
b/gcc/testsuite/c-c++-common/asan/inline.c
new file mode 100644
index 000..7c36702cf9e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/inline.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=address -c -O3 -fdump-tree-optimized" } */
+
+int x;
+
+static inline
+__attribute__((no_sanitize("address")))
+void do_not_sanitize(void)
+{
+  x++;
+}
+
+void
+sanitize_this(void)
+{
+  x++;
+  do_not_sanitize();
+}
+
+/* { dg-final { scan-tree-dump-times "Function do_not_sanitize" 1 "optimized" 
} } */
diff --git a/gcc/testsuite/c-c++-common/tsan/inline.c 
b/gcc/testsuite/c-

Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 01:53:38PM +0200, Martin Liška wrote:
> +  const sanitize_code codes[] =
> +{
> +  SANITIZE_ADDRESS,
> +  SANITIZE_THREAD,
> +  SANITIZE_LEAK,

Why leak?  That is really a link time option only, doesn't affect code
generation in any way.

On the other side, does the SANITIZE_ADDRESS you have in there
handle properly also -fsanitize=kernel-address cases?

Jakub



Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

2020-06-09 Thread Martin Liška

On 6/9/20 2:01 PM, Jakub Jelinek wrote:

On Tue, Jun 09, 2020 at 01:53:38PM +0200, Martin Liška wrote:

+  const sanitize_code codes[] =
+{
+  SANITIZE_ADDRESS,
+  SANITIZE_THREAD,
+  SANITIZE_LEAK,


Why leak?


I was too eager ;)


That is really a link time option only, doesn't affect code
generation in any way.

On the other side, does the SANITIZE_ADDRESS you have in there
handle properly also -fsanitize=kernel-address cases?


Yes, I've just added a new test-case for it.

Martin



Jakub



>From 799c28526b1fd9e16bef70bcf06264960d25032a Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 9 Jun 2020 13:03:55 +0200
Subject: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

gcc/ChangeLog:

	* cif-code.def (ATTRIBUTE_MISMATCH): Rename to...
	(SANITIZE_ATTRIBUTE_MISMATCH): ...this.
	* ipa-inline.c (sanitize_attrs_match_for_inline_p):
	Handle all sanitizer options.
	(can_inline_edge_p): Use renamed CIF_* enum value.

gcc/testsuite/ChangeLog:

	* c-c++-common/asan/inline.c: New test.
	* c-c++-common/asan/inline-kernel.c: New test.
	* c-c++-common/tsan/inline.c: New test.
	* c-c++-common/ubsan/inline.c: New test.
---
 gcc/cif-code.def  |  7 +++--
 gcc/ipa-inline.c  | 28 +++
 .../c-c++-common/asan/inline-kernel.c | 20 +
 gcc/testsuite/c-c++-common/asan/inline.c  | 20 +
 gcc/testsuite/c-c++-common/tsan/inline.c  | 20 +
 gcc/testsuite/c-c++-common/ubsan/inline.c | 20 +
 6 files changed, 101 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/inline-kernel.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/inline.c
 create mode 100644 gcc/testsuite/c-c++-common/tsan/inline.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/inline.c

diff --git a/gcc/cif-code.def b/gcc/cif-code.def
index 31c18c6c691..c65b2477203 100644
--- a/gcc/cif-code.def
+++ b/gcc/cif-code.def
@@ -128,9 +128,10 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR,
 DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR,
 	   N_("callee refers to comdat-local symbols"))
 
-/* We can't inline because of mismatched caller/callee attributes.  */
-DEFCIFCODE(ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR,
-	   N_("function attribute mismatch"))
+/* We can't inline because of mismatched caller/callee
+   sanitizer attributes.  */
+DEFCIFCODE(SANITIZE_ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR,
+	   N_("sanitizer function attribute mismatch"))
 
 /* We can't inline because the user requests only static functions
but the function has external linkage for live patching purpose.  */
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index f71443feff7..e934a3e5805 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -264,18 +264,24 @@ sanitize_attrs_match_for_inline_p (const_tree caller, const_tree callee)
   if (!caller || !callee)
 return true;
 
-  /* Allow inlining always_inline functions into no_sanitize_address
- functions.  */
-  if (!sanitize_flags_p (SANITIZE_ADDRESS, caller)
-  && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)))
+  /* Follow clang and allow inlining for always_inline functions.  */
+  if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)))
 return true;
 
-  return ((sanitize_flags_p (SANITIZE_ADDRESS, caller)
-	   == sanitize_flags_p (SANITIZE_ADDRESS, callee))
-	  && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller)
-	  == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee))
-	  && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller)
-	  == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee)));
+  const sanitize_code codes[] =
+{
+  SANITIZE_ADDRESS,
+  SANITIZE_THREAD,
+  SANITIZE_UNDEFINED,
+  SANITIZE_UNDEFINED_NONDEFAULT
+};
+
+  for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++)
+if (sanitize_flags_p (codes[i], caller)
+	!= sanitize_flags_p (codes[i], callee))
+  return false;
+
+  return true;
 }
 
 /* Used for flags where it is safe to inline when caller's value is
@@ -382,7 +388,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report,
   /* Don't inline a function with mismatched sanitization attributes. */
   else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl))
 {
-  e->inline_failed = CIF_ATTRIBUTE_MISMATCH;
+  e->inline_failed = CIF_SANITIZE_ATTRIBUTE_MISMATCH;
   inlinable = false;
 }
   if (!inlinable && report)
diff --git a/gcc/testsuite/c-c++-common/asan/inline-kernel.c b/gcc/testsuite/c-c++-common/asan/inline-kernel.c
new file mode 100644
index 000..4ca739b5d04
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/inline-kernel.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kernel-address -c -O3 -fdump-tree-optimized" } */
+
+int x;
+
+static inline
+__attribute__((no_sanitize("kernel-address")))
+void do_not_sanitize(void)
+{
+  x++;
+}
+
+v

Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 02:09:06PM +0200, Martin Liška wrote:
> -  return ((sanitize_flags_p (SANITIZE_ADDRESS, caller)
> -== sanitize_flags_p (SANITIZE_ADDRESS, callee))
> -   && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller)
> -   == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee))
> -   && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller)
> -   == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee)));
> +  const sanitize_code codes[] =
> +{
> +  SANITIZE_ADDRESS,
> +  SANITIZE_THREAD,
> +  SANITIZE_UNDEFINED,
> +  SANITIZE_UNDEFINED_NONDEFAULT
> +};
> +
> +  for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++)
> +if (sanitize_flags_p (codes[i], caller)
> + != sanitize_flags_p (codes[i], callee))
> +  return false;

Sorry for not writing everything at once, but are the
SANITIZER_POINTER_{COMPARE,SUBTRACT} differences unimportant?

Jakub



Re: [PATCH] AArch64: Adjust costing of by element MUL to be the same as SAME3 MUL.

2020-06-09 Thread Tamar Christina
Hi Richard,

The 06/09/2020 12:44, Richard Sandiford wrote:
> Tamar Christina  writes:
> > Hi Richard,
> > The 06/08/2020 16:42, Richard Sandiford wrote:
> >> Tamar Christina  writes:
> >> > diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> >> > index 
> >> > 97da60762390db81df9cffaf316b909cd1609130..9cc8da338125afa01bc9fb645f4112d2d7ef548c
> >> >  100644
> >> > --- a/gcc/config/aarch64/aarch64.c
> >> > +++ b/gcc/config/aarch64/aarch64.c
> >> > @@ -11279,6 +11279,14 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code 
> >> > code, int outer, bool speed)
> >> >if (VECTOR_MODE_P (mode))
> >> >  mode = GET_MODE_INNER (mode);
> >> > 
> >> > +  /* The by element versions of the instruction has the same costs as 
> >> > the
> >> > + normal 3 vector version.  So don't add the costs of the duplicate 
> >> > into
> >> > + the costs of the multiply.  */
> >> > +  if (GET_CODE (op0) == VEC_DUPLICATE)
> >> > +op0 = XEXP (op0, 0);
> >> > +  else if (GET_CODE (op1) == VEC_DUPLICATE)
> >> > +op1 = XEXP (op1, 0);
> >> > +
> >> >/* Integer multiply/fma.  */
> >> >if (GET_MODE_CLASS (mode) == MODE_INT)
> >> >  {
> >> 
> >> SVE doesn't have duplicating forms, so I think we should put this code
> >> under the “if (VECTOR_MODE_P (mode))” condition, before changing “mode”,
> >> and then restrict it to VEC_ADVSIMD modes.
> >> 
> >> (SVE FMUL does have an indexed form, but the index is relative to the
> >> start of the associated quadword, so it isn't a VEC_DUPLICATE.)
> >>
> >
> > Done, I have updated the patch. (See attached)
> >  
> >> I guess there's a danger that this could underestimate the cost for
> >> integer modes, if the scalar integer input needs to be moved from GPRs.
> >> In that case the cost of a MULT + VEC_DUPLICATE is probably more
> >> accurate, even though it's still one instruction before RA.
> >> 
> >> But I guess there's no perfect answer there.  The new code will be
> >> right for integer modes in some cases and not in others.  Same if
> >> we leave things as they are.  But maybe it'd be worth having a comment
> >> to say that we're assuming the best case, i.e. that the duplicated
> >> value is naturally in FPRs?
> >> 
> >
> > Hmm I haven't added the comment yet since I don't fully understand when the
> > integer case would be misleading.
> >
> > In both cases the cost for the GPR is paid by the MOV no? I'm missing
> > why having the MUL account for it would be better in some cases.
> 
> The point was that any MOV isn't exposed until after register allocation,
> whereas costs are usually applied before then.  So before RA:
> 
> > For instance for the integer case we used to generate
> >
> > dup v0.4s, w2
> > mul v2.4s, v2.4s, v0.4s
> 
> ...this was costed as:
> 
>(set (reg:V4SI R2) (vec_duplicate:V4SI (reg:SI R1)))
>(set (reg:V4SI R3) (mult:V4SI ...))
> 
> and so accurate when R1 naturally ends up in a GPR.
> 
> > but now do
> >
> > fmovs0, w2
> > mul v2.4s, v2.4s, v0.s[0]
> 
> ...and this is costed as:
> 
>(set (reg:V4SI R3) (mult:V4SI ...))
> 
> and so accurate when R1 naturally ends up in an FPR (without needing
> a reload to put it there).
> 
> In other words, before RA, the patch is making the optimistic assumption
> that R1 is already in FPRs and so a separate FMOV won't be needed.
> 

Aargggs... yes that makes sense. Sorry when I looked at the dump before I 
didn't noticed the order was switched.
The SET was for the load of course. :(

I have added the comment as suggested, thanks for the explanation.

OK for master?

Thanks,
Tamar

> Thanks,
> Richard
> 
> > Which is better on older cores such Cortex-A55 and no different on newer 
> > cores such as
> > Cortex-A76 according to the optimization guides.
> >
> > Regards,
> > Tamar
> >
> >> Thanks,
> >> Richard

-- 
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 973c65aa4fb348450872036617362aa17310fb20..5a5a9ad44f0945b4d6a869fc2b4e857022659c55 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -11279,7 +11279,22 @@ aarch64_rtx_mult_cost (rtx x, enum rtx_code code, int outer, bool speed)
   op1 = XEXP (x, 1);
 
   if (VECTOR_MODE_P (mode))
-mode = GET_MODE_INNER (mode);
+{
+  unsigned int vec_flags = aarch64_classify_vector_mode (mode);
+  mode = GET_MODE_INNER (mode);
+  if (vec_flags & VEC_ADVSIMD)
+	{
+	  /* The by element versions of the instruction has the same costs as the
+	 normal 3 vector version.  So don't add the costs of the duplicate into
+	 the costs of the multiply.  We make an assumption that the value in
+	 the VEC_DUPLICATE is already the FP&SIMD side.  This means costing of
+	 a MUL by element pre RA is a bit optimistic.  */
+	  if (GET_CODE (op0) == VEC_DUPLICATE)
+	op0 = XEXP (op0, 0);
+	  else if (GET_CODE (op1) == VEC_DUPLICATE)
+	op1 = XEXP (op1, 0);
+	}
+}
 
   /* Integer multiply/fma.  */
   if (GET_MODE_CLA

Re: OpenACC 'attach'/'detach' has no business affecting user-visible reference counting (was: [PATCH 07/13] OpenACC 2.6 deep copy: libgomp parts)

2020-06-09 Thread Julian Brown
On Tue, 9 Jun 2020 12:41:21 +0200
Thomas Schwinge  wrote:

> Hi Julian!
> 
> On 2020-06-05T21:31:08+0100, Julian Brown 
> wrote:
> >> For the OpenACC runtime API 'acc_attach' etc. routines they don't,
> >> so what's the conceptual reason that for the corresponding OpenACC
> >> directive variants, 'GOMP_MAP_ATTACH' etc. here participate in
> >> reference counting ('n->refcount++' above)?  I understand OpenACC
> >> 'attach'/'detach' clauses to be simple "executable clauses", which
> >> just update some values somewhere (say, like
> >> 'GOMP_MAP_ALWAYS_POINTER'), but they don't alter any mapping state,
> >> thus wouldn't appear to need reference counting?  
> >
> > IIUC, n->refcount is not directly the "structural reference count"
> > as seen at source level, but rather counts the number of
> > target_var_descs in the lists appended to each target_mem_desc --
> > and GOMP_MAP_ATTACH have variable entries in those lists.  
> 
> That may be OK if that's purely an implementation detail that isn't
> visible to the user, however:
> 
> > That's not the case for the API
> > routines.  
> 
> As I had mentioned, the problem is: in contrast to 'acc_attach', an
> OpenACC 'enter data' directive with 'attach' clause currently uses
> this same reference-counted code path, and thus such an 'attach'
> without corresponding 'detach' inhibits unmapping; see
> 'libgomp.oacc-c-c++-common/mdc-refcount-1.c' in the attached patch
> "OpenACC 'attach'/'detach' has no business affecting user-visible
> reference counting".

Hmm, right. That's quite a problem from an implementation perspective:
the "attach" clause in the target_mem_desc's var list is what triggers
the "detach" operation (for structured data lifetimes). Having those
references "not count" is quite an ugly wrinkle.

I'll think about that some more...

> That patch seemed to be the logical next step then, to unify the code
> paths for 'acc_attach' and 'enter data' directive with 'attach' clause
> (which have to act in the same way).  That's (conceptually) somewhat
> similar to what you had proposed as part of
> http://mid.mail-archive.com/b23ea71697f77d8214411a3e1348e9dee496e5a6.1590182783.git.julian@codesourcery.com>.
> (But all these things really need to be discussed individually...)
> 
> However, that patch regresses
> 'libgomp.oacc-fortran/deep-copy-6-no_finalize.F90', and also the
> 'deep-copy-7b2f-2.c', and 'deep-copy-7cf.c' that I'm attaching here.
> I have not yet made an attempts to understand these regressions.  It
> may be that a Detach Action actually effects an (attached) device
> pointer being copied back to the host, and then disturbing things --
> and if that, then it may be a bug in libgomp, or in the test case.
> ;-)

I haven't (even) quite absorbed what you are trying to test with the "no
finalize" version of the deep-copy-6.f90 test case... I probably need
to go back and re-read the spec. IIRC, my understanding was that
copying out a data item that still has multiple attachments would *not*
automatically perform a detachment. Thus, attaches & detaches have to
balance (at least without "finalize"). But maybe I was wrong about that!

Thanks,

Julian


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-09 Thread Thomas Schwinge
Hi!

On 2020-05-26T04:08:44-0300, Alexandre Oliva  wrote:
> Thanks, here's the combined patch I'm checking in.
>
> revamp dump and aux output names

For BRIG (HSAIL) front end testing, I'm see a lot of failures like:

Running [...]/source-gcc/gcc/testsuite/brig.dg/dg.exp ...
PASS: brig.dg/test/gimple/variables.hsail (test for excess errors)
[-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original 
"__group_base_addr \\+ \\(0 \\+"
[-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original 
"__group_base_addr \\+ 0"
[-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple "[ 
]*prog_global = s204;"
[-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple 
".module.mod_global;"
[...]

That's:

spawn -ignore SIGHUP [...]/build-gcc/gcc/xgcc -B[...]/build-gcc/gcc/ 
[...]/build-gcc/gcc/testsuite/brig/variables.hsail.brig -fdump-tree-gimple 
-fdump-tree-original -S -o variables.s
PASS: brig.dg/test/gimple/variables.hsail (test for excess errors)
variables.hsail.brig: dump file does not exist
dump file: variables.hsail.brig.original
UNRESOLVED: variables.hsail.brig scan-tree-dump original "__group_base_addr 
\\+ \\(0 \\+"

We're trying to scan 'variables.hsail.brig.*', but for input file name
'variables.hsail.brig', we're now creating:

$ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.*
build-gcc/gcc/testsuite/brig/variables.brig.004t.original
build-gcc/gcc/testsuite/brig/variables.brig.005t.gimple

Before your changes, GCC produced the expected:

$ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.*
build-gcc/gcc/testsuite/brig/variables.hsail.brig.004t.original
build-gcc/gcc/testsuite/brig/variables.hsail.brig.005t.gimple

Are you able to easily create a patch for that?  How/where to adjust:
producer-side (GCC driver, or BRIG (HSAIL) front end?), or consumer-side
(testsuite: tree scanning machinery, or have to put some '-dumpbase' into
all test case files?)?


Grüße
 Thomas
-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter


Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

2020-06-09 Thread Martin Liška

On 6/9/20 2:15 PM, Jakub Jelinek wrote:

Sorry for not writing everything at once, but are the
SANITIZER_POINTER_{COMPARE,SUBTRACT} differences unimportant?


They are. I got confused that they are not
part of SANITIZE_UNDEFINED or SANITIZE_UNDEFINED_NONDEFAULT.

Anyway, good point!
Martin

>From 5fe0671ad79d14d1c9d0fead1a471875a4416fac Mon Sep 17 00:00:00 2001
From: Martin Liska 
Date: Tue, 9 Jun 2020 13:03:55 +0200
Subject: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

gcc/ChangeLog:

	* cif-code.def (ATTRIBUTE_MISMATCH): Rename to...
	(SANITIZE_ATTRIBUTE_MISMATCH): ...this.
	* ipa-inline.c (sanitize_attrs_match_for_inline_p):
	Handle all sanitizer options.
	(can_inline_edge_p): Use renamed CIF_* enum value.

gcc/testsuite/ChangeLog:

	* c-c++-common/asan/inline.c: New test.
	* c-c++-common/asan/inline-kernel.c: New test.
	* c-c++-common/tsan/inline.c: New test.
	* c-c++-common/ubsan/inline.c: New test.
---
 gcc/cif-code.def  |  7 +++--
 gcc/ipa-inline.c  | 30 ---
 .../c-c++-common/asan/inline-kernel.c | 20 +
 gcc/testsuite/c-c++-common/asan/inline.c  | 20 +
 gcc/testsuite/c-c++-common/tsan/inline.c  | 20 +
 gcc/testsuite/c-c++-common/ubsan/inline.c | 20 +
 6 files changed, 103 insertions(+), 14 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/asan/inline-kernel.c
 create mode 100644 gcc/testsuite/c-c++-common/asan/inline.c
 create mode 100644 gcc/testsuite/c-c++-common/tsan/inline.c
 create mode 100644 gcc/testsuite/c-c++-common/ubsan/inline.c

diff --git a/gcc/cif-code.def b/gcc/cif-code.def
index 31c18c6c691..c65b2477203 100644
--- a/gcc/cif-code.def
+++ b/gcc/cif-code.def
@@ -128,9 +128,10 @@ DEFCIFCODE(OPTIMIZATION_MISMATCH, CIF_FINAL_ERROR,
 DEFCIFCODE(USES_COMDAT_LOCAL, CIF_FINAL_ERROR,
 	   N_("callee refers to comdat-local symbols"))
 
-/* We can't inline because of mismatched caller/callee attributes.  */
-DEFCIFCODE(ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR,
-	   N_("function attribute mismatch"))
+/* We can't inline because of mismatched caller/callee
+   sanitizer attributes.  */
+DEFCIFCODE(SANITIZE_ATTRIBUTE_MISMATCH, CIF_FINAL_ERROR,
+	   N_("sanitizer function attribute mismatch"))
 
 /* We can't inline because the user requests only static functions
but the function has external linkage for live patching purpose.  */
diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index f71443feff7..c667de2a97c 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -264,18 +264,26 @@ sanitize_attrs_match_for_inline_p (const_tree caller, const_tree callee)
   if (!caller || !callee)
 return true;
 
-  /* Allow inlining always_inline functions into no_sanitize_address
- functions.  */
-  if (!sanitize_flags_p (SANITIZE_ADDRESS, caller)
-  && lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)))
+  /* Follow clang and allow inlining for always_inline functions.  */
+  if (lookup_attribute ("always_inline", DECL_ATTRIBUTES (callee)))
 return true;
 
-  return ((sanitize_flags_p (SANITIZE_ADDRESS, caller)
-	   == sanitize_flags_p (SANITIZE_ADDRESS, callee))
-	  && (sanitize_flags_p (SANITIZE_POINTER_COMPARE, caller)
-	  == sanitize_flags_p (SANITIZE_POINTER_COMPARE, callee))
-	  && (sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, caller)
-	  == sanitize_flags_p (SANITIZE_POINTER_SUBTRACT, callee)));
+  const sanitize_code codes[] =
+{
+  SANITIZE_ADDRESS,
+  SANITIZE_THREAD,
+  SANITIZE_UNDEFINED,
+  SANITIZE_UNDEFINED_NONDEFAULT,
+  SANITIZE_POINTER_COMPARE,
+  SANITIZE_POINTER_SUBTRACT
+};
+
+  for (unsigned i = 0; i < sizeof (codes) / sizeof (codes[0]); i++)
+if (sanitize_flags_p (codes[i], caller)
+	!= sanitize_flags_p (codes[i], callee))
+  return false;
+
+  return true;
 }
 
 /* Used for flags where it is safe to inline when caller's value is
@@ -382,7 +390,7 @@ can_inline_edge_p (struct cgraph_edge *e, bool report,
   /* Don't inline a function with mismatched sanitization attributes. */
   else if (!sanitize_attrs_match_for_inline_p (caller->decl, callee->decl))
 {
-  e->inline_failed = CIF_ATTRIBUTE_MISMATCH;
+  e->inline_failed = CIF_SANITIZE_ATTRIBUTE_MISMATCH;
   inlinable = false;
 }
   if (!inlinable && report)
diff --git a/gcc/testsuite/c-c++-common/asan/inline-kernel.c b/gcc/testsuite/c-c++-common/asan/inline-kernel.c
new file mode 100644
index 000..4ca739b5d04
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/asan/inline-kernel.c
@@ -0,0 +1,20 @@
+/* { dg-do compile } */
+/* { dg-options "-fsanitize=kernel-address -c -O3 -fdump-tree-optimized" } */
+
+int x;
+
+static inline
+__attribute__((no_sanitize("kernel-address")))
+void do_not_sanitize(void)
+{
+  x++;
+}
+
+void
+sanitize_this(void)
+{
+  x++;
+  do_not_sanitize();
+}
+
+/* { dg-final { scan-tree-dump-times "Function do_not_sanitize" 1 "optimized" } } */
diff --git a/gcc/t

Re: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 02:32:36PM +0200, Martin Liška wrote:
> >From 5fe0671ad79d14d1c9d0fead1a471875a4416fac Mon Sep 17 00:00:00 2001
> From: Martin Liska 
> Date: Tue, 9 Jun 2020 13:03:55 +0200
> Subject: [PATCH] sanitizer: do not inline no-sanitize into sanitizer fn
> 
> gcc/ChangeLog:
> 
>   * cif-code.def (ATTRIBUTE_MISMATCH): Rename to...
>   (SANITIZE_ATTRIBUTE_MISMATCH): ...this.
>   * ipa-inline.c (sanitize_attrs_match_for_inline_p):
>   Handle all sanitizer options.
>   (can_inline_edge_p): Use renamed CIF_* enum value.
> 
> gcc/testsuite/ChangeLog:
> 
>   * c-c++-common/asan/inline.c: New test.
>   * c-c++-common/asan/inline-kernel.c: New test.
>   * c-c++-common/tsan/inline.c: New test.
>   * c-c++-common/ubsan/inline.c: New test.

Ok, thanks.

Jakub



[PATCH] gcov-dump: fix --help spacing

2020-06-09 Thread Martin Liška

Pushed to master.

gcc/ChangeLog:

* gcov-dump.c (print_usage): Fix spacing for --raw option
in --help.
---
 gcc/gcov-dump.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/gcov-dump.c b/gcc/gcov-dump.c
index 90cbd1ace52..ffb71ca4b68 100644
--- a/gcc/gcov-dump.c
+++ b/gcc/gcov-dump.c
@@ -133,7 +133,7 @@ print_usage (void)
   printf ("  -h, --help   Print this help\n");
   printf ("  -l, --long   Dump record contents too\n");
   printf ("  -p, --positions  Dump record positions\n");
-  printf ("  -r, --raw  Print content records in raw format\n");
+  printf ("  -r, --rawPrint content records in raw format\n");
   printf ("  -v, --versionPrint version number\n");
   printf ("\nFor bug reporting instructions, please see:\n%s.\n",
   bug_report_url);
--
2.26.2



Re: [PATCH RFA] tree-inline: Fix VLA handling [PR95552]

2020-06-09 Thread Eric Botcazou
> Yes, but the problem is that remap_decl isn't getting called.

Right, I can get it to be called by adding a pushdecl to grokdeclarator...

> Attaching it to the BIND_EXPR doesn't help walk_tree_1 do the right
> thing with the DECL_EXPR.

... but, indeed, this still ICEs.  So the key is that the DECL_EXPR_DECL of 
the copied DECL_EXPR points to the remapped TYPE_DECL before the type is 
copied?  If so, then your original patch is probably the way to go, but the 
comment would need to be slightly adjusted.

In Ada, where we attach the TYPE_DECL to the BIND_EXPR, this will mean that
remap_decl is invoked 3 times per TYPE_DECL: first twice from copy_bind_expr 
and then once again for the DECL_EXPR.  But probably no big deal in the end.

-- 
Eric Botcazou


BRIG FE testsuite: Fix all dump-scans (Was: Re: drop -aux{dir, base}, revamp -dump{dir, base})

2020-06-09 Thread Martin Jambor
Hi,

On Tue, Jun 09 2020, Thomas Schwinge wrote:
> Hi!
>
> On 2020-05-26T04:08:44-0300, Alexandre Oliva  wrote:
>> Thanks, here's the combined patch I'm checking in.
>>
>> revamp dump and aux output names
>
> For BRIG (HSAIL) front end testing, I'm see a lot of failures like:
>
> Running [...]/source-gcc/gcc/testsuite/brig.dg/dg.exp ...
> PASS: brig.dg/test/gimple/variables.hsail (test for excess errors)
> [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original 
> "__group_base_addr \\+ \\(0 \\+"
> [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump original 
> "__group_base_addr \\+ 0"
> [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple "[ 
> ]*prog_global = s204;"
> [-PASS:-]{+UNRESOLVED:+} variables.hsail.brig scan-tree-dump gimple 
> ".module.mod_global;"
> [...]
>
> That's:
>
> spawn -ignore SIGHUP [...]/build-gcc/gcc/xgcc -B[...]/build-gcc/gcc/ 
> [...]/build-gcc/gcc/testsuite/brig/variables.hsail.brig -fdump-tree-gimple 
> -fdump-tree-original -S -o variables.s
> PASS: brig.dg/test/gimple/variables.hsail (test for excess errors)
> variables.hsail.brig: dump file does not exist
> dump file: variables.hsail.brig.original
> UNRESOLVED: variables.hsail.brig scan-tree-dump original 
> "__group_base_addr \\+ \\(0 \\+"
>
> We're trying to scan 'variables.hsail.brig.*', but for input file name
> 'variables.hsail.brig', we're now creating:
>
> $ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.*
> build-gcc/gcc/testsuite/brig/variables.brig.004t.original
> build-gcc/gcc/testsuite/brig/variables.brig.005t.gimple
>
> Before your changes, GCC produced the expected:
>
> $ ls -1 build-gcc/gcc/testsuite/brig/variables.*???t.*
> build-gcc/gcc/testsuite/brig/variables.hsail.brig.004t.original
> build-gcc/gcc/testsuite/brig/variables.hsail.brig.005t.gimple
>
> Are you able to easily create a patch for that?  How/where to adjust:
> producer-side (GCC driver, or BRIG (HSAIL) front end?), or consumer-side
> (testsuite: tree scanning machinery, or have to put some '-dumpbase' into
> all test case files?)?


I looked into the issue yesterday and decided the simplest fix is
probably the following.  I am going to use my BRIG maintainer hat to
commit the patch in a day or two unless someone thinks it is a bad idea.
Tested by running make check-brig on an x86_64-linux.

Martin



Since Alexandre's revamp of dump files handling in
r11-627-g1dedc12d186, BRIG FE has been receiving slightly different
-dumpbase (e.g. smoke_test.brig instead of smoke_test.hsail.brig when
compiling file smoke_test.hsail.brig) and the testsuite then could not
find the generated dump files it wanted to scan.  I have not really
looked into why that changed, the easiest fix seems to me to remove
the hsail part already when generating the binary brig file from the
textual HSAIL representation.

gcc/testsuite/ChangeLog:

2020-06-09  Martin Jambor  

* lib/brig.exp (brig_target_compile): Strip hsail extension when
gnerating the name of the binary brig file.
---
 gcc/testsuite/lib/brig.exp | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/testsuite/lib/brig.exp b/gcc/testsuite/lib/brig.exp
index fbfb1da947a..de47f13e42c 100644
--- a/gcc/testsuite/lib/brig.exp
+++ b/gcc/testsuite/lib/brig.exp
@@ -29,7 +29,7 @@ proc brig_target_compile { source dest type options } {
# We cannot assume all inputs are .hsail as the dg machinery
# calls this for a some c files to check linker plugin support or
# similar.
-   set brig_source ${tmpdir}/[file tail ${source}].brig
+   set brig_source ${tmpdir}/[file rootname [file tail ${source}]].brig
exec HSAILasm $source -o ${brig_source}
set source ${brig_source}
# Change the testname the .brig.
-- 
2.26.2



Re: [PATCH PR95569] ICE in tmmark:verify_ssa failed

2020-06-09 Thread Richard Biener via Gcc-patches
On Tue, Jun 9, 2020 at 11:47 AM qianchao  wrote:
>
> Hi
>
> Commit eb72dc663e9070b281be83a80f6f838a3a878822 introduces a ICE on AArch64. 
> See the discussion on the bug for details.
> Bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95569
>
> Trivial patch fixing the ICE attached.
>
> Bootstrap and tested on aarch64 platform. No new regression witnessed. ok for 
> trunk?

OK.

Thanks,
Richard.

> 2020-06-09  Qian Chao  
>
> gcc/:
>
> PR tree-optimization/95569
> * trans-mem.c (expand_assign_tm): Ensure that rtmp is marked 
> TREE_ADDRESSABLE.
> * testsuite/gcc.dg/tm/pr51472.c: New test.


Re: [patch] Make memory copy functions scalar storage order barriers

2020-06-09 Thread Richard Biener via Gcc-patches
On Tue, Jun 9, 2020 at 12:19 PM Eric Botcazou  wrote:
>
> > I'm probably completely misunderstanding how TYPE_REVERSE_STORAGE_ORDER
> > works and what the constraints are.  But if the memcpy is the storage order
> > barrier then what cases are OK?
> >
> >  - both source and destination objects are [only] accessed with the same
> >storage order
> >  - both source and destination objects are [only] accessed with
> >!TYPE_REVERSE_STORAGE_ORDER "containing" accesses
> >
> > and what determines the "object state" with respect to storage order
> > when we see a memcpy call?  I had the impression (from the other
> > existing cases of contains_storage_order_barrier_p () calls) that
> > we derive the "object state" from the downstream access (vr->operands),
> > in this case the memcpy destination access.  But see below for maybe
> > the easier case (memcpy folding).
>
> All cases are OK if you don't try to replace memcpy with something else.  The
> only possible replacement would be with a VIEW_CONVERT_EXPR, but I guess that
> it would be quickly folded into a MEM_EXPR and then we're back to square one.
>
> > In my understanding of how TYPE_REVERSE_STORAGE_ORDER works
> > a memcpy, irrespective of its actual "implementation" should not break
> > anything since it transfers bytes.
>
> The point of the discussion is precisely to make it work in all cases.
>
> > In particular a 8 byte TYPE_REVERSE_STORAGE_ORDER
> > object can be transfered ("memcpy"ed) with (ignoring TBAA)
> >
> >   int tema = *(int *)src;
> >   *(int *)dst = tema;
> >   int temb = *((int *)src + 1);
> >   *((int *)dst + 1) = temb;
> >
> > and what byteorder the int accesses are performed in is irrelevant
> > as long as read and write use the same byteorder.
>
> No, because of the fundamental invariant of the implementation: you may _not_
> access the same memory location with different storage orders, this is simply
> not supported and we document the limitation in the manual.  The program does
> not do that, so it's valid; the program after folding does that, which means
> that the folding is invalid.
>
> > > Yes, theoritically, but SRA rewrites the access into a scalar access and
> > > we're back to square one (I tried).
> >
> > But what's the issue with this?
>
> Same as above: access to the same location with different storage orders.
>
> > Why?  Is the following an invalid program?
> >
> > void mymemcpy (void *a, void *b) { memcpy (a, b, 8); }
> > void foo()
> > {
> >   int b[2] __attribute__(reverse_storage);
> >   int a[2];
> >   mymemcpy (a, b);
> >   return a[0];
> > }
> >
> > how can GCC infer, inside mymemcpy that b is reverse storage order?
>
> No, that's supported, again the point of the discussion is precisely to make
> it work!

But GCC does not see the reverse storage order in mymemcpy so
it happily folds the memcpy inside it, inlines the result and then?

> > Thus, if all this is a problem then I think we're papering over the issue
> > by not folding memcpy.
>
> I don't really see why you're insisting on folding memcpy.  It's semantically
> equivalent to VIEW_CONVERT_EXPR so it needs to be treated the same way.

I'm not insisting on folding memcpy.  I'm questioning you are fixing
the actual bug ;)

Richard.

> --
> Eric Botcazou


Re: drop -aux{dir,base}, revamp -dump{dir,base}

2020-06-09 Thread Thomas Schwinge
Hi!

On 2020-05-26T04:08:44-0300, Alexandre Oliva  wrote:
> Thanks, here's the combined patch I'm checking in.
>
> revamp dump and aux output names

For libgomp offloading testing, I'm seeing a number of failures like:

UNSUPPORTED: 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test 
for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  
execution test
[-PASS:-]{+UNRESOLVED:+} 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/nvptx-merged-loop.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   
scan-offload-rtl-dump mach "Merging loop .* into "

[...]

UNSUPPORTED: libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  (test 
for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2  
execution test
[-PASS:-]{+FAIL:+} libgomp.oacc-c/../libgomp.oacc-c-c++-common/pr85381-2.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   
scan-assembler-times bar.sync 2

[...]

PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  (test 
for excess errors)
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  
execution test
PASS: libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0  output 
pattern test
[-PASS:-]{+UNRESOLVED:+} 
libgomp.oacc-c/../libgomp.oacc-c-c++-common/vector-length-128-1.c 
-DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0   
scan-offload-tree-dump oaccdevlow "__attribute__\\(\\(oacc function \\(1, 1, 
128\\)"

[...]

UNSUPPORTED: libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O0
PASS: libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 
-foffload=nvptx-none  -O2  (test for excess errors)
PASS: libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 -DACC_MEM_SHARED=0 
-foffload=nvptx-none  -O2  execution test
[-PASS:-]{+UNRESOLVED:+} libgomp.oacc-c/vec.c -DACC_DEVICE_TYPE_nvidia=1 
-DACC_MEM_SHARED=0 -foffload=nvptx-none  -O2   scan-offload-tree-dump slp1 
"vector\\(2\\) long long unsigned int"

[...]

(A number of similar ones skipped.)


A word of warning: some of that testing indeed is fragile/ugly ;-) -- but
we have to restore it anyway.  That means, we shall gladly accept
suggestions how to do it better.


Consider 'libgomp.oacc-c-c++-common/nvptx-merged-loop.c':

/* { dg-options "-foffload=-fdump-rtl-mach" } */

/* { dg-final { scan-offload-rtl-dump "Merging loop .* into " "mach" } } */

Previously, for '-foffload=nvptx-none -foffload=-fdump-rtl-mach
-save-temps -o ./nvptx-merged-loop.exe', GCC produced the expected
'nvptx-merged-loop.o.307r.mach'.

Now, I find the file at '/tmp/cc6nlTN9.mkoffload-cc4ebUn4.o.307r.mach'

(The '-save-temps' comes from
'libgomp/testsuite/lib/libgomp-dg.exp:libgomp-dg-test', by the way.)

(Yes, that's all a bit ugly, how the 'mkoffload's etc. handle their
files...)

Similar for 'libgomp.oacc-c-c++-common/vector-length-128-1.c':

/* { dg-additional-options "-foffload=-fdump-tree-oaccdevlow" } */

/* { dg-final { scan-offload-tree-dump "__attribute__\\(\\(oacc function 
\\(1, 1, 128\\)" "oaccdevlow" } } */

Similar for 'libgomp.oacc-c/vec.c':

/* { dg-additional-options "-std=c99 -ftree-slp-vectorize 
-foffload=-fdump-tree-slp1" } */

/* { dg-final { scan-offload-tree-dump "vector\\(2\\) long long unsigned 
int" "slp1" } } */

(Thus not analyed these any further.)


Consider 'libgomp.oacc-c-c++-common/pr85381-2.c':

/* { dg-additional-options "-save-temps" } */

/* { dg-final { scan-assembler-times "bar.sync" 2 } } */

This expects to scan the PTX offloading compilation assembler code (not
host code!), expecting that nvptx offloading code assembly is produced
after the host code, and thus overwrites the latter file.  (Yes, that's
certainly ugly/fragile...)

Previously, for '-foffload=nvptx-none -save-temps -o ./pr85381-2.exe',
GCC left the '-foffload=nvptx-none' assembly in the expected
'pr85381-2.s'.

Now, I find the file at '/tmp/cc8JsG4H.mkoffload-pr85381-2.s'.


Are you able to easily create/suggest patches for these?  (You're
prob

[PATCH v3] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Marco Elver via Gcc-patches
Add support to optionally emit different instrumentation for accesses to
volatile variables. While the default TSAN runtime likely will never
require this feature, other runtimes for different environments that
have subtly different memory models or assumptions may require
distinguishing volatiles.

One such environment are OS kernels, where volatile is still used in
various places, and often declare volatile to be appropriate even in
multi-threaded contexts. One such example is the Linux kernel, which
implements various synchronization primitives using volatile
(READ_ONCE(), WRITE_ONCE()).

Here the Kernel Concurrency Sanitizer (KCSAN), is a runtime that uses
TSAN instrumentation but otherwise implements a very different approach
to race detection from TSAN:

https://github.com/google/ktsan/wiki/KCSAN

Due to recent changes in requirements by the Linux kernel, KCSAN
requires that the compiler supports tsan-distinguish-volatile (among
several new requirements):

https://lore.kernel.org/lkml/20200521142047.169334-7-el...@google.com/

gcc/
* params.opt: Define --param=tsan-distinguish-volatile=[0,1].
* sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new
builtin for volatile instrumentation of reads/writes.
(BUILT_IN_TSAN_VOLATILE_READ2): Likewise.
(BUILT_IN_TSAN_VOLATILE_READ4): Likewise.
(BUILT_IN_TSAN_VOLATILE_READ8): Likewise.
(BUILT_IN_TSAN_VOLATILE_READ16): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise.
(BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise.
* tsan.c (get_memory_access_decl): Argument if access is
volatile. If param tsan-distinguish-volatile is non-zero, and
access if volatile, return volatile instrumentation decl.
(instrument_expr): Check if access is volatile.

gcc/testsuite/
* c-c++-common/tsan/volatile.c: New test.

Acked-by: Dmitry Vyukov 
---
v3:
* Remove Optimization from -param=tsan-distinguish-volatile.
* Simplify get_memory_access_decl.
* Avoid use of builtin_decl temporary.

v2:
* Add Optimization keyword to -param=tsan-distinguish-volatile= as the
  parameter can be different per TU.
* Add tree-dump check to test.
---
 gcc/params.opt |  4 ++
 gcc/sanitizer.def  | 21 +++
 gcc/testsuite/c-c++-common/tsan/volatile.c | 67 ++
 gcc/tsan.c | 31 +-
 4 files changed, 110 insertions(+), 13 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/tsan/volatile.c

diff --git a/gcc/params.opt b/gcc/params.opt
index 4aec480798b..9b564bb046c 100644
--- a/gcc/params.opt
+++ b/gcc/params.opt
@@ -908,6 +908,10 @@ Stop reverse growth if the reverse probability of best 
edge is less than this th
 Common Joined UInteger Var(param_tree_reassoc_width) Param Optimization
 Set the maximum number of instructions executed in parallel in reassociated 
tree.  If 0, use the target dependent heuristic.
 
+-param=tsan-distinguish-volatile=
+Common Joined UInteger Var(param_tsan_distinguish_volatile) IntegerRange(0, 1) 
Param
+Emit special instrumentation for accesses to volatiles.
+
 -param=uninit-control-dep-attempts=
 Common Joined UInteger Var(param_uninit_control_dep_attempts) Init(1000) 
IntegerRange(1, 65536) Param Optimization
 Maximum number of nested calls to search for control dependencies during 
uninitialized variable analysis.
diff --git a/gcc/sanitizer.def b/gcc/sanitizer.def
index 11eb6467eba..a32715ddb92 100644
--- a/gcc/sanitizer.def
+++ b/gcc/sanitizer.def
@@ -214,6 +214,27 @@ DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_READ_RANGE, 
"__tsan_read_range",
 DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_WRITE_RANGE, "__tsan_write_range",
  BT_FN_VOID_PTR_PTRMODE, ATTR_NOTHROW_LEAF_LIST)
 
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ1, "__tsan_volatile_read1",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ2, "__tsan_volatile_read2",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ4, "__tsan_volatile_read4",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ8, "__tsan_volatile_read8",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_READ16, "__tsan_volatile_read16",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE1, "__tsan_volatile_write1",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATILE_WRITE2, "__tsan_volatile_write2",
+ BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
+DEF_SANITIZER_BUILTIN(BUILT_IN_TSAN_VOLATIL

Re: [PATCH v3] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 03:15:39PM +0200, Marco Elver wrote:
> gcc/
>   * params.opt: Define --param=tsan-distinguish-volatile=[0,1].
>   * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new
>   builtin for volatile instrumentation of reads/writes.
>   (BUILT_IN_TSAN_VOLATILE_READ2): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_READ4): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_READ8): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_READ16): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise.
>   (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise.
>   * tsan.c (get_memory_access_decl): Argument if access is
>   volatile. If param tsan-distinguish-volatile is non-zero, and
>   access if volatile, return volatile instrumentation decl.
>   (instrument_expr): Check if access is volatile.
> 
> gcc/testsuite/
>   * c-c++-common/tsan/volatile.c: New test.
> 
> Acked-by: Dmitry Vyukov 

Ok, thanks.

Jakub



RE: [PATCH] move permutation validity check

2020-06-09 Thread Alex Coplan
> -Original Message-
> From: Gcc-patches  On Behalf Of Richard
> Biener
> Sent: 05 May 2020 14:49
> To: gcc-patches@gcc.gnu.org
> Subject: [PATCH] move permutation validity check
> 
> 
> This delays the SLP permutation check to vectorizable_load and optimizes
> permutations only after all SLP instances have been generated and the
> vectorization factor is determined.
> 
> Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> 
> Richard.

Hi Richard,

This patch (bc484e25) introduced a test failure on arm and aarch64. The
test gcc.dg/vect/slp-perm-9.c is now failing with:

FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation requires 
at least three vectors" 1

It seems this assertion isn't run on x86-64 which explains why this was
missed.

Thanks,
Alex

> 
> 2020-05-05  Richard Biener  
> 
>   * tree-vectorizer.h (vec_info::slp_loads): New.
>   (vect_optimize_slp): Declare.
>   * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts):  Do
>   nothing when there are no loads.
>   (vect_gather_slp_loads): Gather loads into a vector.
>   (vect_supported_load_permutation_p): Remove.
>   (vect_analyze_slp_instance): Do not verify permutation
>   validity here.
>   (vect_analyze_slp): Optimize permutations of reductions
>   after all SLP instances have been gathered and gather
>   all loads.
>   (vect_optimize_slp): New function split out from
>   vect_supported_load_permutation_p.  Elide some permutations.
>   (vect_slp_analyze_bb_1): Call vect_optimize_slp.
>   * tree-vect-loop.c (vect_analyze_loop_2): Likewise.
>   * tree-vect-stmts.c (vectorizable_load): Check whether
>   the load can be permuted.  When generating code assert we can.
> 
>   * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported
>   SLP permutations becoming builds from scalars.
>   * gcc.dg/vect/bb-slp-pr78205.c: Likewise.
>   * gcc.dg/vect/bb-slp-34.c: Likewise.
> ---
>  gcc/testsuite/gcc.dg/vect/bb-slp-34.c  |   3 +-
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c |   7 +-
>  gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c |   6 +-
>  gcc/tree-vect-loop.c   |   3 +
>  gcc/tree-vect-slp.c| 262 +++
> --
>  gcc/tree-vect-stmts.c  |  50 +-
>  gcc/tree-vectorizer.h  |   4 +-
>  7 files changed, 154 insertions(+), 181 deletions(-)


Re: [PATCH] move permutation validity check

2020-06-09 Thread Richard Biener via Gcc-patches
On Tue, Jun 9, 2020 at 3:22 PM Alex Coplan  wrote:
>
> > -Original Message-
> > From: Gcc-patches  On Behalf Of Richard
> > Biener
> > Sent: 05 May 2020 14:49
> > To: gcc-patches@gcc.gnu.org
> > Subject: [PATCH] move permutation validity check
> >
> >
> > This delays the SLP permutation check to vectorizable_load and optimizes
> > permutations only after all SLP instances have been generated and the
> > vectorization factor is determined.
> >
> > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> >
> > Richard.
>
> Hi Richard,
>
> This patch (bc484e25) introduced a test failure on arm and aarch64. The
> test gcc.dg/vect/slp-perm-9.c is now failing with:
>
> FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation 
> requires at least three vectors" 1
>
> It seems this assertion isn't run on x86-64 which explains why this was
> missed.

Please open a bugreport if there's none already.

Thanks,
Richard.

>
> Thanks,
> Alex
>
> >
> > 2020-05-05  Richard Biener  
> >
> >   * tree-vectorizer.h (vec_info::slp_loads): New.
> >   (vect_optimize_slp): Declare.
> >   * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts):  Do
> >   nothing when there are no loads.
> >   (vect_gather_slp_loads): Gather loads into a vector.
> >   (vect_supported_load_permutation_p): Remove.
> >   (vect_analyze_slp_instance): Do not verify permutation
> >   validity here.
> >   (vect_analyze_slp): Optimize permutations of reductions
> >   after all SLP instances have been gathered and gather
> >   all loads.
> >   (vect_optimize_slp): New function split out from
> >   vect_supported_load_permutation_p.  Elide some permutations.
> >   (vect_slp_analyze_bb_1): Call vect_optimize_slp.
> >   * tree-vect-loop.c (vect_analyze_loop_2): Likewise.
> >   * tree-vect-stmts.c (vectorizable_load): Check whether
> >   the load can be permuted.  When generating code assert we can.
> >
> >   * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported
> >   SLP permutations becoming builds from scalars.
> >   * gcc.dg/vect/bb-slp-pr78205.c: Likewise.
> >   * gcc.dg/vect/bb-slp-34.c: Likewise.
> > ---
> >  gcc/testsuite/gcc.dg/vect/bb-slp-34.c  |   3 +-
> >  gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c |   7 +-
> >  gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c |   6 +-
> >  gcc/tree-vect-loop.c   |   3 +
> >  gcc/tree-vect-slp.c| 262 +++
> > --
> >  gcc/tree-vect-stmts.c  |  50 +-
> >  gcc/tree-vectorizer.h  |   4 +-
> >  7 files changed, 154 insertions(+), 181 deletions(-)


RE: [PATCH] move permutation validity check

2020-06-09 Thread Alex Coplan
> -Original Message-
> From: Richard Biener 
> Sent: 09 June 2020 14:24
> To: Alex Coplan 
> Cc: Richard Biener ; gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH] move permutation validity check
> 
> On Tue, Jun 9, 2020 at 3:22 PM Alex Coplan  wrote:
> >
> > > -Original Message-
> > > From: Gcc-patches  On Behalf Of Richard
> > > Biener
> > > Sent: 05 May 2020 14:49
> > > To: gcc-patches@gcc.gnu.org
> > > Subject: [PATCH] move permutation validity check
> > >
> > >
> > > This delays the SLP permutation check to vectorizable_load and optimizes
> > > permutations only after all SLP instances have been generated and the
> > > vectorization factor is determined.
> > >
> > > Bootstrap and regtest running on x86_64-unknown-linux-gnu.
> > >
> > > Richard.
> >
> > Hi Richard,
> >
> > This patch (bc484e25) introduced a test failure on arm and aarch64. The
> > test gcc.dg/vect/slp-perm-9.c is now failing with:
> >
> > FAIL: gcc.dg/vect/slp-perm-9.c scan-tree-dump-times vect "permutation
> requires at least three vectors" 1
> >
> > It seems this assertion isn't run on x86-64 which explains why this was
> > missed.
> 
> Please open a bugreport if there's none already.
> 
> Thanks,
> Richard.

Ah, it appears there already is one: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95056

Apologies, I should have checked Bugzilla first.

Thanks,
Alex

> 
> >
> > Thanks,
> > Alex
> >
> > >
> > > 2020-05-05  Richard Biener  
> > >
> > >   * tree-vectorizer.h (vec_info::slp_loads): New.
> > >   (vect_optimize_slp): Declare.
> > >   * tree-vect-slp.c (vect_attempt_slp_rearrange_stmts):  Do
> > >   nothing when there are no loads.
> > >   (vect_gather_slp_loads): Gather loads into a vector.
> > >   (vect_supported_load_permutation_p): Remove.
> > >   (vect_analyze_slp_instance): Do not verify permutation
> > >   validity here.
> > >   (vect_analyze_slp): Optimize permutations of reductions
> > >   after all SLP instances have been gathered and gather
> > >   all loads.
> > >   (vect_optimize_slp): New function split out from
> > >   vect_supported_load_permutation_p.  Elide some permutations.
> > >   (vect_slp_analyze_bb_1): Call vect_optimize_slp.
> > >   * tree-vect-loop.c (vect_analyze_loop_2): Likewise.
> > >   * tree-vect-stmts.c (vectorizable_load): Check whether
> > >   the load can be permuted.  When generating code assert we can.
> > >
> > >   * gcc.dg/vect/bb-slp-pr68892.c: Adjust for not supported
> > >   SLP permutations becoming builds from scalars.
> > >   * gcc.dg/vect/bb-slp-pr78205.c: Likewise.
> > >   * gcc.dg/vect/bb-slp-34.c: Likewise.
> > > ---
> > >  gcc/testsuite/gcc.dg/vect/bb-slp-34.c  |   3 +-
> > >  gcc/testsuite/gcc.dg/vect/bb-slp-pr68892.c |   7 +-
> > >  gcc/testsuite/gcc.dg/vect/bb-slp-pr78205.c |   6 +-
> > >  gcc/tree-vect-loop.c   |   3 +
> > >  gcc/tree-vect-slp.c| 262 +++
> > > --
> > >  gcc/tree-vect-stmts.c  |  50 +-
> > >  gcc/tree-vectorizer.h  |   4 +-
> > >  7 files changed, 154 insertions(+), 181 deletions(-)



Re: [stage1][PATCH] Lower VEC_COND_EXPR into internal functions.

2020-06-09 Thread Richard Biener via Gcc-patches
On Mon, Jun 8, 2020 at 1:04 PM Martin Liška  wrote:
>
> Hello.
>
> Thank you for the approval. There's the patch that defines 4 new 
> DEF_INTERNAL_OPTAB_FN.
>
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> It also builds on ppc64le-linux-gnu.
>
> Ready to be installed?

The ChangeLog refers to DEF_INTERNAL_OPTAB_CAN_FAIL which is no longer there.

Can you put the isel pass to a separate file please?

So this is a first step towards sanitizing VEC_COND_EXPR.  There were followups
mentioned, namely a) enforcing that VEC_COND_EXPR constraint everywhere,
b) isel vector comparisons at the same time since expansion has a
vec_cond fallback

There's

+ /* ???  If we do not cleanup EH then we will ICE in
+verification.  But in reality we have created wrong-code
+as we did not properly transition EH info and edges to
+the piecewise computations.  */
+ if (maybe_clean_eh_stmt (gsi_stmt (gsi))
+ && gimple_purge_dead_eh_edges (bb))
+   cfg_changed = true;

which of course is bad.  It's the comparison that can throw and I guess current
RTL expansion manages to cope by find_many_sub_bbs and friends.  But we
need to get this correct on GIMPLE here.  Note I find it odd this only triggers
during ISEL - it should trigger during the lowering step which splits
the comparison
from the VEC_COND_EXPR.  An appropriate fix at lowering time would be to
insert the VEC_COND_EXPR w/o the condition on the normal outgoing edge
and keep the comparison in place of the original VEC_COND_EXPR,
moving EH info from the VEC_COND_EXPR to the comparison.

I think we need to fix that before merging.

Thanks,
Richard.

> Thanks,
> Martin


Re: [PATCH v3] tsan: Add optional support for distinguishing volatiles

2020-06-09 Thread Marco Elver via Gcc-patches
On Tue, 9 Jun 2020 at 15:22, Jakub Jelinek  wrote:
>
> On Tue, Jun 09, 2020 at 03:15:39PM +0200, Marco Elver wrote:
> > gcc/
> >   * params.opt: Define --param=tsan-distinguish-volatile=[0,1].
> >   * sanitizer.def (BUILT_IN_TSAN_VOLATILE_READ1): Define new
> >   builtin for volatile instrumentation of reads/writes.
> >   (BUILT_IN_TSAN_VOLATILE_READ2): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_READ4): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_READ8): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_READ16): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE1): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE2): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE4): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE8): Likewise.
> >   (BUILT_IN_TSAN_VOLATILE_WRITE16): Likewise.
> >   * tsan.c (get_memory_access_decl): Argument if access is
> >   volatile. If param tsan-distinguish-volatile is non-zero, and
> >   access if volatile, return volatile instrumentation decl.
> >   (instrument_expr): Check if access is volatile.
> >
> > gcc/testsuite/
> >   * c-c++-common/tsan/volatile.c: New test.
> >
> > Acked-by: Dmitry Vyukov 
>
> Ok, thanks.

I think one of you has to commit the patch, as we don't have access to
the GCC git repository.

Thank you!

-- Marco


[PATCH] Remove dead code

2020-06-09 Thread Richard Biener
This removes dead code that was left over from the reduction
vectorization refactoring last year.

Bootstrapped / tested on x86_64-unknown-linux-gnu, applied.

Richard.

2020-06-09  Richard Biener  

* tree-vect-stmts.c (vect_transform_stmt): Remove dead code.
---
 gcc/tree-vect-stmts.c | 43 ---
 1 file changed, 43 deletions(-)

diff --git a/gcc/tree-vect-stmts.c b/gcc/tree-vect-stmts.c
index b24b0fe4304..61adf7d7fa4 100644
--- a/gcc/tree-vect-stmts.c
+++ b/gcc/tree-vect-stmts.c
@@ -11186,12 +11186,6 @@ vect_transform_stmt (vec_info *vinfo,
   gcc_assert (slp_node || !PURE_SLP_STMT (stmt_info));
   stmt_vec_info old_vec_stmt_info = STMT_VINFO_VEC_STMT (stmt_info);
 
-  loop_vec_info loop_vinfo = dyn_cast  (vinfo);
-  bool nested_p = (loop_vinfo
-  && nested_in_vect_loop_p
-   (LOOP_VINFO_LOOP (loop_vinfo), stmt_info));
-
-  gimple *stmt = stmt_info->stmt;
   switch (STMT_VINFO_TYPE (stmt_info))
 {
 case type_demotion_vec_info_type:
@@ -11266,13 +11260,11 @@ vect_transform_stmt (vec_info *vinfo,
 case call_vec_info_type:
   done = vectorizable_call (vinfo, stmt_info,
gsi, &vec_stmt, slp_node, NULL);
-  stmt = gsi_stmt (*gsi);
   break;
 
 case call_simd_clone_vec_info_type:
   done = vectorizable_simd_clone_call (vinfo, stmt_info, gsi, &vec_stmt,
   slp_node, NULL);
-  stmt = gsi_stmt (*gsi);
   break;
 
 case reduc_vec_info_type:
@@ -11310,41 +11302,6 @@ vect_transform_stmt (vec_info *vinfo,
 gcc_assert (!vec_stmt
&& STMT_VINFO_VEC_STMT (stmt_info) == old_vec_stmt_info);
 
-  /* Handle inner-loop stmts whose DEF is used in the loop-nest that
- is being vectorized, but outside the immediately enclosing loop.  */
-  if (vec_stmt
-  && nested_p
-  && STMT_VINFO_TYPE (stmt_info) != reduc_vec_info_type
-  && (STMT_VINFO_RELEVANT (stmt_info) == vect_used_in_outer
-  || STMT_VINFO_RELEVANT (stmt_info) ==
-   vect_used_in_outer_by_reduction))
-{
-  class loop *innerloop = LOOP_VINFO_LOOP (loop_vinfo)->inner;
-  imm_use_iterator imm_iter;
-  use_operand_p use_p;
-  tree scalar_dest;
-
-  if (dump_enabled_p ())
-dump_printf_loc (MSG_NOTE, vect_location,
- "Record the vdef for outer-loop vectorization.\n");
-
-  /* Find the relevant loop-exit phi-node, and reord the vec_stmt there
-(to be used when vectorizing outer-loop stmts that use the DEF of
-STMT).  */
-  if (gimple_code (stmt) == GIMPLE_PHI)
-scalar_dest = PHI_RESULT (stmt);
-  else
-scalar_dest = gimple_get_lhs (stmt);
-
-  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, scalar_dest)
-   if (!flow_bb_inside_loop_p (innerloop, gimple_bb (USE_STMT (use_p
- {
-   stmt_vec_info exit_phi_info
- = vinfo->lookup_stmt (USE_STMT (use_p));
-   STMT_VINFO_VEC_STMT (exit_phi_info) = vec_stmt;
- }
-}
-
   if (vec_stmt)
 STMT_VINFO_VEC_STMT (stmt_info) = vec_stmt;
 
-- 
2.25.1


Re: [Patch][RFC] openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)

2020-06-09 Thread Tobias Burnus

It turned out that this patch fails with LTO and partitions,
causing fails at runtime such as
  libgomp: Duplicate node
via libgomp/splay-tree.c's splay_tree_insert.

In the test case, the problem occurred for functions - namely
main._omp_fn.* on the host.
If the code is run in LTO context, the filtering-out should
have already happen via the stream-out/stream-in and hence no
additional check is needed for omp_finish_file.

OK?

Tobias

PS: The streaming-in is done via:
  input_offload_tables (/* do_force_output = */ !flag_ltrans);

-
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander 
Walter
openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)

gcc/ChangeLog:

	* omp-offload.c (add_decls_addresses_to_decl_constructor,
	omp_finish_file): With in_lto_p, stream out all offload-table
	items even if the symtab_node does not exist.

diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 4e44cfc9d0a..32c2485abd4 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -126,7 +126,7 @@ add_decls_addresses_to_decl_constructor (vec *v_decls,
 	  && lookup_attribute ("omp declare target link", DECL_ATTRIBUTES (it));
 
   /* See also omp_finish_file and output_offload_tables in lto-cgraph.c.  */
-  if (!symtab_node::get (it))
+  if (!in_lto_p && !symtab_node::get (it))
 	continue;
 
   tree size = NULL_TREE;
@@ -382,14 +382,14 @@ omp_finish_file (void)
 	  tree it = (*offload_funcs)[i];
 	  /* See also add_decls_addresses_to_decl_constructor
 	 and output_offload_tables in lto-cgraph.c.  */
-	  if (!symtab_node::get (it))
+	  if (!in_lto_p && !symtab_node::get (it))
 	continue;
 	  targetm.record_offload_symbol (it);
 	}
   for (unsigned i = 0; i < num_vars; i++)
 	{
 	  tree it = (*offload_vars)[i];
-	  if (!symtab_node::get (it))
+	  if (!in_lto_p && !symtab_node::get (it))
 	continue;
 #ifdef ACCEL_COMPILER
 	  if (DECL_HAS_VALUE_EXPR_P (it)


Re: [Patch][RFC] openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 04:02:19PM +0200, Tobias Burnus wrote:
> It turned out that this patch fails with LTO and partitions,
> causing fails at runtime such as
>   libgomp: Duplicate node
> via libgomp/splay-tree.c's splay_tree_insert.
> 
> In the test case, the problem occurred for functions - namely
> main._omp_fn.* on the host.
> If the code is run in LTO context, the filtering-out should
> have already happen via the stream-out/stream-in and hence no
> additional check is needed for omp_finish_file.
> 
> OK?

Was this caught in the testsuite, or do you have some short testcase
that could be used in the testsuite?
> 
> Tobias
> 
> PS: The streaming-in is done via:
>   input_offload_tables (/* do_force_output = */ !flag_ltrans);
> 
> -
> Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / 
> Germany
> Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, 
> Alexander Walter

> openmp: ensure variables in offload table are streamed out (PRs 94848 + 95551)
> 
> gcc/ChangeLog:
> 
>   * omp-offload.c (add_decls_addresses_to_decl_constructor,
>   omp_finish_file): With in_lto_p, stream out all offload-table
>   items even if the symtab_node does not exist.

Ok with or without the testcase.

Jakub



[RFC][vect] BB SLP reduction prototype

2020-06-09 Thread Andre Vieira (lists)

Hi,

So this is my rework of the code you sent me, I have not included the 
'permute' code you included as I can't figure out what it is meant to be 
doing. Maybe something to look at later.


I have also included three tests that show it working for some simple 
cases and even a nested one.


Unfortunately it will not handle other simple cases where reassoc 
doesn't put the reduction in the form of :

sum0 = a + b;
sum1 = c + sum0;
...

For instance a testcase I have been looking at is:
unsigned int u32_single_abs_sum (unsigned int * a, unsigned int * b)
{
  unsigned int sub0 = a[0] - b[0];
  unsigned int sub1 = a[1] - b[1];
  unsigned int sub2 = a[2] - b[2];
  unsigned int sub3 = a[3] - b[3];
  unsigned int sum = sub0 + sub1;
  sum += sub2;
  sum += sub3;
  return sum;
}

Unfortunately, the code that reaches slp will look like:
  _1 = *a_10(D);
  _2 = *b_11(D);
  _3 = MEM[(unsigned int *)a_10(D) + 4B];
  _4 = MEM[(unsigned int *)b_11(D) + 4B];
  _5 = MEM[(unsigned int *)a_10(D) + 8B];
  _6 = MEM[(unsigned int *)b_11(D) + 8B];
  _7 = MEM[(unsigned int *)a_10(D) + 12B];
  _8 = MEM[(unsigned int *)b_11(D) + 12B];
  _28 = _1 - _2;
  _29 = _3 + _28;
  _30 = _29 - _4;
  _31 = _5 + _30;
  _32 = _31 - _6;
  _33 = _7 + _32;
  sum_18 = _33 - _8;
  return sum_18;

Which doesn't have the format expected as I described above... I am 
wondering how to teach it to support this. Maybe starting with your 
suggestion of making plus_expr and minus_expr have the same hash, so it 
groups all these statements together might be a start, but you'd still 
need to 'rebalance' the tree somehow I need to give this a bit more 
thought but I wanted to share what I have so far.


The code is severely lacking in comments for now btw...

Cheers,
Andre

diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-1.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-1.c
new file mode 100644
index 
..66b53ff9bb1e77414e7493c07ab87d46f4d33651
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-1.c
@@ -0,0 +1,66 @@
+/* { dg-require-effective-target vect_int } */
+#include 
+#include "tree-vect.h"
+extern int abs (int);
+
+#define ABS4(N)\
+  sum += abs (a[N]);   \
+  sum += abs (a[N+1]); \
+  sum += abs (a[N+2]); \
+  sum += abs (a[N+3]);
+
+#define ABS8(N)  \
+  ABS4(N)\
+  ABS4(N+4)
+
+#define ABS16(N)  \
+  ABS8(N)\
+  ABS8(N+8)
+
+__attribute__ ((noipa)) unsigned char
+u8_single_abs_sum (signed char * a)
+{
+  unsigned char sum = 0;
+  ABS16(0)
+  return sum;
+}
+
+__attribute__ ((noipa)) unsigned short
+u16_single_abs_sum (signed short * a)
+{
+  unsigned short sum = 0;
+  ABS8(0)
+  return sum;
+}
+
+__attribute__ ((noipa)) unsigned int
+u32_single_abs_sum (signed int * a)
+{
+  unsigned int sum = 0;
+  ABS4(0)
+  return sum;
+}
+
+signed char u8[16] = {0, 1, 2, 3, 4, 5, 6, -7, -8, -9, -10, -11, -12, -13,
+   -14, -15};
+signed short u16[8] = {0, 1, 2, 3, 4, -5, -6, -7};
+signed int u32[4] = {-10, -20, 30, 40};
+
+
+int main (void)
+{
+  check_vect ();
+
+  if (u8_single_abs_sum (&(u8[0])) != 120)
+abort ();
+
+  if (u16_single_abs_sum (&(u16[0])) != 28)
+abort ();
+
+  if (u32_single_abs_sum (&(u32[0])) != 100)
+abort ();
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-times "basic block vectorized" 3 "slp2" } } */
diff --git a/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-2.c 
b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-2.c
new file mode 100644
index 
..298a22cfef687f6634d61bf808a41942c3ce4a85
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/bb-slp-reduc-2.c
@@ -0,0 +1,82 @@
+/* { dg-require-effective-target vect_int } */
+#include 
+#include "tree-vect.h"
+extern int abs (int);
+
+#define ABS4(N)\
+  sum += abs (a[N]);   \
+  sum += abs (a[N+1]); \
+  sum += abs (a[N+2]); \
+  sum += abs (a[N+3]);
+
+#define ABS8(N)  \
+  ABS4(N)\
+  ABS4(N+4)
+
+#define ABS16(N)  \
+  ABS8(N)\
+  ABS8(N+8)
+
+__attribute__ ((noipa)) unsigned char
+u8_double_abs_sum (signed char * a)
+{
+  unsigned char sum = 0;
+  ABS16(0)
+  ABS16(16)
+  return sum;
+}
+
+__attribute__ ((noipa)) unsigned short
+u16_double_abs_sum (signed short * a)
+{
+  unsigned short sum = 0;
+  ABS16(0)
+  return sum;
+}
+
+__attribute__ ((noipa)) unsigned int
+u32_double_abs_sum (signed int * a)
+{
+  unsigned int sum = 0;
+  ABS8(0)
+  return sum;
+}
+
+__attribute__ ((noipa)) unsigned int
+u32_triple_abs_sum (signed int * a)
+{
+  unsigned int sum = 0;
+  ABS8(0)
+  ABS4(8)
+  return sum;
+}
+
+signed char u8[32] = {0, 1, 2, 3, 4, 5, 6, -7, -8, -9, -10, -11, -12, -13,
+ -14, -15, 0, 1, 2, 3, 4, 5, 6, -7, -8, -9, -10, -11, -12, 
-13,
+ -14, -30};
+
+signed short u16[16] = {0, 1, 2, 3, 4, -5, -6, -7, 10, 20, 30, 40, -10, -20,
+  -30, -40};
+signed int u32[16] = {-10, -20, 30, 40, 100, 200, -300, -500, -600, -700, 1000,
+2000};
+
+i

Re: [PR95416] outputs.exp: skip lto tests when not using linker plugin

2020-06-09 Thread Rainer Orth
Hi Alexandre,

> When the linker plugin is not available, dump outputs in lto runs are
> different from those outputs.exp tests expect, so skip them for now.
>
> Regstrapped on x86_64-linux-gnu.  Ok to install?
>
>
> for  gcc/testsuite/ChangeLog
>
>   * outputs.exp (skip_lto): Set when missing the linker plugin.
> ---
>  testsuite/gcc.misc-tests/outputs.exp |3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git gcc/testsuite/gcc.misc-tests/outputs.exp 
> gcc/testsuite/gcc.misc-tests/outputs.exp
> index 06a32db..a47f8a6 100644
> --- gcc/testsuite/gcc.misc-tests/outputs.exp
> +++ gcc/testsuite/gcc.misc-tests/outputs.exp
> @@ -42,7 +42,8 @@ if ![check_no_compiler_messages gsplitdwarf object {
>  
>  # Check for -flto support.  We explicitly test the result to skip
>  # tests that use -flto.
> -set skip_lto ![check_effective_target_lto]
> +set skip_lto { ![check_effective_target_lto] \
> +|| ![check_linker_plugin_available] }

this is wrong unfortunately: braces are the Tcl equivalent of single
quotes so you're setting skip_lto to the string inside.  Instead you
need something like

set skip_lto [expr ![check_effective_target_lto] \
   || ![check_linker_plugin_available]]

While this worked for me on i386-pc-solaris2.11 (both with ld, thus
without the lto plugin, and with gld and the plugin), I wonder if
there's not a better way than just skipping the lto tests, i.e. account
for the difference in expected files between the cases with and without
the plugin.

Your call in the end (perhaps as a followup): everything is better than
the current bunch of failures.

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Re: [PATCH] rs6000/testsuite: Allow xxperm* instead of only vperm*

2020-06-09 Thread will schmidt via Gcc-patches
On Tue, 2020-06-09 at 01:00 +, Segher Boessenkool wrote:
> Some testcases failed (esp. with --with-cpu=power9) after my change
> to
> prefer xxperm over vperm when all else is equal.  Fix that.  (This
> also
> tightens the relevant REs somewhat).
> 
> Tested on way too many configurations.  Committed to trunk.
> 

Hi, 

A post-commit review .. 

(since I authored many of the fold-vec-* tests, i had to chime in)

this looks good to me, appreciate the effort. 

Thanks :-) 
-Will


> 
> Segher
> 
> 
> 2020-06-09  Segher Boessenkool  
> 
> gcc/testsuite/
>   * gcc.target/powerpc/fold-vec-perm-char.c: Allow both
> vperm/vpermr and
>   xxperm/xxpermr.
>   * gcc.target/powerpc/fold-vec-perm-double.c: Ditto.
>   * gcc.target/powerpc/fold-vec-perm-float.c: Ditto.
>   * gcc.target/powerpc/fold-vec-perm-int.c: Ditto.
>   * gcc.target/powerpc/fold-vec-perm-longlong.c: Ditto.
>   * gcc.target/powerpc/fold-vec-perm-pixel.c: Ditto.
>   * gcc.target/powerpc/fold-vec-perm-short.c: Ditto.
>   * gcc.target/powerpc/lvsl-lvsr.c: Ditto.
>   * gcc.target/powerpc/vec-mult-char-2.c: Ditto.
>   * gcc.target/powerpc/vsx-vector-6.p9.c: Also allow xxpermr.
> 
> ---
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c   | 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c | 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/fold-vec-perm-short.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/lvsl-lvsr.c  | 2 +-
>  gcc/testsuite/gcc.target/powerpc/vec-mult-char-2.c| 2 +-
>  gcc/testsuite/gcc.target/powerpc/vsx-vector-6.p9.c| 2 +-
>  10 files changed, 10 insertions(+), 10 deletions(-)
> 
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c
> index d907eae..56a89f3 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-char.c
> @@ -28,4 +28,4 @@ testuc (vector unsigned char vuc2, vector unsigned
> char vuc3,
>return vec_perm (vuc2, vuc3, vuc);
>  }
> 
> -/* { dg-final { scan-assembler-times "vperm" 3 } } */
> +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c
> index 7ceca9e..c982bc2 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-double.c
> @@ -14,4 +14,4 @@ testd (vector double vd2, vector double vd3, vector
> unsigned char vuc)
>return vec_perm (vd2, vd3, vuc);
>  }
> 
> -/* { dg-final { scan-assembler-times "vperm" 1 } } */
> +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c
> index c9cfb0d..64b8ac7 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-float.c
> @@ -13,4 +13,4 @@ testf (vector float vf2, vector float vf3, vector
> unsigned char vuc)
>return vec_perm (vf2, vf3, vuc);
>  }
> 
> -/* { dg-final { scan-assembler-times "vperm" 1 } } */
> +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 1 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c
> index a2fdc26..a6dd595 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-int.c
> @@ -28,4 +28,4 @@ testui (vector unsigned int vui2, vector unsigned
> int vui3,
>return vec_perm (vui2, vui3, vuc);
>  }
> 
> -/* { dg-final { scan-assembler-times "vperm" 3 } } */
> +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-
> longlong.c b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-
> longlong.c
> index 1333d88..3cc757d 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-longlong.c
> @@ -29,4 +29,4 @@ testul (vector unsigned long long vul2, vector
> unsigned long long vul3,
>return vec_perm (vul2, vul3, vuc);
>  }
> 
> -/* { dg-final { scan-assembler-times "vperm" 3 } } */
> +/* { dg-final { scan-assembler-times {\m(?:v|xx)permr?\M} 3 } } */
> diff --git a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c
> b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c
> index 0d3cb0a..54fccd2 100644
> --- a/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c
> +++ b/gcc/testsuite/gcc.target/powerpc/fold-vec-perm-pixel.c
> @@ -13,4 +1

Re: [RFC][vect] BB SLP reduction prototype

2020-06-09 Thread Andre Vieira (lists)
The 'you' here is Richi, which Richi is probably aware but maybe not the 
rest of the list :')


On 09/06/2020 15:29, Andre Vieira (lists) wrote:

Hi,

So this is my rework of the code you sent me, I have not included the 
'permute' code you included as I can't figure out what it is meant to 
be doing. Maybe something to look at later.


I have also included three tests that show it working for some simple 
cases and even a nested one.


Unfortunately it will not handle other simple cases where reassoc 
doesn't put the reduction in the form of :

sum0 = a + b;
sum1 = c + sum0;
...

For instance a testcase I have been looking at is:
unsigned int u32_single_abs_sum (unsigned int * a, unsigned int * b)
{
  unsigned int sub0 = a[0] - b[0];
  unsigned int sub1 = a[1] - b[1];
  unsigned int sub2 = a[2] - b[2];
  unsigned int sub3 = a[3] - b[3];
  unsigned int sum = sub0 + sub1;
  sum += sub2;
  sum += sub3;
  return sum;
}

Unfortunately, the code that reaches slp will look like:
  _1 = *a_10(D);
  _2 = *b_11(D);
  _3 = MEM[(unsigned int *)a_10(D) + 4B];
  _4 = MEM[(unsigned int *)b_11(D) + 4B];
  _5 = MEM[(unsigned int *)a_10(D) + 8B];
  _6 = MEM[(unsigned int *)b_11(D) + 8B];
  _7 = MEM[(unsigned int *)a_10(D) + 12B];
  _8 = MEM[(unsigned int *)b_11(D) + 12B];
  _28 = _1 - _2;
  _29 = _3 + _28;
  _30 = _29 - _4;
  _31 = _5 + _30;
  _32 = _31 - _6;
  _33 = _7 + _32;
  sum_18 = _33 - _8;
  return sum_18;

Which doesn't have the format expected as I described above... I am 
wondering how to teach it to support this. Maybe starting with your 
suggestion of making plus_expr and minus_expr have the same hash, so 
it groups all these statements together might be a start, but you'd 
still need to 'rebalance' the tree somehow I need to give this a 
bit more thought but I wanted to share what I have so far.


The code is severely lacking in comments for now btw...

Cheers,
Andre



Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue

2020-06-09 Thread Jeff Law via Gcc-patches
On Tue, 2020-06-09 at 09:17 +0200, Richard Biener wrote:
> On Mon, Jun 8, 2020 at 11:13 PM Jeff Law via Gcc-patches
>  wrote:
> > 
> > We're seeing some failures due to the local-alignment pass.  For example on 
> > sh4:
> > 
> > Tests that now fail, but worked before (16 tests):
> > 
> > gcc.dg/pr48335-1.c (test for excess errors)
> > gcc.dg/pr48335-2.c (test for excess errors)
> > gcc.dg/pr48335-5.c (test for excess errors)
> > gcc.dg/pr48335-6.c (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O0  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O0  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O1  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O1  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O2  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O2  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O2 -flto -fno-use-linker-plugin -flto-
> > partition=none  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O2 -flto -fno-use-linker-plugin -flto-
> > partition=none  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O3 -g  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -O3 -g  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -Os  (test for excess errors)
> > gcc.dg/torture/pr48493.c   -Os  (test for excess errors)
> > 
> > Or on x86_64:
> > 
> > during GIMPLE pass: adjust_alignment
> > /home/jenkins/linux/arch/x86/boot/string.c: In function ‘simple_strtoull’:
> > /home/jenkins/linux/arch/x86/boot/string.c:121:20: internal compiler error: 
> > in
> > execute, at adjust-alignment.c:74
> >   121 | unsigned long long simple_strtoull(const char *cp, char **endp, 
> > unsigned
> > int base)
> >   |^~~
> > 0x79c5b3 execute
> > ../../../../../gcc/gcc/adjust-alignment.c:74
> > Please submit a full bug report,
> > 
> > There may be others, I haven't searched the failing targets extensively for 
> > this
> > issue.
> > 
> > AFAICT the adjust-alignment pass is supposed to increase alignments of 
> > locals
> > when needed.  It has an assert to ensure that alignments are only increased.
> > 
> > However, if the object was declared with an alignment attribute that is 
> > larger
> > than what LOCAL_ALIGNMENT would produce for the object, then the 
> > adjust-alignment
> > pass trips the assert.
> > 
> > Rather than asserting, just ignoring such objects seems better.  But I'm not
> > intimately familiar with the issues here.
> > 
> > Bootstrapped and regression tested on x86_64, also verified this fixes the 
> > sh4
> > issue.  All the *-elf targets have also built/tested with this patch with no
> > regressions.
> > 
> > OK for the trunk?
> 
> While technically OK the issue is that it does not solve the x86 issue
> which with incoming stack alignment < 8 bytes does not perform
> dynamic re-alignment to align 'long long' variables appropriately.
> Currently the x86 backend thinks it can fixup by lowering alignment
> via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue
> because the larger alignment is present in the IL for a long (previously
> RTL expansion, now adjust-aligment) time and your patch makes that
> wrong info last forever (or alternatively cause dynamic stack alignment
> which we do _not_ want to perform here).
I've never looked at the dynamic realignment stuff -- is there a good reason why
we don't dynamically realign when we've got a local with a requested alignment? 
Isn't that a huge oversight in the whole concept of dynamic realignment?


> 
> So I prefer to wait for a proper x86 solution before cutting the ICEs.
> (did you verify effects on code generation of your patch?)
I don't mind waiting.  And no, I didn't look at the codegen at all.

jeff



Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote:
> > While technically OK the issue is that it does not solve the x86 issue
> > which with incoming stack alignment < 8 bytes does not perform
> > dynamic re-alignment to align 'long long' variables appropriately.
> > Currently the x86 backend thinks it can fixup by lowering alignment
> > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue
> > because the larger alignment is present in the IL for a long (previously
> > RTL expansion, now adjust-aligment) time and your patch makes that
> > wrong info last forever (or alternatively cause dynamic stack alignment
> > which we do _not_ want to perform here).
> I've never looked at the dynamic realignment stuff -- is there a good reason 
> why
> we don't dynamically realign when we've got a local with a requested 
> alignment? 
> Isn't that a huge oversight in the whole concept of dynamic realignment?

It is quite expensive and long long/double uses are pervasive, so we don't
want to realign just because of that.  If we do dynamic realignment for
other reasons, there is no reason not to align the long long/double
properly.

Jakub



Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue

2020-06-09 Thread Jeff Law via Gcc-patches
On Tue, 2020-06-09 at 16:59 +0200, Jakub Jelinek wrote:
> On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote:
> > > While technically OK the issue is that it does not solve the x86 issue
> > > which with incoming stack alignment < 8 bytes does not perform
> > > dynamic re-alignment to align 'long long' variables appropriately.
> > > Currently the x86 backend thinks it can fixup by lowering alignment
> > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue
> > > because the larger alignment is present in the IL for a long (previously
> > > RTL expansion, now adjust-aligment) time and your patch makes that
> > > wrong info last forever (or alternatively cause dynamic stack alignment
> > > which we do _not_ want to perform here).
> > I've never looked at the dynamic realignment stuff -- is there a good 
> > reason why
> > we don't dynamically realign when we've got a local with a requested 
> > alignment? 
> > Isn't that a huge oversight in the whole concept of dynamic realignment?
> 
> It is quite expensive and long long/double uses are pervasive, so we don't
> want to realign just because of that.  If we do dynamic realignment for
> other reasons, there is no reason not to align the long long/double
Right, but if there's an explicit alignment from the user, don't we need to 
honor
that?

Jeff



Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue

2020-06-09 Thread Jakub Jelinek via Gcc-patches
On Tue, Jun 09, 2020 at 09:18:25AM -0600, Jeff Law wrote:
> On Tue, 2020-06-09 at 16:59 +0200, Jakub Jelinek wrote:
> > On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote:
> > > > While technically OK the issue is that it does not solve the x86 issue
> > > > which with incoming stack alignment < 8 bytes does not perform
> > > > dynamic re-alignment to align 'long long' variables appropriately.
> > > > Currently the x86 backend thinks it can fixup by lowering alignment
> > > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue
> > > > because the larger alignment is present in the IL for a long (previously
> > > > RTL expansion, now adjust-aligment) time and your patch makes that
> > > > wrong info last forever (or alternatively cause dynamic stack alignment
> > > > which we do _not_ want to perform here).
> > > I've never looked at the dynamic realignment stuff -- is there a good 
> > > reason why
> > > we don't dynamically realign when we've got a local with a requested 
> > > alignment? 
> > > Isn't that a huge oversight in the whole concept of dynamic realignment?
> > 
> > It is quite expensive and long long/double uses are pervasive, so we don't
> > want to realign just because of that.  If we do dynamic realignment for
> > other reasons, there is no reason not to align the long long/double
> Right, but if there's an explicit alignment from the user, don't we need to 
> honor
> that?

Sure.  Do we ignore that?  For user alignment we have DECL_USER_ALIGN bit...

Jakub



Re: [RFA] Fix various regressions and kernel build failure due to adjust-alignment issue

2020-06-09 Thread Jeff Law via Gcc-patches
On Tue, 2020-06-09 at 17:26 +0200, Jakub Jelinek wrote:
> On Tue, Jun 09, 2020 at 09:18:25AM -0600, Jeff Law wrote:
> > On Tue, 2020-06-09 at 16:59 +0200, Jakub Jelinek wrote:
> > > On Tue, Jun 09, 2020 at 08:54:47AM -0600, Jeff Law via Gcc-patches wrote:
> > > > > While technically OK the issue is that it does not solve the x86 issue
> > > > > which with incoming stack alignment < 8 bytes does not perform
> > > > > dynamic re-alignment to align 'long long' variables appropriately.
> > > > > Currently the x86 backend thinks it can fixup by lowering alignment
> > > > > via LOCAL_DECL_ALIGNMENT but this is a latent wrong-code issue
> > > > > because the larger alignment is present in the IL for a long 
> > > > > (previously
> > > > > RTL expansion, now adjust-aligment) time and your patch makes that
> > > > > wrong info last forever (or alternatively cause dynamic stack 
> > > > > alignment
> > > > > which we do _not_ want to perform here).
> > > > I've never looked at the dynamic realignment stuff -- is there a good 
> > > > reason why
> > > > we don't dynamically realign when we've got a local with a requested 
> > > > alignment? 
> > > > Isn't that a huge oversight in the whole concept of dynamic realignment?
> > > 
> > > It is quite expensive and long long/double uses are pervasive, so we don't
> > > want to realign just because of that.  If we do dynamic realignment for
> > > other reasons, there is no reason not to align the long long/double
> > Right, but if there's an explicit alignment from the user, don't we need to 
> > honor
> > that?
> 
> Sure.  Do we ignore that?  For user alignment we have DECL_USER_ALIGN bit...
I suspect that's what's going on with the kernel build failure.  

Jeff



Re: [PATCH] guard against calls with fewer arguments than the function type implies (PR 95581)

2020-06-09 Thread Martin Sebor via Gcc-patches

On 6/9/20 1:43 AM, Richard Biener wrote:

On Mon, Jun 8, 2020 at 11:59 PM Martin Sebor via Gcc-patches
 wrote:


PR 95581 reports an ICE in a call to gimple_call_arg() with
an out-of-bounds argument number (2).  The attached patch avoids
the ICE but I'm not sure it's the right fix.  Other than verifying
the ICE is gone with a powerpc64 cross-compiler I couldn't come up
with a generic test cases to exercise this change.

The call, synthesized by the vectorizer, is

vect__5.6_24 = __builtin_altivec_mask_for_load (vectp_a.5_8);

but the function is declared by the powerpc64 back end to take two
arguments: long and const void*.  Is it correct for the builtin to
be called with fewer arguments than their type indicates?  (If it
isn't the patch shouldn't be necessary but some other fix would
be needed.)


I think the backend declaration is wrong, the function only takes
a void * argument and returns a long.


Thanks.  In his comment on the bug Segher (CC'd) points to
the internals manual that documents the function:

https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/target.def;h=07059a87caf7cc0237d583124b476ee45ea41ed5;hb=HEAD#l1744

(By the way, thanks for the pointer!)

If I read it right, ihe function f in
the TARGET_VECTORIZE_BUILTIN_MASK_FOR_LOAD documentation is
__builtin_altivec_mask_for_load.  Although the manual isn't
unequivocal about this but it does suggest the address addr
the function is given as an argument should be the first
(and presumably only) argument.  This matches the call in
the IL where the first argument is a pointer, but not
the function's signature.

I think the middle end needs to be able to rely on built-in
function types when processing calls: i.e., the types and
numbers of actual arguments in the calls need to match those
of the built-in function type.  Otherwise it would have to
validate every call against the function signature and avoid
treating it as a built-in if it doesn't match.  There are
parts of the middle end that do that for library built-ins
(because they can be declared in a program with mismatched
signatures) but as we have seen it's error-prone.  I don't
think it would be helpful to try to extend this approach to
internal built-ins.

So I agree that the real problem is the declaration of
the built-in.

I have no issue with also committing the proposed patch
in the meantime, until the back end is fixed, if it helps.
But I'll leave that decision to the middle end maintainers.

Martin


Re: [PATCH] guard against calls with fewer arguments than the function type implies (PR 95581)

2020-06-09 Thread Richard Biener via Gcc-patches
On Tue, Jun 9, 2020 at 5:51 PM Martin Sebor  wrote:
>
> On 6/9/20 1:43 AM, Richard Biener wrote:
> > On Mon, Jun 8, 2020 at 11:59 PM Martin Sebor via Gcc-patches
> >  wrote:
> >>
> >> PR 95581 reports an ICE in a call to gimple_call_arg() with
> >> an out-of-bounds argument number (2).  The attached patch avoids
> >> the ICE but I'm not sure it's the right fix.  Other than verifying
> >> the ICE is gone with a powerpc64 cross-compiler I couldn't come up
> >> with a generic test cases to exercise this change.
> >>
> >> The call, synthesized by the vectorizer, is
> >>
> >> vect__5.6_24 = __builtin_altivec_mask_for_load (vectp_a.5_8);
> >>
> >> but the function is declared by the powerpc64 back end to take two
> >> arguments: long and const void*.  Is it correct for the builtin to
> >> be called with fewer arguments than their type indicates?  (If it
> >> isn't the patch shouldn't be necessary but some other fix would
> >> be needed.)
> >
> > I think the backend declaration is wrong, the function only takes
> > a void * argument and returns a long.
>
> Thanks.  In his comment on the bug Segher (CC'd) points to
> the internals manual that documents the function:
>
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=gcc/target.def;h=07059a87caf7cc0237d583124b476ee45ea41ed5;hb=HEAD#l1744
>
> (By the way, thanks for the pointer!)
>
> If I read it right, ihe function f in
> the TARGET_VECTORIZE_BUILTIN_MASK_FOR_LOAD documentation is
> __builtin_altivec_mask_for_load.  Although the manual isn't
> unequivocal about this but it does suggest the address addr
> the function is given as an argument should be the first
> (and presumably only) argument.  This matches the call in
> the IL where the first argument is a pointer, but not
> the function's signature.
>
> I think the middle end needs to be able to rely on built-in
> function types when processing calls: i.e., the types and
> numbers of actual arguments in the calls need to match those
> of the built-in function type.  Otherwise it would have to
> validate every call against the function signature and avoid
> treating it as a built-in if it doesn't match.  There are
> parts of the middle end that do that for library built-ins
> (because they can be declared in a program with mismatched
> signatures) but as we have seen it's error-prone.  I don't
> think it would be helpful to try to extend this approach to
> internal built-ins.
>
> So I agree that the real problem is the declaration of
> the built-in.
>
> I have no issue with also committing the proposed patch
> in the meantime, until the back end is fixed, if it helps.
> But I'll leave that decision to the middle end maintainers.

The backend needs to be fixed.

Richard.

> Martin


[pushed] c++: Tweak predeclare_vla.

2020-06-09 Thread Jason Merrill via Gcc-patches
We only need to predeclare a VLA type if it's wrapped in a pointer type;
otherwise gimplify_type_sizes will handle it.

Tested x86_64-pc-linux-gnu, applying to trunk.

gcc/cp/ChangeLog:

PR c++/95552
* cp-gimplify.c (predeclare_vla): Only predeclare a VLA if it's
wrapped in a pointer type.
---
 gcc/cp/cp-gimplify.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/gcc/cp/cp-gimplify.c b/gcc/cp/cp-gimplify.c
index e8fbc300fda..0e949e29c5c 100644
--- a/gcc/cp/cp-gimplify.c
+++ b/gcc/cp/cp-gimplify.c
@@ -1205,7 +1205,8 @@ predeclare_vla (tree expr)
return expr;
   vla = TREE_TYPE (vla);
 }
-  if (TYPE_NAME (vla) || !variably_modified_type_p (vla, NULL_TREE))
+  if (vla == type || TYPE_NAME (vla)
+  || !variably_modified_type_p (vla, NULL_TREE))
 return expr;
 
   tree decl = build_decl (input_location, TYPE_DECL, NULL_TREE, vla);

base-commit: 009668e31f4ee910eae874b24afb8eb6adf65fae
-- 
2.18.1



Re: [PATCH RFA] tree-inline: Fix VLA handling [PR95552]

2020-06-09 Thread Jason Merrill via Gcc-patches

On 6/9/20 8:41 AM, Eric Botcazou wrote:

Yes, but the problem is that remap_decl isn't getting called.


Right, I can get it to be called by adding a pushdecl to grokdeclarator...


Attaching it to the BIND_EXPR doesn't help walk_tree_1 do the right
thing with the DECL_EXPR.


... but, indeed, this still ICEs.  So the key is that the DECL_EXPR_DECL of
the copied DECL_EXPR points to the remapped TYPE_DECL before the type is
copied?  If so, then your original patch is probably the way to go, but the
comment would need to be slightly adjusted.


Like this?


In Ada, where we attach the TYPE_DECL to the BIND_EXPR, this will mean that
remap_decl is invoked 3 times per TYPE_DECL: first twice from copy_bind_expr
and then once again for the DECL_EXPR.  But probably no big deal in the end.


Yes, we want to remap every occurrence of the decl so they all get replaced.

Jason

commit 7b7c1b07dc32cb3cb6dc9b97d516a7240c825cf9
Author: Jason Merrill 
Date:   Fri Jun 5 16:36:27 2020 -0400

tree-inline: Fix VLA handling [PR95552]

The problem in this testcase comes from cloning the constructor into
complete and base variants.  When we clone the body the first time,
walk_tree_1 calls copy_tree_body_r on the type of the artificial TYPE_DECL
we made for the VLA type without calling it on the decl itself, so we
overwrite the type of the TYPE_DECL without copying the decl first.

This has been broken since we started inserting a TYPE_DECL for anonymous
VLAs in r7-457.

This patch fixes walk_tree_1 to call the function on the TYPE_DECL, as we do
for other decls of a DECL_EXPR.

gcc/ChangeLog:

PR c++/95552
* tree.c (walk_tree_1): Call func on the TYPE_DECL of a DECL_EXPR.

gcc/testsuite/ChangeLog:

PR c++/95552
* g++.dg/ext/vla23.C: New test.

diff --git a/gcc/testsuite/g++.dg/ext/vla23.C b/gcc/testsuite/g++.dg/ext/vla23.C
new file mode 100644
index 000..317a824b2f3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/ext/vla23.C
@@ -0,0 +1,14 @@
+// PR c++/95552
+// Test for VLA and cloned constructor.
+// { dg-additional-options -Wno-vla }
+// { dg-require-effective-target alloca }
+
+struct VB { };
+struct ViewDom: virtual VB
+{
+  ViewDom(int i) { char (*a)[i]; }
+};
+void element( )
+{
+  ViewDom a(2);
+}
diff --git a/gcc/tree.c b/gcc/tree.c
index 7197b4720ce..805f669a945 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -12212,6 +12212,12 @@ walk_tree_1 (tree *tp, walk_tree_fn func, void *data,
 	 Note that DECLs get walked as part of processing the BIND_EXPR.  */
   if (TREE_CODE (DECL_EXPR_DECL (*tp)) == TYPE_DECL)
 	{
+	  /* Call the function for the decl so e.g. copy_tree_body_r can
+	 replace it with the remapped one.  */
+	  result = (*func) (&DECL_EXPR_DECL (*tp), &walk_subtrees, data);
+	  if (result || !walk_subtrees)
+	return result;
+
 	  tree *type_p = &TREE_TYPE (DECL_EXPR_DECL (*tp));
 	  if (TREE_CODE (*type_p) == ERROR_MARK)
 	return NULL_TREE;


Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare

2020-06-09 Thread Jonathan Wakely via Gcc-patches

On 09/06/20 00:02 +0100, Jonathan Wakely wrote:

On 08/06/20 22:08 +0100, Jonathan Wakely wrote:

On 08/06/20 19:20 +0100, Jonathan Wakely wrote:

On 05/06/20 22:24 +0200, François Dumont via Libstdc++ wrote:

Hi

    Here is the last of my algo patches this time to extend the 
memcmp optimization to std::deque iterators and _GLIBCXX_DEBUG 
mode.


    To do so I had to return int in implementation details of 
lexicographical_compare to make sure we can treat a deque 
iterator range by chunk in a performant way.


Here's a simpler version, which doesn't alter anything for the
existing code (i.e. all iterators that aren't deque iterators) and
also fixes the infinite loop bug.

This seems simpler and less invasive.

Please take a look.


I've actually tested it in debug mode now, and ...


diff --git a/libstdc++-v3/include/debug/safe_iterator.tcc 
b/libstdc++-v3/include/debug/safe_iterator.tcc
index 888ac803ae5..db6a301655f 100644
--- a/libstdc++-v3/include/debug/safe_iterator.tcc
+++ b/libstdc++-v3/include/debug/safe_iterator.tcc
@@ -470,6 +470,80 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
 return __equal_aux1(__first1, __last1, __first2);
   }

+  template
+int


This should return bool here.


+__lexicographical_compare_aux(
+   const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __first1,
+   const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __last1,
+   _II2 __first2, _II2 __last2)
+{
+  typename ::__gnu_debug::_Distance_traits<_Ite1>::__type __dist1;
+  __glibcxx_check_valid_range2(__first1, __last1, __dist1);
+  __glibcxx_check_valid_range(__first2, __last2);
+
+  if (__dist1.second > ::__gnu_debug::__dp_equality)
+   return std::__lexicographical_compare_aux(__first1.base(),
+ __last1.base(),
+ __first2, __last2);
+  return std::__lexicographical_compare_aux1(__first1, __last1,
+__first2, __last2);
+}
+
+  template
+int


And here.


+__lexicographical_compare_aux(
+   _II1 __first1, _II1 __last1,
+   const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __first2,
+   const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __last2)
+{
+  __glibcxx_check_valid_range(__first1, __last1);
+  typename ::__gnu_debug::_Distance_traits<_II1>::__type __dist2;
+  __glibcxx_check_valid_range2(__first2, __last2, __dist2);
+
+  if (__dist2.second > ::__gnu_debug::__dp_equality)
+   return std::__lexicographical_compare_aux(__first1, __last1,
+ __first2.base(),
+ __last2.base());
+  return std::__lexicographical_compare_aux1(__first1, __last1,
+__first2, __last2);
+}
+
+  template
+int


And here.


And I've also enhanced the tests so they catch the bug I created where
the __lexicographical_compare_aux1 overload taking two ranges of deque
iterators was still trying to return a three-way result, rather than
bool.

Corrected patch attached.

But, the 25_algorithms/lexicographical_compare/deque_iterators/1.cc
test doesn't actually test the new code properly, because it only uses
deque which means memcmp isn't even used. We need better tests.




Re: libstdc++: Extend memcmp optimization in std::lexicographical_compare

2020-06-09 Thread Jonathan Wakely via Gcc-patches

On 09/06/20 17:11 +0100, Jonathan Wakely wrote:

On 09/06/20 00:02 +0100, Jonathan Wakely wrote:

On 08/06/20 22:08 +0100, Jonathan Wakely wrote:

On 08/06/20 19:20 +0100, Jonathan Wakely wrote:

On 05/06/20 22:24 +0200, François Dumont via Libstdc++ wrote:

Hi

    Here is the last of my algo patches this time to extend 
the memcmp optimization to std::deque iterators and 
_GLIBCXX_DEBUG mode.


    To do so I had to return int in implementation details of 
lexicographical_compare to make sure we can treat a deque 
iterator range by chunk in a performant way.


Here's a simpler version, which doesn't alter anything for the
existing code (i.e. all iterators that aren't deque iterators) and
also fixes the infinite loop bug.

This seems simpler and less invasive.

Please take a look.


I've actually tested it in debug mode now, and ...


diff --git a/libstdc++-v3/include/debug/safe_iterator.tcc 
b/libstdc++-v3/include/debug/safe_iterator.tcc
index 888ac803ae5..db6a301655f 100644
--- a/libstdc++-v3/include/debug/safe_iterator.tcc
+++ b/libstdc++-v3/include/debug/safe_iterator.tcc
@@ -470,6 +470,80 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
return __equal_aux1(__first1, __last1, __first2);
  }

+  template
+int


This should return bool here.


+__lexicographical_compare_aux(
+   const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __first1,
+   const ::__gnu_debug::_Safe_iterator<_Ite1, _Seq1, _Cat1>& __last1,
+   _II2 __first2, _II2 __last2)
+{
+  typename ::__gnu_debug::_Distance_traits<_Ite1>::__type __dist1;
+  __glibcxx_check_valid_range2(__first1, __last1, __dist1);
+  __glibcxx_check_valid_range(__first2, __last2);
+
+  if (__dist1.second > ::__gnu_debug::__dp_equality)
+   return std::__lexicographical_compare_aux(__first1.base(),
+ __last1.base(),
+ __first2, __last2);
+  return std::__lexicographical_compare_aux1(__first1, __last1,
+__first2, __last2);
+}
+
+  template
+int


And here.


+__lexicographical_compare_aux(
+   _II1 __first1, _II1 __last1,
+   const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __first2,
+   const ::__gnu_debug::_Safe_iterator<_Ite2, _Seq2, _Cat2>& __last2)
+{
+  __glibcxx_check_valid_range(__first1, __last1);
+  typename ::__gnu_debug::_Distance_traits<_II1>::__type __dist2;
+  __glibcxx_check_valid_range2(__first2, __last2, __dist2);
+
+  if (__dist2.second > ::__gnu_debug::__dp_equality)
+   return std::__lexicographical_compare_aux(__first1, __last1,
+ __first2.base(),
+ __last2.base());
+  return std::__lexicographical_compare_aux1(__first1, __last1,
+__first2, __last2);
+}
+
+  template
+int


And here.


And I've also enhanced the tests so they catch the bug I created where
the __lexicographical_compare_aux1 overload taking two ranges of deque
iterators was still trying to return a three-way result, rather than
bool.

Corrected patch attached.


*Really* attached this time.



But, the 25_algorithms/lexicographical_compare/deque_iterators/1.cc
test doesn't actually test the new code properly, because it only uses
deque which means memcmp isn't even used. We need better tests.





commit 4d501ba840194a28799cb83bcd8aaecd0eec5304
Author: Fran??ois Dumont 
Date:   Tue Jun 9 00:01:51 2020 +0100

libstdc++: Extend memcmp optimization in std::lexicographical_compare

Make the memcmp optimization work for std::deque iterators and safe
iterators.

Co-authored-by: Jonathan Wakely  

libstdc++-v3/ChangeLog:

2020-06-08  Fran??ois Dumont  
Jonathan Wakely  

* include/bits/deque.tcc (__lex_cmp_dit): New.
(__lexicographical_compare_aux1): Define overloads for deque
iterators.
* include/bits/stl_algobase.h (__lexicographical_compare::__3way):
New static member function.
(__lexicographical_compare::__3way): Likewise.
(__lexicographical_compare::__lc): Use __3way.
(__lexicographical_compare_aux): Rename to
__lexicographical_compare_aux1 and declare overloads for deque
iterators.
(__lexicographical_compare_aux): Define new forwarding function
that calls __lexicographical_compare_aux1 and declare new overloads
for safe iterators.
(lexicographical_compare): Do not use __niter_base on
parameters.
* include/debug/safe_iterator.tcc
(__lexicographical_compare_aux): Define overloads for safe
iterators.
* testsuite/25_algorithms/lexicographical_compare/1.cc: Add
checks with random access i

[PATCH] Remove dead code

2020-06-09 Thread Richard Biener
This removes dead code left over from the reduction vectorization
refactoring last year.

Bootstrapped on x86_64-unknown-linux-gnu, testing in progress.

Richard.

2020-06-09  Richard Biener  

* tree-vect-loop.c (vectorizable_induction): Remove dead code.
---
 gcc/tree-vect-loop.c | 42 --
 1 file changed, 42 deletions(-)

diff --git a/gcc/tree-vect-loop.c b/gcc/tree-vect-loop.c
index f2c52ae1909..5329982e4c9 100644
--- a/gcc/tree-vect-loop.c
+++ b/gcc/tree-vect-loop.c
@@ -7373,11 +7373,6 @@ vectorizable_induction (loop_vec_info loop_vinfo,
   unsigned i;
   tree expr;
   gimple_seq stmts;
-  imm_use_iterator imm_iter;
-  use_operand_p use_p;
-  gimple *exit_phi;
-  edge latch_e;
-  tree loop_arg;
   gimple_stmt_iterator si;
 
   gphi *phi = dyn_cast  (stmt_info->stmt);
@@ -7485,9 +7480,6 @@ vectorizable_induction (loop_vec_info loop_vinfo,
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location, "transform induction phi.\n");
 
-  latch_e = loop_latch_edge (iv_loop);
-  loop_arg = PHI_ARG_DEF_FROM_EDGE (phi, latch_e);
-
   step_expr = STMT_VINFO_LOOP_PHI_EVOLUTION_PART (stmt_info);
   gcc_assert (step_expr != NULL_TREE);
   tree step_vectype = get_same_sized_vectype (TREE_TYPE (step_expr), vectype);
@@ -7872,40 +7864,6 @@ vectorizable_induction (loop_vec_info loop_vinfo,
}
 }
 
-  if (nested_in_vect_loop)
-{
-  /* Find the loop-closed exit-phi of the induction, and record
- the final vector of induction results:  */
-  exit_phi = NULL;
-  FOR_EACH_IMM_USE_FAST (use_p, imm_iter, loop_arg)
-{
- gimple *use_stmt = USE_STMT (use_p);
- if (is_gimple_debug (use_stmt))
-   continue;
-
- if (!flow_bb_inside_loop_p (iv_loop, gimple_bb (use_stmt)))
-   {
- exit_phi = use_stmt;
- break;
-   }
-}
-  if (exit_phi)
-   {
- stmt_vec_info stmt_vinfo = loop_vinfo->lookup_stmt (exit_phi);
- /* FORNOW. Currently not supporting the case that an inner-loop 
induction
-is not used in the outer-loop (i.e. only outside the outer-loop).  
*/
- gcc_assert (STMT_VINFO_RELEVANT_P (stmt_vinfo)
- && !STMT_VINFO_LIVE_P (stmt_vinfo));
-
- STMT_VINFO_VEC_STMT (stmt_vinfo) = new_stmt_info;
- if (dump_enabled_p ())
-   dump_printf_loc (MSG_NOTE, vect_location,
-"vector of inductions after inner-loop:%G",
-new_stmt);
-   }
-}
-
-
   if (dump_enabled_p ())
 dump_printf_loc (MSG_NOTE, vect_location,
 "transform induction: created def-use cycle: %G%G",
-- 
2.25.1


Re: [PATCH] libsanitizer: use gnu++14

2020-06-09 Thread Richard Biener via Gcc-patches
On Tue, Jun 9, 2020 at 10:09 AM Martin Liška  wrote:
>
> On 6/8/20 4:53 PM, Martin Liška wrote:
> > Hi.
> >
> > Thank you for the report. It's caused by fact that LLVM switch in 
> > 4d474e078ac7
> > to c++14. So that I suggest to use gnu++14.
> >
> > Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> > I also verified that abidiff is equal for all libsanitizer shared libraries.
> > I'm going to install the patch if there are no objections.
> >
> > Thanks,
> > Martin
>
> Installed as 942a384ef9f.
> @Andreas: Can you please check the riscv64 build?

Note we need to document (and configure test?) the build
requirement for non-bootstrap and asan/ubsan bootstraps.

For now we only document the requirement of a C++11
host compiler.  Also not sure whether using -std=gnu++1y
would allow more released compilers to build the code?
For example GCC 4.8.5 knows -std=gnu++1y but not
-std=gnu++14 and GCC 4.8.3 is required for bootstrap anyway.

Richard.

> Martin


PING^1 [PATCH 0/2] x86: Add cmpmemsi for -minline-all-stringops

2020-06-09 Thread H.J. Lu via Gcc-patches
On Sun, May 31, 2020 at 4:10 PM H.J. Lu  wrote:
>
> We used to expand memcmp to "repz cmpsb" via cmpstrnsi.  It was changed
> by
>
> commit 9b0f6f5e511ca512e4faeabc81d2fd3abad9b02f
> Author: Nick Clifton 
> Date:   Fri Aug 12 16:26:11 2011 +
>
> builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi pattern.
>
> * builtins.c (expand_builtin_memcmp): Do not use cmpstrnsi
> pattern.
> * doc/md.texi (cmpstrn): Note that the comparison stops if both
> fetched bytes are zero.
> (cmpstr): Likewise.
> (cmpmem): Note that the comparison does not stop if both of the
> fetched bytes are zero.
>
> Duplicate the cmpstrn pattern for cmpmem and expand cmpmem to "repz cmpsb"
> for -minline-all-stringops.  The only difference is that the length
> argument of cmpmem is guaranteed to be less than or equal to lengths of
> 2 memory areas.  Since cmpmem expander may pass the actual string length
> directly to cmpstrnqi patterns.  Pass a copy of the string length to
> cmpstrnqi patterns to avoid changing the actual string length by cmpstrnqi
> patterns.
>
> Tested on Linux/x86 and Linux/x86-64.  OK for master?
>
> H.J. Lu (2):
>   x86: Pass a copy of the string length to cmpstrnqi
>   x86: Add cmpmemsi for -minline-all-stringops
>
>  gcc/config/i386/i386-expand.c |  84 ++
>  gcc/config/i386/i386-protos.h |   1 +
>  gcc/config/i386/i386.md   |  80 -
>  gcc/testsuite/gcc.target/i386/pr95151-1.c |  17 +++
>  gcc/testsuite/gcc.target/i386/pr95151-2.c |  10 ++
>  gcc/testsuite/gcc.target/i386/pr95151-3.c |  18 +++
>  gcc/testsuite/gcc.target/i386/pr95151-4.c |  11 ++
>  gcc/testsuite/gcc.target/i386/pr95443-1.c | 130 ++
>  gcc/testsuite/gcc.target/i386/pr95443-2.c |  79 +
>  9 files changed, 371 insertions(+), 59 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-2.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-3.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95151-4.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95443-1.c
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr95443-2.c
>

PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546920.html
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546921.html


-- 
H.J.


PING^2: [PATCH] x86: Add UNSPECV_PATCHABLE_AREA

2020-06-09 Thread H.J. Lu via Gcc-patches
On Fri, May 22, 2020 at 4:22 AM H.J. Lu  wrote:
>
> On Sat, May 2, 2020 at 4:55 AM H.J. Lu  wrote:
> >
> > Currently patchable area is at the wrong place.  It is placed immediately
> > after function label, before both .cfi_startproc and ENDBR.  This patch
> > adds UNSPECV_PATCHABLE_AREA for pseudo patchable area instruction and
> > changes ENDBR insertion pass to also insert patchable area instruction.
> > TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY is defined to avoid placing
> > patchable area before .cfi_startproc and ENDBR.
> >
> > OK for master?
> >
> > Thanks.
> >
> > H.J.
> > ---
> > gcc/
> >
> > PR target/93492
> > * config/i386/i386-features.c (rest_of_insert_endbranch):
> > Renamed to ...
> > (rest_of_insert_endbr_and_patchable_area): Change return type
> > to void. Add need_endbr and patchable_area_size arguments.
> > Don't call timevar_push nor timevar_pop.  Replace
> > endbr_queued_at_entrance with insn_queued_at_entrance.  Insert
> > UNSPECV_PATCHABLE_AREA for patchable area.
> > (pass_data_insert_endbranch): Renamed to ...
> > (pass_data_insert_endbr_and_patchable_area): This.  Change
> > pass name to endbr_and_patchable_area.
> > (pass_insert_endbranch): Renamed to ...
> > (pass_insert_endbr_and_patchable_area): This.  Add need_endbr
> > and patchable_area_size;.
> > (pass_insert_endbr_and_patchable_area::gate): Set and check
> > need_endbr and patchable_area_size.
> > (pass_insert_endbr_and_patchable_area::execute): Call
> > timevar_push and timevar_pop.  Pass need_endbr and
> > patchable_area_size to rest_of_insert_endbr_and_patchable_area.
> > (make_pass_insert_endbranch): Renamed to ...
> > (make_pass_insert_endbr_and_patchable_area): This.
> > * config/i386/i386-passes.def: Replace pass_insert_endbranch
> > with pass_insert_endbr_and_patchable_area.
> > * config/i386/i386-protos.h (ix86_output_patchable_area): New.
> > (make_pass_insert_endbranch): Renamed to ...
> > (make_pass_insert_endbr_and_patchable_area): This.
> > * config/i386/i386.c (ix86_asm_output_function_label): Set
> > function_label_emitted to true.
> > (ix86_print_patchable_function_entry): New function.
> > (ix86_output_patchable_area): Likewise.
> > (x86_function_profiler): Replace endbr_queued_at_entrance with
> > insn_queued_at_entrance.  Generate ENDBR only for TYPE_ENDBR.
> > Call ix86_output_patchable_area to generate patchable area if
> > needed.
> > (TARGET_ASM_PRINT_PATCHABLE_FUNCTION_ENTRY): New.
> > * i386.h (queued_insn_type): New.
> > (machine_function): Add function_label_emitted.  Replace
> > endbr_queued_at_entrance with insn_queued_at_entrance.
> > * config/i386/i386.md (UNSPECV_PATCHABLE_AREA): New.
> > (patchable_area): New.
> >
> > gcc/testsuite/
> >
> > PR target/93492
> > * gcc.target/i386/pr93492-1.c: New test.
> > * gcc.target/i386/pr93492-2.c: Likewise.
> > * gcc.target/i386/pr93492-3.c: Likewise.
> > * gcc.target/i386/pr93492-4.c: Likewise.
> > * gcc.target/i386/pr93492-5.c: Likewise.
>
> PING:
>
> https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545021.html

PING.

-- 
H.J.


PING^1: V5 [PATCH] x86: Move cpuinfo.h from libgcc to common/config/i386

2020-06-09 Thread H.J. Lu via Gcc-patches
On Tue, May 26, 2020 at 6:27 AM Martin Liška  wrote:
>
> On 5/26/20 1:59 PM, H.J. Lu wrote:
> > On Tue, May 26, 2020 at 2:30 AM Martin Liška  wrote:
> >>
> >> On 5/25/20 7:42 PM, H.J. Lu wrote:
> >>> Here is the updated patch.  OK for master?
> >>
> >> Thank you for the updated patch.
> >>
> >> I have still few nits:
> >>
> >> 1) I would make all the:
> >>
> >>> +  has_sse3 = has_feature (FEATURE_SSE3);
> >>
> >> a macro. The local variable seems to superfluous.
> >
> > Done.
>
> Thanks!
>
> >
> >> 2) can we automatically deduce option name:
> >>
> >>> +  ISA_NAMES_TABLE_ENTRY("rdpid", FEATURE_RDPID, P_ZERO, "-mrdpid")
> >>> +  ISA_NAMES_TABLE_ENTRY("rdrnd", FEATURE_RDRND, P_ZERO, "-mrdrnd")
> >>
> >> I mean "-m" + "rdrnd" == "-mrdrnd" ?
> >
> > The new option field serves 2 purposes:
> >
> > 1. Not all features have a corresponding command-line option
> >
> > ISA_NAMES_TABLE_ENTRY("cmov", FEATURE_CMOV, P_ZERO, NULL)
> >
> >   for (i = 0; i < ARRAY_SIZE (isa_names_table); i++)
> >  if (isa_names_table[i].option)
> >
> > 2. Some feature has a different name in the command-line option.
> >
> >ISA_NAMES_TABLE_ENTRY("fxsave", FEATURE_FXSAVE, P_ZERO, "-mfxsr")
>
> I noticed that, one can theoretically use "" for an option that does not
> have a flag. And NULL for these which have option equal to "-m" + name.
> Anyway, that's a nit.
>
> I support the patch!
> Martin
>
> >
> > Here is the updated patch.   OK for master?
> >
> > Thanks.
> >
>

PING:

https://gcc.gnu.org/pipermail/gcc-patches/2020-May/546522.html

-- 
H.J.


Ping: [PATCH] diagnostics: Add options to control the column units [PR49973] [PR86904]

2020-06-09 Thread Lewis Hyatt via Gcc-patches
May I please ping this patch?
https://gcc.gnu.org/pipermail/gcc-patches/2020-May/545426.html

Thanks!

-Lewis

On Fri, May 08, 2020 at 03:35:25PM -0400, Lewis Hyatt wrote:
> On Fri, Jan 31, 2020 at 03:31:59PM -0500, David Malcolm wrote:
> > On Fri, 2020-01-31 at 14:31 -0500, Lewis Hyatt wrote:
> > > Hello-
> > > 
> > > Here is the second patch that I mentioned when I submitted the other
> > > related
> > > patch (which is awaiting review):
> > > https://gcc.gnu.org/ml/gcc-patches/2020-01/msg01626.html. 
> > 
> > Sorry about that; I'm v. busy with analyzer bugs right now.
> > 
> > > This second patch
> > > is based on top of the first one and it closes out PR49973 and
> > > PR86904 by
> > > adding the new option -fdiagnostics-column-unit=[display|byte]. This
> > > allows
> > > to specify whether columns are output as simple byte counts (the
> > > current
> > > behavior), or as display columns including handling multibyte
> > > characters and
> > > tabs. The patch makes display columns the new default. Additionally,
> > > a
> > > second new option -fdiagnostics-column-origin is added, which allows
> > > to make
> > > the column 0-based (or N-based for any N) instead of 1-based. The
> > > default
> > > remains at 1-based as it is now.
> > > 
> > > A number of testcases were explicitly testing for the old behavior,
> > > so I
> > > have updated them to test for the new behavior instead, since the
> > > column
> > > number adjusted for tabs is more natural to test for, and matches
> > > what
> > > editors typically show (give or take 1 for the origin convention).
> > > 
> > > One other testcase (go.dg/arrayclear.go) was a bit of an oddity. It
> > > failed
> > > after this patch, although it doesn't test for any column numbers.
> > > The
> > > answer turned out to be, this test checks for identical error text on
> > > two
> > > different lines. When the column units are changed to display
> > > columns, then
> > > the column of the second error happens to match the line of the first
> > > one. dejagnu then misinterprets the second error as if it matched the
> > > location of the first one (it doesn't distinguish whether it checks
> > > for the
> > > line number or the column number in the output). I added a comment to
> > > the
> > > test explaining the situation; since adding the comment has the side
> > > effect
> > > of making the first line number no longer match the second column
> > > number, it
> > > also makes the test pass again.
> > > 
> > > It wasn't quite clear to me whether this change was appropriate for
> > > GCC 10
> > > or not at this point. We discussed it a couple months ago here:
> > > https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02171.html. Either way,
> > > I hope
> > > it isn't a problem that I submitted the patch for review now, whether
> > > it
> > > will end up in 10 or 11. Please let me know what's normally expected?
> > > Thanks!
> > 
> > Thanks Lewis.
> > 
> > This patch looks very promising, but should wait until gcc 11; we're
> > trying to stabilize gcc 10 right now (I'm knee-deep in analyzer bug-
> > fixing, so I don't want to add any more diagnostics changes).
> >
> 
> Hi Dave-
> 
> Well GCC 10 was released for a whole day so I thought I would bug you with 
> this
> patch again now :). To summarize, I previously sent this in two separate 
> parts.
> 
> Part 1: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg01626.html
> Part 2: https://gcc.gnu.org/legacy-ml/gcc-patches/2020-01/msg02108.html
> 
> Part 1 added the support for converting tabs to spaces when outputting
> diagnostics. Part 2 added the new options -fdiagnostics-column-unit and
> -fdiagnostics-column-origin to control whether the column number is printed
> in display or byte units. Together they resolve both PR49973 and PR86904.
> 
> You provided me with feedback on part 2, which is quoted below with some
> notes interspersed. The new version of the patch incorporates all of your
> suggestions. Part 1 has not changed other than some trivial rebasing
> conflicts. The two patches touch nearly disjoint sets of files and are
> logically linked together, so I thought it would be simpler if I just sent
> one combined patch now. If you prefer them to be separated as before, please
> let me know and I can send them that way as well.
> 
> Bootstrap and reg tests were done on x86-64 Linux for all languages.  Tests
> look good:
> 
> type, before, after
> FAIL 96 96
> PASS 474637 475097
> UNSUPPORTED 11607 11607
> UNTESTED 195 195
> XFAIL 1816 1816
> XPASS 36 36
> 
> > 
> > > gcc/ChangeLog:
> > > 
> > > 2020-01-31  Lewis Hyatt  
> > >
> > 
> > Please reference the PRs here
> > 
> > [...]
> > 
> > > gcc/testsuite/ChangeLog:
> > > 
> > > 2020-01-31  Lewis Hyatt  
> > 
> > Likewise here.
> > 
> > [...]
> >
> 
> Done.
> 
> > > diff --git a/gcc/common.opt b/gcc/common.opt
> > > index 630c380bd6a..657985450c2 100644
> > > --- a/gcc/common.opt
> > > +++ b/gcc/common.opt
> > > @@ -1309,6 +1309,14 @@ Enum(diagnostic_url_rule) String(a

[committed] d: Merge upstream dmd 13d67c575.

2020-06-09 Thread Iain Buclaw via Gcc-patches
Hi,

This patch merges the D front-end implementation with upstream dmd
13d67c575.  Contains many small API changes, most contained within the
front-end, but some spill out into the codegen.

Bootstrapped and regression tested on x86_64-linux-gnu, and committed to
mainline.

Regards
Iain.


gcc/d/ChangeLog:

* dmd/MERGE: Merge upstream dmd 13d67c575.
* d-builtins.cc (build_frontend_type): Update call to
TypeVector::create.
* d-frontend.cc (Global::_init): Move setting of errorLimit to ...
* d-lang.cc (d_init_options): ... here.  Update for new field
location of errorLimit.
(d_post_options): Likewise.
* d-port.cc (Port::readwordLE): Update signature.
(Port::readwordBE): Likewise.
(Port::readlongLE): Likewise.
(Port::readlongBE): Likewise.
* decl.cc (get_symbol_decl): Update for new field types.
---
 gcc/d/d-builtins.cc |   2 +-
 gcc/d/d-frontend.cc |   1 -
 gcc/d/d-lang.cc |   3 +-
 gcc/d/d-port.cc |  16 +-
 gcc/d/decl.cc   |   8 +-
 gcc/d/dmd/MERGE |   2 +-
 gcc/d/dmd/access.c  |   2 +
 gcc/d/dmd/aggregate.h   |   1 +
 gcc/d/dmd/ast_node.h|  20 ++
 gcc/d/dmd/cond.h|   5 +-
 gcc/d/dmd/ctfeexpr.c|   4 +-
 gcc/d/dmd/dclass.c  |  10 +-
 gcc/d/dmd/declaration.h |   2 +-
 gcc/d/dmd/dmangle.c |   8 +-
 gcc/d/dmd/dsymbol.h |   6 +-
 gcc/d/dmd/expression.c  | 518 
 gcc/d/dmd/expression.h  | 109 -
 gcc/d/dmd/globals.h |   5 +-
 gcc/d/dmd/init.h|   6 +-
 gcc/d/dmd/mtype.c   | 104 +++-
 gcc/d/dmd/mtype.h   |  32 ++-
 gcc/d/dmd/root/port.h   |   8 +-
 gcc/d/dmd/statement.h   |   5 +-
 gcc/d/dmd/template.h|   4 +-
 gcc/d/dmd/utf.c |  19 +-
 25 files changed, 843 insertions(+), 57 deletions(-)
 create mode 100644 gcc/d/dmd/ast_node.h

diff --git a/gcc/d/d-builtins.cc b/gcc/d/d-builtins.cc
index 33221ea3229..91e3173e670 100644
--- a/gcc/d/d-builtins.cc
+++ b/gcc/d/d-builtins.cc
@@ -211,7 +211,7 @@ build_frontend_type (tree type)
   if (dtype->nextOf ()->isTypeBasic () == NULL)
break;
 
-  dtype = (TypeVector::create (Loc (), dtype))->addMod (mod);
+  dtype = (TypeVector::create (dtype))->addMod (mod);
   builtin_converted_decls.safe_push (builtin_data (dtype, type));
   return dtype;
 }
diff --git a/gcc/d/d-frontend.cc b/gcc/d/d-frontend.cc
index 90cf74a0f61..5415d471ef4 100644
--- a/gcc/d/d-frontend.cc
+++ b/gcc/d/d-frontend.cc
@@ -55,7 +55,6 @@ Global::_init (void)
 ;
 
   this->stdmsg = stderr;
-  this->errorLimit = flag_max_errors;
 }
 
 /* Start gagging. Return the current number of gagged errors.  */
diff --git a/gcc/d/d-lang.cc b/gcc/d/d-lang.cc
index 2bc0def02c3..badd67f5a8f 100644
--- a/gcc/d/d-lang.cc
+++ b/gcc/d/d-lang.cc
@@ -293,6 +293,7 @@ d_init_options (unsigned int, cl_decoded_option 
*decoded_options)
   global.params.hdrStripPlainFunctions = true;
   global.params.betterC = false;
   global.params.allInst = false;
+  global.params.errorLimit = flag_max_errors;
 
   /* Default extern(C++) mangling to C++14.  */
   global.params.cplusplus = CppStdRevisionCpp14;
@@ -793,7 +794,7 @@ d_post_options (const char ** fn)
 
   /* Make -fmax-errors visible to frontend's diagnostic machinery.  */
   if (global_options_set.x_flag_max_errors)
-global.errorLimit = flag_max_errors;
+global.params.errorLimit = flag_max_errors;
 
   if (flag_excess_precision == EXCESS_PRECISION_DEFAULT)
 flag_excess_precision = EXCESS_PRECISION_STANDARD;
diff --git a/gcc/d/d-port.cc b/gcc/d/d-port.cc
index f129fd8843d..d49bb1ba8f3 100644
--- a/gcc/d/d-port.cc
+++ b/gcc/d/d-port.cc
@@ -97,9 +97,9 @@ Port::isFloat64LiteralOutOfRange (const char *buffer)
 /* Fetch a little-endian 16-bit value from BUFFER.  */
 
 unsigned
-Port::readwordLE (void *buffer)
+Port::readwordLE (const void *buffer)
 {
-  unsigned char *p = (unsigned char*) buffer;
+  const unsigned char *p = (const unsigned char*) buffer;
 
   return ((unsigned) p[1] << 8) | (unsigned) p[0];
 }
@@ -107,9 +107,9 @@ Port::readwordLE (void *buffer)
 /* Fetch a big-endian 16-bit value from BUFFER.  */
 
 unsigned
-Port::readwordBE (void *buffer)
+Port::readwordBE (const void *buffer)
 {
-  unsigned char *p = (unsigned char*) buffer;
+  const unsigned char *p = (const unsigned char*) buffer;
 
   return ((unsigned) p[0] << 8) | (unsigned) p[1];
 }
@@ -117,9 +117,9 @@ Port::readwordBE (void *buffer)
 /* Fetch a little-endian 32-bit value from BUFFER.  */
 
 unsigned
-Port::readlongLE (void *buffer)
+Port::readlongLE (const void *buffer)
 {
-  unsigned char *p = (unsigned char*) buffer;
+  const unsigned char *p = (const unsigned char*) buffer;
 
   return (((unsigned) p[3] << 24)
  | ((unsigned) p[2] << 16)
@@ -130,9 +130,9 @@ Port::readlongLE (void *buffer)
 /* Fetch a big-endian 32-bit value from BUFFER.  */
 
 unsigned
-Port::readlongBE (void *buffer)
+Port::readlongBE (const void *

Re: [PATCH] fortran/95509 - fix spellcheck-operator.f90 regression

2020-06-09 Thread Thomas Koenig via Gcc-patches

Hi Bernhard,


For Fortran identifiers, that would be really good.  Would you do that?

Is it really needed?

We do not enter internal helper names (class names, anything
artificial, anything leading underscore or leading uppercase) to the
suggestions proposals so should never actually recommend any of those?


You're quite right, this is not needed.

Thanks for pointing this out :-)

Regards

Thomas


Re: [PATCH RFA] tree-inline: Fix VLA handling [PR95552]

2020-06-09 Thread Eric Botcazou
> Like this?

Fine with me, thanks!

-- 
Eric Botcazou


  1   2   >