Re: [PATCH] Fix redundant load missed by fre [tree-optimization 92980]

2019-12-18 Thread Segher Boessenkool
On Wed, Dec 18, 2019 at 10:37:11AM +0800, Hongtao Liu wrote:
> Hi:
>   This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> power of 2 and D mod C == 0.
>   bootstrap and make check is ok.

Why would this be a good idea?  It is not reducing the number of
operators or similar?


Segher


Re: [Patch, Fortran] PR92896 [10 Regression] Fix - Prevent character conversion in array constructor

2019-12-18 Thread Mark Eggleston



On 17/12/2019 17:47, Steve Kargl wrote:

On Tue, Dec 17, 2019 at 05:28:05PM +, Mark Eggleston wrote:

On 17/12/2019 17:06, Steve Kargl wrote:

On Tue, Dec 17, 2019 at 03:41:41PM +, Mark Eggleston wrote:

gcc/fortran/ChangeLog

       Mark Eggleston  

       PR fortran/92896
       * array.c (walk_array_constructor): Replace call to cfg_convert_type

s/cfg_convert_type/gfc_convert_type


       with call to gfc_convert_type_warn with new argument set to true.
       (check_element_type): Replace call to cfg_convert_type with call to
       gfc_convert_type_warn with new argument set to true.
       * gfortran.h: Add argument "array" to gfc_convert_type_warn default
       value set to false.

Do all current uses of gfc_convert_type_warn need to be updated
to account for the new parameter?  That is, doesn't this introduce
a mismatch in the prototype and existing code?

I used a default value so all existing calls remain as they are and
default to false. So no mismatch.


% cat a.h
#ifndef _STDBOOL_H_
#include 
#endif
float foo(int, float, bool tmp = false);
% cat a.c
#include "a.h"
void
bar(float x)
{
   int n;
   n = 1;
   x = foo(n, x);
}
% /usr/home/sgk/work/x/bin/gcc -Wall -c a.c
In file included from a.c:2:
a.h:1:32: error: expected ';', ',' or ')' before '=' token
 1 | float foo(int, float, bool tmp = false);
   |^
a.c: In function 'bar':
a.c:8:7: warning: implicit declaration of function 'foo' 
[-Wimplicit-function-declaration]
 8 |   x = foo(n, x);
   |   ^~~

That's compiled with the C compiler, as I understand it the Fortran FE 
is in written in C++ and compiled using the C++ compiler even though the 
file extensions are .c.


regards,

Mark

--
https://www.codethink.co.uk/privacy.html



Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao
On 2019-12-17 18:47 +0100, Jan Hubicka wrote:
> > Would it be equivalent to:
> > 1) output foo_v2 local
> > 2) producing static alias with local name (.L1)
> > 3) do .symver .L1,foo@@@VERS_2
> > That is somewhat more systematic and would not lead to false
> > visibilities.
> 
> I spent some time playing with this.  An in order to 
> 1) be able to handle foo_v2 according to the resolution info
>(so it behaves like a regular symbol and can be called dirrectly,
> localized and optimized)
> 2) get intended objdump -T relocations
> 3) do not polute global symbol tables
> 
> I ended up with the following codegen:
> 
>   .type   foo_v2, @function
> foo_v2:
> .LFB1:
>   .cfi_startproc
>   movl$2, %eax
>   ret
>   .cfi_endproc
> .LFE1:
>   .size   foo_v2, .-foo_v2
>   .globl  .LSYMVER0
>   .set.LSYMVER0,foo_v2
>   .symver .LSYMVER0, foo@@@VERS_2
> 
> This uses @@@ symver version of gas which seems to have odd semantics of
> requiring to be passed global symbol name which it then tkes away and
> produces foo@@VERS_2.
> 
> So the nm outoutp of the ltrans unit is:
>  T foo_v1
> 0010 t foo_v2
>  T foo@VERS_1
> 0010 T foo@@VERS_2
> 
> So the difference to your patch is that foo_v2 is static which enables
> normal optimizations.
> 
> Since additional symbol alias is produced this would also make it
> possible to attach multiple symver attributes with @@ string.
> 
> Does somehting like this make sense to you? Modulo the obvious buffer
> overflow issue?
> Honza

Unfortunately, I got an ICE with my testcase with the patch applied to trunk.

lto1: internal compiler error: tree check: expected tree that contains ‘decl
minimal’ structure, have ‘identifier_node’ in do_assemble_symver, at
varasm.c:5986
0x6fa648 tree_contains_struct_check_failed(tree_node const*,
tree_node_structure_enum, char const*, int, char const*)
../../gcc/gcc/tree.c:9859
0x71466e contains_struct_check(tree_node*, tree_node_structure_enum, char
const*, int, char const*)
../../gcc/gcc/tree.h:3387
0x71466e do_assemble_symver(tree_node*, tree_node*)
../../gcc/gcc/varasm.c:5986
0x89e409 cgraph_node::assemble_thunks_and_aliases()
../../gcc/gcc/cgraphunit.c:2225
0x89e698 cgraph_node::expand()
../../gcc/gcc/cgraphunit.c:2351
0x89f62f expand_all_functions
../../gcc/gcc/cgraphunit.c:2456
0x89f62f symbol_table::compile()
../../gcc/gcc/cgraphunit.c:2806
0x7fb589 lto_main()
../../gcc/gcc/lto/lto.c:658
Please submit a full bug report,
with preprocessed source if appropriate.
Please include the complete backtrace with any bug report.
See  for instructions.
lto-wrapper: fatal error: /home/xry111/gcc-test/bin/gcc returned 1 exit status
compilation terminated.
/usr/bin/ld: error: lto-wrapper failed
collect2: error: ld returned 1 exit status
make: *** [Makefile:4: obj/test.so] Error 1

The change to lto/lto-common.c makes sense.  I tried it instead of my change to
cgraph.h and everything is OK.  I'll investigate the change to varasm.c a
little.
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



Re: [Patch, Fortran] PR92896 [10 Regression] Fix - Prevent character conversion in array constructor

2019-12-18 Thread Mark Eggleston



On 17/12/2019 18:04, Janne Blomqvist wrote:

On Tue, Dec 17, 2019 at 7:47 PM Steve Kargl
 wrote:

On Tue, Dec 17, 2019 at 05:28:05PM +, Mark Eggleston wrote:

On 17/12/2019 17:06, Steve Kargl wrote:

On Tue, Dec 17, 2019 at 03:41:41PM +, Mark Eggleston wrote:

gcc/fortran/ChangeLog

   Mark Eggleston  

   PR fortran/92896
   * array.c (walk_array_constructor): Replace call to cfg_convert_type

s/cfg_convert_type/gfc_convert_type


   with call to gfc_convert_type_warn with new argument set to true.
   (check_element_type): Replace call to cfg_convert_type with call to
   gfc_convert_type_warn with new argument set to true.
   * gfortran.h: Add argument "array" to gfc_convert_type_warn default
   value set to false.

Do all current uses of gfc_convert_type_warn need to be updated
to account for the new parameter?  That is, doesn't this introduce
a mismatch in the prototype and existing code?

I used a default value so all existing calls remain as they are and
default to false. So no mismatch.


% cat a.h
#ifndef _STDBOOL_H_
#include 
#endif
float foo(int, float, bool tmp = false);
% cat a.c
#include "a.h"
void
bar(float x)
{
   int n;
   n = 1;
   x = foo(n, x);
}
% /usr/home/sgk/work/x/bin/gcc -Wall -c a.c
In file included from a.c:2:
a.h:1:32: error: expected ';', ',' or ')' before '=' token
 1 | float foo(int, float, bool tmp = false);
   |^
a.c: In function 'bar':
a.c:8:7: warning: implicit declaration of function 'foo' 
[-Wimplicit-function-declaration]
 8 |   x = foo(n, x);
   |   ^~~

Well, frontends are nowadays C++, so

a) No need to include stdbool.h, bool is a builtin type.

b) optional arguments are a thing (they are also used elsewhere in the
Fortran frontend).

It is a bit confusing that the Fortran FE source files have the .c 
extension implying C when they are C++ and are compiled using C++.


After that aside back to the original question, OK to commit?

regards,

Mark

--
https://www.codethink.co.uk/privacy.html



Re: [PATCH] Some compute_objsize/gimple_call_alloc_size/maybe_warn_overflow cleanups (PR tree-optimization/92868)

2019-12-18 Thread Jakub Jelinek
On Tue, Dec 17, 2019 at 09:53:43AM -0700, Martin Sebor wrote:
> I appreciate a cleanup but I don't have the impression this patch
> does clean anything up.  Because of all the formatting changes and
> no tests the effect of the changes isn't as clear as it should be.
> (I wish you would resist the urge to reformat existing code on this
> scale while also making changes with an observable effect in
> the same diff.)  But thanks to the detailed explanation above
> I think I can safely say that the builtins.c changes are not in
> line with what I would like to see.

As written, there were two real changes in gimple_call_alloc_size,
one in maybe_warn_overflow and the rest formatting fixes (which I really
can't ignore, e.g. semicolon after if (...) { ... }; ?).
From the above, it seems you are talking about just one change in
gimple_call_alloc_size (the setting of rng1[0] to 0 on overflow).

So, let's talk first about the first real change in gimple_call_alloc_size.
Without it, you get garbage on testcase like:
/* { dg-do compile { target lp64 } } */
/* { dg-options "-O2 -Wall" } */

__attribute__((noipa, alloc_size (1)))
char *
foo (int a)
{
  return (char *) __builtin_malloc (a);
}

void
bar (char *q)
{
  char *p = foo (-13);
  if (!p)
return;
  __builtin_memcpy (p, q, (__SIZE_TYPE__) 0x10002ULL);
}

alloc_size_test.c: In function ‘bar’:
alloc_size_test.c:17:3: warning: ‘__builtin_memcpy’ writing 4294967298 bytes 
into a region of size 4294967283 [-Wstringop-overflow=]
   17 |   __builtin_memcpy (p, q, (__SIZE_TYPE__) 0x10002ULL);
  |   ^~~
alloc_size_test.c:14:13: note: at offset 0 to an object with size 4294967283 
allocated by ‘foo’ here
   14 |   char *p = foo (-13);
  | ^
alloc_size_test.c:14:13: warning: argument 1 value ‘-13’ is negative 
[-Walloc-size-larger-than=]
alloc_size_test.c:6:1: note: in a call to allocation function ‘foo’ declared 
here
6 | foo (int a)
  | ^~~

The first warning is wrong, there is no object with size 4294967283 aka
-13U, the allocation will almost certainly fail and there will be nothing
wrong in the testcase, but if it wouldn't fail, it would need to allocate
-13UL aka 18446744073709551603 bytes.  And the reason is that
gimple_call_alloc_size leaves bogus rng1[0] and rng1[1], of -13 with
precision 32, which the caller than extends to siz_prec, but with UNSIGNED
extension, which is reasonable assumption that it is given size (UNSIGNED)
wide_ints.

The second hunk, there can be still several cases, e.g. both low bound and
upper bound could overflow, such as on calloc (SIZE_MAX / 2 + 2, SIZE_MAX /
2 + 2).  In this case, the code would produce a range of (say let's assume
ilp32 in this case) of [0x40010001, 0x].  What the caller
will do with such thing is hard to predict, there is from UNSIGNED siz_prec
conversion.  If you don't want the function to return real range of possible
values, but something else, it should be document what it is and not call it
range.  For the reasons you stated, perhaps for warnings (but never for code
generation!) it could be useful to have both range and likely range, where
the latter would be on the assumption that no overflow happens and so e.g.
for [64, SIZE_MAX] * [64, SIZE_MAX] it could return range of [0, SIZE_MAX]
and likely range of [64 * 64, SIZE_MAX].

And for the real change in maybe_warn_overflow, you can just try in the
debugger or using gprof see when that if (early out) will ever trigger, I'd
bet very rarely.
char *
foo (void)
{
  char *p = __builtin_calloc (32, 32);
  __builtin_memset (p, ' ', 32 * 32);
  return p;
}
gdb --args ./cc1 -quiet -O2 -Wall test.c
b tree-ssa-strlen.c:2051
r
p destsize
$1 = 
p len
$2 = 
p debug_generic_stmt (destsize)
1024
p debug_generic_stmt (len)
1024

Jakub



Re: 'find_group_last' (was: [PATCH] OpenACC reference count overhaul)

2019-12-18 Thread Thomas Schwinge
Hi Julian!

Thanks for walking me through this.

On 2019-12-14T00:19:04+, Julian Brown  wrote:
> On Fri, 13 Dec 2019 16:25:25 +0100
> Thomas Schwinge  wrote:
>> On 2019-10-29T12:15:01+, Julian Brown 
>> wrote:
>> >  static int
>> > -find_pointer (int pos, size_t mapnum, unsigned short *kinds)
>> > +find_group_last (int pos, size_t mapnum, unsigned short *kinds)
>> >  {
>> > -  if (pos + 1 >= mapnum)
>> > -return 0;
>> > +  unsigned char kind0 = kinds[pos] & 0xff;
>> > +  int first_pos = pos, last_pos = pos;
>> >  
>> > -  unsigned char kind = kinds[pos+1] & 0xff;
>> > -
>> > -  if (kind == GOMP_MAP_TO_PSET)
>> > -return 3;
>> > -  else if (kind == GOMP_MAP_POINTER)
>> > -return 2;
>> > +  if (kind0 == GOMP_MAP_TO_PSET)
>> > +{
>> > +  while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == 
>> > GOMP_MAP_POINTER)
>> > +  last_pos = ++pos;
>> > +  /* We expect at least one GOMP_MAP_POINTER after a 
>> > GOMP_MAP_TO_PSET.  */
>> > +  assert (last_pos > first_pos);
>> > +}
>> > +  else
>> > +{
>> > +  /* GOMP_MAP_ALWAYS_POINTER can only appear directly after some other
>> > +   mapping.  */
>> > +  if (pos + 1 < mapnum
>> > +&& (kinds[pos + 1] & 0xff) == GOMP_MAP_ALWAYS_POINTER)
>> > +  return pos + 1;
>> > +
>> > +  /* We can have one or several GOMP_MAP_POINTER mappings after a 
>> > to/from
>> > +   (etc.) mapping.  */
>> > +  while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) == 
>> > GOMP_MAP_POINTER)
>> > +  last_pos = ++pos;
>> > +}
>> >  
>> > -  return 0;
>> > +  return last_pos;
>> >  }  

Given:

program test
  implicit none

  integer, parameter :: n = 64
  integer :: a(n)

  call test_array(a)

contains
  subroutine test_array(a)
implicit none

integer :: a(n)

!$acc enter data copyin(a)

!$acc exit data delete(a)
  end subroutine test_array
end program test

..., we get a 'GOMP_MAP_TO' followed by a 'GOMP_MAP_POINTER'.  That got
us 'find_pointer () == 2', and now we get 'find_group_last (i) == i + 1'
(so, the same).

> In a previous iteration of the refcount overhaul patch, we had the
> "magic" code fragment:
>
>> +  for (int j = 0; j < 2; j++)  
>> +gomp_map_vars_async (acc_dev, aq,
>> + (j == 0 || pointer == 2) ? 1 : 2,
>> + &hostaddrs[i + j], NULL,
>> + &sizes[i + j], &kinds[i + j], true,
>> + GOMP_MAP_VARS_OPENACC_ENTER_DATA);  

> The "pointer == 2" case (i.e. with a GOMP_MAP_TO and a
> GOMP_MAP_POINTER)

So, that's the example given above.

> will also handle the mappings separately in both the
> earlier patch iteration

ACK, given the "previous iteration" code presented above.

> and this one.

NACK?  Given 'find_group_last (i) == i + 1', that means that
'GOMP_MAP_TO' and 'GOMP_MAP_POINTER' get mapped as one group?

On the other hand, it still does match the current 'find_pointer'
behavior?

But what should the behavior here be: 'GOMP_MAP_TO', 'GOMP_MAP_POINTER'
each separate, or as one group?

Confusing stuff.  :-|


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH] Fix redundant load missed by fre [tree-optimization 92980]

2019-12-18 Thread Hongtao Liu
On Wed, Dec 18, 2019 at 4:26 PM Segher Boessenkool
 wrote:
>
> On Wed, Dec 18, 2019 at 10:37:11AM +0800, Hongtao Liu wrote:
> > Hi:
> >   This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> > power of 2 and D mod C == 0.
> >   bootstrap and make check is ok.
>
> Why would this be a good idea?  It is not reducing the number of
> operators or similar?
>
It helps VN, so that fre will delete redundant load.
>
> Segher



-- 
BR,
Hongtao


Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Jan Hubicka
Hi,
sorry I forgot to include cgraph and varpool changes in the patch.

Index: varpool.c
===
--- varpool.c   (revision 279467)
+++ varpool.c   (working copy)
@@ -539,8 +539,7 @@ varpool_node::assemble_aliases (void)
 {
   varpool_node *alias = dyn_cast  (ref->referring);
   if (alias->symver)
-   do_assemble_symver (alias->decl,
-   DECL_ASSEMBLER_NAME (decl));
+   do_assemble_symver (alias->decl, decl);
   else if (!alias->transparent_alias)
do_assemble_alias (alias->decl,
   DECL_ASSEMBLER_NAME (decl));
Index: cgraphunit.c
===
--- cgraphunit.c(revision 279467)
+++ cgraphunit.c(working copy)
@@ -,8 +,7 @@ cgraph_node::assemble_thunks_and_aliases
 of buffering it in same alias pairs.  */
  TREE_ASM_WRITTEN (decl) = 1;
  if (alias->symver)
-   do_assemble_symver (alias->decl,
-   DECL_ASSEMBLER_NAME (decl));
+   do_assemble_symver (alias->decl, decl);
  else
do_assemble_alias (alias->decl,
   DECL_ASSEMBLER_NAME (decl));
Index: varasm.c
===
--- varasm.c(revision 279467)
+++ varasm.c(working copy)
@@ -5970,9 +5970,47 @@ do_assemble_symver (tree decl, tree targ
   ultimate_transparent_alias_target (&id);
   ultimate_transparent_alias_target (&target);
 #ifdef ASM_OUTPUT_SYMVER_DIRECTIVE
-  ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
-  IDENTIFIER_POINTER (target),
-  IDENTIFIER_POINTER (id));
+  if (TREE_PUBLIC (target) && DECL_VISIBILITY (target) == VISIBILITY_DEFAULT)
+ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
+IDENTIFIER_POINTER
+  (DECL_ASSEMBLER_NAME (target)),
+IDENTIFIER_POINTER (id));
+  else
+{
+  int nameend;
+  for (nameend = 0; IDENTIFIER_POINTER (id)[nameend] != '@'; nameend++)
+   ;
+  if (IDENTIFIER_POINTER (id)[nameend + 1] != '@'
+ || IDENTIFIER_POINTER (id)[nameend + 2] == '@')
+   {
+ sorry_at (DECL_SOURCE_LOCATION (target),
+   "can not produce % of a symbol that is "
+   "not exported with default visibility");
+ return;
+   }
+  tree tmpdecl = copy_node (decl);
+  char buf[256];
+  static int symver_labelno;
+  targetm.asm_out.generate_internal_label (buf,
+  "LSYMVER", symver_labelno++);
+  SET_DECL_ASSEMBLER_NAME (tmpdecl, get_identifier (buf));
+  globalize_decl (tmpdecl);
+#ifdef ASM_OUTPUT_DEF_FROM_DECLS
+  ASM_OUTPUT_DEF_FROM_DECLS (asm_out_file, tmpdecl,
+DECL_ASSEMBLER_NAME (target));
+#else
+  ASM_OUTPUT_DEF (asm_out_file,
+ IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (tmpdecl)),
+ IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (target)));
+#endif
+  memcpy (buf, IDENTIFIER_POINTER (id), nameend + 2);
+  buf[nameend + 2] = '@';
+  strcpy (buf + nameend + 3, IDENTIFIER_POINTER (id) + nameend + 2);
+  ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
+  IDENTIFIER_POINTER
+(DECL_ASSEMBLER_NAME (tmpdecl)),
+  buf);
+}
 #else
   error ("symver is only supported on ELF platforms");
 #endif
Index: lto/lto-common.c
===
--- lto/lto-common.c(revision 279467)
+++ lto/lto-common.c(working copy)
@@ -2818,6 +2818,10 @@ read_cgraph_and_symbols (unsigned nfiles
   IDENTIFIER_POINTER
 (DECL_ASSEMBLER_NAME (snode->decl)));
  }
+   /* Symbol versions are always used externally, but linker does not
+  report that correctly.  */
+   else if (snode->symver && *res == LDPR_PREVAILING_DEF_IRONLY)
+ snode->resolution = LDPR_PREVAILING_DEF_IRONLY_EXP;
else
  snode->resolution = *res;
   }


[Ada] Small tweak to pragma Warnings (On)

2019-12-18 Thread Eric Botcazou
This changes pragma Warnings (On) from reenabling all the warnings previously 
silenced to reenabling only those warnings which were enabled when the latest 
pragma Warnings (Off) was processed (excluding the front-end warnings which 
are handled separately by the front-end).

Tested on x86_64-suse-linux, applied on the mainline.


2019-12-18  Eric Botcazou  

* gcc-interface/trans.c (Pragma_to_gnu) : Push a
diagnostics state for pragma Warnings (Off) before turning off all
the warnings and only pop it for pragma Warnings (On).


2019-12-18  Eric Botcazou  

* gnat.dg/warn32.adb: New test.

-- 
Eric BotcazouIndex: gcc-interface/trans.c
===
--- gcc-interface/trans.c	(revision 338441)
+++ gcc-interface/trans.c	(revision 338442)
@@ -1975,7 +1975,21 @@ Pragma_to_gnu (Node_Id gnat_node)
 	gnat_expr = Expression (Next (gnat_temp));
 	  }
 	else
-	  gnat_expr = Empty;
+	  {
+		gnat_expr = Empty;
+
+		/* For pragma Warnings (Off), we save the current state...  */
+		if (kind == DK_IGNORED)
+		  diagnostic_push_diagnostics (global_dc, location);
+
+		/* ...so that, for pragma Warnings (On), we do not enable all
+		   the warnings but just restore the previous state.  */
+		else
+		  {
+		diagnostic_pop_diagnostics (global_dc, location);
+		break;
+		  }
+	  }
 
 	imply = false;
 	  }
--  { dg-do compile }
--  { dg-options "-O -gnatn -Winline -cargs --param max-inline-insns-single=50 -margs" }

with Ada.Containers.Vectors;
with Ada.Strings.Unbounded; use Ada.Strings.Unbounded;
with Ada.Text_IO;

procedure Warn32 is
  type Selected_Block_T is record
Contents  : Unbounded_String;
File_Name : Unbounded_String;
  end record;

  pragma Warnings (Off, "-Winline");
  package Selected_Block_List is
new Ada.Containers.Vectors (Natural, Selected_Block_T);
begin
  Ada.Text_Io.Put_Line ("Hello World!");
end;


Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao
On 2019-12-18 10:26 +0100, Jan Hubicka wrote:
> Hi,
> sorry I forgot to include cgraph and varpool changes in the patch.
> 
> Index: varpool.c
> ===
> --- varpool.c (revision 279467)
> +++ varpool.c (working copy)
> @@ -539,8 +539,7 @@ varpool_node::assemble_aliases (void)
>  {
>varpool_node *alias = dyn_cast  (ref->referring);
>if (alias->symver)
> - do_assemble_symver (alias->decl,
> - DECL_ASSEMBLER_NAME (decl));
> + do_assemble_symver (alias->decl, decl);
>else if (!alias->transparent_alias)
>   do_assemble_alias (alias->decl,
>  DECL_ASSEMBLER_NAME (decl));
> Index: cgraphunit.c
> ===
> --- cgraphunit.c  (revision 279467)
> +++ cgraphunit.c  (working copy)
> @@ -,8 +,7 @@ cgraph_node::assemble_thunks_and_aliases
>of buffering it in same alias pairs.  */
> TREE_ASM_WRITTEN (decl) = 1;
> if (alias->symver)
> - do_assemble_symver (alias->decl,
> - DECL_ASSEMBLER_NAME (decl));
> + do_assemble_symver (alias->decl, decl);
> else
>   do_assemble_alias (alias->decl,
>  DECL_ASSEMBLER_NAME (decl));
> Index: varasm.c
> ===
> --- varasm.c  (revision 279467)
> +++ varasm.c  (working copy)
> @@ -5970,9 +5970,47 @@ do_assemble_symver (tree decl, tree targ
>ultimate_transparent_alias_target (&id);
>ultimate_transparent_alias_target (&target);

ICE here.

lto1: internal compiler error: tree check: expected identifier_node, have
function_decl in ultimate_transparent_alias_target, at varasm.c:1308
0x6f9cfe tree_check_failed(tree_node const*, char const*, int, char const*, ...)
../../gcc/gcc/tree.c:9685
0x714541 tree_check(tree_node*, char const*, int, char const*, tree_code)
../../gcc/gcc/tree.h:3273
0x714541 ultimate_transparent_alias_target
../../gcc/gcc/varasm.c:1308
0x714541 do_assemble_symver(tree_node*, tree_node*)
../../gcc/gcc/varasm.c:5971

>  #ifdef ASM_OUTPUT_SYMVER_DIRECTIVE
> -  ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> -IDENTIFIER_POINTER (target),
> -IDENTIFIER_POINTER (id));
> +  if (TREE_PUBLIC (target) && DECL_VISIBILITY (target) == VISIBILITY_DEFAULT)
> +ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> +  IDENTIFIER_POINTER
> +(DECL_ASSEMBLER_NAME (target)),
> +  IDENTIFIER_POINTER (id));
> +  else
> +{
> +  int nameend;
> +  for (nameend = 0; IDENTIFIER_POINTER (id)[nameend] != '@'; nameend++)
> + ;
> +  if (IDENTIFIER_POINTER (id)[nameend + 1] != '@'
> +   || IDENTIFIER_POINTER (id)[nameend + 2] == '@')
> + {
> +   sorry_at (DECL_SOURCE_LOCATION (target),
> + "can not produce % of a symbol that is "
> + "not exported with default visibility");
> +   return;

I think this does not make sense.  Some library authors may export "foo@VER_1"
but not "foo_v1" to ensure the programmers using the library upgrade their code
to use new "correct" ABI, instead of an old one.   This error makes it
impossible.

(Try to comment out "foo_v1" in version.map, in the testcase.)

> + }
> +  tree tmpdecl = copy_node (decl);
> +  char buf[256];
> +  static int symver_labelno;
> +  targetm.asm_out.generate_internal_label (buf,
> +"LSYMVER", symver_labelno++);
> +  SET_DECL_ASSEMBLER_NAME (tmpdecl, get_identifier (buf));
> +  globalize_decl (tmpdecl);
> +#ifdef ASM_OUTPUT_DEF_FROM_DECLS
> +  ASM_OUTPUT_DEF_FROM_DECLS (asm_out_file, tmpdecl,
> +  DECL_ASSEMBLER_NAME (target));
> +#else
> +  ASM_OUTPUT_DEF (asm_out_file,
> +   IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (tmpdecl)),
> +   IDENTIFIER_POINTER (DECL_ASSEMBLER_NAME (target)));
> +#endif
> +  memcpy (buf, IDENTIFIER_POINTER (id), nameend + 2);
> +  buf[nameend + 2] = '@';
> +  strcpy (buf + nameend + 3, IDENTIFIER_POINTER (id) + nameend + 2);

We can't replace a single "@" with "@@@".  So I think producing .LSYMVERx is not
an option for "old" versions like "foo@VER_1".

> +  ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> +IDENTIFIER_POINTER
> +  (DECL_ASSEMBLER_NAME (tmpdecl)),
> +buf);
> +}
>  #else
>error ("symver is only supported on ELF platforms");
>  #endif
> Index: lto/lto-common.c
> ===
> --- lto/lto-common.c  (revision 279467)
> +++ lto/lto-common.c  (working copy)
> @@ -2818,6 +2818,1

Re: [PATCH] Fix redundant load missed by fre [tree-optimization 92980]

2019-12-18 Thread Andrew Pinski
On Wed, Dec 18, 2019 at 1:18 AM Hongtao Liu  wrote:
>
> On Wed, Dec 18, 2019 at 4:26 PM Segher Boessenkool
>  wrote:
> >
> > On Wed, Dec 18, 2019 at 10:37:11AM +0800, Hongtao Liu wrote:
> > > Hi:
> > >   This patch is to simplify A * C + (-D) -> (A - D/C) * C when C is a
> > > power of 2 and D mod C == 0.
> > >   bootstrap and make check is ok.
> >
> > Why would this be a good idea?  It is not reducing the number of
> > operators or similar?
> >
> It helps VN, so that fre will delete redundant load.

It is basically doing a factoring and undoing an optimization that was
done in the front-end (see pointer_int_sum in c-common.c).
But I think the optimization in the front-end should be removed.  It
dates from 1992, a time when GCC did not anything on the tree level
and there was no GCSE (PRE) and the CSE was limited.

Thanks,
Andrew Pinski


> >
> > Segher
>
>
>
> --
> BR,
> Hongtao


Pass ipa-bit-cp info to tree-ssa-ccp

2019-12-18 Thread Jan Hubicka
Hi,
while hunting the streaming bug of ipa-bit-cp which exchanged value and
mark while streaming to ltrans I noticed that this bug had almost no
effect because we almost always throw away the relevant info.

This patch makes tree-ssa-ccp to use results of ipa-bit-cp so the value
is actually used.  It also tests ipa-param-manipulation infrastructure
(implemented by Martin Jambor) for fixing the issue with aggregate
propagation being ignored in many cases.

Bootstrapped/regtested x86_64-linux, plan to commit it later today if
there are no complains.

Honza

* ipa-param-manipulation.h (get_original_index): Declare.
* ipa-param-manipulation.c (ipa_param_adjustments::get_original_index):
New member function.
* ipa-prop.c (ipcp_get_parm_bits): New function.
* ipa-prop.h (ipcp_get_parm_bits): Declare.
* tree-ssa-ccp.c: Include cgraph.h, alloc-pool.h, symbol-summary.h,
ipa-utils.h and ipa-prop.h
(get_default_value): Use ipcp_get_parm_bits.

* gcc.dg/ipa/ipa-bit-cp.c: New testcase.
* gcc.dg/ipa/ipa-bit-cp-1.c: New testcase.
* gcc.dg/ipa/ipa-bit-cp-2.c: New testcase.
Index: ipa-param-manipulation.h
===
--- ipa-param-manipulation.h(revision 279467)
+++ ipa-param-manipulation.h(working copy)
@@ -258,6 +258,9 @@ public:
   void get_surviving_params (vec *surviving_params);
   /* Fill a vector with new indices of surviving original parameters.  */
   void get_updated_indices (vec *new_indices);
+  /* Return the original index for the given new parameter index.  Return a
+ negative number if not available.  */
+  int get_original_index (int newidx);
 
   void dump (FILE *f);
   void debug ();
Index: ipa-param-manipulation.c
===
--- ipa-param-manipulation.c(revision 279467)
+++ ipa-param-manipulation.c(working copy)
@@ -324,6 +324,18 @@ ipa_param_adjustments::get_updated_indic
 }
 }
 
+/* Return the original index for the given new parameter index.  Return a
+   negative number if not available.  */
+
+int
+ipa_param_adjustments::get_original_index (int newidx)
+{
+  const ipa_adjusted_param *adj = &(*m_adj_params)[newidx];
+  if (adj->op != IPA_PARAM_OP_COPY)
+return -1;
+  return adj->base_index;
+}
+
 /* Return true if the first parameter (assuming there was one) survives the
transformation intact and remains the first one.  */
 
Index: ipa-prop.c
===
--- ipa-prop.c  (revision 279467)
+++ ipa-prop.c  (working copy)
@@ -5480,6 +5480,43 @@ ipcp_modif_dom_walker::before_dom_childr
   return NULL;
 }
 
+/* Return true if we have recorded VALUE and MASK about PARM.
+   Set VALUE and MASk accordingly.  */
+
+bool
+ipcp_get_parm_bits (tree parm, tree *value, widest_int *mask)
+{
+  cgraph_node *cnode = cgraph_node::get (current_function_decl);
+  ipcp_transformation *ts = ipcp_get_transformation_summary (cnode);
+  if (!ts || vec_safe_length (ts->bits) == 0)
+return false;
+
+  int i = 0;
+  for (tree p = DECL_ARGUMENTS (current_function_decl);
+   p != parm; p = DECL_CHAIN (p))
+{
+  i++;
+  /* Ignore static chain.  */
+  if (!p)
+   return false;
+}
+
+  if (cnode->clone.param_adjustments)
+{
+  i = cnode->clone.param_adjustments->get_original_index (i);
+  if (i < 0)
+   return false;
+}
+
+  vec &bits = *ts->bits;
+  if (!bits[i])
+return false;
+  *mask = bits[i]->mask;
+  *value = wide_int_to_tree (TREE_TYPE (parm), bits[i]->value);
+  return true;
+}
+
+
 /* Update bits info of formal parameters as described in
ipcp_transformation.  */
 
Index: ipa-prop.h
===
--- ipa-prop.h  (revision 279467)
+++ ipa-prop.h  (working copy)
@@ -1041,6 +1041,7 @@ ipa_agg_value_set ipa_agg_value_set_from
 void ipa_dump_param (FILE *, class ipa_node_params *info, int i);
 void ipa_release_body_info (struct ipa_func_body_info *);
 tree ipa_get_callee_param_type (struct cgraph_edge *e, int i);
+bool ipcp_get_parm_bits (tree, tree *, widest_int *);
 
 /* From tree-sra.c:  */
 tree build_ref_for_offset (location_t, tree, poly_int64, bool, tree,
Index: tree-ssa-ccp.c
===
--- tree-ssa-ccp.c  (revision 279467)
+++ tree-ssa-ccp.c  (working copy)
@@ -146,6 +146,11 @@ along with GCC; see the file COPYING3.
 #include "stringpool.h"
 #include "attribs.h"
 #include "tree-vector-builder.h"
+#include "cgraph.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
+#include "ipa-utils.h"
+#include "ipa-prop.h"
 
 /* Possible lattice values.  */
 typedef enum
@@ -292,11 +297,26 @@ get_default_value (tree var)
  if (flag_tree_bit_ccp)
{
  wide_int nonzero_bits = get_nonzero_bits (var);
- if (nonzero_bits != -1)
+  

Re: [PATCH] Avoid suspicious -Wduplicate-branches warning in lto-wrapper.c (PR lto/92972)

2019-12-18 Thread Richard Biener
On December 17, 2019 9:43:26 PM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>big ? "-fno-pie" : "-fno-pie" doesn't make much sense, either we want
>to
>use big ? "-fno-PIE" : "-fno-pie", but as both mean the same thing, I
>think
>just using "-fno-pie" is good enough.  + a few formatting nits and one
>comment typo.
>
>Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Ok. 

Richard. 

>2019-12-17  Jakub Jelinek  
>
>   PR lto/92972
>   * lto-wrapper.c (merge_and_complain): Use just "-fno-pie" instead of
>   big ? "-fno-pie" : "-fno-pie".  Formatting fixes.  Fix comment typo.
>
>--- gcc/lto-wrapper.c.jj   2019-09-11 13:36:14.057264373 +0200
>+++ gcc/lto-wrapper.c  2019-12-17 12:28:36.135056568 +0100
>@@ -408,7 +408,7 @@ merge_and_complain (struct cl_decoded_op
>   /* Merge PIC options:
>   -fPIC + -fpic = -fpic
>   -fPIC + -fno-pic = -fno-pic
>-  -fpic/-fPIC + nothin = nothing.  
>+  -fpic/-fPIC + nothing = nothing.
>It is a common mistake to mix few -fPIC compiled objects into otherwise
>  non-PIC code.  We do not want to build everything with PIC then.
> 
>@@ -438,9 +438,10 @@ merge_and_complain (struct cl_decoded_op
>  && pie_option->opt_index == OPT_fPIE;
>   (*decoded_options)[j].opt_index = big ? OPT_fPIE : OPT_fpie;
>   if (pie_option->value)
>-(*decoded_options)[j].canonical_option[0] = big ? "-fPIE" :
>"-fpie";
>+(*decoded_options)[j].canonical_option[0]
>+  = big ? "-fPIE" : "-fpie";
>   else
>-(*decoded_options)[j].canonical_option[0] = big ?
>"-fno-pie" : "-fno-pie";
>+(*decoded_options)[j].canonical_option[0] = "-fno-pie";
>   (*decoded_options)[j].value = pie_option->value;
>   j++;
> }
>@@ -482,7 +483,7 @@ merge_and_complain (struct cl_decoded_op
> {
>   (*decoded_options)[j].opt_index = OPT_fpie;
>   (*decoded_options)[j].canonical_option[0]
>-   = pic_option->value ? "-fpie" : "-fno-pie";
>+= pic_option->value ? "-fpie" : "-fno-pie";
> }
>   else if (!pic_option->value)
> (*decoded_options)[j].canonical_option[0] = "-fno-pie";
>
>   Jakub



Re: [modulo-sched][PATCH] Fix PR92591

2019-12-18 Thread Richard Biener
On December 17, 2019 8:40:59 PM GMT+01:00, Roman Zhuykov  
wrote:
>Hello.
>
>> As pointed out in the PR
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92591#c1, the test can
>be
>> fixed by DFA-checking more adjacent row sequences in the partial
>> schedule.
>> I've found that on powerpc64 gcc.c-torture/execute/pr61682.c test
>> catches same issue with -Os -fmodulo-sched-allow-regmoves with some
>> non-zero sms-dfa-history parameter values, so I added that test using
>> #include as second test into the patch.
>> 
>> Minor separate patch about modulo-sched parameters is also attached.
>> If no objection, I'll commit this two patches into trunk tomorrow
>> together with my PR90001 fix.
>> 
>> Trunk and 8/9 branches succesfully regstrapped on x64, and
>> cross-compiler check-gcc tested on ppc, ppc64, arm, aarch64, ia64 and
>> s390. Certainly a lot of testing were also done with changing default
>> sms-dfa-history value to some other than zero.
>
>I think this should be backported into 9 and 8 branches, because second
>
>example gives an ICE there.  But I'm not sure about backporting 
>sms-dfa-history upper bound limitation (<=16) into params.def in 
>branches.  Compile-time may grow dramatically for huge values like
>1000, 
>so we have to limit it.  Is it ok to limit the parameter, or maybe it's
>
>better to implement some "history=MIN(history, 16)" logic in 
>modulo-sched.c ?
>
>I see that sometimes parameter limitation is backported, examples are:
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80663
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79576
>
>While at it, maybe you have some thoughts about selected value of 16.  
>Maximum reasonable value for sms-dfa-history param seems to be max 
>latency between two insns on target platform (calculated by dep_cost 
>function in haifa-sched.c).
>
>I'm posting full backport patch here, it suits 8/9 branches. Jakub and 
>Richard, is it OK ?

Ok with me. 

Richard. 

>Roman
>
>Backport from mainline
>gcc/ChangeLog:
>
>2019-12-17  Roman Zhuykov  
>
>   * modulo-sched.c (ps_add_node_check_conflicts): Improve checking
>   for history > 0 case.
>   * params.def (sms-dfa-history): Limit to 16.
>
>gcc/testsuite/ChangeLog:
>
>2019-12-17  Roman Zhuykov  
>
>   * gcc.dg/pr92951-1.c: New test.
>   * gcc.dg/pr92951-2.c: New test.
>
>
>diff --git a/gcc/modulo-sched.c b/gcc/modulo-sched.c
>--- a/gcc/modulo-sched.c
>+++ b/gcc/modulo-sched.c
>@@ -3209,7 +3209,7 @@ ps_add_node_check_conflicts (partial_schedule_ptr
>
>ps, int n,
>int c, sbitmap must_precede,
>sbitmap must_follow)
>  {
>-  int has_conflicts = 0;
>+  int i, first, amount, has_conflicts = 0;
>ps_insn_ptr ps_i;
>
>/* First add the node to the PS, if this succeeds check for
>@@ -3217,23 +3217,32 @@ ps_add_node_check_conflicts 
>(partial_schedule_ptr ps, int n,
>   if (! (ps_i = add_node_to_ps (ps, n, c, must_precede, must_follow)))
>  return NULL; /* Failed to insert the node at the given cycle.  */
>
>-  has_conflicts = ps_has_conflicts (ps, c, c)
>-|| (ps->history > 0
>-&& ps_has_conflicts (ps,
>- c - ps->history,
>- c + ps->history));
>-
>-  /* Try different issue slots to find one that the given node can be
>- scheduled in without conflicts.  */
>-  while (has_conflicts)
>+  while (1)
>  {
>+  has_conflicts = ps_has_conflicts (ps, c, c);
>+  if (ps->history > 0 && !has_conflicts)
>+  {
>+/* Check all 2h+1 intervals, starting from c-2h..c up to c..2h,
>+   but not more than ii intervals.  */
>+first = c - ps->history;
>+amount = 2 * ps->history + 1;
>+if (amount > ps->ii)
>+  amount = ps->ii;
>+for (i = first; i < first + amount; i++)
>+  {
>+has_conflicts = ps_has_conflicts (ps,
>+  i - ps->history,
>+  i + ps->history);
>+if (has_conflicts)
>+  break;
>+  }
>+  }
>+  if (!has_conflicts)
>+  break;
>+  /* Try different issue slots to find one that the given node can
>
>be
>+   scheduled in without conflicts.  */
>if (! ps_insn_advance_column (ps, ps_i, must_follow))
>   break;
>-  has_conflicts = ps_has_conflicts (ps, c, c)
>-|| (ps->history > 0
>-&& ps_has_conflicts (ps,
>- c - ps->history,
>- c + ps->history));
>  }
>
>if (has_conflicts)
>diff --git a/gcc/testsuite/gcc.dg/pr92951-1.c 
>b/gcc/testsuite/gcc.dg/pr92951-1.c
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/pr92951-1.c
>@@ -0,0 +1,11 @@
>+/* PR rtl-optimization/92591 */
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fmodulo-sched -fweb -fno-dce -fno-ivopts 
>-fno-sched-pr

Re: [PATCH 1/4] Add verification of SRA accesses

2019-12-18 Thread Richard Biener
On December 17, 2019 1:40:47 PM GMT+01:00, Martin Jambor  
wrote:
>Hi,
>
>because the follow-up patches perform some non-trivial operations on
>SRA patches, I wrote myself a verifier.  And sure enough, it has
>spotted two issues, one of which is fixed in this patch too - we did
>not correctly set the parent link when creating artificial accesses
>for propagation across assignments.  The second one is the (not)
>setting of reverse flag when creating accesses for total scalarization
>but since the following patch removes the offending function, this
>patch does not fix it.
>
>Bootstrapped and tested on x86_64, I consider this a pre-requisite for
>the followup patches (and the parent link fix really is).

OK. 

Thanks 
Richard. 

>Thanks,
>
>Martin
>
>2019-12-10  Martin Jambor  
>
>   * tree-sra.c (verify_sra_access_forest): New function.
>   (verify_all_sra_access_forests): Likewise.
>   (create_artificial_child_access): Set parent.
>   (analyze_all_variable_accesses): Call the verifier.
>---
> gcc/tree-sra.c | 86 ++
> 1 file changed, 86 insertions(+)
>
>diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
>index 87c156f2f54..e077a811da9 100644
>--- a/gcc/tree-sra.c
>+++ b/gcc/tree-sra.c
>@@ -2321,6 +2321,88 @@ build_access_trees (struct access *access)
>   return true;
> }
> 
>+/* Traverse the access forest where ROOT is the first root and verify
>that
>+   various important invariants hold true.  */
>+
>+DEBUG_FUNCTION void
>+verify_sra_access_forest (struct access *root)
>+{
>+  struct access *access = root;
>+  tree first_base = root->base;
>+  gcc_assert (DECL_P (first_base));
>+  do
>+{
>+  gcc_assert (access->base == first_base);
>+  if (access->parent)
>+  gcc_assert (access->offset >= access->parent->offset
>+  && access->size <= access->parent->size);
>+  if (access->next_sibling)
>+  gcc_assert (access->next_sibling->offset
>+  >= access->offset + access->size);
>+
>+  poly_int64 poffset, psize, pmax_size;
>+  bool reverse;
>+  tree base = get_ref_base_and_extent (access->expr, &poffset,
>&psize,
>+ &pmax_size, &reverse);
>+  HOST_WIDE_INT offset, size, max_size;
>+  if (!poffset.is_constant (&offset)
>+|| !psize.is_constant (&size)
>+|| !pmax_size.is_constant (&max_size))
>+  gcc_unreachable ();
>+  gcc_assert (base == first_base);
>+  gcc_assert (offset == access->offset);
>+  gcc_assert (access->grp_unscalarizable_region
>+|| size == max_size);
>+  gcc_assert (max_size == access->size);
>+  gcc_assert (reverse == access->reverse);
>+
>+  if (access->first_child)
>+  {
>+gcc_assert (access->first_child->parent == access);
>+access = access->first_child;
>+  }
>+  else if (access->next_sibling)
>+  {
>+gcc_assert (access->next_sibling->parent == access->parent);
>+access = access->next_sibling;
>+  }
>+  else
>+  {
>+while (access->parent && !access->next_sibling)
>+  access = access->parent;
>+if (access->next_sibling)
>+  access = access->next_sibling;
>+else
>+  {
>+gcc_assert (access == root);
>+root = root->next_grp;
>+access = root;
>+  }
>+  }
>+}
>+  while (access);
>+}
>+
>+/* Verify access forests of all candidates with accesses by calling
>+   verify_access_forest on each on them.  */
>+
>+DEBUG_FUNCTION void
>+verify_all_sra_access_forests (void)
>+{
>+  bitmap_iterator bi;
>+  unsigned i;
>+  EXECUTE_IF_SET_IN_BITMAP (candidate_bitmap, 0, i, bi)
>+{
>+  tree var = candidate (i);
>+  struct access *access = get_first_repr_for_decl (var);
>+  if (access)
>+  {
>+gcc_assert (access->base == var);
>+verify_sra_access_forest (access);
>+  }
>+}
>+}
>+
>/* Return true if expr contains some ARRAY_REFs into a variable bounded
>array.  */
> 
>@@ -2566,6 +2648,7 @@ create_artificial_child_access (struct access
>*parent, struct access *model,
>   access->offset = new_offset;
>   access->size = model->size;
>   access->type = model->type;
>+  access->parent = parent;
>   access->grp_write = set_grp_write;
>   access->grp_read = false;
>   access->reverse = model->reverse;
>@@ -2850,6 +2933,9 @@ analyze_all_variable_accesses (void)
> 
>   propagate_all_subaccesses ();
> 
>+  if (flag_checking)
>+verify_all_sra_access_forests ();
>+
>   bitmap_copy (tmp, candidate_bitmap);
>   EXECUTE_IF_SET_IN_BITMAP (tmp, 0, i, bi)
> {



Re: [PATCH 2/2] libada: Respect `--enable-version-specific-runtime-libs'

2019-12-18 Thread Eric Botcazou
>   gcc/ada/
>   * gcc-interface/Makefile.in (ADA_RTL_DSO_DIR): New variable.
>   (install-gnatlib): Use it in place of ADA_RTL_OBJ_DIR for shared
>   library installation.
> 
>   libada/
>   * Makefile.in (toolexecdir, toolexeclibdir): New variables.
>   (LIBADA_FLAGS_TO_PASS): Add `toolexeclibdir'.
>   * configure.ac: Add `--enable-version-specific-runtime-libs'.
>   Update version-specific `toolexecdir' and `toolexeclibdir' from
>   ADA_RTL_OBJ_DIR from gcc/ada/gcc-interface/Makefile.in.
>   * configure: Regenerate.

This breaks with --disable-libada because $(toolexeclibdir) is not set.

-- 
Eric Botcazou


Re: [PATCH 2/2] libada: Respect `--enable-version-specific-runtime-libs'

2019-12-18 Thread Maciej W. Rozycki
On Wed, 18 Dec 2019, Eric Botcazou wrote:

> > gcc/ada/
> > * gcc-interface/Makefile.in (ADA_RTL_DSO_DIR): New variable.
> > (install-gnatlib): Use it in place of ADA_RTL_OBJ_DIR for shared
> > library installation.
> > 
> > libada/
> > * Makefile.in (toolexecdir, toolexeclibdir): New variables.
> > (LIBADA_FLAGS_TO_PASS): Add `toolexeclibdir'.
> > * configure.ac: Add `--enable-version-specific-runtime-libs'.
> > Update version-specific `toolexecdir' and `toolexeclibdir' from
> > ADA_RTL_OBJ_DIR from gcc/ada/gcc-interface/Makefile.in.
> > * configure: Regenerate.
> 
> This breaks with --disable-libada because $(toolexeclibdir) is not set.

 Sorry about the breakage, I'll look into it right away.

  Maciej


Re: [PATCH 3/4] Also propagate SRA accesses from LHS to RHS (PR 92706)

2019-12-18 Thread Richard Biener
On December 17, 2019 1:43:15 PM GMT+01:00, Martin Jambor  
wrote:
>Hi,
>
>the previous patch unfortunately does not fix the first testcase in PR
>92706 and since I am afraid it might be the important one, I also
>focused on that.  The issue here is again total scalarization accesses
>clashing with those representing accesses in the IL - on another
>aggregate but here the sides are switched.  Whereas in the previous
>case the total scalarization accesses prevented propagation along
>assignments, here we have the user accesses on the LHS, so even though
>we do not create anything there, we totally scalarize the RHS and
>again end up with assignments with different scalarizations leading to
>bad code.
>
>So we clearly need to propagate information about accesses from RHSs
>to LHSs too, which the patch below does.  Because the intent is only
>preventing bad total scalarization - which the last patch now performs
>late enough - and do not care about grp_write flag and so forth, the
>propagation is a bit simpler and so I did not try to unify all of the
>code for both directions.

 But can we really propagate the directions independently? Lacc to racc 
propagation would induce accesses to different racc to Lacc branches of the 
access tree of the parent, no? So in full generality the access links Form an 
undirected graph where you perform propagation in both directions of edges (and 
you'd have to consider cycles). 'linked parts' of the graph then need to have 
the same (or at least a compatible) scalarization, and three would be the 
possibility to compute the optimal 'conflict border' where to fix the conflict 
we'd keep one node in the graph unscalarized. 

The way you did it might be sufficient in practice of course and we should 
probably go with that for now?

Richard. 

>I still think that even with this patch the total scalarization has to
>follow the declared type of the aggregate and cannot be done using
>integers of the biggest suitable power, at least in early SRA, because
>these propagations of course do not work interprocedurally but
>inlining can and does eventually bring accesses from two functions
>together which could (and IMHO would) lead to same problems.
>
>Bootstrapped and LTO-bootstrapped and tested on an x86_64-linux and
>bootstrapped and tested it on aarch64 and i686 (except that on i686
>the testcase will need to be skipped because __int128_t is not
>available there).  I expect that review will lead to requests to
>change things but as far as I am concerned, this is ready for trunk
>too.
>
>Thanks,
>
>Martin
>
>2019-12-11  Martin Jambor  
>
>   PR tree-optimization/92706
>   * tree-sra.c (struct access): Fields first_link, last_link,
>   next_queued and grp_queued renamed to first_rhs_link, last_rhs_link,
>   next_rhs_queued and grp_rhs_queued respectively, new fields
>   first_lhs_link, last_lhs_link, next_lhs_queued and grp_lhs_queued.
>   (struct assign_link): Field next renamed to next_rhs, new field
>   next_lhs.  Updated comment.
>   (work_queue_head): Renamed to rhs_work_queue_head.
>   (lhs_work_queue_head): New variable.
>   (add_link_to_lhs): New function.
>   (relink_to_new_repr): Also relink LHS lists.
>   (add_access_to_work_queue): Renamed to add_access_to_rhs_work_queue.
>   (add_access_to_lhs_work_queue): New function.
>   (pop_access_from_work_queue): Renamed to
>   pop_access_from_rhs_work_queue.
>   (pop_access_from_lhs_work_queue): New function.
>   (build_accesses_from_assign): Also add links to LHS lists and to LHS
>   work_queue.
>   (child_would_conflict_in_lacc): Renamed to
>   child_would_conflict_in_acc.  Adjusted parameter names.
>   (create_artificial_child_access): New parameter set_grp_read, use it.
>   (subtree_mark_written_and_enqueue): Renamed to
>   subtree_mark_written_and_rhs_enqueue.
>   (propagate_subaccesses_across_link): Renamed to
>   propagate_subaccesses_from_rhs.
>   (propagate_subaccesses_from_lhs): New function.
>   (propagate_all_subaccesses): Also propagate subaccesses from LHSs to
>   RHSs.
>
>   testsuite/
>   * gcc.dg/tree-ssa/pr92706-1.c: New test.
>---
> gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c |  17 ++
> gcc/tree-sra.c| 316 --
> 2 files changed, 253 insertions(+), 80 deletions(-)
> create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c
>
>diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c
>b/gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c
>new file mode 100644
>index 000..c36d103798e
>--- /dev/null
>+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr92706-1.c
>@@ -0,0 +1,17 @@
>+/* { dg-do compile } */
>+/* { dg-options "-O2 -fdump-tree-esra-details" } */
>+
>+struct S { int i[4]; } __attribute__((aligned(128)));
>+typedef __int128_t my_int128 __attribute__((may_alias));
>+__int128_t load (void *p)
>+{
>+  struct S v;
>+  __builtin_memcpy (&v, p, sizeof (struct S));
>

Re: [PATCH][AArch64] Fixup core tunings

2019-12-18 Thread Kyrill Tkachov

Hi Wilco,

On 12/17/19 4:03 PM, Wilco Dijkstra wrote:

Hi Richard,

> This changelog entry is inadequate.  It's also not in the correct style.
>
> It should say what has changed, not just that it has changed.

Sure, but there is often no useful space for that. We should auto generate
changelogs if they are deemed useful. I find the commit message a lot more
useful in general. Here is the updated version:


Several tuning settings in cores.def are not consistent.
Set the tuning for Cortex-A76AE and Cortex-A77 to neoversen1 so
it is the same as for Cortex-A76 and Neoverse N1.
Set the tuning for Neoverse E1 to cortexa73 so it's the same as for
Cortex-A65. Set the scheduler for Cortex-A65 and Cortex-A65AE to
cortexa53.

Bootstrap OK, OK for commit?



Ok.

Thanks,

Kyrill




ChangeLog:
2019-12-17  Wilco Dijkstra  

* config/aarch64/aarch64-cores.def:
("cortex-a76ae"): Use neoversen1 tuning.
("cortex-a77"): Likewise.
("cortex-a65"): Use cortexa53 scheduler.
("cortex-a65ae"): Likewise.
("neoverse-e1"): Use cortexa73 tuning.
--

diff --git a/gcc/config/aarch64/aarch64-cores.def 
b/gcc/config/aarch64/aarch64-cores.def
index 
053c6390e747cb9c818fe29a9b22990143b260ad..d170253c6eddca87f8b9f4f7fcc4692695ef83fb 
100644

--- a/gcc/config/aarch64/aarch64-cores.def
+++ b/gcc/config/aarch64/aarch64-cores.def
@@ -101,13 +101,13 @@ AARCH64_CORE("thunderx2t99", thunderx2t99,  
thunderx2t99, 8_1A,  AARCH64_FL_FOR
 AARCH64_CORE("cortex-a55",  cortexa55, cortexa53, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD, cortexa53, 0x41, 0xd05, -1)
 AARCH64_CORE("cortex-a75",  cortexa75, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD, cortexa73, 0x41, 0xd0a, -1)
 AARCH64_CORE("cortex-a76",  cortexa76, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD, neoversen1, 0x41, 0xd0b, -1)
-AARCH64_CORE("cortex-a76ae",  cortexa76ae, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa72, 0x41, 0xd0e, -1)
-AARCH64_CORE("cortex-a77",  cortexa77, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa72, 0x41, 0xd0d, -1)
-AARCH64_CORE("cortex-a65",  cortexa65, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd06, -1)
-AARCH64_CORE("cortex-a65ae",  cortexa65ae, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd43, -1)
+AARCH64_CORE("cortex-a76ae",  cortexa76ae, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, neoversen1, 0x41, 0xd0e, -1)
+AARCH64_CORE("cortex-a77",  cortexa77, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, neoversen1, 0x41, 0xd0d, -1)
+AARCH64_CORE("cortex-a65",  cortexa65, cortexa53, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd06, -1)
+AARCH64_CORE("cortex-a65ae",  cortexa65ae, cortexa53, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd43, -1)
 AARCH64_CORE("ares",  ares, cortexa57, 8_2A, AARCH64_FL_FOR_ARCH8_2 | 
AARCH64_FL_F16 | AARCH64_FL_RCPC | AARCH64_FL_DOTPROD | 
AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1)
 AARCH64_CORE("neoverse-n1",  neoversen1, cortexa57, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_PROFILE, neoversen1, 0x41, 0xd0c, -1)
-AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa53, 0x41, 0xd4a, -1)
+AARCH64_CORE("neoverse-e1",  neoversee1, cortexa53, 8_2A,  
AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD | AARCH64_FL_SSBS, cortexa73, 0x41, 0xd4a, -1)


 /* HiSilicon ('H') cores. */
 AARCH64_CORE("tsv110",  tsv110, tsv110, 8_2A, AARCH64_FL_FOR_ARCH8_2 
| AARCH64_FL_CRYPTO | AARCH64_FL_F16 | AARCH64_FL_AES | 
AARCH64_FL_SHA2, tsv110,   0x48, 0xd01, -1)
@@ -127,6 +127,6 @@ AARCH64_CORE("cortex-a73.cortex-a53", 
cortexa73cortexa53, cortexa53, 8A,  AARCH

 /* ARM DynamIQ big.LITTLE configurations.  */

 AARCH64_CORE("cortex-a75.cortex-a55", cortexa75cortexa55, cortexa53, 
8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD, cortexa73, 0x41, AARCH64_BIG_LITTLE (0xd0a, 
0xd05), -1)
-AARCH64_CORE("cortex-a76.cortex-a55", cortexa76cortexa55, cortexa53, 
8_2A, AARCH64_FL_FOR_ARCH8_2 | AARCH64_FL_F16 | AARCH64_FL_RCPC | 
AARCH64_FL_DOTPROD, cortexa72, 0x41, AARCH64_BIG_LITTLE (0xd0b, 
0xd05), -1)
+AARCH64_CORE("cortex-a76.cortex-a55", c

*ping* / Re: [LTO] PR 86416 – improve lto1 diagnostic if a mode does not exist (esp. for offloading targets)

2019-12-18 Thread Tobias Burnus

*ping*

On 12/13/19 3:34 PM, Tobias Burnus wrote:
As long as one compiles for a single target, the message is unlikely 
to appear.


However, when compiling for offloading, the modes supported on the target
'host' and on the target 'device' can be different. In particular,
'long double' (when larger than double) and '__float128' might not be
available on the device.

This gives currently errors like the following (see PR, comment 0):


lto1: fatal error: unsupported mode TF
compilation terminated.
mkoffload: fatal error:
x86_64-pc-linux-gnu-accel-nvptx-none-gcc returned 1 exit status 


While the device target is hidden in 
'x86_64-pc-linux-gnu-accel-nvptx-none-gcc', it might make more sense 
to add it more prominently. Additionally, an average user does

not understand what 'TF' or 'XF' means.

Solutions:
(A) Add the target to the output
(B) Add a better description for the error

I did both in the attached patch, giving:
lto1: fatal error: nvptx-none - 80-bit floating-point numbers 
unsupported (mode 'XF')
lto1: fatal error: nvptx-none - 128-bit floating-point numbers 
unsupported (mode 'TF')


* (A) should be fine, I think.

* But I am not 100% happy with (B). I think as interim solution,
it is acceptable as XF/TF are well defined and probably the most
common problem. — Alternatively, one could only commit (A) or
solve it more properly (how?).

* If, e.g., 'long long' or 'integer(kind=16)' are used, only a
generic message is printed.  A message such as "'__float128' not
supported" or "'real(kind=10)' not supported" is more helpful and
supporting all modes and not cherry picking those two would be
useful as well.

The question is how to pass this information to lto-streamer-in.c;
it is available as TYPE_NAME – and, with debugging turned on, this
also gets passed on, but is also not be readily available in
lto_input_mode_table. – Suggestions?


Build on x86-64-gnu-linux and tested without offloading and with nvptx
offloading.

Tobias



[committed, amdgcn] Fix vect/pr65947-8.c testcase for amdgcn

2019-12-18 Thread Andrew Stubbs
This patch fixes a test failure caused by GCN's ability to vectorize 
mixed-mode operations not typically available on other platforms.


Basically, the testcase attempts to compare a vector of char against a 
scalar int. GCN can do this just fine, so the loop vectorizes, but the 
pass conditions expect that it will not.


I fixed it by special-casing GCN. There's might be a more general way, 
but apparently this does happen for other architectures (?)


Andrew
Fix vect/pr65947-8.c testcase for amdgcn.

2019-12-18  Andrew Stubbs  

	gcc/testsuite/
	* gcc.dg/vect/pr65947-8.c: Change pass conditions for amdgcn.

diff --git a/gcc/testsuite/gcc.dg/vect/pr65947-8.c b/gcc/testsuite/gcc.dg/vect/pr65947-8.c
index f0f1ac29699..a2a940daf1a 100644
--- a/gcc/testsuite/gcc.dg/vect/pr65947-8.c
+++ b/gcc/testsuite/gcc.dg/vect/pr65947-8.c
@@ -7,7 +7,7 @@ extern void abort (void) __attribute__ ((noreturn));
 #define N 27
 
 /* Condition reduction with multiple types in the comparison.  Will fail to
-   vectorize.  */
+   vectorize on architectures requiring matching vector sizes.  */
 
 int
 condition_reduction (char *a, int min_v)
@@ -41,5 +41,6 @@ main (void)
   return 0;
 }
 
-/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" } } */
-/* { dg-final { scan-tree-dump "multiple types in double reduction or condition reduction" "vect" } } */
+/* { dg-final { scan-tree-dump-not "LOOP VECTORIZED" "vect" { target { ! amdgcn*-*-* } } } } */
+/* { dg-final { scan-tree-dump "LOOP VECTORIZED" "vect" { target amdgcn*-*-* } } } */
+/* { dg-final { scan-tree-dump "multiple types in double reduction or condition reduction" "vect" { target { ! amdgcn*-*-* } } } } */


Re: [Patch] Add OpenACC 2.6's no_create

2019-12-18 Thread Tobias Burnus

Hi Thomas,

@Thomas (and, possibly, Julian & Jakub): Please glance quickly the 
gomp_map_vars_internal change.


libgomp/target.c's gomp_map_vars_internal: it now uses the normal code 
path in the upper loop, except that one directly bails out when the 
'key' has not been found (skipping the adjacent MAP_POINTER as well). 
The 'case' in the second loop is only reached, if tgt[i]->key == NULL 
(i.e. if not present) and one can unconditionally skip here. — This 
seems to be cleaner and should avoid some confusions :-)


GOMP_MAP_POINTER, following MAP_IF_PRESENT: I am not sure about this. 
The testsuite digests both mapping and skipping the map pointer. It 
looks a tad cleaner to avoid mapping the pointer (if the var is not 
present) – saving also few bytes and cpu cycles. On the down side, it 
adds an order dependence assumption, namely assuming that the 
MAP_POINTER after 'no_create'/MAP_IF_PRESENT always belongs to 
no_create. – [This patch follows the original patch and skips the 
map_pointer.]


Otherwise, except for added acc_is_present calls to no_create-3.c to 
check that no_create does not cause mapping and applying your/Thomas's 
patches, it matches my previous version, which was OK'ed. — Hence, I 
intent to commit it tomorrow, unless there are further comments.


Cheers,

Tobias

On 12/17/19 8:11 PM, Tobias Burnus wrote:

Hi Thomas,

I am reasonably comfortable with the current patch (regarding your 
TODOs) – see attachment. It is the previous patch plus your changes 
plus one additional condition (see below) in target.c's first 
GOMP_MAP_IF_PRESENT handling.


I intent to re-test it tomorrow and then commit it, unless some other 
issues or comments come up. — See a bunch of comments below.


Cheers,

Tobias

On 12/3/19 4:16 PM, Thomas Schwinge wrote:

So that's specifically what you fixed above
(See previous reply in this email. Now added an acc_is_present check. 
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00156.html)
Another thing: I've added just another little bit of testsuite 
coverage, and another thing broke. See "TODO" in attached incremental 
patch. […]
Files included, the other issue was XFAILed by you (and hence passed). 
A fix for that issue is: 
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01135.html — and a 
completely separate issue. (That patch is small, very localized and 
orthogonal to this patch.)

The incremental Fortran test case changes have bene done in a rush; not
sure if they make much sense, or should see some further work applied to
them.


I think one can do more, but they are fine. I am not 100% sure how to 
read the following:


  ! The no_create clause is meant for partially shared-memory 
machines.  This
  ! test is written to work on non-shared-memory machines, though this 
is not

  ! necessarily a useful way to use the no_create clause in practice.
  !$acc parallel !no_create (var)

First, why is 'no_create(var)' now commented? – For this code, it 
should really work both ways and independent whether commented boils 
down to 'copy' (currently) or 'present' (with my other patch, linked 
above).


With these items considered/addressed as you feel comfortable, this 
is OK

for trunk.



My TODO items:

--- libgomp/target.c
+++ libgomp/target.c
@@ -671,6 +671,7 @@ gomp_map_vars_internal (struct gomp_device_descr 
*devicep,

  }
    else if ((kind & typemask) == GOMP_MAP_IF_PRESENT)
  {
+  //TODO TS is confused.  Handling this here, will inhibit 
'gomp_map_vars_existing' being used a bit further below.

    tgt->list[i].key = NULL;
    tgt->list[i].offset = 0;
    has_firstprivate = true;


True – but should it? the only effect seems to be that it bumps the 
ref count. (Should it or shouldn't it?) In any case if the data is not 
present, it will fail in this section.


However, I think the following is missing before 'continue' – even 
though testing did not hit it:


  /* Handle the attach/pointer clause next to it later, together with
 GOMP_MAP_IF_PRESENT as the data might be not available. */
  if (i + 1 < mapnum
  && ((typemask & get_kind (short_mapkind, kinds, i + 1))
  == GOMP_MAP_POINTER))
    ++i;

@@ -908,6 +910,7 @@ gomp_map_vars_internal (struct gomp_device_descr 
*devicep,

    splay_tree_key n = splay_tree_lookup (mem_map, &cur_node);
    if (n != NULL)
  {
+  //TODO TS is confused.  Due to the way the handling of 
'GOMP_MAP_NO_ALLOC' is done in the first loop, we're here re-doing 
'gomp_map_vars_existing'?

    tgt->list[i].key = n;
    tgt->list[i].offset = cur_node.host_start - 
n->host_start;

    tgt->list[i].length = n->host_end - n->host_start;
Essentially, yes – except that we know here that the variable does 
exist – in the block above, it also works, but only if the variable 
has been mapped at some point.
@@ -917,6 +920,7 @@ gomp_map_vars_internal (struct gomp_device_descr 
*devicep,

  }
 

Re: [LTO] PR 86416 – improve lto1 diagnostic if a mode does not exist (esp. for offloading targets)

2019-12-18 Thread Jakub Jelinek
On Fri, Dec 13, 2019 at 03:34:56PM +0100, Tobias Burnus wrote:
> --- a/gcc/lto-streamer-in.c
> +++ b/gcc/lto-streamer-in.c
> @@ -1700,7 +1700,19 @@ lto_input_mode_table (struct lto_file_decl_data 
> *file_data)
>   }
> /* FALLTHRU */
>   default:
> -   fatal_error (UNKNOWN_LOCATION, "unsupported mode %qs", mname);
> +   /* For offloading-target compilions, this is a user-facing
> +  message.  See also target.def and machmode.def.  */
> +   if (strcmp (mname, "XF") == 0)
> + fatal_error (UNKNOWN_LOCATION,
> +  "%s - 80-bit floating-point numbers unsupported "
> +  "(mode %qs)", TARGET_MACHINE, mname);
> +   else if (strcmp (mname, "TF") == 0)
> + fatal_error (UNKNOWN_LOCATION,
> +  "%s - 128-bit floating-point numbers unsupported "
> +  "(mode %qs)", TARGET_MACHINE, mname);

I don't like the above, it looks like a gross hack for two modes that are
often used on x86_64, but as soon as you e.g. use powerpc64 KFmode instead
or many others, you'll be in trouble.
I'd say let lto_output_mode_table stream some string with a description of
the mode (perhaps for offloading streaming only), whether that would be by
looking e.g. through standard lang specific types, finding if they have such
mode and printing the name of the type, or trying to describe the mode in
words somehow, and then print that string during lto_input_mode_table.

Jakub



[C++ PATCH] PR c++/12333 - X::~X() with implicit this->.

2019-12-18 Thread Jason Merrill
this->X::~X() is handled by finish_class_member_access_expr and its
lookup_destructor subroutine; let's use it in cp_parser_lookup_name for the
case where this-> is implicit.

I tried replacing the other destructor code here with just the call to
lookup_destructor, but that regressed handling of naming the destructor
outside a non-static member function.

Tested x86_64-pc-linux-gnu, applying to trunk.

* parser.c (cp_parser_lookup_name): Use lookup_destructor.
* typeck.c (lookup_destructor): No longer static.
---
 gcc/cp/cp-tree.h|  1 +
 gcc/cp/parser.c |  5 +
 gcc/cp/typeck.c |  3 +--
 gcc/testsuite/g++.dg/lookup/dtor1.C | 13 +
 gcc/testsuite/g++.dg/parse/dtor3.C  |  8 
 5 files changed, 24 insertions(+), 6 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lookup/dtor1.C

diff --git a/gcc/cp/cp-tree.h b/gcc/cp/cp-tree.h
index b47698e1d0c..c35ed9abe08 100644
--- a/gcc/cp/cp-tree.h
+++ b/gcc/cp/cp-tree.h
@@ -7500,6 +7500,7 @@ extern tree build_class_member_access_expr  (cp_expr, 
tree, tree, bool,
 tsubst_flags_t);
 extern tree finish_class_member_access_expr (cp_expr, tree, bool,
 tsubst_flags_t);
+extern tree lookup_destructor  (tree, tree, tree, 
tsubst_flags_t);
 extern tree build_x_indirect_ref   (location_t, tree,
 ref_operator,
 tsubst_flags_t);
diff --git a/gcc/cp/parser.c b/gcc/cp/parser.c
index f61089934df..de792834050 100644
--- a/gcc/cp/parser.c
+++ b/gcc/cp/parser.c
@@ -27939,6 +27939,11 @@ cp_parser_lookup_name (cp_parser *parser, tree name,
   if (!type || !CLASS_TYPE_P (type))
return error_mark_node;
 
+  /* In a non-static member function, check implicit this->.  */
+  if (current_class_ref)
+   return lookup_destructor (current_class_ref, parser->scope, name,
+ tf_warning_or_error);
+
   if (CLASSTYPE_LAZY_DESTRUCTOR (type))
lazily_declare_fn (sfk_destructor, type);
 
diff --git a/gcc/cp/typeck.c b/gcc/cp/typeck.c
index d3814585e3f..669ca83cccf 100644
--- a/gcc/cp/typeck.c
+++ b/gcc/cp/typeck.c
@@ -59,7 +59,6 @@ static tree get_delta_difference (tree, tree, bool, bool, 
tsubst_flags_t);
 static void casts_away_constness_r (tree *, tree *, tsubst_flags_t);
 static bool casts_away_constness (tree, tree, tsubst_flags_t);
 static bool maybe_warn_about_returning_address_of_local (tree);
-static tree lookup_destructor (tree, tree, tree, tsubst_flags_t);
 static void error_args_num (location_t, tree, bool);
 static int convert_arguments (tree, vec **, tree, int,
   tsubst_flags_t);
@@ -2696,7 +2695,7 @@ build_class_member_access_expr (cp_expr object, tree 
member,
 /* Return the destructor denoted by OBJECT.SCOPE::DTOR_NAME, or, if
SCOPE is NULL, by OBJECT.DTOR_NAME, where DTOR_NAME is ~type.  */
 
-static tree
+tree
 lookup_destructor (tree object, tree scope, tree dtor_name,
   tsubst_flags_t complain)
 {
diff --git a/gcc/testsuite/g++.dg/lookup/dtor1.C 
b/gcc/testsuite/g++.dg/lookup/dtor1.C
new file mode 100644
index 000..29122876401
--- /dev/null
+++ b/gcc/testsuite/g++.dg/lookup/dtor1.C
@@ -0,0 +1,13 @@
+// PR c++/12333
+
+struct A { };
+
+struct X { 
+  void f () {
+X::~X ();
+this->~X();
+~X();  // { dg-error "" "unary ~" }
+A::~A ();  // { dg-error "" }
+X::~A ();  // { dg-error "" }
+  }
+};
diff --git a/gcc/testsuite/g++.dg/parse/dtor3.C 
b/gcc/testsuite/g++.dg/parse/dtor3.C
index 3041ae4a568..6121bed7e8b 100644
--- a/gcc/testsuite/g++.dg/parse/dtor3.C
+++ b/gcc/testsuite/g++.dg/parse/dtor3.C
@@ -4,13 +4,13 @@
 //  destructor call.
 
 struct Y { 
-  ~Y() {}  // { dg-bogus "note" "implemented DR272" { xfail *-*-* } }  
+  ~Y() {}  // { dg-bogus "note" "implemented DR272" }  
 };
 
 struct X : Y { 
-  ~X() {}  // { dg-bogus "note" "implemented DR272" { xfail *-*-* } }  
+  ~X() {}  // { dg-bogus "note" "implemented DR272" }  
   void f() { 
-X::~X();   // { dg-bogus "" "implemented DR272" { xfail *-*-* } }  
-Y::~Y();   // { dg-bogus "" "implemented DR272" { xfail *-*-* } }  
+X::~X();   // { dg-bogus "" "implemented DR272" }  
+Y::~Y();   // { dg-bogus "" "implemented DR272" }  
   } 
 };

base-commit: 5de3f0a1c7019f57b1972c112e0fb876f4df7ec8
-- 
2.18.1



Re: [C++ Patch] Improve throw, sizeof, and alignof locations & more

2019-12-18 Thread Jason Merrill

On 12/16/19 6:06 PM, Paolo Carlini wrote:

Hi,

another batch of work. Primarily, more of the idea of moving up the 
construction of the compound location thus passing it to the 
cxx_sizeof_or_alignof* and build_throw functions to obtain better 
locations for all the diagnostics issued by the latter. During the work 
a few mildly interesting nits (eg, for sizeof and alignof we want to set 
whenever possible the location inside the cxx_* functions and that 
allows to avoid set_location in cp_parser_unary_expression, but 
minimally we have to pass it to the cp_expr constructor, otherwise 
plugin testcases badly fail; noticed that in cxx_sizeof_expr and 
cxx_alignof_expr having a single cp_expr_loc_or_input_loc on top 
actually implies better locations (4 testcases) because below, before 
the diagnostic calls, we have STRIP_ANY_LOCATION_WRAPPER uses; in fact I 
changed those cp_expr_loc_or_input_loc to cp_expr_loc_or_loc because for 
expressions we still want to try fetching the locations but we do have a 
meaningful fallback in the location argument of the function; in 
build_throw, an error should be an inform, because in such cases we 
already issued an error (tested in ctor1.C)) but nothing major. The 
below includes a few other minor changes, like two additional uses of 
DECL_SOURCE_LOCATION, cp_expr_loc_or_input_loc, removal of an unused 
function.


OK.

Jason



Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Jan Hubicka
> ICE here.
> 
> lto1: internal compiler error: tree check: expected identifier_node, have
> function_decl in ultimate_transparent_alias_target, at varasm.c:1308
> 0x6f9cfe tree_check_failed(tree_node const*, char const*, int, char const*, 
> ...)
>   ../../gcc/gcc/tree.c:9685
> 0x714541 tree_check(tree_node*, char const*, int, char const*, tree_code)
>   ../../gcc/gcc/tree.h:3273
> 0x714541 ultimate_transparent_alias_target
>   ../../gcc/gcc/varasm.c:1308
> 0x714541 do_assemble_symver(tree_node*, tree_node*)
>   ../../gcc/gcc/varasm.c:5971

Interesting that it works for me, but indeed we can remove that call
since there is no way to do weakref of symbol version.
> 
> >  #ifdef ASM_OUTPUT_SYMVER_DIRECTIVE
> > -  ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> > -  IDENTIFIER_POINTER (target),
> > -  IDENTIFIER_POINTER (id));
> > +  if (TREE_PUBLIC (target) && DECL_VISIBILITY (target) == 
> > VISIBILITY_DEFAULT)
> > +ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> > +IDENTIFIER_POINTER
> > +  (DECL_ASSEMBLER_NAME (target)),
> > +IDENTIFIER_POINTER (id));
> > +  else
> > +{
> > +  int nameend;
> > +  for (nameend = 0; IDENTIFIER_POINTER (id)[nameend] != '@'; nameend++)
> > +   ;
> > +  if (IDENTIFIER_POINTER (id)[nameend + 1] != '@'
> > + || IDENTIFIER_POINTER (id)[nameend + 2] == '@')
> > +   {
> > + sorry_at (DECL_SOURCE_LOCATION (target),
> > +   "can not produce % of a symbol that is "
> > +   "not exported with default visibility");
> > + return;
> 
> I think this does not make sense.  Some library authors may export "foo@VER_1"
> but not "foo_v1" to ensure the programmers using the library upgrade their 
> code
> to use new "correct" ABI, instead of an old one.   This error makes it
> impossible.
> 
> (Try to comment out "foo_v1" in version.map, in the testcase.)

The problem here is that we lie to the compiler (by pretending that
foo_v2 is exported from DSO while it is not) and force user to do the
same.

We support two ways to hide symbol - either at compile time via
attribute((visibility("hidden"))) or at link-time via map file.  The
first produces better code because compiler can do more optimizations
knowing that the symbol can not be interposed.

Generally we want users to use visiblity attribute or -fvisibility since
it leads to better code. However now we tell users to use
attribute((symver("..."))) to produce symbol version, but at the same
time not use attribute((visibility("hidden"))).

> > +  memcpy (buf, IDENTIFIER_POINTER (id), nameend + 2);
> > +  buf[nameend + 2] = '@';
> > +  strcpy (buf + nameend + 3, IDENTIFIER_POINTER (id) + nameend + 2);
> 
> We can't replace a single "@" with "@@@".  So I think producing .LSYMVERx is 
> not
> an option for "old" versions like "foo@VER_1".

I wonder why gas implements the .symver this way at first place. Does
the linker really need the global symbol foo_v1 to produce the
version (in addition to foo@VER_1 that is in symbol table as well)?

> > +   /* Symbol versions are always used externally, but linker does not
> > +  report that correctly.  */
> > +   else if (snode->symver && *res == LDPR_PREVAILING_DEF_IRONLY)
> > + snode->resolution = LDPR_PREVAILING_DEF_IRONLY_EXP;
> 
> This is absolutely correct.

Good, I will go ahead with filling in binutils PR on the wrong LDPR
value and apply the hack.
> 
> > else
> >   snode->resolution = *res;
> >}
> 
> I still believe we should consider symver targets to be externally visible in
> cgraph_externally_visible_p.  There is a comment saying "if linker counts on 
> us,
> we must preserve the function".  That's true in our case.
> 
> And, I think
> 
> .globl  .LSYMVER0
> .set.LSYMVER0, foo_v2
> .symver .LSYMVER0, foo@@VERS_2
I produce
  .symver .LSYMVER0, foo@@@VERS_2

> 
> is exactly same as
> 
> .globl  foo_v2
> .symver foo_v2, foo@@VERS_2
> 
> except there is an unnecessary ".LSYMVER0".  Adding ".globl foo_v2" or ".globl
> foo_v1" won't cause them to be "global" in the final DSO because the linker 
> will
> hide them according to the version script.

The difference is that in first case compiler can fully control foo_v2
symbol (with LTO it will turn it into static symbol, it will inline
calls to it and do other things), while in the second case we need to
treat foo_v2 specially.
> 
> So if it's safe we can force a ".globl foo_v2" before ".symver foo_v2,
> foo@@VERS_2".  But I can't prove it's safe so I think it's better to consider
> this case in cgraph_externally_visible_p.

It sort of makes things work, but for example it will prevent gcc from
inlining calls to foo_v2.  I think we will still need to do something
about -fvisibility=hidden.

It is sad that we do not have way to produce symbol version without a
corresponding symbol of global visiblity.  If we had we could 

Re: [PATCH, GCC/ARM, 1/2] Add support for ASRL(reg) and LSLL(reg) instructions for Armv8.1-M Mainline

2019-12-18 Thread Mihail Ionescu



Hi Kyrill,

On 12/11/2019 05:50 PM, Kyrill Tkachov wrote:

Hi Mihail,

On 11/14/19 1:54 PM, Mihail Ionescu wrote:

Hi,

This patch adds the new scalar shift instructions for Armv8.1-M
Mainline to the arm backend.
This patch is adding the following instructions:

ASRL (reg)
LSLL (reg)



Sorry for the delay, very busy time for GCC development :(




ChangeLog entry are as follow:

*** gcc/ChangeLog ***


2019-11-14  Mihail-Calin Ionescu 
2019-11-14  Sudakshina Das  

    * config/arm/arm.h (TARGET_MVE): New macro for MVE support.



I don't see this hunk in the patch... There's a lot of v8.1-M-related 
patches in flight. Is it defined elsewhere?


Thanks for having a look at this.
Yes, I forgot to remove that bit from the ChangeLog and mention that the 
patch depends on the Armv8.1-M MVE CLI -- 
https://gcc.gnu.org/ml/gcc-patches/2019-11/msg00641.htm which introduces 
the required TARGET_* macros needed. I've updated the ChangeLog to 
reflect that:


*** gcc/ChangeLog ***


2019-12-18  Mihail-Calin Ionescu  
2019-12-18  Sudakshina Das  

* config/arm/arm.md (ashldi3): Generate thumb2_lsll for TARGET_HAVE_MVE.
(ashrdi3): Generate thumb2_asrl for TARGET_HAVE_MVE.
* config/arm/arm.c (arm_hard_regno_mode_ok): Allocate even odd
register pairs for doubleword quantities for ARMv8.1M-Mainline.
* config/arm/thumb2.md (thumb2_asrl): New.
(thumb2_lsll): Likewise.




    * config/arm/arm.md (ashldi3): Generate thumb2_lsll for 
TARGET_MVE.

    (ashrdi3): Generate thumb2_asrl for TARGET_MVE.
    * config/arm/arm.c (arm_hard_regno_mode_ok): Allocate even odd
    register pairs for doubleword quantities for ARMv8.1M-Mainline.
    * config/arm/thumb2.md (thumb2_asrl): New.
    (thumb2_lsll): Likewise.

*** gcc/testsuite/ChangeLog ***

2019-11-14  Mihail-Calin Ionescu 
2019-11-14  Sudakshina Das  

    * gcc.target/arm/armv8_1m-shift-reg_1.c: New test.

Testsuite shows no regression when run for arm-none-eabi targets.

Is this ok for trunk?

Thanks
Mihail


### Attachment also inlined for ease of reply 
###



diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
be51df7d14738bc1addeab8ac5a3806778106bce..bf788087a30343269b30cf7054ec29212ad9c572 
100644

--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -24454,14 +24454,15 @@ arm_hard_regno_mode_ok (unsigned int regno, 
machine_mode mode)


   /* We allow almost any value to be stored in the general registers.
  Restrict doubleword quantities to even register pairs in ARM state
- so that we can use ldrd.  Do not allow very large Neon structure
- opaque modes in general registers; they would use too many.  */
+ so that we can use ldrd and Armv8.1-M Mainline instructions.
+ Do not allow very large Neon structure  opaque modes in general
+ registers; they would use too many.  */



This comment now reads:

"Restrict doubleword quantities to even register pairs in ARM state
  so that we can use ldrd and Armv8.1-M Mainline instructions."

Armv8.1-M Mainline is not ARM mode though, so please clarify this 
comment further.


Looks ok to me otherwise (I may even have merged this with the second 
patch, but I'm not complaining about keeping it simple :) )


Thanks,

Kyrill



I've now updated the comment to read:
"Restrict doubleword quantities to even register pairs in ARM state
so that we can use ldrd. The same restriction applies for MVE."


Regards,
Mihail




   if (regno <= LAST_ARM_REGNUM)
 {
   if (ARM_NUM_REGS (mode) > 4)
 return false;

-  if (TARGET_THUMB2)
+  if (TARGET_THUMB2 && !TARGET_HAVE_MVE)
 return true;

   return !(TARGET_LDRD && GET_MODE_SIZE (mode) > 4 && (regno & 1) 
!= 0);

diff --git a/gcc/config/arm/arm.md b/gcc/config/arm/arm.md
index 
a91a4b941c3f9d2c3d443f9f4639069ae953fb3b..b735f858a6a5c94d02a6765c1b349cdcb5e77ee3 
100644

--- a/gcc/config/arm/arm.md
+++ b/gcc/config/arm/arm.md
@@ -3503,6 +3503,22 @@
    (match_operand:SI 2 "reg_or_int_operand")))]
   "TARGET_32BIT"
   "
+  if (TARGET_HAVE_MVE)
+    {
+  if (!reg_or_int_operand (operands[2], SImode))
+    operands[2] = force_reg (SImode, operands[2]);
+
+  /* Armv8.1-M Mainline double shifts are not expanded.  */
+  if (REG_P (operands[2]))
+   {
+ if (!reg_overlap_mentioned_p(operands[0], operands[1]))
+   emit_insn (gen_movdi (operands[0], operands[1]));
+
+ emit_insn (gen_thumb2_lsll (operands[0], operands[2]));
+ DONE;
+   }
+    }
+
   arm_emit_coreregs_64bit_shift (ASHIFT, operands[0], operands[1],
  operands[2], gen_reg_rtx (SImode),
  gen_reg_rtx (SImode));
@@ -3530,6 +3546,16 @@
  (match_operand:SI 2 "reg_or_int_operand")))]
   "TARGET_32BIT"
   "
+  /* Armv8.1-M Mainline double shifts are not expanded.  */
+  if (TARGET_HAVE_MVE && REG_P (operands[2]))
+ 

[Ping][GCC][PATCH][ARM]Add ACLE intrinsics for dot product (vusdot - vector, vdot - by element) for AArch32 AdvSIMD ARMv8.6 Extension

2019-12-18 Thread Stam Markianos-Wright


On 12/13/19 10:22 AM, Stam Markianos-Wright wrote:
> Hi all,
> 
> This patch adds the ARMv8.6 Extension ACLE intrinsics for dot product
> operations (vector/by element) to the ARM back-end.
> 
> These are:
> usdot (vector), dot (by element).
> 
> The functions are optional from ARMv8.2-a as -march=armv8.2-a+i8mm and
> for ARM they remain optional as of ARMv8.6-a.
> 
> The functions are declared in arm_neon.h, RTL patterns are defined to
> generate assembler and tests are added to verify and perform adequate 
> checks.
> 
> Regression testing on arm-none-eabi passed successfully.
> 
> This patch depends on:
> 
> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02195.html
> 
> for ARM CLI updates, and on:
> 
> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00857.html
> 
> for testsuite effective_target update.
> 
> Ok for trunk?

.Ping :)

> 
> Cheers,
> Stam
> 
> 
> ACLE documents are at https://developer.arm.com/docs/101028/latest
> ISA documents are at https://developer.arm.com/docs/ddi0596/latest
> 
> PS. I don't have commit rights, so if someone could commit on my behalf,
> that would be great :)
> 
> 
> gcc/ChangeLog:
> 
> 2019-11-28  Stam Markianos-Wright  
> 
>  * config/arm/arm-builtins.c (enum arm_type_qualifiers):
>  (USTERNOP_QUALIFIERS): New define.
>  (USMAC_LANE_QUADTUP_QUALIFIERS): New define.
>  (SUMAC_LANE_QUADTUP_QUALIFIERS): New define.
>  (arm_expand_builtin_args):
>      Add case ARG_BUILTIN_LANE_QUADTUP_INDEX.
>  (arm_expand_builtin_1): Add qualifier_lane_quadtup_index.
>  * config/arm/arm_neon.h (vusdot_s32): New.
>  (vusdot_lane_s32): New.
>  (vusdotq_lane_s32): New.
>  (vsudot_lane_s32): New.
>  (vsudotq_lane_s32): New.
>  * config/arm/arm_neon_builtins.def
>      (usdot,usdot_lane,sudot_lane): New.
>  * config/arm/iterators.md (DOTPROD_I8MM): New.
>      (sup, opsuffix): Add .
>     * config/arm/neon.md (neon_usdot, dot_lane: New.
>  * config/arm/unspecs.md (UNSPEC_DOT_US, UNSPEC_DOT_SU): New.
> 
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-12-12  Stam Markianos-Wright  
> 
>  * gcc.target/arm/simd/vdot-compile-2-1.c: New test.
>  * gcc.target/arm/simd/vdot-compile-2-2.c: New test.
>  * gcc.target/arm/simd/vdot-compile-2-3.c: New test.
>  * gcc.target/arm/simd/vdot-compile-2-4.c: New test.
> 
> 


Re: [PATCH, GCC/ARM, 4/10] Clear GPR with CLRM

2019-12-18 Thread Mihail Ionescu

Hi Kyrill,

On 12/17/2019 10:26 AM, Kyrill Tkachov wrote:

Hi Mihail,

On 12/16/19 6:29 PM, Mihail Ionescu wrote:

Hi Kyrill,

On 11/12/2019 09:55 AM, Kyrill Tkachov wrote:

Hi Mihail,

On 10/23/19 10:26 AM, Mihail Ionescu wrote:

[PATCH, GCC/ARM, 4/10] Clear GPR with CLRM

Hi,

=== Context ===

This patch is part of a patch series to add support for Armv8.1-M
Mainline Security Extensions architecture. Its purpose is to improve
code density of functions with the cmse_nonsecure_entry attribute and
when calling function with the cmse_nonsecure_call attribute by using
CLRM to do all the general purpose registers clearing as well as
clearing the APSR register.

=== Patch description ===

This patch adds a new pattern for the CLRM instruction and guards the
current clearing code in output_return_instruction() and thumb_exit()
on Armv8.1-M Mainline instructions not being present.
cmse_clear_registers () is then modified to use the new CLRM 
instruction

when targeting Armv8.1-M Mainline while keeping Armv8-M register
clearing code for VFP registers.

For the CLRM instruction, which does not mandated APSR in the register
list, checking whether it is the right volatile unspec or a clearing
register is done in clear_operation_p.

Note that load/store multiple were deemed sufficiently different in
terms of RTX structure compared to the CLRM pattern for a different
function to be used to validate the match_parallel.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * config/arm/arm-protos.h (clear_operation_p): Declare.
    * config/arm/arm.c (clear_operation_p): New function.
    (cmse_clear_registers): Generate clear_multiple instruction 
pattern if

    targeting Armv8.1-M Mainline or successor.
    (output_return_instruction): Only output APSR register 
clearing if

    Armv8.1-M Mainline instructions not available.
    (thumb_exit): Likewise.
    * config/arm/predicates.md (clear_multiple_operation): New 
predicate.

    * config/arm/thumb2.md (clear_apsr): New define_insn.
    (clear_multiple): Likewise.
    * config/arm/unspecs.md (VUNSPEC_CLRM_APSR): New volatile 
unspec.


*** gcc/testsuite/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * gcc.target/arm/cmse/bitfield-1.c: Add check for CLRM.
    * gcc.target/arm/cmse/bitfield-2.c: Likewise.
    * gcc.target/arm/cmse/bitfield-3.c: Likewise.
    * gcc.target/arm/cmse/struct-1.c: Likewise.
    * gcc.target/arm/cmse/cmse-14.c: Likewise.
    * gcc.target/arm/cmse/cmse-1.c: Likewise.  Restrict checks 
for Armv8-M

    GPR clearing when CLRM is not available.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-5.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-9.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-13.c: 
Likewise.

    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-5.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-5.c: likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-7.c: likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-8.c: likewise.
    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-5.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-5.c: 
Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-7.c: 
Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-8.c: 
Likewise.

    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-5.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/union-1.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/union-2.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Mihail


### Attachment also inlined for ease of reply 
###



diff --git a/gcc/config/arm/arm-protos.h b/gcc/config/arm/arm-protos.h
index 
f995974f9bb89ab3c7ff0888c394b0dfaf7da60c..1a948d2c97526ad7e67e8d4a610ac74cfdb1

Re: [PATCH, GCC/ARM, 8/10] Do lazy store & load inline when calling nscall function

2019-12-18 Thread Mihail Ionescu

Hi Kyrill,

On 11/12/2019 10:22 AM, Kyrill Tkachov wrote:

Hi Mihail,

On 10/23/19 3:24 PM, Mihail Ionescu wrote:
[PATCH, GCC/ARM, 8/10] Do lazy store & load inline when calling nscall 
function


Hi,

=== Context ===

This patch is part of a patch series to add support for Armv8.1-M
Mainline Security Extensions architecture. Its purpose is to generate
lazy store and load instruction inline when calling a function with the
cmse_nonsecure_call attribute with the soft or softfp floating-point
ABI.

=== Patch description ===

This patch adds two new patterns for the VLSTM and VLLDM instructions.
cmse_nonsecure_call_inline_register_clear is then modified to
generate VLSTM and VLLDM respectively before and after calls to
functions with the cmse_nonsecure_call attribute in order to have lazy
saving, clearing and restoring of VFP registers. Since these
instructions do not do writeback of the base register, the stack is 
adjusted

prior the lazy store and after the lazy load with appropriate frame
debug notes to describe the effect on the CFA register.

As with CLRM, VSCCLRM and VSTR/VLDR, the instruction is modeled as an
unspecified operation to the memory pointed to by the base register.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * config/arm/arm.c (arm_add_cfa_adjust_cfa_note): Declare early.
    (cmse_nonsecure_call_inline_register_clear): Define new 
lazy_fpclear

    variable as true when floating-point ABI is not hard.  Replace
    check against TARGET_HARD_FLOAT_ABI by checks against 
lazy_fpclear.

    Generate VLSTM and VLLDM instruction respectively before and
    after a function call to cmse_nonsecure_call function.
    * config/arm/unspecs.md (VUNSPEC_VLSTM): Define unspec.
    (VUNSPEC_VLLDM): Likewise.
    * config/arm/vfp.md (lazy_store_multiple_insn): New define_insn.
    (lazy_load_multiple_insn): Likewise.

*** gcc/testsuite/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-13.c: Add check 
for VLSTM and

    VLLDM.
    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/soft/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/softfp-sp/cmse-8.c: Likewise.

Testing: bootstrapped on arm-linux-gnueabihf and testsuite shows no
regression.

Is this ok for trunk?

Best regards,

Mihail


### Attachment also inlined for ease of reply 
###



diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 
bcc86d50a10f11d9672258442089a0aa5c450b2f..b10f996c023e830ca24ff83fcbab335caf85d4cb 
100644

--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -186,6 +186,7 @@ static int arm_register_move_cost (machine_mode, 
reg_class_t, reg_class_t);

 static int arm_memory_move_cost (machine_mode, reg_class_t, bool);
 static void emit_constant_insn (rtx cond, rtx pattern);
 static rtx_insn *emit_set_insn (rtx, rtx);
+static void arm_add_cfa_adjust_cfa_note (rtx, int, rtx, rtx);
 static rtx emit_multi_reg_push (unsigned long, unsigned long);
 static void arm_emit_multi_reg_pop (unsigned long);
 static int vfp_emit_fstmd (int, int);
@@ -17830,6 +17831,9 @@ cmse_nonsecure_call_inline_register_clear (void)
   FOR_BB_INSNS (bb, insn)
 {
   bool clear_callee_saved = TARGET_HAVE_FPCTX_CMSE;
+ /* frame = VFP regs + FPSCR + VPR.  */
+ unsigned lazy_store_stack_frame_size =
+   (LAST_VFP_REGNUM - FIRST_VFP_REGNUM + 1 + 2) * 
UNITS_PER_WORD;

   unsigned long callee_saved_mask =
 ((1 << (LAST_HI_REGNUM + 1)) - 1)
 & ~((1 << (LAST_ARG_REGNUM + 1)) - 1);
@@ -17847,7 +17851,7 @@ cmse_nonsecure_call_inline_register_clear (void)
   CUMULATIVE_ARGS args_so_far_v;
   cumulative_args_t args_so_far;
   tree arg_type, fntype;
- bool first_param = true;
+ bool first_param = true, lazy_fpclear = !TARGET_HARD_FLOAT_ABI;
   function_args_iterator args_iter;
   uint32_t padding_bits_to_clear[4] = {0U, 0U, 0U, 0U};

@@ -17881,7 +17885,7 @@ cmse_nonsecure_call_inline_register_clear (void)
  -mfloat-abi=hard.  For -mfloat-abi=softfp we will be 
using the
  lazy store and loads which clear both caller- and 
callee-saved

  registers.  */
- if (TARGET_HARD_FLOAT_ABI)
+ if (!lazy_fpclear)
 {
   auto_sbitmap float_bitmap (maxregno + 1);

@@ -17965,8 +17969,23 @@ cmse_nonsecure_call_inline_register_clear (void)
  disabled for pop

Re: [PATCH, GCC/ARM, 9/10] Call nscall function with blxns

2019-12-18 Thread Mihail Ionescu

Hi,

On 11/12/2019 10:23 AM, Kyrill Tkachov wrote:


On 10/23/19 10:26 AM, Mihail Ionescu wrote:

[PATCH, GCC/ARM, 9/10] Call nscall function with blxns

Hi,

=== Context ===

This patch is part of a patch series to add support for Armv8.1-M
Mainline Security Extensions architecture. Its purpose is to call
functions with the cmse_nonsecure_call attribute directly using blxns
with no undue restriction on the register used for that.

=== Patch description ===

This change to use BLXNS to call a nonsecure function from secure
directly (not using a libcall) is made in 2 steps:
- change nonsecure_call patterns to use blxns instead of calling
  __gnu_cmse_nonsecure_call
- loosen requirement for function address to allow any register when
  doing BLXNS.

The former is a straightforward check over whether instructions added in
Armv8.1-M Mainline are available while the latter consist in making the
nonsecure call pattern accept any register by using match_operand and
changing the nonsecure_call_internal expander to no force r4 when
targeting Armv8.1-M Mainline.

The tricky bit is actually in the test update, specifically how to check
that register lists for CLRM have all registers except for the one
holding parameters (already done) and the one holding the address used
by BLXNS. This is achieved with 3 scan-assembler directives.

1) The first one lists all registers that can appear in CLRM but make
   each of them optional.
   Property guaranteed: no wrong register is cleared and none appears
   twice in the register list.
2) The second directive check that the CLRM is made of a fixed number
   of the right registers to be cleared. The number used is the number
   of registers that could contain a secret minus one (used to hold the
   address of the function to call.
   Property guaranteed: register list has the right number of registers
   Cumulated property guaranteed: only registers with a potential secret
   are cleared and they are all listed but ont
3) The last directive checks that we cannot find a CLRM with a register
   in it that also appears in BLXNS. This is check via the use of a
   back-reference on any of the allowed register in CLRM, the
   back-reference enforcing that whatever register match in CLRM must be
   the same in the BLXNS.
   Property guaranteed: register used for BLXNS is different from
   registers cleared in CLRM.

Some more care needs to happen for the gcc.target/arm/cmse/cmse-1.c
testcase due to there being two CLRM generated. To ensure the third
directive match the right CLRM to the BLXNS, a negative lookahead is
used between the CLRM register list and the BLXNS. The way negative
lookahead work is by matching the *position* where a given regular
expression does not match. In this case, since it comes after the CLRM
register list it is requesting that what comes after the register list
does not have a CLRM again followed by BLXNS. This guarantees that the
.*blxns after only matches a blxns without another CLRM before.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * config/arm/arm.md (nonsecure_call_internal): Do not force 
memory

    address in r4 when targeting Armv8.1-M Mainline.
    (nonsecure_call_value_internal): Likewise.
    * config/arm/thumb2.md (nonsecure_call_reg_thumb2): Make 
memory address

    a register match_operand again.  Emit BLXNS when targeting
    Armv8.1-M Mainline.
    (nonsecure_call_value_reg_thumb2): Likewise.

*** gcc/testsuite/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * gcc.target/arm/cmse/cmse-1.c: Add check for BLXNS when 
instructions
    introduced in Armv8.1-M Mainline Security Extensions are 
available and
    restrict checks for libcall to __gnu_cmse_nonsecure_call to 
Armv8-M
    targets only.  Adapt CLRM check to verify register used for 
BLXNS is

    not in the CLRM register list.
    * gcc.target/arm/cmse/cmse-14.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Likewise and 
adapt
    check for LSB clearing bit to be using the same register as 
BLXNS when

    targeting Armv8.1-M Mainline.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-5.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-9.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: 
Likewise.

    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-13.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse

Re: 'find_group_last' (was: [PATCH] OpenACC reference count overhaul)

2019-12-18 Thread Julian Brown
On Wed, 18 Dec 2019 10:18:14 +0100
Thomas Schwinge  wrote:

> Hi Julian!
> 
> Thanks for walking me through this.
> 
> On 2019-12-14T00:19:04+, Julian Brown 
> wrote:
> > On Fri, 13 Dec 2019 16:25:25 +0100
> > Thomas Schwinge  wrote:  
> >> On 2019-10-29T12:15:01+, Julian Brown 
> >> wrote:  
> >> >  static int
> >> > -find_pointer (int pos, size_t mapnum, unsigned short *kinds)
> >> > +find_group_last (int pos, size_t mapnum, unsigned short *kinds)
> >> >  {
> >> > -  if (pos + 1 >= mapnum)
> >> > -return 0;
> >> > +  unsigned char kind0 = kinds[pos] & 0xff;
> >> > +  int first_pos = pos, last_pos = pos;
> >> >  
> >> > -  unsigned char kind = kinds[pos+1] & 0xff;
> >> > -
> >> > -  if (kind == GOMP_MAP_TO_PSET)
> >> > -return 3;
> >> > -  else if (kind == GOMP_MAP_POINTER)
> >> > -return 2;
> >> > +  if (kind0 == GOMP_MAP_TO_PSET)
> >> > +{
> >> > +  while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) ==
> >> > GOMP_MAP_POINTER)
> >> > +last_pos = ++pos;
> >> > +  /* We expect at least one GOMP_MAP_POINTER after a
> >> > GOMP_MAP_TO_PSET.  */
> >> > +  assert (last_pos > first_pos);
> >> > +}
> >> > +  else
> >> > +{
> >> > +  /* GOMP_MAP_ALWAYS_POINTER can only appear directly after
> >> > some other
> >> > + mapping.  */
> >> > +  if (pos + 1 < mapnum
> >> > +  && (kinds[pos + 1] & 0xff) == GOMP_MAP_ALWAYS_POINTER)
> >> > +return pos + 1;
> >> > +
> >> > +  /* We can have one or several GOMP_MAP_POINTER mappings
> >> > after a to/from
> >> > + (etc.) mapping.  */
> >> > +  while (pos + 1 < mapnum && (kinds[pos + 1] & 0xff) ==
> >> > GOMP_MAP_POINTER)
> >> > +last_pos = ++pos;
> >> > +}
> >> >  
> >> > -  return 0;
> >> > +  return last_pos;
> >> >  }
> 
> Given:
> 
> program test
>   implicit none
> 
>   integer, parameter :: n = 64
>   integer :: a(n)
> 
>   call test_array(a)
> 
> contains
>   subroutine test_array(a)
> implicit none
> 
> integer :: a(n)
> 
> !$acc enter data copyin(a)
> 
> !$acc exit data delete(a)
>   end subroutine test_array
> end program test
> 
> ..., we get a 'GOMP_MAP_TO' followed by a 'GOMP_MAP_POINTER'.  That
> got us 'find_pointer () == 2', and now we get 'find_group_last (i) ==
> i + 1' (so, the same).
> 
> > In a previous iteration of the refcount overhaul patch, we had the
> > "magic" code fragment:
> >  
> >> +for (int j = 0; j < 2; j++)  
> >> +  gomp_map_vars_async (acc_dev, aq,
> >> +   (j == 0 || pointer == 2) ?
> >> 1 : 2,
> >> +   &hostaddrs[i + j], NULL,
> >> +   &sizes[i + j], &kinds[i +
> >> j], true,
> >> +
> >> GOMP_MAP_VARS_OPENACC_ENTER_DATA);
> 
> > The "pointer == 2" case (i.e. with a GOMP_MAP_TO and a
> > GOMP_MAP_POINTER)  
> 
> So, that's the example given above.
> 
> > will also handle the mappings separately in both the
> > earlier patch iteration  
> 
> ACK, given the "previous iteration" code presented above.
> 
> > and this one.  
> 
> NACK?  Given 'find_group_last (i) == i + 1', that means that
> 'GOMP_MAP_TO' and 'GOMP_MAP_POINTER' get mapped as one group?
> 
> On the other hand, it still does match the current 'find_pointer'
> behavior?
> 
> But what should the behavior here be: 'GOMP_MAP_TO',
> 'GOMP_MAP_POINTER' each separate, or as one group?
> 
> Confusing stuff.  :-|

Hmm.

I think that GOMP_MAP_POINTER is only intended to be used after some
other mapping (TO/TOFROM/TO_PSET/etc.). In the follow-up patch
supporting deep copy, this code is extended and refactored a little
more:

https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01256.html

One of the changes made there is to disallow GOMP_MAP{,_ALWAYS}_POINTER
from appearing by itself. By my reading, that must be the case for
GOMP_MAP_ALWAYS_POINTER because it has a hard-wired dependency on the
previous mapping. GOMP_MAP_POINTER is slightly more questionable: at
least according to the comment in gomp-constants.h, these are "an
internal only map kind, used for pointer based array sections" -- so
it's a little surprising they now reach the libgomp runtime at all.
Maybe it was a mistake?

The GOMP_MAP_ATTACH mapping (as in the example upthread) is different --
that one *can* appear by itself. Perhaps the difference (wrt. reference
counting here) is that GOMP_MAP_POINTER refers to the same
target_mem_desc as the previous (grouped-together) mapping, but
GOMP_MAP_ATTACH does not (rather, referring to the location of the
*pointer* to the data of a previous mapping, rather than the data
itself).

For GOMP_MAP_TO_PSET, a subsequent GOMP_MAP_POINTER will refer to the
pointer set itself. So, same thing, and it's not problematic to group
the mappings together.

Anyway: thinking about it some more, I don't think any of the ways
these types of mappings get grouped together should really be causing
refco

Re: [PATCH] V10 patch #4, Add new prefixed/non-prefixed memory constraints

2019-12-18 Thread Segher Boessenkool
Hi!

On Tue, Dec 17, 2019 at 07:38:51PM -0500, Michael Meissner wrote:
> On Tue, Dec 17, 2019 at 05:35:24PM -0600, Segher Boessenkool wrote:
> > And what is with the INSN_FORM_PCREL_EXTERNAL?
> 
> INSN_FORM_PCREL_EXTERNAL says that the operand is a reference to an external
> symbol.  It cannot appear in an actual memory insns in normal usage, but it
> needs to be handled several places:

Sure.  Both prefixed_memory and non_prefixed_memory should test something
like memory_operand, not just whether it is a MEM.

But *both* of them, that's the point, and using some more generic hook.


Segher


Re: [patch] Use simple LRA algorithm at -O0

2019-12-18 Thread Vladimir Makarov



On 2019-12-17 1:02 p.m., Eric Botcazou wrote:

Hi,

LRA is getting measurably slower since GCC 8, at least on x86, and things are
worsening since GCC 9.  While this might be legitimate when optimization is
enabled, it's a pure waste of cycles at -O0 so the attached patch switches LRA
over to using the simple algorithm when optimization is disabled.  The effect
on code size is tiny (typically 0.2% on x86).

Tested on x86_64-suse-linux, OK for the mainline?

Eric, thank you for reporting this issue and providing the patch.   
Simple LRA algorithms switch off hard register splitting, so there might 
a slightly bigger chance for occurring "can find reload register" error 
(e.g. when -O0 -fschedule-insns is used). But this error is still not 
solved in general case and in my experience the chance for this error is 
even bigger for optimized modes than for -O0 with simple LRA algorithms.


Saying that I believe the patch is OK for the trunk.


2019-12-17  Eric Botcazou  

* ira.c (ira): Use simple LRA algorithm when not optimizing.





Re: [PATCH] IPA-CP: Remove bogus static keyword (PR 92971)

2019-12-18 Thread Martin Jambor
Hi,

On Tue, Dec 17 2019, Jakub Jelinek wrote:
> On Tue, Dec 17, 2019 at 01:50:32PM +0100, Martin Jambor wrote:
>> Hi,
>> 
>> as reported in PR 92971, IPA-CP's
>> cgraph_edge_brings_all_agg_vals_for_node defines one local variable with
>> the static keyword which is a clear mistake, probabley a cut'n'paste
>> error when I originally wrote the code.
>> 
>> I'll commit the following as obvious after a round of bootstrap and
>> testing.  Early next year, I'll also commit it to all opened release
>> branches.
>
> Is that what you want to do though?
> Because when it is an automatic variable (shouldn't it be auto_vec, btw),
> then the first use of it doesn't make much sense:
>   values = intersect_aggregates_with_edge (cs, i, values);
> because it will be always (cs, i, vNULL).  So maybe the var should live
> across the iterations or live in the caller that should pass a pointer (or
> reference) to it?
> With the patch, there will be leaks too, because the values vector is only
> released if the function returns false and is not released otherwise.

the leak is indeed a problem, thanks for spotting it.  But apart from
that, I really wanted to pass vNULL to intersect_aggregates_with_edge,
and the patch below does it explicitely to make it clear, because while
the function can do also intersecting its actual task here is to carry
over aggregate constants from all types of callers (clones and
non-clones) and all kinds of supported jump functions.

That might be an overkill because the main goal of
cgraph_edge_brings_all_agg_vals_for_node is to add the last edge within
SCCs of nodes propagating the same constant to each other and I could
not quickly come up with a testcase where the caller would be a
non-clone (and the function something other than pass-through, but I
suspect that could actually happen) but since the code is written we
might as well use it.  The function is written to bail out early before
actual value comparing and that is why the code is rarely executed, in
fact I found out that it is not covered by our testsuite (see
https://users.suse.com/~mliska/lcov/gcc/ipa-cp.c.gcov.html) and so the
patch also adds a testcase which does execute it.

The way vectors are passed around by value rather than by reference is
how I wrote this stuff shortly after conversion from our C VEC_ headers
with which were used in the same way.  I agree that a lot of code in
ipa-cp would benefit from transitioning to auto_vecs but that is
something for the next stage 1.

The patch has been bootstrapped and LTO-profile-bootstrapped on
x86-64-linux.  OK for trunk?

Thanks,

Martin


2019-12-17  Martin Jambor  

PR ipa/92971
* Ipa-cp.c (cgraph_edge_brings_all_agg_vals_for_node): Fix
  definition of values, release memory on exit.

testsuite/
* gcc.dg/ipa/ipcp-agg-12.c: New test.
---
 gcc/ipa-cp.c   |  4 +-
 gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c | 53 ++
 2 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c

diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
index 1a80ccbde2d..243b064ee2c 100644
--- a/gcc/ipa-cp.c
+++ b/gcc/ipa-cp.c
@@ -5117,7 +5117,6 @@ cgraph_edge_brings_all_agg_vals_for_node (struct 
cgraph_edge *cs,
 
   for (i = 0; i < count; i++)
 {
-  static vec values = vNULL;
   class ipcp_param_lattices *plats;
   bool interesting = false;
   for (struct ipa_agg_replacement_value *av = aggval; av; av = av->next)
@@ -5133,7 +5132,7 @@ cgraph_edge_brings_all_agg_vals_for_node (struct 
cgraph_edge *cs,
   if (plats->aggs_bottom)
return false;
 
-  values = intersect_aggregates_with_edge (cs, i, values);
+  vec values = intersect_aggregates_with_edge (cs, i, 
vNULL);
   if (!values.exists ())
return false;
 
@@ -5157,6 +5156,7 @@ cgraph_edge_brings_all_agg_vals_for_node (struct 
cgraph_edge *cs,
return false;
  }
  }
+  values.release ();
 }
   return true;
 }
diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c 
b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
new file mode 100644
index 000..5c57913803e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
@@ -0,0 +1,53 @@
+/* { dg-do compile } */
+/* { dg-options "-O3 -fno-ipa-sra -fdump-ipa-cp-details 
--param=ipa-cp-eval-threshold=2"  } */
+
+struct S
+{
+  int a, b, c;
+};
+
+int __attribute__((noinline)) foo (int i, struct S s);
+int __attribute__((noinline)) bar (int i, struct S s);
+int __attribute__((noinline)) baz (int i, struct S s);
+
+
+int __attribute__((noinline))
+bar (int i, struct S s)
+{
+  return baz (i, s);
+}
+
+int __attribute__((noinline))
+baz (int i, struct S s)
+{
+  return foo (i, s);
+}
+
+int __attribute__((noinline))
+foo (int i, struct S s)
+{
+  if (i == 2)
+return 0;
+  else
+return s.b * s.b + bar (i - 1, s);
+}
+
+volatile int g;
+
+void entry (void)
+{
+  struct S s;
+  s.b = 4;
+  g = bar (g, s)

Re: [PATCH][GCC][arm] Add CLI and multilib support for Armv8.1-M Mainline MVE extensions

2019-12-18 Thread Kyrill Tkachov

Hi Mihail,

On 11/8/19 4:52 PM, Mihail Ionescu wrote:

Hi,

This patch adds CLI and multilib support for Armv8.1-M MVE to the Arm 
backend.
Two new option added for v8.1-m.main: "+mve" for integer MVE 
instructions only

and "+mve.fp" for both integer and single-precision/half-precision
floating-point MVE.
The patch also maps the Armv8.1-M multilib variants to the 
corresponding v8-M ones.




gcc/ChangeLog:

2019-11-08  Mihail Ionescu  
2019-11-08  Andre Vieira 

    * config/arm/arm-cpus.in (mve, mve_float): New features.
    (dsp, mve, mve.fp): New options.
    * config/arm/arm.h (TARGET_HAVE_MVE, TARGET_HAVE_MVE_FLOAT): 
Define.

    * config/arm/t-rmprofile: Map v8.1-M multilibs to v8-M.


gcc/testsuite/ChangeLog:

2019-11-08  Mihail Ionescu  
2019-11-08  Andre Vieira 

    * testsuite/gcc.target/arm/multilib.exp: Add v8.1-M entries.


Is this ok for trunk?



This is ok, but please document the new options in invoke.texi.

Thanks,

Kyrill




Best regards,

Mihail


### Attachment also inlined for ease of reply    
###



diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 
59aad8f62ee5186cc87d3cefaf40ba2ce049012d..c2f016c75e2d8dd06890295321232bef61cbd234 
100644

--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -194,6 +194,10 @@ define feature sb
 # v8-A architectures, added by default from v8.5-A
 define feature predres

+# M-profile Vector Extension feature bits
+define feature mve
+define feature mve_float
+
 # Feature groups.  Conventionally all (or mostly) upper case.
 # ALL_FPU lists all the feature bits associated with the floating-point
 # unit; these will all be removed if the floating-point unit is disabled
@@ -654,9 +658,12 @@ begin arch armv8.1-m.main
  base 8M_MAIN
  isa ARMv8_1m_main
 # fp => FPv5-sp-d16; fp.dp => FPv5-d16
+ option dsp add armv7em
  option fp add FPv5 fp16
  option fp.dp add FPv5 FP_DBL fp16
  option nofp remove ALL_FP
+ option mve add mve armv7em
+ option mve.fp add mve FPv5 fp16 mve_float armv7em
 end arch armv8.1-m.main

 begin arch iwmmxt
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
64c292f2862514fb600a4faeaddfeacb2b69180b..9ec38c6af1b84fc92e20e30e8f07ce5360a966c1 
100644

--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -310,6 +310,12 @@ emission of floating point pcs attributes.  */
    instructions (most are floating-point related).  */
 #define TARGET_HAVE_FPCXT_CMSE  (arm_arch8_1m_main)

+#define TARGET_HAVE_MVE (bitmap_bit_p (arm_active_target.isa, \
+  isa_bit_mve))
+
+#define TARGET_HAVE_MVE_FLOAT (bitmap_bit_p (arm_active_target.isa, \
+ isa_bit_mve_float))
+
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
  || (TARGET_THUMB && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index 
807e69eaf78625f422e2d7ef5936c5c80c5b9073..62e27fd284b21524896430176d64ff5b08c6e0ef 
100644

--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -54,7 +54,7 @@ MULTILIB_REQUIRED += 
mthumb/march=armv8-m.main+fp.dp/mfloat-abi=softfp

 # Arch Matches
 MULTILIB_MATCHES    += march?armv6s-m=march?armv6-m

-# Map all v8-m.main+dsp FP variants down the the variant without DSP.
+# Map all v8-m.main+dsp FP variants down to the variant without DSP.
 MULTILIB_MATCHES    += march?armv8-m.main=march?armv8-m.main+dsp \
    $(foreach FP, +fp +fp.dp, \
march?armv8-m.main$(FP)=march?armv8-m.main+dsp$(FP))
@@ -66,3 +66,18 @@ MULTILIB_MATCHES += 
march?armv7e-m+fp=march?armv7e-m+fpv5
 MULTILIB_REUSE  += $(foreach ARCH, armv6s-m armv7-m armv7e-m 
armv8-m\.base armv8-m\.main, \

mthumb/march.$(ARCH)/mfloat-abi.soft=mthumb/march.$(ARCH)/mfloat-abi.softfp)

+# Map v8.1-M to v8-M.
+MULTILIB_MATCHES   += march?armv8-m.main=march?armv8.1-m.main
+MULTILIB_MATCHES   += march?armv8-m.main=march?armv8.1-m.main+dsp
+MULTILIB_MATCHES   += march?armv8-m.main=march?armv8.1-m.main+mve
+
+v8_1m_sp_variants = +fp +dsp+fp +mve.fp
+v8_1m_dp_variants = +fp.dp +dsp+fp.dp +fp.dp+mve +fp.dp+mve.fp
+
+# Map all v8.1-m.main FP sp variants down to v8-m.
+MULTILIB_MATCHES += $(foreach FP, $(v8_1m_sp_variants), \
+ march?armv8-m.main+fp=march?armv8.1-m.main$(FP))
+
+# Map all v8.1-m.main FP dp variants down to v8-m.
+MULTILIB_MATCHES += $(foreach FP, $(v8_1m_dp_variants), \
+ march?armv8-m.main+fp.dp=march?armv8.1-m.main$(FP))
diff --git a/gcc/testsuite/gcc.target/arm/multilib.exp 
b/gcc/testsuite/gcc.target/arm/multilib.exp
index 
dcea829965eb15e372401e6389df5a1403393ecb..63cca118da2578253740fcd95421eae9ddf219bd 
100644

--- a/gcc/testsuite/gcc.target/arm/multilib.exp
+++ b/gcc/testsuite/gcc.target/arm/multilib.exp
@@ -775,6 +775,27 @@ if {[multilib_config "rmprofile"] } {
 {-march=armv8-r+fp.sp -mfpu=auto -mfloat-abi=hard} 
"thumb/v7-r+fp.sp/ha

Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-18 Thread Segher Boessenkool
(Whoops, I missed replying t this one.  Sorry.)

On Tue, Dec 10, 2019 at 12:27:11PM -0600, Peter Bergner wrote:
> On 12/4/19 5:03 PM, Segher Boessenkool wrote:
> > On Wed, Dec 04, 2019 at 03:53:30PM -0600, Peter Bergner wrote:
> >> Right.  I'll come up with a patch and hopefully Iain and David can test
> >> on Darwin and AIX and I can test on Linux with --enable-decimal-float
> >> and --disable-decimal-float.  That should cover the major subtargets and
> >> if it works there, I'd expect it to work on the minor subtargets too.
> 
> Ok, how about the patch below?  If Iain and David could test this on Darwin
> and AIX respectively, that would be great.  I'm currently testing this on
> powerpc64le-linux, with and without --disable-decimal-float.
> 
> The pr92661.c test case is the DFP test case you wanted added to make sure
> we do not ICE, even when DFP is disabled.  The dfp-[dt]d*.c changes are
> due to me seeing them being run (and FAILing) on my --disable-decimal-float
> runs.  Clearly, they shouldn't be run when DFP is disabled.
> 
> All of the powerpc/dfp/* tests had powerpc*-*-* target tests, but that is
> covered by the dfp.exp target tests, so I removed them along with the
> now unneeded dg-skip-if AIX tests.  If dg-do compile is the default, do
> we want to just remove that whole line?
> 
> How is this looking?


> --- gcc/testsuite/gcc.target/powerpc/pr92661.c(nonexistent)
> +++ gcc/testsuite/gcc.target/powerpc/pr92661.c(working copy)
> @@ -0,0 +1,19 @@
> +/* { dg-do compile { target { powerpc*-*-* } } } */
> +/* { dg-options "-w -O2 -mdejagnu-cpu=power9" } */

You don't need that target clause in gcc.target/powerpc (and dg-do compile
is the default, but having it explicit is also fine of course).

> --- gcc/testsuite/gcc.target/powerpc/dfp-dd.c (revision 278980)
> +++ gcc/testsuite/gcc.target/powerpc/dfp-dd.c (working copy)
> @@ -1,6 +1,7 @@
>  /* Test generation of DFP instructions for POWER6.  */
>  /* Origin: Janis Johnson  */
> -/* { dg-do compile { target { powerpc*-*-linux* && powerpc_fprs } } } */
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> +/* { dg-require-effective-target dfp_hw } */

You can remove powerpc_fprs now because it became redundant?  Cool.

But dfp_hw is the wrong conditions for a dg-do compile test.


Nice cleanups!  Please fix that dfp_hw thing, and then, okay for trunk,
Thanks!


Segher


Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao
On 2019-12-18 14:19 +0100, Jan Hubicka wrote:
> > ICE here.
> > 
> > lto1: internal compiler error: tree check: expected identifier_node, have
> > function_decl in ultimate_transparent_alias_target, at varasm.c:1308
> > 0x6f9cfe tree_check_failed(tree_node const*, char const*, int, char const*,
> > ...)
> > ../../gcc/gcc/tree.c:9685
> > 0x714541 tree_check(tree_node*, char const*, int, char const*, tree_code)
> > ../../gcc/gcc/tree.h:3273
> > 0x714541 ultimate_transparent_alias_target
> > ../../gcc/gcc/varasm.c:1308
> > 0x714541 do_assemble_symver(tree_node*, tree_node*)
> > ../../gcc/gcc/varasm.c:5971
> 
> Interesting that it works for me, but indeed we can remove that call
> since there is no way to do weakref of symbol version.
> > >  #ifdef ASM_OUTPUT_SYMVER_DIRECTIVE
> > > -  ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> > > -IDENTIFIER_POINTER (target),
> > > -IDENTIFIER_POINTER (id));
> > > +  if (TREE_PUBLIC (target) && DECL_VISIBILITY (target) ==
> > > VISIBILITY_DEFAULT)
> > > +ASM_OUTPUT_SYMVER_DIRECTIVE (asm_out_file,
> > > +  IDENTIFIER_POINTER
> > > +(DECL_ASSEMBLER_NAME (target)),
> > > +  IDENTIFIER_POINTER (id));
> > > +  else
> > > +{
> > > +  int nameend;
> > > +  for (nameend = 0; IDENTIFIER_POINTER (id)[nameend] != '@';
> > > nameend++)
> > > + ;
> > > +  if (IDENTIFIER_POINTER (id)[nameend + 1] != '@'
> > > +   || IDENTIFIER_POINTER (id)[nameend + 2] == '@')
> > > + {
> > > +   sorry_at (DECL_SOURCE_LOCATION (target),
> > > + "can not produce % of a symbol that is "
> > > + "not exported with default visibility");
> > > +   return;
> > 
> > I think this does not make sense.  Some library authors may export "foo@VER_
> > 1"
> > but not "foo_v1" to ensure the programmers using the library upgrade their
> > code
> > to use new "correct" ABI, instead of an old one.   This error makes it
> > impossible.
> > 
> > (Try to comment out "foo_v1" in version.map, in the testcase.)
> 
> The problem here is that we lie to the compiler (by pretending that
> foo_v2 is exported from DSO while it is not) and force user to do the
> same.
> 
> We support two ways to hide symbol - either at compile time via
> attribute((visibility("hidden"))) or at link-time via map file.  The
> first produces better code because compiler can do more optimizations
> knowing that the symbol can not be interposed.
> 
> Generally we want users to use visiblity attribute or -fvisibility since
> it leads to better code. However now we tell users to use
> attribute((symver("..."))) to produce symbol version, but at the same
> time not use attribute((visibility("hidden"))).

Could a symver symbol be interposed?  I'll do some test to see.

> > > +  memcpy (buf, IDENTIFIER_POINTER (id), nameend + 2);
> > > +  buf[nameend + 2] = '@';
> > > +  strcpy (buf + nameend + 3, IDENTIFIER_POINTER (id) + nameend + 2);
> > 
> > We can't replace a single "@" with "@@@".  So I think producing .LSYMVERx is
> > not
> > an option for "old" versions like "foo@VER_1".
> 
> I wonder why gas implements the .symver this way at first place. Does
> the linker really need the global symbol foo_v1 to produce the
> version (in addition to foo@VER_1 that is in symbol table as well)?

I don't think the global symbol foo_v1 is necessary.  But I can't find a way
telling gas to make foo@VER_1 global and foo_v1 local.

> > > + /* Symbol versions are always used externally, but linker does not
> > > +report that correctly.  */
> > > + else if (snode->symver && *res == LDPR_PREVAILING_DEF_IRONLY)
> > > +   snode->resolution = LDPR_PREVAILING_DEF_IRONLY_EXP;
> > 
> > This is absolutely correct.
> 
> Good, I will go ahead with filling in binutils PR on the wrong LDPR
> value and apply the hack.
> > >   else
> > > snode->resolution = *res;
> > >}
> > 
> > I still believe we should consider symver targets to be externally visible
> > in
> > cgraph_externally_visible_p.  There is a comment saying "if linker counts on
> > us,
> > we must preserve the function".  That's true in our case.
> > 
> > And, I think
> > 
> > .globl  .LSYMVER0
> > .set.LSYMVER0, foo_v2
> > .symver .LSYMVER0, foo@@VERS_2
> I produce
>   .symver .LSYMVER0, foo@@@VERS_2
> 
> > is exactly same as
> > 
> > .globl  foo_v2
> > .symver foo_v2, foo@@VERS_2
> > 
> > except there is an unnecessary ".LSYMVER0".  Adding ".globl foo_v2" or
> > ".globl
> > foo_v1" won't cause them to be "global" in the final DSO because the linker
> > will
> > hide them according to the version script.
> 
> The difference is that in first case compiler can fully control foo_v2
> symbol (with LTO it will turn it into static symbol, it will inline
> calls to it and do other things), while in the second case we need to
> treat foo_v2 specially.
> > So if it's safe we can force a ".globl foo_v2" before ".symver foo_v2,
>

Re: [PATCH] Fix symver attribute with LTO

2019-12-18 Thread Xi Ruoyao
On 2019-12-18 14:19 +0100, Jan Hubicka wrote:
> The problem here is that we lie to the compiler (by pretending that
> foo_v2 is exported from DSO while it is not) and force user to do the
> same.
> 
> We support two ways to hide symbol - either at compile time via
> attribute((visibility("hidden"))) or at link-time via map file.  The
> first produces better code because compiler can do more optimizations
> knowing that the symbol can not be interposed.

I just get your point: if the library calls foo_v2 it won't be interposed.  If
it supposes a call to be interposed it should call foo() [foo@@VER_2] instead of
foo_v2().

But it seems there is no way we can do this [even with traditional
__asm__("symver foo, foo@@VER_2")].  For this purpose we should either:

1. Change GAS (introducing some new syntax like '' or '.symver_export')

or

2. Add some mangled symbol name in GCC (like ".LSYMVERx" or
"foo_v2.symver_export").
-- 
Xi Ruoyao 
School of Aerospace Science and Technology, Xidian University



[PATCH] analyzer: remove __analyzer builtins

2019-12-18 Thread David Malcolm
On Fri, 2019-12-13 at 13:31 -0500, David Malcolm wrote:
> On Fri, 2019-12-13 at 19:27 +0100, Jakub Jelinek wrote:
> > On Fri, Dec 13, 2019 at 01:11:05PM -0500, David Malcolm wrote:
> > > gcc/ChangeLog:
> > >   * builtins.def (BUILT_IN_ANALYZER_BREAK): New builtin.
> > >   (BUILT_IN_ANALYZER_DUMP): New builtin.
> > >   (BUILT_IN_ANALYZER_DUMP_EXPLODED_NODES): New builtin.
> > >   (BUILT_IN_ANALYZER_DUMP_NUM_HEAP_REGIONS): New builtin.
> > >   (BUILT_IN_ANALYZER_DUMP_PATH): New builtin.
> > >   (BUILT_IN_ANALYZER_DUMP_REGION_MODEL): New builtin.
> > >   (BUILT_IN_ANALYZER_EVAL): New builtin.
> > 
> > Is it a good idea to add further builtins without __builtin_
> > prefix (unless required for interoperability etc.)?
> 
> I think I can do all of these with just string matching on the fndecl
> names; would that be preferable to having them as builtins?

The patch that added __analyzer_* builtins to builtins.def isn't needed:
the functions are only used during DejaGnu testing, and only for
comparison by name during compile-only tests - they're never actually
defined.

This patch eliminates the builtins in favor of a header file in the
DejaGnu testsuite.

Jakub: do you prefer this approach? (eliminating the builtins in favor of
"magic" function names for use just using analyzer DejaGnu tests)

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu; 
pushed to branch dmalcolm/analyzer on the GCC git mirror.

Dave


gcc/ChangeLog:
* builtins.def: Delete the analyzer builtins.
* doc/analyzer.texi (Builtins for Debugging the Analyzer): Rename
to...
(Special Functions for Debugging the Analyzer): ...this.  Add a
leading paragraph.  Document __analyzer_dump_region_model and
__analyzer_eval.

gcc/testsuite/ChangeLog:
* gcc.dg/analyzer/abort.c: Include "analyzer-decls.h".
* gcc.dg/analyzer/analyzer-decls.h: New header.
* gcc.dg/analyzer/conditionals-2.c: Include "analyzer-decls.h".
* gcc.dg/analyzer/conditionals-3.c: Likewise.
* gcc.dg/analyzer/conditionals-notrans.c: Likewise.
* gcc.dg/analyzer/conditionals-trans.c: Likewise.
* gcc.dg/analyzer/data-model-1.c: Likewise.
* gcc.dg/analyzer/data-model-16.c: Likewise.
* gcc.dg/analyzer/data-model-18.c: Likewise.
* gcc.dg/analyzer/data-model-5d.c: Likewise.
* gcc.dg/analyzer/data-model-6.c: Likewise.
* gcc.dg/analyzer/data-model-7.c: Likewise.
* gcc.dg/analyzer/data-model-8.c: Likewise.
* gcc.dg/analyzer/data-model-9.c: Likewise.
* gcc.dg/analyzer/equivalence.c: Likewise.
* gcc.dg/analyzer/function-ptr-2.c: Likewise.
* gcc.dg/analyzer/loop-2.c: Likewise.
* gcc.dg/analyzer/loop-2a.c: Likewise.
* gcc.dg/analyzer/loop-4.c: Likewise.
* gcc.dg/analyzer/loop.c: Likewise.
* gcc.dg/analyzer/malloc-paths-10.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-1a.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-1b.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-2.c: Likewise.
* gcc.dg/analyzer/malloc-vs-local-3.c: Likewise.
* gcc.dg/analyzer/operations.c: Likewise.
* gcc.dg/analyzer/params-2.c: Likewise.
* gcc.dg/analyzer/params.c: Likewise.
* gcc.dg/analyzer/paths-1.c: Likewise.
* gcc.dg/analyzer/paths-1a.c: Likewise.
* gcc.dg/analyzer/paths-2.c: Likewise.
* gcc.dg/analyzer/paths-3.c: Likewise.
* gcc.dg/analyzer/paths-4.c: Likewise.
* gcc.dg/analyzer/paths-5.c: Likewise.
* gcc.dg/analyzer/paths-6.c: Likewise.
* gcc.dg/analyzer/paths-7.c: Likewise.
* gcc.dg/analyzer/setjmp-2.c: Likewise.
* gcc.dg/analyzer/setjmp-3.c: Likewise.
* gcc.dg/analyzer/setjmp-4.c: Likewise.
* gcc.dg/analyzer/setjmp-5.c: Likewise.
* gcc.dg/analyzer/setjmp-8.c: Likewise.
* gcc.dg/analyzer/setjmp-9.c: Likewise.
* gcc.dg/analyzer/switch.c: Likewise.
* gcc.dg/analyzer/zlib-1.c: Likewise.
* gcc.dg/analyzer/zlib-5.c: Likewise.
---
 gcc/builtins.def  | 33 -
 gcc/doc/analyzer.texi | 19 +-
 gcc/testsuite/gcc.dg/analyzer/abort.c |  1 +
 .../gcc.dg/analyzer/analyzer-decls.h  | 36 +++
 .../gcc.dg/analyzer/conditionals-2.c  |  1 +
 .../gcc.dg/analyzer/conditionals-3.c  |  2 ++
 .../gcc.dg/analyzer/conditionals-notrans.c|  1 +
 .../gcc.dg/analyzer/conditionals-trans.c  |  1 +
 gcc/testsuite/gcc.dg/analyzer/data-model-1.c  |  1 +
 gcc/testsuite/gcc.dg/analyzer/data-model-16.c |  2 ++
 gcc/testsuite/gcc.dg/analyzer/data-model-18.c |  2 ++
 gcc/testsuite/gcc.dg/analyzer/data-model-5d.c |  1 +
 gcc/testsuite/gcc.dg/analyzer/data-model-6.c  |  1 +
 gcc/testsuite/gcc.dg/analyzer/data-model-7.c  |  1 +
 gcc/testsuite/gcc.dg/analyzer/data-model-8.c  |  2 ++
 gcc/testsuite/gcc.dg/analyzer/data-model-9.c  |

Re: [PATCH] analyzer: remove __analyzer builtins

2019-12-18 Thread Jakub Jelinek
On Wed, Dec 18, 2019 at 09:36:55AM -0500, David Malcolm wrote:
> This patch eliminates the builtins in favor of a header file in the
> DejaGnu testsuite.
> 
> Jakub: do you prefer this approach? (eliminating the builtins in favor of
> "magic" function names for use just using analyzer DejaGnu tests)

Yeah, certainly.  Thanks.

Jakub



Re: [PATCH] IPA-CP: Remove bogus static keyword (PR 92971)

2019-12-18 Thread Jan Hubicka
> 
> the leak is indeed a problem, thanks for spotting it.  But apart from
> that, I really wanted to pass vNULL to intersect_aggregates_with_edge,
> and the patch below does it explicitely to make it clear, because while
> the function can do also intersecting its actual task here is to carry
> over aggregate constants from all types of callers (clones and
> non-clones) and all kinds of supported jump functions.
> 
> That might be an overkill because the main goal of
> cgraph_edge_brings_all_agg_vals_for_node is to add the last edge within
> SCCs of nodes propagating the same constant to each other and I could
> not quickly come up with a testcase where the caller would be a
> non-clone (and the function something other than pass-through, but I
> suspect that could actually happen) but since the code is written we
> might as well use it.  The function is written to bail out early before
> actual value comparing and that is why the code is rarely executed, in
> fact I found out that it is not covered by our testsuite (see
> https://users.suse.com/~mliska/lcov/gcc/ipa-cp.c.gcov.html) and so the
> patch also adds a testcase which does execute it.
> 
> The way vectors are passed around by value rather than by reference is
> how I wrote this stuff shortly after conversion from our C VEC_ headers
> with which were used in the same way.  I agree that a lot of code in
> ipa-cp would benefit from transitioning to auto_vecs but that is
> something for the next stage 1.
> 
> The patch has been bootstrapped and LTO-profile-bootstrapped on
> x86-64-linux.  OK for trunk?
> 
> Thanks,
> 
> Martin
> 
> 
> 2019-12-17  Martin Jambor  
> 
>   PR ipa/92971
>   * Ipa-cp.c (cgraph_edge_brings_all_agg_vals_for_node): Fix
>   definition of values, release memory on exit.
> 
>   testsuite/
>   * gcc.dg/ipa/ipcp-agg-12.c: New test.
> ---
>  gcc/ipa-cp.c   |  4 +-
>  gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c | 53 ++
>  2 files changed, 55 insertions(+), 2 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
> 
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 1a80ccbde2d..243b064ee2c 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -5117,7 +5117,6 @@ cgraph_edge_brings_all_agg_vals_for_node (struct 
> cgraph_edge *cs,
>  
>for (i = 0; i < count; i++)
>  {
> -  static vec values = vNULL;
>class ipcp_param_lattices *plats;
>bool interesting = false;
>for (struct ipa_agg_replacement_value *av = aggval; av; av = av->next)
> @@ -5133,7 +5132,7 @@ cgraph_edge_brings_all_agg_vals_for_node (struct 
> cgraph_edge *cs,
>if (plats->aggs_bottom)
>   return false;
>  
> -  values = intersect_aggregates_with_edge (cs, i, values);
> +  vec values = intersect_aggregates_with_edge (cs, i, 
> vNULL);
>if (!values.exists ())
>   return false;
>  
> @@ -5157,6 +5156,7 @@ cgraph_edge_brings_all_agg_vals_for_node (struct 
> cgraph_edge *cs,
>   return false;
> }
> }
> +  values.release ();
Generally it seems to me that things would be more readable/leak safe if
we used auto_vecs and passed them as function arguments to be filled in.

But since same constructs are used in ipa-cp/prop elsewhere the patch is
OK.

Honza
>  }
>return true;
>  }
> diff --git a/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c 
> b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
> new file mode 100644
> index 000..5c57913803e
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/ipa/ipcp-agg-12.c
> @@ -0,0 +1,53 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O3 -fno-ipa-sra -fdump-ipa-cp-details 
> --param=ipa-cp-eval-threshold=2"  } */
> +
> +struct S
> +{
> +  int a, b, c;
> +};
> +
> +int __attribute__((noinline)) foo (int i, struct S s);
> +int __attribute__((noinline)) bar (int i, struct S s);
> +int __attribute__((noinline)) baz (int i, struct S s);
> +
> +
> +int __attribute__((noinline))
> +bar (int i, struct S s)
> +{
> +  return baz (i, s);
> +}
> +
> +int __attribute__((noinline))
> +baz (int i, struct S s)
> +{
> +  return foo (i, s);
> +}
> +
> +int __attribute__((noinline))
> +foo (int i, struct S s)
> +{
> +  if (i == 2)
> +return 0;
> +  else
> +return s.b * s.b + bar (i - 1, s);
> +}
> +
> +volatile int g;
> +
> +void entry (void)
> +{
> +  struct S s;
> +  s.b = 4;
> +  g = bar (g, s);
> +}
> +
> +
> +void entry2 (void)
> +{
> +  struct S s;
> +  s.b = 6;
> +  g = baz (g, s);
> +}
> +
> +
> +/* { dg-final { scan-ipa-dump-times "adding an extra caller" 2 "cp" } } */
> -- 
> 2.24.0
> 


Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-18 Thread Jan Hubicka
> The size_info of ipa_size_summary are created by r277424.  It should be
> duplicated for cloned nodes, otherwise self_size and estimated_self_stack_size
> would be 0, causing param large-function-insns and large-function-growth 
> working
> inaccurate when ipa-inline.
> 
> gcc/ChangeLog:
> 
>   2019-12-18  Luo Xiong Hu  
> 
>   * ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Copy
>   ipa_size_summary for cloned nodes.
> ---
>  gcc/ipa-fnsummary.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index a46b1445765..9a01be1708b 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -868,7 +868,12 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
>   }
>  }
>if (!dst->inlined_to)
> +  {
> +class ipa_size_summary *src_size = ipa_size_summaries->get_create (src);
> +class ipa_size_summary *dst_size = ipa_size_summaries->get_create (dst);
> +*dst_size = *src_size;
>  ipa_update_overall_fn_summary (dst);
> +  }

Thanks for spotting this! It is quite bad bug.
The summaries are supposed to be copied by duplicate method. However it
seems that the default duplicate implementation doesn't do the copy (I
wonder why) and moreover copy constructor is broken not copying
correctly stack use.  I think we are fine with the default copy
constructor as follows

Does it fix your testcase?

Index: ipa-fnsummary.h
===
--- ipa-fnsummary.h (revision 279523)
+++ ipa-fnsummary.h (working copy)
@@ -99,11 +99,6 @@ public:
   : estimated_self_stack_size (0), self_size (0), size (0)
   {
   }
-  /* Copy constructor.  */
-  ipa_size_summary (const ipa_size_summary &s)
-  : estimated_self_stack_size (0), self_size (s.self_size), size (s.size)
-  {
-  }
 };
 
 /* Function inlining information.  */
@@ -226,18 +221,20 @@ extern GTY(()) fast_function_summary 
+  public fast_function_summary 
 {
 public:
   ipa_size_summary_t (symbol_table *symtab):
-fast_function_summary  (symtab) {}
+fast_function_summary  (symtab)
+  {
+disable_insertion_hook ();
+  }
 
-  static ipa_size_summary_t *create_ggc (symbol_table *symtab)
+  virtual void duplicate (cgraph_node *, cgraph_node *,
+ ipa_size_summary *src_data,
+ ipa_size_summary *dst_data)
   {
-class ipa_size_summary_t *summary = new (ggc_alloc  ())
-  ipa_size_summary_t (symtab);
-summary->disable_insertion_hook ();
-return summary;
+*dst_data = *src_data;
   }
 };
 extern fast_function_summary 
Index: ipa-fnsummary.c
===
--- ipa-fnsummary.c (revision 279523)
+++ ipa-fnsummary.c (working copy)
@@ -672,8 +672,7 @@ static void
 ipa_fn_summary_alloc (void)
 {
   gcc_checking_assert (!ipa_fn_summaries);
-  ipa_size_summaries = new fast_function_summary 
-(symtab);
+  ipa_size_summaries = new ipa_size_summary_t (symtab);
   ipa_fn_summaries = ipa_fn_summary_t::create_ggc (symtab);
   ipa_call_summaries = new ipa_call_summary_t (symtab);
 }


Re: [PATCH] rs6000: Fix 2 for PR92661, Do not define builtins that overload disabled builtins

2019-12-18 Thread Peter Bergner
On 12/18/19 8:15 AM, Segher Boessenkool wrote:
>> +/* { dg-do compile { target { powerpc*-*-* } } } */
>> +/* { dg-options "-w -O2 -mdejagnu-cpu=power9" } */
> 
> You don't need that target clause in gcc.target/powerpc (and dg-do compile
> is the default, but having it explicit is also fine of course).

I think leaving the bare dg-do compile (ie, no target) is nice,
for newbies who don't know that dg-do compile is the default.



>> --- gcc/testsuite/gcc.target/powerpc/dfp-dd.c(revision 278980)
>> +++ gcc/testsuite/gcc.target/powerpc/dfp-dd.c(working copy)
>> @@ -1,6 +1,7 @@
>>  /* Test generation of DFP instructions for POWER6.  */
>>  /* Origin: Janis Johnson  */
>> -/* { dg-do compile { target { powerpc*-*-linux* && powerpc_fprs } } } */
>> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
>> +/* { dg-require-effective-target dfp_hw } */
> 
> You can remove powerpc_fprs now because it became redundant?  Cool.

Right, hard dfp support requires we have hard float support.


> But dfp_hw is the wrong conditions for a dg-do compile test.

Ok, yes.  Looking closer, that dfp_hw is a runtime test and not
what we want.  I'll change this to using "hard_dfp" which is a
compile time test.



> Nice cleanups!  Please fix that dfp_hw thing, and then, okay for trunk,
> Thanks!

Will do, thanks.  I'll commit this after making these changes and
rerunning the updated test cases.

Peter



[patch][avr] PR92606: Disable -fipa-icf-variables because it generates wrong code.

2019-12-18 Thread Georg-Johann Lay
Hi, this patch turns off -fipa-icf-variables because it generates wrong 
code like for PR92606.  As there is no target hook that could decide 
whether such optimizations are obsolete, disable such optimizations 
alltogether until PR92932 (target hook to disable such optimizations 
depending on object attributes and address-spcace) is available.


Ok to apply?

Johann


Work around PR ipa/92932 by disabling -fipa-icf-variables until
PR92932 will have been solved.

PR ipa/92932
PR target/92606
* common/config/avr/avr-common.c (avr_option_optimization_table)
<-fipa-icf-variables>: Disable.
Index: common/config/avr/avr-common.c
===
--- common/config/avr/avr-common.c	(revision 279522)
+++ common/config/avr/avr-common.c	(working copy)
@@ -38,6 +38,14 @@ static const struct default_options avr_
 { OPT_LEVELS_ALL, OPT_fcaller_saves, NULL, 0 },
 { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_mgas_isr_prologues, NULL, 1 },
 { OPT_LEVELS_1_PLUS, OPT_mmain_is_OS_task, NULL, 1 },
+	// FIXME: IPA incorrectly identifies variables in .progmem.data (accessed
+	// via LPM) with variables in .rodata (accessed via LD, LDD, LDS) like
+	// in PR92606.  As there is no target hook to disable such optimizations
+	// depending on target attributes and / or address-spaces of the involved
+	// objects (filed as PR92932), ditch such malicious optimizations now until
+	// PR92932 is implemented and we can use that target hook to solve PR92606
+	// properly.
+{ OPT_LEVELS_ALL, OPT_fipa_icf_variables, NULL, 0 },
 { OPT_LEVELS_NONE, 0, NULL, 0 }
   };
 


[patch][avr] New option -nodevicespecs to omit -specs=... in self specs.

2019-12-18 Thread Georg-Johann Lay
Hi, currently device support in avr-gcc is accomplished by injecting a 
specs file my means of -specs=... in dirver self specs.


This patch adds a new avr driver option to omit the addition of 
respective -specs option so give the user more freedom.


Ok to apply?

Johann

* config/avr/avr.opt (-nodevicespecs): New driver option.
* config/avr/driver-avr.c (avr_devicespecs_file): Only issue
"-specs=device-specs/..." if that option is not set.
* doc/invoke.texi (AVR Options) <-nodevicespecs>: Document.
Index: config/avr/avr.opt
===
--- config/avr/avr.opt	(revision 279522)
+++ config/avr/avr.opt	(working copy)
@@ -118,3 +118,7 @@ Assume that all data in static storage c
 nodevicelib
 Driver Target Report RejectNegative
 Do not link against the device-specific library lib.a.
+
+nodevicespecs
+Driver Target Report RejectNegative
+Do not use the device-specific specs file device-specs/specs-.
Index: config/avr/driver-avr.c
===
--- config/avr/driver-avr.c	(revision 279522)
+++ config/avr/driver-avr.c	(working copy)
@@ -26,8 +26,8 @@ along with GCC; see the file COPYING3.
 #include "diagnostic.h"
 #include "tm.h"
 
-// Remove -nodevicelib from the command line if not needed
-#define X_NODEVLIB "%

Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-18 Thread Jan Hubicka
> The size_info of ipa_size_summary are created by r277424.  It should be
> duplicated for cloned nodes, otherwise self_size and estimated_self_stack_size
> would be 0, causing param large-function-insns and large-function-growth 
> working
> inaccurate when ipa-inline.
> 
> gcc/ChangeLog:
> 
>   2019-12-18  Luo Xiong Hu  
> 
>   * ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Copy
>   ipa_size_summary for cloned nodes.
> ---
>  gcc/ipa-fnsummary.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
> index a46b1445765..9a01be1708b 100644
> --- a/gcc/ipa-fnsummary.c
> +++ b/gcc/ipa-fnsummary.c
> @@ -868,7 +868,12 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
>   }
>  }
>if (!dst->inlined_to)
> +  {
> +class ipa_size_summary *src_size = ipa_size_summaries->get_create (src);
> +class ipa_size_summary *dst_size = ipa_size_summaries->get_create (dst);

This is intended to happen by the default duplicate method of
ipa_size_summaries via to copy constructor. It seems there is a stupid
pasto and the copy constructor is unused since the default duplicate
implementation does nothing (wonder why).

I am testing the attached patch.  Does this help? 

Index: ipa-fnsummary.h
===
--- ipa-fnsummary.h (revision 279523)
+++ ipa-fnsummary.h (working copy)
@@ -99,11 +99,6 @@ public:
   : estimated_self_stack_size (0), self_size (0), size (0)
   {
   }
-  /* Copy constructor.  */
-  ipa_size_summary (const ipa_size_summary &s)
-  : estimated_self_stack_size (0), self_size (s.self_size), size (s.size)
-  {
-  }
 };
 
 /* Function inlining information.  */
@@ -226,18 +221,20 @@ extern GTY(()) fast_function_summary 
+  public fast_function_summary 
 {
 public:
   ipa_size_summary_t (symbol_table *symtab):
-fast_function_summary  (symtab) {}
+fast_function_summary  (symtab)
+  {
+disable_insertion_hook ();
+  }
 
-  static ipa_size_summary_t *create_ggc (symbol_table *symtab)
+  virtual void duplicate (cgraph_node *, cgraph_node *,
+ ipa_size_summary *src_data,
+ ipa_size_summary *dst_data)
   {
-class ipa_size_summary_t *summary = new (ggc_alloc  ())
-  ipa_size_summary_t (symtab);
-summary->disable_insertion_hook ();
-return summary;
+*dst_data = *src_data;
   }
 };
 extern fast_function_summary 
Index: ipa-fnsummary.c
===
--- ipa-fnsummary.c (revision 279523)
+++ ipa-fnsummary.c (working copy)
@@ -672,8 +672,7 @@ static void
 ipa_fn_summary_alloc (void)
 {
   gcc_checking_assert (!ipa_fn_summaries);
-  ipa_size_summaries = new fast_function_summary 
-(symtab);
+  ipa_size_summaries = new ipa_size_summary_t (symtab);
   ipa_fn_summaries = ipa_fn_summary_t::create_ggc (symtab);
   ipa_call_summaries = new ipa_call_summary_t (symtab);
 }


Re: [PATCH] Handle aggregate pass-through for self-recursive call (PR ipa/92794)

2019-12-18 Thread Martin Jambor
Hi,

On Tue, Dec 17 2019, Feng Xue OS wrote:
> If argument for a self-recursive call is a simple pass-through, the call
> edge is also considered as source of any value originated from
> non-recursive call to the function. Scalar pass-through and full aggregate
> pass-through due to pointer pass-through have also been handled.
> But we missed another kind of pass-through like below case,  partial
> aggregate pass-through. This patch is meant to fix the problem which
> caused ICE as in 92794.
>
>   void foo(struct T *val_ptr)
>   {
> struct T new_val;
> new_val.field = val_ptr->field;
> foo (&temp);
> ...
>   }
>
> Bootstrapped/regtested on x86_64-linux and aarch64-linux.
>
> 2019-12-17  Feng Xue  
>
> PR ipa/92794
> * ipa-cp.c (self_recursive_agg_pass_through_p): New function.
> (intersect_with_plats): Use error_mark_node as place holder
> when aggregate jump function is simple pass-through for
> self-recursive call.
> (intersect_with_agg_replacements): Likewise.
> (intersect_aggregates_with_edge): Likewise.
> (find_aggregate_values_for_callers_subset): Likewise.
>
> Thanks,
> Feng
> From 42ba553ebf80eadb62619c5570a4b406f8c90c49 Mon Sep 17 00:00:00 2001
> From: Feng Xue 
> Date: Mon, 16 Dec 2019 20:33:36 +0800
> Subject: [PATCH] Handle aggregate simple pass-through for self-recursive call
>
> ---
>  gcc/ipa-cp.c   | 97 +-
>  gcc/testsuite/gcc.dg/ipa/pr92794.c | 30 +
>  2 files changed, 111 insertions(+), 16 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92794.c
>
> diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> index 1a80ccbde2d..0e17fedd649 100644
> --- a/gcc/ipa-cp.c
> +++ b/gcc/ipa-cp.c
> @@ -4564,6 +4564,23 @@ self_recursive_pass_through_p (cgraph_edge *cs, 
> ipa_jump_func *jfunc, int i)
>return false;
>  }
>  
> +/* Return true, if JFUNC, which describes a part of an aggregate represented
> +   or pointed to by the i-th parameter of call CS, is a simple no-operation
> +   pass-through function to itself.  */
> +
> +static bool
> +self_recursive_agg_pass_through_p (cgraph_edge *cs, ipa_agg_jf_item *jfunc,
> +int i)
> +{
> +  if (cs->caller == cs->callee->function_symbol ()

I don't know if self-recursive calls can be interposed at all, if yes
you need to add the availability check like we have in
self_recursive_pass_through_p (if not, we should probably remove it
there).

Apart from that, I believe the patch is probably the best thing we can
do to deal with these interesting situations.

Thanks for looking into the bug,

Martin


> +  && jfunc->jftype == IPA_JF_LOAD_AGG
> +  && jfunc->offset == jfunc->value.load_agg.offset
> +  && jfunc->value.pass_through.operation == NOP_EXPR
> +  && jfunc->value.pass_through.formal_id == i)
> +return true;
> +  return false;
> +}


Re: [PATCH 05/49] vec.h: add auto_delete_vec

2019-12-18 Thread David Malcolm
On Wed, 2019-12-04 at 09:29 -0700, Martin Sebor wrote:
> On 11/15/19 6:22 PM, David Malcolm wrote:
> > This patch adds a class auto_delete_vec, a subclass of auto_vec
> > 
> > that deletes all of its elements on destruction; it's used in many
> > places later in the kit.
> > 
> > This is a crude way for a vec to "own" the objects it points to
> > and clean up automatically (essentially a workaround for not being
> > able
> > to use unique_ptr, due to C++98).
> > 
> > gcc/ChangeLog:
> > * vec.c (class selftest::count_dtor): New class.
> > (selftest::test_auto_delete_vec): New test.
> > (selftest::vec_c_tests): Call it.
> > * vec.h (class auto_delete_vec): New class template.
> > (auto_delete_vec::~auto_delete_vec): New dtor.
> 
> Because of slicing, unless preventing the elements from being
> deleted in the class dtor is meant to be a feature, it seems that
> using a wrapper class rather than public derivation from auto_vec
> might be a safer solution.
> 
> It might be worth mentioning in a comment that the class isn't
> safe to copy or assign (each copy would wind up delete the same
> pointers), in addition to making its copy ctor and copy assignment
> operator inaccessible or deleted.
> 
> Martin

In the version of the patch in the v4 kit:
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01035.html
I added:
  private:
 DISABLE_COPY_AND_ASSIGN(auto_delete_vec);
to the class.

Does that satisfy your concerns about slicing? (and, indeed, about
copying and assigning)

Thanks
Dave



Re: [PATCH] Handle aggregate pass-through for self-recursive call (PR ipa/92794)

2019-12-18 Thread Jan Hubicka
> Hi,
> 
> On Tue, Dec 17 2019, Feng Xue OS wrote:
> > If argument for a self-recursive call is a simple pass-through, the call
> > edge is also considered as source of any value originated from
> > non-recursive call to the function. Scalar pass-through and full aggregate
> > pass-through due to pointer pass-through have also been handled.
> > But we missed another kind of pass-through like below case,  partial
> > aggregate pass-through. This patch is meant to fix the problem which
> > caused ICE as in 92794.
> >
> >   void foo(struct T *val_ptr)
> >   {
> > struct T new_val;
> > new_val.field = val_ptr->field;
> > foo (&temp);
> > ...
> >   }
> >
> > Bootstrapped/regtested on x86_64-linux and aarch64-linux.
> >
> > 2019-12-17  Feng Xue  
> >
> > PR ipa/92794
> > * ipa-cp.c (self_recursive_agg_pass_through_p): New function.
> > (intersect_with_plats): Use error_mark_node as place holder
> > when aggregate jump function is simple pass-through for
> > self-recursive call.
> > (intersect_with_agg_replacements): Likewise.
> > (intersect_aggregates_with_edge): Likewise.
> > (find_aggregate_values_for_callers_subset): Likewise.
> >
> > Thanks,
> > Feng
> > From 42ba553ebf80eadb62619c5570a4b406f8c90c49 Mon Sep 17 00:00:00 2001
> > From: Feng Xue 
> > Date: Mon, 16 Dec 2019 20:33:36 +0800
> > Subject: [PATCH] Handle aggregate simple pass-through for self-recursive 
> > call
> >
> > ---
> >  gcc/ipa-cp.c   | 97 +-
> >  gcc/testsuite/gcc.dg/ipa/pr92794.c | 30 +
> >  2 files changed, 111 insertions(+), 16 deletions(-)
> >  create mode 100644 gcc/testsuite/gcc.dg/ipa/pr92794.c
> >
> > diff --git a/gcc/ipa-cp.c b/gcc/ipa-cp.c
> > index 1a80ccbde2d..0e17fedd649 100644
> > --- a/gcc/ipa-cp.c
> > +++ b/gcc/ipa-cp.c
> > @@ -4564,6 +4564,23 @@ self_recursive_pass_through_p (cgraph_edge *cs, 
> > ipa_jump_func *jfunc, int i)
> >return false;
> >  }
> >  
> > +/* Return true, if JFUNC, which describes a part of an aggregate 
> > represented
> > +   or pointed to by the i-th parameter of call CS, is a simple no-operation
> > +   pass-through function to itself.  */
> > +
> > +static bool
> > +self_recursive_agg_pass_through_p (cgraph_edge *cs, ipa_agg_jf_item *jfunc,
> > +  int i)
> > +{
> > +  if (cs->caller == cs->callee->function_symbol ()
> 
> I don't know if self-recursive calls can be interposed at all, if yes
> you need to add the availability check like we have in
> self_recursive_pass_through_p (if not, we should probably remove it
> there).

Yes, self recursion can interpose if you have alias and enter the
recursive loop by different symbol name then recurse.
We have optional ref argument in ultimate_alias_target and friends where
you can specify symbol in which the reference appears and then the
predicate knows how to verify this (odd) condition.

There is cgraph_edge::recursive_p that can make mistake in positive
direction in the case of interposition. We probably want to distinguish
these cases and have parameter for that...

Honza
> 
> Apart from that, I believe the patch is probably the best thing we can
> do to deal with these interesting situations.
> 
> Thanks for looking into the bug,
> 
> Martin
> 
> 
> > +  && jfunc->jftype == IPA_JF_LOAD_AGG
> > +  && jfunc->offset == jfunc->value.load_agg.offset
> > +  && jfunc->value.pass_through.operation == NOP_EXPR
> > +  && jfunc->value.pass_through.formal_id == i)
> > +return true;
> > +  return false;
> > +}


[Patch, fortran] PR70853 - ICE on pointing to null, in gfc_add_block_to_block, at fortran/trans.c:1599

2019-12-18 Thread Harald Anlauf
The patch is self-explaining and practically obvious: pointer bounds
remapping to NULL is not allowed, thus we shall reject it.  I hope the
error message is fine.  If somebody prefers a formulation as in the
standard ("data target", also used by the Intel compiler), please
speak now.

Regtested on x86_64-pc-linux-gnu.

OK for trunk?

Thanks,
Harald

Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c(Revision 279405)
+++ gcc/fortran/trans-expr.c(Arbeitskopie)
@@ -9218,6 +9218,13 @@ gfc_trans_pointer_assignment (gfc_expr * expr1, gf
  break;
   rank_remap = (remap && remap->u.ar.end[0]);

+  if (remap && expr2->expr_type == EXPR_NULL)
+   {
+ gfc_error ("If bounds remapping is specified at %L, "
+"the pointer target shall not be NULL", &expr1->where);
+ return NULL_TREE;
+   }
+
   gfc_init_se (&lse, NULL);
   if (remap)
lse.descriptor_only = 1;


Index: gcc/testsuite/gfortran.dg/pr70853.f90
===
--- gcc/testsuite/gfortran.dg/pr70853.f90   (nicht existent)
+++ gcc/testsuite/gfortran.dg/pr70853.f90   (Arbeitskopie)
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! PR fortran/70853
+! Contributed by Gerhard Steinmetz
+program p
+   real, pointer :: z(:)
+   z(1:2) => null() ! { dg-error "pointer target shall not be NULL" }
+   z(2:1) => null() ! { dg-error "pointer target shall not be NULL" }
+end


2019-12-18  Harald Anlauf  

PR fortran/92898
* trans-expr.c (gfc_trans_pointer_assignment): Reject bounds
remapping if pointer target is NULL().


2019-12-18  Harald Anlauf  

PR fortran/70853
* gfortran.dg/pr70853.f90: New test.


Re: [Patch, fortran] PR70853 - ICE on pointing to null, in gfc_add_block_to_block, at fortran/trans.c:1599

2019-12-18 Thread Tobias Burnus

LGTM. Thanks for the patch!

Tobias

PS: I assume, your patch also fixes the following test case, which also 
ICEs in gfc_trans_pointer_assignment:

integer, pointer, contiguous :: x(:)
nullify(x(1:1))
end

On 12/18/19 5:07 PM, Harald Anlauf wrote:

The patch is self-explaining and practically obvious: pointer bounds
remapping to NULL is not allowed, thus we shall reject it.  I hope the
error message is fine.  If somebody prefers a formulation as in the
standard ("data target", also used by the Intel compiler), please
speak now.

Regtested on x86_64-pc-linux-gnu.

OK for trunk?

Thanks,
Harald

Index: gcc/fortran/trans-expr.c
===
--- gcc/fortran/trans-expr.c(Revision 279405)
+++ gcc/fortran/trans-expr.c(Arbeitskopie)
@@ -9218,6 +9218,13 @@ gfc_trans_pointer_assignment (gfc_expr * expr1, gf
   break;
rank_remap = (remap && remap->u.ar.end[0]);

+  if (remap && expr2->expr_type == EXPR_NULL)
+   {
+ gfc_error ("If bounds remapping is specified at %L, "
+"the pointer target shall not be NULL", &expr1->where);
+ return NULL_TREE;
+   }
+
gfc_init_se (&lse, NULL);
if (remap)
 lse.descriptor_only = 1;


Index: gcc/testsuite/gfortran.dg/pr70853.f90
===
--- gcc/testsuite/gfortran.dg/pr70853.f90   (nicht existent)
+++ gcc/testsuite/gfortran.dg/pr70853.f90   (Arbeitskopie)
@@ -0,0 +1,8 @@
+! { dg-do compile }
+! PR fortran/70853
+! Contributed by Gerhard Steinmetz
+program p
+   real, pointer :: z(:)
+   z(1:2) => null() ! { dg-error "pointer target shall not be NULL" }
+   z(2:1) => null() ! { dg-error "pointer target shall not be NULL" }
+end


2019-12-18  Harald Anlauf  

 PR fortran/92898
 * trans-expr.c (gfc_trans_pointer_assignment): Reject bounds
 remapping if pointer target is NULL().


2019-12-18  Harald Anlauf  

 PR fortran/70853
 * gfortran.dg/pr70853.f90: New test.


[GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [1/2]

2019-12-18 Thread Stam Markianos-Wright
Hi all,

This patch adds Bfloat type support to the ARM back-end.
It also adds a new machine_mode (BFmode) for this type and accompanying Vector 
modes V4BFmode and V8BFmode.

The second patch in this series uses existing target hooks to restrict type use.

Regression testing on aarch64-none-elf passed successfully.

This patch depends on:

https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00857.html

for test suite effective_target update.

Ok for trunk?

Cheers,
Stam


ACLE documents are at https://developer.arm.com/docs/101028/latest
ISA documents are at https://developer.arm.com/docs/ddi0596/latest

Details on ARM Bfloat can be found here:
https://community.arm.com/developer/ip-products/processors/b/ml-ip-blog/posts/bfloat16-processing-for-neural-networks-on-armv8_2d00_a
 


PS. I don't have commit rights, so if someone could commit on my behalf,
that would be great :)



gcc/ChangeLog:

2019-12-16  Stam Markianos-Wright  

* config.gcc: Add arm_bf16.h.
* config/aarch64/aarch64-builtins.c
 (aarch64_simd_builtin_std_type): Add BFmode.
 (aarch64_init_simd_builtin_types): Add element types for vector types.
(aarch64_init_bf16_types): New function.
(aarch64_general_init_builtins): Add arm_init_bf16_types function call.
* config/aarch64/aarch64-modes.def: Add BFmode and vector modes.
* config/aarch64/aarch64-simd-builtin-types.def:
* config/aarch64/aarch64-simd.md: Add BF types to NEON move patterns.
* config/aarch64/aarch64.c (aarch64_classify_vector_mode): Add BF modes.
(aarch64_gimplify_va_arg_expr): Add BFmode.
* config/aarch64/aarch64.h (AARCH64_VALID_SIMD_DREG_MODE): Add V4BF.
(AARCH64_VALID_SIMD_QREG_MODE): Add V8BF.
* config/aarch64/aarch64.md: New enabled_for_bfmode_scalar,
  enabled_for_bfmode_vector attributes. Add BFmode to movhf pattern.
* config/aarch64/arm_bf16.h: New file.
* config/aarch64/arm_neon.h: Add arm_bf16.h and Bfloat vector types.
* config/aarch64/iterators.md
  (HFBF, GPF_TF_F16_MOV, VDMOV, VQMOV, VALL_F16MOV): New.



gcc/testsuite/ChangeLog:

2019-12-16  Stam Markianos-Wright  

* gcc.target/aarch64/bfloat16_compile.c: New test.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 9802f436e06..b49c110ccaf 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -315,7 +315,7 @@ m32c*-*-*)
 ;;
 aarch64*-*-*)
 	cpu_type=aarch64
-	extra_headers="arm_fp16.h arm_neon.h arm_acle.h arm_sve.h"
+	extra_headers="arm_fp16.h arm_neon.h arm_bf16.h arm_acle.h arm_sve.h"
 	c_target_objs="aarch64-c.o"
 	cxx_target_objs="aarch64-c.o"
 	d_target_objs="aarch64-d.o"
diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index c35a1b1f029..3ba2f12166f 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -68,6 +68,9 @@
 #define hi_UPE_HImode
 #define hf_UPE_HFmode
 #define qi_UPE_QImode
+#define bf_UPE_BFmode
+#define v4bf_UP  E_V4BFmode
+#define v8bf_UP  E_V8BFmode
 #define UP(X) X##_UP
 
 #define SIMD_MAX_BUILTIN_ARGS 5
@@ -568,6 +571,10 @@ static tree aarch64_simd_intXI_type_node = NULL_TREE;
 tree aarch64_fp16_type_node = NULL_TREE;
 tree aarch64_fp16_ptr_type_node = NULL_TREE;
 
+/* Back-end node type for brain float (bfloat) types.  */
+tree aarch64_bf16_type_node = NULL_TREE;
+tree aarch64_bf16_ptr_type_node = NULL_TREE;
+
 /* Wrapper around add_builtin_function.  NAME is the name of the built-in
function, TYPE is the function type, and CODE is the function subcode
(relative to AARCH64_BUILTIN_GENERAL).  */
@@ -659,6 +666,8 @@ aarch64_simd_builtin_std_type (machine_mode mode,
   return float_type_node;
 case E_DFmode:
   return double_type_node;
+case E_BFmode:
+  return aarch64_bf16_type_node;
 default:
   gcc_unreachable ();
 }
@@ -750,6 +759,11 @@ aarch64_init_simd_builtin_types (void)
   aarch64_simd_types[Float64x1_t].eltype = double_type_node;
   aarch64_simd_types[Float64x2_t].eltype = double_type_node;
 
+
+/* Init Bfloat vector types with underlying uint types.  */
+  aarch64_simd_types[Bfloat16x4_t].eltype = aarch64_bf16_type_node;
+  aarch64_simd_types[Bfloat16x8_t].eltype = aarch64_bf16_type_node;
+
   for (i = 0; i < nelts; i++)
 {
   tree eltype = aarch64_simd_types[i].eltype;
@@ -1059,6 +1073,19 @@ aarch64_init_fp16_types (void)
   aarch64_fp16_ptr_type_node = build_pointer_type (aarch64_fp16_type_node);
 }
 
+/* Initialize the backend REAL_TYPE type supporting bfloat types.  */
+static void
+aarch64_init_bf16_types (void)
+{
+  aarch64_bf16_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (aarch64_bf16_type_node) = 16;
+  SET_TYPE_MODE (aarch64_bf16_type_node, BFmode);
+  layout_type (aarch64_bf16_type_node);
+
+  (*lang_hooks.types.register_builtin_type) (aarch64_bf16_type_node, "__bf16");
+  aarch64_bf16_ptr_type_node = build_pointer_type (aarch64_bf16_type_node);
+}
+
 /* P

[GCC][PATCH][Aarch64] Add Bfloat16_t scalar type, vector types and machine modes to Aarch64 back-end [2/2]

2019-12-18 Thread Stam Markianos-Wright
Hi all,

This patch is part 2 of Bfloat16_t enablement in the Aarch64 back-end.

This new type is constrained using target hooks TARGET_INVALID_CONVERSION, 
TARGET_INVALID_UNARY_OP, TARGET_INVALID_BINARY_OP so that it may only be used 
through ACLE intrinsics (will be provided in later patches).

Regression testing on aarch64-none-elf passed successfully.

Ok for trunk?

Cheers,
Stam


ACLE documents are at https://developer.arm.com/docs/101028/latest
ISA documents are at https://developer.arm.com/docs/ddi0596/latest

Details on ARM Bfloat can be found here:
https://community.arm.com/developer/ip-products/processors/b/ml-ip-blog/posts/bfloat16-processing-for-neural-networks-on-armv8_2d00_a
 


PS. I don't have commit rights, so if someone could commit on my behalf,
that would be great :)


gcc/ChangeLog:

2019-12-16  Stam Markianos-Wright  

* config/aarch64/aarch64.c
(aarch64_invalid_conversion): New function for target hook.
(aarch64_invalid_unary_op): Likewise.
(aarch64_invalid_binary_op): Likewise.
(TARGET_INVALID_CONVERSION): Add back-end define for target hook.
(TARGET_INVALID_UNARY_OP): Likewise.
(TARGET_INVALID_BINARY_OP): Likewise.


gcc/testsuite/ChangeLog:

2019-12-16  Stam Markianos-Wright  

* gcc.target/aarch64/bfloat16_scalar_typecheck.c: New test.
* gcc.target/aarch64/bfloat16_vector_typecheck1.c: New test.
* gcc.target/aarch64/bfloat16_vector_typecheck2.c: New test.

diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index f57469b6e23..f40f6432fd4 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -21661,6 +21661,68 @@ aarch64_stack_protect_guard (void)
   return NULL_TREE;
 }
 
+/* Return the diagnostic message string if conversion from FROMTYPE to
+   TOTYPE is not allowed, NULL otherwise.  */
+
+static const char *
+aarch64_invalid_conversion (const_tree fromtype, const_tree totype)
+{
+  static char templ[100];
+  if ((GET_MODE_INNER (TYPE_MODE (fromtype)) == BFmode
+   || GET_MODE_INNER (TYPE_MODE (totype)) == BFmode)
+   && TYPE_MODE (fromtype) != TYPE_MODE (totype))
+  {
+snprintf (templ, sizeof (templ), \
+  "incompatible types when assigning to type '%s' from type '%s'",
+  IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (totype))),
+  IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (fromtype;
+return N_(templ);
+  }
+  /* Conversion allowed.  */
+  return NULL;
+}
+
+/* Return the diagnostic message string if the unary operation OP is
+   not permitted on TYPE, NULL otherwise.  */
+
+static const char *
+aarch64_invalid_unary_op (int op, const_tree type)
+{
+  static char templ[100];
+  /* Reject all single-operand operations on BFmode except for &.  */
+  if (GET_MODE_INNER (TYPE_MODE (type)) == BFmode && op != ADDR_EXPR)
+  {
+snprintf (templ, sizeof (templ),
+  "operation not permitted on type '%s'",
+  IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type;
+return N_(templ);
+  }
+  /* Operation allowed.  */
+  return NULL;
+}
+
+/* Return the diagnostic message string if the binary operation OP is
+   not permitted on TYPE1 and TYPE2, NULL otherwise.  */
+
+static const char *
+aarch64_invalid_binary_op (int op ATTRIBUTE_UNUSED, const_tree type1,
+			   const_tree type2)
+{
+  static char templ[100];
+  /* Reject all 2-operand operations on BFmode.  */
+  if (GET_MODE_INNER (TYPE_MODE (type1)) == BFmode
+  || GET_MODE_INNER (TYPE_MODE (type2)) == BFmode)
+  {
+snprintf (templ, sizeof (templ), \
+  "operation not permitted on types '%s', '%s'",
+  IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type1))),
+  IDENTIFIER_POINTER (DECL_NAME (TYPE_NAME (type2;
+return N_(templ);
+  }
+  /* Operation allowed.  */
+  return NULL;
+}
+
 /* Implement TARGET_ASM_FILE_END for AArch64.  This adds the AArch64 GNU NOTE
section at the end if needed.  */
 #define GNU_PROPERTY_AARCH64_FEATURE_1_AND	0xc000
@@ -21911,6 +21973,15 @@ aarch64_libgcc_floating_mode_supported_p
 #undef TARGET_MANGLE_TYPE
 #define TARGET_MANGLE_TYPE aarch64_mangle_type
 
+#undef TARGET_INVALID_CONVERSION
+#define TARGET_INVALID_CONVERSION aarch64_invalid_conversion
+
+#undef TARGET_INVALID_UNARY_OP
+#define TARGET_INVALID_UNARY_OP aarch64_invalid_unary_op
+
+#undef TARGET_INVALID_BINARY_OP
+#define TARGET_INVALID_BINARY_OP aarch64_invalid_binary_op
+
 #undef TARGET_VERIFY_TYPE_CONTEXT
 #define TARGET_VERIFY_TYPE_CONTEXT aarch64_verify_type_context
 
diff --git a/gcc/testsuite/gcc.target/aarch64/bfloat16_scalar_typecheck.c b/gcc/testsuite/gcc.target/aarch64/bfloat16_scalar_typecheck.c
new file mode 100644
index 000..6f6a6af9587
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/bfloat16_scalar_typecheck.c
@@ -0,0 +1,83 @@
+/* { dg-do compile { target { aarch64*-*-* } } } */
+/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } } */
+/* { dg-options "-march=armv8.2-a+i8mm" } */
+
+#include 
+
+bfloat16_t glob;
+float i

Re: [LTO] PR 86416 – improve lto1 diagnostic if a mode does not exist (esp. for offloading targets)

2019-12-18 Thread Tobias Burnus

Hi Jakub,

thanks for the pointers; I was also not happy about part (B).

The mode gets written at lto-streamer-out.c's lto_write_mode_table, 
which is already called with 'if (lto_stream_offload_p)'. — Looking at 
places where the name gets constructed, I realized that required 
information is already written to the mode table: The precision (for 
float, to distinguish between 80bit and 128bit floats) and the size.


One now has a readable error message for float/complex/decimal-float and 
int(eger) modes – and the generic one for the rest; as vector get 
converted to their base type, if not found, this give a readable message 
for all common cases.


Hence, one can write a much cleaner patch which is about as long as the 
hack but not hackish :-)


Bootstrapped + regtested on x86-64-gnu-linux w/o offloading and with 
nvptx offloading.

OK?

Thanks,

Tobias

2019-12-18  Tobias Burnus  

	PR middle-end/86416
	*  Makefile.in (CFLAGS-lto-streamer-in.o): Pass target_noncanonical on.
	* lto-streamer-in.c (lto_input_mode_table): Improve unsupported-mode
	diagnostic.

	PR middle-end/86416
	* testsuite/libgomp.c/pr86416-1.c: New.
	* testsuite/libgomp.c/pr86416-2.c: New.

 gcc/Makefile.in |  2 ++
 gcc/lto-streamer-in.c   | 26 +-
 libgomp/testsuite/libgomp.c/pr86416-1.c | 22 ++
 libgomp/testsuite/libgomp.c/pr86416-2.c | 22 ++
 4 files changed, 71 insertions(+), 1 deletion(-)

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 6b857bd75de..657488d416b 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -2244,6 +2244,8 @@ version.o: $(REVISION) $(DATESTAMP) $(BASEVER) $(DEVPHASE)
 # lto-compress.o needs $(ZLIBINC) added to the include flags.
 CFLAGS-lto-compress.o += $(ZLIBINC)
 
+CFLAGS-lto-streamer-in.o += -DTARGET_MACHINE=\"$(target_noncanonical)\"
+
 bversion.h: s-bversion; @true
 s-bversion: BASE-VER
 	echo "#define BUILDING_GCC_MAJOR `echo $(BASEVER_c) | sed -e 's/^\([0-9]*\).*$$/\1/'`" > bversion.h
diff --git a/gcc/lto-streamer-in.c b/gcc/lto-streamer-in.c
index 675e1a7a153..f49f38df647 100644
--- a/gcc/lto-streamer-in.c
+++ b/gcc/lto-streamer-in.c
@@ -1698,7 +1698,31 @@ lto_input_mode_table (struct lto_file_decl_data *file_data)
 		}
 	  /* FALLTHRU */
 	default:
-	  fatal_error (UNKNOWN_LOCATION, "unsupported mode %qs", mname);
+	  /* This is only used for offloading-target compilations and
+		 is a user-facing error.  Give a better error message for
+		 the common modes; see also mode-classes.def.   */
+	  if (mclass == MODE_FLOAT)
+		fatal_error (UNKNOWN_LOCATION,
+			 "%s - %u-bit-precision floating-point numbers "
+			 "unsupported (mode %qs)", TARGET_MACHINE,
+			 prec.to_constant (), mname);
+	  else if (mclass == MODE_DECIMAL_FLOAT)
+		fatal_error (UNKNOWN_LOCATION,
+			 "%s - %u-bit-precision decimal floating-point "
+			 "numbers unsupported (mode %qs)", TARGET_MACHINE,
+			 prec.to_constant (), mname);
+	  else if (mclass == MODE_COMPLEX_FLOAT)
+		fatal_error (UNKNOWN_LOCATION,
+			 "%s - %u-bit-precision complex floating-point "
+			 "numbers unsupported (mode %qs)", TARGET_MACHINE,
+			 prec.to_constant (), mname);
+	  else if (mclass == MODE_INT)
+		fatal_error (UNKNOWN_LOCATION,
+			 "%s - %u-bit integer numbers unsupported (mode "
+			 "%qs)", TARGET_MACHINE, prec.to_constant (), mname);
+	  else
+		fatal_error (UNKNOWN_LOCATION, "%s - unsupported mode %qs",
+			 TARGET_MACHINE, mname);
 	  break;
 	}
 	}
diff --git a/libgomp/testsuite/libgomp.c/pr86416-1.c b/libgomp/testsuite/libgomp.c/pr86416-1.c
new file mode 100644
index 000..4ab523d2310
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr86416-1.c
@@ -0,0 +1,22 @@
+/* { dg-do link } */
+/* { dg-require-effective-target large_long_double } */
+
+/* PR middle-end/86416  */
+/* { dg-error "bit-precision floating-point numbers unsupported .mode '.F'." "" { target offload_device } 0 }  */
+/* { dg-excess-errors "Follow-up errors from mkoffload and lto-wrapper" { target offload_device } }  */
+
+#include   /* For abort. */
+
+long double foo (long double x)
+{
+  #pragma omp target map(tofrom:x)
+x *= 2.0;
+  return x;
+}
+
+int main()
+{
+  long double v = foo (10.0q) - 20.0q;
+  if (v > 1.0e-5 || v < -1.0e-5) abort();
+  return 0;
+}
diff --git a/libgomp/testsuite/libgomp.c/pr86416-2.c b/libgomp/testsuite/libgomp.c/pr86416-2.c
new file mode 100644
index 000..f104da78029
--- /dev/null
+++ b/libgomp/testsuite/libgomp.c/pr86416-2.c
@@ -0,0 +1,22 @@
+/* { dg-do link { target __float128 } } */
+/* { dg-add-options __float128 } */
+
+/* PR middle-end/86416  */
+/* { dg-error "bit-precision floating-point numbers unsupported .mode '.F'." "" { target offload_device } 0 }  */
+/* { dg-excess-errors "Follow-up errors from mkoffload and lto-wrapper" { target offload_device } }  */
+
+#include   /* For abort. */
+
+__float128 foo(__float12

Re: [Patch, fortran] PR70853 - ICE on pointing to null, in gfc_add_block_to_block, at fortran/trans.c:1599

2019-12-18 Thread Harald Anlauf
On 12/18/19 17:17, Tobias Burnus wrote:
> LGTM. Thanks for the patch!

Thanks, committed as r279527.

> Tobias
>
> PS: I assume, your patch also fixes the following test case, which also
> ICEs in gfc_trans_pointer_assignment:
> integer, pointer, contiguous :: x(:)
> nullify(x(1:1))
> end

Well, that depends on your interpretation of "fix".  The ICE is now
replaced by a somewhat incorrect error message:

x.f90:2:8:

2 | nullify(x(1:1))
  |1
Error: If bounds remapping is specified at (1), the pointer target shall
not be NULL

For a better error message, we'd need to know that we come here from
a NULLIFY statement.  Can you file a PR?

Thanks,
Harald

> On 12/18/19 5:07 PM, Harald Anlauf wrote:
>> The patch is self-explaining and practically obvious: pointer bounds
>> remapping to NULL is not allowed, thus we shall reject it.  I hope the
>> error message is fine.  If somebody prefers a formulation as in the
>> standard ("data target", also used by the Intel compiler), please
>> speak now.
>>
>> Regtested on x86_64-pc-linux-gnu.
>>
>> OK for trunk?
>>
>> Thanks,
>> Harald
>>
>> Index: gcc/fortran/trans-expr.c
>> ===
>> --- gcc/fortran/trans-expr.c(Revision 279405)
>> +++ gcc/fortran/trans-expr.c(Arbeitskopie)
>> @@ -9218,6 +9218,13 @@ gfc_trans_pointer_assignment (gfc_expr * expr1, gf
>>break;
>> rank_remap = (remap && remap->u.ar.end[0]);
>>
>> +  if (remap && expr2->expr_type == EXPR_NULL)
>> +   {
>> + gfc_error ("If bounds remapping is specified at %L, "
>> +"the pointer target shall not be NULL",
>> &expr1->where);
>> + return NULL_TREE;
>> +   }
>> +
>> gfc_init_se (&lse, NULL);
>> if (remap)
>>  lse.descriptor_only = 1;
>>
>>
>> Index: gcc/testsuite/gfortran.dg/pr70853.f90
>> ===
>> --- gcc/testsuite/gfortran.dg/pr70853.f90   (nicht existent)
>> +++ gcc/testsuite/gfortran.dg/pr70853.f90   (Arbeitskopie)
>> @@ -0,0 +1,8 @@
>> +! { dg-do compile }
>> +! PR fortran/70853
>> +! Contributed by Gerhard Steinmetz
>> +program p
>> +   real, pointer :: z(:)
>> +   z(1:2) => null() ! { dg-error "pointer target shall not be NULL" }
>> +   z(2:1) => null() ! { dg-error "pointer target shall not be NULL" }
>> +end
>>
>>
>> 2019-12-18  Harald Anlauf  
>>
>>  PR fortran/92898
>>  * trans-expr.c (gfc_trans_pointer_assignment): Reject bounds
>>  remapping if pointer target is NULL().
>>
>>
>> 2019-12-18  Harald Anlauf  
>>
>>  PR fortran/70853
>>  * gfortran.dg/pr70853.f90: New test.
>



Re: [LTO] PR 86416 – improve lto1 diagnostic if a mode does not exist (esp. for offloading targets)

2019-12-18 Thread Jakub Jelinek
On Wed, Dec 18, 2019 at 05:39:51PM +0100, Tobias Burnus wrote:
> Hence, one can write a much cleaner patch which is about as long as the hack
> but not hackish :-)
> 
> Bootstrapped + regtested on x86-64-gnu-linux w/o offloading and with nvptx
> offloading.
> OK?

LGTM.

> 2019-12-18  Tobias Burnus  
> 
>   PR middle-end/86416
>   *  Makefile.in (CFLAGS-lto-streamer-in.o): Pass target_noncanonical on.
>   * lto-streamer-in.c (lto_input_mode_table): Improve unsupported-mode
>   diagnostic.
> 
>   PR middle-end/86416
>   * testsuite/libgomp.c/pr86416-1.c: New.
>   * testsuite/libgomp.c/pr86416-2.c: New.

Jakub



Re: [GCC][testsuite][ARM][AArch64] Add ARM v8.6 effective target checks to target-supports.exp

2019-12-18 Thread Richard Sandiford
Stam Markianos-Wright  writes:
> On 12/13/19 11:15 AM, Richard Sandiford wrote:
>> Stam Markianos-Wright  writes:
>>> Hi all,
>>>
>>> This small patch adds support for the ARM v8.6 extensions +bf16 and
>>> +i8mm to the testsuite. This will be tested through other upcoming
>>> patches, which is why we are not providing any explicit tests here.
>>>
>>> Ok for trunk?
>>>
>>> Also I don't have commit rights, so if someone could commit on my
>>> behalf, that would be great :)
>>>
>>> The functionality here depends on CLI patches:
>>> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02415.html
>>> https://gcc.gnu.org/ml/gcc-patches/2019-11/msg02195.html
>>>
>>> but this patch applies cleanly without them, too.
>>>
>>> Cheers,
>>> Stam
>>>
>>>
>>> gcc/testsuite/ChangeLog:
>>>
>>> 2019-12-11  Stam Markianos-Wright  
>>>
>>> * lib/target-supports.exp
>>> (check_effective_target_arm_v8_2a_i8mm_ok_nocache): New.
>>> (check_effective_target_arm_v8_2a_i8mm_ok): New.
>>> (add_options_for_arm_v8_2a_i8mm): New.
>>> (check_effective_target_arm_v8_2a_bf16_neon_ok_nocache): New.
>>> (check_effective_target_arm_v8_2a_bf16_neon_ok): New.
>>> (add_options_for_arm_v8_2a_bf16_neon): New.
>> 
>> The new effective-target keywords need to be documented in
>> doc/sourcebuild.texi.
>
> Added in new diff :)
>
>> 
>> LGTM otherwise.  For:
>> 
>>> diff --git a/gcc/testsuite/lib/target-supports.exp 
>>> b/gcc/testsuite/lib/target-supports.exp
>>> index 5b4cc02f921..36fb63e9929 100644
>>> --- a/gcc/testsuite/lib/target-supports.exp
>>> +++ b/gcc/testsuite/lib/target-supports.exp
>>> @@ -4781,6 +4781,49 @@ proc add_options_for_arm_v8_2a_dotprod_neon { flags 
>>> } {
>>>   return "$flags $et_arm_v8_2a_dotprod_neon_flags"
>>>   }
>>>   
>>> +# Return 1 if the target supports ARMv8.2+i8mm Adv.SIMD Dot Product
>>> +# instructions, 0 otherwise.  The test is valid for ARM and for AArch64.
>>> +# Record the command line options needed.
>>> +
>>> +proc check_effective_target_arm_v8_2a_i8mm_ok_nocache { } {
>>> +global et_arm_v8_2a_i8mm_flags
>>> +set et_arm_v8_2a_i8mm_flags ""
>>> +
>>> +if { ![istarget arm*-*-*] && ![istarget aarch64*-*-*] } {
>>> +return 0;
>>> +}
>>> +
>>> +# Iterate through sets of options to find the compiler flags that
>>> +# need to be added to the -march option.
>>> +foreach flags {"" "-mfloat-abi=hard -mfpu=neon-fp-armv8" 
>>> "-mfloat-abi=softfp -mfpu=neon-fp-armv8" } {
>>> +if { [check_no_compiler_messages_nocache \
>>> +  arm_v8_2a_i8mm_ok object {
>>> +#include 
>>> +#if !defined (__ARM_FEATURE_MATMUL_INT8)
>>> +#error "__ARM_FEATURE_MATMUL_INT8 not defined"
>>> +#endif
>>> +} "$flags -march=armv8.2-a+i8mm"] } {
>>> +set et_arm_v8_2a_i8mm_flags "$flags -march=armv8.2-a+i8mm"
>>> +return 1
>>> +}
>>> +}
>> 
>> I wondered whether it would be better to add no options if testing
>> with something that already supports i8mm (e.g. -march=armv8.6).
>> That would mean trying:
>> 
>>"" "-march=armv8.2-a+i8mm" "-march=armv8.2-a+i8mm -mfloat-abi..." ...
>> 
>> instead.  But there are arguments both ways, and the above follows
>> existing style, so OK.
>
> Not quite sure if I understanding this right, but I think that's what 
> the "" option in foreach flags{} is for?
>
> i.e. currently what I'm seeing is:
>
> +/* { dg-require-effective-target arm_v8_2a_i8mm_ok } */
> +/* { dg-add-options arm_v8_2a_i8mm }  */
>
> will pull through the first option that compiles to object file with no 
> errors (check_no_compiler_messages_nocache arm_v8_2a_i8mm_ok object).
>
> So in a lot of cases it should just be fine for "" and only pull in 
> -march=armv8.2-a+i8mm.
>
> I think that's right? Lmk if I'm not reading it properly!

Yeah, that's right, but it's also the "problem".  The point was that
some people will run the tests with options like -march=armv8.6-a that
already support these instructions, e.g. using

  --target_board unix/-march=armv8.6-a

on a native box.  The code above will then override that -march option
with -march=armv8.2-a+i8mm even though the original option was OK.
The tests won't then actually get run with -march=armv8.6-a as the
dominant option, despite being Armv8.6 tests. :-)

The alternative above would have tried with no options at all and only
added -march= if that failed.  But like I say, the current version
follows existing practice and so is fine too.

> diff --git a/gcc/doc/sourcebuild.texi b/gcc/doc/sourcebuild.texi
> index 85573a49a2b..73408d12cbe 100644
> --- a/gcc/doc/sourcebuild.texi
> +++ b/gcc/doc/sourcebuild.texi
> @@ -1877,6 +1877,18 @@ ARM target supports extensions to generate the 
> @code{VFMAL} and @code{VFMLS}
>  half-precision floating-point instructions available from ARMv8.2-A and
>  onwards.  Some multilibs may be incompatible with these options.
>  
> +@item arm_v8_2a_bf16_neon_ok
> +@anchor{arm_v8_2a_bf16_neon_o

Re: [PATCH, GCC/ARM, 9/10] Call nscall function with blxns

2019-12-18 Thread Kyrill Tkachov



On 12/18/19 1:38 PM, Mihail Ionescu wrote:

Hi,

On 11/12/2019 10:23 AM, Kyrill Tkachov wrote:


On 10/23/19 10:26 AM, Mihail Ionescu wrote:

[PATCH, GCC/ARM, 9/10] Call nscall function with blxns

Hi,

=== Context ===

This patch is part of a patch series to add support for Armv8.1-M
Mainline Security Extensions architecture. Its purpose is to call
functions with the cmse_nonsecure_call attribute directly using blxns
with no undue restriction on the register used for that.

=== Patch description ===

This change to use BLXNS to call a nonsecure function from secure
directly (not using a libcall) is made in 2 steps:
- change nonsecure_call patterns to use blxns instead of calling
  __gnu_cmse_nonsecure_call
- loosen requirement for function address to allow any register when
  doing BLXNS.

The former is a straightforward check over whether instructions 
added in

Armv8.1-M Mainline are available while the latter consist in making the
nonsecure call pattern accept any register by using match_operand and
changing the nonsecure_call_internal expander to no force r4 when
targeting Armv8.1-M Mainline.

The tricky bit is actually in the test update, specifically how to 
check

that register lists for CLRM have all registers except for the one
holding parameters (already done) and the one holding the address used
by BLXNS. This is achieved with 3 scan-assembler directives.

1) The first one lists all registers that can appear in CLRM but make
   each of them optional.
   Property guaranteed: no wrong register is cleared and none appears
   twice in the register list.
2) The second directive check that the CLRM is made of a fixed number
   of the right registers to be cleared. The number used is the number
   of registers that could contain a secret minus one (used to hold the
   address of the function to call.
   Property guaranteed: register list has the right number of registers
   Cumulated property guaranteed: only registers with a potential 
secret

   are cleared and they are all listed but ont
3) The last directive checks that we cannot find a CLRM with a register
   in it that also appears in BLXNS. This is check via the use of a
   back-reference on any of the allowed register in CLRM, the
   back-reference enforcing that whatever register match in CLRM 
must be

   the same in the BLXNS.
   Property guaranteed: register used for BLXNS is different from
   registers cleared in CLRM.

Some more care needs to happen for the gcc.target/arm/cmse/cmse-1.c
testcase due to there being two CLRM generated. To ensure the third
directive match the right CLRM to the BLXNS, a negative lookahead is
used between the CLRM register list and the BLXNS. The way negative
lookahead work is by matching the *position* where a given regular
expression does not match. In this case, since it comes after the CLRM
register list it is requesting that what comes after the register list
does not have a CLRM again followed by BLXNS. This guarantees that the
.*blxns after only matches a blxns without another CLRM before.

ChangeLog entries are as follows:

*** gcc/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * config/arm/arm.md (nonsecure_call_internal): Do not force 
memory

    address in r4 when targeting Armv8.1-M Mainline.
    (nonsecure_call_value_internal): Likewise.
    * config/arm/thumb2.md (nonsecure_call_reg_thumb2): Make 
memory address

    a register match_operand again.  Emit BLXNS when targeting
    Armv8.1-M Mainline.
    (nonsecure_call_value_reg_thumb2): Likewise.

*** gcc/testsuite/ChangeLog ***

2019-10-23  Mihail-Calin Ionescu 
2019-10-23  Thomas Preud'homme 

    * gcc.target/arm/cmse/cmse-1.c: Add check for BLXNS when 
instructions
    introduced in Armv8.1-M Mainline Security Extensions are 
available and
    restrict checks for libcall to __gnu_cmse_nonsecure_call to 
Armv8-M
    targets only.  Adapt CLRM check to verify register used for 
BLXNS is

    not in the CLRM register list.
    * gcc.target/arm/cmse/cmse-14.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-4.c: Likewise 
and adapt
    check for LSB clearing bit to be using the same register as 
BLXNS when

    targeting Armv8.1-M Mainline.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-5.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-6.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-9.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/bitfield-and-union.c: 
Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-13.c: 
Likewise.

    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-7.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard-sp/cmse-8.c: Likewise.
    * gcc.target/arm/cmse/mainline/8_1m/hard/cmse-13.c: Likewise

Re: [PATCH][GCC][arm] Add CLI and multilib support for Armv8.1-M Mainline MVE extensions

2019-12-18 Thread Mihail Ionescu

Hi Kyrill,

On 12/18/2019 02:13 PM, Kyrill Tkachov wrote:

Hi Mihail,

On 11/8/19 4:52 PM, Mihail Ionescu wrote:

Hi,

This patch adds CLI and multilib support for Armv8.1-M MVE to the Arm 
backend.
Two new option added for v8.1-m.main: "+mve" for integer MVE 
instructions only

and "+mve.fp" for both integer and single-precision/half-precision
floating-point MVE.
The patch also maps the Armv8.1-M multilib variants to the 
corresponding v8-M ones.




gcc/ChangeLog:

2019-11-08  Mihail Ionescu  
2019-11-08  Andre Vieira 

    * config/arm/arm-cpus.in (mve, mve_float): New features.
    (dsp, mve, mve.fp): New options.
    * config/arm/arm.h (TARGET_HAVE_MVE, TARGET_HAVE_MVE_FLOAT): 
Define.

    * config/arm/t-rmprofile: Map v8.1-M multilibs to v8-M.


gcc/testsuite/ChangeLog:

2019-11-08  Mihail Ionescu  
2019-11-08  Andre Vieira 

    * testsuite/gcc.target/arm/multilib.exp: Add v8.1-M entries.


Is this ok for trunk?



This is ok, but please document the new options in invoke.texi.



Here it is with the updated invoke.texi and ChangeLog.


gcc/ChangeLog:

2019-12-18  Mihail Ionescu  
2019-12-18  Andre Vieira  

* config/arm/arm-cpus.in (mve, mve_float): New features.
(dsp, mve, mve.fp): New options.
* config/arm/arm.h (TARGET_HAVE_MVE, TARGET_HAVE_MVE_FLOAT): Define.
* config/arm/t-rmprofile: Map v8.1-M multilibs to v8-M.
* doc/invoke.texi: Document the armv8.1-m mve and dsp options.


gcc/testsuite/ChangeLog:

2019-12-18  Mihail Ionescu  
2019-12-18  Andre Vieira  

* testsuite/gcc.target/arm/multilib.exp: Add v8.1-M entries.


Thanks,
Mihail


Thanks,

Kyrill




Best regards,

Mihail


### Attachment also inlined for ease of reply 
###



diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
index 
59aad8f62ee5186cc87d3cefaf40ba2ce049012d..c2f016c75e2d8dd06890295321232bef61cbd234 
100644

--- a/gcc/config/arm/arm-cpus.in
+++ b/gcc/config/arm/arm-cpus.in
@@ -194,6 +194,10 @@ define feature sb
 # v8-A architectures, added by default from v8.5-A
 define feature predres

+# M-profile Vector Extension feature bits
+define feature mve
+define feature mve_float
+
 # Feature groups.  Conventionally all (or mostly) upper case.
 # ALL_FPU lists all the feature bits associated with the floating-point
 # unit; these will all be removed if the floating-point unit is disabled
@@ -654,9 +658,12 @@ begin arch armv8.1-m.main
  base 8M_MAIN
  isa ARMv8_1m_main
 # fp => FPv5-sp-d16; fp.dp => FPv5-d16
+ option dsp add armv7em
  option fp add FPv5 fp16
  option fp.dp add FPv5 FP_DBL fp16
  option nofp remove ALL_FP
+ option mve add mve armv7em
+ option mve.fp add mve FPv5 fp16 mve_float armv7em
 end arch armv8.1-m.main

 begin arch iwmmxt
diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
index 
64c292f2862514fb600a4faeaddfeacb2b69180b..9ec38c6af1b84fc92e20e30e8f07ce5360a966c1 
100644

--- a/gcc/config/arm/arm.h
+++ b/gcc/config/arm/arm.h
@@ -310,6 +310,12 @@ emission of floating point pcs attributes.  */
    instructions (most are floating-point related).  */
 #define TARGET_HAVE_FPCXT_CMSE  (arm_arch8_1m_main)

+#define TARGET_HAVE_MVE (bitmap_bit_p (arm_active_target.isa, \
+  isa_bit_mve))
+
+#define TARGET_HAVE_MVE_FLOAT (bitmap_bit_p (arm_active_target.isa, \
+ isa_bit_mve_float))
+
 /* Nonzero if integer division instructions supported.  */
 #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
  || (TARGET_THUMB && arm_arch_thumb_hwdiv))
diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
index 
807e69eaf78625f422e2d7ef5936c5c80c5b9073..62e27fd284b21524896430176d64ff5b08c6e0ef 
100644

--- a/gcc/config/arm/t-rmprofile
+++ b/gcc/config/arm/t-rmprofile
@@ -54,7 +54,7 @@ MULTILIB_REQUIRED += 
mthumb/march=armv8-m.main+fp.dp/mfloat-abi=softfp

 # Arch Matches
 MULTILIB_MATCHES    += march?armv6s-m=march?armv6-m

-# Map all v8-m.main+dsp FP variants down the the variant without DSP.
+# Map all v8-m.main+dsp FP variants down to the variant without DSP.
 MULTILIB_MATCHES    += march?armv8-m.main=march?armv8-m.main+dsp \
    $(foreach FP, +fp +fp.dp, \
march?armv8-m.main$(FP)=march?armv8-m.main+dsp$(FP))
@@ -66,3 +66,18 @@ MULTILIB_MATCHES += 
march?armv7e-m+fp=march?armv7e-m+fpv5
 MULTILIB_REUSE  += $(foreach ARCH, armv6s-m armv7-m armv7e-m 
armv8-m\.base armv8-m\.main, \
mthumb/march.$(ARCH)/mfloat-abi.soft=mthumb/march.$(ARCH)/mfloat-abi.softfp) 



+# Map v8.1-M to v8-M.
+MULTILIB_MATCHES   += march?armv8-m.main=march?armv8.1-m.main
+MULTILIB_MATCHES   += march?armv8-m.main=march?armv8.1-m.main+dsp
+MULTILIB_MATCHES   += march?armv8-m.main=march?armv8.1-m.main+mve
+
+v8_1m_sp_variants = +fp +dsp+fp +mve.fp
+v8_1m_dp_variants = +fp.dp +dsp+fp.dp +fp.dp+mve +fp.dp+mve.fp
+
+# Map all v8.1-m.main FP sp variants down to v8-m.
+MULTILIB_MATCHES += $(foreach 

Re: [PATCH][GCC][arm] Add CLI and multilib support for Armv8.1-M Mainline MVE extensions

2019-12-18 Thread Kyrill Tkachov



On 12/18/19 5:00 PM, Mihail Ionescu wrote:

Hi Kyrill,

On 12/18/2019 02:13 PM, Kyrill Tkachov wrote:
> Hi Mihail,
>
> On 11/8/19 4:52 PM, Mihail Ionescu wrote:
>> Hi,
>>
>> This patch adds CLI and multilib support for Armv8.1-M MVE to the Arm
>> backend.
>> Two new option added for v8.1-m.main: "+mve" for integer MVE
>> instructions only
>> and "+mve.fp" for both integer and single-precision/half-precision
>> floating-point MVE.
>> The patch also maps the Armv8.1-M multilib variants to the
>> corresponding v8-M ones.
>>
>>
>>
>> gcc/ChangeLog:
>>
>> 2019-11-08  Mihail Ionescu 
>> 2019-11-08  Andre Vieira 
>>
>>     * config/arm/arm-cpus.in (mve, mve_float): New features.
>>     (dsp, mve, mve.fp): New options.
>>     * config/arm/arm.h (TARGET_HAVE_MVE, TARGET_HAVE_MVE_FLOAT):
>> Define.
>>     * config/arm/t-rmprofile: Map v8.1-M multilibs to v8-M.
>>
>>
>> gcc/testsuite/ChangeLog:
>>
>> 2019-11-08  Mihail Ionescu 
>> 2019-11-08  Andre Vieira 
>>
>>     * testsuite/gcc.target/arm/multilib.exp: Add v8.1-M entries.
>>
>>
>> Is this ok for trunk?
>
>
> This is ok, but please document the new options in invoke.texi.
>

Here it is with the updated invoke.texi and ChangeLog.



Thanks, looks great to me.

Kyrill



gcc/ChangeLog:

2019-12-18  Mihail Ionescu  
2019-12-18  Andre Vieira 

    * config/arm/arm-cpus.in (mve, mve_float): New features.
    (dsp, mve, mve.fp): New options.
    * config/arm/arm.h (TARGET_HAVE_MVE, TARGET_HAVE_MVE_FLOAT): 
Define.

    * config/arm/t-rmprofile: Map v8.1-M multilibs to v8-M.
    * doc/invoke.texi: Document the armv8.1-m mve and dsp options.


gcc/testsuite/ChangeLog:

2019-12-18  Mihail Ionescu  
2019-12-18  Andre Vieira 

    * testsuite/gcc.target/arm/multilib.exp: Add v8.1-M entries.


Thanks,
Mihail

> Thanks,
>
> Kyrill
>
>
>>
>> Best regards,
>>
>> Mihail
>>
>>
>> ### Attachment also inlined for ease of reply
>> ###
>>
>>
>> diff --git a/gcc/config/arm/arm-cpus.in b/gcc/config/arm/arm-cpus.in
>> index
>> 
59aad8f62ee5186cc87d3cefaf40ba2ce049012d..c2f016c75e2d8dd06890295321232bef61cbd234 


>> 100644
>> --- a/gcc/config/arm/arm-cpus.in
>> +++ b/gcc/config/arm/arm-cpus.in
>> @@ -194,6 +194,10 @@ define feature sb
>>  # v8-A architectures, added by default from v8.5-A
>>  define feature predres
>>
>> +# M-profile Vector Extension feature bits
>> +define feature mve
>> +define feature mve_float
>> +
>>  # Feature groups.  Conventionally all (or mostly) upper case.
>>  # ALL_FPU lists all the feature bits associated with the 
floating-point
>>  # unit; these will all be removed if the floating-point unit is 
disabled

>> @@ -654,9 +658,12 @@ begin arch armv8.1-m.main
>>   base 8M_MAIN
>>   isa ARMv8_1m_main
>>  # fp => FPv5-sp-d16; fp.dp => FPv5-d16
>> + option dsp add armv7em
>>   option fp add FPv5 fp16
>>   option fp.dp add FPv5 FP_DBL fp16
>>   option nofp remove ALL_FP
>> + option mve add mve armv7em
>> + option mve.fp add mve FPv5 fp16 mve_float armv7em
>>  end arch armv8.1-m.main
>>
>>  begin arch iwmmxt
>> diff --git a/gcc/config/arm/arm.h b/gcc/config/arm/arm.h
>> index
>> 
64c292f2862514fb600a4faeaddfeacb2b69180b..9ec38c6af1b84fc92e20e30e8f07ce5360a966c1 


>> 100644
>> --- a/gcc/config/arm/arm.h
>> +++ b/gcc/config/arm/arm.h
>> @@ -310,6 +310,12 @@ emission of floating point pcs attributes.  */
>>     instructions (most are floating-point related).  */
>>  #define TARGET_HAVE_FPCXT_CMSE (arm_arch8_1m_main)
>>
>> +#define TARGET_HAVE_MVE (bitmap_bit_p (arm_active_target.isa, \
>> + isa_bit_mve))
>> +
>> +#define TARGET_HAVE_MVE_FLOAT (bitmap_bit_p (arm_active_target.isa, \
>> + isa_bit_mve_float))
>> +
>>  /* Nonzero if integer division instructions supported.  */
>>  #define TARGET_IDIV ((TARGET_ARM && arm_arch_arm_hwdiv) \
>>   || (TARGET_THUMB && arm_arch_thumb_hwdiv))
>> diff --git a/gcc/config/arm/t-rmprofile b/gcc/config/arm/t-rmprofile
>> index
>> 
807e69eaf78625f422e2d7ef5936c5c80c5b9073..62e27fd284b21524896430176d64ff5b08c6e0ef 


>> 100644
>> --- a/gcc/config/arm/t-rmprofile
>> +++ b/gcc/config/arm/t-rmprofile
>> @@ -54,7 +54,7 @@ MULTILIB_REQUIRED +=
>> mthumb/march=armv8-m.main+fp.dp/mfloat-abi=softfp
>>  # Arch Matches
>>  MULTILIB_MATCHES    += march?armv6s-m=march?armv6-m
>>
>> -# Map all v8-m.main+dsp FP variants down the the variant without DSP.
>> +# Map all v8-m.main+dsp FP variants down to the variant without DSP.
>>  MULTILIB_MATCHES    += march?armv8-m.main=march?armv8-m.main+dsp \
>>     $(foreach FP, +fp +fp.dp, \
>> march?armv8-m.main$(FP)=march?armv8-m.main+dsp$(FP))
>> @@ -66,3 +66,18 @@ MULTILIB_MATCHES +=
>> march?armv7e-m+fp=march?armv7e-m+fpv5
>>  MULTILIB_REUSE  += $(foreach ARCH, armv6s-m armv7-m armv7e-m
>> armv8-m\.base armv8-m\.main, \
>> 
mthumb/march.$(ARCH)/mfloat-abi.soft=mthumb/march.$(ARCH)/mfloat-abi.softfp) 


>>
>>
>> +# Map v8.1-M to v8-M.
>> +MULTILIB_MATCHES   

Re: [PATCH, OpenACC, libgomp, v6, stage1] Async-rework update

2019-12-18 Thread Thomas Schwinge
Hi!

On 2019-05-13T21:33:20+0800, Chung-Lin Tang  wrote:
> committed

(... in r271128.)

As obvious, see attached "Make 'libgomp/target.c:gomp_unmap_tgt' 'static'
again"; committed to trunk in r279529.


Grüße
 Thomas


From 60272bbbd67100b5fd864bfa8a9495b249778a66 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:00:28 +
Subject: [PATCH] Make 'libgomp/target.c:gomp_unmap_tgt' 'static' again

This got changed to 'attribute_hidden' in r271128, but it's not actually used
outside of 'libgomp/target.c'.

	libgomp/
	* target.c (gomp_unmap_tgt): Make it 'static'.
	* libgomp.h (gomp_unmap_tgt): Remove.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279529 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 5 +
 libgomp/libgomp.h | 1 -
 libgomp/target.c  | 2 +-
 3 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 3c834175a29..5bd1c648ffe 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,3 +1,8 @@
+2019-12-18  Thomas Schwinge  
+
+	* target.c (gomp_unmap_tgt): Make it 'static'.
+	* libgomp.h (gomp_unmap_tgt): Remove.
+
 2019-12-18  Tobias Burnus  
 
 	PR middle-end/86416
diff --git a/libgomp/libgomp.h b/libgomp/libgomp.h
index 36dcca28353..038e356ab0b 100644
--- a/libgomp/libgomp.h
+++ b/libgomp/libgomp.h
@@ -1157,7 +1157,6 @@ extern struct target_mem_desc *gomp_map_vars_async (struct gomp_device_descr *,
 		size_t, void **, void **,
 		size_t *, void *, bool,
 		enum gomp_map_vars_kind);
-extern void gomp_unmap_tgt (struct target_mem_desc *);
 extern void gomp_unmap_vars (struct target_mem_desc *, bool);
 extern void gomp_unmap_vars_async (struct target_mem_desc *, bool,
    struct goacc_asyncqueue *);
diff --git a/libgomp/target.c b/libgomp/target.c
index 82ed38c01ec..41cf6a3d7d2 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1105,7 +1105,7 @@ gomp_map_vars_async (struct gomp_device_descr *devicep,
  sizes, kinds, short_mapkind, pragma_kind);
 }
 
-attribute_hidden void
+static void
 gomp_unmap_tgt (struct target_mem_desc *tgt)
 {
   /* Deallocate on target the tgt->tgt_start .. tgt->tgt_end region.  */
-- 
2.17.1



signature.asc
Description: PGP signature


[PR92848] [OpenACC] Use 'GOMP_MAP_VARS_ENTER_DATA' for dynamic data lifetimes

2019-12-18 Thread Thomas Schwinge
Hi!

I haven't researched when this broke, but to fix PR92848 "[OpenACC]
Memory leak for simple 'acc_create', 'acc_delete' sequence", see attached
"[PR92848] [OpenACC] Use 'GOMP_MAP_VARS_ENTER_DATA' for dynamic data
lifetimes"; committed to trunk in r279530.


Grüße
 Thomas


From 4b1057f6d9f6b4dbccf2e6a413a5ed233e65181f Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:00:39 +
Subject: [PATCH] [PR92848] [OpenACC] Use 'GOMP_MAP_VARS_ENTER_DATA' for
 dynamic data lifetimes

	libgomp/
	PR libgomp/92848
	* oacc-mem.c (acc_map_data, present_create_copy)
	(goacc_insert_pointer): Use 'GOMP_MAP_VARS_ENTER_DATA'.
	(acc_unmap_data, delete_copyout, goacc_remove_pointer): Adjust.
	* testsuite/libgomp.oacc-c-c++-common/lib-50.c: Remove.
	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-d-a.c: New file
	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-d-p.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-r-a.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-r-p.c: Likewise.
	* testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c:
	Remove "XFAIL"s.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279530 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog |  12 +
 libgomp/oacc-mem.c|  88 +++--
 .../libgomp.oacc-c-c++-common/lib-50.c|  30 --
 .../libgomp.oacc-c-c++-common/pr92848-1-d-a.c |   7 +
 .../libgomp.oacc-c-c++-common/pr92848-1-d-p.c |   7 +
 .../libgomp.oacc-c-c++-common/pr92848-1-r-a.c |   7 +
 .../libgomp.oacc-c-c++-common/pr92848-1-r-p.c | 321 ++
 .../subset-subarray-mappings-1-r-p.c  |  16 -
 8 files changed, 410 insertions(+), 78 deletions(-)
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-50.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92848-1-d-a.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92848-1-d-p.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92848-1-r-a.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92848-1-r-p.c

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 5bd1c648ffe..d9aba5bee18 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,17 @@
 2019-12-18  Thomas Schwinge  
 
+	PR libgomp/92848
+	* oacc-mem.c (acc_map_data, present_create_copy)
+	(goacc_insert_pointer): Use 'GOMP_MAP_VARS_ENTER_DATA'.
+	(acc_unmap_data, delete_copyout, goacc_remove_pointer): Adjust.
+	* testsuite/libgomp.oacc-c-c++-common/lib-50.c: Remove.
+	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-d-a.c: New file
+	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-d-p.c: Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-r-a.c: Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/pr92848-1-r-p.c: Likewise.
+	* testsuite/libgomp.oacc-c-c++-common/subset-subarray-mappings-1-r-p.c:
+	Remove "XFAIL"s.
+
 	* target.c (gomp_unmap_tgt): Make it 'static'.
 	* libgomp.h (gomp_unmap_tgt): Remove.
 
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 196b7e2a520..54427982341 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -403,7 +403,8 @@ acc_map_data (void *h, void *d, size_t s)
   gomp_mutex_unlock (&acc_dev->lock);
 
   tgt = gomp_map_vars (acc_dev, mapnum, &hostaddrs, &devaddrs, &sizes,
-			   &kinds, true, GOMP_MAP_VARS_OPENACC);
+			   &kinds, true, GOMP_MAP_VARS_ENTER_DATA);
+  assert (tgt);
   splay_tree_key n = tgt->list[0].key;
   assert (n->refcount == 1);
   assert (n->dynamic_refcount == 0);
@@ -468,23 +469,21 @@ acc_unmap_data (void *h)
 		  (void *) h, (int) host_size);
 }
 
-  /* Mark for removal.  */
-  n->refcount = 1;
-
   t = n->tgt;
 
-  if (t->refcount == 2)
+  if (t->refcount == 1)
 {
   /* This is the last reference, so pull the descriptor off the
- chain. This avoids gomp_unmap_vars via gomp_unmap_tgt from
+ chain.  This prevents 'gomp_unmap_tgt' via 'gomp_remove_var' from
  freeing the device memory. */
   t->tgt_end = 0;
   t->to_free = 0;
 }
 
-  gomp_mutex_unlock (&acc_dev->lock);
+  bool is_tgt_unmapped = gomp_remove_var (acc_dev, n);
+  assert (is_tgt_unmapped);
 
-  gomp_unmap_vars (t, true);
+  gomp_mutex_unlock (&acc_dev->lock);
 
   if (profiling_p)
 {
@@ -572,7 +571,8 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
   goacc_aq aq = get_goacc_asyncqueue (async);
 
   tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s,
- &kinds, true, GOMP_MAP_VARS_OPENACC);
+ &kinds, true, GOMP_MAP_VARS_ENTER_DATA);
+  assert (tgt);
   n = tgt->list[0].key;
   assert (n->refcount == 1);
   assert (n->dynamic_refcount == 0);
@@ -727,7 +727,18 @@ delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 			  + (uintptr_t) h - n->host_start);
 	  gomp_copy_dev2host (acc_dev, aq, h, d, s);
 	}
-  gomp_remove_var_async (acc_dev, n, aq);
+
+  i

Re: [OpenACC] Elaborate/simplify 'exit data' 'finalize' handling (was: [OpenACC] Update OpenACC data clause semantics to the 2.5 behavior)

2019-12-18 Thread Thomas Schwinge
Hi!

On 2019-12-13T23:34:15+, Julian Brown  wrote:
> On Fri, 13 Dec 2019 15:13:53 +0100
> Thomas Schwinge  wrote:
>> Julian, Tobias, regarding the following OpenACC 'exit data' 'finalize'
>> handling:
>> 
>> On 2018-05-25T13:01:58-0700, Cesar Philippidis
>>  wrote:
>> > [...]
>> 
>> ... does the attached patch "[OpenACC] Elaborate/simplify 'exit data'
>> 'finalize' handling" (with "No functional changes") match your
>> understanding of what's going on?
>
> Your patch looks OK to me, FWIW.

Thanks for the review.

See attached "[OpenACC] Elaborate/simplify 'exit data' 'finalize'
handling"; committed to trunk in r279531.


> As you mentioned at some point though,
> it might be good to get rid of this style of finalize handling,
> replacing it with a flag passed to GOACC_exit_data

Actually, as recently discussed in a different context, I'm now steering
into the opposite direction: make all that explicit in the mapping kinds.
The reason is: while it's true that currently the OpenACC 'finalize'
clause applies to all data clauses on the directive, that's not the only
way: it might at some point become a flag for each individual clause
('copyout(finalize: [...])', or something like that) -- like OpenMP
already does, as far as I remember, so we need to support that
per-mapping kind anyway.


> -- presuming that at
> the same time, we separate out the needlessly-dual-purpose
> GOACC_enter_exit_data API entry point into "enter" and "exit" halves.

Indeed -- while that one's not a problem, it's still a bit "uh".  But,
for the sake of backwards compatibility..., it'll stay this way until we
do any other breaking changes.


Grüße
 Thomas


From a4af910c186bab748351f00b9a652c3167fa8da6 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:00:51 +
Subject: [PATCH] [OpenACC] Elaborate/simplify 'exit data' 'finalize' handling

No functional changes.

	gcc/
	* gimplify.c (gimplify_omp_target_update): Elaborate 'exit data'
	'finalize' handling.
	gcc/testsuite/
	* c-c++-common/goacc/finalize-1.c: Extend.
	* gfortran.dg/goacc/finalize-1.f: Likewise.
	libgomp/
	* oacc-mem.c (GOACC_enter_exit_data): Simplify 'exit data'
	'finalize' handling.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279531 138bc75d-0d04-0410-961f-82ee72b054a4
---
 gcc/ChangeLog |  5 
 gcc/gimplify.c| 23 +++
 gcc/testsuite/ChangeLog   |  5 
 gcc/testsuite/c-c++-common/goacc/finalize-1.c | 11 -
 gcc/testsuite/gfortran.dg/goacc/finalize-1.f  | 10 
 libgomp/ChangeLog |  3 +++
 libgomp/oacc-mem.c| 14 +++
 7 files changed, 49 insertions(+), 22 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 642faea1e44..be8dfa3fecf 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,8 @@
+2019-12-18  Thomas Schwinge  
+
+	* gimplify.c (gimplify_omp_target_update): Elaborate 'exit data'
+	'finalize' handling.
+
 2019-12-18  Tobias Burnus  
 
 	PR middle-end/86416
diff --git a/gcc/gimplify.c b/gcc/gimplify.c
index 9073680cb31..60a80cb8098 100644
--- a/gcc/gimplify.c
+++ b/gcc/gimplify.c
@@ -12738,27 +12738,30 @@ gimplify_omp_target_update (tree *expr_p, gimple_seq *pre_p)
 	   && omp_find_clause (OMP_STANDALONE_CLAUSES (expr),
 			   OMP_CLAUSE_FINALIZE))
 {
-  /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote that "finalize"
-	 semantics apply to all mappings of this OpenACC directive.  */
-  bool finalize_marked = false;
+  /* Use GOMP_MAP_DELETE/GOMP_MAP_FORCE_FROM to denote "finalize"
+	 semantics.  */
+  bool have_clause = false;
   for (tree c = OMP_STANDALONE_CLAUSES (expr); c; c = OMP_CLAUSE_CHAIN (c))
 	if (OMP_CLAUSE_CODE (c) == OMP_CLAUSE_MAP)
 	  switch (OMP_CLAUSE_MAP_KIND (c))
 	{
 	case GOMP_MAP_FROM:
 	  OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_FORCE_FROM);
-	  finalize_marked = true;
+	  have_clause = true;
 	  break;
 	case GOMP_MAP_RELEASE:
 	  OMP_CLAUSE_SET_MAP_KIND (c, GOMP_MAP_DELETE);
-	  finalize_marked = true;
+	  have_clause = true;
 	  break;
-	default:
-	  /* Check consistency: libgomp relies on the very first data
-		 mapping clause being marked, so make sure we did that before
-		 any other mapping clauses.  */
-	  gcc_assert (finalize_marked);
+	case GOMP_MAP_POINTER:
+	case GOMP_MAP_TO_PSET:
+	  /* TODO PR92929: we may see these here, but they'll always follow
+		 one of the clauses above, and will be handled by libgomp as
+		 one group, so no handling required here.  */
+	  gcc_assert (have_clause);
 	  break;
+	default:
+	  gcc_unreachable ();
 	}
 }
   stmt = gimple_build_omp_target (NULL, kind, OMP_STANDALONE_CLAUSES (expr));
diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 942448061cb..f1bf3452243 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@

Re: [OpenACC] Update OpenACC data clause semantics to the 2.5 behavior

2019-12-18 Thread Thomas Schwinge
Hi!

On 2018-05-25T13:01:58-0700, Cesar Philippidis  wrote:
> This patch updates GCC's to support OpenACC 2.5's data clause semantics. 

Per  "OpenACC 2.5: 'acc_delete' etc. on
non-present data is a no-op", which this patch didn't address.

I wanted to delay fixing this until I got the intended OpenACC 2.6 ff.
semantics clarified with the OpenACC Technical Committee, but it turned
out that fixing this now would be useful for other reasons, so see
attached "[PR92726, PR92970, PR92984] [OpenACC] Clarify 'acc_delete'
etc. for 'NULL'-in, non-present data, or size zero"; committed to trunk
in r279532.

More C/C++ and also Fortran test cases (that exercises all the different
code paths that we have in 'libgomp/oacc-mem.c:GOACC_enter_exit_data',
related to 'find_pointer' handling etc.) shall then follow later (no
hurry with that).


Grüße
 Thomas


From f7b1686558c2515511917aaeb74269b7e85ae09b Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:01:11 +
Subject: [PATCH] [PR92726, PR92970, PR92984] [OpenACC] Clarify 'acc_delete'
 etc. for 'NULL'-in, non-present data, or size zero

PR92970 "OpenACC 2.5: 'acc_delete' etc. on non-present data is a no-op" is an
actual bug fix, and the other ones are fall-out, currently undefined behavior.

	libgomp/
	PR libgomp/92726
	PR libgomp/92970
	PR libgomp/92984
	* oacc-mem.c (delete_copyout): No-op behavior if 'lookup_host'
	fails.
	(GOACC_enter_exit_data): Simplify accordingly.
	* testsuite/libgomp.oacc-c-c++-common/pr92970-1.c: New file,
	subsuming...
	* testsuite/libgomp.oacc-c-c++-common/lib-17.c: ... this file...
	* testsuite/libgomp.oacc-c-c++-common/lib-18.c: ..., and this
	file.
	* testsuite/libgomp.oacc-c-c++-common/pr92984-1.c: New file,
	subsuming...
	* testsuite/libgomp.oacc-c-c++-common/lib-21.c: ... this file...
	* testsuite/libgomp.oacc-c-c++-common/lib-29.c: ..., and this
	file.
	* testsuite/libgomp.oacc-c-c++-common/pr92726-1.c: New file,
	subsuming...
	* testsuite/libgomp.oacc-c-c++-common/lib-28.c: ... this file.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279532 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog |  20 
 libgomp/oacc-mem.c|  28 ++---
 .../libgomp.oacc-c-c++-common/lib-17.c|  38 ---
 .../libgomp.oacc-c-c++-common/lib-18.c|  38 ---
 .../libgomp.oacc-c-c++-common/lib-21.c|  35 --
 .../libgomp.oacc-c-c++-common/lib-28.c|  32 --
 .../libgomp.oacc-c-c++-common/lib-29.c|  32 --
 .../libgomp.oacc-c-c++-common/pr92726-1.c |  26 +
 .../libgomp.oacc-c-c++-common/pr92970-1.c |  33 ++
 .../libgomp.oacc-c-c++-common/pr92984-1.c | 100 ++
 10 files changed, 190 insertions(+), 192 deletions(-)
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-17.c
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-18.c
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-21.c
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-28.c
 delete mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/lib-29.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92726-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92970-1.c
 create mode 100644 libgomp/testsuite/libgomp.oacc-c-c++-common/pr92984-1.c

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index c4283fdfe1d..871a1537c77 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,25 @@
 2019-12-18  Thomas Schwinge  
 
+	PR libgomp/92726
+	PR libgomp/92970
+	PR libgomp/92984
+	* oacc-mem.c (delete_copyout): No-op behavior if 'lookup_host'
+	fails.
+	(GOACC_enter_exit_data): Simplify accordingly.
+	* testsuite/libgomp.oacc-c-c++-common/pr92970-1.c: New file,
+	subsuming...
+	* testsuite/libgomp.oacc-c-c++-common/lib-17.c: ... this file...
+	* testsuite/libgomp.oacc-c-c++-common/lib-18.c: ..., and this
+	file.
+	* testsuite/libgomp.oacc-c-c++-common/pr92984-1.c: New file,
+	subsuming...
+	* testsuite/libgomp.oacc-c-c++-common/lib-21.c: ... this file...
+	* testsuite/libgomp.oacc-c-c++-common/lib-29.c: ..., and this
+	file.
+	* testsuite/libgomp.oacc-c-c++-common/pr92726-1.c: New file,
+	subsuming...
+	* testsuite/libgomp.oacc-c-c++-common/lib-28.c: ... this file.
+
 	* oacc-mem.c (GOACC_enter_exit_data): Simplify 'exit data'
 	'finalize' handling.
 
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index b21d83c37d8..32bf3656029 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -659,7 +659,9 @@ acc_pcopyin (void *h, size_t s)
 static void
 delete_copyout (unsigned f, void *h, size_t s, int async, const char *libfnname)
 {
-  splay_tree_key n;
+  /* No need to call lazy open, as the data must already have been
+ mapped.  */
+
   struct goacc_thread *thr = goacc_thread ();
   struct gomp_device_descr *acc_dev = thr->dev;
 
@@ -677,16 +679,10 @@ delete_copyout (unsigned f, void *h, size_t s, 

Re: [RFC PATCH] Coalesce host to device transfers in libgomp

2019-12-18 Thread Thomas Schwinge
Hi!

On 2017-10-25T13:38:50+0200, Jakub Jelinek  wrote:
> --- libgomp/target.c.jj   2017-10-24 12:07:03.763759657 +0200
> +++ libgomp/target.c  2017-10-25 13:17:31.608975390 +0200

> +/* Return true for mapping kinds which need to copy data from the
> +   host to device for regions that weren't previously mapped.  */
> +
> +static inline bool
> +gomp_to_device_kind_p (int kind)
> +{
> +  switch (kind)
> +{
> +case GOMP_MAP_ALLOC:
> +case GOMP_MAP_FROM:
> +case GOMP_MAP_FORCE_ALLOC:
> +case GOMP_MAP_ALWAYS_FROM:
> +  return false;
> +default:
> +  return true;
> +}
> +}

Poor 'GOMP_MAP_FORCE_FROM'...  ;'-|

See attached "[OpenACC] In 'libgomp/target.c:gomp_to_device_kind_p',
handle 'GOMP_MAP_FORCE_FROM' like 'GOMP_MAP_FROM'"; committed to trunk in
r279533.


Grüße
 Thomas


From 74bb6382e2be4c478e2f58daa3cdf1c42b6c2480 Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:01:22 +
Subject: [PATCH] [OpenACC] In 'libgomp/target.c:gomp_to_device_kind_p', handle
 'GOMP_MAP_FORCE_FROM' like 'GOMP_MAP_FROM'

Fix oversight from r254194 "Coalesce host to device transfers in libgomp".

	libgomp/
	* target.c (gomp_to_device_kind_p): Handle 'GOMP_MAP_FORCE_FROM'
	like 'GOMP_MAP_FROM'.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279533 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog | 3 +++
 libgomp/target.c  | 1 +
 2 files changed, 4 insertions(+)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 871a1537c77..472519c7e3e 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2019-12-18  Thomas Schwinge  
 
+	* target.c (gomp_to_device_kind_p): Handle 'GOMP_MAP_FORCE_FROM'
+	like 'GOMP_MAP_FROM'.
+
 	PR libgomp/92726
 	PR libgomp/92970
 	PR libgomp/92984
diff --git a/libgomp/target.c b/libgomp/target.c
index 41cf6a3d7d2..a3cdb34bd51 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -279,6 +279,7 @@ gomp_to_device_kind_p (int kind)
 case GOMP_MAP_ALLOC:
 case GOMP_MAP_FROM:
 case GOMP_MAP_FORCE_ALLOC:
+case GOMP_MAP_FORCE_FROM:
 case GOMP_MAP_ALWAYS_FROM:
   return false;
 default:
-- 
2.17.1



signature.asc
Description: PGP signature


Re: [RFC] Offloading Support in libgomp

2019-12-18 Thread Thomas Schwinge
Hi!

On 2019-12-07T15:22:33+0100, I wrote:
> [...] propose the attached patch
> adding a safeguard [...]

See attached "Assert in 'libgomp/target.c:gomp_unmap_vars_internal' that
we're not unmapping 'tgt' while it's still in use"; committed to trunk in
r279534.


Grüße
 Thomas


From 7c82035afd9b018956fca3f670b2564ec6f0f7ca Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:01:33 +
Subject: [PATCH] Assert in 'libgomp/target.c:gomp_unmap_vars_internal' that
 we're not unmapping 'tgt' while it's still in use

	libgomp/
	* target.c (gomp_unmap_vars_internal): Add a safeguard to
	'gomp_remove_var'.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279534 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog |  3 +++
 libgomp/target.c  | 10 +-
 2 files changed, 12 insertions(+), 1 deletion(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 472519c7e3e..541a2c7610c 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2019-12-18  Thomas Schwinge  
 
+	* target.c (gomp_unmap_vars_internal): Add a safeguard to
+	'gomp_remove_var'.
+
 	* target.c (gomp_to_device_kind_p): Handle 'GOMP_MAP_FORCE_FROM'
 	like 'GOMP_MAP_FROM'.
 
diff --git a/libgomp/target.c b/libgomp/target.c
index a3cdb34bd51..67cd80a3c35 100644
--- a/libgomp/target.c
+++ b/libgomp/target.c
@@ -1225,7 +1225,15 @@ gomp_unmap_vars_internal (struct target_mem_desc *tgt, bool do_copyfrom,
   + tgt->list[i].offset),
 			tgt->list[i].length);
   if (do_unmap)
-	gomp_remove_var (devicep, k);
+	{
+	  struct target_mem_desc *k_tgt = k->tgt;
+	  bool is_tgt_unmapped = gomp_remove_var (devicep, k);
+	  /* It would be bad if TGT got unmapped while we're still iterating
+	 over its LIST_COUNT, and also expect to use it in the following
+	 code.  */
+	  assert (!is_tgt_unmapped
+		  || k_tgt != tgt);
+	}
 }
 
   if (aq)
-- 
2.17.1



signature.asc
Description: PGP signature


Re: [PATCH][ARM][GCC][1/x]: MVE ACLE intrinsics framework patch.

2019-12-18 Thread Kyrill Tkachov



On 11/14/19 7:12 PM, Srinath Parvathaneni wrote:

Hello,

This patch creates the required framework for MVE ACLE intrinsics.

The following changes are done in this patch to support MVE ACLE 
intrinsics.


Header file arm_mve.h is added to source code, which contains the 
definitions of MVE ACLE intrinsics
and different data types used in MVE. Machine description file mve.md 
is also added which contains the

RTL patterns defined for MVE.

A new reigster "p0" is added which is used in by MVE predicated 
patterns. A new register class "VPR_REG"

is added and its contents are defined in REG_CLASS_CONTENTS.

The vec-common.md file is modified to support the standard move 
patterns. The prefix of neon functions

which are also used by MVE is changed from "neon_" to "simd_".
eg: neon_immediate_valid_for_move changed to 
simd_immediate_valid_for_move.


In the patch standard patterns mve_move, mve_store and move_load for 
MVE are added and neon.md and vfp.md

files are modified to support this common patterns.

Please refer to Arm reference manual [1] for more details.

[1] 
https://static.docs.arm.com/ddi0553/bh/DDI0553B_h_armv8m_arm.pdf?_ga=2.102521798.659307368.1572453718-1501600630.1548848914


Regression tested on arm-none-eabi and found no regressions.

Ok for trunk?


Ok.

Thanks,

Kyrill



Thanks,
Srinath

gcc/ChangeLog:

2019-11-11  Andre Vieira 
    Mihail Ionescu  
    Srinath Parvathaneni 

    * config.gcc (arm_mve.h): Add header file.
    * config/arm/aout.h (p0): Add new register name.
    * config/arm-builtins.c (ARM_BUILTIN_SIMD_LANE_CHECK): Define.
    (ARM_BUILTIN_NEON_LANE_CHECK): Remove.
    (arm_init_simd_builtin_types): Add TARGET_HAVE_MVE check.
    (arm_init_neon_builtins): Move a check to arm_init_builtins 
function.
    (arm_init_builtins): Move a check from arm_init_neon_builtins 
function.

    (mve_dereference_pointer): Add new function.
    (arm_expand_builtin_args): Add TARGET_HAVE_MVE check.
    (arm_expand_neon_builtin): Move a check to arm_expand_builtin 
function.
    (arm_expand_builtin): Move a check from 
arm_expand_neon_builtin function.

    * config/arm/arm-c.c (arm_cpu_builtins): Define macros for MVE.
    * config/arm/arm-modes.def (INT_MODE): Add three new integer 
modes.
    * config/arm/arm-protos.h (neon_immediate_valid_for_move): 
Rename function.
    (simd_immediate_valid_for_move): Rename 
neon_immediate_valid_for_move function.
    * config/arm/arm.c 
(arm_options_perform_arch_sanity_checks):Enable mve isa bit.

    (use_return_insn): Add TARGET_HAVE_MVE check.
    (aapcs_vfp_allocate): Add TARGET_HAVE_MVE check.
    (aapcs_vfp_allocate_return_reg): Add TARGET_HAVE_MVE check.
    (thumb2_legitimate_address_p): Add TARGET_HAVE_MVE check.
    (arm_rtx_costs_internal): Add TARGET_HAVE_MVE check.
    (neon_valid_immediate): Rename to simd_valid_immediate.
    (simd_valid_immediate): Rename from neon_valid_immediate.
    (neon_immediate_valid_for_move): Rename to 
simd_immediate_valid_for_move.
    (simd_immediate_valid_for_move): Rename from 
neon_immediate_valid_for_move.
    (neon_immediate_valid_for_logic): Modify call to 
neon_valid_immediate function.
    (neon_make_constant): Modify call to neon_valid_immediate 
function.

    (neon_vector_mem_operand): Add TARGET_HAVE_MVE check.
    (output_move_neon): Add TARGET_HAVE_MVE check.
    (arm_compute_frame_layout): Add TARGET_HAVE_MVE check.
    (arm_save_coproc_regs): Add TARGET_HAVE_MVE check.
    (arm_print_operand): Add case 'E' to print memory operands.
    (arm_print_operand_address): Add TARGET_HAVE_MVE check.
    (arm_hard_regno_mode_ok): Add TARGET_HAVE_MVE check.
    (arm_modes_tieable_p): Add TARGET_HAVE_MVE check.
    (arm_regno_class): Add VPR_REGNUM check.
    (arm_expand_epilogue_apcs_frame): Add TARGET_HAVE_MVE check.
    (arm_expand_epilogue): Add TARGET_HAVE_MVE check.
    (arm_vector_mode_supported_p): Add TARGET_HAVE_MVE check for 
MVE vector modes.

    (arm_array_mode_supported_p): Add TARGET_HAVE_MVE check.
    (arm_conditional_register_usage): For TARGET_HAVE_MVE enable 
VPR register.
    * config/arm/arm.h (IS_VPR_REGNUM): Macro to check for VPR 
register.

    (FIRST_PSEUDO_REGISTER): Modify.
    (VALID_MVE_MODE): Define.
    (VALID_MVE_SI_MODE): Define.
    (VALID_MVE_SF_MODE): Define.
    (VALID_MVE_STRUCT_MODE): Define.
    (REG_ALLOC_ORDER): Add VPR_REGNUM entry.
    (enum reg_class): Add VPR_REG entry.
    (REG_CLASS_NAMES): Add VPR_REG entry.
    * config/arm/arm.md (VPR_REGNUM): Define.
    (arm_movsf_soft_insn): Add TARGET_HAVE_MVE check to not allow MVE.
    (vfp_pop_multiple_with_writeback): Add TARGET_HAVE_MVE check 
to allow writeback.

    (include "mve.md"): Include mve.md file.
    * config/arm/arm_mve.h: New file.
    * config/arm/c

Re: [PATCH] OpenACC reference count overhaul

2019-12-18 Thread Thomas Schwinge
Hi!

On 2019-12-11T18:22:00+0100, I wrote:
> On 2019-10-29T12:15:01+, Julian Brown  wrote:
>> I've removed the special-case handling
>> of pointers in the enter/exit data code, and combined the
>> gomp_acc_remove_pointer code (which now iterated over mappings
>> one-at-a-time anyway) with the loop iterating over mappings in the
>> new goacc_exit_data_internal function. It was a bit nonsensical to have
>> the "exit data" code split over two files, as before.
>
> Yes, I like that very much, and we shall tackle that next intermediate
> step

> One thing:
>
>> libgomp/
>
>> * oacc-parallel.c (find_pointer): Remove function.
>> (find_group_last, goacc_enter_data_internal,
>> goacc_exit_data_internal): New functions.
>> (GOACC_enter_exit_data): Use goacc_enter_data_internal and
>> goacc_exit_data_internal helper functions.
>
> It makes much sense to move all that into 'libgomp/oacc-mem.c', and as a
> preparational step, see attached "[OpenACC] Consolidate
> 'GOACC_enter_exit_data' and its helper functions in
> 'libgomp/oacc-mem.c'", committed to trunk in r279233.

Working incrementally towards the goal of unifying all that mapping
handling code, I did some refactoring ("No functional changes"): see the
attached "[OpenACC] Refactor 'present_create_copy' into
'goacc_enter_data'", "[OpenACC] Refactor 'delete_copyout' into
'goacc_exit_data'", "[OpenACC] Refactor 'GOACC_enter_exit_data' to call
'goacc_enter_data', 'goacc_exit_data'", "[OpenACC] Refactor
'goacc_remove_pointer' interface", "[OpenACC] Refactor 'goacc_enter_data'
so that it can be called from 'goacc_insert_pointer', "not present"
case", "[OpenACC] Refactor 'goacc_enter_data' so that it can be called
from 'goacc_insert_pointer', "present" case, and simplify"; committed to
trunk in r279535, r279536, r279537, r279538, r279539, r279540.


Grüße
 Thomas


From ab6f9acf81772264a8564f834b9c5d1b5b70213e Mon Sep 17 00:00:00 2001
From: tschwinge 
Date: Wed, 18 Dec 2019 17:01:51 +
Subject: [PATCH 1/6] [OpenACC] Refactor 'present_create_copy' into
 'goacc_enter_data'

Every caller passes in 'FLAG_PRESENT', 'FLAG_CREATE'.  Change the remaining
'FLAG_COPY' into the usual map kind.

No functional changes.

	libgomp/
	* oacc-mem.c (present_create_copy): Refactor into...
	(goacc_enter_data): ... this.  Adjust all users.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@279535 138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog  |  3 +++
 libgomp/oacc-mem.c | 37 ++---
 2 files changed, 13 insertions(+), 27 deletions(-)

diff --git a/libgomp/ChangeLog b/libgomp/ChangeLog
index 541a2c7610c..a5d6b51df5f 100644
--- a/libgomp/ChangeLog
+++ b/libgomp/ChangeLog
@@ -1,5 +1,8 @@
 2019-12-18  Thomas Schwinge  
 
+	* oacc-mem.c (present_create_copy): Refactor into...
+	(goacc_enter_data): ... this.  Adjust all users.
+
 	* target.c (gomp_unmap_vars_internal): Add a safeguard to
 	'gomp_remove_var'.
 
diff --git a/libgomp/oacc-mem.c b/libgomp/oacc-mem.c
index 32bf3656029..68b78b3f42f 100644
--- a/libgomp/oacc-mem.c
+++ b/libgomp/oacc-mem.c
@@ -492,12 +492,13 @@ acc_unmap_data (void *h)
 }
 }
 
-#define FLAG_PRESENT (1 << 0)
-#define FLAG_CREATE (1 << 1)
-#define FLAG_COPY (1 << 2)
+
+/* Enter a dynamic mapping.
+
+   Return the device pointer.  */
 
 static void *
-present_create_copy (unsigned f, void *h, size_t s, int async)
+goacc_enter_data (void *h, size_t s, unsigned short kind, int async)
 {
   void *d;
   splay_tree_key n;
@@ -530,12 +531,6 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
   /* Present. */
   d = (void *) (n->tgt->tgt_start + n->tgt_offset + h - n->host_start);
 
-  if (!(f & FLAG_PRESENT))
-{
-	  gomp_mutex_unlock (&acc_dev->lock);
-  gomp_fatal ("[%p,+%d] already mapped to [%p,+%d]",
-	  (void *)h, (int)s, (void *)d, (int)s);
-	}
   if ((h + s) > (void *)n->host_end)
 	{
 	  gomp_mutex_unlock (&acc_dev->lock);
@@ -549,29 +544,18 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 
   gomp_mutex_unlock (&acc_dev->lock);
 }
-  else if (!(f & FLAG_CREATE))
-{
-  gomp_mutex_unlock (&acc_dev->lock);
-  gomp_fatal ("[%p,+%d] not mapped", (void *)h, (int)s);
-}
   else
 {
   struct target_mem_desc *tgt;
   size_t mapnum = 1;
-  unsigned short kinds;
   void *hostaddrs = h;
 
-  if (f & FLAG_COPY)
-	kinds = GOMP_MAP_TO;
-  else
-	kinds = GOMP_MAP_ALLOC;
-
   gomp_mutex_unlock (&acc_dev->lock);
 
   goacc_aq aq = get_goacc_asyncqueue (async);
 
   tgt = gomp_map_vars_async (acc_dev, aq, mapnum, &hostaddrs, NULL, &s,
- &kinds, true, GOMP_MAP_VARS_ENTER_DATA);
+ &kind, true, GOMP_MAP_VARS_ENTER_DATA);
   assert (tgt);
   n = tgt->list[0].key;
   assert (n->refcount == 1);
@@ -593,13 +577,13 @@ present_create_copy (unsigned f, void *h, size_t s, int async)
 void *
 acc_create (void *h, size_

[committed] Drop unused member from cpp_string_location_reader (PR preprocessor/92982)

2019-12-18 Thread David Malcolm
Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.
Committed to trunk as r279541.

libcpp/ChangeLog:
PR preprocessor/92982
* charset.c
(cpp_string_location_reader::cpp_string_location_reader): Delete
initialization of m_line_table.
* include/cpplib.h (cpp_string_location_reader::m_line_table):
Delete unused member.
---
 libcpp/charset.c| 1 -
 libcpp/include/cpplib.h | 1 -
 2 files changed, 2 deletions(-)

diff --git a/libcpp/charset.c b/libcpp/charset.c
index 956d2dad5c8..0476b58611b 100644
--- a/libcpp/charset.c
+++ b/libcpp/charset.c
@@ -2237,7 +2237,6 @@ _cpp_default_encoding (void)
 cpp_string_location_reader::
 cpp_string_location_reader (location_t src_loc,
line_maps *line_table)
-: m_line_table (line_table)
 {
   src_loc = get_range_from_loc (line_table, src_loc).m_start;
 
diff --git a/libcpp/include/cpplib.h b/libcpp/include/cpplib.h
index e199aecfa48..1349871dc38 100644
--- a/libcpp/include/cpplib.h
+++ b/libcpp/include/cpplib.h
@@ -912,7 +912,6 @@ class cpp_string_location_reader {
  private:
   location_t m_loc;
   int m_offset_per_column;
-  line_maps *m_line_table;
 };
 
 /* A class for storing the source ranges of all of the characters within
-- 
2.21.0



Re: [PATCH 00/13] OpenACC 2.6 manual deep copy support

2019-12-18 Thread Thomas Schwinge
Hi!

On 2019-12-17T22:02:25-0800, Julian Brown  wrote:
> This patch series provides support for OpenACC 2.6's manual deep copy
> (attach/detach) feature.

Thanks.


There is high pressure to get this functionality into GCC 10, but
remaining time is short, given upcoming winter holidays, and GCC
development stage 3 end.  The big "OpenACC reference count overhaul" is a
prerequisite for the actual "OpenACC 2.6 manual deep copy support".
Integrating into GCC trunk in incremental pieces these changes has taken
a considerable amount of time, due to having to research a lot of the
existing GCC implementation as well as intended semantics.  While we made
good progress, it's not complete yet.  I very much would like to continue
working this in an incremental fashion, however, due to shortage of time,
this is not possible.  Under protest I thus now rubber-stamp approve all
the patches posted here (to the extent I'm able to), without further
review now, and I'm planning to next year then do post-commit review, and
revisions as required.


> Many of these patches have been submitted
> previously, but this series has been rebased and the large deep-copy
> part proper has been split into several pieces for ease of review.

Again: at least as far as I'm concerned, "ease of review" doesn't mean to
artificially split a patch into several pieces per component or
directories/files touched (I don't need separate patches for
'libgomp.oacc-c-c++-common/', and then 'libgomp.oacc-fortran/'), but
instead per self-contained functional change, incrementally.


Grüße
 Thomas


signature.asc
Description: PGP signature


[RFC][C++ PATCH] Don't mangle attributes that have a space in their name

2019-12-18 Thread Richard Sandiford
The SVE port needs to maintain a different type identity for
GNU vectors and "SVE vectors" even during LTO, since the types
use different ABIs.  The easiest way of doing that seemed to be
to use type attributes.  However, these type attributes shouldn't
be user-facing; they're just a convenient way of representing the
types internally in GCC.

There are already several internal-only attributes, such as "fn spec"
and "omp declare simd".  They're distinguished from normal user-facing
attributes by having a space in their name, which means that it isn't
possible to write them directly in C or C++.

Taking the same approach mostly works well for SVE.  The only snag
I've hit so far is that the new attribute needs to (and only exists to)
affect type identity.  This means that it would normally get included
in mangled names, to distinguish it from types without the attribute.

However, the SVE ABI specifies a separate mangling for SVE vector types,
rather than using an attribute mangling + a normal vector mangling.
So we need some way of suppressing the attribute mangling for this case.

There are currently no other target-independent or target-specific
internal-only attributes that affect type identity, so this patch goes
for the simplest fix of skipping mangling for attributes whose names
contain a space.  Other options I thought about were:

(1) Also make sure that targetm.mangled_type returns nonnull.

(2) Check directly for the target-specific name.

(3) Add a new target hook.

(4) Add new information to attribute_spec.  This would be very invasive
at this stage, but maybe we should consider replacing all the boolean
fields with flags?  That should make the tables slightly easier to
read and would make adding new flags much simpler in future.

What do you think?  Do any of these sound OK, or is there a better
way of doing it?

Tested on aarch64-linux-gnu and x86_64-linux-gnu.

Thanks,
Richard


2019-12-18  Richard Sandiford  

gcc/cp/
* mangle.c (write_CV_qualifiers_for_type): Don't mangle attributes
that contain a space.

Index: gcc/cp/mangle.c
===
--- gcc/cp/mangle.c 2019-11-29 13:04:13.654674379 +
+++ gcc/cp/mangle.c 2019-12-18 18:20:05.831412127 +
@@ -2377,6 +2377,11 @@ write_CV_qualifiers_for_type (const tree
  tree name = get_attribute_name (a);
  const attribute_spec *as = lookup_attribute_spec (name);
  if (as && as->affects_type_identity
+ /* Skip internal-only attributes, which are distinguished from
+others by having a space.  At present, all internal-only
+attributes that affect type identity are target-specific
+and are handled by targetm.mangle_type instead.  */
+ && !strchr (IDENTIFIER_POINTER (name), ' ')
  && !is_attribute_p ("transaction_safe", name)
  && !is_attribute_p ("abi_tag", name))
vec.safe_push (a);


Re: [Patch, Fortran] PR92896 [10 Regression] Fix - Prevent character conversion in array constructor

2019-12-18 Thread Steve Kargl
On Wed, Dec 18, 2019 at 08:47:31AM +, Mark Eggleston wrote:
>
> It is a bit confusing that the Fortran FE source files have the .c 
> extension implying C when they are C++ and are compiled using C++.

It is not puzzling at all.  gfortran was added to GCC some 15
years ago.  gfortran was originally written in C and is mostly
still written in C.

When asking people, who know Fortran, to get involved in gfortran
development, the #1 excuse why they don't > "I don't know C".
Now, throw C++ on top of C, and it is even harder to find new hands.
C++ creep into gfortran is, IMNSHO, a bad thing.

As to the patch, unless someone else objects, I suppose it's ok.

-- 
Steve


Re: [Patch] Add OpenACC 2.6's no_create

2019-12-18 Thread Thomas Schwinge
Hi Tobias!

On 2019-12-18T13:36:29+0100, Tobias Burnus  wrote:
> libgomp/target.c's gomp_map_vars_internal: it now uses the normal code 
> path in the upper loop, except that one directly bails out when the 
> 'key' has not been found (skipping the adjacent MAP_POINTER as well). 
> The 'case' in the second loop is only reached, if tgt[i]->key == NULL 
> (i.e. if not present) and one can unconditionally skip here. — This 
> seems to be cleaner and should avoid some confusions :-)

Oh, great!  It seems that you managed to de-cypher what my brain (or was
it my gut feeling?) told me to write down in these TODO comments that I
had added.  ;-)

I have not now reviewed the details, but from the structure, your changes
looks good, and if it work, all the better.


I note you're building up a "dangerous" ;-) level of understanding of OMP
internals!  :-)


> GOMP_MAP_POINTER, following MAP_IF_PRESENT: I am not sure about this. 

So, what does a 'GOMP_MAP_POINTER' following a non-present
'GOMP_MAP_IF_PRESENT' mean -- is this 'GOMP_MAP_POINTER' operation
actually a no-op then, given that in the non-present case we'll just use
the host pointer?  But if it is a no-op, should we then just let the
mapping code execute these 'GOMP_MAP_POINTER' operation, instead of
adding special-case code to skip them?

Are there any interactions with the OpenACC 2.6 manual deep copy
implementation maybe?

> The testsuite digests both mapping and skipping the map pointer. It 
> looks a tad cleaner to avoid mapping the pointer (if the var is not 
> present) – saving also few bytes and cpu cycles. On the down side, it 
> adds an order dependence assumption, namely assuming that the 
> MAP_POINTER after 'no_create'/MAP_IF_PRESENT always belongs to 
> no_create. – [This patch follows the original patch and skips the 
> map_pointer.]

Per his OpenACC 2.6 manual deep copy work, Julian has indeed established
that a 'GOMP_MAP_POINTER' is "only expected after some other mapping";
see "case GOMP_MAP_POINTER" in
<65540b92dff74db1f15af930f87f7096d03e7efe.1576648001.git.julian@codesourcery.com">http://mid.mail-archive.com/65540b92dff74db1f15af930f87f7096d03e7efe.1576648001.git.julian@codesourcery.com>,
for example.

See also 
"unfinished notes on pointer mapping kinds" that Julian created.

The question then is, is it (a) correct (also per the OpenACC 2.6 manual
deep copy requirements) to skip these 'GOMP_MAP_POINTER' after
'GOMP_MAP_IF_PRESENT', and (b) only 'GOMP_MAP_POINTER' or also other
"variants", and/or (c) not do that skipping?

(For avoidance of doubt: this is fine to resolve later, given that it may
depend on the pending OpenACC 2.6 manual deep copy, and doesn't seem to
cause any issues at present.)


> Otherwise, except for added acc_is_present calls to no_create-3.c to 
> check that no_create does not cause mapping and applying your/Thomas's 
> patches, it matches my previous version, which was OK'ed. — Hence, I 
> intent to commit it tomorrow, unless there are further comments.

ACK.


> On 12/17/19 8:11 PM, Tobias Burnus wrote:
>> On 12/3/19 4:16 PM, Thomas Schwinge wrote:
>>> Another thing: I've added just another little bit of testsuite 
>>> coverage, and another thing broke. See "TODO" in attached incremental 
>>> patch. […]
>> Files included, the other issue was XFAILed by you (and hence passed). 
>> A fix for that issue is: 
>> https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01135.html — and a 
>> completely separate issue. (That patch is small, very localized and 
>> orthogonal to this patch.)

ACK, that's for later.


>>> The incremental Fortran test case changes have bene done in a rush; not
>>> sure if they make much sense, or should see some further work applied to
>>> them.
>>
>> I think one can do more, but they are fine. I am not 100% sure how to 
>> read the following:
>>
>>   ! The no_create clause is meant for partially shared-memory 
>> machines.  This
>>   ! test is written to work on non-shared-memory machines, though this 
>> is not
>>   ! necessarily a useful way to use the no_create clause in practice.

(We inherited that from somebody else.  I too didn't quickly understand
that.)

>>   !$acc parallel !no_create (var)
>>
>> First, why is 'no_create(var)' now commented? – For this code, it 
>> should really work both ways and independent whether commented boils 
>> down to 'copy' (currently) or 'present' (with my other patch, linked 
>> above).

If I remember correctly (remember: "done in a rush"), I think that was my
rationale: we should get kind-of an implicit 'no_create' here.


..., and then, learned something new this evening:

>  .../testsuite/libgomp.oacc-fortran/no_create-1.f90 | 39 ++
>  .../testsuite/libgomp.oacc-fortran/no_create-2.f90 | 90 
> ++
>  .../testsuite/libgomp.oacc-fortran/no_create-3.F90 | 39 ++

> --- /dev/null
> +++ b/libgomp/testsuite/libgomp.oacc-fortran/no_create-3.F90

Why is this upper-case '.F90' when

Re: [PATCH] V10 patch #5, Fix codegen bug with vector extracts using a variable offset & PC-relative address

2019-12-18 Thread Michael Meissner
On Tue, Dec 17, 2019 at 12:02:46PM -0600, Segher Boessenkool wrote:
> >  ;; Variable V2DI/V2DF extract
> >  (define_insn_and_split "vsx_extract__var"
> > -  [(set (match_operand: 0 "gpc_reg_operand" "=v,wa,r")
> > -   (unspec: [(match_operand:VSX_D 1 "input_operand" "v,m,m")
> > -(match_operand:DI 2 "gpc_reg_operand" "r,r,r")]
> > -   UNSPEC_VSX_EXTRACT))
> > -   (clobber (match_scratch:DI 3 "=r,&b,&b"))
> > -   (clobber (match_scratch:V2DI 4 "=&v,X,X"))]
> > +  [(set (match_operand: 0 "gpc_reg_operand" "=v,wa,r,wa,r")
> > +   (unspec:
> > +[(match_operand:VSX_D 1 "input_operand" "v,em,em,ep,ep")
> > + (match_operand:DI 2 "gpc_reg_operand" "r,r,r,r,r")]
> > +UNSPEC_VSX_EXTRACT))
> > +   (clobber (match_scratch:DI 3 "=r,&b,&b,&b,&b"))
> > +   (clobber (match_scratch:V2DI 4 "=&v,X,X,X,X"))
> > +   (clobber (match_scratch:DI 5 "=X,X,X,&b,&b"))]
> >"VECTOR_MEM_VSX_P (mode) && TARGET_DIRECT_MOVE_64BIT"
> >"#"
> >"&& reload_completed"
> >[(const_int 0)]
> >  {
> >rs6000_split_vec_extract_var (operands[0], operands[1], operands[2],
> > -   operands[3], operands[4]);
> > +   operands[3], operands[4], operands[5]);
> 
> This writes to operands[2], which does not match its constraint.
> 
> Same in the other splitters.

Right.  Good catch.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797


[PATCH] PowerPC, Rename SIGNED_BIT_OFFSET_P to SIGNED_INTEGER_BIT_P

2019-12-18 Thread Michael Meissner
In the patch:
https://gcc.gnu.org/ml/gcc-patches/2019-12/msg01201.html

Segher Boessenkool asked me to submit a patch to rename the macros used to see
if a number is a valid signed 16 or 34-bit value:

> Please follow up with a patch to not call random numbers "OFFSET".

This patch does this, renaming:

SIGNED_34BIT_OFFSET_P   -> SIGNED_INTEGER_34BIT_P
SIGNED_16BIT_OFFSET_P   -> SIGNED_INTEGER_16BIT_P

I did not change the secondary macros (SIGNED_34BIT_OFFSET_EXTRA_P and
SIGNED_16BIT_OFFSET_P), since those are exclusively used for offset
calculations.  But I can if you prefer it that way.

I also converted one a use in num_insns_constant_gpr to use the macro (it had
been in previous patches, but I dropped in the last patch just to get the
minimal change in).

I've bootstrapped compilers with these patches and there was no regression in
the test suite.  Can I check this into the trunk?

Some of the remaining patches in the V10 series will need to be modified as
well.  I will submit those patches (after I rework the vector extract stuff) in
a new series.

2019-12-17   Michael Meissner  

* config/rs6000/predicates.md (cint34_operand): Use
SIGNED_INTEGER_34BIT_P macro.
* config/rs6000/rs6000.c (num_insns_constant_gpr): Use the
SIGNED_INTEGER_16BIT_P and SIGNED_INTEGER_34BIT_P macros.
(address_to_insn_form): Use the SIGNED_INTEGER_16BIT_P and
SIGNED_INTEGER_34BIT_P macros.
* config/rs6000/rs6000.h (SIGNED_INTEGER_NBIT_P): New macro.
(SIGNED_INTEGER_16BIT_P): Rename SIGNED_16BIT_OFFSET_P to be
SIGNED_INTEGER_34BIT_P.
(SIGNED_INTEGER_34BIT_P): Rename SIGNED_34BIT_OFFSET_P to be
SIGNED_INTEGER_34BIT_P.

Index: gcc/config/rs6000/predicates.md
===
--- gcc/config/rs6000/predicates.md (revision 279478)
+++ gcc/config/rs6000/predicates.md (working copy)
@@ -309,7 +309,7 @@ (define_predicate "cint34_operand"
   if (!TARGET_PREFIXED_ADDR)
 return 0;
 
-  return SIGNED_34BIT_OFFSET_P (INTVAL (op));
+  return SIGNED_INTEGER_34BIT_P (INTVAL (op));
 })
 
 ;; Return 1 if op is a register that is not special.
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 279478)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -5557,7 +5557,7 @@ static int
 num_insns_constant_gpr (HOST_WIDE_INT value)
 {
   /* signed constant loadable with addi */
-  if (((unsigned HOST_WIDE_INT) value + 0x8000) < 0x1)
+  if (SIGNED_INTEGER_16BIT_P (value))
 return 1;
 
   /* constant loadable with addis */
@@ -5566,7 +5566,7 @@ num_insns_constant_gpr (HOST_WIDE_INT va
 return 1;
 
   /* PADDI can support up to 34 bit signed integers.  */
-  else if (TARGET_PREFIXED_ADDR && SIGNED_34BIT_OFFSET_P (value))
+  else if (TARGET_PREFIXED_ADDR && SIGNED_INTEGER_34BIT_P (value))
 return 1;
 
   else if (TARGET_POWERPC64)
@@ -24770,7 +24770,7 @@ address_to_insn_form (rtx addr,
 return INSN_FORM_BAD;
 
   HOST_WIDE_INT offset = INTVAL (op1);
-  if (!SIGNED_34BIT_OFFSET_P (offset))
+  if (!SIGNED_INTEGER_34BIT_P (offset))
 return INSN_FORM_BAD;
 
   /* Check for local and external PC-relative addresses.  Labels are always
@@ -24789,7 +24789,7 @@ address_to_insn_form (rtx addr,
 return INSN_FORM_BAD;
 
   /* Large offsets must be prefixed.  */
-  if (!SIGNED_16BIT_OFFSET_P (offset))
+  if (!SIGNED_INTEGER_16BIT_P (offset))
 {
   if (TARGET_PREFIXED_ADDR)
return INSN_FORM_PREFIXED_NUMERIC;
Index: gcc/config/rs6000/rs6000.h
===
--- gcc/config/rs6000/rs6000.h  (revision 279478)
+++ gcc/config/rs6000/rs6000.h  (working copy)
@@ -2529,18 +2529,16 @@ typedef struct GTY(()) machine_function
 #pragma GCC poison TARGET_FLOAT128 OPTION_MASK_FLOAT128 MASK_FLOAT128
 #endif
 
-/* Whether a given VALUE is a valid 16 or 34-bit signed offset.  */
-#define SIGNED_16BIT_OFFSET_P(VALUE)   \
+/* Whether a given VALUE is a valid 16 or 34-bit signed integer.  */
+#define SIGNED_INTEGER_NBIT_P(VALUE, N)
\
   IN_RANGE ((VALUE),   \
-   -(HOST_WIDE_INT_1 << 15),   \
-   (HOST_WIDE_INT_1 << 15) - 1)
+   -(HOST_WIDE_INT_1 << ((N)-1)),  \
+   (HOST_WIDE_INT_1 << ((N)-1)) - 1)
 
-#define SIGNED_34BIT_OFFSET_P(VALUE)   \
-  IN_RANGE ((VALUE),   \
-   -(HOST_WIDE_INT_1 << 33),   \
-   (HOST_WIDE_INT_1 << 33) - 1)
+#define SIGNED_INTEGER_16BIT_P(VALUE)  SIGNED_INTEGER_NBIT_P (VALUE, 16)
+#define SIGNED_INTEGER_34BIT_P(VALUE)  SIGNED_INTEGER_NBIT_P (VALUE, 34)
 
-/* Like SIGNED_16BIT_OFFSE

Re: [PATCH 10/13] OpenACC 2.6 deep copy: Fortran front-end parts

2019-12-18 Thread Tobias Burnus

On 12/18/19 7:04 AM, Julian Brown wrote:


This part contains the Fortran front-end support for parsing the new
attach and detach clauses, as well as derived-type members on other
data-movement clauses (copyin, copyout, etc.).
I browsed the patch and it looks mostly fine to me. However, I do have 
comments related to the array refs.

@@ -3890,9 +3922,6 @@ check_array_not_assumed (gfc_symbol *sym, locus loc, 
const char *name)
  static void
  resolve_oacc_data_clauses (gfc_symbol *sym, locus loc, const char *name)
  {
+ /* Disallow duplicate bare variable references and multiple
+subarrays of the same array here, but allow multiple components of
+the same (e.g. derived-type) variable.  For the latter, duplicate
+components are detected elsewhere.  */

Do we have a test case for "the latter"?

@@ -4470,23 +4514,43 @@ resolve_omp_clauses (gfc_code *code, gfc_omp_clauses 
*omp_clauses,
+   gfc_ref *array_ref = NULL;
+   bool resolved = false;
if (n->expr)
  {
-   if (!gfc_resolve_expr (n->expr)
+   array_ref = n->expr->ref;
+   resolved = gfc_resolve_expr (n->expr);
+
+   /* Look through component refs to find last array
+  reference.  */
+   if (openacc)
+ while (resolved
I would move the "resolved" into the "if" condition as it doesn't change 
in the while loop.

+&& array_ref
+&& (array_ref->type == REF_COMPONENT
+|| (array_ref->type == REF_ARRAY
+&& array_ref->next
+&& (array_ref->next->type
+== REF_COMPONENT
+   array_ref = array_ref->next;
+ }
+   if (array_ref
+   || (n->expr
+   && (!resolved || n->expr->expr_type != EXPR_VARIABLE)))
+ {
+   if (!resolved
|| n->expr->expr_type != EXPR_VARIABLE
-   || n->expr->ref == NULL
-   || n->expr->ref->next
-   || n->expr->ref->type != REF_ARRAY)
+   || array_ref->next
+   || array_ref->type != REF_ARRAY)
  gfc_error ("%qs in %s clause at %L is not a proper "
 "array section", n->sym->name, name,
 &n->where);
-   else if (n->expr->ref->u.ar.codimen)
+   else if (array_ref->u.ar.codimen)
  gfc_error ("Coarrays not supported in %s clause at %L",
 name, &n->where);


First, I believe the error message is wrong – coarrays are permitted but 
only if their local data is accessed; this check checks whether a 
coindex is present, i.e. whether the variable is accessed on a remote 
process ("image"). Hence, the error should use something like "Entry 
shall not be coindexed in %s clause at %L" or something like that.


Secondly, a coarray can exist at different places, e.g.

type t
  integer :: i
end type t
type t2
  integer, allocatable :: i[:]
  type(t), allocatable :: x[:]
end type t2
type(t), allocatable :: A[:], B[:]
type(t) :: D[*]
type(t2) :: C
!$acc data copy(D[2]%i, A[4], B[4]%i, C%i[2], C%x[4]%i)

Here, C is not a coarray. But all those list items in the clause are
coindexed – but your new check will only detect those where the ultimate
component is coindexed. The quickest check for this is "gfc_is_coindexed 
(expr)".

Thirdly, I am not sure whether the following will work with your code:
type t
  integer :: i(5), j(17), k
end type t
type(t) :: x(10)
!$acc data copy (x(:)%k, x(:)%j(3))

This data is strided; I don't quickly see whether that's rejected. (I also
didn't check whether it is valid, but I think it is not.)


+/* Transparently dereference VAR if it is a pointer, reference, etc.
+   according to Fortran semantics.  */
+
+tree
+gfc_auto_dereference_var (gfc_symbol *sym, tree var, bool descriptor_only_p,
+ bool is_classarray)


I have to admit that 'transparently deference' and 'auto' puzzles me, 
naming/description wise, but I don't have a good solution; but I like 
the 'Dereference the expression, where needed' more



+  /* Dereference the expression, where needed.  */
+  se->expr = gfc_auto_dereference_var (sym, se->expr, se->descriptor_only,
+  is_classarray);



+++ b/gcc/fortran/trans-openmp.c
+  if (element)
+{
+  gfc_conv_expr_reference (&se, n->expr);
+  gfc_add_block_to_block (block, &se.pre);
+  ptr = se.expr;
+  OMP_CLAUSE_SIZE (node)
+   = TYPE_SIZE_UNIT (TREE_TYPE (ptr));


This fits nicely on a single line; I think the ';' is in c

[PATCH] Partially fix libgomp/testsuite/libgomp.c/pr86416-*.c

2019-12-18 Thread Jakub Jelinek
Hi!

As mentioned in the PR, I believe we should use long double float suffixes
in the test testing long double and Q suffixes in the test that tests
__float128, both because PowerPC doesn't allow mixing them and because only
the latter test is guarded on float128 support.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2019-12-18  Jakub Jelinek  

PR middle-end/86416
* testsuite/libgomp.c/pr86416-1.c (main): Use L suffixes rather than
q or none.
* testsuite/libgomp.c/pr86416-2.c (main): Use Q suffixes rather than
L or none.

--- libgomp/testsuite/libgomp.c/pr86416-1.c.jj  2019-12-18 21:25:02.856131826 
+0100
+++ libgomp/testsuite/libgomp.c/pr86416-1.c 2019-12-18 21:28:06.275349386 
+0100
@@ -16,7 +16,7 @@ long double foo (long double x)
 
 int main()
 {
-  long double v = foo (10.0q) - 20.0q;
-  if (v > 1.0e-5 || v < -1.0e-5) abort();
+  long double v = foo (10.0L) - 20.0L;
+  if (v > 1.0e-5L || v < -1.0e-5L) abort();
   return 0;
 }
--- libgomp/testsuite/libgomp.c/pr86416-2.c.jj  2019-12-18 21:25:02.855131842 
+0100
+++ libgomp/testsuite/libgomp.c/pr86416-2.c 2019-12-18 21:28:41.708811864 
+0100
@@ -16,7 +16,7 @@ __float128 foo(__float128 y)
 
 int main()
 {
-  __float128 v = foo (5.0L) - 20.0L;
-  if (v > 1.0e-5 || v < -1.0e-5) abort();
+  __float128 v = foo (5.0Q) - 20.0Q;
+  if (v > 1.0e-5Q || v < -1.0e-5Q) abort();
   return 0;
 }

Jakub



[committed] Avoid some Fortran FE opts in OpenMP atomics (PR fortran/92977)

2019-12-18 Thread Jakub Jelinek
Hi!

Similarly to EXEC_OMP_WORKSHARE, EXEC_OMP_ATOMIC has also very tight rules
what can and can't appear in the block, enforced through parsing and
resolving, so e.g. inserting EXEC_BLOCK there leads to ICEs.  In theory, one
could add such a BLOCK around the atomic rather than inside of it, but the
code isn't prepared to be able to do that and furthermore there is still
risk of breaking the EXEC_OMP_ATOMIC expectations.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
committed to trunk so far.

2019-12-19  Jakub Jelinek  

PR fortran/92977
* frontend-passes.c (in_omp_atomic): New variable.
(cfe_expr_0, matmul_to_var_expr, matmul_temp_args,
inline_matmul_assign, call_external_blas): Don't optimize in
EXEC_OMP_ATOMIC.
(optimize_namespace): Clear in_omp_atomic.
(gfc_code_walker): Set in_omp_atomic for EXEC_OMP_ATOMIC, save/restore
it around.

* gfortran.dg/gomp/pr92977.f90: New test.

--- gcc/fortran/frontend-passes.c.jj2019-11-09 18:08:47.866374726 +0100
+++ gcc/fortran/frontend-passes.c   2019-12-18 14:45:23.996493420 +0100
@@ -92,6 +92,10 @@ static int forall_level;
 
 static bool in_omp_workshare;
 
+/* Keep track of whether we are within an OMP atomic.  */
+
+static bool in_omp_atomic;
+
 /* Keep track of whether we are within a WHERE statement.  */
 
 static bool in_where;
@@ -913,9 +917,9 @@ cfe_expr_0 (gfc_expr **e, int *walk_subt
   gfc_expr *newvar;
   gfc_expr **ei, **ej;
 
-  /* Don't do this optimization within OMP workshare or ASSOC lists.  */
+  /* Don't do this optimization within OMP workshare/atomic or ASSOC lists.  */
 
-  if (in_omp_workshare || in_assoc_list)
+  if (in_omp_workshare || in_omp_atomic || in_assoc_list)
 {
   *walk_subtrees = 0;
   return 0;
@@ -1464,6 +1468,7 @@ optimize_namespace (gfc_namespace *ns)
   iterator_level = 0;
   in_assoc_list = false;
   in_omp_workshare = false;
+  in_omp_atomic = false;
 
   if (flag_frontend_optimize)
 {
@@ -2818,7 +2823,7 @@ matmul_to_var_expr (gfc_expr **ep, int *
 return 0;
 
   if (forall_level > 0 || iterator_level > 0 || in_omp_workshare
-  || in_where || in_assoc_list)
+  || in_omp_atomic || in_where || in_assoc_list)
 return 0;
 
   /* Check if this is already in the form c = matmul(a,b).  */
@@ -2880,7 +2885,7 @@ matmul_temp_args (gfc_code **c, int *wal
 return 0;
 
   if (forall_level > 0 || iterator_level > 0 || in_omp_workshare
-  || in_where)
+  || in_omp_atomic || in_where)
 return 0;
 
   /* This has some duplication with inline_matmul_assign.  This
@@ -3848,7 +3853,7 @@ inline_matmul_assign (gfc_code **c, int
   /* For now don't do anything in OpenMP workshare, it confuses
  its translation, which expects only the allowed statements in there.
  We should figure out how to parallelize this eventually.  */
-  if (in_omp_workshare)
+  if (in_omp_workshare || in_omp_atomic)
 return 0;
 
   expr1 = co->expr1;
@@ -4385,7 +4390,7 @@ call_external_blas (gfc_code **c, int *w
   /* For now don't do anything in OpenMP workshare, it confuses
  its translation, which expects only the allowed statements in there. */
 
-  if (in_omp_workshare)
+  if (in_omp_workshare | in_omp_atomic)
 return 0;
 
   expr1 = co->expr1;
@@ -5047,6 +5052,7 @@ gfc_code_walker (gfc_code **c, walk_code
  gfc_code *co;
  gfc_association_list *alist;
  bool saved_in_omp_workshare;
+ bool saved_in_omp_atomic;
  bool saved_in_where;
 
  /* There might be statement insertions before the current code,
@@ -5054,6 +5060,7 @@ gfc_code_walker (gfc_code **c, walk_code
 
  co = *c;
  saved_in_omp_workshare = in_omp_workshare;
+ saved_in_omp_atomic = in_omp_atomic;
  saved_in_where = in_where;
 
  switch (co->op)
@@ -5251,6 +5258,10 @@ gfc_code_walker (gfc_code **c, walk_code
  WALK_SUBEXPR (co->ext.dt->extra_comma);
  break;
 
+   case EXEC_OMP_ATOMIC:
+ in_omp_atomic = true;
+ break;
+
case EXEC_OMP_PARALLEL:
case EXEC_OMP_PARALLEL_DO:
case EXEC_OMP_PARALLEL_DO_SIMD:
@@ -5368,6 +5379,7 @@ gfc_code_walker (gfc_code **c, walk_code
select_level --;
 
  in_omp_workshare = saved_in_omp_workshare;
+ in_omp_atomic = saved_in_omp_atomic;
  in_where = saved_in_where;
}
 }
--- gcc/testsuite/gfortran.dg/gomp/pr92977.f90.jj   2019-12-18 
15:16:14.657486591 +0100
+++ gcc/testsuite/gfortran.dg/gomp/pr92977.f90  2019-12-18 15:16:08.310582750 
+0100
@@ -0,0 +1,15 @@
+! PR fortran/92977
+! { dg-do compile }
+! { dg-additional-options "-O2" }
+
+program pr92977
+  integer :: n = 1
+  integer :: a
+!$omp atomic write
+  a = f(n) - f(n)
+contains
+  integer function f(x)
+integer, intent(in) :: x
+f = x
+  end
+end

Jakub



[C++ PATCH] Don't ignore side-effects on decltype(nullptr) typed args passed to ... (PR c++/92992)

2019-12-18 Thread Jakub Jelinek
Hi!

While looking at PR92666, I've spotted a wrong-code issue where we ignore
any side-effects on arguments passed to ellipsis if they have
decltype(nullptr) type.

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk and release branches?

2019-12-19  Jakub Jelinek  

PR c++/92992
* call.c (convert_arg_to_ellipsis): For decltype(nullptr) arguments
that have side-effects use cp_build_compound_expr.

* g++.dg/cpp0x/nullptr45.C: New test.

--- gcc/cp/call.c.jj2019-12-17 10:19:51.013282361 +0100
+++ gcc/cp/call.c   2019-12-18 18:23:01.441357443 +0100
@@ -7822,7 +7822,12 @@ convert_arg_to_ellipsis (tree arg, tsubs
   arg = convert_to_real_nofold (double_type_node, arg);
 }
   else if (NULLPTR_TYPE_P (arg_type))
-arg = null_pointer_node;
+{
+  if (TREE_SIDE_EFFECTS (arg))
+   arg = cp_build_compound_expr (arg, null_pointer_node, complain);
+  else
+   arg = null_pointer_node;
+}
   else if (INTEGRAL_OR_ENUMERATION_TYPE_P (arg_type))
 {
   if (SCOPED_ENUM_P (arg_type))
--- gcc/testsuite/g++.dg/cpp0x/nullptr45.C.jj   2019-12-18 18:37:48.537933751 
+0100
+++ gcc/testsuite/g++.dg/cpp0x/nullptr45.C  2019-12-18 18:37:17.290406672 
+0100
@@ -0,0 +1,24 @@
+// PR c++/92992
+// { dg-do run { target c++11 } }
+
+int a;
+
+void
+bar (int, ...)
+{
+}
+
+decltype (nullptr)
+baz ()
+{
+  a++;
+  return nullptr;
+}
+
+int
+main ()
+{
+  bar (0, baz ());
+  if (a != 1)
+__builtin_abort ();
+}

Jakub



[C++ PATCH] Fix -Wunused-but-set-* false positives in arg passing to ... (PR c++/92666)

2019-12-18 Thread Jakub Jelinek
Hi!

convert_arg_to_ellipsis used to call decay_conversion for all types
(which calls mark_rvalue_use), but it doesn't anymore in GCC 10,
and while for INTEGRAL_OR_ENUMERATION_TYPE_P args it calls
cp_perform_integral_promotions which does that too and for aggregate
args keeps calling decay_conversion, for floating point or decltype(nullptr)
args there is nothing that would shut up -Wunused-but-set-* warning
(and I guess equally also handle lambda captures).

Fixed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok for
trunk?

2019-12-19  Jakub Jelinek  

PR c++/92666
* call.c (convert_arg_to_ellipsis): For floating point or
decltype(nullptr) arguments call mark_rvalue_use.

* g++.dg/warn/Wunused-var-36.C: New test.

--- gcc/cp/call.c.jj2019-12-17 10:19:51.013282361 +0100
+++ gcc/cp/call.c   2019-12-18 18:23:01.441357443 +0100
@@ -7819,10 +7819,12 @@ convert_arg_to_ellipsis (tree arg, tsubs
"implicit conversion from %qH to %qI when passing "
"argument to function",
arg_type, double_type_node);
+  arg = mark_rvalue_use (arg);
   arg = convert_to_real_nofold (double_type_node, arg);
 }
   else if (NULLPTR_TYPE_P (arg_type))
 {
+  arg = mark_rvalue_use (arg);
   if (TREE_SIDE_EFFECTS (arg))
arg = cp_build_compound_expr (arg, null_pointer_node, complain);
   else
--- gcc/testsuite/g++.dg/warn/Wunused-var-36.C.jj   2019-12-18 
18:05:50.804946325 +0100
+++ gcc/testsuite/g++.dg/warn/Wunused-var-36.C  2019-12-18 18:40:19.130654606 
+0100
@@ -0,0 +1,25 @@
+// PR c++/92666
+// { dg-do compile }
+// { dg-options "-Wunused-but-set-variable" }
+
+int bar (int, ...);
+#if __cplusplus >= 201103L
+enum class E : int { F = 0, G = 1 };
+#endif
+struct S { int s; };
+
+void
+foo ()
+{
+  float r = 1.0f;  // { dg-bogus "set but not used" }
+  int i = 2;   // { dg-bogus "set but not used" }
+#if __cplusplus >= 201103L
+  decltype(nullptr) n = nullptr;   // { dg-bogus "set but not used" }
+  E e = E::F;  // { dg-bogus "set but not used" }
+#else
+  void *n = (void *) 0;
+  int e = 4;
+#endif
+  S s = { 3 }; // { dg-bogus "set but not used" }
+  bar (0, r, i, n, e, s);
+}

Jakub



[committed] Add diagnostic_metadata and CWE support

2019-12-18 Thread David Malcolm
This patch adds support for associating a diagnostic message with an
optional diagnostic_metadata object, so that plugins can add extra data
to their diagnostics (e.g. mapping a diagnostic to a taxonomy or coding
standard such as from CERT or MISRA).

Currently this only supports associating a CWE identifier with a
diagnostic (which is what I'm using for the warnings in the analyzer
patch kit), but adding a diagnostic_metadata class allows for future
growth in this area without an explosion of further "warning_at"
overloads for all of the different kinds of custom data that a plugin
might want to add.

This version of the patch renames the overly-general
-fdiagnostics-show-metadata to -fdiagnostics-show-cwe and adds test
coverage for it via a plugin.

It also adds a note to the documentation that no GCC diagnostics
currently use this; it's a feature for plugins (and, at some point,
I hope, the analyzer).

Successfully bootstrapped & regrtested on x86_64-pc-linux-gnu.

Committed to trunk as r279556.

I've also pushed an incremental version of this patch relative to
  https://gcc.gnu.org/ml/gcc-patches/2019-12/msg00998.html
to dmalcolm/analyzer on the GCC git mirror, which also successfully
bootstrapped & regrtested on x86_64-pc-linux-gnu.

Dave

gcc/ChangeLog:
* common.opt (fdiagnostics-show-cwe): Add.
* diagnostic-core.h (class diagnostic_metadata): New forward decl.
(warning_at): Add overload taking a const diagnostic_metadata &.
(emit_diagnostic_valist): Add overload taking a
const diagnostic_metadata *.
* diagnostic-format-json.cc: Include "diagnostic-metadata.h".
(json_from_metadata): New function.
(json_end_diagnostic): Call it to add "metadata" child for
diagnostics with metadata.
(diagnostic_output_format_init): Clear context->show_cwe.
* diagnostic-metadata.h: New file.
* diagnostic.c: Include "diagnostic-metadata.h".
(diagnostic_impl): Add const diagnostic_metadata * param.
(diagnostic_n_impl): Likewise.
(diagnostic_initialize): Initialize context->show_cwe.
(diagnostic_set_info_translated): Initialize diagnostic->metadata.
(get_cwe_url): New function.
(print_any_cwe): New function.
(diagnostic_report_diagnostic): Call print_any_cwe if the
diagnostic has non-NULL metadata.
(emit_diagnostic): Pass NULL as the metadata in the call to
diagnostic_impl.
(emit_diagnostic_valist): Likewise.
(emit_diagnostic_valist): New overload taking a
const diagnostic_metadata *.
(inform): Pass NULL as the metadata in the call to
diagnostic_impl.
(inform_n): Likewise for diagnostic_n_impl.
(warning): Likewise.
(warning_at): Likewise.  Add overload that takes a
const diagnostic_metadata &.
(warning_n): Pass NULL as the metadata in the call to
diagnostic_n_impl.
(pedwarn): Likewise for diagnostic_impl.
(permerror): Likewise.
(error): Likewise.
(error_n): Likewise.
(error_at): Likewise.
(sorry): Likewise.
(sorry_at): Likewise.
(fatal_error): Likewise.
(internal_error): Likewise.
(internal_error_no_backtrace): Likewise.
* diagnostic.h (diagnostic_info::metadata): New field.
(diagnostic_context::show_cwe): New field.
* doc/invoke.texi (-fno-diagnostics-show-cwe): New option.
* opts.c (common_handle_option): Handle OPT_fdiagnostics_show_cwe.
* toplev.c (general_init): Initialize global_dc->show_cwe.

gcc/testsuite/ChangeLog:
* gcc.dg/plugin/diagnostic-test-metadata.c: New test.
* gcc.dg/plugin/diagnostic_plugin_test_metadata.c: New test plugin.
* gcc.dg/plugin/plugin.exp (plugin_test_list): Add them.
---
 gcc/common.opt|   4 +
 gcc/diagnostic-core.h |  10 ++
 gcc/diagnostic-format-json.cc |  24 +++
 gcc/diagnostic-metadata.h |  42 ++
 gcc/diagnostic.c  | 142 ++
 gcc/diagnostic.h  |   8 +
 gcc/doc/invoke.texi   |  10 ++
 gcc/opts.c|   4 +
 .../gcc.dg/plugin/diagnostic-test-metadata.c  |   9 ++
 .../plugin/diagnostic_plugin_test_metadata.c  | 140 +
 gcc/testsuite/gcc.dg/plugin/plugin.exp|   1 +
 gcc/toplev.c  |   2 +
 12 files changed, 365 insertions(+), 31 deletions(-)
 create mode 100644 gcc/diagnostic-metadata.h
 create mode 100644 gcc/testsuite/gcc.dg/plugin/diagnostic-test-metadata.c
 create mode 100644 
gcc/testsuite/gcc.dg/plugin/diagnostic_plugin_test_metadata.c

diff --git a/gcc/common.opt b/gcc/common.opt
index b4dc31c7490..058da8af877 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1334,6 +1334,10 @@ fdiagnostics-show-option
 Commo

[C++ PATCH] PR c++/91165 follow-on tweak

2019-12-18 Thread Jason Merrill
I talked in the PR about possibly stripping the location from the args in
the hash table, since if we use the cache the locations would be wrong, but
didn't actually do anything about that.  Then I noticed that there's already
unshare_expr_without_location...

Tested x86_64-pc-linux-gnu, applying to trunk.

* constexpr.c (cxx_eval_call_expression): Use
unshare_expr_without_location.
---
 gcc/cp/constexpr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 87d78d26728..b95da0f8342 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -2079,7 +2079,7 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree 
t,
  /* Unshare args going into the hash table to separate them
 from the caller's context, for better GC and to avoid
 problems with verify_gimple.  */
- arg = unshare_expr (arg);
+ arg = unshare_expr_without_location (arg);
  TREE_VEC_ELT (bound, i) = arg;
}
  /* Don't share a CONSTRUCTOR that might be changed.  This is not

base-commit: 05df605885dee7bf66bcafeed961aac9827bdb27
-- 
2.18.1



Re: [PATCH] [RFC] ipa: duplicate ipa_size_summary for cloned nodes

2019-12-18 Thread luoxhu
On 2019/12/18 23:48, Jan Hubicka wrote:
>> The size_info of ipa_size_summary are created by r277424.  It should be
>> duplicated for cloned nodes, otherwise self_size and 
>> estimated_self_stack_size
>> would be 0, causing param large-function-insns and large-function-growth 
>> working
>> inaccurate when ipa-inline.
>>
>> gcc/ChangeLog:
>>
>>  2019-12-18  Luo Xiong Hu  
>>
>>  * ipa-fnsummary.c (ipa_fn_summary_t::duplicate): Copy
>>  ipa_size_summary for cloned nodes.
>> ---
>>   gcc/ipa-fnsummary.c | 5 +
>>   1 file changed, 5 insertions(+)
>>
>> diff --git a/gcc/ipa-fnsummary.c b/gcc/ipa-fnsummary.c
>> index a46b1445765..9a01be1708b 100644
>> --- a/gcc/ipa-fnsummary.c
>> +++ b/gcc/ipa-fnsummary.c
>> @@ -868,7 +868,12 @@ ipa_fn_summary_t::duplicate (cgraph_node *src,
>>  }
>>   }
>> if (!dst->inlined_to)
>> +  {
>> +class ipa_size_summary *src_size = ipa_size_summaries->get_create (src);
>> +class ipa_size_summary *dst_size = ipa_size_summaries->get_create (dst);
> 
> This is intended to happen by the default duplicate method of
> ipa_size_summaries via to copy constructor. It seems there is a stupid
> pasto and the copy constructor is unused since the default duplicate
> implementation does nothing (wonder why).
> 
> I am testing the attached patch.  Does this help?

Yes, It works.  Thanks for your refine.  The default duplicate implementation 
is in
symbol-summary.h:template class function_summary_base::duplicate, I 
tried
to call duplicate in it, but it will cause a lot of errors as many classes 
doesn't
implement the virtual duplicate function.  Please commit your patch once tested 
pass :)

Xiong Hu
> 
> Index: ipa-fnsummary.h
> ===
> --- ipa-fnsummary.h   (revision 279523)
> +++ ipa-fnsummary.h   (working copy)
> @@ -99,11 +99,6 @@ public:
> : estimated_self_stack_size (0), self_size (0), size (0)
> {
> }
> -  /* Copy constructor.  */
> -  ipa_size_summary (const ipa_size_summary &s)
> -  : estimated_self_stack_size (0), self_size (s.self_size), size (s.size)
> -  {
> -  }
>   };
>   
>   /* Function inlining information.  */
> @@ -226,18 +221,20 @@ extern GTY(()) fast_function_summary  *ipa_fn_summaries;
>   
>   class ipa_size_summary_t:
> -  public fast_function_summary 
> +  public fast_function_summary 
>   {
>   public:
> ipa_size_summary_t (symbol_table *symtab):
> -fast_function_summary  (symtab) {}
> +fast_function_summary  (symtab)
> +  {
> +disable_insertion_hook ();
> +  }
>   
> -  static ipa_size_summary_t *create_ggc (symbol_table *symtab)
> +  virtual void duplicate (cgraph_node *, cgraph_node *,
> +   ipa_size_summary *src_data,
> +   ipa_size_summary *dst_data)
> {
> -class ipa_size_summary_t *summary = new (ggc_alloc  
> ())
> -  ipa_size_summary_t (symtab);
> -summary->disable_insertion_hook ();
> -return summary;
> +*dst_data = *src_data;
> }
>   };
>   extern fast_function_summary 
> Index: ipa-fnsummary.c
> ===
> --- ipa-fnsummary.c   (revision 279523)
> +++ ipa-fnsummary.c   (working copy)
> @@ -672,8 +672,7 @@ static void
>   ipa_fn_summary_alloc (void)
>   {
> gcc_checking_assert (!ipa_fn_summaries);
> -  ipa_size_summaries = new fast_function_summary  va_heap>
> -  (symtab);
> +  ipa_size_summaries = new ipa_size_summary_t (symtab);
> ipa_fn_summaries = ipa_fn_summary_t::create_ggc (symtab);
> ipa_call_summaries = new ipa_call_summary_t (symtab);
>   }
> 



Re: [PATCH 12/49] Add diagnostic paths

2019-12-18 Thread David Malcolm
On Sat, 2019-12-07 at 07:45 -0700, Jeff Law wrote:
> On Fri, 2019-11-15 at 20:22 -0500, David Malcolm wrote:
> > This patch adds support for associating a "diagnostic_path" with a
> > diagnostic: a sequence of events predicted by the compiler that
> > leads
> > to
> > the problem occurring, with their locations in the user's source,
> > text descriptions, and stack information (for handling
> > interprocedural
> > paths).
> > 
> > For example, the following (hypothetical) error has a 3-event
> > intraprocedural path:
> > 
> > test.c: In function 'demo':
> > test.c:29:5: error: passing NULL as argument 1 to 'PyList_Append'
> > which
> >   requires a non-NULL parameter
> >29 | PyList_Append(list, item);
> >   | ^
> >   'demo': events 1-3
> >  |
> >  |   25 |   list = PyList_New(0);
> >  |  |  ^
> >  |  |  |
> >  |  |  (1) when 'PyList_New' fails, returning NULL
> >  |   26 |
> >  |   27 |   for (i = 0; i < count; i++) {
> >  |  |   ~~~
> >  |  |   |
> >  |  |   (2) when 'i < count'
> >  |   28 | item = PyLong_FromLong(random());
> >  |   29 | PyList_Append(list, item);
> >  |  | ~
> >  |  | |
> >  |  | (3) when calling 'PyList_Append', passing NULL
> > from
> > (1) as argument 1
> >  |
> > 
> > The patch adds a new "%@" format code for printing event IDs, so
> > that
> > in the above, the description of event (3) mentions event (1),
> > showing
> > the user where the bogus NULL value comes from (the event IDs are
> > colorized to draw the user's attention to them).
> > 
> > There is a separation between data vs presentation: the above shows
> > how
> > the diagnostic-printing code has consolidated the path into a
> > single
> > run
> > of events, since all the events are near each other and within the
> > same
> > function; more complicated examples (such as interprocedural paths)
> > might be printed as multiple runs of events.
> > 
> > Examples of how interprocedural paths are printed can be seen in
> > the
> > test suite (which uses a plugin to exercise the code without
> > relying
> > on specific warnings using this functionality).
> > 
> > Other output formats include
> > - JSON,
> > - printing each event as a separate "note", and
> > - to not emit paths.
> > 
> > (I have a separate script that can generate HTML from the JSON, but
> > HTML
> > is not my speciality; help from a web front-end expert to make it
> > look
> > good would be appreciated).
> > 
> > gcc/ChangeLog:
> > * Makefile.in (OBJS): Add tree-diagnostic-path.o.
> > * common.opt (fdiagnostics-path-format=): New option.
> > (diagnostic_path_format): New enum.
> > (fdiagnostics-show-path-depths): New option.
> > * coretypes.h (diagnostic_event_id_t): New forward decl.
> > * diagnostic-color.c (color_dict): Add "path".
> > * diagnostic-event-id.h: New file.
> > * diagnostic-format-json.cc (json_from_expanded_location): Make
> > non-static.
> > (json_end_diagnostic): Call context->make_json_for_path if it
> > exists and the diagnostic has a path.
> > (diagnostic_output_format_init): Clear context->print_path.
> > * diagnostic-path.h: New file.
> > * diagnostic-show-locus.c (colorizer::set_range): Special-case
> > when printing a run of events in a diagnostic_path so that they
> > all get the same color.
> > (layout::m_diagnostic_path_p): New field.
> > (layout::layout): Initialize it.
> > (layout::print_any_labels): Don't colorize the label text for
> > an
> > event in a diagnostic_path.
> > (gcc_rich_location::add_location_if_nearby): Add
> > "restrict_to_current_line_spans" and "label" params.  Pass the
> > former to layout.maybe_add_location_range; pass the latter
> > when calling add_range.
> > * diagnostic.c: Include "diagnostic-path.h".
> > (diagnostic_initialize): Initialize context->path_format and
> > context->show_path_depths.
> > (diagnostic_show_any_path): New function.
> > (diagnostic_path::interprocedural_p): New function.
> > (diagnostic_report_diagnostic): Call diagnostic_show_any_path.
> > (simple_diagnostic_path::num_events): New function.
> > (simple_diagnostic_path::get_event): New function.
> > (simple_diagnostic_path::add_event): New function.
> > (simple_diagnostic_event::simple_diagnostic_event): New ctor.
> > (simple_diagnostic_event::~simple_diagnostic_event): New dtor.
> > (debug): New overload taking a diagnostic_path *.
> > * diagnostic.def (DK_DIAGNOSTIC_PATH): New.
> > * diagnostic.h (enum diagnostic_path_format): New enum.
> > (json::value): New forward decl.
> > (diagnostic_context::path_format): New field.
> > (diagnostic_context::show_path_depths): New field.
> > (diagnostic_context::print_path): New callback field.
> > (diag

Re: [PATCH] Handle aggregate pass-through for self-recursive call (PR ipa/92794)

2019-12-18 Thread Feng Xue OS


>> +static bool
>> +self_recursive_agg_pass_through_p (cgraph_edge *cs, ipa_agg_jf_item *jfunc,
>> +int i)
>> +{
>> +  if (cs->caller == cs->callee->function_symbol ()

> I don't know if self-recursive calls can be interposed at all, if yes
> you need to add the availability check like we have in
> self_recursive_pass_through_p (if not, we should probably remove it
> there).

Added.  

Feng

Re: gccgo branch updated

2019-12-18 Thread Ian Lance Taylor
I've updated the gccgo branch to revision 279561 of trunk.

Ian


[patch, fortran] Fix PR 91541, ICE on valid for INDEX

2019-12-18 Thread Thomas Koenig

Hello world,

the attached patch fixes an ICE on valid for INDEX (see test case).
The problem was that the KIND argument was still present during
scalarization, which caused the ICE.

The fix is to remove the KIND argument, and the best place
to do this is in resolution.  I did try to do this in
gfc_conv_intrinsic_index_scan_verify, but it is too late by then.

Removing the KIND argument required changing the call signature
of gfc_resolve_index_func, which in turn required the rest of
the changes (including the one in trans-decl.c - I am not convinced
that what we are doing there is right, but for this bug fix, I
left the functionality as is).

Regression-tested. OK for trunk?

Regards

Thomas

2019-12-19  Thomas Koenig  

PR fortran/91541
* intrinsic.c (add_sym_4ind): New function.
(add_functions): Use it for INDEX.
(resolve_intrinsic): Also call f1m for INDEX.
* intrinsic.h (gfc_resolve_index_func): Adjust prototype to
take a gfc_arglist instead of individual arguments.
* iresolve.c (gfc_resolve_index_func): Adjust arguments.
Remove KIND argument if present, and make sure this is
not done twice.
* trans-decl.c: Include "intrinsic.h".
(gfc_get_extern_function_decl): Special case for resolving INDEX.

2019-12-19  Thomas Koenig  

PR fortran/91541
* gfortran.dg/index_3.f90: New test.
Index: intrinsic.c
===
--- intrinsic.c	(Revision 279405)
+++ intrinsic.c	(Arbeitskopie)
@@ -851,7 +851,40 @@ add_sym_4 (const char *name, gfc_isym_id id, enum
 	   (void *) 0);
 }
 
+/* Add a symbol to the function list where the function takes 4
+   arguments and resolution may need to change the number or
+   arrangement of arguments. This is the case for INDEX, which needs
+   its KIND argument removed.  */
 
+static void
+add_sym_4ind (const char *name, gfc_isym_id id, enum klass cl, int actual_ok,
+	  bt type, int kind, int standard,
+	  bool (*check) (gfc_expr *, gfc_expr *, gfc_expr *, gfc_expr *),
+	  gfc_expr *(*simplify) (gfc_expr *, gfc_expr *, gfc_expr *,
+ gfc_expr *),
+	  void (*resolve) (gfc_expr *, gfc_actual_arglist *),
+	  const char *a1, bt type1, int kind1, int optional1,
+	  const char *a2, bt type2, int kind2, int optional2,
+	  const char *a3, bt type3, int kind3, int optional3,
+	  const char *a4, bt type4, int kind4, int optional4 )
+{
+  gfc_check_f cf;
+  gfc_simplify_f sf;
+  gfc_resolve_f rf;
+
+  cf.f4 = check;
+  sf.f4 = simplify;
+  rf.f1m = resolve;
+
+  add_sym (name, id, cl, actual_ok, type, kind, standard, cf, sf, rf,
+	   a1, type1, kind1, optional1, INTENT_IN,
+	   a2, type2, kind2, optional2, INTENT_IN,
+	   a3, type3, kind3, optional3, INTENT_IN,
+	   a4, type4, kind4, optional4, INTENT_IN,
+	   (void *) 0);
+}
+
+
 /* Add a symbol to the subroutine list where the subroutine takes
4 arguments.  */
 
@@ -2153,11 +2186,11 @@ add_functions (void)
 
   /* The resolution function for INDEX is called gfc_resolve_index_func
  because the name gfc_resolve_index is already used in resolve.c.  */
-  add_sym_4 ("index", GFC_ISYM_INDEX, CLASS_ELEMENTAL, ACTUAL_YES,
-	 BT_INTEGER, di, GFC_STD_F77,
-	 gfc_check_index, gfc_simplify_index, gfc_resolve_index_func,
-	 stg, BT_CHARACTER, dc, REQUIRED, ssg, BT_CHARACTER, dc, REQUIRED,
-	 bck, BT_LOGICAL, dl, OPTIONAL, kind, BT_INTEGER, di, OPTIONAL);
+  add_sym_4ind ("index", GFC_ISYM_INDEX, CLASS_ELEMENTAL, ACTUAL_YES,
+		BT_INTEGER, di, GFC_STD_F77,
+		gfc_check_index, gfc_simplify_index, gfc_resolve_index_func,
+		stg, BT_CHARACTER, dc, REQUIRED, ssg, BT_CHARACTER, dc, REQUIRED,
+		bck, BT_LOGICAL, dl, OPTIONAL, kind, BT_INTEGER, di, OPTIONAL);
 
   make_generic ("index", GFC_ISYM_INDEX, GFC_STD_F77);
 
@@ -4434,9 +4467,10 @@ resolve_intrinsic (gfc_intrinsic_sym *specific, gf
 
   arg = e->value.function.actual;
 
-  /* Special case hacks for MIN and MAX.  */
+  /* Special case hacks for MIN, MAX and INDEX.  */
   if (specific->resolve.f1m == gfc_resolve_max
-  || specific->resolve.f1m == gfc_resolve_min)
+  || specific->resolve.f1m == gfc_resolve_min
+  || specific->resolve.f1m == gfc_resolve_index_func)
 {
   (*specific->resolve.f1m) (e, arg);
   return;
Index: intrinsic.h
===
--- intrinsic.h	(Revision 279405)
+++ intrinsic.h	(Arbeitskopie)
@@ -517,8 +517,7 @@ void gfc_resolve_ibits (gfc_expr *, gfc_expr *, gf
 void gfc_resolve_ibset (gfc_expr *, gfc_expr *, gfc_expr *);
 void gfc_resolve_image_index (gfc_expr *, gfc_expr *, gfc_expr *);
 void gfc_resolve_image_status (gfc_expr *, gfc_expr *, gfc_expr *);
-void gfc_resolve_index_func (gfc_expr *, gfc_expr *, gfc_expr *, gfc_expr *,
-			 gfc_expr *);
+void gfc_resolve_index_func (gfc_expr *, gfc_actual_arglist *);
 void gfc_resolve_ierrno (gfc_expr *);
 void gfc_resolve_ieor (g

Re: [patch] Guard aarch64/aapcs64 tests using abitest.S by check_weak_available

2019-12-18 Thread Olivier Hainque


> On 16 Dec 2019, at 14:54, Richard Sandiford  wrote:

>> We have local patches adding
>> 
>>  dg-require-effective-target fpic
>> 
>> directives to these.
>> 
>> Is that the correct thing to do ?
> 
> Yeah.  Adding that to tests that use -fpic or -fPIC is OK/preapproved.
> 
> Personally, I don't think people can be expected to remember to use
> this whenever they add a new -fpic or -fPIC test, so it's probably
> going to be a constant fight to get clean results without PIC support.
> 
> Maybe we should have a programmatic fix.  E.g. we could override
> dg-options in gcc-dg.exp and make it do the equivalent of:
> 
>  { dg-require-effective-target fpic }
> 
> whenever -fpic or -fPIC is used.  We override it in a few test harnesses
> already (e.g. mips.exp, which does something more complicated) so it
> wouldn't be entirely new ground.

I see. I'll probably go for the simple approach first, which indeed seems
to be commonly used.

Thanks for your feedback on this Richard.

Best Regards,

Olivier