Re: [wwwdocs] Revamp formatting markup of bugs/management.html

2018-09-10 Thread Gerald Pfeifer
On Sun, 2 Sep 2018, Gerald Pfeifer wrote:
> ...making this page HTML 5 compliant.

And with this little follow-up this page solely uses CSS, removing
a sole warning the validator still showed after my change last week.

Applied.

Gerald

Index: bugs/management.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/bugs/management.html,v
retrieving revision 1.39
diff -u -r1.39 management.html
--- bugs/management.html2 Sep 2018 17:30:48 -   1.39
+++ bugs/management.html9 Sep 2018 22:00:53 -
@@ -86,7 +86,7 @@
 The following two fields describe how serious a bug is from a user's
 perspective (Severity) and what Priority we assign to it in fixing it:
 
-
+
 
 
 Severity


[wwwdocs] projects/h8300-abi.html formatting revamp

2018-09-10 Thread Gerald Pfeifer
Use CSS instead of presentational markup for tables, and properly
mark up table headings instead of treating those like regular rows.

Committed.

Gerald

Index: projects/h8300-abi.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/h8300-abi.html,v
retrieving revision 1.10
diff -u -r1.10 h8300-abi.html
--- projects/h8300-abi.html 2 Sep 2018 16:59:49 -   1.10
+++ projects/h8300-abi.html 10 Sep 2018 08:26:15 -
@@ -66,7 +66,7 @@
 downward.  For example, if 2 bytes of data, 0x1234, are to be pushed
 onto the stack on H8/300H, 4 bytes are pushed like so:
 
-
+
 sp + 30x34
 sp + 20x12
 sp + 1padding (unknown value)
@@ -156,14 +156,14 @@
 Immediately after the prologue is setup, the stack frame layout is
 as follows:
 
-
+
 
-  Address
-  Description
-  Size on H8/300
-  Size on H8/300H and H8S(Normal Mode)
-  Size on H8/300H and H8S(Advanced Mode)
-  Pointed to by
+  Address
+  Description
+  Size on H8/300
+  Size on H8/300H and H8S(Normal Mode)
+  Size on H8/300H and H8S(Advanced Mode)
+  Pointed to by
 
 
   High
@@ -225,14 +225,14 @@
 Currently, the stack frame layout, subject to change, is as
 follows:
 
-
+
 
-  Address
-  Description
-  Size on H8/300
-  Size on H8/300H and H8S(Normal Mode)
-  Size on H8/300H and H8S(Advanced Mode)
-  Pointed to by
+  Address
+  Description
+  Size on H8/300
+  Size on H8/300H and H8S(Normal Mode)
+  Size on H8/300H and H8S(Advanced Mode)
+  Pointed to by
 
 
   High


[wwwdocs] Add some missing table cells to projects/cxx-status.html

2018-09-10 Thread Gerald Pfeifer
Committed.

Gerald

Index: projects/cxx-status.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx-status.html,v
retrieving revision 1.58
diff -u -r1.58 cxx-status.html
--- projects/cxx-status.html1 Sep 2018 23:42:09 -   1.58
+++ projects/cxx-status.html10 Sep 2018 08:35:20 -
@@ -577,13 +577,15 @@
Coroutines 
   http://wg21.link/n4649";>N4649
No 
-   
+  
+  
 
 
Modules 
   http://wg21.link/n4720";>N4720
https://gcc.gnu.org/wiki/cxx-modules";>In progress 
-   
+  
+  
 
   
 


Re: C++ PATCH to tidy up build_vtbl_ref

2018-09-10 Thread Jason Merrill
OK.

On Mon, Sep 10, 2018 at 12:32 AM, Marek Polacek  wrote:
> The wrapper for build_vtbl_ref_1 doesn't seem to do anything useful.
>
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
>
> 2018-09-09  Marek Polacek  
>
> * class.c (build_vtbl_ref): Remove.
> (build_vtbl_ref_1): Rename to build_vtbl_ref.
> (build_vfn_ref): Call build_vtbl_ref instead of build_vtbl_ref_1.
>
> diff --git gcc/cp/class.c gcc/cp/class.c
> index e11173d2e59..e950a7423f7 100644
> --- gcc/cp/class.c
> +++ gcc/cp/class.c
> @@ -133,7 +133,6 @@ static void maybe_warn_about_overly_private_class (tree);
>  static void add_implicitly_declared_members (tree, tree*, int, int);
>  static tree fixed_type_or_null (tree, int *, int *);
>  static tree build_simple_base_path (tree expr, tree binfo);
> -static tree build_vtbl_ref_1 (tree, tree);
>  static void build_vtbl_initializer (tree, tree, tree, tree, int *,
> vec **);
>  static bool check_bitfield_decl (tree);
> @@ -699,8 +698,8 @@ build_vfield_ref (tree datum, tree type)
> cases for INSTANCE which we take care of here, mainly to avoid
> creating extra tree nodes when we don't have to.  */
>
> -static tree
> -build_vtbl_ref_1 (tree instance, tree idx)
> +tree
> +build_vtbl_ref (tree instance, tree idx)
>  {
>tree aref;
>tree vtbl = NULL_TREE;
> @@ -730,14 +729,6 @@ build_vtbl_ref_1 (tree instance, tree idx)
>return aref;
>  }
>
> -tree
> -build_vtbl_ref (tree instance, tree idx)
> -{
> -  tree aref = build_vtbl_ref_1 (instance, idx);
> -
> -  return aref;
> -}
> -
>  /* Given a stable object pointer INSTANCE_PTR, return an expression which
> yields a function pointer corresponding to vtable element INDEX.  */
>
> @@ -746,8 +737,7 @@ build_vfn_ref (tree instance_ptr, tree idx)
>  {
>tree aref;
>
> -  aref = build_vtbl_ref_1 (cp_build_fold_indirect_ref (instance_ptr),
> -   idx);
> +  aref = build_vtbl_ref (cp_build_fold_indirect_ref (instance_ptr), idx);
>
>/* When using function descriptors, the address of the
>   vtable entry is treated as a function pointer.  */


Re: [PATCH v2] combine: perform jump threading at the end

2018-09-10 Thread Ilya Leoshkevich



> Am 06.09.2018 um 20:11 schrieb Jeff Law :
> 
> On 09/05/2018 06:11 AM, Richard Biener wrote:
>> On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich  wrote:
>>> 
>>> gcc/ChangeLog:
>>> 
>>> 2018-09-05  Ilya Leoshkevich  
>>> 
>>>PR target/80080
>>>* combine.c (rest_of_handle_combine): Perform jump threading.
>>> 
>>> gcc/testsuite/ChangeLog:
>>> 
>>> 2018-09-05  Ilya Leoshkevich  
>>> 
>>>PR target/80080
>>>* gcc.target/s390/pr80080-4.c: New test.
>>> ---
>>> gcc/combine.c | 10 --
>>> gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 
>>> 2 files changed, 24 insertions(+), 2 deletions(-)
>>> create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c
>>> 
>>> diff --git a/gcc/combine.c b/gcc/combine.c
>>> index a2649b6d5a1..818b4c5b77d 100644
>>> --- a/gcc/combine.c
>>> +++ b/gcc/combine.c
>>> @@ -14960,10 +14960,16 @@ rest_of_handle_combine (void)
>>>free_dominance_info (CDI_DOMINATORS);
>>>   timevar_push (TV_JUMP);
>>>   rebuild_jump_labels (get_insns ());
>>> -  cleanup_cfg (0);
>>> -  timevar_pop (TV_JUMP);
>>> }
>>> 
>>> +  /* Combining insns can change basic blocks in a way that they end up
>>> + containing a single jump_insn. This creates an opportunity to improve 
>>> code
>>> + with jump threading.  */
>>> +  cleanup_cfg (CLEANUP_THREADING);
>>> +
>>> +  if (rebuild_jump_labels_after_combine)
>>> +timevar_pop (TV_JUMP);
>> 
>> cleanup_cfg pushes its own timevar so it doesn't make sense to try covering 
>> it
>> with TV_JUMP.  And rebuild_jump_labels immediately pushes TV_REBUILD_JUMP.
>> 
>> So I suggest to remove the timevar_push/pop of TV_JUMP here.
>> 
>> No comment in general about the change, maybe we can detect transforms that
>> make jump-threading viable and conditionalize that properly?  Note the only
>> setter of CLEANUP_THREADING guards it with flag_thread_jumps so maybe better
>> do it above as well (avoids cost at -O0 for example).
> The sad thing is I thought we'd killed the RTL jump threading code eons ago.
Do you mean RTL jump threading is deprecated and/or we better rely on
something else to achieve the same results?
> 
> THe RTL jump threading code tries to prove that the target block has no
> side effects and that we can statically determine the true/false
> condition for the conditional branch at the end of the block.
> 
> This is (of course) much easier to do when the target block has no insns
> other than the conditional branch.  So perhaps only do this when the
> target block has just the conditional?
Sounds reasonable, I implemented this in the new patch.  The only thing
is that combine leaves (note NOTE_INSN_DELETED) around, so I needed to
also account for those.  I used side_effects_p for that.
> 
> Hard to know if that'd work here since RTL wasn't posted.
I will post the RTL with the updated patch shortly.
> 
> Jeff



[PATCH v3] combine: perform jump threading at the end

2018-09-10 Thread Ilya Leoshkevich
Consider the following RTL:

(code_label 11 10 26 4 2 (nil) [1 uses])
(note 26 11 12 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 12 26 15 4 (set (reg:SI 65)
(if_then_else:SI (eq (reg:CCZ 33 %cc)
(const_int 0 [0]))
(const_int 1 [0x1])
(const_int 0 [0]))) "pr80080-4.c":9 1674 {*movsicc})
(insn 15 12 16 4 (parallel [
(set (reg:CCZ 33 %cc)
(compare:CCZ (reg:SI 65)
(const_int 0 [0])))
(clobber (scratch:SI))
]) "pr80080-4.c":9 1216 {*tstsi_cconly_extimm})
(jump_insn 16 15 17 4 (set (pc)
(if_then_else (ne (reg:CCZ 33 %cc)
(const_int 0 [0]))
(label_ref:DI 23)
(pc))) "pr80080-4.c":9 1897 {*cjump_64})

Combine simplifies this into:

(code_label 11 10 26 4 2 (nil) [1 uses])
(note 26 11 12 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(note 12 26 15 4 NOTE_INSN_DELETED)
(note 15 12 16 4 NOTE_INSN_DELETED)
(jump_insn 16 15 17 4 (set (pc)
(if_then_else (eq (reg:CCZ 33 %cc)
(const_int 0 [0]))
(label_ref:DI 23)
(pc))) "pr80080-4.c":9 1897 {*cjump_64})

opening up the possibility to perform jump threading.  Since this
happens infrequently, perform jump threading only when there is a
changed basic block, whose sole side effect is a trailing jump.

Also remove redundant usage of TV_JUMP, because rebuild_jump_labels ()
and cleanup_cfg () already have their own timevars.

gcc/ChangeLog:

2018-09-05  Ilya Leoshkevich  

PR target/80080
* combine.c (is_single_jump_bb): New function.
(struct combine_summary): New struct.
(combine_instructions): Instead of returning
new_direct_jump_p, fill struct combine_summary. In addition
to the existing new_direct_jump_p, it contains a new
new_single_jump_p field, which controls whether or not
jump threading should be performed after combine.
(rest_of_handle_combine): Perform jump threading if there is
a possibility that it would be profitable.  Remove redundant
usage of TV_JUMP.

gcc/testsuite/ChangeLog:

2018-09-05  Ilya Leoshkevich  

PR target/80080
* gcc.target/s390/pr80080-4.c: New test.
---
 gcc/combine.c | 89 +++
 gcc/testsuite/gcc.target/s390/pr80080-4.c | 16 
 2 files changed, 89 insertions(+), 16 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/s390/pr80080-4.c

diff --git a/gcc/combine.c b/gcc/combine.c
index a2649b6d5a1..65f5d7d092b 100644
--- a/gcc/combine.c
+++ b/gcc/combine.c
@@ -1139,13 +1139,42 @@ insn_a_feeds_b (rtx_insn *a, rtx_insn *b)
   return false;
 }
 
+/* Return true iff the only side effect of BB is its trailing jump_insn.  */
+
+static bool
+is_single_jump_bb (basic_block bb)
+{
+  rtx_insn *end = BB_END (bb);
+  rtx_insn *insn;
+
+  if (!JUMP_P (end))
+return false;
+
+  for (insn = BB_HEAD (bb); insn != end; insn = NEXT_INSN (insn))
+if (INSN_P (insn) && side_effects_p (PATTERN (insn)))
+  return false;
+  return true;
+}
+
+/* Summary of changes performed by the combiner.  */
+struct combine_summary {
+  /* True if the combiner has turned an indirect jump instruction into a direct
+ jump.  */
+  bool new_direct_jump_p;
+
+  /* True if the combiner changed at least one basic block in a way that it
+ ended up containing a single jump_insn.  */
+  bool new_single_jump_p;
+};
+
 /* Main entry point for combiner.  F is the first insn of the function.
NREGS is the first unused pseudo-reg number.
 
-   Return nonzero if the combiner has turned an indirect jump
-   instruction into a direct jump.  */
-static int
-combine_instructions (rtx_insn *f, unsigned int nregs)
+   If performed changes satisfy certain criteria, set the corresponding fields
+   of SUMMARY to true.  */
+static void
+combine_instructions (rtx_insn *f, unsigned int nregs,
+ struct combine_summary *summary)
 {
   rtx_insn *insn, *next;
   rtx_insn *prev;
@@ -1158,7 +1187,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
   for (first = f; first && !NONDEBUG_INSN_P (first); )
 first = NEXT_INSN (first);
   if (!first)
-return 0;
+return;
 
   combine_attempts = 0;
   combine_merges = 0;
@@ -1251,6 +1280,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
   FOR_EACH_BB_FN (this_basic_block, cfun)
 {
   rtx_insn *last_combined_insn = NULL;
+  bool bb_changed = false;
 
   /* Ignore instruction combination in basic blocks that are going to
 be removed as unreachable anyway.  See PR82386.  */
@@ -1302,6 +1332,7 @@ combine_instructions (rtx_insn *f, unsigned int nregs)
 last_combined_insn)) != 0)
  {
statistics_counter_event (cfun, "two-insn combine", 1);
+   bb_changed = true;
goto retry;
  }
 
@@ -1323,6 +1354,7 @@ combine_instructions (rtx_insn *f, unsign

Re: [PATCH][4/4][v2] RPO-style value-numbering for FRE/PRE

2018-09-10 Thread Martin Liška
On 09/05/2018 09:48 AM, Richard Biener wrote:
> On Wed, 5 Sep 2018, Gerald Pfeifer wrote:
> 
>> On Tue, 4 Sep 2018, Jeff Law wrote:
 On the other hand, this ICE has been consistent across a week of
 daily builds now.
>>> An FYI, My i686 builds have been running OK.  But given what you've
>>> described this could well be an uninitialized read, dangling pointer,
>>> out of bounds write or some similar kind of bug.
>>
>> I did binary search now, and am afraid it's really that patch, Richard:
>>
>> Revision 263874 appears just fine; 263875 breaks as per my original 
>> message.
> 
> Sure - but without a way to reproduce locally investigation is really
> hard...  So I'm concentrating on the bugs the rev caused that I can
> reproduce and thus fix ;)
> 
> Richard.
> 

Hi.

I can reproduce that locally in a KVM machine running FreeBSD test 10.4-RELEASE.
I used gcc version 6.4.0 (FreeBSD Ports Collection) to build stage1 compiler 
and I
can see Segfaults happening.

Issue is that neither valgrind nor gdb work on the system.
Valgrind has 2 PRs reported:
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224878
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228973

For the gdb I can't see any lines associated:

Starting program: /tmp/gcc2/objdir/gcc/cc1 -quiet -I . -I . -I ../.././gcc -I 
../../../libgcc -I ../../../libgcc/. -I ../../../libgcc/../gcc -I 
../../../libgcc/../include -iprefix 
/tmp/gcc2/objdir/gcc/../lib/gcc/i586-unknown-freebsd10.4/9.0.0/ -isystem 
/tmp/gcc2/objdir/./gcc/include -isystem /tmp/gcc2/objdir/./gcc/include-fixed 
-MD subtf3.d -MF subtf3.dep -MP -MT subtf3.o -D IN_GCC -D IN_LIBGCC2 -D 
HAVE_CC_TLS -D HIDE_EXPORTS -isystem 
/usr/local/i586-unknown-freebsd10.4/include -isystem 
/usr/local/i586-unknown-freebsd10.4/sys-include -isystem ./include 
../../../libgcc/soft-fp/subtf3.c -quiet -dumpbase subtf3.c -mtune=pentium 
-march=pentium -auxbase-strip subtf3.o -g -g -g -O2 -O2 -O2 -Wextra -Wall 
-Wno-narrowing -Wwrite-strings -Wcast-qual -Wstrict-prototypes 
-Wold-style-definition -Wno-missing-prototypes -Wno-type-limits 
-fbuilding-libgcc -fno-stack-protector -fpic -fvisibility=hidden -o 
/tmp//ccJog3Wn.s
(no debugging symbols found)...(no debugging symbols found)...(no debugging 
symbols found)...(no debugging symbols found)...(no debugging symbols 
found)...(no debugging symbols found)...(no debugging symbols found)...
Program received signal SIGSEGV, Segmentation fault.
0x08510556 in ?? ()
(gdb) bt
#0  0x08510556 in ?? ()
#1  0x2ad8aea0 in ?? ()
#2  0x000d in ?? ()
#3  0xbfbfe078 in ?? ()
#4  0x0001 in ?? ()
#5  0xbfbfe3fc in ?? ()
#6  0x in ?? ()
(gdb) c
Continuing.
during GIMPLE pass: pre
../../../libgcc/soft-fp/subtf3.c: In function '__subtf3':
../../../libgcc/soft-fp/subtf3.c:35:1: internal compiler error: Segmentation 
fault
35 | __subtf3 (TFtype a, TFtype b)
   | ^~~~
libbacktrace could not find executable to open
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.

Would be very handy to have at least one of these tools working.
Gerald can you please help us with that?

Martin


Re: [RFH] split {generic,gimple}-match.c files

2018-09-10 Thread Martin Liška
On 09/04/2018 05:07 PM, Martin Liška wrote:
> - in order to achieve real speed up we need to split also other generated 
> (and also dwarf2out.c, i386.c, ..) files:
> here I'm most concerned about insn-recog.c, which can't be split the same way 
> without ending up with a single huge SCC component.

About the insn-recog.c file: all functions are static and using SCC one ends
up with all functions in one component. In order to split the callgraph one
needs to promote some functions to be extern and then split would be possible.
In order to do that we'll probably need to teach splitter how to do partitioning
based on minimal number of edges to be removed.

I need to inspire in lto_balanced_map, or is there some simple algorithm I can 
start with?

Martin


Re: [PATCH][4/4][v2] RPO-style value-numbering for FRE/PRE

2018-09-10 Thread Gerald Pfeifer
On Mon, 10 Sep 2018, Martin Liška wrote:
> I can reproduce that locally in a KVM machine running FreeBSD test 
> 10.4-RELEASE. I used gcc version 6.4.0 (FreeBSD Ports Collection) to 
> build stage1 compiler and I can see Segfaults happening.

Great, thanks for helping look into this, Martin!

> Issue is that neither valgrind nor gdb work on the system.
> Valgrind has 2 PRs reported:
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224878
> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228973
> 
> For the gdb I can't see any lines associated:

Which gdb are you using?  (`which gdb` and/or `gdb --version`?)

If, as I assume, it is the old one in the system (GDB 6.1.1) it
may be worth a try installing the one from the ports/packages.
Something like `pkg add gdb`, which should bring in GDB 8.1.1.

> Would be very handy to have at least one of these tools working.
> Gerald can you please help us with that?

I believe the GDB route is the more promising one (also having looked
at the two PRs you found re valgrind on FreeBSD/i386).  Quesiton is,
is it more of a GDB issue, or perhaps debugging information missing?

Let me know what you find, I may be able to ask others, closer to the
FreeBSD toolchain for help.

Gerald

Re: [PATCH v2] combine: perform jump threading at the end

2018-09-10 Thread Ilya Leoshkevich



> Am 07.09.2018 um 00:39 schrieb Segher Boessenkool 
> :
> 
> On Thu, Sep 06, 2018 at 12:11:09PM -0600, Jeff Law wrote:
>> On 09/05/2018 06:11 AM, Richard Biener wrote:
>>> On Wed, Sep 5, 2018 at 2:01 PM Ilya Leoshkevich  wrote:
 +  /* Combining insns can change basic blocks in a way that they end up
 + containing a single jump_insn. This creates an opportunity to 
 improve code
 + with jump threading.  */
 +  cleanup_cfg (CLEANUP_THREADING);
> 
> Please show an example of when this happens.  For almost all code it does
> not happen, so please don't do it always.
This improves the code for the following example from
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80080:

extern void bar(int *mem);

void foo5(int *mem)
{
  int oldval = 0;
  __atomic_compare_exchange_n (mem, (void *) &oldval, 1,
   1, __ATOMIC_ACQUIRE, __ATOMIC_RELAXED);
  if (oldval != 0)
bar (mem);
}

I posted the corresponding RTL here:
https://gcc.gnu.org/ml/gcc-patches/2018-09/msg00495.html
> 
> Does it improve code at all?  There is a reason we do not run the expensive
> cfg cleanups after every pass: they are expensive.  They are only done in
> some strategically chosen places.
> 
Performance-wise, the net win is insignificant (checked with SPEC
CPU2006), but nevertheless the generated code contains less redundant
jumps.  I intended this to be a small unintrusive improvement, so it
definitely is not good if it would increase the compile times.  I will
check how this affects build times of gcc master and SPEC CPU2006
benchmarks.
> 
> Segher
> 



Re: [PATCH]: Allow TARGET_SCHED_ADJUST_PRIORITY hook to reduce priority

2018-09-10 Thread Andreas Schwab
On Sep 06 2018, Jeff Law  wrote:

> On 09/03/2018 08:32 AM, John David Anglin wrote:
>> The documentation for TARGET_SCHED_ADJUST_PRIORITY indicates that the
>> hook can
>> reduce the priority of INSN to execute it later.  The hppa hook only
>> reduces the priority
>> and it has been this way for years.  However, the assert in
>> sel_target_adjust_priority()
>> prevents reduction of the priority.
>> 
>> The attached change revises the assert to allow the priority to be
>> reduced to zero.
>> 
>> This fixes PR rtl-optimization/85458.
>> 
>> Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and
>> hppa64-hp-hpux11.11.
>> 
>> I must admit that this happens so infrequently that I have to wonder if
>> the hook provides
>> any benefit on hppa.  It was supposed to keep addil instructions close
>> to the following instruction
>> to reduce pressure on register %r1.
>> 
>> Okay?
>> 
>> Dave
>> 
>> -- 
>> John David Anglin  dave.ang...@bell.net
>> 
>> 
>> sel-sched.c.d
>> 
>> 
>> 2018-09-03  John David Anglin  
>> 
>>  PR rtl-optimization/85458
>>  * sel-sched.c (sel_target_adjust_priority): Allow backend adjust
>>  priority hook to reduce the priority of EXPR.
> OK.

That breaks ia64.

during RTL pass: mach
/usr/local/gcc/test/gcc/testsuite/gcc.c-torture/compile/20010102-1.c: In 
function '_obstack_newchunk':
/usr/local/gcc/test/gcc/testsuite/gcc.c-torture/compile/20010102-1.c:101:1: 
internal compiler error: in sel_target_adjust_priority, at sel-sched.c:
0x410bb68f sel_target_adjust_priority
../../gcc/sel-sched.c:
0x410bb68f fill_vec_av_set
../../gcc/sel-sched.c:3727
0x410bd45f fill_ready_list
../../gcc/sel-sched.c:4028
0x410bd45f find_best_expr
../../gcc/sel-sched.c:4388
0x410bd45f fill_insns
../../gcc/sel-sched.c:5549
0x410c29cf schedule_on_fences
../../gcc/sel-sched.c:7366
0x410c29cf sel_sched_region_2
../../gcc/sel-sched.c:7504
0x410c510f sel_sched_region_1
../../gcc/sel-sched.c:7546
0x410c700f sel_sched_region(int)
../../gcc/sel-sched.c:7647
0x410c9def run_selective_scheduling()
../../gcc/sel-sched.c:7733
0x419e473f ia64_reorg
../../gcc/config/ia64/ia64.c:9857
0x410314cf execute
../../gcc/reorg.c:3984

Andreas.

-- 
Andreas Schwab, SUSE Labs, sch...@suse.de
GPG Key fingerprint = 0196 BAD8 1CE9 1970 F4BE  1748 E4D4 88E3 0EEA B9D7
"And now for something completely different."


Re: [PATCH][4/4][v2] RPO-style value-numbering for FRE/PRE

2018-09-10 Thread Martin Liška
On 09/10/2018 02:19 PM, Gerald Pfeifer wrote:
> On Mon, 10 Sep 2018, Martin Liška wrote:
>> I can reproduce that locally in a KVM machine running FreeBSD test 
>> 10.4-RELEASE. I used gcc version 6.4.0 (FreeBSD Ports Collection) to 
>> build stage1 compiler and I can see Segfaults happening.
> 
> Great, thanks for helping look into this, Martin!
> 
>> Issue is that neither valgrind nor gdb work on the system.
>> Valgrind has 2 PRs reported:
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=224878
>> https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=228973
>>
>> For the gdb I can't see any lines associated:
> 
> Which gdb are you using?  (`which gdb` and/or `gdb --version`?)
> 
> If, as I assume, it is the old one in the system (GDB 6.1.1) it
> may be worth a try installing the one from the ports/packages.
> Something like `pkg add gdb`, which should bring in GDB 8.1.1.

Works for me! One needed to add --wrapper gdb81,--args. So now I see a nice
back-trace.

> 
>> Would be very handy to have at least one of these tools working.
>> Gerald can you please help us with that?
> 
> I believe the GDB route is the more promising one (also having looked
> at the two PRs you found re valgrind on FreeBSD/i386).  Quesiton is,
> is it more of a GDB issue, or perhaps debugging information missing?
> 
> Let me know what you find, I may be able to ask others, closer to the
> FreeBSD toolchain for help.

Sure, I'll debug that more now.

Martin

> 
> Gerald
> 



Do not stream TYPE_VALUES to ltrans units

2018-09-10 Thread Jan Hubicka
Hi,
TYPE_VALUES are currently only used to output warnings on ODR mismatched enums.
I think those warnings are useful and thus we want to stream them to WPA, but
there is no need to stream them further to ltrans units.

Bootstrapped/regtested x86_64-linux, OK?

* tree-streamer-out.c (write_ts_type_non_common_tree_pointers):
Do not stream TYPE_VALUES to ltrans units.
* lto-streamer-out.c (DFS::DFS_write_tree_body): Likewise.
Index: tree-streamer-out.c
===
--- tree-streamer-out.c (revision 264180)
+++ tree-streamer-out.c (working copy)
@@ -700,7 +700,9 @@ write_ts_type_non_common_tree_pointers (
bool ref_p)
 {
   if (TREE_CODE (expr) == ENUMERAL_TYPE)
-stream_write_tree (ob, TYPE_VALUES (expr), ref_p);
+/* At WPA time we do not need to stream type values; those are only needed
+   to output ODR warnings.  */
+stream_write_tree (ob, flag_wpa ? NULL : TYPE_VALUES (expr), ref_p);
   else if (TREE_CODE (expr) == ARRAY_TYPE)
 stream_write_tree (ob, TYPE_DOMAIN (expr), ref_p);
   else if (RECORD_OR_UNION_TYPE_P (expr))
Index: lto-streamer-out.c
===
--- lto-streamer-out.c  (revision 264180)
+++ lto-streamer-out.c  (working copy)
@@ -864,7 +992,9 @@ DFS::DFS_write_tree_body (struct output_
 
   if (CODE_CONTAINS_STRUCT (code, TS_TYPE_NON_COMMON))
 {
-  if (TREE_CODE (expr) == ENUMERAL_TYPE)
+  /* At WPA time we do not need to stream type values; those are only 
needed
+ to output ODR warnings.  */
+  if (TREE_CODE (expr) == ENUMERAL_TYPE && !flag_wpa)
DFS_follow_tree_edge (TYPE_VALUES (expr));
   else if (TREE_CODE (expr) == ARRAY_TYPE)
DFS_follow_tree_edge (TYPE_DOMAIN (expr));


Re: [PATCH][OBVIOUS] Close file on return from verify-intermediate

2018-09-10 Thread Martin Liška
On 09/05/2018 03:29 PM, Joey Ye wrote:
> This is a fix to an obvious issue in gcov.exp, where proc verify-intermediate 
> returns without closing the open file.
> 
> This can be a possible fix to PR85871. gcov-8.C diffs to other gcov testcases 
> that it invokes verify-intermediate. Not closing an open file may result in 
> random failure quietly.
> 
> It is only a possible fix as I failed to reproduce the PR85871 random failure 
> in my local machine despite continuous testing of multiple days. So I cannot 
> verify if this patch fixes the regression either.
> 
> To verify, https://gcc.gnu.org/ml/gcc-testresults/ need to be watched whether 
> gcov-8 regression will disappear completely one month after this patch 
> committed to trunk.
> 
> Tested with make check with no new regressions.
> 
> OK to trunk?
> 
> testsuite/ChangeLog:
> 2018-09-05  Joey Ye  
> 
>     * lib/gcov.exp (verify-intermediate): Add missing close.
> 

Hi.

Thanks for the fix, it's obvious. Please install the patch.

Note that gcov-8.C is built multiple times with different -std=* options:

PASS: g++.dg/gcov/gcov-8.C  -std=gnu++98 (test for excess errors)
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++98 execution test
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++98  gcov
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++11 (test for excess errors)
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++11 execution test
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++11  gcov
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++14 (test for excess errors)
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++14 execution test
PASS: g++.dg/gcov/gcov-8.C  -std=gnu++14  gcov

That can cause the collisions seen in the PR.

Martin


Re: [PATCH] PR86844: Fix for store merging

2018-09-10 Thread Andreas Krebbel
On 20.08.2018 16:30, Jeff Law wrote:
> On 08/18/2018 03:20 AM, Eric Botcazou wrote:
>>> Eric, didn't your patches explicitely handle this case of a non-constant
>>> inbetween?
>>
>> Only if there is no overlap at all, otherwise you cannot do things simply.
>>
>>> Can you have a look / review here?
>>
>> Jakub is probably more qualified to give a definitive opinion, as he wrote 
>> check_no_overlap and the bug is orthogonal to my patches since it is present 
>> in 8.x; in any case, all transformations are supposed to be covered by the 
>> testsuite.
> FYI. Jakub is on PTO through the end of this week and will probably be
> buried when he returns.

Jakub, could you please have a look whether that's the right fix?

https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00474.html

Andreas



Re: [PATCH]: Allow TARGET_SCHED_ADJUST_PRIORITY hook to reduce priority

2018-09-10 Thread John David Anglin

On 2018-09-10 8:35 AM, Andreas Schwab wrote:

On Sep 06 2018, Jeff Law  wrote:


On 09/03/2018 08:32 AM, John David Anglin wrote:

The documentation for TARGET_SCHED_ADJUST_PRIORITY indicates that the
hook can
reduce the priority of INSN to execute it later.  The hppa hook only
reduces the priority
and it has been this way for years.  However, the assert in
sel_target_adjust_priority()
prevents reduction of the priority.

The attached change revises the assert to allow the priority to be
reduced to zero.

This fixes PR rtl-optimization/85458.

Tested on hppa-unknown-linux-gnu, hppa2.0w-hp-hpux11.11 and
hppa64-hp-hpux11.11.

I must admit that this happens so infrequently that I have to wonder if
the hook provides
any benefit on hppa.  It was supposed to keep addil instructions close
to the following instruction
to reduce pressure on register %r1.

Okay?

Dave

--
John David Anglin  dave.ang...@bell.net


sel-sched.c.d


2018-09-03  John David Anglin  

PR rtl-optimization/85458
* sel-sched.c (sel_target_adjust_priority): Allow backend adjust
priority hook to reduce the priority of EXPR.

OK.

That breaks ia64.

during RTL pass: mach
/usr/local/gcc/test/gcc/testsuite/gcc.c-torture/compile/20010102-1.c: In 
function '_obstack_newchunk':
/usr/local/gcc/test/gcc/testsuite/gcc.c-torture/compile/20010102-1.c:101:1: 
internal compiler error: in sel_target_adjust_priority, at sel-sched.c:
0x410bb68f sel_target_adjust_priority
../../gcc/sel-sched.c:
0x410bb68f fill_vec_av_set
../../gcc/sel-sched.c:3727
0x410bd45f fill_ready_list
../../gcc/sel-sched.c:4028
0x410bd45f find_best_expr
../../gcc/sel-sched.c:4388
0x410bd45f fill_insns
../../gcc/sel-sched.c:5549
0x410c29cf schedule_on_fences
../../gcc/sel-sched.c:7366
0x410c29cf sel_sched_region_2
../../gcc/sel-sched.c:7504
0x410c510f sel_sched_region_1
../../gcc/sel-sched.c:7546
0x410c700f sel_sched_region(int)
../../gcc/sel-sched.c:7647
0x410c9def run_selective_scheduling()
../../gcc/sel-sched.c:7733
0x419e473f ia64_reorg
../../gcc/config/ia64/ia64.c:9857
0x410314cf execute
../../gcc/reorg.c:3984
It looks like negative priorities occur on ia64.  If that's reasonable, 
then the assert should be removed.

On the other hand, maybe there is a bug in setting the expression priority.

Dave

--
John David Anglin  dave.ang...@bell.net



Re: [PATCH] Add a dwarf unit type to represent 24 bit values.

2018-09-10 Thread Jason Merrill
On Mon, Aug 27, 2018 at 8:20 PM, John Darrington
 wrote:
> * include/dwarf2.h (enum dwarf_unit_type) [DE_EH_PE_udata3]: New 
> member.

This is a new macro, not a member of dwarf_unit_type.

What's the rationale?  Do you have a separate patch that uses this new macro?

Jason


Re: [PATCH] Add a dwarf unit type to represent 24 bit values.

2018-09-10 Thread John Darrington
On Mon, Sep 10, 2018 at 03:36:26PM +0100, Jason Merrill wrote:
 On Mon, Aug 27, 2018 at 8:20 PM, John Darrington
  wrote:
 > * include/dwarf2.h (enum dwarf_unit_type) [DE_EH_PE_udata3]: New 
member.
 
 
 What's the rationale?  Do you have a separate patch that uses this new 
macro?
 
Yes.   I there is an upcoming patch for GDB.  See 
https://sourceware.org/ml/gdb-patches/2018-08/msg00731.html

J'


[PATCH, OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions

2018-09-10 Thread Chung-Lin Tang


This patch adds *_async versions of several OpenACC runtime library API 
functions,
which is to allow execution of a function asynchronously on particular async
stream, an addition to the standard since 2.5. Specifically, these functions:

acc_copyin_async
acc_copyout_async
acc_copyout_finalize_async
acc_create_async
acc_delete_async
acc_delete_finalize_async
acc_memcpy_from_device_async
acc_memcpy_to_device_async
acc_update_device_async
acc_update_self_async

which have an additional 'int async' argument in additional from the non-async 
version.

libgomp tested with offloading with no regressions, is this okay for trunk?

Thanks,
Chung-Lin

2018-09-10  Chung-Lin Tang  

libgomp/
* oacc-mem.c (memcpy_tofrom_device): New function, combined from
acc_memcpy_to/from_device functions, now with async parameter.
(acc_memcpy_to_device): Modify to use memcpy_tofrom_device.
(acc_memcpy_from_device): Likewise.
(acc_memcpy_to_device_async): New API function.
(acc_memcpy_from_device_async): Likewise.
(present_create_copy): Add async parameter and async setting/unsetting.
(acc_create): Adjust present_create_copy call.
(acc_copyin): Likewise.
(acc_present_or_create): Likewise.
(acc_present_or_copyin): Likewise.
(acc_create_async): New API function.
(acc_copyin_async): New API function.
(delete_copyout): Add async parameter and async setting/unsetting.
(acc_delete): Adjust delete_copyout call.
(acc_copyout): Likewise.
(acc_delete_async): New API function.
(acc_copyout_async): Likewise.
(update_dev_host): Add async parameter and async setting/unsetting.
(acc_update_device): Adjust update_dev_host call.
(acc_update_self): Likewise.
(acc_update_device_async): New API function.
(acc_update_self_async): Likewise.
* openacc.h (acc_copyin_async): Declare new API function.
(acc_create_async): Likewise.
(acc_copyout_async): Likewise.
(acc_delete_async): Likewise.
(acc_update_device_async): Likewise.
(acc_update_self_async): Likewise.
(acc_memcpy_to_device_async): Likewise.
(acc_memcpy_from_device_async): Likewise.
* openacc_lib.h (acc_copyin_async_32_h): New subroutine.
(acc_copyin_async_64_h): New subroutine.
(acc_copyin_async_array_h): New subroutine.
(acc_create_async_32_h): New subroutine.
(acc_create_async_64_h): New subroutine.
(acc_create_async_array_h): New subroutine.
(acc_copyout_async_32_h): New subroutine.
(acc_copyout_async_64_h): New subroutine.
(acc_copyout_async_array_h): New subroutine.
(acc_delete_async_32_h): New subroutine.
(acc_delete_async_64_h): New subroutine.
(acc_delete_async_array_h): New subroutine.
(acc_update_device_async_32_h): New subroutine.
(acc_update_device_async_64_h): New subroutine.
(acc_update_device_async_array_h): New subroutine.
(acc_update_self_async_32_h): New subroutine.
(acc_update_self_async_64_h): New subroutine.
(acc_update_self_async_array_h): New subroutine.
* openacc.f90 (acc_copyin_async_32_h): New subroutine.
(acc_copyin_async_64_h): New subroutine.
(acc_copyin_async_array_h): New subroutine.
(acc_create_async_32_h): New subroutine.
(acc_create_async_64_h): New subroutine.
(acc_create_async_array_h): New subroutine.
(acc_copyout_async_32_h): New subroutine.
(acc_copyout_async_64_h): New subroutine.
(acc_copyout_async_array_h): New subroutine.
(acc_delete_async_32_h): New subroutine.
(acc_delete_async_64_h): New subroutine.
(acc_delete_async_array_h): New subroutine.
(acc_update_device_async_32_h): New subroutine.
(acc_update_device_async_64_h): New subroutine.
(acc_update_device_async_array_h): New subroutine.
(acc_update_self_async_32_h): New subroutine.
(acc_update_self_async_64_h): New subroutine.
(acc_update_self_async_array_h): New subroutine.
* libgomp.map (OACC_2.5): Add acc_copyin_async*, acc_copyout_async*,
acc_copyout_finalize_async*, acc_create_async*, acc_delete_async*,
acc_delete_finalize_async*, acc_memcpy_from_device_async*,
acc_memcpy_to_device_async*, acc_update_device_async*, and
acc_update_self_async* entries.
* testsuite/libgomp.oacc-c-c++-common/lib-94.c: New test.
* testsuite/libgomp.oacc-c-c++-common/lib-95.c: New test.
* testsuite/libgomp.oacc-fortran/lib-16.f90: New test.
Index: libgomp/libgomp.map
===
--- libgomp/libgomp.map (revision 264192)
+++ libgomp/libgomp.map (working copy)
@@ -388,14 +388,48 @@ OACC_2.0.1 {
 
 OACC_2.5 {
   global:
+   acc_copyin_async;
+   acc_copyin_async

Re: [PATCH] Add a dwarf unit type to represent 24 bit values.

2018-09-10 Thread Jason Merrill
On Mon, Sep 10, 2018 at 3:42 PM, John Darrington
 wrote:
> On Mon, Sep 10, 2018 at 03:36:26PM +0100, Jason Merrill wrote:
>  On Mon, Aug 27, 2018 at 8:20 PM, John Darrington
>   wrote:
>  > * include/dwarf2.h (enum dwarf_unit_type) [DE_EH_PE_udata3]: 
> New member.
>
>
>  What's the rationale?  Do you have a separate patch that uses this new 
> macro?
>
> Yes.   I there is an upcoming patch for GDB.  See
> https://sourceware.org/ml/gdb-patches/2018-08/msg00731.html

This looks like support for reading fixed 3-byte values from the
exception handling unwind information.  Do you expect this information
to ever need to store 3-byte values?  The offsets in the unwind info
don't need to correspond exactly to target word sizes, and if you use
an assembler that supports it (such as GNU as), the table will use
variable-length encoding anyway.

Jason


Re: [PATCH, OpenACC 2.5, libgomp] Add *_async versions of runtime library API functions

2018-09-10 Thread Cesar Philippidis
On 09/10/2018 08:04 AM, Chung-Lin Tang wrote:

>  GOACC_2.0 {
> Index: libgomp/oacc-mem.c
> ===
> --- libgomp/oacc-mem.c(revision 264192)
> +++ libgomp/oacc-mem.c(working copy)
> @@ -153,8 +153,9 @@ acc_free (void *d)
>  gomp_fatal ("error in freeing device memory in %s", __FUNCTION__);
>  }
>  
> -void
> -acc_memcpy_to_device (void *d, void *h, size_t s)
> +static void
> +memcpy_tofrom_device (bool from, void *d, void *h, size_t s, int async,
> +   const char *libfnname)

This showed up oddly in the diff, but memcpy_tofrom_device is a new
internal function that's not part of the public API. It's nice that you
were able to merge the to/from functions together. I think this is safe
in terms of backwards compatibility.

>  {
>/* No need to call lazy open here, as the device pointer must have
>   been obtained from a routine that did that.  */
> @@ -164,31 +165,49 @@ acc_free (void *d)
>  
>if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
>  {
> -  memmove (d, h, s);
> +  if (from)
> + memmove (h, d, s);
> +  else
> + memmove (d, h, s);
>return;
>  }
>  
> -  if (!thr->dev->host2dev_func (thr->dev->target_id, d, h, s))
> -gomp_fatal ("error in %s", __FUNCTION__);
> +  if (async > acc_async_sync)
> +thr->dev->openacc.async_set_async_func (async);
> +
> +  bool ret = (from
> +   ? thr->dev->dev2host_func (thr->dev->target_id, h, d, s)
> +   : thr->dev->host2dev_func (thr->dev->target_id, d, h, s));
> +
> +  if (async > acc_async_sync)
> +thr->dev->openacc.async_set_async_func (acc_async_sync);
> +
> +  if (!ret)
> +gomp_fatal ("error in %s", libfnname);
>  }
>  
>  void
> -acc_memcpy_from_device (void *h, void *d, size_t s)
> +acc_memcpy_to_device (void *d, void *h, size_t s)
>  {
> -  /* No need to call lazy open here, as the device pointer must have
> - been obtained from a routine that did that.  */
> -  struct goacc_thread *thr = goacc_thread ();
> +  memcpy_tofrom_device (false, d, h, s, acc_async_sync, __FUNCTION__);
> +}
>  
> -  assert (thr && thr->dev);
> +void
> +acc_memcpy_to_device_async (void *d, void *h, size_t s, int async)
> +{
> +  memcpy_tofrom_device (false, d, h, s, async, __FUNCTION__);
> +}
>  
> -  if (thr->dev->capabilities & GOMP_OFFLOAD_CAP_SHARED_MEM)
> -{
> -  memmove (h, d, s);
> -  return;
> -}
> +void
> +acc_memcpy_from_device (void *h, void *d, size_t s)
> +{
> +  memcpy_tofrom_device (true, d, h, s, acc_async_sync, __FUNCTION__);
> +}
>  
> -  if (!thr->dev->dev2host_func (thr->dev->target_id, h, d, s))
> -gomp_fatal ("error in %s", __FUNCTION__);
> +void
> +acc_memcpy_from_device_async (void *h, void *d, size_t s, int async)
> +{
> +  memcpy_tofrom_device (true, d, h, s, async, __FUNCTION__);
>  }
>  
>  /* Return the device pointer that corresponds to host data H.  Or NULL
> @@ -428,7 +447,7 @@ acc_unmap_data (void *h)
>  #define FLAG_COPY (1 << 2)
>  
>  static void *
> -present_create_copy (unsigned f, void *h, size_t s)
> +present_create_copy (unsigned f, void *h, size_t s, int async)

Likewise, this is another internal function, so it shouldn't break anything.

>  {
>void *d;
>splay_tree_key n;
> @@ -490,11 +509,17 @@ static void *
>  
>gomp_mutex_unlock (&acc_dev->lock);
>  
> +  if (async > acc_async_sync)
> + acc_dev->openacc.async_set_async_func (async);
> +
>tgt = gomp_map_vars (acc_dev, mapnum, &hostaddrs, NULL, &s, &kinds, 
> true,
>  GOMP_MAP_VARS_OPENACC);
>/* Initialize dynamic refcount.  */
>tgt->list[0].key->dynamic_refcount = 1;
>  
> +  if (async > acc_async_sync)
> + acc_dev->openacc.async_set_async_func (acc_async_sync);
> +
>gomp_mutex_lock (&acc_dev->lock);
>  
>d = tgt->to_free;
> @@ -510,19 +535,32 @@ static void *
>  void *
>  acc_create (void *h, size_t s)
>  {
> -  return present_create_copy (FLAG_PRESENT | FLAG_CREATE, h, s);
> +  return present_create_copy (FLAG_PRESENT | FLAG_CREATE, h, s, 
> acc_async_sync);
>  }
>  
> +void
> +acc_create_async (void *h, size_t s, int async)
> +{
> +  present_create_copy (FLAG_PRESENT | FLAG_CREATE, h, s, async);
> +}
> +
>  void *
>  acc_copyin (void *h, size_t s)
>  {
> -  return present_create_copy (FLAG_PRESENT | FLAG_CREATE | FLAG_COPY, h, s);
> +  return present_create_copy (FLAG_PRESENT | FLAG_CREATE | FLAG_COPY, h, s,
> +   acc_async_sync);
>  }
>  
> +void
> +acc_copyin_async (void *h, size_t s, int async)
> +{
> +  present_create_copy (FLAG_PRESENT | FLAG_CREATE | FLAG_COPY, h, s, async);
> +}
> +
>  void *
>  acc_present_or_create (void *h, size_t s)
>  {
> -  return present_create_copy (FLAG_PRESENT | FLAG_CREATE, h, s);
> +  return present_create_copy (FLAG_PRESENT | FLAG_CREATE, h, s, 
> acc_async_sync);
>  }
>  
>  /* acc_pcreate is acc_present_or_create by a different na

Re: [PATCH, OpenACC] C++ reference mapping (PR middle-end/86336)

2018-09-10 Thread Jason Merrill
On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown  wrote:
> This patch (by Cesar) changes the way C++ references are mapped in
> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase.
>
> Post-patch, references are mapped like this (from the omplower dump):
>
> map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer assign, bias: 
> 0])
>
> Tested with offloading to NVPTX and bootstrapped. OK for trunk?
>
> Thanks,
>
> Julian
>
> ChangeLog
>
> 2018-09-09  Cesar Philippidis  
> Julian Brown  
>
> PR middle-end/86336
>
> (gimplify_adjust_omp_clauses_1): Update handling of mapping of C++
> references.

How is reference handling specified differently between OpenMP and
OpenACC?  It seems strange for them to differ.

In any case, you shouldn't need to check lang_GNU_CXX since we're
already calling the langhook.

Jason


Re: [PATCH, OpenACC] C++ reference mapping (PR middle-end/86336)

2018-09-10 Thread Cesar Philippidis
On 09/10/2018 10:37 AM, Jason Merrill wrote:
> On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown  wrote:
>> This patch (by Cesar) changes the way C++ references are mapped in
>> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase.
>>
>> Post-patch, references are mapped like this (from the omplower dump):
>>
>> map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer assign, bias: 
>> 0])
>>
>> Tested with offloading to NVPTX and bootstrapped. OK for trunk?
>>
>> Thanks,
>>
>> Julian
>>
>> ChangeLog
>>
>> 2018-09-09  Cesar Philippidis  
>> Julian Brown  
>>
>> PR middle-end/86336
>>
>> (gimplify_adjust_omp_clauses_1): Update handling of mapping of C++
>> references.
> 
> How is reference handling specified differently between OpenMP and
> OpenACC?  It seems strange for them to differ.

Both OpenACC and OpenMP privatize mapped array pointers on the
accelerator for subarrays in the same way. However, for pointers without
subarrays, OpenMP treats them as zero-length arrays, whereas OpenACC
treats them as ordinary scalars so that the pointer target will not get
remapped on the accelerator (which is odd because there's a deviceptr
clause for that). Scalars in C++ are special, because references must
treated like an array of length one, for lack of a better terminology.

> In any case, you shouldn't need to check lang_GNU_CXX since we're
> already calling the langhook.

Julian, can you look into this? I'm traveling tomorrow.

Cesar


Re: [PATCH] PR86844: Fix for store merging

2018-09-10 Thread Jakub Jelinek
On Mon, Sep 10, 2018 at 04:05:26PM +0200, Andreas Krebbel wrote:
> On 20.08.2018 16:30, Jeff Law wrote:
> > On 08/18/2018 03:20 AM, Eric Botcazou wrote:
> >>> Eric, didn't your patches explicitely handle this case of a non-constant
> >>> inbetween?
> >>
> >> Only if there is no overlap at all, otherwise you cannot do things simply.
> >>
> >>> Can you have a look / review here?
> >>
> >> Jakub is probably more qualified to give a definitive opinion, as he wrote 
> >> check_no_overlap and the bug is orthogonal to my patches since it is 
> >> present 
> >> in 8.x; in any case, all transformations are supposed to be covered by the 
> >> testsuite.
> > FYI. Jakub is on PTO through the end of this week and will probably be
> > buried when he returns.
> 
> Jakub, could you please have a look whether that's the right fix?
> 
> https://gcc.gnu.org/ml/gcc-patches/2018-08/msg00474.html

It is a fix, but not optimal.
We have essentially:
 MEM[(int *)p_28] = 0;
 MEM[(char *)p_28 + 3B] = 1;
 MEM[(char *)p_28 + 1B] = 2;
 MEM[(char *)p_28 + 2B] = MEM[(char *)p_28 + 6B];
It is useful to merge the first 3 stores into:
 MEM[(int *)p_28] = 0x01000200; // or 0x00020001; depending on endianity
 MEM[(char *)p_28 + 2B] = MEM[(char *)p_28 + 6B];
rather than punt, and just ignore (i.e. make sure it isn't merged with
anything else) the non-INTEGER_CST store).  If you don't mind, I'll take this
PR over and handle it tomorrow.

Slightly tweaked testcase:
__attribute__((noipa)) void
foo (int *p)
{
  *p = 0;
  *((char *)p + 3) = 1;
  *((char *)p + 1) = 2;
  *((char *)p + 2) = *((char *)p + 6);
}

int
main ()
{
  int a[2] = { -1, 0 };
  if (sizeof (int) != 4)
return 0;
  ((char *)a)[6] = 3;
  foo (a);
  if (((char *)a)[0] != 0 || ((char *)a)[1] != 2
  || ((char *)a)[2] != 3 || ((char *)a)[3] != 1)
__builtin_abort ();
}

Jakub


Re: [PATCH, OpenACC] C++ reference mapping (PR middle-end/86336)

2018-09-10 Thread Julian Brown
On Mon, 10 Sep 2018 10:52:47 -0700
Cesar Philippidis  wrote:

> On 09/10/2018 10:37 AM, Jason Merrill wrote:
> > On Mon, Sep 10, 2018 at 4:05 AM, Julian Brown
> >  wrote:  
> >> This patch (by Cesar) changes the way C++ references are mapped in
> >> OpenACC regions, fixing an ICE in the non-scalar-data.C testcase.
> >>
> >> Post-patch, references are mapped like this (from the omplower
> >> dump):
> >>
> >> map(force_present:*x [len: 4]) map(firstprivate ref:x [pointer
> >> assign, bias: 0])
> >>
> >> Tested with offloading to NVPTX and bootstrapped. OK for trunk?
> >>
> >> Thanks,
> >>
> >> Julian
> >>
> >> ChangeLog
> >>
> >> 2018-09-09  Cesar Philippidis  
> >> Julian Brown  
> >>
> >> PR middle-end/86336
> >>
> >> (gimplify_adjust_omp_clauses_1): Update handling of
> >> mapping of C++ references.  
> > 
> > How is reference handling specified differently between OpenMP and
> > OpenACC?  It seems strange for them to differ.  
> 
> Both OpenACC and OpenMP privatize mapped array pointers on the
> accelerator for subarrays in the same way. However, for pointers
> without subarrays, OpenMP treats them as zero-length arrays, whereas
> OpenACC treats them as ordinary scalars so that the pointer target
> will not get remapped on the accelerator (which is odd because
> there's a deviceptr clause for that). Scalars in C++ are special,
> because references must treated like an array of length one, for lack
> of a better terminology.

I think it's more accurate to say that OpenACC says nothing about C++
references at all, nor about how unadorned pointers are mapped in
copy/copyin/copyout clauses. So arguably we get to choose whatever we
want, preferably based on the principle of least surprise. (ICE'ing
definitely counts as a surprise!)

As noted in a previous email, PGI seems to treat pointers to
aggregates specially, mapping them as ptr[0:1], but it's unclear if the
same is true for pointers to scalars with their compiler. Neither
behaviour seems to be standard-mandated, but this patch extends the
idea to references to scalars nonetheless.

> > In any case, you shouldn't need to check lang_GNU_CXX since we're
> > already calling the langhook.  
> 
> Julian, can you look into this? I'm traveling tomorrow.

Yes, I'll continue to look at this patch.

Thanks,

Julian


[PATCH] [ARC]: core3 features are default for core4

2018-09-10 Thread Vineet Gupta
   * config/arc/arc.c: object attributes for core4 not reflected correctly
   * config/arc/arc.h: Don't restrict DBNZ to core3 (core4 includes core3)

Signed-off-by: Vineet Gupta 
---
 gcc/ChangeLog| 7 +++
 gcc/config/arc/arc.c | 2 +-
 gcc/config/arc/arc.h | 2 +-
 3 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 6dbe8147b3ec..3a022d156445 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,10 @@
+2018-09-10  Vineet Gupta 
+
+   * config/arc/arc.c: object attributes for core4 not reflected
+   correctly
+   * config/arc/arc.h: Don't restrict DBNZ to core3 (core4 includes
+   core3)
+
 2018-09-09  Uros Bizjak  
 
* config/i386/i386.md (float partial SSE register stall splitter): Move
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index c186e02e0f18..0171e8a7c615 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -5181,7 +5181,7 @@ static void arc_file_start (void)
   TARGET_OPTFPE ? 1 : 0);
   if (TARGET_V2)
 asm_fprintf (asm_out_file, "\t.arc_attribute Tag_ARC_CPU_variation, %d\n",
-arc_tune == ARC_TUNE_CORE_3 ? 3 : 2);
+arc_tune < ARC_TUNE_CORE_3 ? 2 : (arc_tune == ARC_TUNE_CORE_3 
? 3 : 4) );
 }
 
 /* Implement `TARGET_ASM_FILE_END'.  */
diff --git a/gcc/config/arc/arc.h b/gcc/config/arc/arc.h
index de09b6b2f09e..4d38f9ec174f 100644
--- a/gcc/config/arc/arc.h
+++ b/gcc/config/arc/arc.h
@@ -1636,6 +1636,6 @@ enum
 #define TARGET_FPX_QUARK(TARGET_EM && TARGET_SPFP  \
 && (arc_fpu_build == FPX_QK))
 /* DBNZ support is available for ARCv2 core3 cpus.  */
-#define TARGET_DBNZ (TARGET_V2 && (arc_tune == ARC_TUNE_CORE_3))
+#define TARGET_DBNZ (TARGET_V2 && (arc_tune >= ARC_TUNE_CORE_3))
 
 #endif /* GCC_ARC_H */
-- 
2.7.4



[PATCH, i386]: Use only memory_operand input operands in x87/SSE constant load splitter

2018-09-10 Thread Uros Bizjak
Hello!

Currently, x87/SSE constant load splitter converts memory loads and
register copies to supported immediate loads (xorps reg,reg, fld0,
fld1, ...). However, it is cheaper to copy the value from register
than to rematerialize the constant. Also, the compiler differs between
SFmode, DFmode and XFmode x87 load, and currently produces several
separate fld1 insns for loads in different modes. The patch prevents
this situation and leaves float_extends from SFmode load (emitted by
compress_float_constant), which are later converted to either no-op or
plain x87 register moves.

2018-09-10  Uros Bizjak  

* config/i386/i386.md (x87/SSE constant load splitter): Use
memory_operand instead of nonimmediate_operand for input operand
predicate.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.

Index: config/i386/i386.md
===
--- config/i386/i386.md (revision 264185)
+++ config/i386/i386.md (working copy)
@@ -3833,7 +3833,7 @@

 (define_split
   [(set (match_operand 0 "any_fp_register_operand")
-   (match_operand 1 "nonimmediate_operand"))]
+   (match_operand 1 "memory_operand"))]
   "reload_completed
&& (GET_MODE (operands[0]) == TFmode
|| GET_MODE (operands[0]) == XFmode
@@ -3845,7 +3845,7 @@

 (define_split
   [(set (match_operand 0 "any_fp_register_operand")
-   (float_extend (match_operand 1 "nonimmediate_operand")))]
+   (float_extend (match_operand 1 "memory_operand")))]
   "reload_completed
&& (GET_MODE (operands[0]) == TFmode
|| GET_MODE (operands[0]) == XFmode


[wwwdocs] Move gcc-3.4/criteria.html to HTML 5

2018-09-10 Thread Gerald Pfeifer
...and remove "DRAFT" from its title on the way, ahem.

Applied.

Gerald

Index: gcc-3.4/criteria.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/gcc-3.4/criteria.html,v
retrieving revision 1.9
diff -u -r1.9 criteria.html
--- gcc-3.4/criteria.html   2 Sep 2018 21:30:48 -   1.9
+++ gcc-3.4/criteria.html   10 Sep 2018 19:01:00 -
@@ -2,13 +2,13 @@
 
 
 
-DRAFT: GCC 3.4 Release Criteria
+GCC 3.4 Release Criteria
 https://gcc.gnu.org/gcc.css"; />
 
 
 
 
-DRAFT: GCC 3.4 Release Criteria
+GCC 3.4 Release Criteria
 
 This page provides the release criteria for GCC 3.4.  GCC 3.4 will
 not be released until these criteria have been met.  This page
@@ -64,7 +64,7 @@
 systems and the most popular microprocessors.  Of course, where
 possible, the release will support other targets as well.
 
-
+
 Primary Evaluation Platforms
 Chip OS  
   Triplet
@@ -123,7 +123,7 @@
 team, will make reasonable efforts to assist these volunteers by
 answering questions and reviewing patches as time permits.
 
-
+
 Secondary Evaluation Platforms
 Chip OS
   Triplet
@@ -200,7 +200,7 @@
 to general information about a package and a source URL.  Versions
 shown here are used for GCC 3.4 integration testing.
 
-
+
 Integration Tests
 Name
 Language
@@ -313,7 +313,7 @@
 Therefore, we will use the following benchmarks for measuring code
 quality:
 
-
+
 Name
 Language
 Source URL
@@ -357,7 +357,7 @@
 In order to measure compile-time performance, we will use the
 following unit tests:
 
-
+
 Name
 Language
 Source


[wwwdocs] Use plain

2018-09-10 Thread Gerald Pfeifer
Per the HTML 5 validator used by w3.org.

Applied.

Gerald

Index: projects/cxx0x.html
===
RCS file: /cvs/gcc/wwwdocs/htdocs/projects/cxx0x.html,v
retrieving revision 1.76
diff -u -r1.76 cxx0x.html
--- projects/cxx0x.html 1 Sep 2018 23:42:09 -   1.76
+++ projects/cxx0x.html 10 Sep 2018 19:04:48 -
@@ -3,7 +3,7 @@

 
 
-
+