[PATCH][ARC][committed] Update tmac tests.

2019-06-06 Thread Claudiu Zissulescu
Fix order of dg-directives such that tests are executed only when
there is no command line cpu option given.

gcc/testsuite/
-xx-xx  Claudiu Zissulescu  

* gcc.target/arc/tmac-1.c: Reoreder dg-directives.
* gcc.target/arc/tmac-2.c: Likewise.
---
 gcc/testsuite/ChangeLog   | 5 +
 gcc/testsuite/gcc.target/arc/tmac-1.c | 3 +--
 gcc/testsuite/gcc.target/arc/tmac-2.c | 2 +-
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/gcc/testsuite/ChangeLog b/gcc/testsuite/ChangeLog
index 18a480fcdab..62262461834 100644
--- a/gcc/testsuite/ChangeLog
+++ b/gcc/testsuite/ChangeLog
@@ -1,3 +1,8 @@
+2019-06-06  Claudiu Zissulescu  
+
+   * gcc.target/arc/tmac-1.c: Reoreder dg-directives.
+   * gcc.target/arc/tmac-2.c: Likewise.
+
 2019-06-05  Martin Sebor  
 
PR c/90737
diff --git a/gcc/testsuite/gcc.target/arc/tmac-1.c 
b/gcc/testsuite/gcc.target/arc/tmac-1.c
index e59df5f6b59..3fcabf5fff2 100644
--- a/gcc/testsuite/gcc.target/arc/tmac-1.c
+++ b/gcc/testsuite/gcc.target/arc/tmac-1.c
@@ -1,5 +1,5 @@
-/* { dg-skip-if "" { ! { clmcpu } } } */
 /* { dg-do compile } */
+/* { dg-skip-if "" { ! { clmcpu } } } */
 /* { dg-options "-O2 -mcpu=archs -mmpy-option=8" } */
 
 /* Test MAC operation for MPY_OPTION = 8.  */
@@ -9,4 +9,3 @@
 /* { dg-final { scan-assembler "macdu" } } */
 /* { dg-final { scan-assembler "mpyd " } } */
 /* { dg-final { scan-assembler "mpydu" } } */
-
diff --git a/gcc/testsuite/gcc.target/arc/tmac-2.c 
b/gcc/testsuite/gcc.target/arc/tmac-2.c
index f0136bac3e6..ee1339a2f23 100644
--- a/gcc/testsuite/gcc.target/arc/tmac-2.c
+++ b/gcc/testsuite/gcc.target/arc/tmac-2.c
@@ -1,5 +1,5 @@
-/* { dg-skip-if "" { ! { clmcpu } } } */
 /* { dg-do compile } */
+/* { dg-skip-if "" { ! { clmcpu } } } */
 /* { dg-options "-O2 -mcpu=archs -mmpy-option=7" } */
 
 /* Test MAC operation for MPY_OPTION = 7.  */
-- 
2.21.0



Re: undefined behavior in value_range::equiv_add()?

2019-06-06 Thread Richard Biener
On Wed, Jun 5, 2019 at 11:12 PM Jeff Law  wrote:
>
> On 6/4/19 9:04 AM, Richard Biener wrote:
> > On Tue, Jun 4, 2019 at 3:40 PM Jeff Law  wrote:
> >>
> >> On 6/4/19 5:23 AM, Richard Biener wrote:
> >>> On Tue, Jun 4, 2019 at 12:30 AM Jeff Law  wrote:
> 
>  On 6/3/19 7:13 AM, Aldy Hernandez wrote:
> > On 5/31/19 5:00 AM, Richard Biener wrote:
> >> On Fri, May 31, 2019 at 2:27 AM Jeff Law 
> >> wrote:
> >>>
> >>> On 5/29/19 10:20 AM, Aldy Hernandez wrote:
>  On 5/29/19 12:12 PM, Jeff Law wrote:
> > On 5/29/19 9:58 AM, Aldy Hernandez wrote:
> >> On 5/29/19 9:24 AM, Richard Biener wrote:
> >>> On Wed, May 29, 2019 at 2:18 PM Aldy Hernandez
> >>>  wrote:
> 
>  As per the API, and the original documentation
>  to value_range, VR_UNDEFINED and VR_VARYING
>  should never have equivalences. However,
>  equiv_add is tacking on equivalences blindly,
>  and there are various regressions that happen
>  if I fix this oversight.
> 
>  void value_range::equiv_add (const_tree var,
>  const value_range *var_vr, bitmap_obstack
>  *obstack) { if (!m_equiv) m_equiv =
>  BITMAP_ALLOC (obstack); unsigned ver =
>  SSA_NAME_VERSION (var); bitmap_set_bit
>  (m_equiv, ver); if (var_vr && var_vr->m_equiv)
>  bitmap_ior_into (m_equiv, var_vr->m_equiv); }
> 
>  Is this a bug in the documentation / API, or is
>  equiv_add incorrect and we should fix the
>  fall-out elsewhere?
> >>>
> >>> I think this must have been crept in during the
> >>> classification. If you go back to say GCC 7 you
> >>> shouldn't see value-ranges with UNDEFINED/VARYING
> >>> state in the lattice that have equivalences.
> >>>
> >>> It may not be easy to avoid with the new classy
> >>> interface but we're certainly not tacking on them
> >>> "blindly".  At least we're not supposed to.  As
> >>> usual the intermediate state might be "broken"
> >>> but intermediateness is not sth the new class
> >>> "likes".
> >>
> >> It looks like extract_range_from_stmt (by virtue
> >> of vrp_visit_assignment_or_call and then
> >> extract_range_from_ssa_name) returns one of these
> >> intermediate ranges.  It would seem to me that an
> >> outward looking API method like
> >> vr_values::extract_range_from_stmt shouldn't be
> >> returning inconsistent ranges.  Or are there no
> >> guarantees for value_ranges from within all of
> >> vr_values?
> > ISTM that if we have an implementation constraint
> > that says a VR_VARYING or VR_UNDEFINED range can't
> > have equivalences, then we need to honor that at the
> > minimum for anything returned by an external API.
> > Returning an inconsistent state is bad.  I'd even
> > state that we should try damn hard to avoid it in
> > internal APIs as well.
> 
>  Agreed * 2.
> 
> >
> >>
> >> Perhaps I should give a little background.  As part
> >> of your value_range_base re-factoring last year,
> >> you mentioned that you didn't split out intersect
> >> like you did union because of time or oversight.
> >> I have code to implement intersect (attached), for
> >> which I've noticed that I must leave equivalences
> >> intact, even when transitioning to VR_UNDEFINED:
> >>
> >> [from the attached patch] +  /* If THIS is varying
> >> we want to pick up equivalences from OTHER. +
> >> Just special-case this here rather than trying to
> >> fixup after the + fact.  */ +  if
> >> (this->varying_p ()) +this->deep_copy (other);
> >> +  else if (this->undefined_p ()) +/* ?? Leave
> >> any equivalences already present in an undefined. +
> >> This is technically not allowed, but we may get an
> >> in-flight +   value_range in an intermediate
> >> state.  */
> > Where/when does this happen?
> 
>  The above snippet is not currently in mainline.  It's
>  in the patch I'm proposing to clean up intersect.  It's
>  just that while cleaning up intersect I noticed that if
>  we keep to the value_range API, we end up clobbering an
>  equivalence to a VR_UNDEFINED that we depend up further
>  up the call chain.
> 
>  The reason it doesn't happen in mainline is because
>  intersect_helper bails early on an undefined, thus
>  leaving the problematic equivalence intact.
> 
>  You can see it in mainline though, with the following
>  testcase:
> >

[PATCH] [ARC] Fix PR89838

2019-06-06 Thread Claudiu Zissulescu
Hi Andrew,

This is a proposed fix for bugzilla PR89838 issue. It also needs to be 
backported to gcc9 and, eventually, gcc8 branches.

Ok to apply?
Claudiu

gcc/
-xx-xx  Claudiu Zissulescu  

* config/arc/arc.c (arc_symbol_binds_local_p): New function.
(arc_legitimize_pic_address): Simplify and cleanup the function.
(SYMBOLIC_CONST): Remove.
(prepare_pic_move): Likewise.
(prepare_move_operands): Handle complex mov cases here.
(arc_legitimize_address_0): Remove call to
arc_legitimize_pic_address.
(arc_legitimize_address): Remove call to
arc_legitimize_tls_address.
* config/arc/arc.md (movqi_insn): Allow Cm3 match.
(movhi_insn): Likewise.

/gcc/testsuite
-xx-xx  Claudiu Zissulescu  

* gcc.target/arc/pr89838.c: New file.
---
 gcc/config/arc/arc.c   | 215 +++--
 gcc/config/arc/arc.md  |   8 +-
 gcc/testsuite/gcc.target/arc/pr89838.c |  16 ++
 3 files changed, 77 insertions(+), 162 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/pr89838.c

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index 20dfae66370..a5ee5c49a8f 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -5986,130 +5986,47 @@ arc_legitimize_tls_address (rtx addr, enum tls_model 
model)
 }
 }
 
-/* Legitimize a pic address reference in ORIG.
-   The return value is the legitimated address.
-   If OLDX is non-zero, it is the target to assign the address to first.  */
+/* Return true if SYMBOL_REF X binds locally.  */
 
-static rtx
-arc_legitimize_pic_address (rtx orig, rtx oldx)
+static bool
+arc_symbol_binds_local_p (const_rtx x)
 {
-  rtx addr = orig;
-  rtx pat = orig;
-  rtx base;
+  return (SYMBOL_REF_DECL (x)
+ ? targetm.binds_local_p (SYMBOL_REF_DECL (x))
+ : SYMBOL_REF_LOCAL_P (x));
+}
+
+/* Legitimize a pic address reference in ORIG.  The return value is
+   the legitimated address.  */
 
-  if (oldx == orig)
-oldx = NULL;
+static rtx
+arc_legitimize_pic_address (rtx addr)
+{
+  if (!flag_pic)
+return addr;
 
-  if (GET_CODE (addr) == LABEL_REF)
-; /* Do nothing.  */
-  else if (GET_CODE (addr) == SYMBOL_REF)
+  switch (GET_CODE (addr))
 {
-  enum tls_model model = SYMBOL_REF_TLS_MODEL (addr);
-  if (model != 0)
-   return arc_legitimize_tls_address (addr, model);
-  else if (!flag_pic)
-   return orig;
-  else if (CONSTANT_POOL_ADDRESS_P (addr) || SYMBOL_REF_LOCAL_P (addr))
-   return arc_unspec_offset (addr, ARC_UNSPEC_GOTOFFPC);
+case SYMBOL_REF:
+  /* TLS symbols are handled in different place.  */
+  if (SYMBOL_REF_TLS_MODEL (addr))
+   return addr;
 
   /* This symbol must be referenced via a load from the Global
 Offset Table (@GOTPC).  */
-  pat = arc_unspec_offset (addr, ARC_UNSPEC_GOT);
-  pat = gen_const_mem (Pmode, pat);
-
-  if (oldx == NULL)
-   oldx = gen_reg_rtx (Pmode);
-
-  emit_move_insn (oldx, pat);
-  pat = oldx;
-}
-  else
-{
-  if (GET_CODE (addr) == CONST)
-   {
- addr = XEXP (addr, 0);
- if (GET_CODE (addr) == UNSPEC)
-   {
- /* Check that the unspec is one of the ones we generate?  */
- return orig;
-   }
- /* fwprop is placing in the REG_EQUIV notes constant pic
-unspecs expressions.  Then, loop may use these notes for
-optimizations resulting in complex patterns that are not
-supported by the current implementation. The following
-two if-cases are simplifying the complex patters to
-simpler ones.  */
- else if (GET_CODE (addr) == MINUS)
-   {
- rtx op0 = XEXP (addr, 0);
- rtx op1 = XEXP (addr, 1);
- gcc_assert (oldx);
- gcc_assert (GET_CODE (op1) == UNSPEC);
-
- emit_move_insn (oldx,
- gen_rtx_CONST (SImode,
-arc_legitimize_pic_address (op1,
-
NULL_RTX)));
- emit_insn (gen_rtx_SET (oldx, gen_rtx_MINUS (SImode, op0, oldx)));
- return oldx;
-
-   }
- else if (GET_CODE (addr) != PLUS)
-   {
- rtx tmp = XEXP (addr, 0);
- enum rtx_code code = GET_CODE (addr);
-
- /* It only works for UNARY operations.  */
- gcc_assert (UNARY_P (addr));
- gcc_assert (GET_CODE (tmp) == UNSPEC);
- gcc_assert (oldx);
-
- emit_move_insn
-   (oldx,
-gen_rtx_CONST (SImode,
-   arc_legitimize_pic_address (tmp,
-   NULL_RTX)));
-
- emit_insn (gen_rtx_SET (oldx,
- gen_rtx_fmt_ee (code, SImode,
-   

Re: [PATCH] Simplify mult where both arguments are 0 or 1 (PR tree-optimization/87954).

2019-06-06 Thread Richard Biener
On Wed, Jun 5, 2019 at 3:35 PM Martin Liška  wrote:
>
> On 6/5/19 3:04 PM, Richard Biener wrote:
> > On Wed, Jun 5, 2019 at 2:09 PM Martin Liška  wrote:
> >>
> >> On 6/5/19 1:13 PM, Richard Biener wrote:
> >>> On Wed, Jun 5, 2019 at 12:56 PM Martin Liška  wrote:
> 
>  Hi.
> 
>  I'm suggesting one multiplication simplification pattern.
> 
>  Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
>  Ready to be installed?
> >>>
> >>> +  (if (INTEGRAL_TYPE_P (type)
> >>> +   && wi::eq_p (get_nonzero_bits (@1), wi::one (TYPE_PRECISION 
> >>> (type)))
> >>> +   && wi::eq_p (get_nonzero_bits (@2), wi::one (TYPE_PRECISION 
> >>> (type
> >>>
> >>>   && wi::eq_p (wi::bit_or (get_nonzero_bits (@1), get_nonzero_bits  
> >>> (@2)), 1))
> >>>
> >>> (I think literal 1 still works)?
> >>
> >> Yep, I can confirm that.
> >>
> >>> How does it behave for  singed/unsigned 1-bit
> >>> bitfields?  A gimple testcase maybe necessary to see.
> >>
> >> Can we really have a mult that will have a bitfield type?
> >
> > As said you probably need a GIMPLE testcase to avoid
> > promoting to int.  Oh, and that doesn't work yet because
> > we cannot "parse" bit-precision types for temporaries.
> >
> > struct X { int a : 1; int b : 1; };
> >
> > int foo (struct X *p)
> > {
> >   return p->a;
> > }
> >
> > produces
> >
> > int __GIMPLE (ssa)
> > foo (struct X * p)
> > {
> >   int D_1913;
> >_1;
> > ...
> >
> > we have similar issues with dumping of vector types but
> > there at least one can use a typedef and manual editing.
> > For bit-precision types we need to invent a "C" extension
> > (thus also for vectors).
> >
> > Anyway...
>
> I see, I'm sending updated version of the patch I've been just testing.
> It's addressing Richard Sandifords's note.
>
> May I install it after testing?

Yes.

Thanks,
Richard.

> >
> >> $ cat gcc/testsuite/gcc.dg/pr87954-2.c
> >> #define __GFP_DMA 1u
> >> #define __GFP_RECLAIM 0x10u
> >>
> >> struct bt
> >> {
> >>   unsigned int v:1;
> >> };
> >>
> >> unsigned int
> >> imul(unsigned int flags)
> >> {
> >>   struct bt is_dma, is_rec;
> >>
> >>   is_dma.v = !!(flags & __GFP_DMA);
> >>   is_rec.v = !!(flags & __GFP_RECLAIM);
> >>
> >>   return is_rec.v * !is_dma.v;
> >> }
> >>
> >> $ ./xgcc -B. /home/marxin/Programming/gcc/gcc/testsuite/gcc.dg/pr87954-2.c 
> >> -fdump-tree-optimized=/dev/stdout -O2
> >>
> >> ;; Function imul (imul, funcdef_no=0, decl_uid=1909, cgraph_uid=1, 
> >> symbol_order=0)
> >>
> >> imul (unsigned int flags)
> >> {
> >>   struct bt is_dma;
> >>   _Bool _1;
> >>   unsigned int _2;
> >>   _Bool _3;
> >>   unsigned char _4;
> >>   _Bool _6;
> >>   unsigned int _9;
> >>_11;
> >>   unsigned char _14;
> >>
> >>[local count: 1073741824]:
> >>   _1 = (_Bool) flags_7(D);
> >>   _2 = flags_7(D) & 16;
> >>   _3 = _2 != 0;
> >>   is_dma.v = _1;
> >>   _4 = BIT_FIELD_REF ;
> >>   _14 = ~_4;
> >>   _6 = (_Bool) _14;
> >>   _11 = _3 & _6;
> >>   _9 = (unsigned int) _11;
> >>   is_dma ={v} {CLOBBER};
> >>   return _9;
> >> }
> >>
> >>>
> >>> Does this mean we want to turn plus into bit_ior when
> >>> get_nonzero_bits() & get_nonzero_bits() is zero?
> >>
> >> That's quite interesting transformation, I'll add it as a follow up patch.
> >
> > I was just curious - maybe we should do the reverse instead?
> > For mult vs. bit-and I think the latter will be "faster" (well, probably not
> > even that...).  But for plus vs or?
>
> Hmm, expected speed up will be probably very small.
>
> Martin
>
> >
> >
> >>>
> >>> X * [0, 1] -> X & sign-extend-from-bit-1 also works I guess, but 
> >>> multiplication
> >>> looks more canonical.
> >> Ok here.
> >>
> >> Martin
> >>
> >>>
> >>> Thanks,
> >>> Richard.
> >>>
>  Thanks,
>  Martin
> 
>  gcc/ChangeLog:
> 
>  2019-06-05  Martin Liska  
> 
>  PR tree-optimization/87954
>  * match.pd: Simplify mult where both arguments are 0 or 1.
> 
>  gcc/testsuite/ChangeLog:
> 
>  2019-06-05  Martin Liska  
> 
>  PR tree-optimization/87954
>  * gcc.dg/pr87954.c: New test.
>  ---
>   gcc/match.pd   |  8 
>   gcc/testsuite/gcc.dg/pr87954.c | 21 +
>   2 files changed, 29 insertions(+)
>   create mode 100644 gcc/testsuite/gcc.dg/pr87954.c
> 
> 
> >>
>


[PATCH] [ARC][DOC] Add documentation naked, ilink and firq

2019-06-06 Thread Claudiu Zissulescu
Hi Sandra, Andrew

Please find a small patch on the documentation which adds info about naked, 
ilink and firq function attributes.

Ok to apply?
Claudiu

gcc/
-xx-xx  Claudiu Zissulescu  

* doc/extend.texi (ARC Function Attributes): Update info.
---
 gcc/doc/extend.texi | 14 +-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e8563fd0803..c2e675afa0f 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -4139,7 +4139,8 @@ void f () __attribute__ ((interrupt ("ilink1")));
 @end smallexample
 
 Permissible values for this parameter are: @w{@code{ilink1}} and
-@w{@code{ilink2}}.
+@w{@code{ilink2}} for ARCv1 architecture, and @w{@code{ilink}} and
+@w{@code{firq}} for ARCv2 architecture.
 
 @item long_call
 @itemx medium_call
@@ -4182,7 +4183,17 @@ This attribute allows one to mark secure-code functions 
that are
 callable from normal mode.  The location of the secure call function
 into the @code{sjli} table needs to be passed as argument.
 
+@item naked
+@cindex @code{naked} function attribute, ARC
+This attribute allows the compiler to construct the requisite function
+declaration, while allowing the body of the function to be assembly
+code.  The specified function will not have prologue/epilogue sequences
+generated by the compiler.  Only basic @code{asm} statements can safely
+be included in naked functions (@pxref{Basic Asm}).  While using
+extended @code{asm} or a mixture of basic @code{asm} and C code may
+appear to work, they cannot be depended upon to work reliably and are
+not supported.
+
 @end table
 
 @node ARM Function Attributes
-- 
2.21.0



Re: [0/3] Improve debug info for addressable vars

2019-06-06 Thread Richard Biener
On Wed, Jun 5, 2019 at 4:30 PM Richard Sandiford
 wrote:
>
> Richard Biener  writes:
> > On Sat, Jun 1, 2019 at 5:49 PM Richard Sandiford
> >  wrote:
> >>
> >> Taking the address of a variable stops us doing var-tracking on it,
> >> so that we just use the DECL_RTL instead.  This can easily cause wrong
> >> debug info for regions of code that would have had correct debug info
> >> if the variable weren't addressable.  E.g.:
> >>
> >> {
> >>   int base;
> >>   get_start (&base);
> >>   x[i1] = base;
> >>   base += 1; // No need to store this
> >>   x[i2] = base; // ...so the debug info for "base" is wrong here
> >> }
> >>
> >> or (the motivating example):
> >>
> >> {
> >>   int base;
> >>   get_start (&base);
> >>   for (int i = 0; i < n; ++i)
> >> {
> >>   x[i] = base;
> >>   base += y[i]; // Can apply LSM here, so the debug info for "base"
> >> // in the loop is wrong
> >> }
> >>   consume (&base);
> >> }
> >>
> >> This patch series lets us use the DECL_RTL location for some parts of a
> >> variable's lifetime and debug-bind locations for other parts:
> >>
> >> 1) Gimple uses "VAR s=> VAR" to bind VAR to its DECL_RTL.  The binding
> >>holds until overridden.
> >>
> >> 2) RTL does the same thing using:
> >>
> >>  (var_location VAR (decl_rtl_ref VAR))
> >>
> >>where DECL_RTL_REF is a new rtx code that captures the DECL_RTL
> >>by reference rather than by value.
> >>
> >>We can't just use "(var_location VAR (mem X))" for this, because
> >>that would bind VAR to the value that (mem X) has at that exact point.
> >>VAR would therefore get reset by any possible change to (mem X),
> >>whereas here we want it to track (possibly indirect) updates instead.
> >>
> >> 3) The gimplifier decides which variables should get the new treatment
> >>and emits "VAR s=> VAR" to mark the start of VAR's lifetime.
> >>Clobbers continue to mark the end of VAR's lifetime.
> >>
> >> 4) Stores to VAR implicitly reestablish the link between VAR and its
> >>DECL_RTL.  This is simpler (and IMO more robust) than inserting an
> >>explicit "VAR s=> VAR" at every write.
> >>
> >> 5) gsi_remove tries to insert "VAR => X" in place of a deleted "VAR = X",
> >>falling back to a "VAR => NULL" reset if that fails.
> >>
> >> Patch 1 handles the new rtl code, patch 2 adds the gimple framework,
> >> and patch 3 uses it for LSM.
> >
> > So I wonder how it handles
> >
> > void __attribute__((noinline)) foo(int *p) { *p = 42; }
> > int x;
> > int main()
> > {
> >   int base = 1;
> >   foo (&base);
> >   base = 2;
> >   *(x ? &x : &base) = 1; // (*)
> >   return 0;
> > }
> >
> > here we DSE the base = 2 store leaving a
> >
> > # DEBUG base = 2
> >
> > stmt?  But there's an indirect store that also stores
> > to base - what will the debug info say at/after (*)?  Will
> > it claim that base is 2?  At least I do not see that
> > the connection with bases DECL_RTL is re-established?
>
> Yeah, true.
>
> > There's a clobber of base before return 0 so you eventually
> > have to add some dummy stmt you can print base after
> > the indirect store.
> >
> > That said, doesn't "aliasing" create another source of wrong-debug
> > with your approach that might be even worse?
>
> Not sure about even worse, but maybe different.  In the example above
> the patches fix the debug info after "base = 2" but break it after the
> following statement.
>
> But there's no real need for the compiler to store to base in (*) either.

Indeed partial dead code elim code sink the store into both arms and then
remove the store in one of them.

> We could end up with "if (...) x = 1;" instead.  So AFAICT there's no
> guarantee that we'll get correct debug info at the return statement even
> as things stand.
>
> For memory variables, I think we're always at the mercy of dead stores
> being optimised away, and the patch isn't trying to fix that.

Hmm, but you _do_ insert the debug stmts when we remove stores...

>  Since
> both writes to base are dead in the above, I don't think we can guarantee
> correct debug info without compromising optimisation for the sake of
> debuggability.  (FWIW, I have a WIP patch to add an option for that,
> hope to post an RFC soon.)
>
> I can't think of a case in which the patches introduce wrong debug
> info for code that isn't dead.

All the recent slew of wrong-debug bugs were exactly for dead code...
I don't think that the difference of dead store vs dead SSA assignment
is different from the user perception.  Now - I think that var-tracking
could fix things up here once it sees aliasing stores.  The important
difference to before is that now it suddenly deals with variables that
are aliased while before that could not happen (besides spills/reloads).
Can you quickly see what it would take to take this into account?
For my testcase the debugger should say "optimized out" because
we can't tell whether the store will actuall touch 'base', so refering
to its memory location

[PATCH 2/2] [ARC] Update RTX costs.

2019-06-06 Thread Claudiu Zissulescu
Update RTX costs to reflect better the ARC architecture.

Ok to apply?
Claudiu

gcc/
-xx-xx  Claudiu Zissulescu  

* config/arc/arc.c (arc_rtx_costs): Update costs.

/gcc/testsuite
-xx-xx  Claudiu Zissulescu  

* gcc.target/arc/jumptables.c: Update test.
---
 gcc/config/arc/arc.c | 69 +---
 gcc/testsuite/gcc.target/arc/jumptable.c |  2 +-
 2 files changed, 40 insertions(+), 31 deletions(-)

diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index f398c4a0086..20dfae66370 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -5569,41 +5569,39 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
 case CONST_INT:
   {
bool nolimm = false; /* Can we do without long immediate?  */
-   bool fast = false; /* Is the result available immediately?  */
-   bool condexec = false; /* Does this allow conditiobnal execution?  */
-   bool compact = false; /* Is a 16 bit opcode available?  */
-   /* CONDEXEC also implies that we can have an unconditional
-  3-address operation.  */
 
-   nolimm = compact = condexec = false;
+   nolimm = false;
if (UNSIGNED_INT6 (INTVAL (x)))
- nolimm = condexec = compact = true;
+ nolimm = true;
else
  {
-   if (SMALL_INT (INTVAL (x)))
- nolimm = fast = true;
switch (outer_code)
  {
  case AND: /* bclr, bmsk, ext[bw] */
if (satisfies_constraint_Ccp (x) /* bclr */
|| satisfies_constraint_C1p (x) /* bmsk */)
- nolimm = fast = condexec = compact = true;
+ nolimm = true;
break;
  case IOR: /* bset */
if (satisfies_constraint_C0p (x)) /* bset */
- nolimm = fast = condexec = compact = true;
+ nolimm = true;
break;
  case XOR:
if (satisfies_constraint_C0p (x)) /* bxor */
- nolimm = fast = condexec = true;
+ nolimm = true;
break;
+ case SET:
+   if (UNSIGNED_INT8 (INTVAL (x)))
+ nolimm = true;
+   if (satisfies_constraint_Chi (x))
+ nolimm = true;
+   if (satisfies_constraint_Clo (x))
+ nolimm = true;
  default:
break;
  }
  }
-   /* FIXME: Add target options to attach a small cost if
-  condexec / compact is not true.  */
-   if (nolimm)
+   if (nolimm && !speed)
  {
*total = 0;
return true;
@@ -5616,7 +5614,7 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
 case CONST:
 case LABEL_REF:
 case SYMBOL_REF:
-  *total = COSTS_N_INSNS (1);
+  *total = speed ? COSTS_N_INSNS (1) : COSTS_N_INSNS (4);
   return true;
 
 case CONST_DOUBLE:
@@ -5642,16 +5640,10 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
 case LSHIFTRT:
   if (TARGET_BARREL_SHIFTER)
{
- /* If we want to shift a constant, we need a LIMM.  */
- /* ??? when the optimizers want to know if a constant should be
-hoisted, they ask for the cost of the constant.  OUTER_CODE is
-insufficient context for shifts since we don't know which operand
-we are looking at.  */
  if (CONSTANT_P (XEXP (x, 0)))
{
- *total += (COSTS_N_INSNS (2)
-+ rtx_cost (XEXP (x, 1), mode, (enum rtx_code) code,
-0, speed));
+ *total += rtx_cost (XEXP (x, 1), mode, (enum rtx_code) code,
+ 0, speed);
  return true;
}
  *total = COSTS_N_INSNS (1);
@@ -5671,7 +5663,13 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
 
 case DIV:
 case UDIV:
-  if (speed)
+  if (GET_MODE_CLASS (mode) == MODE_FLOAT
+ && (TARGET_FP_SP_SQRT || TARGET_FP_DP_SQRT))
+   *total = COSTS_N_INSNS(1);
+  else if (GET_MODE_CLASS (mode) == MODE_INT
+  && TARGET_DIVREM)
+   *total = COSTS_N_INSNS(1);
+  else if (speed)
*total = COSTS_N_INSNS(30);
   else
*total = COSTS_N_INSNS(1);
@@ -5684,19 +5682,28 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
*total= arc_multcost;
   /* We do not want synth_mult sequences when optimizing
 for size.  */
-  else if (TARGET_MUL64_SET || TARGET_ARC700_MPY)
+  else if (TARGET_ANY_MPY)
*total = COSTS_N_INSNS (1);
   else
*total = COSTS_N_INSNS (2);
   return false;
+
 case PLUS:
+  if (outer_code == MEM && CONST_INT_P (XEXP (x, 1))
+ && RTX_OK_FOR_OFFSET_P (mode, XEXP (x, 1)))
+   {
+ *total = 0;
+ return true;
+   }
+
   if ((GET_CODE (XEXP (x, 0)) == ASH

[PATCH 1/2] [ARC] Improve code gen when compiling for size

2019-06-06 Thread Claudiu Zissulescu
When optimizing for size, try to avoid using long immediate by
employing alternative (short) instructions.

Ok to apply?
Claudiu

gcc/
-xx-xx  Claudiu Zissulescu  

* config/arc/arc-protos.h (arc_check_ior_const): Declare.
(arc_split_ior): Likewise.
(arc_check_mov_const): Likewise.
(arc_split_mov_const): Likewise.
* config/arc/arc.c (arc_print_operand): Fix 'z' letter.
(arc_rtx_costs): Replace check Crr with Cax constraint.
(prepare_move_operands): Cleanup, remove unused code.
(arc_split_ior): New function.
(arc_check_ior_const): Likewise.
(arc_split_mov_const): Likewise.
(arc_check_mov_const): Likewise.
* config/arc/arc.md (movsi_insn): Restructure it, and convert it
in define_insn_and_split pattern.
(iorsi3): Likewise.
(mulsi3_v2): Add new matching variant.
(andsi3_i): Cleanup pattern.
(rotrsi3_cnt1): Update pattern.
(rotrsi3_cnt8): New pattern.
(ashlsi2_cnt8): Likewise.
(ashlsi2_cnt16): Likewise.
* config/arc/constraints.md (C0p): Update constraint.
(Crr): Remove it.
(C0x): New pattern.
(Cax): New pattern.

testsuite/
-xx-xx  Claudiu Zissulescu  

* gcc.target/arc/and-cnst-size.c: New test.
* gcc.target/arc/mov-cnst-size.c: Likewise.
* gcc.target/arc/or-cnst-size.c: Likewise.
* gcc.target/arc/store-merge-1.c: Update test.
* gcc.target/arc/arc700-stld-hazard.c: Likewise.
* gcc.target/arc/cmem-1.c: Likewise.
* gcc.target/arc/cmem-2.c: Likewise.
* gcc.target/arc/cmem-3.c: Likewise.
* gcc.target/arc/cmem-4.c: Likewise.
* gcc.target/arc/cmem-5.c: Likewise.
* gcc.target/arc/cmem-6.c: Likewise.
* gcc.target/arc/loop-4.c: Likewise.
* gcc.target/arc/movh_cl-1.c: Likewise.
* gcc.target/arc/sdata-3.c: Likewise.
---
 gcc/config/arc/arc-protos.h   |   4 +
 gcc/config/arc/arc.c  | 222 +++---
 gcc/config/arc/arc.md | 215 ++---
 gcc/config/arc/constraints.md |  22 +-
 gcc/testsuite/gcc.target/arc/and-cnst-size.c  |  16 ++
 .../gcc.target/arc/arc700-stld-hazard.c   |   4 +-
 gcc/testsuite/gcc.target/arc/cmem-1.c |   6 +-
 gcc/testsuite/gcc.target/arc/cmem-2.c |   6 +-
 gcc/testsuite/gcc.target/arc/cmem-3.c |   6 +-
 gcc/testsuite/gcc.target/arc/cmem-4.c |   6 +-
 gcc/testsuite/gcc.target/arc/cmem-5.c |   6 +-
 gcc/testsuite/gcc.target/arc/cmem-6.c |   6 +-
 gcc/testsuite/gcc.target/arc/loop-4.c |   3 +-
 gcc/testsuite/gcc.target/arc/mov-cnst-size.c  |  42 
 gcc/testsuite/gcc.target/arc/movh_cl-1.c  |   2 +-
 gcc/testsuite/gcc.target/arc/or-cnst-size.c   |  16 ++
 gcc/testsuite/gcc.target/arc/sdata-3.c|  18 +-
 gcc/testsuite/gcc.target/arc/store-merge-1.c  |   2 +-
 18 files changed, 449 insertions(+), 153 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/arc/and-cnst-size.c
 create mode 100644 gcc/testsuite/gcc.target/arc/mov-cnst-size.c
 create mode 100644 gcc/testsuite/gcc.target/arc/or-cnst-size.c

diff --git a/gcc/config/arc/arc-protos.h b/gcc/config/arc/arc-protos.h
index ac0de6b2874..f501bc30ee7 100644
--- a/gcc/config/arc/arc-protos.h
+++ b/gcc/config/arc/arc-protos.h
@@ -48,6 +48,10 @@ extern bool arc_is_uncached_mem_p (rtx);
 extern bool gen_operands_ldd_std (rtx *operands, bool load, bool commute);
 extern bool arc_check_multi (rtx, bool);
 extern void arc_adjust_reg_alloc_order (void);
+extern bool arc_check_ior_const (HOST_WIDE_INT );
+extern void arc_split_ior (rtx *);
+extern bool arc_check_mov_const (HOST_WIDE_INT );
+extern bool arc_split_mov_const (rtx *);
 #endif /* RTX_CODE */
 
 extern unsigned int arc_compute_frame_size (int);
diff --git a/gcc/config/arc/arc.c b/gcc/config/arc/arc.c
index b49f2539408..f398c4a0086 100644
--- a/gcc/config/arc/arc.c
+++ b/gcc/config/arc/arc.c
@@ -4232,7 +4232,7 @@ arc_print_operand (FILE *file, rtx x, int code)
 
 case 'z':
   if (GET_CODE (x) == CONST_INT)
-   fprintf (file, "%d",exact_log2(INTVAL (x)) );
+   fprintf (file, "%d",exact_log2 (INTVAL (x) & 0x));
   else
output_operand_lossage ("invalid operand to %%z code");
 
@@ -5597,9 +5597,6 @@ arc_rtx_costs (rtx x, machine_mode mode, int outer_code,
if (satisfies_constraint_C0p (x)) /* bxor */
  nolimm = fast = condexec = true;
break;
- case SET:
-   if (satisfies_constraint_Crr (x)) /* ror b,u6 */
- nolimm = true;
  default:
break;
  }
@@ -9088,31 +9085,6 @@ prepare_move_operands (rtx *operands, machine_mode mode)
  MEM_COPY_ATTRIBUTES (pat, operands[0]);
  operands[0] = pat;
}
-  if (!cse_not_expected)
-   {
- rtx pat = XEX

Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-06 Thread Richard Biener
On Wed, Jun 5, 2019 at 10:03 PM Eric Botcazou  wrote:
>
> > This issue exists, not just for targets that can have their
> > MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
> > for the 'bit-container' targets where it *can't* be set higher.
> >
> > Let's please DTRT and correct the code here in the middle-end,
> > so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
> >  typedef int v4si __attribute__ ((vector_size (1 << 29)));
> > (all listed targets happen to have Pmode == SImode)
> >
> > So, considering that: ok to commit?
>
> You'd need to audit the effects on other targets though.  Are we sure that we
> want to do bitsizetype calculations in a larger mode on very embedded targets?

I didn't actually write it down but originally wanted - what about adding
a way for the target to specify what type to use for bitsize_type?
We do have SIZETYPE so say that if BITSIZETYPE is defined then
use that (otherwise fall back to the existing mechanism)?  There may
not be a C type that maps to DImode for cris and it's not that
I like those C type names very much (probably a way to make the
macros independent of the chosen multilib?), so eventually a
BITSIZEMODE or BITSIZE_LARGER_THAN_MAX_FIXED_MODE_SIZE
macro?  That said, if BITSIZETYPE would work I'd prefer that just
for consistency.

Richard.

> --
> Eric Botcazou


[PATCH] Add warn_unused_result for malloc-like functions (PR tree-optimization/78902).

2019-06-06 Thread Martin Liška
Hi.

The patch is about addition of warn_unused_attribute for malloc-like function.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-06  Martin Liska  

PR tree-optimization/78902
* builtin-attrs.def (ATTR_WARN_UNUSED_RESULT): New.
(ATTR_MALLOC_NOTHROW_LEAF_LIST): Remove.
(ATTR_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST): New.
(ATTR_ALLOC_SIZE_2_NOTHROW_LIST): Remove.
(ATTR_MALLOC_SIZE_1_NOTHROW_LEAF_LIST): Remove.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LIST): New.
(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_NOTHROW_LEAF_LIST): New.
(ATTR_ALLOCA_SIZE_1_NOTHROW_LEAF_LIST): Remove.
(ATTR_ALLOCA_WARN_UNUSED_RESULT_SIZE_1_NOTHROW_LEAF_LIST): New.
(ATTR_MALLOC_SIZE_1_2_NOTHROW_LEAF_LIST):  Remove.
(ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_2_NOTHROW_LEAF_LIST):
New.
(ATTR_ALLOC_SIZE_2_NOTHROW_LEAF_LIST): Remove.
(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): New.
(ATTR_MALLOC_NOTHROW_NONNULL): Remove.
(ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL): New.
(ATTR_MALLOC_NOTHROW_NONNULL_LEAF): Remove.
(ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF): New.
* builtins.def (BUILT_IN_ALIGNED_ALLOC): Change to use
warn_unused_result attribute.
(BUILT_IN_STRDUP): Likewise.
(BUILT_IN_STRNDUP): Likewise.
(BUILT_IN_ALLOCA): Likewise.
(BUILT_IN_CALLOC): Likewise.
(BUILT_IN_MALLOC): Likewise.
(BUILT_IN_REALLOC): Likewise.

gcc/testsuite/ChangeLog:

2019-06-06  Martin Liska  

PR tree-optimization/78902
* c-c++-common/asan/alloca_loop_unpoisoning.c: Use result
of __builtin_alloca.
* c-c++-common/asan/pr88619.c: Likewise.
* g++.dg/overload/using2.C: Likewise for malloc.
* gcc.dg/attr-alloc_size-5.c: Add new dg-warning.
* gcc.dg/nonnull-3.c: Use result of __builtin_strdup.
* gcc.dg/pr43643.c: Likewise.
* gcc.dg/pr59717.c: Likewise for calloc.
* gcc.dg/torture/pr71816.c: Likewise.
* gcc.dg/tree-ssa/pr78886.c: Likewise.
* gcc.dg/tree-ssa/pr79697.c: Likewise.
* gcc.dg/pr78902.c: New test.
---
 gcc/builtin-attrs.def | 37 ---
 gcc/builtins.def  | 14 +++
 .../asan/alloca_loop_unpoisoning.c|  2 +-
 gcc/testsuite/c-c++-common/asan/pr88619.c |  2 +-
 gcc/testsuite/g++.dg/overload/using2.C|  2 +-
 gcc/testsuite/gcc.dg/attr-alloc_size-5.c  |  2 +-
 gcc/testsuite/gcc.dg/nonnull-3.c  |  4 +-
 gcc/testsuite/gcc.dg/pr43643.c|  6 +--
 gcc/testsuite/gcc.dg/pr59717.c|  8 ++--
 gcc/testsuite/gcc.dg/pr78902.c| 14 +++
 gcc/testsuite/gcc.dg/torture/pr71816.c|  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr78886.c   |  2 +-
 gcc/testsuite/gcc.dg/tree-ssa/pr79697.c   |  6 +--
 13 files changed, 62 insertions(+), 39 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/pr78902.c


diff --git a/gcc/builtin-attrs.def b/gcc/builtin-attrs.def
index 204141f6c53..39d1395f42a 100644
--- a/gcc/builtin-attrs.def
+++ b/gcc/builtin-attrs.def
@@ -118,6 +118,7 @@ DEF_ATTR_IDENT (ATTR_TM_REGPARM, "*tm regparm")
 DEF_ATTR_IDENT (ATTR_TM_TMPURE, "transaction_pure")
 DEF_ATTR_IDENT (ATTR_RETURNS_TWICE, "returns_twice")
 DEF_ATTR_IDENT (ATTR_RETURNS_NONNULL, "returns_nonnull")
+DEF_ATTR_IDENT (ATTR_WARN_UNUSED_RESULT, "warn_unused_result")
 
 DEF_ATTR_TREE_LIST (ATTR_NOVOPS_LIST, ATTR_NOVOPS, ATTR_NULL, ATTR_NULL)
 
@@ -157,8 +158,10 @@ DEF_ATTR_TREE_LIST (ATTR_CONST_NORETURN_NOTHROW_LEAF_COLD_LIST, ATTR_COLD,\
 			ATTR_NULL, ATTR_CONST_NORETURN_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_MALLOC_NOTHROW_LIST, ATTR_MALLOC,	\
 			ATTR_NULL, ATTR_NOTHROW_LIST)
-DEF_ATTR_TREE_LIST (ATTR_MALLOC_NOTHROW_LEAF_LIST, ATTR_MALLOC,	\
+DEF_ATTR_TREE_LIST (ATTR_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST, ATTR_WARN_UNUSED_RESULT,	\
 			ATTR_NULL, ATTR_NOTHROW_LEAF_LIST)
+DEF_ATTR_TREE_LIST (ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST, ATTR_MALLOC,	\
+			ATTR_NULL, ATTR_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST)
 DEF_ATTR_TREE_LIST (ATTR_SENTINEL_NOTHROW_LIST, ATTR_SENTINEL,	\
 			ATTR_NULL, ATTR_NOTHROW_LIST)
 DEF_ATTR_TREE_LIST (ATTR_SENTINEL_NOTHROW_LEAF_LIST, ATTR_SENTINEL,	\
@@ -170,24 +173,26 @@ DEF_ATTR_TREE_LIST (ATTR_COLD_CONST_NORETURN_NOTHROW_LEAF_LIST, ATTR_CONST,\
with _SIZE_1, or second argument with _SIZE_2, specifies the size
of the allocated object.  */
 DEF_ATTR_TREE_LIST (ATTR_MALLOC_SIZE_1_NOTHROW_LIST, ATTR_ALLOC_SIZE,	\
-			ATTR_LIST_1, ATTR_MALLO

Re: Negative arguments in OpenMP 'num_threads' clause etc.

2019-06-06 Thread Thomas Schwinge
Hi Jakub!

On Wed, 29 May 2019 16:52:45 +0200, Jakub Jelinek  wrote:
> On Wed, May 29, 2019 at 04:42:14PM +0200, Thomas Schwinge wrote:
> > On Tue, 09 Apr 2019 17:51:46 +0200, I wrote:
> > > On Tue, 29 Nov 2016 17:47:08 -0800, Cesar Philippidis 
> > >  wrote:
> > > > One notable difference between the trunk and gomp4 implementation of the
> > > > tile clause is that gomp4 errors on negative value tile arguments,
> > > > whereas trunk issues warnings.

> > > > Is there a reason why the fortran FE
> > > > generally emits a warning, on say num_threads(-5), instead of an error?
> > > 
> > > Same for the C/C++ front ends, which I'm looking into first.
> > > 
> > > Jakub, is the reason that even if the user is clearly doing something
> > > "strage" there, the compiler doesn't have a problem to continue
> > > compilation for 'num_threads(-5)', so it just emits a warning, but for
> > > example for 'collapse(-5)' is has to stop with an error, because it can't
> > > continue compilation in that case?  Or, is there a different reason for
> > > the many 'warning_at ([...], "[...] must be positive"' (C front end, for
> > > example), instead of using 'error_at' for these?
> 
> collapse has a constant expression argument and if the value is negative (or
> 0), then parsing doesn't make sense, so that case is clearly something where
> an error is in order.  num_threads is an example of where the standard is
> not completely clear if it is or is not ok to reject compilation as opposed
> to just UB at runtime if that happens and no problem if that construct is
> never encoutered at runtime.

Thanks, so that matches my understanding, and we shall thus retract the
OpenACC "update gfortran's tile clause error handling" patch, that got
posted several times in several variations.


Later, we shall audit all front end clauses handling to make sure that
this is done consistently.


Grüße
 Thomas


signature.asc
Description: PGP signature


Re: [PATCH 0/4] Store multiple values for single value profilers

2019-06-06 Thread Martin Liška
On 6/5/19 3:49 PM, Richard Biener wrote:
> On Tue, Jun 4, 2019 at 10:44 AM Martin Liska  wrote:
>>
>> Hi.
>>
>> It's becoming more common that a training run happens in parallel 
>> environment.
>> That can lead to a not reproducible builds caused by different order of 
>> merging
>> of .gcda files. So that I'm suggesting to store up to 4 values for 
>> HIST_TYPE_SINGLE_VALUE
>> and HIST_TYPE_INDIR_CALL on disk. If the capacity is exceeded the whole 
>> counter is
>> marked as unstable (not reproducible).
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
>>
>> Ready to be installed?
> 
> Thanks for working on this, I hope Honza can review and approve it. 

Yes, he'll do it soon.

> Does this
> solve the issue of profiledbootstrap results being not reproducible?  (if you
> fix genchecksum to not generate different checksums)

Hopefully, but it needs to be tested.

> 
> I suppose this would also apply to a GCC 9 tree?

Yes, it applies smoothly. Would you like to see it backported to 9.2?

Martin

> 
> Thanks,
> Richard.
> 
>> Thanks,
>> Martin
>>
>> marxin (4):
>>   Remove indirect call top N counter type.
>>   Implement N disk counters for single value and indirect call counters.
>>   Dump histograms only if present.
>>   Update a bit dump format.
>>
>>  gcc/doc/invoke.texi   |   3 -
>>  gcc/gcov-counter.def  |   3 -
>>  gcc/gcov-io.h |   9 +-
>>  gcc/ipa-profile.c |  13 ++-
>>  gcc/params.def|   8 --
>>  gcc/profile.c |   1 -
>>  gcc/tree-profile.c|  23 +---
>>  gcc/value-prof.c  | 224 --
>>  gcc/value-prof.h  |   4 +-
>>  libgcc/Makefile.in|  10 +-
>>  libgcc/libgcov-driver.c   |  80 --
>>  libgcc/libgcov-merge.c| 139 +--
>>  libgcc/libgcov-profiler.c | 176 ++
>>  libgcc/libgcov-util.c |  19 
>>  libgcc/libgcov.h  |  12 +-
>>  15 files changed, 179 insertions(+), 545 deletions(-)
>>
>> --
>> 2.21.0
>>



Re: [PR 89330] Avoid adding dead speculative edges to inlinig heap

2019-06-06 Thread Martin Liška
Hi.

This is rebased version of the patch that Martin J. wrote.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin

>From 17290dd6dc4412aee6c6484844f9edb149129d36 Mon Sep 17 00:00:00 2001
From: Martin Jambor 
Date: Wed, 5 Jun 2019 16:11:44 +0200
Subject: [PATCH] Avoid adding dead speculative edges to inlining heap

2019-02-15  Martin Jambor  

	PR ipa/89330
	* ipa-inline.c (can_inline_edge_p): Move most of the checks...
	(call_not_inlinable_p): ...this new function.
	* ipa-inline.h (call_not_inlinable_p): Declare.
	* ipa-prop.c: Include ipa-inline.h
	(try_make_edge_direct_virtual_call): Create speculative edges only
	if there is any chance of inlining them.

	testsuite/
	* g++.dg/lto/pr89330_[01].C: New test.
---
 gcc/ipa-inline.c | 119 +--
 gcc/ipa-inline.h |   4 +-
 gcc/ipa-prop.c   |   8 +-
 gcc/testsuite/g++.dg/lto/pr89330_0.C |  50 +++
 gcc/testsuite/g++.dg/lto/pr89330_1.C |  36 
 5 files changed, 151 insertions(+), 66 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/lto/pr89330_0.C
 create mode 100644 gcc/testsuite/g++.dg/lto/pr89330_1.C

diff --git a/gcc/ipa-inline.c b/gcc/ipa-inline.c
index 745bdf3002a..ba8155f006e 100644
--- a/gcc/ipa-inline.c
+++ b/gcc/ipa-inline.c
@@ -299,95 +299,87 @@ sanitize_attrs_match_for_inline_p (const_tree caller, const_tree callee)
   (opts_for_fn (caller->decl)->x_##flag		\
!= opts_for_fn (callee->decl)->x_##flag)
 
-/* Decide if we can inline the edge and possibly update
-   inline_failed reason.  
-   We check whether inlining is possible at all and whether
-   caller growth limits allow doing so.  
-
-   if REPORT is true, output reason to the dump file. */
+/* Return CIF_OK if a call from CALLER to CALLEE is or would be inlineable.
+   Otherwise, return the reason why it cannot.  EARLY should be set when
+   deciding about early inlining.  */
 
-static bool
-can_inline_edge_p (struct cgraph_edge *e, bool report,
-		   bool early = false)
+enum cgraph_inline_failed_t
+call_not_inlinable_p (cgraph_node *caller, cgraph_node *callee,
+		  bool early)
 {
-  gcc_checking_assert (e->inline_failed);
-
-  if (cgraph_inline_failed_type (e->inline_failed) == CIF_FINAL_ERROR)
-{
-  if (report)
-report_inline_failed_reason (e);
-  return false;
-}
-
-  bool inlinable = true;
   enum availability avail;
-  cgraph_node *caller = e->caller->global.inlined_to
-		? e->caller->global.inlined_to : e->caller;
-  cgraph_node *callee = e->callee->ultimate_alias_target (&avail, caller);
+  caller = caller->global.inlined_to ? caller->global.inlined_to : caller;
+  callee = callee->ultimate_alias_target (&avail, caller);
 
   if (!callee->definition)
-{
-  e->inline_failed = CIF_BODY_NOT_AVAILABLE;
-  inlinable = false;
-}
+return CIF_BODY_NOT_AVAILABLE;
   if (!early && (!opt_for_fn (callee->decl, optimize)
 		 || !opt_for_fn (caller->decl, optimize)))
-{
-  e->inline_failed = CIF_FUNCTION_NOT_OPTIMIZED;
-  inlinable = false;
-}
+return CIF_FUNCTION_NOT_OPTIMIZED;
   else if (callee->calls_comdat_local)
-{
-  e->inline_failed = CIF_USES_COMDAT_LOCAL;
-  inlinable = false;
-}
+return CIF_USES_COMDAT_LOCAL;
   else if (avail <= AVAIL_INTERPOSABLE)
-{
-  e->inline_failed = CIF_OVERWRITABLE;
-  inlinable = false;
-}
-  /* All edges with call_stmt_cannot_inline_p should have inline_failed
- initialized to one of FINAL_ERROR reasons.  */
-  else if (e->call_stmt_cannot_inline_p)
-gcc_unreachable ();
+return CIF_OVERWRITABLE;
   /* Don't inline if the functions have different EH personalities.  */
   else if (DECL_FUNCTION_PERSONALITY (caller->decl)
 	   && DECL_FUNCTION_PERSONALITY (callee->decl)
 	   && (DECL_FUNCTION_PERSONALITY (caller->decl)
 	   != DECL_FUNCTION_PERSONALITY (callee->decl)))
-{
-  e->inline_failed = CIF_EH_PERSONALITY;
-  inlinable = false;
-}
+return CIF_EH_PERSONALITY;
   /* TM pure functions should not be inlined into non-TM_pure
  functions.  */
   else if (is_tm_pure (callee->decl) && !is_tm_pure (caller->decl))
-{
-  e->inline_failed = CIF_UNSPECIFIED;
-  inlinable = false;
-}
+return CIF_UNSPECIFIED;
   /* Check compatibility of target optimization options.  */
   else if (!targetm.target_option.can_inline_p (caller->decl,
 		callee->decl))
-{
-  e->inline_failed = CIF_TARGET_OPTION_MISMATCH;
-  inlinable = false;
-}
+return CIF_TARGET_OPTION_MISMATCH;
   else if (ipa_fn_summaries->get (callee) == NULL
 	   || !ipa_fn_summaries->get (callee)->inlinable)
-{
-  e->inline_failed = CIF_FUNCTION_NOT_INLINABLE;
-  inlinable = false;
-}
+return CIF_FUNCTION_NOT_INLINABLE;
   /* Don't inline a function with mismatched sanitization attributes. */
   else if (!sanitize_attrs_match_for_inline_p (caller->decl, calle

[PATCH] Release cgraph_{node,edge} via ggc_free.

2019-06-06 Thread Martin Liška
Hi.

This is follow up patch that releases memory for removed
cgraph edges and nodes.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Ready to be installed?
Thanks,
Martin
>From 121138ee973b63bfdf7be04ab28113f479bc91b0 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Fri, 15 Feb 2019 14:19:59 +0100
Subject: [PATCH] Release cgraph_{node,edge} via ggc_free.

gcc/ChangeLog:

2019-02-19  Martin Liska  

	* cgraph.c (symbol_table::create_edge): Always allocate
	a cgraph_edge.
	(symbol_table::free_edge): Store summary_id to
	edge_released_summary_ids if != -1;
	* cgraph.h (NEXT_FREE_NODE): Remove.
	(SET_NEXT_FREE_NODE): Likewise.
	(NEXT_FREE_EDGE): Likewise.
	(symbol_table::release_symbol): Store summary_id to
	cgraph_released_summary_ids if != -1;
	(symbol_table::allocate_cgraph_symbol): Always allocate
	a cgraph_node.
---
 gcc/cgraph.c | 26 +++
 gcc/cgraph.h | 58 ++--
 2 files changed, 31 insertions(+), 53 deletions(-)

diff --git a/gcc/cgraph.c b/gcc/cgraph.c
index 28019aba434..59dc70339f0 100644
--- a/gcc/cgraph.c
+++ b/gcc/cgraph.c
@@ -846,17 +846,8 @@ symbol_table::create_edge (cgraph_node *caller, cgraph_node *callee,
   gcc_assert (is_gimple_call (call_stmt));
 }
 
-  if (free_edges)
-{
-  edge = free_edges;
-  free_edges = NEXT_FREE_EDGE (edge);
-}
-  else
-{
-  edge = ggc_alloc ();
-  edge->m_summary_id = -1;
-}
-
+  edge = ggc_alloc ();
+  edge->m_summary_id = -1;
   edges_count++;
 
   gcc_assert (++edges_max_uid != 0);
@@ -1013,16 +1004,13 @@ cgraph_edge::remove_caller (void)
 void
 symbol_table::free_edge (cgraph_edge *e)
 {
+  edges_count--;
+  if (e->m_summary_id != -1)
+edge_released_summary_ids.safe_push (e->m_summary_id);
+
   if (e->indirect_info)
 ggc_free (e->indirect_info);
-
-  /* Clear out the edge so we do not dangle pointers.  */
-  int summary_id = e->m_summary_id;
-  memset (e, 0, sizeof (*e));
-  e->m_summary_id = summary_id;
-  NEXT_FREE_EDGE (e) = free_edges;
-  free_edges = e;
-  edges_count--;
+  ggc_free (e);
 }
 
 /* Remove the edge in the cgraph.  */
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index 18839a4a5ec..82ec5966a9b 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -2018,12 +2018,6 @@ is_a_helper ::test (symtab_node *p)
   return p && p->type == SYMTAB_VARIABLE;
 }
 
-/* Macros to access the next item in the list of free cgraph nodes and
-   edges. */
-#define NEXT_FREE_NODE(NODE) dyn_cast ((NODE)->next)
-#define SET_NEXT_FREE_NODE(NODE,NODE2) ((NODE))->next = NODE2
-#define NEXT_FREE_EDGE(EDGE) (EDGE)->prev_caller
-
 typedef void (*cgraph_edge_hook)(cgraph_edge *, void *);
 typedef void (*cgraph_node_hook)(cgraph_node *, void *);
 typedef void (*varpool_node_hook)(varpool_node *, void *);
@@ -2079,7 +2073,8 @@ public:
   friend class cgraph_edge;
 
   symbol_table (): cgraph_max_uid (1), cgraph_max_summary_id (0),
-  edges_max_uid (1), edges_max_summary_id (0)
+  edges_max_uid (1), edges_max_summary_id (0),
+  cgraph_released_summary_ids (), edge_released_summary_ids ()
   {
   }
 
@@ -2285,14 +2280,22 @@ public:
   /* Assign a new summary ID for the callgraph NODE.  */
   inline int assign_summary_id (cgraph_node *node)
   {
-node->m_summary_id = cgraph_max_summary_id++;
+if (!cgraph_released_summary_ids.is_empty ())
+  node->m_summary_id = cgraph_released_summary_ids.pop ();
+else
+  node->m_summary_id = cgraph_max_summary_id++;
+
 return node->m_summary_id;
   }
 
   /* Assign a new summary ID for the callgraph EDGE.  */
   inline int assign_summary_id (cgraph_edge *edge)
   {
-edge->m_summary_id = edges_max_summary_id++;
+if (!edge_released_summary_ids.is_empty ())
+  edge->m_summary_id = edge_released_summary_ids.pop ();
+else
+  edge->m_summary_id = edges_max_summary_id++;
+
 return edge->m_summary_id;
   }
 
@@ -2308,14 +2311,15 @@ public:
   int edges_max_uid;
   int edges_max_summary_id;
 
+  /* Vector of released summary IDS for cgraph nodes.  */
+  vec GTY ((skip)) cgraph_released_summary_ids;
+
+  /* Vector of released summary IDS for cgraph nodes.  */
+  vec GTY ((skip)) edge_released_summary_ids;
+
   symtab_node* GTY(()) nodes;
   asm_node* GTY(()) asmnodes;
   asm_node* GTY(()) asm_last_node;
-  cgraph_node* GTY(()) free_nodes;
-
-  /* Head of a linked list of unused (freed) call graph edges.
- Do not GTY((delete)) this list so UIDs gets reliably recycled.  */
-  cgraph_edge * GTY(()) free_edges;
 
   /* The order index of the next symtab node to be created.  This is
  used so that we can sort the cgraph nodes in order by when we saw
@@ -2675,15 +2679,9 @@ inline void
 symbol_table::release_symbol (cgraph_node *node)
 {
   cgraph_count--;
-
-  /* Clear out the node to NULL all pointers and add the node to the free
- list.  */
-  int summary_id = node->m_summary_id;
-  memset (node, 0, sizeof (*node));
-  node->type = SYMTAB_FUNCTION;
-  node->m_summary_id = summa

Re: [PATCH] Find constant definition for by-ref argument using dominance information (PR ipa/90401)

2019-06-06 Thread Feng Xue OS
> I don't think your PHI handling works correct.  If you have

>    agg.part1 = 0;
>    if ()
>      agg.part2 = 1;
>    call (agg);

> then you seem to end up above for agg.part2 because you push that onto the
> worklist below.

It is correct.

o. worklist is used to collect all non-dom virtual operands.  agg.part2 is 
collected
into worklist, after it is taken from worklist, and processed,  we find the 
next dom 
virtual operands agg.part1.
[the statement "dom_vuse = vuse"]. 

o. Execution transfers to the start of main loop, which processes dom virtual
operands, and does overlap analysis between agg.part1 and agg.part2. 
[the statement if (!clobber_by_agg_contents_list_p())]

>> +                }
>> +              append.safe_push (gimple_vuse (stmt));
>> +            }
>> +          else
>> +            {
>> +              for (unsigned i = 0; i < gimple_phi_num_args (stmt); ++i)
>> +                {
>> +                  tree phi_arg = gimple_phi_arg_def (stmt, i);
>> +
>> +                  if (SSA_NAME_IS_DEFAULT_DEF (phi_arg))
>> +                    goto finish;
>> +
>> +                  append.safe_push (phi_arg);

> Not sure why you first push to append and then move parts to the
> worklist below - it seems to only complicate code.

It is just a code trick. I do not want to duplicate below codes for both
normal virtual SSA and PHI virtual SSA.

>> +            {
>> +              if (visited.add (vuse = append[i]))
>> +                continue;
>> +
>> +              if (SSA_NAME_IS_DEFAULT_DEF (vuse)
>> +                  || strictly_dominated_by_ssa_p (call_bb, vuse))
>> +                {
>> +                  /* Found a new dom virtual operand, stop going further 
>> until
>> +                     all pending non-dom virtual operands are processed. */
>> +                  gcc_assert (!dom_vuse);
>> +                  dom_vuse = vuse;
>> +                }
>> +              else
>> +                worklist.safe_push (vuse);
>> +            }


> What you want to do is (probably) use get_continuation_for_phi
> which for a virtual PHI node and a reference computes a
> virtual operand you can continue your processing with.  Thus sth
> like

I looked though the function. Yes, it does nearly same thing as this patch 
requires.
But a minor difference,  get_continuation_for_phi() does not translate virtual
operands in back edge.  Then, if rewriting with the function, we can not handle 
the
following case.

/* assume no overlap between agg.part1 and agg.part2 */

__attribute__((pure)) call();
agg.part1 = 0;
for (...)
  {
call(agg);
agg.part2 = 1;
  }

No sure this restrict is to prevent revisit the start virtual SSA or due to
some other consideration.

Thanks for comments.

Feng

[PATCH][arm] Implement usadv16qi and ssadv16qi standard names

2019-06-06 Thread Przemyslaw Wirkus
Hi all,

This patch implements the usadv16qi and ssadv16qi standard names for arm.

The V16QImode variant is important as it is the most commonly used pattern:
reducing vectors of bytes into an int.
The midend expects the optab to compute the absolute differences of operands 1
and 2 and reduce them while widening along the way up to SImode. So the inputs
are V16QImode and the output is V4SImode.

I've based my solution on Aarch64 usadv16qi and ssadv16qi standard names
current implementation (r260437). This solution emits below sequence of
instructions:

VABDL.u8tmp, op1, op2   # op1, op2 lowpart
VABAL.u8tmp, op1, op2   # op1, op2 highpart
VPADAL.u16  op3, tmp

So, for the code:

$ arm-none-linux-gnueabihf-gcc -S -O3 -march=armv8-a+simd -mfpu=auto 
-mfloat-abi=hard usadv16qi.c -dp

#define N 1024
unsigned char pix1[N];
unsigned char pix2[N];

int
foo (void)
{
  int i_sum = 0;
  int i;
  for (i = 0; i < N; i++)
i_sum += __builtin_abs (pix1[i] - pix2[i]);
  return i_sum;
}

we now generate on arm:
foo:
movwr3, #:lower16:pix2  @ 57[c=4 l=4]  *arm_movsi_vfp/3
movtr3, #:upper16:pix2  @ 58[c=4 l=4]  *arm_movt/0
vmov.i32q9, #0  @ v4si  @ 3 [c=4 l=4]  *neon_movv4si/2
movwr2, #:lower16:pix1  @ 59[c=4 l=4]  *arm_movsi_vfp/3
movtr2, #:upper16:pix1  @ 60[c=4 l=4]  *arm_movt/0
add r1, r3, #1024   @ 8 [c=4 l=4]  *arm_addsi3/4
.L2:
vld1.8  {q11}, [r3]!@ 11[c=8 l=4]  *movmisalignv16qi_neon_load
vld1.8  {q10}, [r2]!@ 10[c=8 l=4]  *movmisalignv16qi_neon_load
cmp r1, r3  @ 21[c=4 l=4]  *arm_cmpsi_insn/2
vabdl.u8q8, d20, d22@ 12[c=8 l=4]  neon_vabdluv8qi
vabal.u8q8, d21, d23@ 15[c=88 l=4]  neon_vabaluv8qi
vpadal.u16  q9, q8  @ 16[c=8 l=4]  neon_vpadaluv8hi
bne .L2 @ 22[c=16 l=4]  arm_cond_branch
vadd.i32d18, d18, d19   @ 24[c=120 l=4]  
quad_halves_plusv4si
vpadd.i32   d18, d18, d18   @ 25[c=8 l=4]  
neon_vpadd_internalv2si
vmov.32 r0, d18[0]  @ 30[c=12 l=4]  vec_extractv2sisi/1

instead of:
foo:
movwr3, #:lower16:pix1
movtr3, #:upper16:pix1
vmov.i32q9, #0  @ v4si
movwr2, #:lower16:pix2
movtr2, #:upper16:pix2
add r1, r3, #1024
.L2:
vld1.8  {q8}, [r3]!
vld1.8  {q11}, [r2]!
vmovl.u8 q10, d16
cmp r1, r3
vmovl.u8 q8, d17
vmovl.u8 q12, d22
vmovl.u8 q11, d23
vsub.i16q10, q10, q12
vsub.i16q8, q8, q11
vabs.s16q10, q10
vabs.s16q8, q8
vaddw.s16   q9, q9, d20
vaddw.s16   q9, q9, d21
vaddw.s16   q9, q9, d16
vaddw.s16   q9, q9, d17
bne .L2
vadd.i32d18, d18, d19
vpadd.i32   d18, d18, d18
vmov.32 r0, d18[0]

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Przemyslaw

2019-06-06 Przemyslaw Wirkus 

* config/arm/iterators.md (VABAL): New int iterator.
* config/arm/neon.md (sadv16qi): New define_expand.
* config/arm/unspecs.md ("unspec"): Define UNSPEC_VABAL_S, 
UNSPEC_VABAL_U
values.

2019-06-06 Przemyslaw Wirkus 

* gcc.target/arm/ssadv16qi.c: New test.
* gcc.target/arm/usadv16qi.c: Likewise.

Re: [PATCH][arm] Implement usadv16qi and ssadv16qi standard names

2019-06-06 Thread Przemyslaw Wirkus
Hi all,

This patch implements the usadv16qi and ssadv16qi standard names for arm.

The V16QImode variant is important as it is the most commonly used pattern:
reducing vectors of bytes into an int.
The midend expects the optab to compute the absolute differences of operands 1
and 2 and reduce them while widening along the way up to SImode. So the inputs
are V16QImode and the output is V4SImode.

I've based my solution on Aarch64 usadv16qi and ssadv16qi standard names
current implementation (r260437). This solution emits below sequence of
instructions:

VABDL.u8tmp, op1, op2   # op1, op2 lowpart
VABAL.u8tmp, op1, op2   # op1, op2 highpart
VPADAL.u16  op3, tmp

So, for the code:

$ arm-none-linux-gnueabihf-gcc -S -O3 -march=armv8-a+simd -mfpu=auto 
-mfloat-abi=hard usadv16qi.c -dp

#define N 1024
unsigned char pix1[N];
unsigned char pix2[N];

int
foo (void)
{
  int i_sum = 0;
  int i;
  for (i = 0; i < N; i++)
i_sum += __builtin_abs (pix1[i] - pix2[i]);
  return i_sum;
}

we now generate on arm:
foo:
movwr3, #:lower16:pix2  @ 57[c=4 l=4]  *arm_movsi_vfp/3
movtr3, #:upper16:pix2  @ 58[c=4 l=4]  *arm_movt/0
vmov.i32q9, #0  @ v4si  @ 3 [c=4 l=4]  *neon_movv4si/2
movwr2, #:lower16:pix1  @ 59[c=4 l=4]  *arm_movsi_vfp/3
movtr2, #:upper16:pix1  @ 60[c=4 l=4]  *arm_movt/0
add r1, r3, #1024   @ 8 [c=4 l=4]  *arm_addsi3/4
.L2:
vld1.8  {q11}, [r3]!@ 11[c=8 l=4]  *movmisalignv16qi_neon_load
vld1.8  {q10}, [r2]!@ 10[c=8 l=4]  *movmisalignv16qi_neon_load
cmp r1, r3  @ 21[c=4 l=4]  *arm_cmpsi_insn/2
vabdl.u8q8, d20, d22@ 12[c=8 l=4]  neon_vabdluv8qi
vabal.u8q8, d21, d23@ 15[c=88 l=4]  neon_vabaluv8qi
vpadal.u16  q9, q8  @ 16[c=8 l=4]  neon_vpadaluv8hi
bne .L2 @ 22[c=16 l=4]  arm_cond_branch
vadd.i32d18, d18, d19   @ 24[c=120 l=4]  
quad_halves_plusv4si
vpadd.i32   d18, d18, d18   @ 25[c=8 l=4]  
neon_vpadd_internalv2si
vmov.32 r0, d18[0]  @ 30[c=12 l=4]  vec_extractv2sisi/1

instead of:
foo:
movwr3, #:lower16:pix1
movtr3, #:upper16:pix1
vmov.i32q9, #0  @ v4si
movwr2, #:lower16:pix2
movtr2, #:upper16:pix2
add r1, r3, #1024
.L2:
vld1.8  {q8}, [r3]!
vld1.8  {q11}, [r2]!
vmovl.u8 q10, d16
cmp r1, r3
vmovl.u8 q8, d17
vmovl.u8 q12, d22
vmovl.u8 q11, d23
vsub.i16q10, q10, q12
vsub.i16q8, q8, q11
vabs.s16q10, q10
vabs.s16q8, q8
vaddw.s16   q9, q9, d20
vaddw.s16   q9, q9, d21
vaddw.s16   q9, q9, d16
vaddw.s16   q9, q9, d17
bne .L2
vadd.i32d18, d18, d19
vpadd.i32   d18, d18, d18
vmov.32 r0, d18[0]

Bootstrapped and tested on arm-none-linux-gnueabihf.

Ok for trunk?

Thanks,
Przemyslaw

2019-05-29 Przemyslaw Wirkus 

* config/arm/iterators.md (VABAL): New int iterator.
* config/arm/neon.md (sadv16qi): New define_expand.
* config/arm/unspecs.md ("unspec"): Define UNSPEC_VABAL_S, 
UNSPEC_VABAL_U
values.

2019-05-29 Przemyslaw Wirkus 

* gcc.target/arm/ssadv16qi.c: New test.
* gcc.target/arm/usadv16qi.c: Likewise.diff --git a/gcc/config/arm/iterators.md b/gcc/config/arm/iterators.md
index 
eb07c5b90c1b1905d35d7b480bdbe7d7a45ab7ba..2462b8c87ea7dbe60ba50d22b1e494bb4fe905c2
 100644
--- a/gcc/config/arm/iterators.md
+++ b/gcc/config/arm/iterators.md
@@ -341,6 +341,8 @@
 
 (define_int_iterator VSUBHN [UNSPEC_VSUBHN UNSPEC_VRSUBHN])
 
+(define_int_iterator VABAL [UNSPEC_VABAL_S UNSPEC_VABAL_U])
+
 (define_int_iterator VABD [UNSPEC_VABD_S UNSPEC_VABD_U])
 
 (define_int_iterator VABDL [UNSPEC_VABDL_S UNSPEC_VABDL_U])
@@ -834,6 +836,7 @@
   (UNSPEC_VSUBW_S "s") (UNSPEC_VSUBW_U "u")
   (UNSPEC_VHSUB_S "s") (UNSPEC_VHSUB_U "u")
   (UNSPEC_VQSUB_S "s") (UNSPEC_VQSUB_U "u")
+  (UNSPEC_VABAL_S "s") (UNSPEC_VABAL_U "u")
   (UNSPEC_VABD_S "s") (UNSPEC_VABD_U "u")
   (UNSPEC_VABDL_S "s") (UNSPEC_VABDL_U "u")
   (UNSPEC_VMAX "s") (UNSPEC_VMAX_U "u")
diff --git a/gcc/config/arm/neon.md b/gcc/config/arm/neon.md
index 
de9ae43849038b3cf75feceec36429d5c40c63f2..51ed11abc519ea9d4f9e31751ac6d26a3d1ae5cd
 100644
--- a/gcc/config/arm/neon.md
+++ b/gcc/config/arm/neon.md
@@ -3255,6 +3255,32 @@
   [(set_attr "type" "neon_arith_acc")]
 )
 
+(define_expand "sadv16qi"
+  [(use (match_operand:V4SI 0 "register_operand"))
+   (unspec:V16QI [(use (match_operand:V16QI 1 "register_operand"))
+  (use (match_operand:V16QI 2 "register_operand"))] VABAL)
+   (use (match_operand:V4SI 3 "register_operand"))]
+  "TARGET_NEON"
+  {
+rtx reduc = gen_reg_rtx (V8HImode);
+

Re: [PATCH 0/4] Store multiple values for single value profilers

2019-06-06 Thread Richard Biener
On Thu, Jun 6, 2019 at 10:23 AM Martin Liška  wrote:
>
> On 6/5/19 3:49 PM, Richard Biener wrote:
> > On Tue, Jun 4, 2019 at 10:44 AM Martin Liska  wrote:
> >>
> >> Hi.
> >>
> >> It's becoming more common that a training run happens in parallel 
> >> environment.
> >> That can lead to a not reproducible builds caused by different order of 
> >> merging
> >> of .gcda files. So that I'm suggesting to store up to 4 values for 
> >> HIST_TYPE_SINGLE_VALUE
> >> and HIST_TYPE_INDIR_CALL on disk. If the capacity is exceeded the whole 
> >> counter is
> >> marked as unstable (not reproducible).
> >>
> >> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> >>
> >> Ready to be installed?
> >
> > Thanks for working on this, I hope Honza can review and approve it.
>
> Yes, he'll do it soon.
>
> > Does this
> > solve the issue of profiledbootstrap results being not reproducible?  (if 
> > you
> > fix genchecksum to not generate different checksums)
>
> Hopefully, but it needs to be tested.
>
> >
> > I suppose this would also apply to a GCC 9 tree?
>
> Yes, it applies smoothly. Would you like to see it backported to 9.2?

No, but eventually into our package.

Richard.

> Martin
>
> >
> > Thanks,
> > Richard.
> >
> >> Thanks,
> >> Martin
> >>
> >> marxin (4):
> >>   Remove indirect call top N counter type.
> >>   Implement N disk counters for single value and indirect call counters.
> >>   Dump histograms only if present.
> >>   Update a bit dump format.
> >>
> >>  gcc/doc/invoke.texi   |   3 -
> >>  gcc/gcov-counter.def  |   3 -
> >>  gcc/gcov-io.h |   9 +-
> >>  gcc/ipa-profile.c |  13 ++-
> >>  gcc/params.def|   8 --
> >>  gcc/profile.c |   1 -
> >>  gcc/tree-profile.c|  23 +---
> >>  gcc/value-prof.c  | 224 --
> >>  gcc/value-prof.h  |   4 +-
> >>  libgcc/Makefile.in|  10 +-
> >>  libgcc/libgcov-driver.c   |  80 --
> >>  libgcc/libgcov-merge.c| 139 +--
> >>  libgcc/libgcov-profiler.c | 176 ++
> >>  libgcc/libgcov-util.c |  19 
> >>  libgcc/libgcov.h  |  12 +-
> >>  15 files changed, 179 insertions(+), 545 deletions(-)
> >>
> >> --
> >> 2.21.0
> >>
>


Re: [PATCH] Find constant definition for by-ref argument using dominance information (PR ipa/90401)

2019-06-06 Thread Richard Biener
On Thu, Jun 6, 2019 at 10:54 AM Feng Xue OS  wrote:
>
> > I don't think your PHI handling works correct.  If you have
>
> >agg.part1 = 0;
> >if ()
> >  agg.part2 = 1;
> >call (agg);
>
> > then you seem to end up above for agg.part2 because you push that onto the
> > worklist below.
>
> It is correct.
>
> o. worklist is used to collect all non-dom virtual operands.  agg.part2 is 
> collected
> into worklist, after it is taken from worklist, and processed,  we find the 
> next dom
> virtual operands agg.part1.
> [the statement "dom_vuse = vuse"].
>
> o. Execution transfers to the start of main loop, which processes dom virtual
> operands, and does overlap analysis between agg.part1 and agg.part2.
> [the statement if (!clobber_by_agg_contents_list_p())]
>
> >> +}
> >> +  append.safe_push (gimple_vuse (stmt));
> >> +}
> >> +  else
> >> +{
> >> +  for (unsigned i = 0; i < gimple_phi_num_args (stmt); ++i)
> >> +{
> >> +  tree phi_arg = gimple_phi_arg_def (stmt, i);
> >> +
> >> +  if (SSA_NAME_IS_DEFAULT_DEF (phi_arg))
> >> +goto finish;
> >> +
> >> +  append.safe_push (phi_arg);
>
> > Not sure why you first push to append and then move parts to the
> > worklist below - it seems to only complicate code.
>
> It is just a code trick. I do not want to duplicate below codes for both
> normal virtual SSA and PHI virtual SSA.
>
> >> +{
> >> +  if (visited.add (vuse = append[i]))
> >> +continue;
> >> +
> >> +  if (SSA_NAME_IS_DEFAULT_DEF (vuse)
> >> +  || strictly_dominated_by_ssa_p (call_bb, vuse))
> >> +{
> >> +  /* Found a new dom virtual operand, stop going further 
> >> until
> >> + all pending non-dom virtual operands are processed. 
> >> */
> >> +  gcc_assert (!dom_vuse);
> >> +  dom_vuse = vuse;
> >> +}
> >> +  else
> >> +worklist.safe_push (vuse);
> >> +}
>
>
> > What you want to do is (probably) use get_continuation_for_phi
> > which for a virtual PHI node and a reference computes a
> > virtual operand you can continue your processing with.  Thus sth
> > like
>
> I looked though the function. Yes, it does nearly same thing as this patch 
> requires.
> But a minor difference,  get_continuation_for_phi() does not translate virtual
> operands in back edge.  Then, if rewriting with the function, we can not 
> handle the
> following case.
>
> /* assume no overlap between agg.part1 and agg.part2 */
>
> __attribute__((pure)) call();
> agg.part1 = 0;
> for (...)
>   {
> call(agg);
> agg.part2 = 1;
>   }
>
> No sure this restrict is to prevent revisit the start virtual SSA or due to
> some other consideration.

It certainly can handle this situation, you do not need the "translate" hook.

Richard.

>
> Thanks for comments.
>
> Feng


Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Richard Sandiford
Kugan Vivekanandarajah  writes:
> Hi Richard,
>
> Thanks for the review. Attached is the latest patch.
>
> For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
> am limiting fwprop in cases like this. Is there a better fix for this?
> index cf2c9de..2c99285 100644
> --- a/gcc/fwprop.c
> +++ b/gcc/fwprop.c
> @@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
> rtx_insn *def_insn, rtx def_set)
>else
>  mode = GET_MODE (*loc);
>
> +  /* TODO. We can't get the mode for
> + (set (reg:VNx16BI 109)
> +  (unspec:VNx16BI [
> +(reg:SI 131)
> +(reg:SI 106)
> +   ] UNSPEC_WHILE_LO))
> + Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
> +  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
> +return false;
>new_rtx = propagate_rtx (*loc, mode, reg, src,
>   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));

What specifically goes wrong?  The unspec above isn't that unusual --
many unspecs have different modes from their inputs.

Thanks,
Richard


Re: [PATCH] fix more -Wformat-diag issues

2019-06-06 Thread Jakub Jelinek
On Wed, May 22, 2019 at 10:34:00AM -0600, Martin Sebor wrote:
> gcc/ChangeLog:
> 
>   * config/i386/i386-features.c (ix86_get_function_versions_dispatcher):
>   Adjust quoting and hyphenation.
>   * convert.c (convert_to_real_1): Same.
>   * gcc.c (driver_wrong_lang_callback): Same.
>   (driver::handle_unrecognized_options): Same.
>   * gimple-ssa-nonnull-compare.c (do_warn_nonnull_compare): Same.
>   * opts-common.c (cmdline_handle_error): Same.
>   (read_cmdline_option): Same.
>   * opts-global.c (complain_wrong_lang): Same.
>   (print_ignored_options): Same.
>   (handle_common_deferred_options): Same.
>   * pretty-print.h: Same.
>   * print-rtl.c (debug_bb_n_slim): Same.
>   * sched-rgn.c (make_pass_sched_fusion): Same.
>   * tree-cfg.c (verify_gimple_assign_unary): Same.
>   (verify_gimple_label): Same.
>   * tree-ssa-operands.c (verify_ssa_operands): Same.
>   * varasm.c (do_assemble_alias): Same.
>   (assemble_alias): Same.
> 
>   * diagnostic-core.h (GCC_DIAG_STYLE): Adjust.
>(GCC_DIAG_RAW_STYLE): New macro.
> 
>   * cfghooks.c: Disable -Wformat-diags.
>   * cfgloop.c: Same.
>   * cfgrtl.c: Same.
>   * cgraph.c: Same.
>   * diagnostic-show-locus.c: Same.
>   * diagnostic.c: Same.
>   * gimple-pretty-print.c: Same.
>   * graph.c: Same.
>   * symtab.c: Same.
>   * tree-eh.c Same.
>   * tree-pretty-print.c: Same.
>   * tree-ssa.c: Same.
> 
>   * configure: Regenerate.
>   * configure.ac (ACX_PROG_CXX_WARNING_OPTS): Add -Wno-error=format-diag.
>(ACX_PROG_CC_WARNING_OPTS): Same.

Changes for the same change shouldn't be separated by empty newlines in the
ChangeLog.  Furthermore, you've managed to commit only the first part (until
varasm.c) and not the rest.

> diff --git a/gcc/configure b/gcc/configure
> index 4a3d5eefcb8..c9062cca9d6 100755
> --- a/gcc/configure
> +++ b/gcc/configure
> @@ -6797,7 +6797,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
>  
>  c_loose_warn=
>  save_CFLAGS="$CFLAGS"
> -for real_option in -Wstrict-prototypes -Wmissing-prototypes; do
> +for real_option in -Wstrict-prototypes 
> -Wmissing-prototypes-Wno-error=format-diag; do
># Do the check with the no- prefix removed since gcc silently
># accepts any -Wno-* option on purpose
>case $real_option in

The above was probably regenerated before you've added a space:

> diff --git a/gcc/configure.ac b/gcc/configure.ac
> index 35982fdc9ed..cbc0c25fa2b 100644
> --- a/gcc/configure.ac
> +++ b/gcc/configure.ac
> @@ -483,10 +483,11 @@ AS_IF([test $enable_build_format_warnings = no],
>[wf_opt=-Wno-format],[wf_opt=])
>  ACX_PROG_CXX_WARNING_OPTS(
>   m4_quote(m4_do([-W -Wall -Wno-narrowing -Wwrite-strings ],
> -[-Wcast-qual $wf_opt])), [loose_warn])
> +[-Wcast-qual -Wno-error=format-diag $wf_opt])),
> +[loose_warn])
>  ACX_PROG_CC_WARNING_OPTS(
> - m4_quote(m4_do([-Wstrict-prototypes -Wmissing-prototypes])),
> - [c_loose_warn])
> + m4_quote(m4_do([-Wstrict-prototypes -Wmissing-prototypes ],

^--HERE
I've committed following to fix that up as obvious:

2019-06-06  Jakub Jelinek  

* configure: Regenerate.

--- gcc/configure   (revision 271993)
+++ gcc/configure   (revision 271994)
@@ -6797,7 +6797,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
 
 c_loose_warn=
 save_CFLAGS="$CFLAGS"
-for real_option in -Wstrict-prototypes 
-Wmissing-prototypes-Wno-error=format-diag; do
+for real_option in -Wstrict-prototypes -Wmissing-prototypes 
-Wno-error=format-diag; do
   # Do the check with the no- prefix removed since gcc silently
   # accepts any -Wno-* option on purpose
   case $real_option in


Jakub


Re: [0/3] Improve debug info for addressable vars

2019-06-06 Thread Richard Sandiford
Richard Biener  writes:
> On Wed, Jun 5, 2019 at 4:30 PM Richard Sandiford
>  wrote:
>>
>> Richard Biener  writes:
>> > On Sat, Jun 1, 2019 at 5:49 PM Richard Sandiford
>> >  wrote:
>> >>
>> >> Taking the address of a variable stops us doing var-tracking on it,
>> >> so that we just use the DECL_RTL instead.  This can easily cause wrong
>> >> debug info for regions of code that would have had correct debug info
>> >> if the variable weren't addressable.  E.g.:
>> >>
>> >> {
>> >>   int base;
>> >>   get_start (&base);
>> >>   x[i1] = base;
>> >>   base += 1; // No need to store this
>> >>   x[i2] = base; // ...so the debug info for "base" is wrong here
>> >> }
>> >>
>> >> or (the motivating example):
>> >>
>> >> {
>> >>   int base;
>> >>   get_start (&base);
>> >>   for (int i = 0; i < n; ++i)
>> >> {
>> >>   x[i] = base;
>> >>   base += y[i]; // Can apply LSM here, so the debug info for "base"
>> >> // in the loop is wrong
>> >> }
>> >>   consume (&base);
>> >> }
>> >>
>> >> This patch series lets us use the DECL_RTL location for some parts of a
>> >> variable's lifetime and debug-bind locations for other parts:
>> >>
>> >> 1) Gimple uses "VAR s=> VAR" to bind VAR to its DECL_RTL.  The binding
>> >>holds until overridden.
>> >>
>> >> 2) RTL does the same thing using:
>> >>
>> >>  (var_location VAR (decl_rtl_ref VAR))
>> >>
>> >>where DECL_RTL_REF is a new rtx code that captures the DECL_RTL
>> >>by reference rather than by value.
>> >>
>> >>We can't just use "(var_location VAR (mem X))" for this, because
>> >>that would bind VAR to the value that (mem X) has at that exact point.
>> >>VAR would therefore get reset by any possible change to (mem X),
>> >>whereas here we want it to track (possibly indirect) updates instead.
>> >>
>> >> 3) The gimplifier decides which variables should get the new treatment
>> >>and emits "VAR s=> VAR" to mark the start of VAR's lifetime.
>> >>Clobbers continue to mark the end of VAR's lifetime.
>> >>
>> >> 4) Stores to VAR implicitly reestablish the link between VAR and its
>> >>DECL_RTL.  This is simpler (and IMO more robust) than inserting an
>> >>explicit "VAR s=> VAR" at every write.
>> >>
>> >> 5) gsi_remove tries to insert "VAR => X" in place of a deleted "VAR = X",
>> >>falling back to a "VAR => NULL" reset if that fails.
>> >>
>> >> Patch 1 handles the new rtl code, patch 2 adds the gimple framework,
>> >> and patch 3 uses it for LSM.
>> >
>> > So I wonder how it handles
>> >
>> > void __attribute__((noinline)) foo(int *p) { *p = 42; }
>> > int x;
>> > int main()
>> > {
>> >   int base = 1;
>> >   foo (&base);
>> >   base = 2;
>> >   *(x ? &x : &base) = 1; // (*)
>> >   return 0;
>> > }
>> >
>> > here we DSE the base = 2 store leaving a
>> >
>> > # DEBUG base = 2
>> >
>> > stmt?  But there's an indirect store that also stores
>> > to base - what will the debug info say at/after (*)?  Will
>> > it claim that base is 2?  At least I do not see that
>> > the connection with bases DECL_RTL is re-established?
>>
>> Yeah, true.
>>
>> > There's a clobber of base before return 0 so you eventually
>> > have to add some dummy stmt you can print base after
>> > the indirect store.
>> >
>> > That said, doesn't "aliasing" create another source of wrong-debug
>> > with your approach that might be even worse?
>>
>> Not sure about even worse, but maybe different.  In the example above
>> the patches fix the debug info after "base = 2" but break it after the
>> following statement.
>>
>> But there's no real need for the compiler to store to base in (*) either.
>
> Indeed partial dead code elim code sink the store into both arms and then
> remove the store in one of them.
>
>> We could end up with "if (...) x = 1;" instead.  So AFAICT there's no
>> guarantee that we'll get correct debug info at the return statement even
>> as things stand.
>>
>> For memory variables, I think we're always at the mercy of dead stores
>> being optimised away, and the patch isn't trying to fix that.
>
> Hmm, but you _do_ insert the debug stmts when we remove stores...

Right, but that's for examples like the ones in patch 2, where we have:

   base += 1;
   ... use base ...;

and base doesn't need to be stored back to memory after the addition.
The difference is that here the user didn't write dead code (because
the result of the addition is used), whereas in your example they did.

I.e. the reason for replacing a dead "VAR = X" with "# DEBUG VAR => X"
is that we should only be deleting the store if:

(a) all later uses of "VAR" can be/have been replaced by "X" or
(b) the calculation of "X" was never needed in the first place

(a) is something we should handle as well as we can, and I think
inserting the debug statement is the right thing to do there.
(b) seems like a lost cause.  E.g. for:

{
  int base;
  ...
  if (...)
x = &base;
  *x = 1; // [A]
  *x = 2; // [B]
}

I don't think we're ever going to be able to mak

[PATCH V7] Remove empty loop with assumed finiteness (PR tree-optimization/89713)

2019-06-06 Thread Feng Xue OS
diff --git a/gcc/ChangeLog b/gcc/ChangeLog
index 37aab79..87cc125 100644
--- a/gcc/ChangeLog
+++ b/gcc/ChangeLog
@@ -1,3 +1,16 @@
+2019-06-04  Feng Xue  
+
+   PR tree-optimization/89713
+   * doc/invoke.texi (-ffinite-loops): Document new option.
+   * common.opt (-ffinite-loops): New option.
+   * tree-ssa-dce.c (mark_stmt_if_obviously_necessary): Mark
+   IFN_GOACC_LOOP calls as necessary.
+   * tree-ssa-loop-niter.c (finite_loop_p): Assume loop with an exit
+   is finite.
+   * omp-offload.c (oacc_xform_loop): Skip lowering if return value of
+   IFN_GOACC_LOOP call is not used.
+   * opts.c (default_options_table): Enable -ffinite-loops at -O2+.
+
 2019-06-04  Alan Modra  
 
PR target/90689
diff --git a/gcc/common.opt b/gcc/common.opt
index 0e72fd0..8b0e6ad 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -1437,6 +1437,10 @@ ffinite-math-only
 Common Report Var(flag_finite_math_only) Optimization SetByCombined
 Assume no NaNs or infinities are generated.
 
+ffinite-loops
+Common Report Var(flag_finite_loops) Optimization
+Assume that loops with an exit will terminate and not loop indefinitely.
+
 ffixed-
 Common Joined RejectNegative Var(common_deferred_options) Defer
 -ffixed- Mark  as being unavailable to the compiler.
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 91c9bb8..0fe4c52 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -412,6 +412,7 @@ Objective-C and Objective-C++ Dialects}.
 -fdevirtualize-at-ltrans  -fdse @gol
 -fearly-inlining  -fipa-sra  -fexpensive-optimizations  -ffat-lto-objects @gol
 -ffast-math  -ffinite-math-only  -ffloat-store  -fexcess-precision=@var{style} 
@gol
+-ffinite-loops @gol
 -fforward-propagate  -ffp-contract=@var{style}  -ffunction-sections @gol
 -fgcse  -fgcse-after-reload  -fgcse-las  -fgcse-lm  -fgraphite-identity @gol
 -fgcse-sm  -fhoist-adjacent-loads  -fif-conversion @gol
@@ -8282,6 +8283,7 @@ also turns on the following optimization flags:
 -fdelete-null-pointer-checks @gol
 -fdevirtualize  -fdevirtualize-speculatively @gol
 -fexpensive-optimizations @gol
+-ffinite-loops @gol 
 -fgcse  -fgcse-lm  @gol
 -fhoist-adjacent-loads @gol
 -finline-small-functions @gol
@@ -9503,6 +9505,15 @@ that may set @code{errno} but are otherwise free of side 
effects.  This flag is
 enabled by default at @option{-O2} and higher if @option{-Os} is not also
 specified.
 
+@item -ffinite-loops
+@opindex ffinite-loops
+@opindex fno-finite-loops
+Assume that a loop with an exit will eventually take the exit and not loop
+indefinitely.  This allows the compiler to remove loops that otherwise have
+no side-effects, not considering eventual endless looping as such.
+
+This option is enabled by default at @option{-O2}.
+
 @item -ftree-dominator-opts
 @opindex ftree-dominator-opts
 Perform a variety of simple scalar cleanups (constant/copy
diff --git a/gcc/omp-offload.c b/gcc/omp-offload.c
index 97ae47b..369122f 100644
--- a/gcc/omp-offload.c
+++ b/gcc/omp-offload.c
@@ -300,7 +300,7 @@ oacc_xform_loop (gcall *call)
   tree chunk_size = NULL_TREE;
   unsigned mask = (unsigned) TREE_INT_CST_LOW (gimple_call_arg (call, 5));
   tree lhs = gimple_call_lhs (call);
-  tree type = TREE_TYPE (lhs);
+  tree type = NULL_TREE;
   tree diff_type = TREE_TYPE (range);
   tree r = NULL_TREE;
   gimple_seq seq = NULL;
@@ -308,6 +308,15 @@ oacc_xform_loop (gcall *call)
   unsigned outer_mask = mask & (~mask + 1); // Outermost partitioning
   unsigned inner_mask = mask & ~outer_mask; // Inner partitioning (if any)
 
+  /* Skip lowering if return value of IFN_GOACC_LOOP call is not used. */
+  if (!lhs)
+{
+  gsi_replace_with_seq (&gsi, seq, true);
+  return;
+}
+
+  type = TREE_TYPE (lhs);
+ 
 #ifdef ACCEL_COMPILER
   chunk_size = gimple_call_arg (call, 4);
   if (integer_minus_onep (chunk_size)  /* Force static allocation.  */
diff --git a/gcc/opts.c b/gcc/opts.c
index 64f94ac..b38bfb1 100644
--- a/gcc/opts.c
+++ b/gcc/opts.c
@@ -494,6 +494,7 @@ static const struct default_options default_options_table[] 
=
 { OPT_LEVELS_2_PLUS, OPT_fdevirtualize, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_fdevirtualize_speculatively, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_fexpensive_optimizations, NULL, 1 },
+{ OPT_LEVELS_2_PLUS, OPT_ffinite_loops, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_fgcse, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_fhoist_adjacent_loads, NULL, 1 },
 { OPT_LEVELS_2_PLUS, OPT_findirect_inlining, NULL, 1 },
diff --git a/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C 
b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
new file mode 100644
index 000..6b1e879
--- /dev/null
+++ b/gcc/testsuite/g++.dg/tree-ssa/empty-loop.C
@@ -0,0 +1,33 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -fdump-tree-cddce2 -ffinite-loops" } */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+using namespace std;
+
+int foo (vector &v, list &l, set &s, map 
&m)
+{
+  for (vector::iterator it = v.begin (); it != v.e

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-06-06 Thread Richard Earnshaw (lists)
On 05/06/2019 19:04, Jason Merrill wrote:
> On 6/3/19 6:33 PM, Joseph Myers wrote:
>> On Sun, 2 Jun 2019, Segher Boessenkool wrote:
>>
> Git has an identity (well, two) _per commit_, and there is no way
> you can
> reconstruct people's prefered name and email address (at any point
> in time,
> for every commit separately) correctly.  IMO it is much better to
> not even
> try.  We already *have* enough info for anyone to trivially look up
> who wrote
> what, and what might be that person's email address at the time.  But
> pretending that is more than a guess is just wrong.

 I think not doing a best-effort identification (name+email) is just as
>>>
>>> And I think guessing is not a "best effort", but just wrong.
>>
>> It's 100% accurate about the identity of the person who was the committer
>> (modulo the one username from the gcc2 period where it was clear who the
>> author of the commits by that username was, and so that went in the
>> author
>> map, but not clear that was the same as the committer, who did not commit
>> patches for any other author).  So it's as accurate as any case where
>> someone committing natively in git for someone else failed to use
>> --author
>> (and if the CVS/SVN commit included a ChangeLog entry, we have credit
>> given from there via the "changelogs" feature).
>>
>> I think failing to credit (by name and email address) the person implied
>> by the commit metadata, in the absence of positive evidence (such as a
>> ChangeLog entry) for the change being authored by someone else, is just
>> wrong, in the same way it's wrong not to use --author when committing for
>> someone else in git.
> 
> It's wrong, but it's not importantly wrong.  If we're doing a
> reposurgeon conversion, this adjustment makes sense.  If we're starting
> from the git-svn mirror, it doesn't justify breaking everyone's copies
> by rewriting branches.  And the bird in the hand looks more and more
> appealing as time goes by.
> 
>> Where a person used different names over time, there's no generally
>> applicable rule for whether they'd prefer the latest version or the
>> version used at the time to be used in reference to past commits, and I
>> think using the most current version known is most appropriate, in the
>> absence of a ChangeLog entry added in the commit, unless they've
>> specified
>> a preference for some other rule for which commits get what name.
>> Likewise for email addresses.
> 
> For email addresses, I think that using @gcc.gnu.org would be the best
> approach for people that have such accounts, rather than an employer
> address from an arbitrary point in time.

Or @gnu.org for accounts that pre-date the switch to EGCS and CVS.

> 
> Jason



Re: [AArch64] [SVE] PR88837 - Poor vector construction code in VL-specific mode

2019-06-06 Thread Szabolcs Nagy
On 03/06/2019 08:26, Prathamesh Kulkarni wrote:
> +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_8.c
> @@ -0,0 +1,32 @@
> +/* { dg-do assemble { target aarch64_asm_sve_ok } } */
> +/* { dg-options "-O2 -fno-schedule-insns -msve-vector-bits=256 --save-temps" 
> } */
> +
> +/* Case 5.2: Interleaved elements and constants.  */ 
> +
> +#include 
> +
> +typedef int32_t vnx4si __attribute__((vector_size (32)));
> +
> +__attribute__((noipa))
> +vnx4si foo(int a, int b, int c, int d)
> +{
> +  return (vnx4si) { a, 1, b, 2, c, 3, d, 4 }; 
> +}
> +
> +/*
> +foo:
> +.LFB0:
> +.cfi_startproc
> +ptrue   p0.s, vl8
> +mov z0.s, w3
> +adrpx3, .LANCHOR0
> +insrz0.s, w2
> +add x3, x3, :lo12:.LANCHOR0
> +insrz0.s, w1
> +ld1wz1.s, p0/z, [x3]
> +insrz0.s, w0
> +zip1z0.s, z0.s, z1.s
> +ret
> +*/
> +
> +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w3\n\tadrp\t(x[0-9]+), 
> \.LANCHOR0\n\tinsr\t\1, w2\n\tadd\t\2, \2, :lo12:\.LANCHOR0\n\tinsr\t\1, 
> w1\n\tld1w\t(z[0-9]+\.s), p[0-9]+/z, \[\2\]\n\tinsr\t\1, w0\n\tzip1\t\1, \1, 
> \3} } } */

this fails with tiny model when i'm testing aarch64-none-elf

$ make check-c 'RUNTESTFLAGS=--target_board=aarch64-elf-qemu{-mcmodel=tiny} 
aarch64-sve.exp=init_8.c'
...
FAIL: gcc.target/aarch64/sve/init_8.c -march=armv8.2-a+sve  scan-assembler 
\\tmov\\t(z[0-9]+\\.s), w3\\n\\tadrp\\t(x[0-9]+),
\\.LANCHOR0\\n\\tinsr\\t\\1, w2\\n\\tadd\\t\\2, \\2, 
:lo12:\\.LANCHOR0\\n\\tinsr\\t\\1, w1\\n\\tld1w\\t(z[0-9]+\\.s), p[0-9]+/z,
\\[\\2\\]\\n\\tinsr\\t\\1, w0\\n\\tzip1\\t\\1, \\1, \\3

i think you need conditional scan asm for { target aarch64_small }
and { target aarch64_tiny } or just skip the test for tiny, but
even then matching exact register name and instruction scheduling
seems fragile.

tiny code:

.arch armv8.2-a+crc+sve
.file   "init_8.c"
.text
.align  2
.p2align 3,,7
.global foo
.type   foo, %function
foo:
ptrue   p0.s, vl8
adr x4, .LC0
mov z0.s, w3
ld1wz1.s, p0/z, [x4]
insrz0.s, w2
insrz0.s, w1
insrz0.s, w0
zip1z0.s, z0.s, z1.s
st1wz0.s, p0, [x8]
ret
.size   foo, .-foo
.align  4
.LC0:
.word   1
.word   2
.word   3
.word   4
.word   3
.word   4
.word   3
.word   4


Re: [AArch64] [SVE] PR88837 - Poor vector construction code in VL-specific mode

2019-06-06 Thread Richard Sandiford
Szabolcs Nagy  writes:
> On 03/06/2019 08:26, Prathamesh Kulkarni wrote:
>> +++ b/gcc/testsuite/gcc.target/aarch64/sve/init_8.c
>> @@ -0,0 +1,32 @@
>> +/* { dg-do assemble { target aarch64_asm_sve_ok } } */
>> +/* { dg-options "-O2 -fno-schedule-insns -msve-vector-bits=256 
>> --save-temps" } */
>> +
>> +/* Case 5.2: Interleaved elements and constants.  */ 
>> +
>> +#include 
>> +
>> +typedef int32_t vnx4si __attribute__((vector_size (32)));
>> +
>> +__attribute__((noipa))
>> +vnx4si foo(int a, int b, int c, int d)
>> +{
>> +  return (vnx4si) { a, 1, b, 2, c, 3, d, 4 }; 
>> +}
>> +
>> +/*
>> +foo:
>> +.LFB0:
>> +.cfi_startproc
>> +ptrue   p0.s, vl8
>> +mov z0.s, w3
>> +adrpx3, .LANCHOR0
>> +insrz0.s, w2
>> +add x3, x3, :lo12:.LANCHOR0
>> +insrz0.s, w1
>> +ld1wz1.s, p0/z, [x3]
>> +insrz0.s, w0
>> +zip1z0.s, z0.s, z1.s
>> +ret
>> +*/
>> +
>> +/* { dg-final { scan-assembler {\tmov\t(z[0-9]+\.s), w3\n\tadrp\t(x[0-9]+), 
>> \.LANCHOR0\n\tinsr\t\1, w2\n\tadd\t\2, \2, :lo12:\.LANCHOR0\n\tinsr\t\1, 
>> w1\n\tld1w\t(z[0-9]+\.s), p[0-9]+/z, \[\2\]\n\tinsr\t\1, w0\n\tzip1\t\1, \1, 
>> \3} } } */
>
> this fails with tiny model when i'm testing aarch64-none-elf
>
> $ make check-c 'RUNTESTFLAGS=--target_board=aarch64-elf-qemu{-mcmodel=tiny} 
> aarch64-sve.exp=init_8.c'
> ...
> FAIL: gcc.target/aarch64/sve/init_8.c -march=armv8.2-a+sve  scan-assembler 
> \\tmov\\t(z[0-9]+\\.s), w3\\n\\tadrp\\t(x[0-9]+),
> \\.LANCHOR0\\n\\tinsr\\t\\1, w2\\n\\tadd\\t\\2, \\2, 
> :lo12:\\.LANCHOR0\\n\\tinsr\\t\\1, w1\\n\\tld1w\\t(z[0-9]+\\.s), p[0-9]+/z,
> \\[\\2\\]\\n\\tinsr\\t\\1, w0\\n\\tzip1\\t\\1, \\1, \\3
>
> i think you need conditional scan asm for { target aarch64_small }
> and { target aarch64_tiny } or just skip the test for tiny,

Maybe we should remove the address calculation and replace the ld1w
address with \[[^]]*\].  All that really matters for this test is that
the vector is loaded from memory.

> but even then matching exact register name and instruction scheduling
> seems fragile.

The only hard-coded register names are the parameters, which are
guaranteed by the ABI.  Testing for those should be fine.

The dg-options pass -fno-schedule-insns, but I guess they should
also pass -fno-schedule-insns2.  Or maybe just use -O instead.
We can always revisit this later if even that isn't enough to make
the order stable.

Richard


Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Kugan Vivekanandarajah
Hi Richard,

On Thu, 6 Jun 2019 at 19:35, Richard Sandiford
 wrote:
>
> Kugan Vivekanandarajah  writes:
> > Hi Richard,
> >
> > Thanks for the review. Attached is the latest patch.
> >
> > For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
> > am limiting fwprop in cases like this. Is there a better fix for this?
> > index cf2c9de..2c99285 100644
> > --- a/gcc/fwprop.c
> > +++ b/gcc/fwprop.c
> > @@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
> > rtx_insn *def_insn, rtx def_set)
> >else
> >  mode = GET_MODE (*loc);
> >
> > +  /* TODO. We can't get the mode for
> > + (set (reg:VNx16BI 109)
> > +  (unspec:VNx16BI [
> > +(reg:SI 131)
> > +(reg:SI 106)
> > +   ] UNSPEC_WHILE_LO))
> > + Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
> > +  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
> > +return false;
> >new_rtx = propagate_rtx (*loc, mode, reg, src,
> >   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
>
> What specifically goes wrong?  The unspec above isn't that unusual --
> many unspecs have different modes from their inputs.

cond_arith_1.c:38:1: internal compiler error: in paradoxical_subreg_p,
at rtl.h:3130
0x135f1d3 paradoxical_subreg_p(machine_mode, machine_mode)
../../88838/gcc/rtl.h:3130
0x135f1d3 propagate_rtx
../../88838/gcc/fwprop.c:683
0x135f4a3 forward_propagate_and_simplify
../../88838/gcc/fwprop.c:1371
0x135f4a3 forward_propagate_into
../../88838/gcc/fwprop.c:1430
0x135fdcb fwprop
../../88838/gcc/fwprop.c:1519
0x135fdcb execute
../../88838/gcc/fwprop.c:1550
Please submit a full bug report,
with preprocessed source if appropriate.


in forward_propagate_and_simplify

use_set:
(set (reg:VNx16BI 96 [ loop_mask_52 ])
(unspec:VNx16BI [
(reg:SI 92 [ _3 ])
(reg:SI 95 [ niters.36 ])
] UNSPEC_WHILE_LO))

reg:
(reg:SI 92 [ _3 ])

*loc:
(unspec:VNx16BI [
(reg:SI 92 [ _3 ])
(reg:SI 95 [ niters.36 ])
] UNSPEC_WHILE_LO)

src:
(subreg:SI (reg:DI 136 [ ivtmp_101 ]) 0)

use_insn:
(insn 87 86 88 4 (parallel [
(set (reg:VNx16BI 96 [ loop_mask_52 ])
(unspec:VNx16BI [
(reg:SI 92 [ _3 ])
(reg:SI 95 [ niters.36 ])
] UNSPEC_WHILE_LO))
(clobber (reg:CC 66 cc))
]) 4255 {while_ultsivnx16bi}
 (expr_list:REG_UNUSED (reg:CC 66 cc)
(nil)))

I think we calculate the mode to be VNx16BI which is wrong?
because of which in propgate_rtx,   !paradoxical_subreg_p (mode,
GET_MODE (SUBREG_REG (new_rtx)  ICE

Thanks,
Kugan

>
> Thanks,
> Richard


Re: Review Hashtable extract node API

2019-06-06 Thread Jonathan Wakely

On 05/06/19 20:18 +0100, Jonathan Wakely wrote:

On 05/06/19 17:43 +0100, Jonathan Wakely wrote:

On 05/06/19 17:22 +0100, Jonathan Wakely wrote:

On 04/06/19 19:19 +0200, François Dumont wrote:

@@ -669,18 +670,19 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __node_base*
   _M_get_previous_node(size_type __bkt, __node_base* __n);

-  // Insert node with hash code __code, in bucket bkt if no rehash (assumes
-  // no element with its key already present). Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with key __k and hash code __code, in bucket __bkt if no
+  // rehash (assumes no element with its key already present).
+  template
iterator
-  _M_insert_unique_node(size_type __bkt, __hash_code __code,
-   __node_type* __n, size_type __n_elt = 1);
+   _M_insert_unique_node(const key_type& __k, size_type __bkt,
+ __hash_code __code, const _NodeAccessor&,
+ size_type __n_elt = 1);

-  // Insert node with hash code __code. Take ownership of the node,
-  // deallocate it on exception.
+  // Insert node with hash code __code.
+  template
iterator
-  _M_insert_multi_node(__node_type* __hint,
-  __hash_code __code, __node_type* __n);
+   _M_insert_multi_node(__node_type* __hint, __hash_code __code,
+const _NodeAccessor& __node_accessor);


It looks like most times you call these functions you pass an
identical lambda expression, but each of those lambda expressions will
create a unique type. That means you create different instantiations
of the function templates even though they do exactly the same thing.

That's just generating multiple copies of identical code. Passing in a
function object to provide the node pointer doesn't really seem
necessary anyway, so if it results in larger executables it's really
not desirable.


Also I didn't really like the name NodeAccessor. It's not an accessor,
because it performs ownership transfer. Invoking __node_accessor()
returns a __node_type* by releasing it from the previous owner (by
setting the owner's pointer member to null).

Passing a const reference to something called NodeAccessor does not
make it clear that it performs a mutating operation like that! If the
_M_insert_unique_node and _M_insert_multi_node functions did the
__node_accessor() call *before* rehashing, and rehashing threw an
exception, then they would leak. So it's important that the
__node_acessor() call happens at the right time, and so it's important
to name it well.

In my suggested patch the naming isn't misleading, because we just
pass a raw __node_type* and have a new comment saying:

   // Takes ownership of __n if insertion succeeds, throws otherwise.

The function doesn't have a callable with non-local effects that
modifies an object outside the function. Because the caller sets the
previous owner's pointer to null there's no danger of it happening at
the wrong time; it can only happen after the function has returned and
ownership transfer has completed.


As a further evolution that simplifies some uses of _Scoped_node we
could give it a constructor that allocates a node and constructs an
element, as in the attached patch.


Of course all this code is completely wrong, because it uses raw
pointers not the allocator's pointer type. But that's a much bigger
problem that needs to be solved separately.




[PATCH] Fix PR90574

2019-06-06 Thread Richard Biener


The following fixes debugging experience (and coverage) for cases
where CFG construction "optimizes" the CFG by squashing labels
into the same basic-block, defeating the regular mechanism of
dropping labels that are not reachable as done by CFG cleanup.

Writing coverage testcases is easy enough here, guality IIRC
cannot test whether we stop at exactly a line - after the
patch gdb when setting a breakpoint on line 4, stops at
line 10 (stepping goes from 2 to 10 directly).

Bootstrap & regtest running on x86_64-unknown-linux-gnu.

The patch has no bad effects on code generation when optimizing,
we just produce more "garbage" CFG upfront to leave optimizing
the CFG to machinery that knows how to do it correctly.  It
does have code-generation effects when not optimizing where
for the first testcase instead of

nop
.L2:
cmpl$1, -4(%rbp)
jne .L3

we now emit

cmpl$0, -4(%rbp)
.L3:
cmpl$1, -4(%rbp)
jne .L4

and CFG cleanup done after RTL expansion elides the jump
but not the compare.

Any objections?

Thanks,
Richard.

2019-06-06  Richard Biener  

PR debug/90574
* tree-cfg.c (stmt_starts_bb_p): Split blocks at labels
that appear after user labels.

* gcc.misc-tests/gcov-pr90574-1.c: New testcase.
* gcc.misc-tests/gcov-pr90574-2.c: Likewise.

Index: gcc/tree-cfg.c
===
--- gcc/tree-cfg.c  (revision 271990)
+++ gcc/tree-cfg.c  (working copy)
@@ -2722,10 +2722,10 @@ stmt_starts_bb_p (gimple *stmt, gimple *
  || FORCED_LABEL (gimple_label_label (label_stmt)))
return true;
 
-  if (prev_stmt && gimple_code (prev_stmt) == GIMPLE_LABEL)
+  if (glabel *plabel = safe_dyn_cast  (prev_stmt))
{
- if (DECL_NONLOCAL (gimple_label_label (
-  as_a  (prev_stmt
+ if (DECL_NONLOCAL (gimple_label_label (plabel))
+ || !DECL_ARTIFICIAL (gimple_label_label (plabel)))
return true;
 
  cfg_stats.num_merged_labels++;

Index: gcc/testsuite/gcc.misc-tests/gcov-pr90574-1.c
===
--- gcc/testsuite/gcc.misc-tests/gcov-pr90574-1.c   (nonexistent)
+++ gcc/testsuite/gcc.misc-tests/gcov-pr90574-1.c   (working copy)
@@ -0,0 +1,20 @@
+/* { dg-options "-fprofile-arcs -ftest-coverage" } */
+/* { dg-do run { target native } } */
+
+int main(int argc, char **argv)
+{
+  if (argc == 0)
+{
+  int *ptr;
+label:  /* count(#) */
+   {
+   }
+}
+  if (argc == 1)
+{
+  __builtin_printf("hello\n");
+}
+  return 0;
+}
+
+/* { dg-final { run-gcov gcov-pr90574-1.c } } */
Index: gcc/testsuite/gcc.misc-tests/gcov-pr90574-2.c
===
--- gcc/testsuite/gcc.misc-tests/gcov-pr90574-2.c   (nonexistent)
+++ gcc/testsuite/gcc.misc-tests/gcov-pr90574-2.c   (working copy)
@@ -0,0 +1,15 @@
+/* { dg-options "-fprofile-arcs -ftest-coverage" } */
+/* { dg-do run { target native } } */
+
+int main(int argc, char **argv)
+{
+  switch (argc)
+{
+case 0:
+  foo: /* count(#) */
+case 1:;
+}
+  return 0;
+}
+
+/* { dg-final { run-gcov gcov-pr90574-2.c } } */


Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Richard Sandiford
Kugan Vivekanandarajah  writes:
> Hi Richard,
>
> On Thu, 6 Jun 2019 at 19:35, Richard Sandiford
>  wrote:
>>
>> Kugan Vivekanandarajah  writes:
>> > Hi Richard,
>> >
>> > Thanks for the review. Attached is the latest patch.
>> >
>> > For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
>> > am limiting fwprop in cases like this. Is there a better fix for this?
>> > index cf2c9de..2c99285 100644
>> > --- a/gcc/fwprop.c
>> > +++ b/gcc/fwprop.c
>> > @@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
>> > rtx_insn *def_insn, rtx def_set)
>> >else
>> >  mode = GET_MODE (*loc);
>> >
>> > +  /* TODO. We can't get the mode for
>> > + (set (reg:VNx16BI 109)
>> > +  (unspec:VNx16BI [
>> > +(reg:SI 131)
>> > +(reg:SI 106)
>> > +   ] UNSPEC_WHILE_LO))
>> > + Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
>> > +  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
>> > +return false;
>> >new_rtx = propagate_rtx (*loc, mode, reg, src,
>> >   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
>>
>> What specifically goes wrong?  The unspec above isn't that unusual --
>> many unspecs have different modes from their inputs.
>
> cond_arith_1.c:38:1: internal compiler error: in paradoxical_subreg_p,
> at rtl.h:3130
> 0x135f1d3 paradoxical_subreg_p(machine_mode, machine_mode)
> ../../88838/gcc/rtl.h:3130
> 0x135f1d3 propagate_rtx
> ../../88838/gcc/fwprop.c:683
> 0x135f4a3 forward_propagate_and_simplify
> ../../88838/gcc/fwprop.c:1371
> 0x135f4a3 forward_propagate_into
> ../../88838/gcc/fwprop.c:1430
> 0x135fdcb fwprop
> ../../88838/gcc/fwprop.c:1519
> 0x135fdcb execute
> ../../88838/gcc/fwprop.c:1550
> Please submit a full bug report,
> with preprocessed source if appropriate.
>
>
> in forward_propagate_and_simplify
>
> use_set:
> (set (reg:VNx16BI 96 [ loop_mask_52 ])
> (unspec:VNx16BI [
> (reg:SI 92 [ _3 ])
> (reg:SI 95 [ niters.36 ])
> ] UNSPEC_WHILE_LO))
>
> reg:
> (reg:SI 92 [ _3 ])
>
> *loc:
> (unspec:VNx16BI [
> (reg:SI 92 [ _3 ])
> (reg:SI 95 [ niters.36 ])
> ] UNSPEC_WHILE_LO)
>
> src:
> (subreg:SI (reg:DI 136 [ ivtmp_101 ]) 0)
>
> use_insn:
> (insn 87 86 88 4 (parallel [
> (set (reg:VNx16BI 96 [ loop_mask_52 ])
> (unspec:VNx16BI [
> (reg:SI 92 [ _3 ])
> (reg:SI 95 [ niters.36 ])
> ] UNSPEC_WHILE_LO))
> (clobber (reg:CC 66 cc))
> ]) 4255 {while_ultsivnx16bi}
>  (expr_list:REG_UNUSED (reg:CC 66 cc)
> (nil)))
>
> I think we calculate the mode to be VNx16BI which is wrong?
> because of which in propgate_rtx,   !paradoxical_subreg_p (mode,
> GET_MODE (SUBREG_REG (new_rtx)  ICE

Looks like something I hit on the ACLE branch, but didn't have a
non-ACLE reproducer for (see 065881acf0de35ff7818c1fc92769e1c106e1028).

Does the attached work?  The current call is wrong because "mode"
is the mode of "x", not the mode of "new_rtx".

Thanks,
Richard


2019-06-06  Richard Sandiford  

gcc/
* fwprop.c (propagate_rtx): Fix call to paradoxical_subreg_p.

Index: gcc/fwprop.c
===
--- gcc/fwprop.c2019-03-08 18:14:25.333011645 +
+++ gcc/fwprop.c2019-06-06 13:04:34.423476690 +0100
@@ -680,7 +680,7 @@ propagate_rtx (rtx x, machine_mode mode,
   || CONSTANT_P (new_rtx)
   || (GET_CODE (new_rtx) == SUBREG
  && REG_P (SUBREG_REG (new_rtx))
- && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (new_rtx)
+ && !paradoxical_subreg_p (new_rtx)))
 flags |= PR_CAN_APPEAR;
   if (!varying_mem_p (new_rtx))
 flags |= PR_HANDLE_MEM;


Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-06 Thread Uros Bizjak
On Thu, Jun 6, 2019 at 7:54 AM Hongtao Liu  wrote:
>
> Hi Uros and all:
>   This patch is about to enable support for AVX512_VP2INTERSECT which will
> be in Willow Cove. There are two instructions for AVX512_VP2INTERSECT:
> VP2INTERSECTD and VP2INTERSECTQ. More details please refer to
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
>
>   Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
>
> Changelog:
>
> gcc/
> +2019-06-06  Hongtao Liu  
> + H.J. Lu  
> + Olga Makhotina  
> +
> + * common/config/i386/i386-common.c
> + (OPTION_MASK_ISA_AVX512VP2INTERSECT_SET,
> + OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET): New macros.
> + (OPTION_MASK_ISA2_AVX512F_UNSET): Add
> + OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET.
> + (ix86_handle_option): Handle -mavx512vp2intersect.
> + * config/i386/avx512vp2intersectintrin.h: New.
> + * config/i386/avx512vp2intersectvlintrin.h: New.
> + * config/i386/cpuid.h (bit_AVX512VP2INTERSECT): New.
> + * config/i386/driver-i386.c (host_detect_local_cpu): Detect
> + AVX512VP2INTERSECT.
> + * config/i386/i386-builtin-types.def: Add new types.
> + * config/i386/i386-builtin.def: Add new builtins.
> + * config/i386/i386-builtins.c: (enum processor_features): Add
> + F_AVX512VP2INTERSECT.
> + (static const _isa_names_table isa_names_table): Ditto.
> + * config/i386/i386-c.c (ix86_target_macros_internal): Define
> + __AVX512VP2INTERSECT__.
> + * config/i386/i386-expand.c (ix86_expand_builtin): Expand
> + IX86_BUILTIN_2INTERSECTD512, IX86_BUILTIN_2INTERSECTQ512,
> + IX86_BUILTIN_2INTERSECTD256, IX86_BUILTIN_2INTERSECTQ256,
> + IX86_BUILTIN_2INTERSECTD128, IX86_BUILTIN_2INTERSECTQ128.
> + * config/i386/i386-modes.def (P2QI, P2HI): New modes.
> + * config/i386/i386-options.c (ix86_target_string): Add
> + -mavx512vp2intersect.
> + (ix86_option_override_internal): Handle AVX512VP2INTERSECT.
> + * config/i386/i386.c (ix86_hard_regno_nregs): Allocate two regs for
> + P2HImode and P2QImode.
> + (ix86_hard_regno_mode_ok): Register pair only starts at even hardreg
> + number for P2QImode and P2HImode.
> + * config/i386/i386.h (TARGET_AVX512VP2INTERSECT,
> + TARGET_AVX512VP2INTERSECT_P): New.
> + (PTA_AVX512VP2INTERSECT): Ditto.
> + * config/i386/i386.opt: Add -mavx512vp2intersect.
> + * config/i386/immintrin.h: Include avx512vp2intersectintrin.h and
> + avx512vp2intersectvlintrin.h.
> + * config/i386/sse.md (define_c_enum "unspec"): Add UNSPEC_VP2INTERSECT.
> + (define_mode_iterator VI48_AVX512VP2VL): New.
> + (avx512vp2intersect_2intersect,
> + avx512vp2intersect_2intersectv16si): New define_insn patterns.
> + (*vec_extractp2hi, *vec_extractp2qi): New define_insn_and_split
> + patterns.
> + * config.gcc: Add avx512vp2intersectvlintrin.h and
> + avx512vp2intersectintrin.h to extra_headers.
> + * doc/invoke.texi: Document -mavx512vp2intersect.
> +
>
> gcc/testsuite/
> +2019-06-06  Hongtao Liu  
> + Olga Makhotina  
> +
> + * gcc.target/i386/avx512-check.h: Handle bit_AVX512VP2INTERSECT.
> + * gcc.target/i386/avx512vp2intersect-2intersect-1a.c: New test.
> + * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Likewise.
> + * gcc.target/i386/avx512vp2intersect-2intersectvl-1a.c: Likewise.
> + * gcc.target/i386/avx512vp2intersect-2intersectvl-1b.c: Likewise.
> + * gcc.target/i386/sse-12.c: Add -mavx512vp2intersect.
> + * gcc.target/i386/sse-13.c: Likewsie.
> + * gcc.target/i386/sse-14.c: Likewise.
> + * gcc.target/i386/sse-22.c: Likewise.
> + * gcc.target/i386/sse-23.c: Likewise.
> + * g++.dg/other/i386-2.C: Likewise.
> + * g++.dg/other/i386-3.C: Likewise.
> +

+case OPT_mavx512vp2intersect:
+  if (value)
+{
+  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VP2INTERSECT_SET;
+  opts->x_ix86_isa_flags2_explicit |=
OPTION_MASK_ISA_AVX512VP2INTERSECT_SET;
+  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET;
+  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET;
+}

some space/tab mixup here.

+(define_mode_iterator VI48_AVX512VP2VL
+  [V8DI
+  (V4DI  "TARGET_AVX512VL") (V2DI  "TARGET_AVX512VL")
+  (V8SI "TARGET_AVX512VL") (V4SI  "TARGET_AVX512VL")])

also here (or maybe a vertical alignment issue).

+  op2 = copy_to_reg (op2);
+  op3 = copy_to_reg (op3);

The predicate says that this one can be memory operand as well. I
suggest you use

if (!insn_data[icode].operand[X].predicate (opX, modeX))
  opX = copy_to_mode_reg (modeX, opX);

This would also handle eventual VOIDmode vector 0 operand.

+
+  op4 = gen_reg_rtx (mode4);
+  emit_insn (GEN_FCN (icode) (op4, op2, op3));
+  mode0 = GET_MODE_INNER (GET_MODE (op4));
+  pat = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, GEN_INT (0)));
+  pat2 = gen_rtx_VEC_SELECT (mode0, op4, pat);
+  emit_move_insn (gen_rtx_MEM (mode0, op0), pat2);
+  pat = gen_rtx_PARALLEL (VOIDmode, gen_rtvec (1, GEN_INT (1)));
+  pat2 = gen_rtx_VEC_SELECT (mode0, op4, pat);
+  emit_move_insn 

Re: [PATCH 4/4] Update a bit dump format.

2019-06-06 Thread Jan Hubicka
> 
> gcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * value-prof.c (dump_histogram_value): Change dump format.
>   (gimple_mod_subtract_transform): Remove legacy comment.

OK,
Honza


Re: [PATCH 3/4] Dump histograms only if present.

2019-06-06 Thread Jan Hubicka
> 
> gcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * value-prof.c (dump_histogram_value): Print histogram values
>   only if present.

What is the point of having histogram value when there are no counters?

Honza
> ---
>  gcc/value-prof.c | 72 +++-
>  1 file changed, 28 insertions(+), 44 deletions(-)
> 

> diff --git a/gcc/value-prof.c b/gcc/value-prof.c
> index e893ca084c9..25b957d0c0a 100644
> --- a/gcc/value-prof.c
> +++ b/gcc/value-prof.c
> @@ -228,42 +228,38 @@ dump_histogram_value (FILE *dump_file, histogram_value 
> hist)
>switch (hist->type)
>  {
>  case HIST_TYPE_INTERVAL:
> -  fprintf (dump_file, "Interval counter range %d -- %d",
> -hist->hdata.intvl.int_start,
> -(hist->hdata.intvl.int_start
> - + hist->hdata.intvl.steps - 1));
>if (hist->hvalue.counters)
>   {
> -unsigned int i;
> -fprintf (dump_file, " [");
> -   for (i = 0; i < hist->hdata.intvl.steps; i++)
> -  fprintf (dump_file, " %d:%" PRId64,
> -   hist->hdata.intvl.int_start + i,
> -   (int64_t) hist->hvalue.counters[i]);
> -fprintf (dump_file, " ] outside range:%" PRId64,
> - (int64_t) hist->hvalue.counters[i]);
> +   fprintf (dump_file, "Interval counter range %d -- %d",
> +hist->hdata.intvl.int_start,
> +(hist->hdata.intvl.int_start
> + + hist->hdata.intvl.steps - 1));
> +
> +   unsigned int i;
> +   fprintf (dump_file, " [");
> +   for (i = 0; i < hist->hdata.intvl.steps; i++)
> + fprintf (dump_file, " %d:%" PRId64,
> +  hist->hdata.intvl.int_start + i,
> +  (int64_t) hist->hvalue.counters[i]);
> +   fprintf (dump_file, " ] outside range:%" PRId64 ".\n",
> +(int64_t) hist->hvalue.counters[i]);
>   }
> -  fprintf (dump_file, ".\n");
>break;
>  
>  case HIST_TYPE_POW2:
> -  fprintf (dump_file, "Pow2 counter ");
>if (hist->hvalue.counters)
> - {
> -fprintf (dump_file, "pow2:%" PRId64
> - " nonpow2:%" PRId64,
> - (int64_t) hist->hvalue.counters[1],
> - (int64_t) hist->hvalue.counters[0]);
> - }
> -  fprintf (dump_file, ".\n");
> + fprintf (dump_file, "Pow2 counter pow2:%" PRId64
> +  " nonpow2:%" PRId64 ".\n",
> +  (int64_t) hist->hvalue.counters[1],
> +  (int64_t) hist->hvalue.counters[0]);
>break;
>  
>  case HIST_TYPE_SINGLE_VALUE:
>  case HIST_TYPE_INDIR_CALL:
> -  fprintf (dump_file, (hist->type == HIST_TYPE_SINGLE_VALUE
> -? "Single values " : "Indirect call "));
>if (hist->hvalue.counters)
>   {
> +   fprintf (dump_file, (hist->type == HIST_TYPE_SINGLE_VALUE
> +? "Single values " : "Indirect call "));
> for (unsigned i = 0; i < GCOV_DISK_SINGLE_VALUES; i++)
>   {
> fprintf (dump_file, "[%" PRId64 ":%" PRId64 "]",
> @@ -272,40 +268,28 @@ dump_histogram_value (FILE *dump_file, histogram_value 
> hist)
> if (i != GCOV_DISK_SINGLE_VALUES - 1)
>   fprintf (dump_file, ", ");
>   }
> +   fprintf (dump_file, ".\n");
>   }
> -  fprintf (dump_file, ".\n");
>break;
>  
>  case HIST_TYPE_AVERAGE:
> -  fprintf (dump_file, "Average value ");
>if (hist->hvalue.counters)
> - {
> -fprintf (dump_file, "sum:%" PRId64
> - " times:%" PRId64,
> - (int64_t) hist->hvalue.counters[0],
> - (int64_t) hist->hvalue.counters[1]);
> - }
> -  fprintf (dump_file, ".\n");
> + fprintf (dump_file, "Average value sum:%" PRId64
> +  " times:%" PRId64 ".\n",
> +  (int64_t) hist->hvalue.counters[0],
> +  (int64_t) hist->hvalue.counters[1]);
>break;
>  
>  case HIST_TYPE_IOR:
> -  fprintf (dump_file, "IOR value ");
>if (hist->hvalue.counters)
> - {
> -fprintf (dump_file, "ior:%" PRId64,
> - (int64_t) hist->hvalue.counters[0]);
> - }
> -  fprintf (dump_file, ".\n");
> + fprintf (dump_file, "IOR value ior:%" PRId64 ".\n",
> +  (int64_t) hist->hvalue.counters[0]);
>break;
>  
>  case HIST_TYPE_TIME_PROFILE:
> -  fprintf (dump_file, "Time profile ");
>if (hist->hvalue.counters)
> -  {
> -fprintf (dump_file, "time:%" PRId64,
> - (int64_t) hist->hvalue.counters[0]);
> -  }
> -  fprintf (dump_file, ".\n");
> + fprintf (dump_file, "Time profile time:%" PRId64 ".\n",
> +  (int64_t) hist->hvalue.counters[0]);
>break;
>  case HIST_TYPE_MAX:
>gcc_unreachable ();



[PATCH] Refactor SFINAE constraints on std::tuple constructors

2019-06-06 Thread Jonathan Wakely

Replace the _TC class template with the better-named _TupleConstraints
one, which provides a different set of member functions. The new members
do not distinguish construction from lvalues and rvalues, but expects
the caller to do that by providing different template arguments. Within
the std::tuple primary template and std::tuple partial
specialization the _TupleConstraints members are used via new alias
templates like _ImplicitCtor and _ExplicitCtor which makes the
constructor constraints less verbose and repetitive. For example, where
we previously had:

template::template
  _MoveConstructibleTuple<_UElements...>()
&& _TMC<_UElements...>::template
  _ImplicitlyMoveConvertibleTuple<_UElements...>()
&& (sizeof...(_Elements) >= 1),
  bool>::type=true>
  constexpr tuple(_UElements&&... __elements)

We now have:

template(),
_ImplicitCtor<_Valid, _UElements...> = true>
 constexpr
 tuple(_UElements&&... __elements)

There are two semantic changes as a result of the refactoring:

- The allocator-extended default constructor is now constrained.
- The rewritten constraints fix PR 90700.

* include/std/tuple (_TC): Replace with _TupleConstraints.
(_TupleConstraints): New helper for SFINAE constraints, with more
expressive member functions to reduce duplication when used.
(tuple::_TC2, tuple::_TMC, tuple::_TNTC): Remove.
(tuple::_TCC): Replace dummy type parameter with bool non-type
parameter that can be used to check the pack size.
(tuple::_ImplicitDefaultCtor, tuple::_ExplicitDefaultCtor)
(tuple::_ImplicitCtor, tuple::_ExplicitCtor): New alias templates for
checking constraints in constructors.
(tuple::__valid_args, tuple::_UseOtherCtor, tuple::__use_other_ctor):
New SFINAE helpers.
(tuple::tuple): Use new helpers to reduce repitition in constraints.
(tuple::tuple(allocator_arg_t, const Alloc&)): Constrain.
(tuple::_TCC, tuple::_ImplicitDefaultCtor)
(tuple::_ExplicitDefaultCtor, tuple::_ImplicitCtor)
(tuple::_ExplicitCtor): New alias templates for checking
constraints in constructors.
(tuple::__is_alloc_arg()): New SFINAE helpers.
(tuple::tuple): Use new helpers to reduce repitition in
constraints.
(tuple::tuple(allocator_arg_t, const Alloc&)): Constrain.
* testsuite/20_util/tuple/cons/90700.cc: New test.
* testsuite/20_util/tuple/cons/allocators.cc: Add default constructor
to meet new constraint on allocator-extended default constructor.

An earlier version of this was discussed last August. This is the
outcome of that proposal. This results in some nice compilation
speedups e.g. -ftime-report output before and after:

TOTAL :   3.19  0.80  4.01 325721 kB
TOTAL :   2.32  0.60  2.94 245921 kB

Tested x86_64-linux, committed to trunk.


commit 0132493cd663f3470f468cbf3d327d71320f290e
Author: Jonathan Wakely 
Date:   Mon Jun 3 15:20:56 2019 +0100

Refactor SFINAE constraints on std::tuple constructors

Replace the _TC class template with the better-named _TupleConstraints
one, which provides a different set of member functions. The new members
do not distinguish construction from lvalues and rvalues, but expects
the caller to do that by providing different template arguments. Within
the std::tuple primary template and std::tuple partial
specialization the _TupleConstraints members are used via new alias
templates like _ImplicitCtor and _ExplicitCtor which makes the
constructor constraints less verbose and repetitive. For example, where
we previously had:

 template::template
   _MoveConstructibleTuple<_UElements...>()
 && _TMC<_UElements...>::template
   _ImplicitlyMoveConvertibleTuple<_UElements...>()
 && (sizeof...(_Elements) >= 1),
   bool>::type=true>
   constexpr tuple(_UElements&&... __elements)

We now have:

 template(),
 _ImplicitCtor<_Valid, _UElements...> = true>
  constexpr
  tuple(_UElements&&... __elements)

There are two semantic changes as a result of the refactoring:

- The allocator-extended default constructor is now constrained.
- The rewritten constraints fix PR 90700.

* include/std/tuple (_TC): Replace with _TupleConstraints.
(_TupleConstraints): New helper for SFINAE constraints, with more
expressive member functions to reduce duplication when used.
(tuple::_TC2, tuple::_TMC, tuple::_TNTC): Remove.
(tuple::_TCC): Replace dummy type parameter with bool non-type
parameter that can be used to check the pack size.
(tuple::_ImplicitDefaultCtor, tu

[PATCH] Fix tests that fail with -std=gnu++98 or -std=gnu++11

2019-06-06 Thread Jonathan Wakely

* testsuite/18_support/set_terminate.cc: Do not run for C++98 mode.
* testsuite/18_support/set_unexpected.cc: Likewise.
* testsuite/20_util/is_nothrow_invocable/value.cc: Test converting to
void.
* testsuite/20_util/is_nothrow_invocable/value_ext.cc: Fix constexpr
function to be valid in C++11.
* testsuite/26_numerics/complex/proj.cc: Do not run for C++98 mode.
* testsuite/experimental/names.cc: Do not run for C++98 mode. Do not
include Library Fundamentals or Networking headers in C++11 mode.
* testsuite/ext/char8_t/atomic-1.cc: Do not run for C++98 mode.

Tested x86_64-linux, committed to trunk.

commit 755ea5a814d88ab49a4514483351ef4bdbfb70de
Author: Jonathan Wakely 
Date:   Mon Jun 3 23:49:42 2019 +0100

Fix tests that fail with -std=gnu++98 or -std=gnu++11

* testsuite/18_support/set_terminate.cc: Do not run for C++98 mode.
* testsuite/18_support/set_unexpected.cc: Likewise.
* testsuite/20_util/is_nothrow_invocable/value.cc: Test converting 
to
void.
* testsuite/20_util/is_nothrow_invocable/value_ext.cc: Fix constexpr
function to be valid in C++11.
* testsuite/26_numerics/complex/proj.cc: Do not run for C++98 mode.
* testsuite/experimental/names.cc: Do not run for C++98 mode. Do not
include Library Fundamentals or Networking headers in C++11 mode.
* testsuite/ext/char8_t/atomic-1.cc: Do not run for C++98 mode.

diff --git a/libstdc++-v3/testsuite/18_support/set_terminate.cc 
b/libstdc++-v3/testsuite/18_support/set_terminate.cc
index 632b9f1974a..81f182a7a76 100644
--- a/libstdc++-v3/testsuite/18_support/set_terminate.cc
+++ b/libstdc++-v3/testsuite/18_support/set_terminate.cc
@@ -15,6 +15,7 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// { dg-options "-std=gnu++11" }
 // { dg-do run }
 
 #include 
diff --git a/libstdc++-v3/testsuite/18_support/set_unexpected.cc 
b/libstdc++-v3/testsuite/18_support/set_unexpected.cc
index dac44e616cd..7c3f3d44790 100644
--- a/libstdc++-v3/testsuite/18_support/set_unexpected.cc
+++ b/libstdc++-v3/testsuite/18_support/set_unexpected.cc
@@ -15,7 +15,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do run { target { c++98_only || { c++11_only || c++14_only } } } }
+// { dg-options "-std=gnu++11" }
+// { dg-do run { target { c++11_only || c++14_only } } }
 
 #include 
 #include 
diff --git a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value.cc 
b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value.cc
index 04d310fff38..c0c6a7dc8ea 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value.cc
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value.cc
@@ -119,9 +119,11 @@ void test01()
   static_assert( ! is_nt_invocable_r< T, F  >(), "call throws");
   static_assert( ! is_nt_invocable_r< NT,F  >(), "call throws");
   static_assert( ! is_nt_invocable_r< Ex,F  >(), "call throws");
+  static_assert( ! is_nt_invocable_r< void,  F  >(), "call throws");
   static_assert( ! is_nt_invocable_r< T, CF >(), "conversion throws");
   static_assert(   is_nt_invocable_r< NT,CF >(), "" );
   static_assert( ! is_nt_invocable_r< Ex,CF >(), "conversion fails");
+  static_assert(   is_nt_invocable_r< void,  CF >(), "");
 
   static_assert( ! is_nt_invocable< F,   int >(), "call throws");
   static_assert(   is_nt_invocable< F&,  int >(), "");
@@ -140,12 +142,14 @@ void test01()
 
   static_assert(   is_nt_invocable_r< char&,  CF,  int >(), "");
   static_assert(   is_nt_invocable_r< char&,  CF&, int >(), "");
+  static_assert(   is_nt_invocable_r< void,   CF&, int >(), "");
 
   static_assert( ! is_nt_invocable_r< T,  CF&, int >(),
   "conversion throws");
   static_assert(   is_nt_invocable_r< NT, CF&, int >(), "");
   static_assert( ! is_nt_invocable_r< Ex, CF&, int >(),
   "conversion fails, would use explicit constructor");
+  static_assert(   is_nt_invocable_r< void,   CF&, int >(), "");
 
   static_assert( ! is_nt_invocable< F, int, int >(),
   "would call private member");
@@ -157,6 +161,7 @@ void test01()
   };
   static_assert( is_nt_invocable< FX >(), "FX::operator() is nothrow" );
   static_assert( is_nt_invocable_r(), "no conversion needed" );
+  static_assert( is_nt_invocable_r(), "" );
 
   struct Y {
 explicit Y(X) noexcept; // not viable for implicit conversions
diff --git a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value_ext.cc 
b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value_ext.cc
index 3a133ade4de..2c00b1fbe75 100644
--- a/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value_ext.cc
+++ b/libstdc++-v3/testsuite/20_util/is_nothrow_invocable/value_ext.cc
@@ -24,19 +24,24 @@ template
   { return std::__is_nothrow_

Re: [PATCH] PR libstdc++/71579 assert that type traits are not misused with an incomplete type

2019-06-06 Thread Jonathan Wakely

On 31/05/19 11:35 +0100, Jonathan Wakely wrote:

On 31/05/19 08:58 +0300, Antony Polukhin wrote:

On Thu, May 30, 2019, 19:38 Jonathan Wakely  wrote:
<...>


I've attached a relative diff, to be applied on top of yours, with my
suggested tweaks. Do you see any issues with it?

(If you're happy with those tweaks I can go ahead and apply this,
there's no need for an updated patch from you).



Looks good. Please apply!


Here's what I've tested and committed to trunk.

I decided to add a more detailed changelog too.


I'm removing some of these assertions again, because they are either
reundant or wrong.

Tested x86_64-linux, committed to trunk.


commit 6ffe4cf371688563be1680ffb75cc1160540cf2e
Author: redi 
Date:   Thu Jun 6 12:13:47 2019 +

Remove redundant static assertions in [meta.unary.prop] traits

The type property predicates that are implemented by a compiler builtin
already do the right checks in the compiler. The checks for complete
type or unbounded arrays were wrong for these types anyway.

* include/std/type_traits (is_empty, is_polymorphic, is_final)
(is_abstract, is_aggregate): Remove static_assert.
* testsuite/20_util/is_abstract/incomplete_neg.cc: Check for error
from builtin only.
* testsuite/20_util/is_aggregate/incomplete_neg.cc: Likewise. Add
missing -std=gnu++17 option.
* testsuite/20_util/is_empty/incomplete_neg.cc: New test.
* testsuite/20_util/is_final/incomplete_neg.cc: New test.
* testsuite/20_util/is_polymorphic/incomplete_neg.cc: Check for error
from builtin only.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272000 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/include/std/type_traits b/libstdc++-v3/include/std/type_traits
index 78a113af415..e53d3c8d535 100644
--- a/libstdc++-v3/include/std/type_traits
+++ b/libstdc++-v3/include/std/type_traits
@@ -746,19 +746,13 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_empty
 : public integral_constant
-{
-  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
-	"template argument must be a complete class or an unbounded array");
-};
+{ };
 
   /// is_polymorphic
   template
 struct is_polymorphic
 : public integral_constant
-{
-  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
-	"template argument must be a complete class or an unbounded array");
-};
+{ };
 
 #if __cplusplus >= 201402L
 #define __cpp_lib_is_final 201402L
@@ -766,20 +760,14 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   template
 struct is_final
 : public integral_constant
-{
-  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
-	"template argument must be a complete class or an unbounded array");
-};
+{ };
 #endif
 
   /// is_abstract
   template
 struct is_abstract
 : public integral_constant
-{
-  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
-	"template argument must be a complete class or an unbounded array");
-};
+{ };
 
   template::value>
@@ -3174,10 +3162,7 @@ template 
   template
 struct is_aggregate
 : bool_constant<__is_aggregate(remove_cv_t<_Tp>)>
-{
-  static_assert(std::__is_complete_or_unbounded(__type_identity<_Tp>{}),
-	"template argument must be a complete class or an unbounded array");
-};
+{ };
 
   /// is_aggregate_v
   template
diff --git a/libstdc++-v3/testsuite/20_util/is_abstract/incomplete_neg.cc b/libstdc++-v3/testsuite/20_util/is_abstract/incomplete_neg.cc
index 94f4ecd6000..a2a73d01a06 100644
--- a/libstdc++-v3/testsuite/20_util/is_abstract/incomplete_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_abstract/incomplete_neg.cc
@@ -1,7 +1,5 @@
 // { dg-do compile { target c++11 } }
-// { dg-prune-output "invalid use of incomplete type" }
-// { dg-prune-output "must be a complete" }
-//
+
 // Copyright (C) 2019 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
@@ -19,6 +17,9 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
+// Expect the compiler builtin to do the completeness check.
+// { dg-error "incomplete type" "" { target *-*-* } 0 }
+
 #include 
 
 class X;
diff --git a/libstdc++-v3/testsuite/20_util/is_aggregate/incomplete_neg.cc b/libstdc++-v3/testsuite/20_util/is_aggregate/incomplete_neg.cc
index 8a3dd551cbb..eff3f64c476 100644
--- a/libstdc++-v3/testsuite/20_util/is_aggregate/incomplete_neg.cc
+++ b/libstdc++-v3/testsuite/20_util/is_aggregate/incomplete_neg.cc
@@ -1,5 +1,6 @@
+// { dg-options "-std=gnu++17" }
 // { dg-do compile { target c++17 } }
-//
+
 // Copyright (C) 2019 Free Software Foundation, Inc.
 //
 // This file is part of the GNU ISO C++ Library.  This library is free
@@ -17,7 +18,8 @@
 // with this library; see the file COPYING3.  If n

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Kugan Vivekanandarajah
Hi Richard,

On Thu, 6 Jun 2019 at 22:07, Richard Sandiford
 wrote:
>
> Kugan Vivekanandarajah  writes:
> > Hi Richard,
> >
> > On Thu, 6 Jun 2019 at 19:35, Richard Sandiford
> >  wrote:
> >>
> >> Kugan Vivekanandarajah  writes:
> >> > Hi Richard,
> >> >
> >> > Thanks for the review. Attached is the latest patch.
> >> >
> >> > For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
> >> > am limiting fwprop in cases like this. Is there a better fix for this?
> >> > index cf2c9de..2c99285 100644
> >> > --- a/gcc/fwprop.c
> >> > +++ b/gcc/fwprop.c
> >> > @@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
> >> > rtx_insn *def_insn, rtx def_set)
> >> >else
> >> >  mode = GET_MODE (*loc);
> >> >
> >> > +  /* TODO. We can't get the mode for
> >> > + (set (reg:VNx16BI 109)
> >> > +  (unspec:VNx16BI [
> >> > +(reg:SI 131)
> >> > +(reg:SI 106)
> >> > +   ] UNSPEC_WHILE_LO))
> >> > + Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
> >> > +  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
> >> > +return false;
> >> >new_rtx = propagate_rtx (*loc, mode, reg, src,
> >> >   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
> >>
> >> What specifically goes wrong?  The unspec above isn't that unusual --
> >> many unspecs have different modes from their inputs.
> >
> > cond_arith_1.c:38:1: internal compiler error: in paradoxical_subreg_p,
> > at rtl.h:3130
> > 0x135f1d3 paradoxical_subreg_p(machine_mode, machine_mode)
> > ../../88838/gcc/rtl.h:3130
> > 0x135f1d3 propagate_rtx
> > ../../88838/gcc/fwprop.c:683
> > 0x135f4a3 forward_propagate_and_simplify
> > ../../88838/gcc/fwprop.c:1371
> > 0x135f4a3 forward_propagate_into
> > ../../88838/gcc/fwprop.c:1430
> > 0x135fdcb fwprop
> > ../../88838/gcc/fwprop.c:1519
> > 0x135fdcb execute
> > ../../88838/gcc/fwprop.c:1550
> > Please submit a full bug report,
> > with preprocessed source if appropriate.
> >
> >
> > in forward_propagate_and_simplify
> >
> > use_set:
> > (set (reg:VNx16BI 96 [ loop_mask_52 ])
> > (unspec:VNx16BI [
> > (reg:SI 92 [ _3 ])
> > (reg:SI 95 [ niters.36 ])
> > ] UNSPEC_WHILE_LO))
> >
> > reg:
> > (reg:SI 92 [ _3 ])
> >
> > *loc:
> > (unspec:VNx16BI [
> > (reg:SI 92 [ _3 ])
> > (reg:SI 95 [ niters.36 ])
> > ] UNSPEC_WHILE_LO)
> >
> > src:
> > (subreg:SI (reg:DI 136 [ ivtmp_101 ]) 0)
> >
> > use_insn:
> > (insn 87 86 88 4 (parallel [
> > (set (reg:VNx16BI 96 [ loop_mask_52 ])
> > (unspec:VNx16BI [
> > (reg:SI 92 [ _3 ])
> > (reg:SI 95 [ niters.36 ])
> > ] UNSPEC_WHILE_LO))
> > (clobber (reg:CC 66 cc))
> > ]) 4255 {while_ultsivnx16bi}
> >  (expr_list:REG_UNUSED (reg:CC 66 cc)
> > (nil)))
> >
> > I think we calculate the mode to be VNx16BI which is wrong?
> > because of which in propgate_rtx,   !paradoxical_subreg_p (mode,
> > GET_MODE (SUBREG_REG (new_rtx)  ICE
>
> Looks like something I hit on the ACLE branch, but didn't have a
> non-ACLE reproducer for (see 065881acf0de35ff7818c1fc92769e1c106e1028).
>
> Does the attached work?  The current call is wrong because "mode"
> is the mode of "x", not the mode of "new_rtx".

Yes, attached patch works for this testcase. Are you planning to
commit it to trunk. I will wait for this.

Thanks,
Kugan
>
> Thanks,
> Richard
>
>
> 2019-06-06  Richard Sandiford  
>
> gcc/
> * fwprop.c (propagate_rtx): Fix call to paradoxical_subreg_p.
>
> Index: gcc/fwprop.c
> ===
> --- gcc/fwprop.c2019-03-08 18:14:25.333011645 +
> +++ gcc/fwprop.c2019-06-06 13:04:34.423476690 +0100
> @@ -680,7 +680,7 @@ propagate_rtx (rtx x, machine_mode mode,
>|| CONSTANT_P (new_rtx)
>|| (GET_CODE (new_rtx) == SUBREG
>   && REG_P (SUBREG_REG (new_rtx))
> - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (new_rtx)
> + && !paradoxical_subreg_p (new_rtx)))
>  flags |= PR_CAN_APPEAR;
>if (!varying_mem_p (new_rtx))
>  flags |= PR_HANDLE_MEM;


Re: [PATCH 3/4] Dump histograms only if present.

2019-06-06 Thread Martin Liška
On 6/6/19 2:16 PM, Jan Hubicka wrote:
> What is the point of having histogram value when there are no counters?

Because we first create histograms in branch_prob -> 
gimple_find_values_to_profile and
later we read profile from file. At that point we know which are empty, but we 
don't
remove them.

Martin


Re: [PATCH] Enable GCC support for AVX512_VP2INTERSECT.

2019-06-06 Thread Uros Bizjak
On Thu, Jun 6, 2019 at 2:12 PM Uros Bizjak  wrote:
>
> On Thu, Jun 6, 2019 at 7:54 AM Hongtao Liu  wrote:
> >
> > Hi Uros and all:
> >   This patch is about to enable support for AVX512_VP2INTERSECT which will
> > be in Willow Cove. There are two instructions for AVX512_VP2INTERSECT:
> > VP2INTERSECTD and VP2INTERSECTQ. More details please refer to
> > https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> >
> >   Bootstrap is ok, and no regressions for i386/x86-64 testsuite.
> >
> > Changelog:
> >
> > gcc/
> > +2019-06-06  Hongtao Liu  
> > + H.J. Lu  
> > + Olga Makhotina  
> > +
> > + * common/config/i386/i386-common.c
> > + (OPTION_MASK_ISA_AVX512VP2INTERSECT_SET,
> > + OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET): New macros.
> > + (OPTION_MASK_ISA2_AVX512F_UNSET): Add
> > + OPTION_MASK_ISA_AVX512VP2INTERSECT_UNSET.
> > + (ix86_handle_option): Handle -mavx512vp2intersect.
> > + * config/i386/avx512vp2intersectintrin.h: New.
> > + * config/i386/avx512vp2intersectvlintrin.h: New.
> > + * config/i386/cpuid.h (bit_AVX512VP2INTERSECT): New.
> > + * config/i386/driver-i386.c (host_detect_local_cpu): Detect
> > + AVX512VP2INTERSECT.
> > + * config/i386/i386-builtin-types.def: Add new types.
> > + * config/i386/i386-builtin.def: Add new builtins.
> > + * config/i386/i386-builtins.c: (enum processor_features): Add
> > + F_AVX512VP2INTERSECT.
> > + (static const _isa_names_table isa_names_table): Ditto.
> > + * config/i386/i386-c.c (ix86_target_macros_internal): Define
> > + __AVX512VP2INTERSECT__.
> > + * config/i386/i386-expand.c (ix86_expand_builtin): Expand
> > + IX86_BUILTIN_2INTERSECTD512, IX86_BUILTIN_2INTERSECTQ512,
> > + IX86_BUILTIN_2INTERSECTD256, IX86_BUILTIN_2INTERSECTQ256,
> > + IX86_BUILTIN_2INTERSECTD128, IX86_BUILTIN_2INTERSECTQ128.
> > + * config/i386/i386-modes.def (P2QI, P2HI): New modes.
> > + * config/i386/i386-options.c (ix86_target_string): Add
> > + -mavx512vp2intersect.
> > + (ix86_option_override_internal): Handle AVX512VP2INTERSECT.
> > + * config/i386/i386.c (ix86_hard_regno_nregs): Allocate two regs for
> > + P2HImode and P2QImode.
> > + (ix86_hard_regno_mode_ok): Register pair only starts at even hardreg
> > + number for P2QImode and P2HImode.
> > + * config/i386/i386.h (TARGET_AVX512VP2INTERSECT,
> > + TARGET_AVX512VP2INTERSECT_P): New.
> > + (PTA_AVX512VP2INTERSECT): Ditto.
> > + * config/i386/i386.opt: Add -mavx512vp2intersect.
> > + * config/i386/immintrin.h: Include avx512vp2intersectintrin.h and
> > + avx512vp2intersectvlintrin.h.
> > + * config/i386/sse.md (define_c_enum "unspec"): Add UNSPEC_VP2INTERSECT.
> > + (define_mode_iterator VI48_AVX512VP2VL): New.
> > + (avx512vp2intersect_2intersect,
> > + avx512vp2intersect_2intersectv16si): New define_insn patterns.
> > + (*vec_extractp2hi, *vec_extractp2qi): New define_insn_and_split
> > + patterns.
> > + * config.gcc: Add avx512vp2intersectvlintrin.h and
> > + avx512vp2intersectintrin.h to extra_headers.
> > + * doc/invoke.texi: Document -mavx512vp2intersect.
> > +
> >
> > gcc/testsuite/
> > +2019-06-06  Hongtao Liu  
> > + Olga Makhotina  
> > +
> > + * gcc.target/i386/avx512-check.h: Handle bit_AVX512VP2INTERSECT.
> > + * gcc.target/i386/avx512vp2intersect-2intersect-1a.c: New test.
> > + * gcc.target/i386/avx512vp2intersect-2intersect-1b.c: Likewise.
> > + * gcc.target/i386/avx512vp2intersect-2intersectvl-1a.c: Likewise.
> > + * gcc.target/i386/avx512vp2intersect-2intersectvl-1b.c: Likewise.
> > + * gcc.target/i386/sse-12.c: Add -mavx512vp2intersect.
> > + * gcc.target/i386/sse-13.c: Likewsie.
> > + * gcc.target/i386/sse-14.c: Likewise.
> > + * gcc.target/i386/sse-22.c: Likewise.
> > + * gcc.target/i386/sse-23.c: Likewise.
> > + * g++.dg/other/i386-2.C: Likewise.
> > + * g++.dg/other/i386-3.C: Likewise.
> > +
>
> +case OPT_mavx512vp2intersect:
> +  if (value)
> +{
> +  opts->x_ix86_isa_flags2 |= OPTION_MASK_ISA_AVX512VP2INTERSECT_SET;
> +  opts->x_ix86_isa_flags2_explicit |=
> OPTION_MASK_ISA_AVX512VP2INTERSECT_SET;
> +  opts->x_ix86_isa_flags |= OPTION_MASK_ISA_AVX512F_SET;
> +  opts->x_ix86_isa_flags_explicit |= OPTION_MASK_ISA_AVX512F_SET;
> +}
>
> some space/tab mixup here.
>
> +(define_mode_iterator VI48_AVX512VP2VL
> +  [V8DI
> +  (V4DI  "TARGET_AVX512VL") (V2DI  "TARGET_AVX512VL")
> +  (V8SI "TARGET_AVX512VL") (V4SI  "TARGET_AVX512VL")])
>
> also here (or maybe a vertical alignment issue).
>
> +  op2 = copy_to_reg (op2);
> +  op3 = copy_to_reg (op3);
>
> The predicate says that this one can be memory operand as well. I
> suggest you use
>
> if (!insn_data[icode].operand[X].predicate (opX, modeX))
>   opX = copy_to_mode_reg (modeX, opX);
>
> This would also handle eventual VOIDmode vector 0 operand.
>
> +
> +  op4 = gen_reg_rtx (mode4);
> +  emit_insn (GEN_FCN (icode) (op4, op2, op3));
> +  mode0 = GET_MODE_INNER (GET_MODE (op4));
> +  pat = gen_rtx_PARALLEL (VOIDmode, gen

Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Richard Sandiford
Kugan Vivekanandarajah  writes:
> Hi Richard,
>
> On Thu, 6 Jun 2019 at 22:07, Richard Sandiford
>  wrote:
>>
>> Kugan Vivekanandarajah  writes:
>> > Hi Richard,
>> >
>> > On Thu, 6 Jun 2019 at 19:35, Richard Sandiford
>> >  wrote:
>> >>
>> >> Kugan Vivekanandarajah  writes:
>> >> > Hi Richard,
>> >> >
>> >> > Thanks for the review. Attached is the latest patch.
>> >> >
>> >> > For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
>> >> > am limiting fwprop in cases like this. Is there a better fix for this?
>> >> > index cf2c9de..2c99285 100644
>> >> > --- a/gcc/fwprop.c
>> >> > +++ b/gcc/fwprop.c
>> >> > @@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
>> >> > rtx_insn *def_insn, rtx def_set)
>> >> >else
>> >> >  mode = GET_MODE (*loc);
>> >> >
>> >> > +  /* TODO. We can't get the mode for
>> >> > + (set (reg:VNx16BI 109)
>> >> > +  (unspec:VNx16BI [
>> >> > +(reg:SI 131)
>> >> > +(reg:SI 106)
>> >> > +   ] UNSPEC_WHILE_LO))
>> >> > + Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
>> >> > +  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
>> >> > +return false;
>> >> >new_rtx = propagate_rtx (*loc, mode, reg, src,
>> >> >   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
>> >>
>> >> What specifically goes wrong?  The unspec above isn't that unusual --
>> >> many unspecs have different modes from their inputs.
>> >
>> > cond_arith_1.c:38:1: internal compiler error: in paradoxical_subreg_p,
>> > at rtl.h:3130
>> > 0x135f1d3 paradoxical_subreg_p(machine_mode, machine_mode)
>> > ../../88838/gcc/rtl.h:3130
>> > 0x135f1d3 propagate_rtx
>> > ../../88838/gcc/fwprop.c:683
>> > 0x135f4a3 forward_propagate_and_simplify
>> > ../../88838/gcc/fwprop.c:1371
>> > 0x135f4a3 forward_propagate_into
>> > ../../88838/gcc/fwprop.c:1430
>> > 0x135fdcb fwprop
>> > ../../88838/gcc/fwprop.c:1519
>> > 0x135fdcb execute
>> > ../../88838/gcc/fwprop.c:1550
>> > Please submit a full bug report,
>> > with preprocessed source if appropriate.
>> >
>> >
>> > in forward_propagate_and_simplify
>> >
>> > use_set:
>> > (set (reg:VNx16BI 96 [ loop_mask_52 ])
>> > (unspec:VNx16BI [
>> > (reg:SI 92 [ _3 ])
>> > (reg:SI 95 [ niters.36 ])
>> > ] UNSPEC_WHILE_LO))
>> >
>> > reg:
>> > (reg:SI 92 [ _3 ])
>> >
>> > *loc:
>> > (unspec:VNx16BI [
>> > (reg:SI 92 [ _3 ])
>> > (reg:SI 95 [ niters.36 ])
>> > ] UNSPEC_WHILE_LO)
>> >
>> > src:
>> > (subreg:SI (reg:DI 136 [ ivtmp_101 ]) 0)
>> >
>> > use_insn:
>> > (insn 87 86 88 4 (parallel [
>> > (set (reg:VNx16BI 96 [ loop_mask_52 ])
>> > (unspec:VNx16BI [
>> > (reg:SI 92 [ _3 ])
>> > (reg:SI 95 [ niters.36 ])
>> > ] UNSPEC_WHILE_LO))
>> > (clobber (reg:CC 66 cc))
>> > ]) 4255 {while_ultsivnx16bi}
>> >  (expr_list:REG_UNUSED (reg:CC 66 cc)
>> > (nil)))
>> >
>> > I think we calculate the mode to be VNx16BI which is wrong?
>> > because of which in propgate_rtx,   !paradoxical_subreg_p (mode,
>> > GET_MODE (SUBREG_REG (new_rtx)  ICE
>>
>> Looks like something I hit on the ACLE branch, but didn't have a
>> non-ACLE reproducer for (see 065881acf0de35ff7818c1fc92769e1c106e1028).
>>
>> Does the attached work?  The current call is wrong because "mode"
>> is the mode of "x", not the mode of "new_rtx".
>
> Yes, attached patch works for this testcase. Are you planning to
> commit it to trunk. I will wait for this.

Needs approval first. :-)

The patch was originally bootstrapped & regtested on aarch64-linux-gnu,
but I'll repeat that for trunk and test x86_64-linux-gnu too.

Richard

>
> Thanks,
> Kugan
>>
>> Thanks,
>> Richard
>>
>>
>> 2019-06-06  Richard Sandiford  
>>
>> gcc/
>> * fwprop.c (propagate_rtx): Fix call to paradoxical_subreg_p.
>>
>> Index: gcc/fwprop.c
>> ===
>> --- gcc/fwprop.c2019-03-08 18:14:25.333011645 +
>> +++ gcc/fwprop.c2019-06-06 13:04:34.423476690 +0100
>> @@ -680,7 +680,7 @@ propagate_rtx (rtx x, machine_mode mode,
>>|| CONSTANT_P (new_rtx)
>>|| (GET_CODE (new_rtx) == SUBREG
>>   && REG_P (SUBREG_REG (new_rtx))
>> - && !paradoxical_subreg_p (mode, GET_MODE (SUBREG_REG (new_rtx)
>> + && !paradoxical_subreg_p (new_rtx)))
>>  flags |= PR_CAN_APPEAR;
>>if (!varying_mem_p (new_rtx))
>>  flags |= PR_HANDLE_MEM;


Re: [ARM/FDPIC v5 00/21] FDPIC ABI for ARM

2019-06-06 Thread Christophe Lyon
Hi,

If this makes review easier, here are the areas covered by the patches:

- patches 1,3,4,7,8,9,10,12,21: target-specific
- patch 2: configure
- patch 5,6,11,13: generic parts, undef #if defined(__FDPIC__)
- patches 14-20: testsuite

Christophe

On Tue, 4 Jun 2019 at 14:57, Christophe Lyon  wrote:
>
> Ping?
>
>
> On Thu, 23 May 2019 at 14:46, Christophe Lyon  wrote:
> >
> > Ping?
> >
> > Any feedback other than what I got on patch 03/21 ?
> >
> > Thanks,
> >
> > Christophe
> >
> >
> > On 15/05/2019 14:39, Christophe Lyon wrote:
> > > Hello,
> > >
> > > This patch series implements the GCC contribution of the FDPIC ABI for
> > > ARM targets.
> > >
> > > This ABI enables to run Linux on ARM MMU-less cores and supports
> > > shared libraries to reduce the memory footprint.
> > >
> > > Without MMU, text and data segments relative distances are different
> > > from one process to another, hence the need for a dedicated FDPIC
> > > register holding the start address of the data segment. One of the
> > > side effects is that function pointers require two words to be
> > > represented: the address of the code, and the data segment start
> > > address. These two words are designated as "Function Descriptor",
> > > hence the "FD PIC" name.
> > >
> > > On ARM, the FDPIC register is r9 [1], and the target name is
> > > arm-uclinuxfdpiceabi. Note that arm-uclinux exists, but uses another
> > > ABI and the BFLAT file format; it does not support code sharing.
> > > The -mfdpic option is enabled by default, and -mno-fdpic should be
> > > used to build the Linux kernel.
> > >
> > > This work was developed some time ago by STMicroelectronics, and was
> > > presented during Linaro Connect SFO15 (September 2015). You can watch
> > > the discussion and read the slides [2].
> > > This presentation was related to the toolchain published on github [3],
> > > which is based on binutils-2.22, gcc-4.7, uclibc-0.9.33.2, gdb-7.5.1
> > > and qemu-2.3.0, and for which pre-built binaries are available [3].
> > >
> > > The ABI itself is described in details in [1].
> > >
> > > Our Linux kernel patches have been updated and committed by Nicolas
> > > Pitre (Linaro) in July 2017. They are required so that the loader is
> > > able to handle this new file type. Indeed, the ELF files are tagged
> > > with ELFOSABI_ARM_FDPIC. This new tag has been allocated by ARM, as
> > > well as the new relocations involved.
> > >
> > > The binutils, QEMU and uclibc-ng patch series have been merged a few
> > > months ago. [4][5][6]
> > >
> > > This series provides support for architectures that support ARM and/or
> > > Thumb-2 and has been tested on arm-linux-gnueabi without regression,
> > > as well as arm-uclinuxfdpiceabi, using QEMU. arm-uclinuxfdpiceabi has
> > > a few more failures than arm-linux-gnueabi, but is quite functional.
> > >
> > > I have also booted an STM32 board (stm32f469) which uses a cortex-m4
> > > with linux-4.20.17 and ran successfully several tools.
> > >
> > > Are the GCC patches OK for inclusion in master?
> > >
> > > Changes between v4 and v5:
> > > - rebased on top of recent gcc-10 master (April 26th, 2019)
> > > - fixed handling of stack-protector combined patterns in FDPIC mode
> > >
> > > Changes between v3 and v4:
> > >
> > > - improved documentation (patch 1)
> > > - emit an error message (sorry) if the target architecture does not
> > >support arm nor thumb-2 modes (patch 4)
> > > - handle Richard's comments on patch 4 (comments, unspec)
> > > - added .align directive (patch 5)
> > > - fixed use of kernel helpers (__kernel_cmpxchg, __kernel_dmb) (patch 6)
> > > - code factorization in patch 7
> > > - typos/internal function name in patch 8
> > > - improved patch 12
> > > - dropped patch 16
> > > - patch 20 introduces arm_arch*_thumb_ok effective targets to help
> > >skip some tests
> > > - I tested patch 2 on xtensa-buildroot-uclinux-uclibc, it adds many
> > >new tests, but a few regressions
> > >(https://gcc.gnu.org/ml/gcc-patches/2018-11/msg00713.html)
> > > - I compiled and executed several LTP tests to exercise pthreads and 
> > > signals
> > > - I wrote and executed a simple testcase to change the interaction
> > >with __kernel_cmpxchg (ie. call the kernel helper rather than use an
> > >implementation in libgcc as requested by Richard)
> > >
> > > Changes between v2 and v3:
> > > - added doc entry for -mfdpic new option
> > > - took Kyrill's comments into account (use "Armv7" instead of "7",
> > >code factorization, use preprocessor instead of hard-coding "r9",
> > >remove leftover code for thumb1 support, fixed comments)
> > > - rebase over recent trunk
> > > - patches with changes: 1, 2 (commit message), 3 (rebase), 4, 6, 7, 9,
> > >14 (rebase), 19 (rebase)
> > >
> > > Changes between v1 and v2:
> > > - fix GNU coding style
> > > - exit with an error for pre-Armv7
> > > - use ACLE __ARM_ARCH and remove dead code for pre-Armv4
> > > - remove unsupported attempts of pre-Armv7/thu

Re: [PATCH][MSP430][4/4] Implement 64-bit shifts in assembly code

2019-06-06 Thread Jozef Lawrynowicz
On Wed, 5 Jun 2019 16:35:14 -0600
Jeff Law  wrote:

> On 6/4/19 7:17 AM, Jozef Lawrynowicz wrote:
> > libgcc/ChangeLog
> > 
> > 2019-06-04  Jozef Lawrynowicz  
> > 
> > * config/msp430/slli.S (__mspabi_s): New library function for
> > performing a logical left shift of a 64-bit value.
> > (__mspabi_srall): New library function for
> > performing a arithmetic right shift of a 64-bit value.
> > (__mspabi_srlll): New library function for
> > performing a logical right shift of a 64-bit value.
> > 
> Going to assume your assembly routines are correct :-)
> 
> OK
> jeff

I assume I implemented them correctly based on the clean regtest of
GCC/G++ testsuites. But in case there might be a gap in the coverage
somewhere, how about the attached new torture test to explicitly check 64-bit
shifts work as expected?
Passes for x86_64-linux-gnu and msp430-elf.

Thanks for the reviews.
Jozef
diff --git a/gcc/testsuite/gcc.c-torture/execute/shiftdi-2.c b/gcc/testsuite/gcc.c-torture/execute/shiftdi-2.c
new file mode 100644
index 000..63d1fe90830
--- /dev/null
+++ b/gcc/testsuite/gcc.c-torture/execute/shiftdi-2.c
@@ -0,0 +1,23 @@
+__INT64_TYPE__ a = 568513516876543756;
+__INT64_TYPE__ b = -754324895235774564;
+__UINT64_TYPE__ c = 156789543257562457;
+
+__INT64_TYPE__ expected_a[64] = {568513516876543756, 1137027033753087512, 2274054067506175024, 4548108135012350048, 9096216270024700096, -254311533660151424, -508623067320302848, -1017246134640605696, -2034492269281211392, -4068984538562422784, -8137969077124845568, 2170805919459860480, 4341611838919720960, 8683223677839441920, -1080296718030667776, -2160593436061335552, -4321186872122671104, -8642373744245342208, 1161996585218867200, 2323993170437734400, 4647986340875468800, -9150771391958614016, 145201289792323584, 290402579584647168, 580805159169294336, 1161610318338588672, 2323220636677177344, 4646441273354354688, -9153861527000842240, 139021019707867136, 278042039415734272, 556084078831468544, 1112168157662937088, 2224336315325874176, 4448672630651748352, 8897345261303496704, -652053551102558208, -1304107102205116416, -2608214204410232832, -5216428408820465664, 8013887256068620288, -2418969561572311040, -4837939123144622080, 8770865827420307456, -905012418868936704, -1810024837737873408, -3620049675475746816, -7240099350951493632, 3966545371806564352, 7933090743613128704, -2580562586483294208, -5161125172966588416, 8124493727776374784, -2197756618156802048, -4395513236313604096, -8791026472627208192, 864691128455135232, 1729382256910270464, 3458764513820540928, 6917529027641081856, -4611686018427387904, -9223372036854775808ULL, 0, 0};
+__INT64_TYPE__ expected_b[64] = {-754324895235774564, -377162447617887282, -188581223808943641, -94290611904471821, -47145305952235911, -23572652976117956, -11786326488058978, -5893163244029489, -2946581622014745, -1473290811007373, -736645405503687, -368322702751844, -184161351375922, -92080675687961, -46040337843981, -23020168921991, -11510084460996, -5755042230498, -2877521115249, -1438760557625, -719380278813, -359690139407, -179845069704, -89922534852, -44961267426, -22480633713, -11240316857, -5620158429, -2810079215, -1405039608, -702519804, -351259902, -175629951, -87814976, -43907488, -21953744, -10976872, -5488436, -2744218, -1372109, -686055, -343028, -171514, -85757, -42879, -21440, -10720, -5360, -2680, -1340, -670, -335, -168, -84, -42, -21, -11, -6, -3, -2, -1, -1, -1, -1};
+__UINT64_TYPE__ expected_c[64] = {156789543257562457, 78394771628781228, 39197385814390614, 19598692907195307, 9799346453597653, 4899673226798826, 2449836613399413, 1224918306699706, 612459153349853, 306229576674926, 153114788337463, 76557394168731, 38278697084365, 19139348542182, 9569674271091, 4784837135545, 2392418567772, 1196209283886, 598104641943, 299052320971, 149526160485, 74763080242, 37381540121, 18690770060, 9345385030, 4672692515, 2336346257, 1168173128, 584086564, 292043282, 146021641, 73010820, 36505410, 18252705, 9126352, 4563176, 2281588, 1140794, 570397, 285198, 142599, 71299, 35649, 17824, 8912, 4456, 2228, 1114, 557, 278, 139, 69, 34, 17, 8, 4, 2, 1, 0, 0, 0, 0, 0, 0};
+
+int
+main (void)
+{
+  int i;
+  for (i = 0; i < 64; i++)
+  {
+if ((a << i) != expected_a[i])
+  __builtin_abort ();
+else if ((b >> i) != expected_b[i])
+  __builtin_abort ();
+else if ((c >> i) != expected_c[i])
+  __builtin_abort ();
+  }
+  return 0;
+}


Re: [PATCH 2/4] Implement N disk counters for single value and indirect call counters.

2019-06-06 Thread Jan Hubicka
> 
> gcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * gcov-io.h (GCOV_DISK_SINGLE_VALUES): New.
>   (GCOV_SINGLE_VALUE_COUNTERS): Likewise.
>   * ipa-profile.c (ipa_profile_generate_summary):
>   Use get_most_common_single_value.
>   * tree-profile.c (gimple_init_gcov_profiler):
>   Instrument with __gcov_one_value_profiler_v2
>   and __gcov_indirect_call_profiler_v4.
>   * value-prof.c (dump_histogram_value):
>   Print all values for HIST_TYPE_SINGLE_VALUE.
>   (stream_in_histogram_value): Set number of
>   counters for HIST_TYPE_SINGLE_VALUE.
>   (get_most_common_single_value): New.
>   (gimple_divmod_fixed_value_transform):
>   Use get_most_common_single_value.
>   (gimple_ic_transform): Likewise.
>   (gimple_stringops_transform): Likewise.
>   (gimple_find_values_to_profile): Set number
>   of counters for HIST_TYPE_SINGLE_VALUE.
>   * value-prof.h (get_most_common_single_value):
>   New.
> 
> libgcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * Makefile.in: Add __gcov_one_value_profiler_v2 and
>   __gcov_indirect_call_profiler_v4.
>   * libgcov-merge.c (__gcov_merge_single): Change
>   function signature.
>   (merge_single_value_set): New.
>   * libgcov-profiler.c (__gcov_one_value_profiler_body):
>   Do not update counters[2].
>   (__gcov_one_value_profiler): Remove.
>   (__gcov_one_value_profiler_atomic): Rename to ...
>   (__gcov_one_value_profiler_v2): ... this.
>   (__gcov_indirect_call_profiler_v3): Rename to ...
>   (__gcov_indirect_call_profiler_v4): ... this.
>   * libgcov.h (__gcov_one_value_profiler): Remove.
>   (__gcov_one_value_profiler_atomic): Remove.
>   (__gcov_indirect_call_profiler_v3): Remove.
>   (__gcov_one_value_profiler_v2): New.
>   (__gcov_indirect_call_profiler_v4): New.
> ---
>  gcc/gcov-io.h |   7 +++
>  gcc/ipa-profile.c |  13 +++--
>  gcc/tree-profile.c|   9 ++-
>  gcc/value-prof.c  | 120 --
>  gcc/value-prof.h  |   2 +
>  libgcc/Makefile.in|   5 +-
>  libgcc/libgcov-merge.c|  77 
>  libgcc/libgcov-profiler.c |  43 +++---
>  libgcc/libgcov.h  |   5 +-
>  9 files changed, 147 insertions(+), 134 deletions(-)
> 

> diff --git a/gcc/tree-profile.c b/gcc/tree-profile.c
> index f2cf4047579..008a1271979 100644
> --- a/gcc/tree-profile.c
> +++ b/gcc/tree-profile.c
> @@ -165,10 +165,9 @@ gimple_init_gcov_profiler (void)
> = build_function_type_list (void_type_node,
> gcov_type_ptr, gcov_type_node,
> NULL_TREE);
> -  fn_name = concat ("__gcov_one_value_profiler", fn_suffix, NULL);
> -  tree_one_value_profiler_fn = build_fn_decl (fn_name,
> -   one_value_profiler_fn_type);
> -  free (CONST_CAST (char *, fn_name));
> +  tree_one_value_profiler_fn
> + = build_fn_decl ("__gcov_one_value_profiler_v2",
> +  one_value_profiler_fn_type);

Why you no longer need the optional atomic suffix here?
> diff --git a/gcc/value-prof.c b/gcc/value-prof.c
> index 1e14e532070..e893ca084c9 100644
> --- a/gcc/value-prof.c
> +++ b/gcc/value-prof.c
> @@ -259,15 +259,19 @@ dump_histogram_value (FILE *dump_file, histogram_value 
> hist)
>break;
>  
>  case HIST_TYPE_SINGLE_VALUE:
> -  fprintf (dump_file, "Single value ");
> +case HIST_TYPE_INDIR_CALL:
> +  fprintf (dump_file, (hist->type == HIST_TYPE_SINGLE_VALUE
> +? "Single values " : "Indirect call "));
>if (hist->hvalue.counters)
>   {
> -fprintf (dump_file, "value:%" PRId64
> - " match:%" PRId64
> - " wrong:%" PRId64,
> - (int64_t) hist->hvalue.counters[0],
> - (int64_t) hist->hvalue.counters[1],
> - (int64_t) hist->hvalue.counters[2]);
> +   for (unsigned i = 0; i < GCOV_DISK_SINGLE_VALUES; i++)
> + {
> +   fprintf (dump_file, "[%" PRId64 ":%" PRId64 "]",
> +(int64_t) hist->hvalue.counters[2 * i],
> +(int64_t) hist->hvalue.counters[2 * i + 1]);
> +   if (i != GCOV_DISK_SINGLE_VALUES - 1)
> + fprintf (dump_file, ", ");
> + }
Unless there are some compelling reasons, I would still include in dump what is
meaning of the values - not everyone is fluent in that ;)
> @@ -758,23 +779,19 @@ gimple_divmod_fixed_value_transform 
> (gimple_stmt_iterator *si)
>if (!histogram)
>  return false;
>  
> +  if (!get_most_common_single_value (histogram, &val, &count))
> +return false;
> +
>value = histogram->hvalue.value;
> -  val = histogram->hvalue.counters[0];
> -  count = histogram->hvalue.counters[1];
> -  all = histogram->hvalue.counters[2];
> +  all = gim

Re: [PATCH 1/4] Remove indirect call top N counter type.

2019-06-06 Thread Jan Hubicka
Hi,
so the only point of removing this is the fact that builds would be
not reproducible with indir-call-topn-profile?
I still kind of thing it may be useful to track multiple most common
values, so I would be in favour of keeping it just updating the documentation
of indir-call-topn-profile that it is currently incomplete and does not
lead to reproducible builds and does not handle speculation...

Honza
> 
> gcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * doc/invoke.texi: Remove param.
>   * gcov-counter.def (GCOV_COUNTER_ICALL_TOPNV):
>   Remove.
>   * gcov-io.h (GCOV_ICALL_TOPN_VAL): Likewise.
>   (GCOV_ICALL_TOPN_NCOUNTS): Likewise.
>   * params.def (PARAM_INDIR_CALL_TOPN_PROFILE): Likewise.
>   * profile.c (instrument_values): Remove
>   HIST_TYPE_INDIR_CALL_TOPN.
>   * tree-profile.c (init_ic_make_global_vars):
>   Always build __gcov_indirect_call only.
>   (gimple_init_gcov_profiler): Remove usage
>   of PARAM_INDIR_CALL_TOPN_PROFILE.
>   (gimple_gen_ic_profiler): Likewise.
>   * value-prof.c (dump_histogram_value): Likewise.
>   (stream_in_histogram_value): Likewise.
>   (gimple_indirect_call_to_profile): Likewise.
>   (gimple_find_values_to_profile): Likewise.
>   * value-prof.h (enum hist_type): Likewise.
> 
> libgcc/ChangeLog:
> 
> 2019-06-04  Martin Liska  
> 
>   * Makefile.in: Remove usage of
>   _gcov_merge_icall_topn.
>   * libgcov-driver.c (gcov_sort_n_vals): Remove.
>   (gcov_sort_icall_topn_counter): Likewise.
>   (gcov_sort_topn_counter_arrays): Likewise.
>   (dump_one_gcov): Remove call to gcov_sort_topn_counter_arrays.
>   * libgcov-merge.c (__gcov_merge_icall_topn): Remove.
>   * libgcov-profiler.c (__gcov_topn_value_profiler_body):
>   Likewise.
>   (GCOV_ICALL_COUNTER_CLEAR_THRESHOLD): Remove.
>   (struct indirect_call_tuple): Remove.
>   (__gcov_indirect_call_topn_profiler): Remove.
>   * libgcov-util.c (__gcov_icall_topn_counter_op): Remove.
>   * libgcov.h (gcov_sort_n_vals): Remove.
>   (L_gcov_merge_icall_topn): Likewise.
>   (__gcov_merge_icall_topn): Likewise.
>   (__gcov_indirect_call_topn_profiler): Likewise.
> ---
>  gcc/doc/invoke.texi   |   3 -
>  gcc/gcov-counter.def  |   3 -
>  gcc/gcov-io.h |   6 --
>  gcc/params.def|   8 ---
>  gcc/profile.c |   1 -
>  gcc/tree-profile.c|  14 +---
>  gcc/value-prof.c  |  32 +
>  gcc/value-prof.h  |   2 -
>  libgcc/Makefile.in|   5 +-
>  libgcc/libgcov-driver.c   |  80 ---
>  libgcc/libgcov-merge.c|  62 --
>  libgcc/libgcov-profiler.c | 133 --
>  libgcc/libgcov-util.c |  19 --
>  libgcc/libgcov.h  |   7 --
>  14 files changed, 5 insertions(+), 370 deletions(-)
> 

> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 91c9bb89651..50e50e39413 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -12140,9 +12140,6 @@ will not try to thread through its block.
>  Maximum number of nested calls to search for control dependencies
>  during uninitialized variable analysis.
>  
> -@item indir-call-topn-profile
> -Track top N target addresses in indirect-call profile.
> -
>  @item max-once-peeled-insns
>  The maximum number of insns of a peeled loop that rolls only once.
>  
> diff --git a/gcc/gcov-counter.def b/gcc/gcov-counter.def
> index 3a0e620987a..b0596c8dc6b 100644
> --- a/gcc/gcov-counter.def
> +++ b/gcc/gcov-counter.def
> @@ -49,6 +49,3 @@ DEF_GCOV_COUNTER(GCOV_COUNTER_IOR, "ior", _ior)
>  
>  /* Time profile collecting first run of a function */
>  DEF_GCOV_COUNTER(GCOV_TIME_PROFILER, "time_profiler", _time_profile)
> -
> -/* Top N value tracking for indirect calls.  */
> -DEF_GCOV_COUNTER(GCOV_COUNTER_ICALL_TOPNV, "indirect_call_topn", _icall_topn)
> diff --git a/gcc/gcov-io.h b/gcc/gcov-io.h
> index 9edb2923982..69c9a73dba8 100644
> --- a/gcc/gcov-io.h
> +++ b/gcc/gcov-io.h
> @@ -266,12 +266,6 @@ GCOV_COUNTERS
>  #define GCOV_N_VALUE_COUNTERS \
>(GCOV_LAST_VALUE_COUNTER - GCOV_FIRST_VALUE_COUNTER + 1)
>  
> -/* The number of hottest callees to be tracked.  */
> -#define GCOV_ICALL_TOPN_VAL  2
> -
> -/* The number of counter entries per icall callsite.  */
> -#define GCOV_ICALL_TOPN_NCOUNTS (1 + GCOV_ICALL_TOPN_VAL * 4)
> -
>  /* Convert a counter index to a tag.  */
>  #define GCOV_TAG_FOR_COUNTER(COUNT)  \
>   (GCOV_TAG_COUNTER_BASE + ((gcov_unsigned_t)(COUNT) << 17))
> diff --git a/gcc/params.def b/gcc/params.def
> index 6b7f7eb5bae..b4a4e4a4190 100644
> --- a/gcc/params.def
> +++ b/gcc/params.def
> @@ -992,14 +992,6 @@ DEFPARAM (PARAM_PROFILE_FUNC_INTERNAL_ID,
> "Use internal function id in profile lookup.",
> 0, 0, 1)
>  
> -/* When the parameter is 1, track the most frequent N target
> -   addresses in indirect-call profile. This disab

Re: [PATCH 3/4] Dump histograms only if present.

2019-06-06 Thread Jan Hubicka
> On 6/6/19 2:16 PM, Jan Hubicka wrote:
> > What is the point of having histogram value when there are no counters?
> 
> Because we first create histograms in branch_prob -> 
> gimple_find_values_to_profile and
> later we read profile from file. At that point we know which are empty, but 
> we don't
> remove them.

OK, so it is about the dump not crahsing when you do it in meantime.
That makes sense :)
So patch is OK.

Honza


Re: Make aliasing_component_refs_p to work harder when same_type_for_tbaa returns -1

2019-06-06 Thread Richard Biener
On Thu, 30 May 2019, Jan Hubicka wrote:

> Hi,
> this patch makes aliasing_component_refs_p to not give up at the first
> -1 returned by types_same_for_tbaa_p and continue searching for pairs of types
> we know to be the same.  This affects disambiguations as follows:
>
> With patch:
>   refs_may_alias_p: 3013678 disambiguations, 3314059 queries
>   ref_maybe_used_by_call_p: 7112 disambiguations, 3039278 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   aliasing_component_ref_p: 636 disambiguations, 15844 queries
>   TBAA oracle: 1417999 disambiguations 2915696 queries
>552182 are in alias set 0
>569795 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>259437 are dependent in the DAG
>116283 are aritificially in conflict with void *
> 
> Without:
> 
> Alias oracle query stats:
>   refs_may_alias_p: 3013194 disambiguations, 3313539 queries
>   ref_maybe_used_by_call_p: 7112 disambiguations, 3038794 queries
>   call_may_clobber_ref_p: 817 disambiguations, 817 queries
>   aliasing_component_ref_p: 152 disambiguations, 14285 queries
>   TBAA oracle: 1417999 disambiguations 2914656 queries
>552182 are in alias set 0
>569275 queries asked about the same object
>0 queries asked about the same alias set
>0 access volatile
>258917 are dependent in the DAG
>116283 are aritificially in conflict with void *
> 
> Basically all comming from disambiguating
>   MEM[(const Element_t[3] &)_116][1];
>   usedGuards.upper_m[2];
> There are number of similar matches in testsuite.  
> More disambiguations would be possible if we did not allow partial
> overlaps of arrays.
> 
> I had to however plug a problem with alias-2.c testcase (the one about
> overlapping arrays):
> 
> /* We do not want to treat int[3] as an object that cannot overlap
>itself but treat it as arbitrary sub-array of a larger array object.  */
> int ar1(int (*p)[3], int (*q)[3])
> {
>   (*p)[0] = 1;
>   (*q)[1] = 2;
>   return (*p)[0];
> }
> int main()
> {
>   int a[4];
>   if (ar1 ((int (*)[3])&a[1], (int (*)[3])&a[0]) != 2)
> __builtin_abort ();
>   return 0;
> }
> 
> Previously indirect_refs_may_alias_p bypased the offset+range test because
> it explicitly tests for array types:
> 
>   /* But avoid treating arrays as "objects", instead assume they
>  can overlap by an exact multiple of their element size.  */
>   && TREE_CODE (TREE_TYPE (ptrtype1)) != ARRAY_TYPE)
> return ranges_maybe_overlap_p (offset1, max_size1, offset2, max_size2);
> 
> Later the code continues to aliasing_component_refs_p which used to give up
> comparing int[3] and int with -1 because of:
> 
>   /* ??? In Ada, an lvalue of an unconstrained type can be used to access an
>  object of one of its constrained subtypes, e.g. when a function with an
>  unconstrained parameter passed by reference is called on an object and
>  inlined.  But, even in the case of a fixed size, type and subtypes are
>  not equivalent enough as to share the same TYPE_CANONICAL, since this
>  would mean that conversions between them are useless, whereas they are
>  not (e.g. type and subtypes can have different modes).  So, in the end,
>  they are only guaranteed to have the same alias set.  */
>   if (get_alias_set (type1) == get_alias_set (type2))
> return -1;
> 
> This is more of an accident and there are cases where we do not trip across
> this -1 and we disambiguate array accesses that seems unsafe to me.
> 
> With my change aliasing_component_refs_p finds the match of the
> array types (type_same_for_tbaa_p returns 1 with non-LTO becuase they
> have same canonical type) and disambiguates based on disjoint access ranges.
> 
> I have thus went ahead and updated all uses of type_same_for_tbaa_p to special
> case arrays and reffer to this testcase (which seems odd and is only one in
> testsuite): We can still do useful disambiguation if the array is not toplevel
> reference or we know that the memory object is not bigger. This is tested by a
> testcase I added and is quite frequent in real world code.
> 
> I also added check to give up on VLAs since I can not convicne myself that
> this is safe: I think early inlining VLAs and streaming them may lead to
> same VLA type have two different sizes at a time enabling it to partially
> overlap.

OK - sorry for the delay.  The array stuff gets a bit ugly so
we eventually want to do sth about that ...

Thanks,
Richard.

>   * gcc.dg/lto/alias-access-path-2.0.c: New testcase.
> 
>   * tree-ssa-alias.c (aliasing_component_refs_p): Do not give up
>   immediately after same_types_for_tbaa_p returns -1 and continue
>   looking for possible exact match; if matching types are arrays
>   watch for partial overlaps.
>   (indirect_ref_may_a

[C++ Patch] Further grokdeclarator locations work

2019-06-06 Thread Paolo Carlini

Hi,

only minor functional changes here - more precise locations for two 
error messages  - but I think it's a step in the right direction: it 
moves the declaration of id_loc way up, near typespec_loc and as such 
id_loc is immediately used. Then its value is updated upon the loop over 
declarator in the middle of the function and used again in the final 
part of the function. That also "frees" the simple name loc for other 
local uses, allows to simplify those checks changed rather recently over 
(name && declarator). and (unqualified_id && declarator). Tested 
x86_64-linux.


Thanks, Paolo.

//

/cp
2019-06-06  Paolo Carlini  

* decl.c (grokdeclarator): Move further up the declaration of
id_loc, use it immediately, update its value after the loop
over declarator, use it again in the final part of function;
use id_loc in error messages about multiple data types and
conflicting specifiers.

/testsuite
2019-06-06  Paolo Carlini  

* g++.dg/diagnostic/conflicting-specifiers-1.C: New.
* g++.dg/diagnostic/two-or-more-data-types-1.C: Likewise.
Index: cp/decl.c
===
--- cp/decl.c   (revision 271974)
+++ cp/decl.c   (working copy)
@@ -10456,6 +10456,8 @@ grokdeclarator (const cp_declarator *declarator,
   if (typespec_loc == UNKNOWN_LOCATION)
 typespec_loc = input_location;
 
+  location_t id_loc = declarator ? declarator->id_loc : input_location;
+
   /* Look inside a declarator for the name being declared
  and get it as a string, for an error message.  */
   for (id_declarator = declarator;
@@ -10620,8 +10622,7 @@ grokdeclarator (const cp_declarator *declarator,
  D1 ( parameter-declaration-clause) ...  */
   if (funcdef_flag && innermost_code != cdk_function)
 {
-  error_at (declarator->id_loc,
-   "function definition does not declare parameters");
+  error_at (id_loc, "function definition does not declare parameters");
   return error_mark_node;
 }
 
@@ -10629,8 +10630,7 @@ grokdeclarator (const cp_declarator *declarator,
   && innermost_code != cdk_function
   && ! (ctype && !declspecs->any_specifiers_p))
 {
-  error_at (declarator->id_loc,
-   "declaration of %qD as non-function", dname);
+  error_at (id_loc, "declaration of %qD as non-function", dname);
   return error_mark_node;
 }
 
@@ -10639,8 +10639,7 @@ grokdeclarator (const cp_declarator *declarator,
   if (UDLIT_OPER_P (dname)
  && innermost_code != cdk_function)
{
- error_at (declarator->id_loc,
-   "declaration of %qD as non-function", dname);
+ error_at (id_loc, "declaration of %qD as non-function", dname);
  return error_mark_node;
}
 
@@ -10648,14 +10647,12 @@ grokdeclarator (const cp_declarator *declarator,
{
  if (typedef_p)
{
- error_at (declarator->id_loc,
-   "declaration of %qD as %", dname);
+ error_at (id_loc, "declaration of %qD as %", dname);
  return error_mark_node;
}
  else if (decl_context == PARM || decl_context == CATCHPARM)
{
- error_at (declarator->id_loc,
-   "declaration of %qD as parameter", dname);
+ error_at (id_loc, "declaration of %qD as parameter", dname);
  return error_mark_node;
}
}
@@ -10705,13 +10702,13 @@ grokdeclarator (const cp_declarator *declarator,
  issue an error message.  */
   if (declspecs->multiple_types_p)
 {
-  error ("two or more data types in declaration of %qs", name);
+  error_at (id_loc, "two or more data types in declaration of %qs", name);
   return error_mark_node;
 }
 
   if (declspecs->conflicting_specifiers_p)
 {
-  error ("conflicting specifiers in declaration of %qs", name);
+  error_at (id_loc, "conflicting specifiers in declaration of %qs", name);
   return error_mark_node;
 }
 
@@ -11861,6 +11858,8 @@ grokdeclarator (const cp_declarator *declarator,
}
 }
 
+  id_loc = declarator ? declarator->id_loc : input_location;
+
   /* A `constexpr' specifier used in an object declaration declares
  the object as `const'.  */
   if (constexpr_p && innermost_code != cdk_function)
@@ -11884,8 +11883,6 @@ grokdeclarator (const cp_declarator *declarator,
   unqualified_id = dname;
 }
 
-  location_t loc = declarator ? declarator->id_loc : input_location;
-
   /* If TYPE is a FUNCTION_TYPE, but the function name was explicitly
  qualified with a class-name, turn it into a METHOD_TYPE, unless
  we know that the function is static.  We take advantage of this
@@ -11912,7 +11909,7 @@ grokdeclarator (const cp_declarator *declarator,
  friendp = 0;
}
  else
-   permerror (loc, "extra qualification %<%T::%> on memb

Re: [PR 89330] Avoid adding dead speculative edges to inlinig heap

2019-06-06 Thread Jan Hubicka
My understanding is that the problematic situation is when we create 
speculation,

insert it to the priority queue and then decide it is useless.
This is done by speculation_useful_p and that checks more than just the 
fact whether
function is inlinable (it accepts targets declared as PURE/CONST but 
also punts on cold

calls).
It also checks inline limits. So won't it break when these things are 
not in sync?


Honza

Dne 2019-06-06 10:46, Martin Liška napsal:

Hi.

This is rebased version of the patch that Martin J. wrote.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.

Thanks,
Martin




Re: [PATCH 1/4] Remove indirect call top N counter type.

2019-06-06 Thread Martin Liška
On 6/6/19 2:52 PM, Jan Hubicka wrote:
> Hi,
> so the only point of removing this is the fact that builds would be
> not reproducible with indir-call-topn-profile?

That's one reason. But the main reason is that the code is dead and not used.
If you take a look at gcc/ipa-profile.c:189, the code is not supporting 
HIST_TYPE_INDIR_CALL_TOPN.
So it's broken for few years and nobody is using that. Last reason is 
implementation of:
__gcov_merge_icall_topn. It's over-complicated.

Martin

> I still kind of thing it may be useful to track multiple most common
> values, so I would be in favour of keeping it just updating the documentation
> of indir-call-topn-profile that it is currently incomplete and does not
> lead to reproducible builds and does not handle speculation...
> 
> Honza



Re: Make aliasing_component_refs_p to work harder when same_type_for_tbaa returns -1

2019-06-06 Thread Jan Hubicka
> > This is more of an accident and there are cases where we do not trip across
> > this -1 and we disambiguate array accesses that seems unsafe to me.
> > 
> > With my change aliasing_component_refs_p finds the match of the
> > array types (type_same_for_tbaa_p returns 1 with non-LTO becuase they
> > have same canonical type) and disambiguates based on disjoint access ranges.
> > 
> > I have thus went ahead and updated all uses of type_same_for_tbaa_p to 
> > special
> > case arrays and reffer to this testcase (which seems odd and is only one in
> > testsuite): We can still do useful disambiguation if the array is not 
> > toplevel
> > reference or we know that the memory object is not bigger. This is tested 
> > by a
> > testcase I added and is quite frequent in real world code.
> > 
> > I also added check to give up on VLAs since I can not convicne myself that
> > this is safe: I think early inlining VLAs and streaming them may lead to
> > same VLA type have two different sizes at a time enabling it to partially
> > overlap.
> 
> OK - sorry for the delay.  The array stuff gets a bit ugly so
> we eventually want to do sth about that ...

Thanks, no prob with the delay - I apprechiate we could discuss these
things carefully as they are by no means obvious :)

I agree that the way overlapping arrays support is done current is ugly
and seems incomplete.  I hope to clean it up now and craft more
testcases.  It seems bit odd decision to suport the partial overlaps as
done by alias-2.c. I wonder how important it is in practice.

Honza
> 
> Thanks,
> Richard.
> 
> > * gcc.dg/lto/alias-access-path-2.0.c: New testcase.
> > 
> > * tree-ssa-alias.c (aliasing_component_refs_p): Do not give up
> > immediately after same_types_for_tbaa_p returns -1 and continue
> > looking for possible exact match; if matching types are arrays
> > watch for partial overlaps.
> > (indirect_ref_may_alias_decl_p): Watch for partial array overlaps.
> > (indirect_refs_may_alias_p): Do type based disambiguation first;
> > update comment.
> > Index: testsuite/g++.dg/lto/pr88130_0.C
> > ===
> > --- testsuite/g++.dg/lto/pr88130_0.C(nonexistent)
> > +++ testsuite/g++.dg/lto/pr88130_0.C(working copy)
> > @@ -0,0 +1,28 @@
> > +// { dg-lto-do link }
> > +// // { dg-lto-options { "-O2 -flto" } }
> > +// // { dg-extra-ld-options "-r -nostdlib" }
> > +class a {
> > +public:
> > +  static const long b = 1;
> > +};
> > +struct c {
> > +  enum d { e };
> > +};
> > +class C;
> > +class f {
> > +public:
> > +  f(c::d);
> > +  template  C operator<=(g);
> > +};
> > +class C {
> > +public:
> > +  template  void operator!=(h &);
> > +};
> > +void i() {
> > +  f j(c::e);
> > +  try {
> > +j <= 0 != a::b;
> > +  } catch (...) {
> > +  }
> > +}
> > +
> > Index: testsuite/gcc.dg/lto/alias-access-path-2_0.c
> > ===
> > --- testsuite/gcc.dg/lto/alias-access-path-2_0.c(nonexistent)
> > +++ testsuite/gcc.dg/lto/alias-access-path-2_0.c(working copy)
> > @@ -0,0 +1,38 @@
> > +/* { dg-lto-do run } */
> > +/* { dg-lto-options { { -O3 -flto -fno-early-inlining } } } */
> > +
> > +/* In this test the access patch orracle (aliasing_component_refs_p)
> > +   can disambiguage array[0] from array[1] by base+offset but it needs to 
> > be
> > +   able to find the common type and not give up by not being able to 
> > compare
> > +   types earlier.  */
> > +
> > +typedef int (*fnptr) ();
> > +
> > +__attribute__ ((used))
> > +struct a
> > +{
> > +  void *array[2];
> > +} a, *aptr = &a;
> > +
> > +__attribute__ ((used))
> > +struct b
> > +{
> > + struct a a;
> > +} *bptr;
> > +
> > +static void
> > +inline_me_late (int argc)
> > +{
> > +  if (argc == -1)
> > +bptr->a.array[1] = bptr;
> > +}
> > +
> > +int
> > +main (int argc)
> > +{
> > +  aptr->array[0] = 0;
> > +  inline_me_late (argc);
> > +  if (!__builtin_constant_p (aptr->array[0] == 0))
> > +__builtin_abort ();
> > +  return 0;
> > +}
> > Index: tree-ssa-alias.c
> > ===
> > --- tree-ssa-alias.c(revision 271747)
> > +++ tree-ssa-alias.c(working copy)
> > @@ -850,6 +850,7 @@ aliasing_component_refs_p (tree ref1,
> >tree type1, type2;
> >tree *refp;
> >int same_p1 = 0, same_p2 = 0;
> > +  bool maybe_match = false;
> >  
> >/* Choose bases and base types to search for.  */
> >base1 = ref1;
> > @@ -880,8 +881,14 @@ aliasing_component_refs_p (tree ref1,
> >   if (cmp == 0)
> > {
> >   same_p2 = same_type_for_tbaa (TREE_TYPE (*refp), type1);
> > - if (same_p2 != 0)
> > + if (same_p2 == 1)
> > break;
> > + /* In case we can't decide whether types are same try to
> > +continue looking for the exact match.
> > +Remember however that we possibly saw a match
> > +  

Re: Make aliasing_component_refs_p to work harder when same_type_for_tbaa returns -1

2019-06-06 Thread Richard Biener
On Thu, 6 Jun 2019, Jan Hubicka wrote:

> > > This is more of an accident and there are cases where we do not trip 
> > > across
> > > this -1 and we disambiguate array accesses that seems unsafe to me.
> > > 
> > > With my change aliasing_component_refs_p finds the match of the
> > > array types (type_same_for_tbaa_p returns 1 with non-LTO becuase they
> > > have same canonical type) and disambiguates based on disjoint access 
> > > ranges.
> > > 
> > > I have thus went ahead and updated all uses of type_same_for_tbaa_p to 
> > > special
> > > case arrays and reffer to this testcase (which seems odd and is only one 
> > > in
> > > testsuite): We can still do useful disambiguation if the array is not 
> > > toplevel
> > > reference or we know that the memory object is not bigger. This is tested 
> > > by a
> > > testcase I added and is quite frequent in real world code.
> > > 
> > > I also added check to give up on VLAs since I can not convicne myself that
> > > this is safe: I think early inlining VLAs and streaming them may lead to
> > > same VLA type have two different sizes at a time enabling it to partially
> > > overlap.
> > 
> > OK - sorry for the delay.  The array stuff gets a bit ugly so
> > we eventually want to do sth about that ...
> 
> Thanks, no prob with the delay - I apprechiate we could discuss these
> things carefully as they are by no means obvious :)
> 
> I agree that the way overlapping arrays support is done current is ugly
> and seems incomplete.  I hope to clean it up now and craft more
> testcases.  It seems bit odd decision to suport the partial overlaps as
> done by alias-2.c. I wonder how important it is in practice.

Probably not very, likely appears in the context of VLAs or
multi-dimensional arrays.  IIRC at some point I tried to
understand what the C standard says here but I don't remember
the outcome ;)  Then there's also ARRAY_RANGE_REF ...

Richard.

> Honza
> > 
> > Thanks,
> > Richard.
> > 
> > >   * gcc.dg/lto/alias-access-path-2.0.c: New testcase.
> > > 
> > >   * tree-ssa-alias.c (aliasing_component_refs_p): Do not give up
> > >   immediately after same_types_for_tbaa_p returns -1 and continue
> > >   looking for possible exact match; if matching types are arrays
> > >   watch for partial overlaps.
> > >   (indirect_ref_may_alias_decl_p): Watch for partial array overlaps.
> > >   (indirect_refs_may_alias_p): Do type based disambiguation first;
> > >   update comment.
> > > Index: testsuite/g++.dg/lto/pr88130_0.C
> > > ===
> > > --- testsuite/g++.dg/lto/pr88130_0.C  (nonexistent)
> > > +++ testsuite/g++.dg/lto/pr88130_0.C  (working copy)
> > > @@ -0,0 +1,28 @@
> > > +// { dg-lto-do link }
> > > +// // { dg-lto-options { "-O2 -flto" } }
> > > +// // { dg-extra-ld-options "-r -nostdlib" }
> > > +class a {
> > > +public:
> > > +  static const long b = 1;
> > > +};
> > > +struct c {
> > > +  enum d { e };
> > > +};
> > > +class C;
> > > +class f {
> > > +public:
> > > +  f(c::d);
> > > +  template  C operator<=(g);
> > > +};
> > > +class C {
> > > +public:
> > > +  template  void operator!=(h &);
> > > +};
> > > +void i() {
> > > +  f j(c::e);
> > > +  try {
> > > +j <= 0 != a::b;
> > > +  } catch (...) {
> > > +  }
> > > +}
> > > +
> > > Index: testsuite/gcc.dg/lto/alias-access-path-2_0.c
> > > ===
> > > --- testsuite/gcc.dg/lto/alias-access-path-2_0.c  (nonexistent)
> > > +++ testsuite/gcc.dg/lto/alias-access-path-2_0.c  (working copy)
> > > @@ -0,0 +1,38 @@
> > > +/* { dg-lto-do run } */
> > > +/* { dg-lto-options { { -O3 -flto -fno-early-inlining } } } */
> > > +
> > > +/* In this test the access patch orracle (aliasing_component_refs_p)
> > > +   can disambiguage array[0] from array[1] by base+offset but it needs 
> > > to be
> > > +   able to find the common type and not give up by not being able to 
> > > compare
> > > +   types earlier.  */
> > > +
> > > +typedef int (*fnptr) ();
> > > +
> > > +__attribute__ ((used))
> > > +struct a
> > > +{
> > > +  void *array[2];
> > > +} a, *aptr = &a;
> > > +
> > > +__attribute__ ((used))
> > > +struct b
> > > +{
> > > + struct a a;
> > > +} *bptr;
> > > +
> > > +static void
> > > +inline_me_late (int argc)
> > > +{
> > > +  if (argc == -1)
> > > +bptr->a.array[1] = bptr;
> > > +}
> > > +
> > > +int
> > > +main (int argc)
> > > +{
> > > +  aptr->array[0] = 0;
> > > +  inline_me_late (argc);
> > > +  if (!__builtin_constant_p (aptr->array[0] == 0))
> > > +__builtin_abort ();
> > > +  return 0;
> > > +}
> > > Index: tree-ssa-alias.c
> > > ===
> > > --- tree-ssa-alias.c  (revision 271747)
> > > +++ tree-ssa-alias.c  (working copy)
> > > @@ -850,6 +850,7 @@ aliasing_component_refs_p (tree ref1,
> > >tree type1, type2;
> > >tree *refp;
> > >int same_p1 = 0, same_p2 = 0;
> > > +  bool maybe_match = false

Re: [PATCH 1/4] Remove indirect call top N counter type.

2019-06-06 Thread Jan Hubicka
> On 6/6/19 2:52 PM, Jan Hubicka wrote:
> > Hi,
> > so the only point of removing this is the fact that builds would be
> > not reproducible with indir-call-topn-profile?
> 
> That's one reason. But the main reason is that the code is dead and not used.
> If you take a look at gcc/ipa-profile.c:189, the code is not supporting 
> HIST_TYPE_INDIR_CALL_TOPN.

Yep, i know. It is not that hard to add support for this - we currently
assume that speculative edges have one direct and one indirect call.
We need to add support to have multiple indirect calls which would not
be too hard, but it never got high enough in my TODO list.

Option is to do the mutiple value speuclation the old way dirrect in
VPT.

I was thinking that other possible use of TOPN is the division used in
growing hashtable implementations. Those usually have small set of built
in prime numbers which they use as array size and modulo always use one
of the values. This is however usually not quite visible to compiler
and TOPN counter could work that out.  So we could do automatically more
or less what htab_mod does.

But yep, I have no time to work on this currently, so we may just drop
it and see if we will want to recover it later.
Honza

> So it's broken for few years and nobody is using that. Last reason is 
> implementation of:
> __gcov_merge_icall_topn. It's over-complicated.
> 
> Martin
> 
> > I still kind of thing it may be useful to track multiple most common
> > values, so I would be in favour of keeping it just updating the 
> > documentation
> > of indir-call-topn-profile that it is currently incomplete and does not
> > lead to reproducible builds and does not handle speculation...
> > 
> > Honza
> 


Re: [PATCH] Enforce allocator::value_type consistency for containers in C++2a

2019-06-06 Thread Jonathan Wakely

On 03/06/19 14:23 +0100, Jonathan Wakely wrote:

In previous standards it is undefined for a container and its allocator
to have a different value_type. Libstdc++ has traditionally allowed it
as an extension, automatically rebinding the allocator to the
container's value_type. Since GCC 8.1 that extension has been disabled
for C++11 and later when __STRICT_ANSI__ is defined (i.e. for
-std=c++11, -std=c++14, -std=c++17 and -std=c++2a).

Since the acceptance of P1463R1 into the C++2a draft an incorrect
allocator::value_type now requires a diagnostic. This patch implements
that by enabling the static_assert for -std=gnu++2a as well.

* doc/xml/manual/status_cxx2020.xml: Document P1463R1 status.
* include/bits/forward_list.h [__cplusplus > 201703]: Enable
allocator::value_type assertion for C++2a.
* include/bits/hashtable.h: Likewise.
* include/bits/stl_deque.h: Likewise.
* include/bits/stl_list.h: Likewise.
* include/bits/stl_map.h: Likewise.
* include/bits/stl_multimap.h: Likewise.
* include/bits/stl_multiset.h: Likewise.
* include/bits/stl_set.h: Likewise.
* include/bits/stl_vector.h: Likewise.
* testsuite/23_containers/deque/48101-3_neg.cc: New test.
* testsuite/23_containers/forward_list/48101-3_neg.cc: New test.
* testsuite/23_containers/list/48101-3_neg.cc: New test.
* testsuite/23_containers/map/48101-3_neg.cc: New test.
* testsuite/23_containers/multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/set/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_map/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_set/48101-3_neg.cc: New test.
* testsuite/23_containers/vector/48101-3_neg.cc: New test.


The tests for this extension now fail when run with -std=gnu++2a. This
fixes them. Tested x86_64-linux with various -std options. Committed
to trunk.


commit a0791bda52523601ffd7208d59947460b8853773
Author: Jonathan Wakely 
Date:   Thu Jun 6 14:27:31 2019 +0100

Fix tests that fail in C++2a mode

The GNU extension that allows using the wrong allocator type with a
container is disabled for C++2a mode, because the standard now requires
a diagnostic. Fix the tests that fail when -std=gnu++2a is used.

Also remove some reundant tests that are duplicates of another test
except for a target specifier of c++11. Those tests previously set
-std=gnu++11 explicitly but that was replaced globally with a target
specifier. These tests existed to verify that explicit instantiation
worked for both C++98 and C++11 modes, but now do nothing because both
copies of the test use -std=gnu++14 by default. Instead of duplicating
the test we should be regularly running the whole testsuite with
different -std options.

* testsuite/23_containers/deque/requirements/explicit_instantiation/
1_c++0x.cc: Remove redundant test.
* testsuite/23_containers/deque/requirements/explicit_instantiation/
2.cc: Use target selector instead of preprocessor condition.
* testsuite/23_containers/deque/requirements/explicit_instantiation/
3.cc: Do not run test for C++2a.
* testsuite/23_containers/forward_list/requirements/
explicit_instantiation/3.cc: Likewise.
* testsuite/23_containers/forward_list/requirements/
explicit_instantiation/5.cc: Do not test allocator rebinding extension
for C++2a.
* testsuite/23_containers/list/requirements/explicit_instantiation/
1_c++0x.cc: Remove redundant test.
* testsuite/23_containers/list/requirements/explicit_instantiation/
2.cc: Use target selector instead of preprocessor condition.
* testsuite/23_containers/list/requirements/explicit_instantiation/
3.cc: Do not run test for C++2a.
* testsuite/23_containers/list/requirements/explicit_instantiation/
5.cc: Do not test allocator rebinding extension for C++2a.
* testsuite/23_containers/map/requirements/explicit_instantiation/
1_c++0x.cc: Remove redundant test.
* testsuite/23_containers/map/requirements/explicit_instantiation/
2.cc: Adjust comment.
* testsuite/23_containers/map/requirements/explicit_instantiation/
3.cc: Do not run test for C++2a.
* testsuite/23_containers/map/requirements/explicit_instantiation/
5.cc: Do not test allocator rebinding extension for C++2a.
* testsuite/23_containers/multimap/requirements/explicit_instantiation/
1_c++0x.cc: Remove redun

Re: [PATCH][AArch64] PR tree-optimization/90332: Implement vec_init where N is a vector mode

2019-06-06 Thread Kyrill Tkachov

[resending without HTML]

On 6/3/19 11:48 AM, Kyrill Tkachov wrote:


On 6/3/19 11:32 AM, James Greenhalgh wrote:

On Fri, May 10, 2019 at 10:32:22AM +0100, Kyrill Tkachov wrote:

Hi all,

This patch fixes the failing gcc.dg/vect/slp-reduc-sad-2.c testcase on
aarch64
by implementing a vec_init optab that can handle two half-width vectors
producing a full-width one
by concatenating them.

In the gcc.dg/vect/slp-reduc-sad-2.c case it's a V8QI reg concatenated
with a V8QI const_vector of zeroes.
This can be implemented efficiently using the aarch64_combinez pattern
that just loads a D-register to make
use of the implicit zero-extending semantics of that load.
Otherwise it concatenates the two vector using aarch64_simd_combine.

With this patch I'm seeing the effect from richi's original patch that
added gcc.dg/vect/slp-reduc-sad-2.c on aarch64
and 525.x264_r improves by about 1.5%.

Bootstrapped and tested on aarch64-none-linux-gnu. Also tested on
aarch64_be-none-elf.

Ok for trunk?

I have a question on the patch. Otherise, this is OK for trunk.


2019-10-05  Kyrylo Tkachov 

      PR tree-optimization/90332
      * config/aarch64/aarch64.c (aarch64_expand_vector_init):
      Handle VALS containing two vectors.
      * config/aarch64/aarch64-simd.md (*aarch64_combinez): 
Rename

      to...
      (@aarch64_combinez): ... This.
      (*aarch64_combinez_be): Rename to...
      (@aarch64_combinez_be): ... This.
      (vec_init): New define_expand.
      * config/aarch64/iterators.md (Vhalf): Handle V8HF.
diff --git a/gcc/config/aarch64/aarch64.c 
b/gcc/config/aarch64/aarch64.c
index 
0c2c17ed8269923723d066b250974ee1ff423d26..52c933cfdac20c5c566c13ae2528f039efda4c46 
100644

--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -15075,6 +15075,43 @@ aarch64_expand_vector_init (rtx target, rtx 
vals)

    rtx v0 = XVECEXP (vals, 0, 0);
    bool all_same = true;
  +  /* This is a special vec_init where N is not an element 
mode but a
+ vector mode with half the elements of M.  We expect to find 
two entries
+ of mode N in VALS and we must put their concatentation into 
TARGET.  */
+  if (XVECLEN (vals, 0) == 2 && VECTOR_MODE_P (GET_MODE (XVECEXP 
(vals, 0, 0
Should you validate the two vector modes are actually half-size 
vectors here,

and not something unexpected?



From my reading it can't ever be anything but half-size vectors on 
this path (the optabs are only defined for such modes)


I'll add an assert to that effect, as if that changes in the optab, 
this will blow up.



And this is what I committed (added an assert)

Thanks for the review.

Kyrill



Thanks,

Kyrill



Thanks,
James



+    {
+  rtx lo = XVECEXP (vals, 0, 0);
+  rtx hi = XVECEXP (vals, 0, 1);
+  machine_mode narrow_mode = GET_MODE (lo);
+  gcc_assert (GET_MODE_INNER (narrow_mode) == inner_mode);
+  gcc_assert (narrow_mode == GET_MODE (hi));
+
+  /* When we want to concatenate a half-width vector with 
zeroes we can

+ use the aarch64_combinez[_be] patterns.  Just make sure that the
+ zeroes are in the right half.  */
+  if (BYTES_BIG_ENDIAN
+  && aarch64_simd_imm_zero (lo, narrow_mode)
+  && general_operand (hi, narrow_mode))
+    emit_insn (gen_aarch64_combinez_be (narrow_mode, target, hi, lo));
+  else if (!BYTES_BIG_ENDIAN
+   && aarch64_simd_imm_zero (hi, narrow_mode)
+   && general_operand (lo, narrow_mode))
+    emit_insn (gen_aarch64_combinez (narrow_mode, target, lo, hi));
+  else
+    {
+  /* Else create the two half-width registers and combine 
them.  */

+  if (!REG_P (lo))
+    lo = force_reg (GET_MODE (lo), lo);
+  if (!REG_P (hi))
+    hi = force_reg (GET_MODE (hi), hi);
+
+  if (BYTES_BIG_ENDIAN)
+    std::swap (lo, hi);
+  emit_insn (gen_aarch64_simd_combine (narrow_mode, target, lo, 
hi));

+    }
+ return;
+   }
+
    /* Count the number of variable elements to initialise. */
    for (int i = 0; i < n_elts; ++i)
  {
diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 9faec77a8c58b3722bc5906135bc927010c94011..4e12e8c71fb11c654336b773f85e3e764b85dbf6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -3226,7 +3226,7 @@
 ;; In this insn, operand 1 should be low, and operand 2 the high part of the
 ;; dest vector.
 
-(define_insn "*aarch64_combinez"
+(define_insn "@aarch64_combinez"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
 	(vec_concat:
 	  (match_operand:VDC 1 "general_operand" "w,?r,m")
@@ -3240,7 +3240,7 @@
(set_attr "arch" "simd,fp,simd")]
 )
 
-(define_insn "*aarch64_combinez_be"
+(define_insn "@aarch64_combinez_be"
   [(set (match_operand: 0 "register_operand" "=w,w,w")
 (vec_concat:
 	  (match_operand:VDC 2 "aarch64_simd_or_scalar_imm_zero")
@@ -5969,6 +5969,15 @@
   DONE;
 })
 
+(define_expand "vec_init"
+  [(match_operand:VQ_NO2E 0 "register_operand" "")
+   (match_opera

Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-06 Thread Hans-Peter Nilsson
> From: Eric Botcazou 
> Date: Wed, 05 Jun 2019 22:03:04 +0200

> > This issue exists, not just for targets that can have their
> > MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
> > for the 'bit-container' targets where it *can't* be set higher.
> > 
> > Let's please DTRT and correct the code here in the middle-end,
> > so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
> >  typedef int v4si __attribute__ ((vector_size (1 << 29)));
> > (all listed targets happen to have Pmode == SImode)
> > 
> > So, considering that: ok to commit?
> 
> You'd need to audit the effects on other targets though.  Are we sure that we 
> want to do bitsizetype calculations in a larger mode on very embedded targets?

IIUC, the precision of the bitsizetype is used on the host.
(Which is bad for native builds.)  When bitsizetype objects end
up on the target, they use the actual Pmode and not the larger
precision mode.

The only "other targets" affected are the one I listed, where
it's needed in order to be able to detect near-address-range
overflow, as shown.

So, it's a question of correctness, not want.

Are you suggesting I need to follow where the precision of
bitsizetype ends up in target code?  If so, can you please do
that for Ada?  You may already have the answer for that.

brgds, H-P


Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-06 Thread Hans-Peter Nilsson
> Date: Thu, 6 Jun 2019 16:04:47 +0200
> From: Hans-Peter Nilsson 

> When bitsizetype objects end
> up on the target, they use the actual Pmode and not the larger
> precision mode.

Oops, a half-way-done email slipped away, this part still needs
to be investigated.

I don't really know where the precision of bitsizetype ends up;
if it's a host-thing-only or a leaks into target code.  Anyone?

brgds, H-P


Re: Patch: don't cap TYPE_PRECISION of bitsizetype at MAX_FIXED_MODE_SIZE

2019-06-06 Thread Hans-Peter Nilsson
> MIME-Version: 1.0
> X-Axis-User: NO
> X-Axis-NonUser: YES
> X-Virus-Scanned: Debian amavisd-new at bastet.se.axis.com
> X-Spam-Score: 1.102
> X-Spam-Level: *
> X-Spam-Status: No, score=1.102 required=5 tests=[DKIM_ADSP_CUSTOM_MED=0.001,
>   DKIM_INVALID=0.1, DKIM_SIGNED=0.1, FREEMAIL_FROM=0.001,
>   NML_ADSP_CUSTOM_MED=0.9, RCVD_IN_DNSWL_NONE=-0.0001]
>   autolearn=no autolearn_force=no
> Authentication-Results: bastet.se.axis.com (amavisd-new);
>   dkim=fail (2048-bit key) reason="fail (message has been altered)"
>   header.d=gmail.com
> DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
> d=gmail.com; s=20161025;
> h=mime-version:references:in-reply-to:from:date:message-id:subject:to
>  :cc;
> bh=AAYGM2axRy+4hoUcsyx/QB2xNqS0+rKeO265TdzbaYs=;
> b=EvVgla8PF8FUMiRf82HqSQYHX78PzrU+GiXPiqw2uC24Fqu/gGFiLDj4IBbKaszKDi
>  s8GoNR6NbyflH1aoj2GunbYJNvUI4VibPEVbviVNYdyTCHV0q6TGGYaE5cZoo2UWtcK/
>  0d/ZKMOM4Qjje9+r0rSTMIJZWTJ0/sVd0NS1euJuPthYVNmvVpb7AB/PhJh54CDHQDSR
>  9PEhf7dFxqV92mf8+GI0tOCQ+nm9Y71dVZCwh1k/Tu0Y1TTwhuq5IepHmVE77z/yNuHA
>  A12vhQcQfjhAP8V+W/BMJHiJAHUDjZPzEH49e01LiYsbVAKr+KOdr5cNTz5Bv4ItW79W
>  D9dQ==
> X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
> d=1e100.net; s=20161025;
> h=x-gm-message-state:mime-version:references:in-reply-to:from:date
>  :message-id:subject:to:cc;
> bh=AAYGM2axRy+4hoUcsyx/QB2xNqS0+rKeO265TdzbaYs=;
> b=orrCZrV+/CuKSTflfzU8FAJQ7oMcW5ZSEWJODWJy4CHX8ZdR0RzkcWS/SullfhG1cJ
>  KEt3DVxCK9szmKsUkBXIXxmGvJEqma1RmybFDitV0kcN2VDh60YPdp75AcLkCfgD6Fez
>  epoB4IbjbyQMf20/pYpiyFRUEDVBa/UVkKvB5bzD5NxdNxQfL3x5J6fOvRhaGmk+1QzO
>  AfVcQ+gvpKxHLwyuDJzs4OG8YnGxbmGAae3xh0PWWj2sBx7Tb+8h49Eh4VRmMOi9JprK
>  5mgGos8VM3rpyo6W3adAqrCgPy1MC/yITWetaawNdNP+l4aGsNSxqZBdQwZAfczwEMGg
>  rJGA==
> X-Gm-Message-State: APjAAAU+9Bb82pA2/UKZGhKnzsPN9t+7zZadQNJrSS7iIplrqHOmuzm8
>   bPGmBk4MWcN13b4Pjd316YowZiEFUVjiy8t8CTTqCg==
> X-Google-Smtp-Source: 
> APXvYqxOm2u4IqYKwhXP3MEEu2jy+NePCHnhKcBVCF7HIHkdksucHqDhQPlCymvdaoRcj41yORfaHxfBZo5iaRZ8cv4=
> X-Received: by 2002:a19:2981:: with SMTP id 
> p123mr22301587lfp.190.1559807980081;
>  Thu, 06 Jun 2019 00:59:40 -0700 (PDT)
> References: <201906051938.x55jcssw016...@ignucius.se.axis.com>
>  <18571728.MIQ1nkMWVm@polaris>
> From: Richard Biener 
> Date: Thu, 6 Jun 2019 09:59:29 +0200
> Cc: Hans-Peter Nilsson ,
>   GCC Patches 
> Old-Content-Type: text/plain; charset="UTF-8"
> X-TM-AS-GCONF: 00
> X-TM-AS-Product-Ver: IMSVA-9.1.0.1817-8.5.0.1020-24660.005
> X-TMASE-Version: IMSVA-9.1.0.1817-8.5.1020-24660.005
> X-TMASE-Result: 10--13.361300-10.00
> X-TMASE-MatchedRID: OCgf9vSZjhKtHGjI+4ePLkKGB4JJ2ELXO4rmf5nWGLY2lSfrRmNlMlOi
>   
> wGvrPOJI/ye/3Hc9K1qfhUT+CqHntkvOGeNuCS0Sx5sgyUhLCNtrLj3DxYBIN1eIuu+Gkot87vf
>   
> jHqfMw2LuANzOai4fLwLhmAWwrROCua8xKml5Zs2ar6Iu0UJj0ndYbPDVqm8dNgrwTjLio7iKXt
>   
> SrhR1F6LfTxRyZysPB3S9otnhJMWXMgfjGGlvykOXSonB/2H+nF9s8UTYYetXNWDA/tkxh/2BSh
>   
> v8puDpLUAsHThNBbWPj2iOyGc7mHghiPDSyUlz9L9Kx8SxUYHtBmlBF/IJ0fFdZsJXctXRjVVrM
>   
> hkGIRMOxNUU9LT2EMIwve9GnFZLPHDOmeQqRrUyeAiCmPx4NwFkMvWAuahr8JnwEOk8wQnYqtq5
>   
> d3cxkNd6Gc5JbLppQbfv6+tC0kEv7tmrl9YnUIaKw1O9DfwWCpRyUja5VPhXrpcchznD6Bw==
> X-TMASE-SNAP-Result: 1.821001.0001-0-1-12:0,22:0,33:0,34:0-0
> X-RBL-Checked: 10.0.5.15 10.0.5.26 10.0.5.60 10.20.1.11 127.0.0.1 
> 209.85.167.48
> Content-Transfer-Encoding: 8bit
> Content-Type: TEXT/plain; charset=iso-8859-1
> 
> On Wed, Jun 5, 2019 at 10:03 PM Eric Botcazou  wrote:
> >
> > > This issue exists, not just for targets that can have their
> > > MAX_FIXED_MODE_SIZE more-or-less easily tweaked higher, but also
> > > for the 'bit-container' targets where it *can't* be set higher.
> > >
> > > Let's please DTRT and correct the code here in the middle-end,
> > > so we don't ICE for those targets and this line (gcc.dg/pr69973.c):
> > >  typedef int v4si __attribute__ ((vector_size (1 << 29)));
> > > (all listed targets happen to have Pmode == SImode)
> > >
> > > So, considering that: ok to commit?
> >
> > You'd need to audit the effects on other targets though.  Are we sure that 
> > we
> > want to do bitsizetype calculations in a larger mode on very embedded 
> > targets?
> 
> I didn't actually write it down but originally wanted - what about adding
> a way for the target to specify what type to use for bitsize_type?
> We do have SIZETYPE so say that if BITSIZETYPE is defined then
> use that (otherwise fall back to the existing mechanism)?  There may
> not be a C type that maps to DImode for cris

(Again: 64-bit types work fine for CRIS, it's just the cooked-up
middle-end expressions that shouldn't use 64-bit-entities.
Like, extracting bytes out of a 8-byte vector type.  Not sure if
that actually happens, but MAX_FIXED_MODE_SIZE is used in
situations like that, where the data wasn't originally a 64-bit
scalar.)

> and it's not t

Re: [PATCH] fix more -Wformat-diag issues

2019-06-06 Thread Martin Sebor

On 6/6/19 3:39 AM, Jakub Jelinek wrote:

On Wed, May 22, 2019 at 10:34:00AM -0600, Martin Sebor wrote:

gcc/ChangeLog:

* config/i386/i386-features.c (ix86_get_function_versions_dispatcher):
Adjust quoting and hyphenation.
* convert.c (convert_to_real_1): Same.
* gcc.c (driver_wrong_lang_callback): Same.
(driver::handle_unrecognized_options): Same.
* gimple-ssa-nonnull-compare.c (do_warn_nonnull_compare): Same.
* opts-common.c (cmdline_handle_error): Same.
(read_cmdline_option): Same.
* opts-global.c (complain_wrong_lang): Same.
(print_ignored_options): Same.
(handle_common_deferred_options): Same.
* pretty-print.h: Same.
* print-rtl.c (debug_bb_n_slim): Same.
* sched-rgn.c (make_pass_sched_fusion): Same.
* tree-cfg.c (verify_gimple_assign_unary): Same.
(verify_gimple_label): Same.
* tree-ssa-operands.c (verify_ssa_operands): Same.
* varasm.c (do_assemble_alias): Same.
(assemble_alias): Same.

* diagnostic-core.h (GCC_DIAG_STYLE): Adjust.
 (GCC_DIAG_RAW_STYLE): New macro.

* cfghooks.c: Disable -Wformat-diags.
* cfgloop.c: Same.
* cfgrtl.c: Same.
* cgraph.c: Same.
* diagnostic-show-locus.c: Same.
* diagnostic.c: Same.
* gimple-pretty-print.c: Same.
* graph.c: Same.
* symtab.c: Same.
* tree-eh.c Same.
* tree-pretty-print.c: Same.
* tree-ssa.c: Same.

* configure: Regenerate.
* configure.ac (ACX_PROG_CXX_WARNING_OPTS): Add -Wno-error=format-diag.
 (ACX_PROG_CC_WARNING_OPTS): Same.


Changes for the same change shouldn't be separated by empty newlines in the
ChangeLog.  Furthermore, you've managed to commit only the first part (until
varasm.c) and not the rest.


I actually managed to do that on purpose.  I just didn't "manage"
to also update the ever important ChangeLog.  There are probably
other mistakes in it.




diff --git a/gcc/configure b/gcc/configure
index 4a3d5eefcb8..c9062cca9d6 100755
--- a/gcc/configure
+++ b/gcc/configure
@@ -6797,7 +6797,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
  
  c_loose_warn=

  save_CFLAGS="$CFLAGS"
-for real_option in -Wstrict-prototypes -Wmissing-prototypes; do
+for real_option in -Wstrict-prototypes 
-Wmissing-prototypes-Wno-error=format-diag; do
# Do the check with the no- prefix removed since gcc silently
# accepts any -Wno-* option on purpose
case $real_option in


The above was probably regenerated before you've added a space:


Yes.




diff --git a/gcc/configure.ac b/gcc/configure.ac
index 35982fdc9ed..cbc0c25fa2b 100644
--- a/gcc/configure.ac
+++ b/gcc/configure.ac
@@ -483,10 +483,11 @@ AS_IF([test $enable_build_format_warnings = no],
[wf_opt=-Wno-format],[wf_opt=])
  ACX_PROG_CXX_WARNING_OPTS(
m4_quote(m4_do([-W -Wall -Wno-narrowing -Wwrite-strings ],
-  [-Wcast-qual $wf_opt])), [loose_warn])
+  [-Wcast-qual -Wno-error=format-diag $wf_opt])),
+  [loose_warn])
  ACX_PROG_CC_WARNING_OPTS(
-   m4_quote(m4_do([-Wstrict-prototypes -Wmissing-prototypes])),
-   [c_loose_warn])
+   m4_quote(m4_do([-Wstrict-prototypes -Wmissing-prototypes ],


 ^--HERE
I've committed following to fix that up as obvious:


Thank you.

Martin



2019-06-06  Jakub Jelinek  

* configure: Regenerate.

--- gcc/configure   (revision 271993)
+++ gcc/configure   (revision 271994)
@@ -6797,7 +6797,7 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
  
  c_loose_warn=

  save_CFLAGS="$CFLAGS"
-for real_option in -Wstrict-prototypes 
-Wmissing-prototypes-Wno-error=format-diag; do
+for real_option in -Wstrict-prototypes -Wmissing-prototypes 
-Wno-error=format-diag; do
# Do the check with the no- prefix removed since gcc silently
# accepts any -Wno-* option on purpose
case $real_option in


Jakub





[committed, amdgcn] Add -march=gfx906 for Vega20

2019-06-06 Thread Andrew Stubbs
This patch adds a new -march=gfx906 option, and a new multilib to go 
with it.


gfx906 is ISA compatible with gfx900 (at least as far as the compiler 
support goes), but unfortunately the metadata in the object files is not 
compatible, so we need a whole extra multilib (booo!). Or, at least 
that's the easiest solution to use for now.


Andrew
Add -march=gfx906 for AMD GCN.

2019-06-06  Andrew Stubbs  

	gcc/
	* config.gcc (amdgcn-*-*): Allow --with-arch=gfx906.
	* config/gcn/gcn.opt (gpu_type): Add gfx906.
	* config/gcn/t-gcn-hsa (MULTILIB_OPTIONS): Add gfx906 multilib.
	(MULTILIB_DIRNAMES): Rename gcn5 to gfx900.
	Add gfx906.

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 67c3c2c7a42..6b00c387247 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -4127,7 +4127,7 @@ case "${target}" in
 		for which in arch tune; do
 			eval "val=\$with_$which"
 			case ${val} in
-			"" | carrizo | fiji | gfx900 )
+			"" | carrizo | fiji | gfx900 | gfx906 )
 # OK
 ;;
 			*)
diff --git a/gcc/config/gcn/gcn.opt b/gcc/config/gcn/gcn.opt
index 2fd3996edba..bdc878f35ad 100644
--- a/gcc/config/gcn/gcn.opt
+++ b/gcc/config/gcn/gcn.opt
@@ -34,6 +34,9 @@ Enum(gpu_type) String(fiji) Value(PROCESSOR_FIJI)
 EnumValue
 Enum(gpu_type) String(gfx900) Value(PROCESSOR_VEGA)
 
+EnumValue
+Enum(gpu_type) String(gfx906) Value(PROCESSOR_VEGA)
+
 march=
 Target RejectNegative Joined ToLower Enum(gpu_type) Var(gcn_arch) Init(PROCESSOR_CARRIZO)
 Specify the name of the target GPU.
diff --git a/gcc/config/gcn/t-gcn-hsa b/gcc/config/gcn/t-gcn-hsa
index 085ba429c9d..1600a586ac4 100644
--- a/gcc/config/gcn/t-gcn-hsa
+++ b/gcc/config/gcn/t-gcn-hsa
@@ -42,8 +42,8 @@ ALL_HOST_OBJS += gcn-run.o
 gcn-run$(exeext): gcn-run.o
 	+$(LINKER) $(ALL_LINKERFLAGS) $(LDFLAGS) -o $@ $< -ldl
 
-MULTILIB_OPTIONS = march=gfx900
-MULTILIB_DIRNAMES = gcn5
+MULTILIB_OPTIONS = march=gfx900 march=gfx906
+MULTILIB_DIRNAMES = gfx900 gfx906
 
 PASSES_EXTRA += $(srcdir)/config/gcn/gcn-passes.def
 gcn-tree.o: $(srcdir)/config/gcn/gcn-tree.c


Re: [PATCH] Add warn_unused_result for malloc-like functions (PR tree-optimization/78902).

2019-06-06 Thread Martin Sebor

On 6/6/19 2:01 AM, Martin Liška wrote:

Hi.

The patch is about addition of warn_unused_attribute for malloc-like function.

Patch can bootstrap on x86_64-linux-gnu and survives regression tests.


I like this change (as you know :)  Just one question: should
all allocation functions be also annotated, including the two
variants of __builtin_alloca_with_align and
__builtin_posix_memalign?

As a separate comment, to get the benefit of the attribute in
GCC we might want to also annotate the wrappers in libiberty.h
and perhaps also some (many?) of the functions in tree.h.
(This is just a suggestion to think about independent of your
change.)

Martin



Ready to be installed?
Thanks,
Martin

gcc/ChangeLog:

2019-06-06  Martin Liska  

PR tree-optimization/78902
* builtin-attrs.def (ATTR_WARN_UNUSED_RESULT): New.
(ATTR_MALLOC_NOTHROW_LEAF_LIST): Remove.
(ATTR_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST): New.
(ATTR_ALLOC_SIZE_2_NOTHROW_LIST): Remove.
(ATTR_MALLOC_SIZE_1_NOTHROW_LEAF_LIST): Remove.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LIST): New.
(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_NOTHROW_LEAF_LIST): New.
(ATTR_ALLOCA_SIZE_1_NOTHROW_LEAF_LIST): Remove.
(ATTR_ALLOCA_WARN_UNUSED_RESULT_SIZE_1_NOTHROW_LEAF_LIST): New.
(ATTR_MALLOC_SIZE_1_2_NOTHROW_LEAF_LIST):  Remove.
(ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_2_NOTHROW_LEAF_LIST):
New.
(ATTR_ALLOC_SIZE_2_NOTHROW_LEAF_LIST): Remove.
(ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): New.
(ATTR_MALLOC_NOTHROW_NONNULL): Remove.
(ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL): New.
(ATTR_MALLOC_NOTHROW_NONNULL_LEAF): Remove.
(ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF): New.
(ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF): New.
* builtins.def (BUILT_IN_ALIGNED_ALLOC): Change to use
warn_unused_result attribute.
(BUILT_IN_STRDUP): Likewise.
(BUILT_IN_STRNDUP): Likewise.
(BUILT_IN_ALLOCA): Likewise.
(BUILT_IN_CALLOC): Likewise.
(BUILT_IN_MALLOC): Likewise.
(BUILT_IN_REALLOC): Likewise.

gcc/testsuite/ChangeLog:

2019-06-06  Martin Liska  

PR tree-optimization/78902
* c-c++-common/asan/alloca_loop_unpoisoning.c: Use result
of __builtin_alloca.
* c-c++-common/asan/pr88619.c: Likewise.
* g++.dg/overload/using2.C: Likewise for malloc.
* gcc.dg/attr-alloc_size-5.c: Add new dg-warning.
* gcc.dg/nonnull-3.c: Use result of __builtin_strdup.
* gcc.dg/pr43643.c: Likewise.
* gcc.dg/pr59717.c: Likewise for calloc.
* gcc.dg/torture/pr71816.c: Likewise.
* gcc.dg/tree-ssa/pr78886.c: Likewise.
* gcc.dg/tree-ssa/pr79697.c: Likewise.
* gcc.dg/pr78902.c: New test.
---
  gcc/builtin-attrs.def | 37 ---
  gcc/builtins.def  | 14 +++
  .../asan/alloca_loop_unpoisoning.c|  2 +-
  gcc/testsuite/c-c++-common/asan/pr88619.c |  2 +-
  gcc/testsuite/g++.dg/overload/using2.C|  2 +-
  gcc/testsuite/gcc.dg/attr-alloc_size-5.c  |  2 +-
  gcc/testsuite/gcc.dg/nonnull-3.c  |  4 +-
  gcc/testsuite/gcc.dg/pr43643.c|  6 +--
  gcc/testsuite/gcc.dg/pr59717.c|  8 ++--
  gcc/testsuite/gcc.dg/pr78902.c| 14 +++
  gcc/testsuite/gcc.dg/torture/pr71816.c|  2 +-
  gcc/testsuite/gcc.dg/tree-ssa/pr78886.c   |  2 +-
  gcc/testsuite/gcc.dg/tree-ssa/pr79697.c   |  6 +--
  13 files changed, 62 insertions(+), 39 deletions(-)
  create mode 100644 gcc/testsuite/gcc.dg/pr78902.c






Re: [PATCH] Enforce allocator::value_type consistency for containers in C++2a

2019-06-06 Thread Jonathan Wakely

On 06/06/19 14:36 +0100, Jonathan Wakely wrote:

On 03/06/19 14:23 +0100, Jonathan Wakely wrote:

In previous standards it is undefined for a container and its allocator
to have a different value_type. Libstdc++ has traditionally allowed it
as an extension, automatically rebinding the allocator to the
container's value_type. Since GCC 8.1 that extension has been disabled
for C++11 and later when __STRICT_ANSI__ is defined (i.e. for
-std=c++11, -std=c++14, -std=c++17 and -std=c++2a).

Since the acceptance of P1463R1 into the C++2a draft an incorrect
allocator::value_type now requires a diagnostic. This patch implements
that by enabling the static_assert for -std=gnu++2a as well.

* doc/xml/manual/status_cxx2020.xml: Document P1463R1 status.
* include/bits/forward_list.h [__cplusplus > 201703]: Enable
allocator::value_type assertion for C++2a.
* include/bits/hashtable.h: Likewise.
* include/bits/stl_deque.h: Likewise.
* include/bits/stl_list.h: Likewise.
* include/bits/stl_map.h: Likewise.
* include/bits/stl_multimap.h: Likewise.
* include/bits/stl_multiset.h: Likewise.
* include/bits/stl_set.h: Likewise.
* include/bits/stl_vector.h: Likewise.
* testsuite/23_containers/deque/48101-3_neg.cc: New test.
* testsuite/23_containers/forward_list/48101-3_neg.cc: New test.
* testsuite/23_containers/list/48101-3_neg.cc: New test.
* testsuite/23_containers/map/48101-3_neg.cc: New test.
* testsuite/23_containers/multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/set/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_map/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multimap/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_multiset/48101-3_neg.cc: New test.
* testsuite/23_containers/unordered_set/48101-3_neg.cc: New test.
* testsuite/23_containers/vector/48101-3_neg.cc: New test.


The tests for this extension now fail when run with -std=gnu++2a. This
fixes them. Tested x86_64-linux with various -std options. Committed
to trunk.


I missed a couple more tests that fail with -std=gnu++2a.

Tested x86_64-linux, committed to trunk.


commit 191c471552dfa4784c0721f7813b54eec845bf1a
Author: redi 
Date:   Thu Jun 6 15:34:45 2019 +

Fix more tests that fail in C++2a mode

* testsuite/23_containers/unordered_map/requirements/debug_container.cc:
Do not test allocator rebinding extension for C++2a.
* testsuite/23_containers/unordered_set/allocator/ext_ptr.cc: Change
dg-do directive for C++17 and C++2a.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272009 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/testsuite/23_containers/unordered_map/requirements/debug_container.cc b/libstdc++-v3/testsuite/23_containers/unordered_map/requirements/debug_container.cc
index d6afae9c2e9..903802878d7 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_map/requirements/debug_container.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_map/requirements/debug_container.cc
@@ -30,7 +30,7 @@ template class __gnu_debug::unordered_map;
 template class __gnu_debug::unordered_map, equal_to, 
   allocator>>;
-#ifndef __STRICT_ANSI__
+#if !defined __STRICT_ANSI__ && __cplusplus <= 201703L
 template class __gnu_debug::unordered_map, equal_to, 
   allocator>;
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc b/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
index 5daa456e440..b7a63c5e393 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_set/allocator/ext_ptr.cc
@@ -15,7 +15,8 @@
 // with this library; see the file COPYING3.  If not see
 // .
 
-// { dg-do run { target c++11 } }
+// { dg-do run { target { c++11_only || c++14_only } } }
+// { dg-do compile { target c++17 } }
 
 #include 
 #include 


Re: [PATCH] Fix tests that fail with -std=gnu++98 or -std=gnu++11

2019-06-06 Thread Jonathan Wakely

On 06/06/19 13:17 +0100, Jonathan Wakely wrote:

* testsuite/18_support/set_terminate.cc: Do not run for C++98 mode.
* testsuite/18_support/set_unexpected.cc: Likewise.
* testsuite/20_util/is_nothrow_invocable/value.cc: Test converting to
void.
* testsuite/20_util/is_nothrow_invocable/value_ext.cc: Fix constexpr
function to be valid in C++11.
* testsuite/26_numerics/complex/proj.cc: Do not run for C++98 mode.
* testsuite/experimental/names.cc: Do not run for C++98 mode. Do not
include Library Fundamentals or Networking headers in C++11 mode.
* testsuite/ext/char8_t/atomic-1.cc: Do not run for C++98 mode.


This should fix the last failures with -std=gnu++98.

Tested x86_64-linux, committed to trunk.


commit 2bc51486854314b620cc92a0a1dff5a9be5cb831
Author: redi 
Date:   Thu Jun 6 15:34:51 2019 +

Fix more failing tests for C++98 mode

* testsuite/23_containers/deque/requirements/dr438/assign_neg.cc: Add
dg-prune-output for different C++98 diagnostic.
* testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc:
Likewise.
* testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc:
Likewise.
* testsuite/23_containers/deque/requirements/dr438/insert_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/assign_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/constructor_1_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/constructor_2_neg.cc:
Likewise.
* testsuite/23_containers/list/requirements/dr438/insert_neg.cc:
Likewise.
* testsuite/23_containers/vector/requirements/dr438/assign_neg.cc:
Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_1_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/
constructor_2_neg.cc: Likewise.
* testsuite/23_containers/vector/requirements/dr438/insert_neg.cc:
Likewise.
* testsuite/libstdc++-prettyprinters/compat.cc: Do not run for C++98.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272010 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc
index 0be1e965103..fdb03865e3d 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/assign_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-do compile }
+// { dg-prune-output "cannot convert" }
 // { dg-prune-output "no matching function .*_M_fill_assign" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc
index d99bd63abb5..1cb8cf1a7ec 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_1_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-do compile }
+// { dg-prune-output "cannot convert" }
 // { dg-prune-output "no matching function .*_M_fill_initialize" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc
index 9962bbfa225..4d3c9b31434 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/constructor_2_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-do compile }
+// { dg-prune-output "cannot convert" }
 // { dg-prune-output "no matching function .*_M_fill_initialize" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc
index 8051196011b..83ee4492ff3 100644
--- a/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc
+++ b/libstdc++-v3/testsuite/23_containers/deque/requirements/dr438/insert_neg.cc
@@ -18,6 +18,7 @@
 // .
 
 // { dg-do compile }
+// { dg-prune-output "cannot convert" }
 // { dg-prune-output "no matching function .*_M_fill_insert" }
 
 #include 
diff --git a/libstdc++-v3/testsuite/23_containers/list/requirements/dr438/assign_neg.cc b/libstdc++-v3/testsuite/23_containers/list/requirements/dr438/assign_neg.cc
index a3da00b03e9..a4dd34d8a6d 100644
--- a/libstdc++-v3/testsuite/23_containers/list/requirements/d

[PATCH] Avoid unnecessary inclusion of header

2019-06-06 Thread Jonathan Wakely

This can greatly reduce the amount of preprocessed code that is included
by other headers, because  depends on  which is huge.

* include/std/array: Do not include .
* include/std/optional: Include  and
 instead of .

Preprocessed line counts for C++17 mode:


Before   2577432453 31616
After 992523194 19062

Tested x86_64-linux, committed to trunk.

Once we have a gcc-10/porting_to.html page I'll note this change
there, because code relying on std::string and std::allocator being
defined by transitive includes will need to include the right headers.

commit 9eb6db53fb072e20b1d54b16d8c1c77638c934e5
Author: redi 
Date:   Thu Jun 6 15:34:56 2019 +

Avoid unnecessary inclusion of  header

This can greatly reduce the amount of preprocessed code that is included
by other headers, because  depends on  which is huge.

* include/std/array: Do not include .
* include/std/optional: Include  and
 instead of .
* testsuite/20_util/function_objects/searchers.cc: Include 
for std::isalnum.
* testsuite/20_util/tuple/cons/deduction.cc: Include  for
std::allocator.
* testsuite/23_containers/map/erasure.cc: Include .
* testsuite/23_containers/unordered_map/erasure.cc: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@272011 138bc75d-0d04-0410-961f-82ee72b054a4

diff --git a/libstdc++-v3/include/std/array b/libstdc++-v3/include/std/array
index 02c6f4b4dbe..230e2b0f593 100644
--- a/libstdc++-v3/include/std/array
+++ b/libstdc++-v3/include/std/array
@@ -36,7 +36,7 @@
 #else
 
 #include 
-#include 
+#include 
 #include 
 #include 
 
diff --git a/libstdc++-v3/include/std/optional b/libstdc++-v3/include/std/optional
index ae825d3e327..79cd6c97019 100644
--- a/libstdc++-v3/include/std/optional
+++ b/libstdc++-v3/include/std/optional
@@ -35,10 +35,10 @@
 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
-#include 
+#include 
 #include 
 #include 
 
diff --git a/libstdc++-v3/testsuite/20_util/function_objects/searchers.cc b/libstdc++-v3/testsuite/20_util/function_objects/searchers.cc
index aae21d28d3a..fc278860f5c 100644
--- a/libstdc++-v3/testsuite/20_util/function_objects/searchers.cc
+++ b/libstdc++-v3/testsuite/20_util/function_objects/searchers.cc
@@ -19,6 +19,7 @@
 
 #include 
 #include 
+#include 
 #ifdef _GLIBCXX_USE_WCHAR_T
 # include 
 #endif
diff --git a/libstdc++-v3/testsuite/20_util/tuple/cons/deduction.cc b/libstdc++-v3/testsuite/20_util/tuple/cons/deduction.cc
index fa91f8fa539..eb3f2f3d6ab 100644
--- a/libstdc++-v3/testsuite/20_util/tuple/cons/deduction.cc
+++ b/libstdc++-v3/testsuite/20_util/tuple/cons/deduction.cc
@@ -19,6 +19,7 @@
 // .
 
 #include 
+#include 
 
 template struct require_same;
 template struct require_same { using type = void; };
diff --git a/libstdc++-v3/testsuite/23_containers/map/erasure.cc b/libstdc++-v3/testsuite/23_containers/map/erasure.cc
index d8a57160865..5b211c3602b 100644
--- a/libstdc++-v3/testsuite/23_containers/map/erasure.cc
+++ b/libstdc++-v3/testsuite/23_containers/map/erasure.cc
@@ -19,6 +19,7 @@
 // .
 
 #include 
+#include 
 #include 
 
 #ifndef __cpp_lib_erase_if
diff --git a/libstdc++-v3/testsuite/23_containers/unordered_map/erasure.cc b/libstdc++-v3/testsuite/23_containers/unordered_map/erasure.cc
index 35190a0d19e..17bb940f00f 100644
--- a/libstdc++-v3/testsuite/23_containers/unordered_map/erasure.cc
+++ b/libstdc++-v3/testsuite/23_containers/unordered_map/erasure.cc
@@ -19,6 +19,7 @@
 // .
 
 #include 
+#include 
 #include 
 
 #ifndef __cpp_lib_erase_if


Re: [PATCH] fix more -Wformat-diag issues

2019-06-06 Thread Jakub Jelinek
On Thu, Jun 06, 2019 at 08:45:56AM -0600, Martin Sebor wrote:
> > Changes for the same change shouldn't be separated by empty newlines in the
> > ChangeLog.  Furthermore, you've managed to commit only the first part (until
> > varasm.c) and not the rest.
> 
> I actually managed to do that on purpose.  I just didn't "manage"
> to also update the ever important ChangeLog.  There are probably
> other mistakes in it.

The coding conventions says not to, so have you violated that on purpose?
https://www.gnu.org/prep/standards/html_node/Style-of-Change-Logs.html#Style-of-Change-Logs
"Separate unrelated change log entries with blank lines.
Don’t put blank lines between individual changes of an entry."

I don't see any changes in that patch as unrelated, after all, if they were
unrelated, they ought to be committed separately.

Jakub


Re: Teach same_types_for_tbaa to structurally compare arrays, pointers and vectors

2019-06-06 Thread Martin Jambor
Hi,
(now even including gcc-patches mailing list which we managed to drop
again and Honza whom I forgot to CC the last time)

On Thu, Jun 06 2019, Richard Biener wrote:
> yOn Tue, 4 Jun 2019, Martin Jambor wrote:
>> >> @@ -1822,9 +1863,19 @@ build_ref_for_model (location_t loc, tree base, 
>> >> HOST_WIDE_INT offset,
>> >> NULL_TREE);
>> >>  }
>> >>else
>> >> -return
>> >> -  build_ref_for_offset (loc, base, offset, model->reverse, 
>> >> model->type,
>> >> - gsi, insert_after);
>> >> +{
>> >> +  tree res;
>> >> +  if (model->grp_same_access_path
>> >> +   && offset <= model->offset
>> >> +   && get_object_alignment (base) >= TYPE_ALIGN (TREE_TYPE (base))
>> >
>> > not sure what this tests - but I think it should be part of the
>> > grp_same_access_path check?
>> >
>> 
>> It checks that base is not under-aligned, I was assuming that I can
>> safely construct COMPONENT_REFs and ARRAY_REFs on a properly aligned
>> base.  I hope that is still safe even of reference copying with
>> unsharing but of course there is more room for surprises.
>> 
>> It cannot be part of grp_same_access_path check because BASE is now
>> something quite different.  For example when optimizing
>> 
>>   s = *p;
>>   v = s.i;
>> 
>> build_ref_for_model can be called to construct reference to load p->i
>> and BASE is *p, as opposed to grp_same_access_path where the path is
>> based on s.
>
> Oh, I see.  Still alignment is ultimatively on the base, so
> there shouldn't be any constraints here.  That is, if you
> substitute a base with different alignment the accesses will
> change accordingly and that's independend on whether you
> use a simple MEM_REF or re-build the access path.
>
>> 
>> The patch passes bootstrap end testing on x86_64-linux, please let me
>> know if there is anything else you'd like me to adjust.
>
> Looks good to me.  As said, eventually the alignment check is
> unnecessary.

OK, thank you.

I am going to commit it and then immediately follow up with a patch
removing the test.  The combination has just passed bootstrap and
testing on an x86_64-linux.  At least I hope the above is a permission
to do so.

Thanks,

Martin



Subject: [PATCH 2/2] Drop alignment check in build_reconstructed_reference

2019-06-06  Martin Jambor  

* tree-sra.c (build_reconstructed_reference): Drop the alignment
check.
---
 gcc/tree-sra.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index a246a93a48d..074d4964379 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -1817,9 +1817,6 @@ build_reconstructed_reference (location_t, tree base, 
struct access *model)
   expr = TREE_OPERAND (expr, 0);
 }
 
-  if (get_object_alignment (base) < get_object_alignment (expr))
-return NULL;
-
   TREE_OPERAND (prev_expr, 0) = base;
   tree ref = unshare_expr (model->expr);
   TREE_OPERAND (prev_expr, 0) = expr;
-- 
2.21.0



For the reference of people on the mailing list, the first patch was:


Subject: [PATCH 1/2] Make SRA re-construct orginal memory accesses when easy

2019-06-03  Martin Jambor  

* tree-sra.c (struct access): New field grp_same_access_path.
(dump_access): Dump it.
(build_reconstructed_reference): New function.
(build_ref_for_model): Use it if possible.
(path_comparable_for_same_access): New function.
(same_access_path_p): Likewise.
(sort_and_splice_var_accesses): Set the new flag.
(analyze_access_subtree): Likewise.
(propagate_subaccesses_across_link): Propagate zero value of the new
flag down the access tree.

testsuite/
* gcc.dg/tree-ssa/alias-access-path-1.c: Remove -fno-tree-sra option.
* gcc.dg/tree-ssa/ssa-dse-26.c: Disable FRE.
* testsuite/gnat.dg/opt39.adb: Adjust scan dump.

Addressed Richi's comments
---
 .../gcc.dg/tree-ssa/alias-access-path-1.c |   2 +-
 gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c|   2 +-
 gcc/testsuite/gnat.dg/opt39.adb   |   3 +-
 gcc/tree-sra.c| 135 --
 4 files changed, 131 insertions(+), 11 deletions(-)

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c 
b/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c
index 264f72aff0a..ba90b56fe5c 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/alias-access-path-1.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-fre3 -fno-tree-sra" } */
+/* { dg-options "-O2 -fdump-tree-fre3" } */
 struct foo
 {
   int val;
diff --git a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c 
b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
index 32d63899b63..836a8092ab9 100644
--- a/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
+++ b/gcc/testsuite/gcc.dg/tree-ssa/ssa-dse-26.c
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-O2 -fdump-tree-dse1-details -fno-short-enums" } */
+/* { 

Re: [PATCH] fix more -Wformat-diag issues

2019-06-06 Thread Martin Sebor

On 6/6/19 9:42 AM, Jakub Jelinek wrote:

On Thu, Jun 06, 2019 at 08:45:56AM -0600, Martin Sebor wrote:

Changes for the same change shouldn't be separated by empty newlines in the
ChangeLog.  Furthermore, you've managed to commit only the first part (until
varasm.c) and not the rest.


I actually managed to do that on purpose.  I just didn't "manage"
to also update the ever important ChangeLog.  There are probably
other mistakes in it.


The coding conventions says not to, so have you violated that on purpose?


Are you for real?  Yes, I added a blank line on purpose.


https://www.gnu.org/prep/standards/html_node/Style-of-Change-Logs.html#Style-of-Change-Logs
"Separate unrelated change log entries with blank lines.
Don’t put blank lines between individual changes of an entry."


It's pretty pathetic that the only thing you choose to say in
response to my change to better align GCC code with the coding
conventions is to chastise me for trying to make the ChangeLog
entry more readable by adding a blank line to it.


I don't see any changes in that patch as unrelated, after all, if they were
unrelated, they ought to be committed separately.


Thanks for the lecture.


Re: [PATCH] Add warn_unused_result for malloc-like functions (PR tree-optimization/78902).

2019-06-06 Thread Jeff Law
On 6/6/19 9:17 AM, Martin Sebor wrote:
> On 6/6/19 2:01 AM, Martin Liška wrote:
>> Hi.
>>
>> The patch is about addition of warn_unused_attribute for malloc-like
>> function.
>>
>> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> I like this change (as you know :)  Just one question: should
> all allocation functions be also annotated, including the two
> variants of __builtin_alloca_with_align and
> __builtin_posix_memalign?
> 
> As a separate comment, to get the benefit of the attribute in
> GCC we might want to also annotate the wrappers in libiberty.h
> and perhaps also some (many?) of the functions in tree.h.
> (This is just a suggestion to think about independent of your
> change.)
I like it conceptually as well -- I think the glibc guys had pretty good
results exposing latent issues when they went through and did the
necessary markup for WUR warnings.

Jeff


Re: Make aliasing_component_refs_p to work harder when same_type_for_tbaa returns -1

2019-06-06 Thread Richard Sandiford
Richard Biener  writes:
> On Thu, 6 Jun 2019, Jan Hubicka wrote:
>
>> > > This is more of an accident and there are cases where we do not trip 
>> > > across
>> > > this -1 and we disambiguate array accesses that seems unsafe to me.
>> > > 
>> > > With my change aliasing_component_refs_p finds the match of the
>> > > array types (type_same_for_tbaa_p returns 1 with non-LTO becuase they
>> > > have same canonical type) and disambiguates based on disjoint access 
>> > > ranges.
>> > > 
>> > > I have thus went ahead and updated all uses of type_same_for_tbaa_p to 
>> > > special
>> > > case arrays and reffer to this testcase (which seems odd and is only one 
>> > > in
>> > > testsuite): We can still do useful disambiguation if the array is not 
>> > > toplevel
>> > > reference or we know that the memory object is not bigger. This is 
>> > > tested by a
>> > > testcase I added and is quite frequent in real world code.
>> > > 
>> > > I also added check to give up on VLAs since I can not convicne myself 
>> > > that
>> > > this is safe: I think early inlining VLAs and streaming them may lead to
>> > > same VLA type have two different sizes at a time enabling it to partially
>> > > overlap.
>> > 
>> > OK - sorry for the delay.  The array stuff gets a bit ugly so
>> > we eventually want to do sth about that ...
>> 
>> Thanks, no prob with the delay - I apprechiate we could discuss these
>> things carefully as they are by no means obvious :)
>> 
>> I agree that the way overlapping arrays support is done current is ugly
>> and seems incomplete.  I hope to clean it up now and craft more
>> testcases.  It seems bit odd decision to suport the partial overlaps as
>> done by alias-2.c. I wonder how important it is in practice.
>
> Probably not very, likely appears in the context of VLAs or
> multi-dimensional arrays.  IIRC at some point I tried to
> understand what the C standard says here but I don't remember
> the outcome ;)  Then there's also ARRAY_RANGE_REF ...

Dropping support for that would also allow cheaper runtime alias checks
in the vectoriser for some cases, via the DDR_COULD_BE_INDEPENDENT_P stuff.

Richard


[PATCH] remove trailing spaces from tree-ssa-strlen.c

2019-06-06 Thread Martin Sebor

To avoid trailing whitespace in my commits I have my editor set
to highlight them.  While integrating the strlen/sprintf passes
and making more extensive changes than usual, I keep getting
distracted by the highlighting pointing out trailing spaces that
predate my changes.  Rather than "fixing" this piecemeal as other
changes are made that touch the same lines I'd like to clean this
up in a standalone commit that does nothing else but adjust
the formatting.

If no one has any objections I'll commit this cleanup later today.
The diff is attached.

Martin
gcc/ChangeLog:

	* tree-ssa-strlen.c (adjust_related_strinfos): Avoid trailing article.
	(handle_builtin_malloc): Remove trailing spaces.
	(handle_builtin_memset): Same.
	(handle_builtin_memcmp): Same.
	(compute_string_length): Same.
	(determine_min_objsize): Same.
	(handle_builtin_string_cmp): Same.
	(handle_char_store): Same.  Break up excessively long line.

Index: gcc/tree-ssa-strlen.c
===
--- gcc/tree-ssa-strlen.c	(revision 272003)
+++ gcc/tree-ssa-strlen.c	(working copy)
@@ -891,9 +891,9 @@ adjust_related_strinfos (location_t loc, strinfo *
 	  tree tem;
 
 	  si = unshare_strinfo (si);
-	  /* We shouldn't see delayed lengths here; the caller must have
-	 calculated the old length in order to calculate the
-	 adjustment.  */
+	  /* We shouldn't see delayed lengths here; the caller must
+	 have calculated the old length in order to calculate
+	 the adjustment.  */
 	  gcc_assert (si->nonzero_chars);
 	  tem = fold_convert_loc (loc, TREE_TYPE (si->nonzero_chars), adj);
 	  si->nonzero_chars = fold_build2_loc (loc, PLUS_EXPR,
@@ -2759,7 +2759,7 @@ handle_builtin_malloc (enum built_in_function bcod
 
 /* Handle a call to memset.
After a call to calloc, memset(,0,) is unnecessary.
-   memset(malloc(n),0,n) is calloc(n,1). 
+   memset(malloc(n),0,n) is calloc(n,1).
return true when the call is transfomred, false otherwise.  */
 
 static bool
@@ -2815,7 +2815,7 @@ handle_builtin_memset (gimple_stmt_iterator *gsi)
 
 /* Handle a call to memcmp.  We try to handle small comparisons by
converting them to load and compare, and replacing the call to memcmp
-   with a __builtin_memcmp_eq call where possible. 
+   with a __builtin_memcmp_eq call where possible.
return true when call is transformed, return false otherwise.  */
 
 static bool
@@ -2898,13 +2898,13 @@ handle_builtin_memcmp (gimple_stmt_iterator *gsi)
   return true;
 }
 
-/* Given an index to the strinfo vector, compute the string length for the
-   corresponding string. Return -1 when unknown.  */
- 
-static HOST_WIDE_INT 
+/* Given an index to the strinfo vector, compute the string length
+   for the corresponding string. Return -1 when unknown.  */
+
+static HOST_WIDE_INT
 compute_string_length (int idx)
 {
-  HOST_WIDE_INT string_leni = -1; 
+  HOST_WIDE_INT string_leni = -1;
   gcc_assert (idx != 0);
 
   if (idx < 0)
@@ -2924,9 +2924,9 @@ compute_string_length (int idx)
   return string_leni;
 }
 
-/* Determine the minimum size of the object referenced by DEST expression which
-   must have a pointer type. 
-   Return the minimum size of the object if successful or NULL when the size 
+/* Determine the minimum size of the object referenced by DEST expression
+   which must have a pointer type.
+   Return the minimum size of the object if successful or NULL when the size
cannot be determined.  */
 static tree
 determine_min_objsize (tree dest)
@@ -2936,8 +2936,8 @@ determine_min_objsize (tree dest)
   if (compute_builtin_object_size (dest, 2, &size))
 return build_int_cst (sizetype, size);
 
-  /* Try to determine the size of the object through the RHS of the 
- assign statement.  */
+  /* Try to determine the size of the object through the RHS
+ of the assign statement.  */
   if (TREE_CODE (dest) == SSA_NAME)
 {
   gimple *stmt = SSA_NAME_DEF_STMT (dest);
@@ -2962,13 +2962,13 @@ determine_min_objsize (tree dest)
 
   type = TYPE_MAIN_VARIANT (type);
 
-  /* We cannot determine the size of the array if it's a flexible array, 
+  /* We cannot determine the size of the array if it's a flexible array,
  which is declared at the end of a structure.  */
   if (TREE_CODE (type) == ARRAY_TYPE
   && !array_at_struct_end_p (dest))
 {
   tree size_t = TYPE_SIZE_UNIT (type);
-  if (size_t && TREE_CODE (size_t) == INTEGER_CST 
+  if (size_t && TREE_CODE (size_t) == INTEGER_CST
 	  && !integer_zerop (size_t))
 return size_t;
 }
@@ -2976,7 +2976,7 @@ determine_min_objsize (tree dest)
   return NULL_TREE;
 }
 
-/* Handle a call to strcmp or strncmp. When the result is ONLY used to do 
+/* Handle a call to strcmp or strncmp. When the result is ONLY used to do
equality test against zero:
 
A. When the lengths of both arguments are constant and it's a strcmp:
@@ -2983,12 +2983,12 @@ determine_min_objsize (tree dest)
   * if the lengths are NOT equal

Re: [PATCH][MSP430][4/4] Implement 64-bit shifts in assembly code

2019-06-06 Thread Jeff Law
On 6/6/19 6:42 AM, Jozef Lawrynowicz wrote:
> On Wed, 5 Jun 2019 16:35:14 -0600
> Jeff Law  wrote:
> 
>> On 6/4/19 7:17 AM, Jozef Lawrynowicz wrote:
>>> libgcc/ChangeLog
>>>
>>> 2019-06-04  Jozef Lawrynowicz  
>>>
>>> * config/msp430/slli.S (__mspabi_s): New library function for
>>> performing a logical left shift of a 64-bit value.
>>> (__mspabi_srall): New library function for
>>> performing a arithmetic right shift of a 64-bit value.
>>> (__mspabi_srlll): New library function for
>>> performing a logical right shift of a 64-bit value.
>>>
>> Going to assume your assembly routines are correct :-)
>>
>> OK
>> jeff
> I assume I implemented them correctly based on the clean regtest of
> GCC/G++ testsuites. But in case there might be a gap in the coverage
> somewhere, how about the attached new torture test to explicitly check 64-bit
> shifts work as expected?
> Passes for x86_64-linux-gnu and msp430-elf.
I suspect this needs to be conditional on 64bit integer support (check
either at runtime with sizeof or via dejagnu effective target stuff).
With that fixed this is OK.

jeff


Re: [PATCH] remove trailing spaces from tree-ssa-strlen.c

2019-06-06 Thread Jeff Law
On 6/6/19 11:19 AM, Martin Sebor wrote:
> To avoid trailing whitespace in my commits I have my editor set
> to highlight them.  While integrating the strlen/sprintf passes
> and making more extensive changes than usual, I keep getting
> distracted by the highlighting pointing out trailing spaces that
> predate my changes.  Rather than "fixing" this piecemeal as other
> changes are made that touch the same lines I'd like to clean this
> up in a standalone commit that does nothing else but adjust
> the formatting.
> 
> If no one has any objections I'll commit this cleanup later today.
> The diff is attached.
> 
> Martin
> 
> gcc-tree-ssa-strlen-trailing-space.diff
> 
> gcc/ChangeLog:
> 
>   * tree-ssa-strlen.c (adjust_related_strinfos): Avoid trailing article.
>   (handle_builtin_malloc): Remove trailing spaces.
>   (handle_builtin_memset): Same.
>   (handle_builtin_memcmp): Same.
>   (compute_string_length): Same.
>   (determine_min_objsize): Same.
>   (handle_builtin_string_cmp): Same.
>   (handle_char_store): Same.  Break up excessively long line.
OK.

THough really I'd like to see these caught by and either rejected or
automatically fixed by commit hooks :-)  It's silly to require any
manual intervention for trailing whitespace.

jeff


Re: undefined behavior in value_range::equiv_add()?

2019-06-06 Thread Jeff Law
On 6/6/19 1:31 AM, Richard Biener wrote:
[ Big snip ]

> 
>> I was primarily concerned with the ones in the evrp simplification
>> engine which are easy to fix in-place.  Looking at gimple-fold.c I agree
>> we do need to address the walking problem.
>>
>> I'm not sure we can drop to varying though -- my first twiddle of this
>> did precisely that, but that'll likely regress vrp47 where we know the
>> resultant range after simplification is just [0,1].
> 
> Of course we do not need to drop the original LHS to VARYING, only
> the defs we didn't already visit.
I was referring to the newly created def.  There's a case (seen in vrp47
IIRC) where the new temporary has a known range [0,1] and we need to
know that for subsequent simplifications.  But if your change passes
regression testing, then we're still picking that up properly.


> 
>> Ideally we wouldn't have the simplifiers creating new names or we'd have
>> a more robust mechanism for identifying when we've created a new name
>> and doing the right thing.  I suspect whatever we do here right now is
>> going to be a bandaid and as long as we keep creating new names in the
>> simplifier we're likely going to be coming back and applying more bandaids.
> 
> Re-visiting the stmts would be possible if we'd split registering ranges 
> derived
> from uses of a stmt from those of defs (IIRC I had incomplete patches to do 
> that
> when trying to derive X != 0 from Y / X because otherwise we miscompile 
> stuff).
> 
> Meanwhile I have bootstrapped / tested the following which does the VARYING
> thing.
> 
> Applied to trunk.  I think we need to backport this since this is a latent
> wrong-code issue.  We can see to improve things on the trunk incrementally.
Works for me.

jeff


[PATCH] Update preferred_stack_boundary only when expanding function call

2019-06-06 Thread H.J. Lu
locate_and_pad_parm is called when expanding function call from
initialize_argument_information and when generating function body
from assign_parm_find_entry_rtl:

  /* Remember if the outgoing parameter requires extra alignment on the
 calling function side.  */
  if (crtl->stack_alignment_needed < boundary)
crtl->stack_alignment_needed = boundary;
  if (crtl->preferred_stack_boundary < boundary)
crtl->preferred_stack_boundary = boundary;

stack_alignment_needed and preferred_stack_boundary should be updated
only when expanding function call, not when generating function body.
Add an argument, outgoing_p, to locate_and_pad_parm to indicate for
expanding function call.  Update stack_alignment_needed and
preferred_stack_boundary if the parameter is passed on stack and only
when expanding function call.

Tested on Linux/x86-64.

OK for trunk?

Thanks.

-- 
H.J.
From e91e20ad8e10373db2c6d8f99a3da0bbf46c5c34 Mon Sep 17 00:00:00 2001
From: "H.J. Lu" 
Date: Wed, 5 Jun 2019 12:55:19 -0700
Subject: [PATCH] Update preferred_stack_boundary only when expanding function
 call

locate_and_pad_parm is called when expanding function call from
initialize_argument_information and when generating function body
from assign_parm_find_entry_rtl:

  /* Remember if the outgoing parameter requires extra alignment on the
 calling function side.  */
  if (crtl->stack_alignment_needed < boundary)
crtl->stack_alignment_needed = boundary;
  if (crtl->preferred_stack_boundary < boundary)
crtl->preferred_stack_boundary = boundary;

stack_alignment_needed and preferred_stack_boundary should be updated
only when expanding function call, not when generating function body.
Add an argument, outgoing_p, to locate_and_pad_parm to indicate for
expanding function call.  Update stack_alignment_needed and
preferred_stack_boundary if the parameter is passed on stack and only
when expanding function call.

Tested on Linux/x86-64.

gcc/

	PR rtl-optimization/90765
	* function.c (assign_parm_find_entry_rtl): Pass false to
	locate_and_pad_parm.
	(locate_and_pad_parm): Add an argument, outgoing_p, to indicate
	for expanding function call.  Update stack_alignment_needed and
	preferred_stack_boundary only if outgoing_p is true and the
	the parameter is passed on stack.
	* function.h (locate_and_pad_parm): Add an argument, outgoing_p,
	defaulting to true.

gcc/testsuite/

	PR rtl-optimization/90765
	* gcc.target/i386/pr90765-1.c: New test.
	* gcc.target/i386/pr90765-2.c: Likewise.
---
 gcc/function.c| 21 +
 gcc/function.h|  3 ++-
 gcc/testsuite/gcc.target/i386/pr90765-1.c | 11 +++
 gcc/testsuite/gcc.target/i386/pr90765-2.c | 18 ++
 4 files changed, 44 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr90765-1.c
 create mode 100644 gcc/testsuite/gcc.target/i386/pr90765-2.c

diff --git a/gcc/function.c b/gcc/function.c
index e30ee259bec..9b6673f6f0d 100644
--- a/gcc/function.c
+++ b/gcc/function.c
@@ -2601,7 +2601,7 @@ assign_parm_find_entry_rtl (struct assign_parm_data_all *all,
   locate_and_pad_parm (data->promoted_mode, data->passed_type, in_regs,
 		   all->reg_parm_stack_space,
 		   entry_parm ? data->partial : 0, current_function_decl,
-		   &all->stack_args_size, &data->locate);
+		   &all->stack_args_size, &data->locate, false);
 
   /* Update parm_stack_boundary if this parameter is passed in the
  stack.  */
@@ -3954,7 +3954,8 @@ locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
 		 int reg_parm_stack_space, int partial,
 		 tree fndecl ATTRIBUTE_UNUSED,
 		 struct args_size *initial_offset_ptr,
-		 struct locate_and_pad_arg_data *locate)
+		 struct locate_and_pad_arg_data *locate,
+		 bool outgoing_p)
 {
   tree sizetree;
   pad_direction where_pad;
@@ -4021,12 +4022,16 @@ locate_and_pad_parm (machine_mode passed_mode, tree type, int in_regs,
 	}
 }
 
-  /* Remember if the outgoing parameter requires extra alignment on the
- calling function side.  */
-  if (crtl->stack_alignment_needed < boundary)
-crtl->stack_alignment_needed = boundary;
-  if (crtl->preferred_stack_boundary < boundary)
-crtl->preferred_stack_boundary = boundary;
+  if (outgoing_p && !in_regs)
+{
+  /* Remember if the outgoing parameter requires extra alignment on
+	 the calling function side if this parameter is passed in the
+	 stack.  */
+  if (crtl->stack_alignment_needed < boundary)
+	crtl->stack_alignment_needed = boundary;
+  if (crtl->preferred_stack_boundary < boundary)
+	crtl->preferred_stack_boundary = boundary;
+}
 
   if (ARGS_GROW_DOWNWARD)
 {
diff --git a/gcc/function.h b/gcc/function.h
index bfe9919a760..5ad7a33fd39 100644
--- a/gcc/function.h
+++ b/gcc/function.h
@@ -613,7 +613,8 @@ extern bool use_register_for_decl (const_tree);
 extern gimple_seq gimplify_parameters (gimple_seq *);
 extern void locate_a

[PATCH, PPC, Darwin] Ensure unwinder is built with altivec enabled.

2019-06-06 Thread Iain Sandoe
When libgcc is built on Darwin, it is usually built for the earliest potential
target (Darwin8, 10.4).  Build for that revision default to assuming that the
processor might be G3 (without vector ops) and there is an outlined function
used for save/restore that checks whether the processor is G3 or G4+ at run-
time.  However, the unwinder itself needs to be built with the assumption of
vector usage so that the relevant outlined functions are called.

Tested on powerpc-darwin9,
applied to mainline.

thanks
Iain

2019-06-06  Iain Sandoe  

* config/rs6000/t-darwin: Ensure that the unwinder is built with
altivec enabled.

diff --git a/libgcc/config/rs6000/t-darwin b/libgcc/config/rs6000/t-darwin
index abb41fc..61da0bd 100644
--- a/libgcc/config/rs6000/t-darwin
+++ b/libgcc/config/rs6000/t-darwin
@@ -20,4 +20,7 @@ LIB2ADD_ST = \
 # earlier OSX versions.
 HOST_LIBGCC2_CFLAGS += -Wa,-force_cpusubtype_ALL -mmacosx-version-min=10.4
 
+unwind-dw2_s.o: HOST_LIBGCC2_CFLAGS += -maltivec
+unwind-dw2.o: HOST_LIBGCC2_CFLAGS += -maltivec
+
 LIB2ADDEH += $(srcdir)/config/rs6000/darwin-fallback.c



Re: [PATCH] remove trailing spaces from tree-ssa-strlen.c

2019-06-06 Thread Segher Boessenkool
On Thu, Jun 06, 2019 at 11:34:31AM -0600, Jeff Law wrote:
> THough really I'd like to see these caught by and either rejected or
> automatically fixed by commit hooks :-)  It's silly to require any
> manual intervention for trailing whitespace.

[ Not fixed please, that can cause much bigger problems than it solves. ]

There is one line of thought that says if people do not have to deal with
these trivial formatting nits, they have more time for getting the more
complex cases right.  Another line of thought says that if people do not
get the trivial cases right, they will never do the harder ones correctly.


Segher


[Darwin, c++, testsuite] Fix alignas4.C for Darwin.

2019-06-06 Thread Iain Sandoe
This test was failing on Darwin because the scan-asm clauses only excluded
Darwin for m64 and the m32 syntax for Darwin is different, as below.

@Rainer, Mike: the tests for non-Darwin are somewhat strange (although I have
not modified them other than to exclude Darwin for 32b).
(a) the tests seem to be x86 only (m64 explicitly and m32 implicitly by using 
ia32
 as a guard).
(b) they only seem to care if there’s one instance of the aligment statement but
there are two objects?
(just mentioning in passing)

Darwin produces aligned zerofill directives for the objects represented.
We can scan for these using "lp64" and "ilp32" to catch operation on both
X86 and PowerPC ports (the test is for the alignment which is the trailing
value in the zerofill directive, as a power of two).

tested on x86_64-darwin16, 18 and x86_64-linux-gnu (m32, m64)
Applied to mainline,
thanks
Iain

gcc/testsuite/ChangeLog:

2019-06-06  Iain Sandoe  

* g++.dg/cpp0x/alignas4.C: Amend test to check for zerofill syntax
on Darwin.

diff --git a/gcc/testsuite/g++.dg/cpp0x/alignas4.C 
b/gcc/testsuite/g++.dg/cpp0x/alignas4.C
index b66fa65..1ef4870 100644
--- a/gcc/testsuite/g++.dg/cpp0x/alignas4.C
+++ b/gcc/testsuite/g++.dg/cpp0x/alignas4.C
@@ -1,7 +1,13 @@
 // PR c++/59012
 // { dg-do compile { target c++11 } }
 // { dg-final { scan-assembler "align 8" { target { { i?86-*-* x86_64-*-* } && 
{ { ! ia32 } && { ! *-*-darwin* } } } } } }
-// { dg-final { scan-assembler "align 4" { target ia32 } } }
+// { dg-final { scan-assembler "align 4" { target { ia32 && { ! *-*-darwin* } 
} } } }
+
+// Darwin produces aligned .zerofill directives for these.
+// { dg-final { scan-assembler {zerofill[^\n\r]+_a,4,2} { target { ilp32 && 
*-*-darwin* } } } }
+// { dg-final { scan-assembler {zerofill[^\n\r]+_a,8,3} { target { lp64 && 
*-*-darwin* } } } }
+// { dg-final { scan-assembler {zerofill[^\n\r]+_a2,4,2} { target { ilp32 && 
*-*-darwin* } } } }
+// { dg-final { scan-assembler {zerofill[^\n\r]+_a2,8,3} { target { lp64 && 
*-*-darwin* } } } }
 
 template 
 struct A



[PATCH] update get_range_strlen description

2019-06-06 Thread Martin Sebor

Hi Jeff,

It looks like the updated comment for get_range_strlen didn't make
it into the strlen fixup commits last December and the function
still has the old description.  (Not surprising given there are
at least two overloads of the same function and things moving
between them.)  I'm going to check in the comment from the patch
here: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02050.html
with an additional sentence describing ELTSIZE.

Let me know if you'd like me to tweak it or if spot something
else.

Thanks
Martin
gcc/ChangeLog:

	* gimple-fold.c (get_range_strlen): Update comment that didn't
	make it into r267503 or related commits.
	
diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index b3e931744f8..18860bb7c28 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -1672,30 +1672,16 @@ get_range_strlen (tree arg, bitmap *visited,
 }
 }
 
-/* Determine the minimum and maximum value or string length that ARG
-   refers to and store each in the first two elements of MINMAXLEN.
-   For expressions that point to strings of unknown lengths that are
-   character arrays, use the upper bound of the array as the maximum
-   length.  For example, given an expression like 'x ? array : "xyz"'
-   and array declared as 'char array[8]', MINMAXLEN[0] will be set
-   to 0 and MINMAXLEN[1] to 7, the longest string that could be
-   stored in array.
-   Return true if the range of the string lengths has been obtained
-   from the upper bound of an array at the end of a struct.  Such
-   an array may hold a string that's longer than its upper bound
-   due to it being used as a poor-man's flexible array member.
-
-   STRICT is true if it will handle PHIs and COND_EXPRs conservatively
-   and false if PHIs and COND_EXPRs are to be handled optimistically,
-   if we can determine string length minimum and maximum; it will use
-   the minimum from the ones where it can be determined.
-   STRICT false should be only used for warning code.
-   When non-null, clear *NONSTR if ARG refers to a constant array
-   that is known not be nul-terminated.  Otherwise set it to
-   the declaration of the constant non-terminated array.
-
-   ELTSIZE is 1 for normal single byte character strings, and 2 or
-   4 for wide characer strings.  ELTSIZE is by default 1.  */
+/* Try to obtain the range of the lengths of the string(s) referenced
+   by ARG, or the size of the largest array ARG refers to if the range
+   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
+   is the expected size of the string element in bytes: 1 for char and
+   some power of 2 for wide characters.
+   Return true if the range [PDATA->MINLEN, PDATA->MAXLEN] is suitable
+   for optimization.  Returning false means that a nonzero PDATA->MINLEN
+   doesn't reflect the true lower bound of the range  when PDATA->MAXLEN
+   is -1 (in that case, the actual range is indeterminate, i.e.,
+   [0, PTRDIFF_MAX - 2].  */
 
 bool
 get_range_strlen (tree arg, c_strlen_data *pdata, unsigned eltsize)


[PATCH] Fix up g++.dg/cpp*/feat-cxx*.C tests (was: [PATCH] Avoid unnecessary inclusion of header)

2019-06-06 Thread Jakub Jelinek
Hi!

On Thu, Jun 06, 2019 at 04:38:36PM +0100, Jonathan Wakely wrote:
> This can greatly reduce the amount of preprocessed code that is included
> by other headers, because  depends on  which is huge.
> 
>   * include/std/array: Do not include .
>   * include/std/optional: Include  and
>instead of .
> 
> Preprocessed line counts for C++17 mode:
> 
> 
> Before   2577432453 31616
> After 992523194 19062
> 
> Tested x86_64-linux, committed to trunk.
> 
> Once we have a gcc-10/porting_to.html page I'll note this change
> there, because code relying on std::string and std::allocator being
> defined by transitive includes will need to include the right headers.

Not only those, but apparently also ::size_t or std::plus; this broke
FAIL: g++.dg/cpp1y/feat-cxx14.C   (test for excess errors)
FAIL: g++.dg/cpp1z/feat-cxx1z.C  -std=gnu++17 (test for excess errors)
FAIL: g++.dg/cpp1z/pr85569.C  -std=c++17 (test for excess errors)
FAIL: g++.dg/cpp2a/feat-cxx2a.C   (test for excess errors)

Fixed thusly, tested on x86_64-linux with check-c++-all, ok for trunk?

2019-06-06  Jakub Jelinek  

PR testsuite/90772
* g++.dg/cpp1y/feat-cxx14.C: Use std::size_t instead of size_t.
* g++.dg/cpp1z/feat-cxx1z.C: Likewise.
* g++.dg/cpp2a/feat-cxx2a.C: Likewise.
* g++.dg/cpp1z/pr85569.C: Include .

--- gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C.jj  2017-09-12 09:35:46.955698280 
+0200
+++ gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C 2019-06-06 22:08:59.853528085 
+0200
@@ -303,11 +303,11 @@
 #if __has_include()
 #  define STD_ARRAY 1
 #  include 
-  template
+  template
 using array = std::array<_Tp, _Num>;
 #elif __has_include()
 #  define TR1_ARRAY 1
 #  include 
-  template
+  template
 typedef std::tr1::array<_Tp, _Num> array;
 #endif
--- gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C.jj  2019-01-16 09:35:07.360276820 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C 2019-06-06 22:09:21.571184644 
+0200
@@ -292,12 +292,12 @@
 #if __has_include()
 #  define STD_ARRAY 1
 #  include 
-  template
+  template
 using array = std::array<_Tp, _Num>;
 #elif __has_include()
 #  define TR1_ARRAY 1
 #  include 
-  template
+  template
 typedef std::tr1::array<_Tp, _Num> array;
 #endif
 
--- gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C.jj  2019-01-16 09:35:07.800269539 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C 2019-06-06 22:09:37.432934738 
+0200
@@ -291,12 +291,12 @@
 #if __has_include()
 #  define STD_ARRAY 1
 #  include 
-  template
+  template
 using array = std::array<_Tp, _Num>;
 #elif __has_include()
 #  define TR1_ARRAY 1
 #  include 
-  template
+  template
 typedef std::tr1::array<_Tp, _Num> array;
 #endif
 
--- gcc/testsuite/g++.dg/cpp1z/pr85569.C.jj 2018-12-05 09:16:42.870128432 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/pr85569.C2019-06-06 22:15:46.462120520 
+0200
@@ -2,6 +2,7 @@
 
 #include 
 #include 
+#include 
 
 #define LIFT_FWD(x) std::forward(x)
 

Jakub


Re: [PATCH] update get_range_strlen description

2019-06-06 Thread Marek Polacek
On Thu, Jun 06, 2019 at 01:32:14PM -0600, Martin Sebor wrote:
> Hi Jeff,
> 
> It looks like the updated comment for get_range_strlen didn't make
> it into the strlen fixup commits last December and the function
> still has the old description.  (Not surprising given there are
> at least two overloads of the same function and things moving
> between them.)  I'm going to check in the comment from the patch
> here: https://gcc.gnu.org/ml/gcc-patches/2018-11/msg02050.html
> with an additional sentence describing ELTSIZE.
> 
> Let me know if you'd like me to tweak it or if spot something
> else.
> 
> Thanks
> Martin

> gcc/ChangeLog:
> 
>   * gimple-fold.c (get_range_strlen): Update comment that didn't
>   make it into r267503 or related commits.
>   
> diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
> index b3e931744f8..18860bb7c28 100644
> --- a/gcc/gimple-fold.c
> +++ b/gcc/gimple-fold.c
> @@ -1672,30 +1672,16 @@ get_range_strlen (tree arg, bitmap *visited,
>  }
>  }
>  
> -/* Determine the minimum and maximum value or string length that ARG
> -   refers to and store each in the first two elements of MINMAXLEN.
> -   For expressions that point to strings of unknown lengths that are
> -   character arrays, use the upper bound of the array as the maximum
> -   length.  For example, given an expression like 'x ? array : "xyz"'
> -   and array declared as 'char array[8]', MINMAXLEN[0] will be set
> -   to 0 and MINMAXLEN[1] to 7, the longest string that could be
> -   stored in array.
> -   Return true if the range of the string lengths has been obtained
> -   from the upper bound of an array at the end of a struct.  Such
> -   an array may hold a string that's longer than its upper bound
> -   due to it being used as a poor-man's flexible array member.
> -
> -   STRICT is true if it will handle PHIs and COND_EXPRs conservatively
> -   and false if PHIs and COND_EXPRs are to be handled optimistically,
> -   if we can determine string length minimum and maximum; it will use
> -   the minimum from the ones where it can be determined.
> -   STRICT false should be only used for warning code.
> -   When non-null, clear *NONSTR if ARG refers to a constant array
> -   that is known not be nul-terminated.  Otherwise set it to
> -   the declaration of the constant non-terminated array.
> -
> -   ELTSIZE is 1 for normal single byte character strings, and 2 or
> -   4 for wide characer strings.  ELTSIZE is by default 1.  */
> +/* Try to obtain the range of the lengths of the string(s) referenced
> +   by ARG, or the size of the largest array ARG refers to if the range
> +   of lengths cannot be determined, and store all in *PDATA.  ELTSIZE
> +   is the expected size of the string element in bytes: 1 for char and
> +   some power of 2 for wide characters.
> +   Return true if the range [PDATA->MINLEN, PDATA->MAXLEN] is suitable
> +   for optimization.  Returning false means that a nonzero PDATA->MINLEN
> +   doesn't reflect the true lower bound of the range  when PDATA->MAXLEN
   ^^
There are two spaces instead of one.  No need to repost just because of
this, of course.

Marek


Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2019-06-06 Thread François Dumont

Here is what I come up with.


Regarding allocation in print_function I would also prefer to avoid it. 
But this patch also aim at creating a backtrace_state object in case of 
UB so the alloc is perhaps not so important. I can't use string_view as 
I need to modify it to display only a part of it through fsprintf. I 
could try to use "%.*s" however. I haven't also consider your remark 
about template parameters containing '<' yet.




+#if defined(_GLIBCXX_DEBUG_BACKTRACE)
+# if !defined(BACKTRACE_SUPPORTED)
+#  if defined(__has_include) && !__has_include()
+#   error No libbacktrace backtrace-supported.h file found.
+#  endif
+#  include 
+# endif
+# if !BACKTRACE_SUPPORTED
+#  error libbacktrace not supported.
+# endif
+# include 
+#else
+# include  // For uintptr_t.


Please use  and std::uintptr_t.


I did so but then realized that to do so I had to be in C++11 mode. I 
used tr1/cstdint in pre-C++11 mode.





+// Extracted from libbacktrace.
+typedef void (*backtrace_error_callback) (void*, const char*, int);
+
+typedef int (*backtrace_full_callback) (void*, uintptr_t, const 
char*, int,

+    const char*);


These typedefs should use __reserved_names.


+struct backtrace_state;


Although this one can't use a reserved name, unless we're going to
create opaque wrappers around the libbacktrace type. Introducing t his
non-reserved name means that defining _GLIBCXX_DEBUG makes the library
non-conforming.

It would be possible to avoid declaring this struct, by making
_M_backtrace_state a void* and creating a wrapper function for
backtrace_create_state, and a weak symbol in the library. I'll have to
think about this more 


My main problem was to be able to respect the ODR even when 
!BACKTRACE_SUPPORTED. To do so I eventually realized that I had to limit 
the feature to system where uintptr_t is available which I detect thanks 
to the _GLIBCXX_USE_C99_STDINT_TR1 macro which is used both in  
and .


If you think it is fine I'll document it.

François

diff --git a/libstdc++-v3/doc/xml/manual/debug_mode.xml b/libstdc++-v3/doc/xml/manual/debug_mode.xml
index 23a5df975a2..680b9d5999d 100644
--- a/libstdc++-v3/doc/xml/manual/debug_mode.xml
+++ b/libstdc++-v3/doc/xml/manual/debug_mode.xml
@@ -162,6 +162,13 @@ which always works correctly.
   GLIBCXX_DEBUG_MESSAGE_LENGTH can be used to request a
   different length.
 
+Starting with GCC 10 libstdc++ is able to use
+  http://www.w3.org/1999/xlink";
+  xlink:href="https://github.com/ianlancetaylor/libbacktrace";>libbacktrace
+  to produce backtraces on error. Use -D_GLIBCXX_DEBUG_BACKTRACE to
+  activate it. Note that if not properly installed or if libbacktrace is not
+  supported, compilation will fail. You'll also have to use
+  -lbacktrace to build your application.
 
 
 Using a Specific Debug Container
diff --git a/libstdc++-v3/doc/xml/manual/using.xml b/libstdc++-v3/doc/xml/manual/using.xml
index d7fbfe9584d..5769722192c 100644
--- a/libstdc++-v3/doc/xml/manual/using.xml
+++ b/libstdc++-v3/doc/xml/manual/using.xml
@@ -1128,6 +1128,16 @@ g++ -Winvalid-pch -I. -include stdc++.h -H -g -O2 hello.cc -o test.exe
 	extensions and libstdc++-specific behavior into errors.
   
 
+_GLIBCXX_DEBUG_BACKTRACE
+
+  
+	Undefined by default. Considered only if _GLIBCXX_DEBUG
+	is defined. When defined, checks for http://www.w3.org/1999/xlink";
+	xlink:href="https://github.com/ianlancetaylor/libbacktrace";>libbacktrace
+	support and use it to display backtraces on
+	debug mode assertions.
+  
+
 _GLIBCXX_PARALLEL
 
   Undefined by default. When defined, compiles user code
@@ -1634,6 +1644,17 @@ A quick read of the relevant part of the GCC
   header will remain compatible between different GCC releases.
 
 
+
+External Libraries
+
+
+  GCC 10 debug mode is able
+  produce backtraces thanks to http://www.w3.org/1999/xlink";
+  xlink:href="https://github.com/ianlancetaylor/libbacktrace";>libbacktrace.
+  To use the library you should define _GLIBCXX_DEBUG_BACKTRACE
+  and link with -lbacktrace.
+
+
   
 
   Concurrency
diff --git a/libstdc++-v3/include/debug/formatter.h b/libstdc++-v3/include/debug/formatter.h
index 220379994c0..9e5962a4744 100644
--- a/libstdc++-v3/include/debug/formatter.h
+++ b/libstdc++-v3/include/debug/formatter.h
@@ -31,6 +31,38 @@
 
 #include 
 
+#if defined(_GLIBCXX_DEBUG_BACKTRACE)
+# if !defined(BACKTRACE_SUPPORTED)
+#  include 
+# endif
+#endif
+
+#if BACKTRACE_SUPPORTED
+# define _GLIBCXX_DEBUG_USE_BACKTRACE 1
+# include 
+typedef backtrace_error_callback __backtrace_error_cb;
+typedef backtrace_full_callback __backtrace_full_cb;
+typedef backtrace_state __backtrace_state;
+#elif defined (_GLIBCXX_USE_C99_STDINT_TR1)
+# define _GLIBCXX_DEBUG_USE_BACKTRACE 1
+
+# if __cplusplus >= 201103L
+#  include  // For std::uintptr_t.
+typedef int (*__backtrace_full_cb) (void*, std::uintptr_t, const char*,
+int, const char*);
+# else
+#  include 
+t

[PATCH] Fix up g++.dg/cpp*/feat-cxx*.C tests (PR testsuite/90772, take 2, was: [PATCH] Avoid unnecessary inclusion of header)

2019-06-06 Thread Jakub Jelinek
On Thu, Jun 06, 2019 at 10:19:52PM +0200, Jakub Jelinek wrote:
> Not only those, but apparently also ::size_t or std::plus; this broke
> FAIL: g++.dg/cpp1y/feat-cxx14.C   (test for excess errors)
> FAIL: g++.dg/cpp1z/feat-cxx1z.C  -std=gnu++17 (test for excess errors)
> FAIL: g++.dg/cpp1z/pr85569.C  -std=c++17 (test for excess errors)
> FAIL: g++.dg/cpp2a/feat-cxx2a.C   (test for excess errors)
> 
> Fixed thusly, tested on x86_64-linux with check-c++-all, ok for trunk?

Bill has mentioned two other tests that are failing for similar reasons, so
here is an updated patch to cover those two too.  Tested again on
x86_64-linux with check-c++-all, ok for trunk?

2019-06-06  Jakub Jelinek  

PR testsuite/90772
* g++.dg/cpp1y/feat-cxx14.C: Use std::size_t instead of size_t.
* g++.dg/cpp1z/feat-cxx1z.C: Likewise.
* g++.dg/cpp2a/feat-cxx2a.C: Likewise.
* g++.dg/cpp1z/pr85569.C: Include .
* g++.dg/tree-ssa/pr80293.C: Include .
* g++.dg/tree-ssa/pr69336.C: Include .

--- gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C.jj  2017-09-12 09:35:46.955698280 
+0200
+++ gcc/testsuite/g++.dg/cpp1y/feat-cxx14.C 2019-06-06 22:08:59.853528085 
+0200
@@ -303,11 +303,11 @@
 #if __has_include()
 #  define STD_ARRAY 1
 #  include 
-  template
+  template
 using array = std::array<_Tp, _Num>;
 #elif __has_include()
 #  define TR1_ARRAY 1
 #  include 
-  template
+  template
 typedef std::tr1::array<_Tp, _Num> array;
 #endif
--- gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C.jj  2019-01-16 09:35:07.360276820 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/feat-cxx1z.C 2019-06-06 22:09:21.571184644 
+0200
@@ -292,12 +292,12 @@
 #if __has_include()
 #  define STD_ARRAY 1
 #  include 
-  template
+  template
 using array = std::array<_Tp, _Num>;
 #elif __has_include()
 #  define TR1_ARRAY 1
 #  include 
-  template
+  template
 typedef std::tr1::array<_Tp, _Num> array;
 #endif
 
--- gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C.jj  2019-01-16 09:35:07.800269539 
+0100
+++ gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C 2019-06-06 22:09:37.432934738 
+0200
@@ -291,12 +291,12 @@
 #if __has_include()
 #  define STD_ARRAY 1
 #  include 
-  template
+  template
 using array = std::array<_Tp, _Num>;
 #elif __has_include()
 #  define TR1_ARRAY 1
 #  include 
-  template
+  template
 typedef std::tr1::array<_Tp, _Num> array;
 #endif
 
--- gcc/testsuite/g++.dg/cpp1z/pr85569.C.jj 2018-12-05 09:16:42.870128432 
+0100
+++ gcc/testsuite/g++.dg/cpp1z/pr85569.C2019-06-06 22:15:46.462120520 
+0200
@@ -2,6 +2,7 @@
 
 #include 
 #include 
+#include 
 
 #define LIFT_FWD(x) std::forward(x)
 
--- gcc/testsuite/g++.dg/tree-ssa/pr80293.C.jj  2017-04-24 19:28:04.179949309 
+0200
+++ gcc/testsuite/g++.dg/tree-ssa/pr80293.C 2019-06-06 22:31:53.851868545 
+0200
@@ -2,6 +2,7 @@
 // { dg-options "-O2 -std=gnu++11 -fdump-tree-optimized" } */
 
 #include 
+#include 
 
 // Return a copy of the underlying memory of an arbitrary value.
 template <
--- gcc/testsuite/g++.dg/tree-ssa/pr69336.C.jj  2016-01-25 22:33:17.211951114 
+0100
+++ gcc/testsuite/g++.dg/tree-ssa/pr69336.C 2019-06-06 22:30:33.808132272 
+0200
@@ -3,6 +3,7 @@
 
 #include 
 #include 
+#include 
 
 
 template struct static_map


Jakub


Remove -fodr-type-merging

2019-06-06 Thread Jan Hubicka
Hi,
at the type introducing ODR type merging via mangled type names I was
concerned about streaming overhead of this infrastructure and
implemented flag to disable it.  Maintaining this flag is harder over a
time since we really want to establing ODR based type euivalences and
also the type streaming is now relatively cheap after the type
simplification and merging improvemets.

This patch thus drops the flag. I plan to commit it tomorrow if there
are no complains.

Honza

* common.opt (flto-odr-type-merging): Ignore.
* invoke.texi (-flto-odr-type-merging): Remove.
* ipa-devirt.c (odr_vtable_hasher:odr_name_hasher): Remove.
(can_be_vtable_hashed_p): Remove.
(hash_odr_vtable): Remove.
(odr_vtable_hasher::hash): Remove.
(types_same_for_odr): Remove.
(types_odr_comparable): Remove.
(odr_vtable_hasher::equal): Remove.
(odr_vtable_hash_type, odr_vtable_hash): Remove.
(add_type_duplicate): Do not synchronize vtable and name hashtables.
(get_odr_type): Do not use vtable hash.
(dump_odr_type): Remove commented out code.
(build_type_inheritance_graph): Do not allocate vtable hash.
(rebuild_type_inheritance_graph): Do not delete vtable hash.
* ipa-utils.h (type_with_linkage_p): Drop vtable hash path.
(odr_type_p): Likewise.
* tree.c (need_assembler_name_p): Remove flag_lto_odr_type_mering
test.
Index: common.opt
===
--- common.opt  (revision 272018)
+++ common.opt  (working copy)
@@ -1888,8 +1888,8 @@ Common Joined RejectNegative UInteger Va
 -flto-compression-level=   Use zlib compression level  for 
IL.
 
 flto-odr-type-merging
-Common Report Var(flag_lto_odr_type_mering) Init(1)
-Merge C++ types using One Definition Rule.
+Common Ignore
+Does nothing.  Preserved for backward compatibility.
 
 flto-report
 Common Report Var(flag_lto_report) Init(0)
Index: doc/invoke.texi
===
--- doc/invoke.texi (revision 272018)
+++ doc/invoke.texi (working copy)
@@ -7357,7 +7357,7 @@ Do not warn about compile-time overflow
 @opindex Wno-odr
 @opindex Wodr
 Warn about One Definition Rule violations during link-time optimization.
-Requires @option{-flto-odr-type-merging} to be enabled.  Enabled by default.
+Enabled by default.
 
 @item -Wopenmp-simd
 @opindex Wopenmp-simd
@@ -10353,12 +10353,6 @@ The value @samp{one} specifies that exac
 used while the value @samp{none} bypasses partitioning and executes
 the link-time optimization step directly from the WPA phase.
 
-@item -flto-odr-type-merging
-@opindex flto-odr-type-merging
-Enable streaming of mangled types names of C++ types and their unification
-at link time.  This increases size of LTO object files, but enables
-diagnostics about One Definition Rule violations.
-
 @item -flto-compression-level=@var{n}
 @opindex flto-compression-level
 This option specifies the level of compression used for intermediate
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 272018)
+++ ipa-devirt.c(working copy)
@@ -284,16 +284,6 @@ struct odr_name_hasher : pointer_hash type);
 }
 
-static bool
-can_be_vtable_hashed_p (tree t)
-{
-  /* vtable hashing can distinguish only main variants.  */
-  if (TYPE_MAIN_VARIANT (t) != t)
-return false;
-  /* Anonymous namespace types are always handled by name hash.  */
-  if (type_with_linkage_p (t) && type_in_anonymous_namespace_p (t))
-return false;
-  return (TREE_CODE (t) == RECORD_TYPE
- && TYPE_BINFO (t) && BINFO_VTABLE (TYPE_BINFO (t)));
-}
-
-/* Hash type by assembler name of its vtable.  */
-
-static hashval_t
-hash_odr_vtable (const_tree t)
-{
-  tree v = BINFO_VTABLE (TYPE_BINFO (TYPE_MAIN_VARIANT (t)));
-  inchash::hash hstate;
-
-  gcc_checking_assert (in_lto_p);
-  gcc_checking_assert (!type_in_anonymous_namespace_p (t));
-  gcc_checking_assert (TREE_CODE (t) == RECORD_TYPE
-  && TYPE_BINFO (t) && BINFO_VTABLE (TYPE_BINFO (t)));
-  gcc_checking_assert (TYPE_MAIN_VARIANT (t) == t);
-
-  if (TREE_CODE (v) == POINTER_PLUS_EXPR)
-{
-  add_expr (TREE_OPERAND (v, 1), hstate);
-  v = TREE_OPERAND (TREE_OPERAND (v, 0), 0);
-}
-
-  hstate.add_hwi (IDENTIFIER_HASH_VALUE (DECL_ASSEMBLER_NAME (v)));
-  return hstate.end ();
-}
-
-/* Return the computed hashcode for ODR_TYPE.  */
-
-inline hashval_t
-odr_vtable_hasher::hash (const odr_type_d *odr_type)
-{
-  return hash_odr_vtable (odr_type->type);
-}
-
 /* For languages with One Definition Rule, work out if
types are the same based on their name.
 
@@ -404,60 +349,6 @@ types_same_for_odr (const_tree type1, co
   || (type_with_linkage_p (type2) && type_in_anonymous_namespace_p 
(type2)))
 return false;
 
-
-  /* ODR name of the type is set in DECL_ASSEMBLER_NAME of its TY

Re: libbacktrace integration for _GLIBCXX_DEBUG mode

2019-06-06 Thread Jonathan Wakely

On 06/06/19 22:33 +0200, François Dumont wrote:

Here is what I come up with.


Regarding allocation in print_function I would also prefer to avoid 
it. But this patch also aim at creating a backtrace_state object in 
case of UB so the alloc is perhaps not so important. I can't use 
string_view as I need to modify it to display only a part of it 


I was only referring to these strings, which allocated memory on every
call to print_function, but you don't modify:

+const string cxx1998 = "__cxx1998::";
+const string allocator = ", std::allocator<";
+const string safe_iterator = "__gnu_debug::_Safe_iterator<";

I see you've changed them now though.

through fsprintf. I could try to use "%.*s" however. I haven't also 
consider your remark about template parameters containing '<' yet.




+#if defined(_GLIBCXX_DEBUG_BACKTRACE)
+# if !defined(BACKTRACE_SUPPORTED)
+#  if defined(__has_include) && !__has_include()
+#   error No libbacktrace backtrace-supported.h file found.
+#  endif
+#  include 
+# endif
+# if !BACKTRACE_SUPPORTED
+#  error libbacktrace not supported.
+# endif
+# include 
+#else
+# include  // For uintptr_t.


Please use  and std::uintptr_t.


I did so but then realized that to do so I had to be in C++11 mode. I 
used tr1/cstdint in pre-C++11 mode.


Ugh, right, of course, sorry.

Then I guess  is better than relying on TR1 (even though
 isn't technically part of C++98 either).



Re: [PATCH] handle vla plus offset in strlen (PR 90662)

2019-06-06 Thread Jeff Law
On 6/5/19 4:51 PM, Martin Sebor wrote:
> One of my new tests for the strlen/sprintf integration tripped
> over an incomplete handling of VLAs by the strlen pass.  Where
> it can determine the length of a substring at some offset with
> other kinds of arrays, the pass fails with VLAs because they
> are represented as pointers to arrays.
> 
> The attached patch adds the missing handling so that code like
> the following can be fully folded even for VLAs.
> 
>   int f (int n)
>   {
> char a[n];
> strcpy (a, "12345");
> if (strlen (&a[2]) != 3)
>   abort ();
>   }
> 
> Tested on x86_64-linux.
> 
> Martin
> 
> gcc-90662.diff
> 
> PR tree-optimization/90662 - strlen of a string in a vla plus offset not 
> folded
> 
> gcc/ChangeLog:
> 
>   PR tree-optimization/90662
>   * tree-ssa-strlen.c (get_stridx): Handle simple VLAs and pointers
>   to arrays.
> 
> gcc/testsuite/ChangeLog:
> 
>   PR tree-optimization/90662
>   * gcc.dg/strlenopt-62.c: New test.
>   * gcc.dg/strlenopt-63.c: New test.
We're relying on the fact that in a MEM_REF the offset is always a byte
offset, so no scaling for wchars needed, right?

OK for the trunk.
THanks,
Jeff



Re: undefined behavior in value_range::equiv_add()?

2019-06-06 Thread Aldy Hernandez

Meanwhile I have bootstrapped / tested the following which does the VARYING
thing.

Applied to trunk.  I think we need to backport this since this is a latent
wrong-code issue.  We can see to improve things on the trunk incrementally.


Folks, thanks so much for taking care of this.

After Richard's patch, my value_range_base::intersect patch no longer 
fails on vrp47, and no longer requires a special-case for undefined.


The attached patch splits out the intersect code into a value_range_base 
version, as we have for union_.


OK?

Aldy
gcc/

	* tree-vrp.h (value_range_base::intersect): New.
	(value_range::intersect_helper): Move from here...
	(value_range_base::intersect_helper): ...to here.
	* tree-vrp.c (value_range::intersect_helper): Rename to...
	(value_range_base::intersect_helper): ...this, and rewrite to
	return a value instead of modifying THIS in place.
	Also, move equivalence handling...
	(value_range::intersect): ...here, while calling intersect_helper.
	* gimple-fold.c (size_must_be_zero_p): Use value_range_base when
	calling intersect.
	* gimple-ssa-evrp-analyze.c (ecord_ranges_from_incoming_edge):
	Same.
	* vr-values.c (vrp_evaluate_conditional_warnv_with_ops): Same.

diff --git a/gcc/gimple-fold.c b/gcc/gimple-fold.c
index b3e931744f8..8b8331eb555 100644
--- a/gcc/gimple-fold.c
+++ b/gcc/gimple-fold.c
@@ -684,10 +684,10 @@ size_must_be_zero_p (tree size)
   /* Compute the value of SSIZE_MAX, the largest positive value that
  can be stored in ssize_t, the signed counterpart of size_t.  */
   wide_int ssize_max = wi::lshift (wi::one (prec), prec - 1) - 1;
-  value_range valid_range (VR_RANGE,
-			   build_int_cst (type, 0),
-			   wide_int_to_tree (type, ssize_max));
-  value_range vr;
+  value_range_base valid_range (VR_RANGE,
+build_int_cst (type, 0),
+wide_int_to_tree (type, ssize_max));
+  value_range_base vr;
   get_range_info (size, vr);
   vr.intersect (&valid_range);
   return vr.zero_p ();
diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
index bb4e2d6e798..4c68af847e1 100644
--- a/gcc/gimple-ssa-evrp-analyze.c
+++ b/gcc/gimple-ssa-evrp-analyze.c
@@ -210,9 +210,10 @@ evrp_range_analyzer::record_ranges_from_incoming_edge (basic_block bb)
 	 getting first [64, +INF] and then ~[0, 0] from
 		 conditions like (s & 0x3cc0) == 0).  */
 	  value_range *old_vr = get_value_range (vrs[i].first);
-	  value_range tem (old_vr->kind (), old_vr->min (), old_vr->max ());
+	  value_range_base tem (old_vr->kind (), old_vr->min (),
+old_vr->max ());
 	  tem.intersect (vrs[i].second);
-	  if (tem.equal_p (*old_vr, /*ignore_equivs=*/true))
+	  if (tem.equal_p (*old_vr))
 		continue;
 	  push_value_range (vrs[i].first, vrs[i].second);
 	  if (is_fallthru
diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
index fdda64c30d5..d94de2b22ee 100644
--- a/gcc/tree-vrp.c
+++ b/gcc/tree-vrp.c
@@ -6020,30 +6020,26 @@ intersect_ranges (enum value_range_kind *vr0type,
 }
 
 
-/* Intersect the two value-ranges *VR0 and *VR1 and store the result
-   in *VR0.  This may not be the smallest possible such range.  */
+/* Helper for the intersection operation for value ranges.  Given two
+   value ranges VR0 and VR1, return the intersection of the two
+   ranges.  This may not be the smallest possible such range.  */
 
-void
-value_range::intersect_helper (value_range *vr0, const value_range *vr1)
+value_range_base
+value_range_base::intersect_helper (const value_range_base *vr0,
+const value_range_base *vr1)
 {
   /* If either range is VR_VARYING the other one wins.  */
   if (vr1->varying_p ())
-return;
+return *vr0;
   if (vr0->varying_p ())
-{
-  vr0->deep_copy (vr1);
-  return;
-}
+return *vr1;
 
   /* When either range is VR_UNDEFINED the resulting range is
  VR_UNDEFINED, too.  */
   if (vr0->undefined_p ())
-return;
+return *vr0;
   if (vr1->undefined_p ())
-{
-  vr0->set_undefined ();
-  return;
-}
+return *vr1;
 
   value_range_kind vr0type = vr0->kind ();
   tree vr0min = vr0->min ();
@@ -6053,28 +6049,34 @@ value_range::intersect_helper (value_range *vr0, const value_range *vr1)
   /* Make sure to canonicalize the result though as the inversion of a
  VR_RANGE can still be a VR_RANGE.  Work on a temporary so we can
  fall back to vr0 when this turns things to varying.  */
-  value_range tem;
+  value_range_base tem;
   tem.set_and_canonicalize (vr0type, vr0min, vr0max);
   /* If that failed, use the saved original VR0.  */
   if (tem.varying_p ())
-return;
-  vr0->update (tem.kind (), tem.min (), tem.max ());
+return *vr0;
 
-  /* If the result is VR_UNDEFINED there is no need to mess with
- the equivalencies.  */
-  if (vr0->undefined_p ())
-return;
+  return tem;
+}
 
-  /* The resulting set of equivalences for range intersection is the union of
- the two sets.  */
-  if (vr0->m_equiv && vr1->m_equiv && vr0->m_equiv != vr1->m_equiv)
-  

[PATCH] Disable PowerPC pc-relative support until the code is checked in

2019-06-06 Thread Michael Meissner
I will be starting to put out the patches to enable pc-relative support for a
future machine on the PowerPC.

Until those patches are approved and committed, we need to change the defaults
for the target so that pc-relative isn't default.  Right now, we have the parts
of the compiler that does calls, etc. able to generate the appropriate
pc-relative calls, but the addressing modes still use the TOC calls.
Unfortunately, this means the TOC register is not properly set up in the
prologue.

I have tested these patches and the don't generate any test suite regressions.
I have visually inspected the code to make sure the calls are using the TOC abi
instead of the pc-relative.  I did fix the few tests for pc-relative support to
add the necessary flags (-mpcrel) or GCC target pragma options (pcrel and
no-pcrel).  In addition, I made the -mpcrel and -mprefixed-addr options behave
like other options (i.e. if you do -mpcrel, it also enables -mcpu=target and
-mprefixed-addr).

Can I check these patches into the trunk?

[gcc]
2019-06-06  Michael Meissner  

* config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Delete
enabling -mprefixed-addr and -mpcrel by default.
* config/rs6000/rs6000.c (rs6000_option_override_internal): Make
-mpcrel and -mprefixed-addr act like other swtiches (i.e. using
-mpcrel automatically sets -mcpu=future and -mprefixed-addr, and
-mprefixed-addr automatically sets just -mcpu=future).  Clean up
some thinkos in reporting errors for -mpcrel and -mprefixed-addr.

[gcc/testsuite]
2019-06-06  Michael Meissner  

* gcc.target/powerpc/localentry-1.c: Add -mpcrel option.
* gcc.target/powerpc/localentry-detect-1.c: Explicitly set and
unset -mpcrel in the target pragmas.
* gcc.target/powerpc/notoc-direct-1.c: Add -mpcrel option.
* gcc.target/powerpc/pcrel-sibcall-1.c: Explicitly set and
unset -mpcrel in the target pragmas.

Index: gcc/config/rs6000/rs6000-cpus.def
===
--- gcc/config/rs6000/rs6000-cpus.def   (revision 272004)
+++ gcc/config/rs6000/rs6000-cpus.def   (working copy)
@@ -77,9 +77,7 @@
 
 /* Support for a future processor's features.  */
 #define ISA_FUTURE_MASKS_SERVER(ISA_3_0_MASKS_SERVER   
\
-| OPTION_MASK_FUTURE   \
-| OPTION_MASK_PCREL\
-| OPTION_MASK_PREFIXED_ADDR)
+| OPTION_MASK_FUTURE)  \
 
 /* Flags that need to be turned off if -mno-future.  */
 #define OTHER_FUTURE_MASKS (OPTION_MASK_PCREL  \
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 272004)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -3896,8 +3896,15 @@ rs6000_option_override_internal (bool gl
   ignore_masks = rs6000_disable_incompatible_switches ();
 
   /* For the newer switches (vsx, dfp, etc.) set some of the older options,
- unless the user explicitly used the -mno- to disable the code.  */
-  if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_MISC)
+ unless the user explicitly used the -mno- to disable the code.  At
+ present -mfuture does not enable prefixed address or pc-relative support,
+ so if those are specified, enable the necessary additional bits.  */
+  if (TARGET_PCREL)
+rs6000_isa_flags |= ((ISA_FUTURE_MASKS_SERVER
+ | OPTION_MASK_PREFIXED_ADDR) & ~ignore_masks);
+  else if (TARGET_PREFIXED_ADDR)
+rs6000_isa_flags |= (ISA_FUTURE_MASKS_SERVER & ~ignore_masks);
+  else if (TARGET_P9_VECTOR || TARGET_MODULO || TARGET_P9_MISC)
 rs6000_isa_flags |= (ISA_3_0_MASKS_SERVER & ~ignore_masks);
   else if (TARGET_P9_MINMAX)
 {
@@ -4248,7 +4255,7 @@ rs6000_option_override_internal (bool gl
   /* -mpcrel requires prefixed load/store addressing.  */
   if (TARGET_PCREL && !TARGET_PREFIXED_ADDR)
 {
-  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_PREFIXED_ADDR) != 0)
error ("%qs requires %qs", "-mpcrel", "-mprefixed-addr");
 
   rs6000_isa_flags &= ~OPTION_MASK_PCREL;
@@ -4257,7 +4264,7 @@ rs6000_option_override_internal (bool gl
   /* -mprefixed-addr (and hence -mpcrel) requires -mcpu=future.  */
   if (TARGET_PREFIXED_ADDR && !TARGET_FUTURE)
 {
-  if ((rs6000_isa_flags_explicit & OPTION_MASK_PCREL) != 0)
+  if ((rs6000_isa_flags_explicit & OPTION_MASK_FUTURE) != 0)
error ("%qs requires %qs", "-mprefixed-addr", "-mcpu=future");
 
   rs6000_isa_flags &= ~(OPTION_MASK_PCREL | OPTION_MASK_PREFIXED_ADDR);
Index: gcc/testsuite/gcc.target/powerpc/localentry-1.c
===
--- gcc/testsuite/gcc.target/powerpc/localen

[PATCH] PR other/90695 reduce testcase to remove library dependency

2019-06-06 Thread Jonathan Wakely

This reproduces the original ICE fixed by r178857 (tested at r178852 and
r178860), without depending on a libstdc++ header that keeps changing.

The number of errors differs between C++14 and C++17 modes, so the fixed
test uses dg-excess-errors to match any number of them. The precise
errors aren't what's being tested for here anyway, the point of the test
is to verify the ICE in PR 50391 is fixed.

PR other/90695
* g++.dg/cpp0x/noexcept15.C: Remove dependency on library header.

Tested x86_64-linux. OK for trunk?


commit 2a63cb3a3500eb4ea3eec8b8761a767ec4637aa1
Author: Jonathan Wakely 
Date:   Thu Jun 6 23:26:52 2019 +0100

PR other/90695 reduce testcase to remove library dependency

This reproduces the original ICE fixed by r178857 (tested at r178852 and
r178860), without depending on a libstdc++ header that keeps changing.

The number of errors differs between C++14 and C++17 modes, so the fixed
test uses dg-excess-errors to match any number of them. The precise
errors aren't what's being tested for here anyway, the point of the test
is to verify the ICE in PR 50391 is fixed.

PR other/90695
* g++.dg/cpp0x/noexcept15.C: Remove dependency on library header.

diff --git a/gcc/testsuite/g++.dg/cpp0x/noexcept15.C 
b/gcc/testsuite/g++.dg/cpp0x/noexcept15.C
index 5cbbea8a91a..6c6eef68915 100644
--- a/gcc/testsuite/g++.dg/cpp0x/noexcept15.C
+++ b/gcc/testsuite/g++.dg/cpp0x/noexcept15.C
@@ -1,12 +1,46 @@
 // PR c++/50391
 // { dg-do compile { target c++11 } }
 
-#include 
+namespace std
+{
+  template
+struct integral_constant
+{ static constexpr T value = Val; };
+
+  template
+struct is_abstract
+: integral_constant
+{ };
+
+  template::value>
+struct is_destructible
+: integral_constant
+{ };
+
+  template
+struct is_destructible
+: integral_constant
+{ };
+
+  template
+struct is_nothrow_move_constructible
+: is_destructible
+{ };
+
+  template
+struct decay
+{ typedef T type; };
+
+  template
+struct decay
+{ typedef T type; };
+
+} // std
 
 template
   struct single
   {
-Tp elem;  // { dg-error "incomplete type" }
+Tp elem;
 
 constexpr single(const Tp& e)
 : elem(e) { }
@@ -30,3 +64,5 @@ foo(Blob *b)
 {
   make_single(*b);
 }
+
+// { dg-excess-errors "incomplete type|not a member" }


Re: [PATCH] Disable PowerPC pc-relative support until the code is checked in

2019-06-06 Thread Segher Boessenkool
Hi Mike,

On Thu, Jun 06, 2019 at 06:42:16PM -0400, Michael Meissner wrote:
> 2019-06-06  Michael Meissner  
> 
>   * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Delete
>   enabling -mprefixed-addr and -mpcrel by default.

Why disable prefixed-addr?

>   * config/rs6000/rs6000.c (rs6000_option_override_internal): Make
>   -mpcrel and -mprefixed-addr act like other swtiches (i.e. using

Typo ("switches").

>   -mpcrel automatically sets -mcpu=future and -mprefixed-addr, and

Automatically setting -mcpu= is a bad thing.  Instead, we should just
error out if someone tries to use -mpcrel with a CPU (or ABI, etc.) that
doesn't support it.  Or, is there any special reason you want it?

In the future, we will not have an -mprefixed-addr option (it will be
always on for CPUs that support it), and I don't see any real reason
to allow disabling pcrel either, but we'll see.


Segher


[PATCH] RISC-V: Move STARTFILE_PREFIX_SPEC into target OS files.

2019-06-06 Thread Jim Wilson
This fixes a problem reported for the RISC-V Haiku OS port, where putting
STARTFILE_PREFIX_SPEC in riscv.h breaks their port because they put
libraries in different directories than the UNIX convention.  The current
definition is definitely needed for Linux.  It should not be needed for rtems
or embedded elf which use cross compilers.  It may or may not be needed for
FreeBSD, I can't easily tell, but it is harmless, and can be removed later if
they don't want it.  So I'm moving the definition from riscv.h into the
freebsd.h and linux.h files.

Tested with cross builds and make checks for riscv32-elf and riscv64-linux.
There were no regressions.  Also tested by checking the --print-search-dirs
output for elf, linux, and freebsd cross compiler builds.


Committed.

Jim

gcc/
PR target/89955
* config/riscv/riscv.h (STARTFILE_PREFIX_SPEC): Deleted.
* config/riscv/freebsd.h (STARTFILE_PREFIX_SPEC): Added.
* config/riscv/linux.h (STARTFILE_PREFIX_SPEC): Added.
---
 gcc/config/riscv/freebsd.h | 6 ++
 gcc/config/riscv/linux.h   | 6 ++
 gcc/config/riscv/riscv.h   | 6 --
 3 files changed, 12 insertions(+), 6 deletions(-)

diff --git a/gcc/config/riscv/freebsd.h b/gcc/config/riscv/freebsd.h
index 13d04ccbb47..bc516628285 100644
--- a/gcc/config/riscv/freebsd.h
+++ b/gcc/config/riscv/freebsd.h
@@ -52,3 +52,9 @@ along with GCC; see the file COPYING3.  If not see
 %{rdynamic:-export-dynamic}\
 -dynamic-linker " FBSD_DYNAMIC_LINKER "}   \
 %{static:-static}}"
+
+#define STARTFILE_PREFIX_SPEC  \
+   "/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
+   "/usr/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
+   "/lib/ "\
+   "/usr/lib/ "
diff --git a/gcc/config/riscv/linux.h b/gcc/config/riscv/linux.h
index 58dd18b89f3..07ce80a847c 100644
--- a/gcc/config/riscv/linux.h
+++ b/gcc/config/riscv/linux.h
@@ -68,3 +68,9 @@ along with GCC; see the file COPYING3.  If not see
 %{static:-static}}"
 
 #define TARGET_ASM_FILE_END file_end_indicate_exec_stack
+
+#define STARTFILE_PREFIX_SPEC  \
+   "/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
+   "/usr/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
+   "/lib/ "\
+   "/usr/lib/ "
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index 4edd2a60194..8856cee599e 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -893,12 +893,6 @@ extern unsigned riscv_stack_boundary;
   "%{mabi=lp64f:lp64f}" \
   "%{mabi=lp64d:lp64d}" \
 
-#define STARTFILE_PREFIX_SPEC  \
-   "/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
-   "/usr/lib" XLEN_SPEC "/" ABI_SPEC "/ "  \
-   "/lib/ "\
-   "/usr/lib/ "
-
 /* ISA constants needed for code generation.  */
 #define OPCODE_LW0x2003
 #define OPCODE_LD0x3003
-- 
2.17.1



Re: [PATCH] Fix PR90574

2019-06-06 Thread Jeff Law
On 6/6/19 5:43 AM, Richard Biener wrote:
> 
> The following fixes debugging experience (and coverage) for cases
> where CFG construction "optimizes" the CFG by squashing labels
> into the same basic-block, defeating the regular mechanism of
> dropping labels that are not reachable as done by CFG cleanup.
> 
> Writing coverage testcases is easy enough here, guality IIRC
> cannot test whether we stop at exactly a line - after the
> patch gdb when setting a breakpoint on line 4, stops at
> line 10 (stepping goes from 2 to 10 directly).
> 
> Bootstrap & regtest running on x86_64-unknown-linux-gnu.
> 
> The patch has no bad effects on code generation when optimizing,
> we just produce more "garbage" CFG upfront to leave optimizing
> the CFG to machinery that knows how to do it correctly.  It
> does have code-generation effects when not optimizing where
> for the first testcase instead of
> 
> nop
> .L2:
> cmpl$1, -4(%rbp)
> jne .L3
> 
> we now emit
> 
> cmpl$0, -4(%rbp)
> .L3:
> cmpl$1, -4(%rbp)
> jne .L4
> 
> and CFG cleanup done after RTL expansion elides the jump
> but not the compare.
> 
> Any objections?
> 
> Thanks,
> Richard.
> 
> 2019-06-06  Richard Biener  
> 
>   PR debug/90574
>   * tree-cfg.c (stmt_starts_bb_p): Split blocks at labels
>   that appear after user labels.
> 
>   * gcc.misc-tests/gcov-pr90574-1.c: New testcase.
>   * gcc.misc-tests/gcov-pr90574-2.c: Likewise.
No objections from me.
jeff


Go patch committed: Permit inlining temporary statements and references

2019-06-06 Thread Ian Lance Taylor
This patch to the Go frontend permits inlining functions that use
temporary statements and references.  This increases the number of
inlinable functions from 439 to 455.  An example is math/bits.Mul32,
which uses temporaries to handle the
tuple assignment.  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 271983)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-bc7374913367fba9b10dc284af87eb539fb6c5b2
+015785baa74629baafe520367b9c71707366c6eb
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/export.cc
===
--- gcc/go/gofrontend/export.cc (revision 271976)
+++ gcc/go/gofrontend/export.cc (working copy)
@@ -6,15 +6,14 @@
 
 #include "go-system.h"
 
-#include "go-sha1.h"
 #include "go-c.h"
-
+#include "go-diagnostics.h"
+#include "go-sha1.h"
 #include "gogo.h"
 #include "types.h"
 #include "expressions.h"
 #include "statements.h"
 #include "export.h"
-
 #include "go-linemap.h"
 #include "backend.h"
 
@@ -1297,3 +1296,33 @@ Stream_to_section::do_write(const char*
 {
   this->backend_->write_export_data (bytes, length);
 }
+
+// Class Export_function_body.
+
+// Record a temporary statement.
+
+unsigned int
+Export_function_body::record_temporary(const Temporary_statement* temp)
+{
+  unsigned int ret = this->next_temporary_index_;
+  if (ret > 0x7fff)
+go_error_at(temp->location(),
+   "too many temporary statements in export data");
+  ++this->next_temporary_index_;
+  std::pair val(temp, ret);
+  std::pair ins = this->temporary_indexes_.insert(val);
+  go_assert(ins.second);
+  return ret;
+}
+
+// Return the index of a temporary statement.
+
+unsigned int
+Export_function_body::temporary_index(const Temporary_statement* temp)
+{
+  Unordered_map(const Temporary_statement*, unsigned int)::const_iterator p =
+this->temporary_indexes_.find(temp);
+  go_assert(p != this->temporary_indexes_.end());
+  return p->second;
+}
Index: gcc/go/gofrontend/export.h
===
--- gcc/go/gofrontend/export.h  (revision 271891)
+++ gcc/go/gofrontend/export.h  (working copy)
@@ -20,6 +20,7 @@ class Type;
 class Package;
 class Import_init_set;
 class Backend;
+class Temporary_statement;
 
 // Codes used for the builtin types.  These are all negative to make
 // them easily distinct from the codes assigned by Export::write_type.
@@ -307,7 +308,8 @@ class Export_function_body : public Stri
 {
  public:
   Export_function_body(Export* exp, int indent)
-: exp_(exp), type_context_(NULL), indent_(indent)
+: exp_(exp), body_(), type_context_(NULL), next_temporary_index_(0),
+  temporary_indexes_(), indent_(indent)
   { }
 
   // Write a character to the body.
@@ -363,6 +365,14 @@ class Export_function_body : public Stri
   package_index(const Package* p) const
   { return this->exp_->package_index(p); }
 
+  // Record a temporary statement and return its index.
+  unsigned int
+  record_temporary(const Temporary_statement*);
+
+  // Return the index of a temporary statement.
+  unsigned int
+  temporary_index(const Temporary_statement*);
+
   // Return a reference to the completed body.
   const std::string&
   body() const
@@ -375,6 +385,10 @@ class Export_function_body : public Stri
   std::string body_;
   // Current type context.  Used to avoid duplicate type conversions.
   Type* type_context_;
+  // Index to give to next temporary statement.
+  unsigned int next_temporary_index_;
+  // Map temporary statements to indexes.
+  Unordered_map(const Temporary_statement*, unsigned int) temporary_indexes_;
   // Current indentation level: the number of spaces before each statement.
   int indent_;
 };
Index: gcc/go/gofrontend/expressions.cc
===
--- gcc/go/gofrontend/expressions.cc(revision 271983)
+++ gcc/go/gofrontend/expressions.cc(working copy)
@@ -1025,6 +1025,57 @@ Temporary_reference_expression::do_addre
   this->statement_->set_is_address_taken();
 }
 
+// Export a reference to a temporary.
+
+void
+Temporary_reference_expression::do_export(Export_function_body* efb) const
+{
+  unsigned int idx = efb->temporary_index(this->statement_);
+  char buf[50];
+  snprintf(buf, sizeof buf, "$t%u", idx);
+  efb->write_c_string(buf);
+}
+
+// Import a reference to a temporary.
+
+Expression*
+Temporary_reference_expression::do_import(Import_function_body* ifb,
+ Location loc)
+{
+  std::string id = ifb->read_identifier();
+  go_assert(id[0] == '$' && id[1] == 't');
+  const char *p = id.c_str();
+  char *end;
+  long idx = strtol(p + 2, &end, 10);
+  if (*end != '\0' || idx > 0x7fff)
+

Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-06-06 Thread Joseph Myers
On Wed, 5 Jun 2019, Jason Merrill wrote:

> > I think failing to credit (by name and email address) the person implied
> > by the commit metadata, in the absence of positive evidence (such as a
> > ChangeLog entry) for the change being authored by someone else, is just
> > wrong, in the same way it's wrong not to use --author when committing for
> > someone else in git.
> 
> It's wrong, but it's not importantly wrong.

I think it's importantly wrong not to have a name and email address for 
the committer in the absence of using such information for the author.  
(Whereas if the name or email address refer to the right person but are 
anachronistic for that commit, that's what I'd consider not importantly 
wrong.)

> For email addresses, I think that using @gcc.gnu.org would be the best
> approach for people that have such accounts, rather than an employer address
> from an arbitrary point in time.

I'm fine with use of @gcc.gnu.org (used together with a name for the 
person in question that is or was valid, at or after the time of some 
commit they made) for committers who in fact do have or did have such an 
address (as opposed to inventing such addresses for committers from the 
gcc2 era who never had such addresses, or anyone who only committed in the 
egcs.cygnus.com era and who no longer had an account by the time of the 
move to gcc.gnu.org).

When the commit adds a ChangeLog entry and thus contains contemporaneous 
information about the preferred name and email address for the author at 
that time, I think using that information (via the reposurgeon 
"changelogs" feature) is preferable to a generic author map entry (thus, 
the author map entries should be considered a fallback for those commits 
that didn't add a ChangeLog entry (or added one with bad syntax for which 
parsing fails, etc.)).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-06-06 Thread Joseph Myers
On Thu, 6 Jun 2019, Richard Earnshaw (lists) wrote:

> > For email addresses, I think that using @gcc.gnu.org would be the best
> > approach for people that have such accounts, rather than an employer
> > address from an arbitrary point in time.
> 
> Or @gnu.org for accounts that pre-date the switch to EGCS and CVS.

When were such addresses introduced?  I'm not sure if all the gcc2 
committers would have had them, or only @.ai.mit.edu if 
that's where the repository was (certainly many early ChangeLog entries 
tend to use the .ai.mit.edu form, if not just 
).

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [committed, amdgcn] Add -march=gfx906 for Vega20

2019-06-06 Thread Joseph Myers
On Thu, 6 Jun 2019, Andrew Stubbs wrote:

> This patch adds a new -march=gfx906 option, and a new multilib to go with it.

This is missing an invoke.texi update.

-- 
Joseph S. Myers
jos...@codesourcery.com


Re: [Contrib PATCH] Add scripts to convert GCC repo from SVN to Git

2019-06-06 Thread Ian Lance Taylor
On Thu, Jun 6, 2019 at 4:41 PM Joseph Myers  wrote:
>
> On Thu, 6 Jun 2019, Richard Earnshaw (lists) wrote:
>
> > > For email addresses, I think that using @gcc.gnu.org would be the best
> > > approach for people that have such accounts, rather than an employer
> > > address from an arbitrary point in time.
> >
> > Or @gnu.org for accounts that pre-date the switch to EGCS and CVS.
>
> When were such addresses introduced?  I'm not sure if all the gcc2
> committers would have had them, or only @.ai.mit.edu if
> that's where the repository was (certainly many early ChangeLog entries
> tend to use the .ai.mit.edu form, if not just
> ).

I got a @gnu.org account around 1990 or 1991, and I was hardly the
first, so they were introduced some time before then.

Ian


Re: [PATCH] Disable PowerPC pc-relative support until the code is checked in

2019-06-06 Thread Michael Meissner
On Thu, Jun 06, 2019 at 06:14:29PM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Thu, Jun 06, 2019 at 06:42:16PM -0400, Michael Meissner wrote:
> > 2019-06-06  Michael Meissner  
> > 
> > * config/rs6000/rs6000-cpus.def (ISA_FUTURE_MASKS_SERVER): Delete
> > enabling -mprefixed-addr and -mpcrel by default.
> 
> Why disable prefixed-addr?

Convenience, but I could leave it in, since we don't yet have any prefixed
instruction support.

> > * config/rs6000/rs6000.c (rs6000_option_override_internal): Make
> > -mpcrel and -mprefixed-addr act like other swtiches (i.e. using
> 
> Typo ("switches").

Thanks.

> > -mpcrel automatically sets -mcpu=future and -mprefixed-addr, and
> 
> Automatically setting -mcpu= is a bad thing.  Instead, we should just
> error out if someone tries to use -mpcrel with a CPU (or ABI, etc.) that
> doesn't support it.  Or, is there any special reason you want it?

Well, I was trying to be consistant with the other things (-mpower9-vector
automaically sets all of the other power9 options).  If you feel we don't need
the consistancy, I can remove that part of the patch.

As I mentioned elsewhere, there is a real problem with options specified on the
command line and pragma/attribute target (basically if you set -mpcrel on the
command line, and then do '#pragma GCC target ("cpu=power9")', it will
currently complain that -mfuture or -mcpu=future is not set.  I wanted to do
the minimum patch so other people could start using the target.

> In the future, we will not have an -mprefixed-addr option (it will be
> always on for CPUs that support it), and I don't see any real reason
> to allow disabling pcrel either, but we'll see.

Well I suspect for at least several months we will need the ability to turn off
pc-relative support but allow the other future stuff.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.ibm.com, phone: +1 (978) 899-4797



Re: [RFC][PR88838][SVE] Use 32-bit WHILELO in LP64 mode

2019-06-06 Thread Jeff Law
On 6/6/19 6:33 AM, Richard Sandiford wrote:
> Kugan Vivekanandarajah  writes:
>> Hi Richard,
>>
>> On Thu, 6 Jun 2019 at 22:07, Richard Sandiford
>>  wrote:
>>>
>>> Kugan Vivekanandarajah  writes:
 Hi Richard,

 On Thu, 6 Jun 2019 at 19:35, Richard Sandiford
  wrote:
>
> Kugan Vivekanandarajah  writes:
>> Hi Richard,
>>
>> Thanks for the review. Attached is the latest patch.
>>
>> For testcase like cond_arith_1.c, with the patch, gcc ICE in fwprop. I
>> am limiting fwprop in cases like this. Is there a better fix for this?
>> index cf2c9de..2c99285 100644
>> --- a/gcc/fwprop.c
>> +++ b/gcc/fwprop.c
>> @@ -1358,6 +1358,15 @@ forward_propagate_and_simplify (df_ref use,
>> rtx_insn *def_insn, rtx def_set)
>>else
>>  mode = GET_MODE (*loc);
>>
>> +  /* TODO. We can't get the mode for
>> + (set (reg:VNx16BI 109)
>> +  (unspec:VNx16BI [
>> +(reg:SI 131)
>> +(reg:SI 106)
>> +   ] UNSPEC_WHILE_LO))
>> + Thus, bailout when it is UNSPEC and MODEs are not compatible.  */
>> +  if (GET_MODE_CLASS (mode) != GET_MODE_CLASS (GET_MODE (reg)))
>> +return false;
>>new_rtx = propagate_rtx (*loc, mode, reg, src,
>>   optimize_bb_for_speed_p (BLOCK_FOR_INSN (use_insn)));
>
> What specifically goes wrong?  The unspec above isn't that unusual --
> many unspecs have different modes from their inputs.

 cond_arith_1.c:38:1: internal compiler error: in paradoxical_subreg_p,
 at rtl.h:3130
 0x135f1d3 paradoxical_subreg_p(machine_mode, machine_mode)
 ../../88838/gcc/rtl.h:3130
 0x135f1d3 propagate_rtx
 ../../88838/gcc/fwprop.c:683
 0x135f4a3 forward_propagate_and_simplify
 ../../88838/gcc/fwprop.c:1371
 0x135f4a3 forward_propagate_into
 ../../88838/gcc/fwprop.c:1430
 0x135fdcb fwprop
 ../../88838/gcc/fwprop.c:1519
 0x135fdcb execute
 ../../88838/gcc/fwprop.c:1550
 Please submit a full bug report,
 with preprocessed source if appropriate.


 in forward_propagate_and_simplify

 use_set:
 (set (reg:VNx16BI 96 [ loop_mask_52 ])
 (unspec:VNx16BI [
 (reg:SI 92 [ _3 ])
 (reg:SI 95 [ niters.36 ])
 ] UNSPEC_WHILE_LO))

 reg:
 (reg:SI 92 [ _3 ])

 *loc:
 (unspec:VNx16BI [
 (reg:SI 92 [ _3 ])
 (reg:SI 95 [ niters.36 ])
 ] UNSPEC_WHILE_LO)

 src:
 (subreg:SI (reg:DI 136 [ ivtmp_101 ]) 0)

 use_insn:
 (insn 87 86 88 4 (parallel [
 (set (reg:VNx16BI 96 [ loop_mask_52 ])
 (unspec:VNx16BI [
 (reg:SI 92 [ _3 ])
 (reg:SI 95 [ niters.36 ])
 ] UNSPEC_WHILE_LO))
 (clobber (reg:CC 66 cc))
 ]) 4255 {while_ultsivnx16bi}
  (expr_list:REG_UNUSED (reg:CC 66 cc)
 (nil)))

 I think we calculate the mode to be VNx16BI which is wrong?
 because of which in propgate_rtx,   !paradoxical_subreg_p (mode,
 GET_MODE (SUBREG_REG (new_rtx)  ICE
>>>
>>> Looks like something I hit on the ACLE branch, but didn't have a
>>> non-ACLE reproducer for (see 065881acf0de35ff7818c1fc92769e1c106e1028).
>>>
>>> Does the attached work?  The current call is wrong because "mode"
>>> is the mode of "x", not the mode of "new_rtx".
>>
>> Yes, attached patch works for this testcase. Are you planning to
>> commit it to trunk. I will wait for this.
> 
> Needs approval first. :-)
> 
> The patch was originally bootstrapped & regtested on aarch64-linux-gnu,
> but I'll repeat that for trunk and test x86_64-linux-gnu too.
Assuming that passes this is fine.
jeff


Re: [PATCH] Add warn_unused_result for malloc-like functions (PR tree-optimization/78902).

2019-06-06 Thread Jeff Law
On 6/6/19 2:01 AM, Martin Liška wrote:
> Hi.
> 
> The patch is about addition of warn_unused_attribute for malloc-like function.
> 
> Patch can bootstrap on x86_64-linux-gnu and survives regression tests.
> 
> Ready to be installed?
> Thanks,
> Martin
> 
> gcc/ChangeLog:
> 
> 2019-06-06  Martin Liska  
> 
>   PR tree-optimization/78902
>   * builtin-attrs.def (ATTR_WARN_UNUSED_RESULT): New.
>   (ATTR_MALLOC_NOTHROW_LEAF_LIST): Remove.
>   (ATTR_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST): New.
>   (ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LEAF_LIST): New.
>   (ATTR_ALLOC_SIZE_2_NOTHROW_LIST): Remove.
>   (ATTR_MALLOC_SIZE_1_NOTHROW_LEAF_LIST): Remove.
>   (ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_LIST): New.
>   (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LIST): New.
>   (ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_NOTHROW_LEAF_LIST): New.
>   (ATTR_ALLOCA_SIZE_1_NOTHROW_LEAF_LIST): Remove.
>   (ATTR_ALLOCA_WARN_UNUSED_RESULT_SIZE_1_NOTHROW_LEAF_LIST): New.
>   (ATTR_MALLOC_SIZE_1_2_NOTHROW_LEAF_LIST):  Remove.
>   (ATTR_MALLOC_WARN_UNUSED_RESULT_SIZE_1_2_NOTHROW_LEAF_LIST):
>   New.
>   (ATTR_ALLOC_SIZE_2_NOTHROW_LEAF_LIST): Remove.
>   (ATTR_ALLOC_WARN_UNUSED_RESULT_SIZE_2_NOTHROW_LEAF_LIST): New.
>   (ATTR_MALLOC_NOTHROW_NONNULL): Remove.
>   (ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL): New.
>   (ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL): New.
>   (ATTR_MALLOC_NOTHROW_NONNULL_LEAF): Remove.
>   (ATTR_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF): New.
>   (ATTR_MALLOC_WARN_UNUSED_RESULT_NOTHROW_NONNULL_LEAF): New.
>   * builtins.def (BUILT_IN_ALIGNED_ALLOC): Change to use
>   warn_unused_result attribute.
>   (BUILT_IN_STRDUP): Likewise.
>   (BUILT_IN_STRNDUP): Likewise.
>   (BUILT_IN_ALLOCA): Likewise.
>   (BUILT_IN_CALLOC): Likewise.
>   (BUILT_IN_MALLOC): Likewise.
>   (BUILT_IN_REALLOC): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 2019-06-06  Martin Liska  
> 
>   PR tree-optimization/78902
>   * c-c++-common/asan/alloca_loop_unpoisoning.c: Use result
>   of __builtin_alloca.
>   * c-c++-common/asan/pr88619.c: Likewise.
>   * g++.dg/overload/using2.C: Likewise for malloc.
>   * gcc.dg/attr-alloc_size-5.c: Add new dg-warning.
>   * gcc.dg/nonnull-3.c: Use result of __builtin_strdup.
>   * gcc.dg/pr43643.c: Likewise.
>   * gcc.dg/pr59717.c: Likewise for calloc.
>   * gcc.dg/torture/pr71816.c: Likewise.
>   * gcc.dg/tree-ssa/pr78886.c: Likewise.
>   * gcc.dg/tree-ssa/pr79697.c: Likewise.
>   * gcc.dg/pr78902.c: New test.
OK.

Any thoughts on whether or not we'd want to IPA propagate this attribute
like we do for the malloc attribute?

jeff
> ---


Re: [PATCH] A jump threading opportunity for condition branch

2019-06-06 Thread Jeff Law
On 5/31/19 9:03 AM, Jeff Law wrote:
> On 5/31/19 1:24 AM, Richard Biener wrote:
>> On Thu, 30 May 2019, Jeff Law wrote:
>>
>>> On 5/30/19 12:41 AM, Richard Biener wrote:
 On May 29, 2019 10:18:01 PM GMT+02:00, Jeff Law  wrote:
> On 5/23/19 6:11 AM, Richard Biener wrote:
>> On Thu, 23 May 2019, Jiufu Guo wrote:
>>
>>> Hi,
>>>
>>> Richard Biener  writes:
>>>
 On Tue, 21 May 2019, Jiufu Guo wrote:
>
> +}
> +
> +  if (TREE_CODE_CLASS (gimple_assign_rhs_code (def)) !=
> tcc_comparison)
> +return false;
> +
> +  /* Check if phi's incoming value is defined in the incoming
> basic_block.  */
> +  edge e = gimple_phi_arg_edge (phi, index);
> +  if (def->bb != e->src)
> +return false;
 why does this matter?

>>> Through preparing pathes and duplicating block, this transform can
> also
>>> help to combine a cmp in previous block and a gcond in current
> block.
>>> "if (def->bb != e->src)" make sure the cmp is define in the incoming
>>> block of the current; and then combining "cmp with gcond" is safe. 
> If
>>> the cmp is defined far from the incoming block, it would be hard to
>>> achieve the combining, and the transform may not needed.
>> We're in SSA form so the "combining" doesn't really care where the
>> definition comes from.
> Combining doesn't care, but we need to make sure the copy of the
> conditional ends up in the right block since it wouldn't necessarily be
> associated with def->bb anymore.  But I'd expect the sinking pass to
> make this a non-issue in practice anyway.
>
>>
> +
> +  if (!single_succ_p (def->bb))
> +return false;
 Or this?  The actual threading will ensure this will hold true.

>>> Yes, other thread code check this and ensure it to be true, like
>>> function thread_through_normal_block. Since this new function is
> invoked
>>> outside thread_through_normal_block, so, checking single_succ_p is
> also
>>> needed for this case.
>> I mean threading will isolate the path making this trivially true.
>> It's also no requirement for combining, in fact due to the single-use
>> check the definition can be sinked across the edge already (if
>> the edges dest didn't have multiple predecessors which this threading
>> will fix as well).
> I don't think so.  The CMP source block could end with a call and have
> an abnormal edge (for example).  We can't put the copied conditional
> before the call and putting it after the call essentially means
> creating
> a new block.
>
> The CMP source block could also end with a conditional.  Where do we
> put
> the one we want to copy into the CMP source block in that case? :-)
>
> This is something else we'd want to check if we ever allowed the the
> CMP
> defining block to not be the immediate predecessor of the conditional
> jump block.  If we did that we'd need to validate that the block where
> we're going to insert the copy of the jump has a single successor.

 But were just isolating a path here. The actual combine job is left to 
 followup cleanups. 
>>> Absolutely agreed.  My point was that there's some additional stuff we'd
>>> have to verify does the right thing if we wanted to allow the CMP to be
>>> somewhere other than in the immediate predecessor of the conditional
>>> jump block.
>>
>> For correctness?  No.  For the CMP to be forwarded?  No.  For optimality
>> maybe - forwarding a binary operation always incurs register pressure
>> increase.
> For correctness of the patch.  Conceptually I have _no_ issues with
> having the CMP in a different block than an immediate predecessor of the
> conditional jump block.  But the patch does certain code which would
> need to be audited with that change in mind.
> 
>>
>> Btw, as you already said sinking should have sinked the CMP to the
>> predecessor (since we have a single use in the PHI).
>>
>> So I hardly see the point of making this difference.
> :-)
So just to satisfy my curiosity I put in some instrumentation to check
for cases where the CMP is not in an immediate predecessor of the
conditional branch.  It happens.  It's not terribly common though.  I'd
guess it's cases where this code is running before sinking.

I went ahead and audited the patch for this case so that we could just
eliminate that test.  The key thing thing is that we don't use the block
with the CMP insn at all in this code.  So there's no possibility of
duplicating the conditional into the wrong block or anything like that.

Since this code is running from within thread_across_edge it can't be
called with complex/abnormal edges or any other cases that can't be
handled since we filter those out before calling thread_across_edge.

So it should be safe to just elimin

libgo patch committed: Ignore unexported names in gccgoimporter

2019-06-06 Thread Ian Lance Taylor
This libgo patch changes the gccgoimporter package to ignore the
unexported and imported names.  Due to inlining, we can now see
unexported functions and variables, and functions and variables
imported from different packages.  Ignore them rather than reporting
them from this package.

Handle $hash and $equal functions consistently, so that we discard the
inline body if there is one.

Ignore names created for result parameters for inlining purposes.

Bootstrapped and ran Go testsuite on x86_64-pc-linux-gnu.  Committed
to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 272022)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-015785baa74629baafe520367b9c71707366c6eb
+e76c26059585433ce44e50cd7f8f504c6676f453
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: libgo/go/go/internal/gccgoimporter/parser.go
===
--- libgo/go/go/internal/gccgoimporter/parser.go(revision 271976)
+++ libgo/go/go/internal/gccgoimporter/parser.go(working copy)
@@ -261,6 +261,10 @@ func (p *parser) parseField(pkg *types.P
 // Param = Name ["..."] Type .
 func (p *parser) parseParam(pkg *types.Package) (param *types.Var, isVariadic 
bool) {
name := p.parseName()
+   // Ignore names invented for inlinable functions.
+   if strings.HasPrefix(name, "p.") || strings.HasPrefix(name, "r.") || 
strings.HasPrefix(name, "$ret") {
+   name = ""
+   }
if p.tok == '<' && p.scanner.Peek() == 'e' {
// EscInfo = "" . (optional and ignored)
p.next()
@@ -286,7 +290,14 @@ func (p *parser) parseParam(pkg *types.P
 // Var = Name Type .
 func (p *parser) parseVar(pkg *types.Package) *types.Var {
name := p.parseName()
-   return types.NewVar(token.NoPos, pkg, name, p.parseType(pkg))
+   v := types.NewVar(token.NoPos, pkg, name, p.parseType(pkg))
+   if name[0] == '.' || name[0] == '<' {
+   // This is an unexported variable,
+   // or a variable defined in a different package.
+   // We only want to record exported variables.
+   return nil
+   }
+   return v
 }
 
 // Conversion = "convert" "(" Type "," ConstValue ")" .
@@ -741,14 +752,17 @@ func (p *parser) parseFunc(pkg *types.Pa
}
 
name := p.parseName()
-   if strings.ContainsRune(name, '$') {
-   // This is a Type$equal or Type$hash function, which we don't 
want to parse,
-   // except for the types.
-   p.discardDirectiveWhileParsingTypes(pkg)
-   return nil
-   }
f := types.NewFunc(token.NoPos, pkg, name, p.parseFunctionType(pkg, 
nil))
p.skipInlineBody()
+
+   if name[0] == '.' || name[0] == '<' || strings.ContainsRune(name, '$') {
+   // This is an unexported function,
+   // or a function defined in a different package,
+   // or a type$equal or type$hash function.
+   // We only want to record exported functions.
+   return nil
+   }
+
return f
 }
 
@@ -769,7 +783,9 @@ func (p *parser) parseInterfaceType(pkg
embeddeds = append(embeddeds, p.parseType(pkg))
} else {
method := p.parseFunc(pkg)
-   methods = append(methods, method)
+   if method != nil {
+   methods = append(methods, method)
+   }
}
p.expect(';')
}
@@ -1050,23 +1066,6 @@ func (p *parser) parsePackageInit() Pack
return PackageInit{Name: name, InitFunc: initfunc, Priority: priority}
 }
 
-// Throw away tokens until we see a newline or ';'.
-// If we see a '<', attempt to parse as a type.
-func (p *parser) discardDirectiveWhileParsingTypes(pkg *types.Package) {
-   for {
-   switch p.tok {
-   case '\n', ';':
-   return
-   case '<':
-   p.parseType(pkg)
-   case scanner.EOF:
-   p.error("unexpected EOF")
-   default:
-   p.next()
-   }
-   }
-}
-
 // Create the package if we have parsed both the package path and package name.
 func (p *parser) maybeCreatePackage() {
if p.pkgname != "" && p.pkgpath != "" {
@@ -1204,7 +1203,9 @@ func (p *parser) parseDirective() {
case "var":
p.next()
v := p.parseVar(p.pkg)
-   p.pkg.Scope().Insert(v)
+   if v != nil {
+   p.pkg.Scope().Insert(v)
+   }
p.expectEOL()
 
case "const":


  1   2   >