Re: [RFA][PATCH][tree-optimization/78496] 01/03 Do not lose range information from earlier VRP passes

2017-12-04 Thread Jakub Jelinek
On Sun, Dec 03, 2017 at 10:55:27PM -0700, Jeff Law wrote:
> As we touched on in IRC, the EVRP analyzer was doing something stupid
> which caused it to not merge in existing range information under certain
> circumstances.
> 
> Consider this fragment:
> 
>   x_1 = foo ()
>   if (x_1 > 2)
> __builtin_unreachable ();
>   if (x_1 < 0)
> __builtin_unreachable ();

Note that for say:
  x_1 = foo ();
  bar (x_1);
  if (x_1 > 2)
__builtin_unreachable ();
  if (x_1 < 0)
__builtin_unreachable ();
  ...
  further uses of x_1
we can't do that anymore (at least, can't remember it in
SSA_NAME_RANGE_INFO), as bar could not return/could loop etc.  Ditto with
any other code in between foo and the unreachable asserts if it doesn't
guarantee that we'll always reach the comparisons after the x_1 setter.
Even
  x_1 = foo ();
  bar ();
  if (x_1 > 2)
__builtin_unreachable ();
  if (x_1 < 0)
__builtin_unreachable ();
looks unclean, if bar doesn't return, then we'd just need to hope we don't
add further uses of x_1 in between foo and bar.  Some optimizations do stuff
like that, consider foo being a pass-through function.

Jakub


[PATCH] Re: loading of zeros into {x,y,z}mm registers (take 2)

2017-12-04 Thread Jan Beulich
>>> On 02.12.17 at 01:13,  wrote:
> On Fri, Dec 01, 2017 at 02:54:28PM +0100, Jakub Jelinek wrote:
>> Will try this:
> 
> That failed to bootstrap, but here is an updated version that passed
> bootstrap/regtest on x86_64-linux and i686-linux, ok for trunk?

Thanks for the quick fix. A related question though: The issue
could have been spotted (in the provided example) by gas, if
only it had been given enough information (i.e. that AVX512VL
insns are fine to use, but AVX512DQ ones aren't). Is there a
reason gcc doesn't pass any -march= to gas, nor issues any
.arch directives?

Jan



Re: [PATCH] Re: loading of zeros into {x,y,z}mm registers (take 2)

2017-12-04 Thread Jakub Jelinek
On Mon, Dec 04, 2017 at 01:15:46AM -0700, Jan Beulich wrote:
> >>> On 02.12.17 at 01:13,  wrote:
> > On Fri, Dec 01, 2017 at 02:54:28PM +0100, Jakub Jelinek wrote:
> >> Will try this:
> > 
> > That failed to bootstrap, but here is an updated version that passed
> > bootstrap/regtest on x86_64-linux and i686-linux, ok for trunk?
> 
> Thanks for the quick fix. A related question though: The issue
> could have been spotted (in the provided example) by gas, if
> only it had been given enough information (i.e. that AVX512VL
> insns are fine to use, but AVX512DQ ones aren't). Is there a
> reason gcc doesn't pass any -march= to gas, nor issues any
> .arch directives?

Because we've never done that on i?86/x86_64, so it would confuse users,
plus there can be instructions in inline asm, there is target attribute,
function multiversioning, etc.

A year or more ago I've been regtesting with hacks to let the assembler
complain, I just remember it wasn't that easy, but don't remember the
details anymore.

Jakub


[PATCH] Fix PR83238

2017-12-04 Thread Richard Biener

This fixes PR83238, biitstrapped and tested on x86_64-unknown-linux-gnu, 
applied.

Richard.

2017-12-04  Richard Biener  

PR tree-optimization/83238
* graphite-scop-detection.c (scop_detection::merge_sese): Make
code match comment, rejecting invalid SESE regions.

* gcc.dg/graphite/pr83238.c: New testcase.

Index: gcc/graphite-scop-detection.c
===
--- gcc/graphite-scop-detection.c   (revision 255309)
+++ gcc/graphite-scop-detection.c   (working copy)
@@ -570,8 +570,7 @@ scop_detection::merge_sese (sese_l first
  which post-dominates dom, until it stabilizes.  Also, ENTRY->SRC and
  EXIT->DEST should be in the same loop nest.  */
   if (!dominated_by_p (CDI_DOMINATORS, pdom, dom)
-  || loop_depth (entry->src->loop_father)
-!= loop_depth (exit->dest->loop_father))
+  || entry->src->loop_father != exit->dest->loop_father)
 return invalid_sese;
 
   /* For now we just bail out when there is a loop exit in the region
Index: gcc/testsuite/gcc.dg/graphite/pr83238.c
===
--- gcc/testsuite/gcc.dg/graphite/pr83238.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr83238.c (working copy)
@@ -0,0 +1,39 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -floop-parallelize-all" } */
+
+void
+vw (int *dk, int zd, int jb)
+{
+  int sq;
+  int *kv = &sq;
+
+  for (sq = 0; sq < 2; ++sq)
+{
+  int u1;
+
+  for (u1 = 0; u1 < 5; ++u1)
+   if (zd == 0)
+ return;
+}
+
+  for (;;)
+{
+  ++zd;
+  if (zd == 0)
+   while (jb != 0)
+ kv = &jb;
+
+  while (*dk < 1)
+   {
+ for (jb = 0; jb < 2; ++jb)
+   {
+   }
+ ++*dk;
+   }
+
+  for (*kv = 0; *kv < 2; ++*kv)
+   for (*dk = 0; *dk < 2; ++*dk)
+ {
+ }
+}
+}


Re: [PATCH] lra: Clobbers in a parallel are earlyclobbers (PR83245)

2017-12-04 Thread Andreas Krebbel
On 12/04/2017 01:13 AM, Vladimir Makarov wrote:
> 
> 
> On 12/02/2017 01:59 PM, Segher Boessenkool wrote:
>> The documentation (rtl.texi) says:
>>
>>When a @code{clobber} expression for a register appears inside a
>>@code{parallel} with other side effects, the register allocator
>>guarantees that the register is unoccupied both before and after that
>>insn if it is a hard register clobber.
>>
>> and at least the rs6000 backend relies on that (see PR83245).  This
>> patch restores that behaviour.
>>
>> Registers that are also used as operands in the instruction are not
>> treated as earlyclobber, so such insns also still work (PR80818, an
>> s390 testcase).
>>
>> Tested on powerpc64-linux {-m32,-m64}, also tested with a s390 cross.
>>
>> Andreas, can you confirm this really still works after this patch?
>> Vlad, if so, is this okay for trunk?
>>
>>
>>
> Yes, it is ok for me in any case, even if it creates a trouble for s390 
> PR80818.  Sorry for the inconvenience from my patch.  I should have been 
> more careful.

With the patch the original failure is back on s390 :( But I agree that the 
change is required.
PR80818 needs to be solved differently.

I think the problem with PR80818 is that in insns like:

(insn 472 27815 25079 10 (parallel [
(set (reg:SI 7 %r7 [orig:1282 _2003 ] [1282])
(plus:SI (plus:SI (gtu:SI (reg:CCU 33 %cc)
(const_int 0 [0]))
(reg:SI 7 %r7 [orig:1256 _1967 ] [1256]))
(reg:SI 2 %r2 [orig:1258 _1969 ] [1258])))
(clobber (reg:CC 33 %cc))
]) "t.f90":48 1661 {*addsi3_alc}
 (nil))

r33 is NOT recognized as being used in an operand when it comes to recording 
early clobbers. The
operand RTX is:
(gtu:SI (reg:CCU 33 %cc) (const_int 0 [0]))

The code at collect_non_operand_hard_regs only checks for exact matches with 
operand locations and
therefore does not consider r33 an operand:

  for (i = 0; i < data->insn_static_data->n_operands; i++)
if (! data->insn_static_data->operand[i].is_operator
&& x == data->operand_loc[i])
  /* It is an operand loc. Stop here.  */
  return list;

-Andreas-



[patch][i386, AVX] Adding missing mask[z]_range[_round]_s[d,s] intrinsics

2017-12-04 Thread Makhotina, Olga
Hi,

This patch adds missing intrinsics for _mm_mask[z]_range[_round]_[sd,ss].

04.12.2017 Olga Makhotina  

gcc/
* config/i386/avx512dqintrin.h (_mm_mask_range_sd, _mm_maskz_range_sd,
_mm_mask_range_round_sd, _mm_maskz_range_round_sd, _mm_mask_range_ss,
_mm_maskz_range_ss, _mm_mask_range_round_ss,
_mm_maskz_range_round_ss): New intrinsics.
(__builtin_ia32_rangesd128_round, __builtin_ia32_rangess128_round): 
Remove.
(__builtin_ia32_rangesd128_mask_round,
__builtin_ia32_rangess128_mask_round): New builtins.
* config/i386/i386-builtin.def (__builtin_ia32_rangesd128_round,
__builtin_ia32_rangess128_round): Remove.
(__builtin_ia32_rangesd128_mask_round,
__builtin_ia32_rangess128_mask_round): New builtins.
* config/i386/sse.md (ranges): Renamed to ...
(ranges): ... this.
((match_operand:VF_128 2 ""
"")): Changed to ...
((match_operand:VF_128 2 ""
"")): ... this.
("vrange\t{%3, %2, %1, %0|
%0, %1, %2, %3}"): Changed to ...
("vrange\t{%3, %2, 
%1,
%0|%0, %1,
%2, %3}"): ... this.

04.12.2017 Olga Makhotina  

gcc/testsuite/
* gcc.target/i386/avx512dq-vrangesd-1.c (_mm_mask_range_sd,
_mm_maskz_range_sd, _mm_mask_range_round_sd,
_mm_maskz_range_round_sd): Test new intrinsics.
* gcc.target/i386/avx512dq-vrangesd-2.c (_mm_range_sd, 
_mm_mask_range_sd,
_mm_maskz_range_sd, _mm_range_round_sd, _mm_mask_range_round_sd,
_mm_maskz_range_round_sd): Test new intrinsics.
* gcc.target/i386/avx512dq-vrangess-1.c (_mm_mask_range_ss,
_mm_maskz_range_ss, _mm_mask_range_round_ss,
_mm_maskz_range_round_ss): Test new intrinsics.
* gcc.target/i386/avx512dq-vrangess-2.c (_mm_range_ss, 
_mm_mask_range_ss,
_mm_maskz_range_ss, _mm_range_round_ss, _mm_mask_range_round_ss,
_mm_maskz_range_round_ss): Test new intrinsics.
* gcc.target/i386/avx-1.c (__builtin_ia32_rangesd128_round,
__builtin_ia32_rangess128_round): Remove builtins.
(__builtin_ia32_rangesd128_mask_round,
__builtin_ia32_rangess128_mask_round): Test new builtins.
* gcc.target/i386/sse-13.c: Ditto.
* gcc.target/i386/sse-23.c: Ditto.

Is it ok for trunk?

Thanks,
Olga



0001-range.patch
Description: 0001-range.patch


Re: [PATCH, rs6000] gimple folding of vec_msum()

2017-12-04 Thread Richard Biener
On Sat, Dec 2, 2017 at 12:08 AM, Bill Schmidt
 wrote:
> Hi Will,
>
>> On Dec 1, 2017, at 3:43 PM, Will Schmidt  wrote:
>>
>> On Fri, 2017-12-01 at 18:46 +0100, Richard Biener wrote:
>>> On December 1, 2017 6:22:21 PM GMT+01:00, Will Schmidt 
>>>  wrote:
 Hi,
 Add support for folding of vec_msum in GIMPLE.

 This uses the DOT_PROD_EXPR gimple op, which is sensitive to type
 mismatches:
 error: type mismatch in dot product reduction
 __vector signed int
 __vector signed char
 __vector unsigned char
 D.2798 = DOT_PROD_EXPR ;
 So for those cases with a signed/unsigned mismatch in the arguments,
 this
 converts those arguments to their signed type.

 This also adds a define_expand for sdot_prodv16qi. This is based on a
 similar
 existing entry.

 Testing coverage is handled by the existing
 gcc.target/powerpc/fold-vec-msum*.c tests.

 Sniff-tests have passed on P8.  full regtests currently running on
 other assorted
 power systems.
 OK for trunk with successful results?
>>>
>>> Note DOT_PROD_EXPR is only useful when the result is reduced to a scalar 
>>> later and the reduction order is irrelevant.
>>>
>>> This is because GIMPLE doesn't specify whether the reduction reduces 
>>> odd/even or high/low lanes of the argument vectors.  Does vec_msum specify 
>>> that?
>>
>> Not that I see, but there may be an implied intent here that just isn't
>> obvious to me.   I'll defer to ... someone. :-)
>>
>>> That said, it exists as a 'hack' for the vectorizer and isn't otherwise 
>>> useful for GIMPLE.
>>
>> OK.  With that in mind, should I just try to split this out into
>> separate multiply and add steps?
>
> No.  The semantics of vec_msum are very specific and can't be accurately 
> represented in GIMPLE.  This one should be left as a call until expand.

I think some of Richards patches also remove those tree codes in favor
of IFNs.  More thoroughly specifying them would be nice
though - I do expect most targets doing odd/even reduction for them.
Fact of the unspecifiedness is that for example we can't
even constant-fold those on GIMPLE!

Richard.

> Thanks!
> Bill
>
>>
>> Thanks,
>> -Will
>>
>>
>>
>>
>>>
>>>
>>> Richard.
>>>
 Thanks
 -Will

 [gcc]

 2017-12-01  Will Schmidt  

 * config/rs6000/altivec.md (sdot_prodv16qi): New.
 * config/rs6000/rs6000.c (rs6000_gimple_fold_builtin): Add support for
 gimple-folding of vec_msum.
 (builtin_function_type): Add entries for VMSUMU[BH]M and VMSUMMBM.

 diff --git a/gcc/config/rs6000/altivec.md
 b/gcc/config/rs6000/altivec.md
 index 7122f99..fa9e121 100644
 --- a/gcc/config/rs6000/altivec.md
 +++ b/gcc/config/rs6000/altivec.md
 @@ -3349,11 +3349,26 @@
(match_operand:V8HI 2 "register_operand" "v")]
UNSPEC_VMSUMSHM)))]
  "TARGET_ALTIVEC"
  "
 {
 -  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
 operands[2], operands[3]));
 +  emit_insn (gen_altivec_vmsumshm (operands[0], operands[1],
 +  operands[2], operands[3]));
 +  DONE;
 +}")
 +
 +(define_expand "sdot_prodv16qi"
 +  [(set (match_operand:V4SI 0 "register_operand" "=v")
 +(plus:V4SI (match_operand:V4SI 3 "register_operand" "v")
 +   (unspec:V4SI [(match_operand:V16QI 1
 "register_operand" "v")
 + (match_operand:V16QI 2
 "register_operand" "v")]
 +UNSPEC_VMSUMM)))]
 +  "TARGET_ALTIVEC"
 +  "
 +{
 +  emit_insn (gen_altivec_vmsummbm (operands[0], operands[1],
 +  operands[2], operands[3]));
  DONE;
 }")

 (define_expand "widen_usum3"
  [(set (match_operand:V4SI 0 "register_operand" "=v")
 diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
 index 551d9c4..552fcdd 100644
 --- a/gcc/config/rs6000/rs6000.c
 +++ b/gcc/config/rs6000/rs6000.c
 @@ -16614,10 +16614,40 @@ rs6000_gimple_fold_builtin
 (gimple_stmt_iterator *gsi)
case VSX_BUILTIN_CMPLE_2DI:
case VSX_BUILTIN_CMPLE_U2DI:
  fold_compare_helper (gsi, LE_EXPR, stmt);
  return true;

 +/* vec_msum.  */
 +case ALTIVEC_BUILTIN_VMSUMUHM:
 +case ALTIVEC_BUILTIN_VMSUMSHM:
 +case ALTIVEC_BUILTIN_VMSUMUBM:
 +case ALTIVEC_BUILTIN_VMSUMMBM:
 +  {
 +   arg0 = gimple_call_arg (stmt, 0);
 +   arg1 = gimple_call_arg (stmt, 1);
 +   tree arg2 = gimple_call_arg (stmt, 2);
 +   lhs = gimple_call_lhs (stmt);
 +   if ( TREE_TYPE (arg0) == TREE_TYPE (arg1))
 + g = gimple_build_assign (lhs, DOT_PROD_EXPR, arg0, arg1, arg2);
 +   else
 + {
 +   // For the case where we have a mix of signed/unsigned
>>

Re: [RFA, PATCH] Deprecate stabs if dwarf2 support available.

2017-12-04 Thread Richard Biener
On Sun, Dec 3, 2017 at 5:17 AM, Jim Wilson  wrote:
> This is based on an idea from Richard Biener that was mentioned here
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02376.html
>
> If we are interested in doing this, it would be accompanied by documentation
> changes to state that stabs is now deprecated in favor of dwarf2, release
> notes to announce the change, and a configure patch to deprecate --with-stabs.
> We will also need a patch for the debug torture testing in the testsuite to
> check for the new error.  There may also be other things we need that I 
> haven't
> noticed yet.

Note that "deprecation" doesn't match up with giving an error when using.
We should warn (or maybe just document deprecation) instead.

IMHO unless we can get rid of stabs support it isn't worth being this drastic.

Maybe we can add a DWARF2_ONLY_DEBUGGING_INFO or
NO_DBX_DEBUGGING_INFO macro so targets can decide they do not
want to support -gstabs and reject it this way?  I'm not sure if
#undefing DBX_DEBUGGING_INFO for those targets would be enough to
achieve this?  It seems clearly not.  Thus sth like

Index: gcc/opts.c
===
--- gcc/opts.c  (revision 255375)
+++ gcc/opts.c  (working copy)
@@ -2383,6 +2383,10 @@ common_handle_option (struct gcc_options

 case OPT_gstabs:
 case OPT_gstabs_:
+#ifndef DBX_DEBUGGING_INFO
+  warning ("-gstabs is deprecated on this target");
+
+#endif
   set_debug_level (DBX_DEBUG, code == OPT_gstabs_, arg, opts, opts_set,
   loc);
   break;

and/or ignore stabs and use -gdwarf -g1 when -gstabs is specified?

Seems x86_64-linux includes dbxelf.h via config.gcc so the above has
no effect there for example.

OTOH removing that would probably completely remove stabs support.

So unless we want to remove any support for GCC 8 I'd suggest to document
the deprecation in gcc-8/changes.html and emit a deprecation warning from
the --with-stabs configure option.  Note that deprecating stabs means
deprecating
those targets (or just debugging on those?) that only support stabs.

> I tried testing a number of targets that use stabs to see what would happen
> with this change.
>
> There are 3 targets that default to stabs, and -g continues to emit stabs
> with this patch: m68k-openbsd, pdp11, vax-openbsd.
>
> There are 3 targets that default to stabs, and -g now gives an error: avr-elf,
> i386-lynxos, powerpc-lynxos.  avr includes elfos.h which defines
> DWARF2_DEBUGGING_INFO.  We can temporarily fix avr by adding a #undef for 
> that.
> I think i386-lynxos and powerpc-lynxos should switch to dwarf2, as both i386
> and powerpc support dwarf2.  But if necessary, we can temporarily fix them
> with the same change.
>
> There are 2 targets that emit dwarf2 by default, but emit stabs when
> the --with-stabs configure option is used: mn10300-elf, v850-elf.  They
> continue to work fine by default, but -g fails if you use --with-stabs.  The
> solution here is to add a configure patch to give an error when --with-stabs
> is used.
>
> And of course for the vast majority of targets, which support dwarf2 by
> default, -g works fine, and -gstabs now gives an error.
>
> There is also a configuration for cygwin, with an old non-dwarf assembler,
> that defaults to stabs, but that probably doesn't even build anymore.  And
> there is also a configuration for pre-darwin9 that defaults to stabs, but that
> may not be buildable anymore either.
>
> This doesn't break AIX/XCOFF, as we aren't trying to remove any code at this
> time.
>
> It is possible that there might also be other targets affected by the change
> that I haven't noticed yet.  I tried searching for all targets using stabs,
> but might have missed a few.
>
> Jim
>
> gcc/
> * toplev.c (process_options): If DWARF2_DEBUGGING_INFO defined, and
> write_symbols set to DBX_DEBUG, give an error.
> ---
>  gcc/toplev.c | 6 ++
>  1 file changed, 6 insertions(+)
>
> diff --git a/gcc/toplev.c b/gcc/toplev.c
> index 2f154960a17..f5cee35234e 100644
> --- a/gcc/toplev.c
> +++ b/gcc/toplev.c
> @@ -1482,6 +1482,12 @@ process_options (void)
>   "target system does not support the %qs debug format",
>   debug_type_names[write_symbols]);
>
> +#if defined(DWARF2_DEBUGGING_INFO)
> +  if (write_symbols == DBX_DEBUG)
> +error_at (UNKNOWN_LOCATION,
> + "DBX/stabs debugging info format is deprecated, use DWARF2 
> instead.");
> +#endif
> +
>/* We know which debug output will be used so we can set flag_var_tracking
>   and flag_var_tracking_uninit if the user has not specified them.  */
>if (debug_info_level < DINFO_LEVEL_NORMAL
> --
> 2.14.1
>


Re: [RFA][PATCH][tree-optimization/78496] 02/03 Do not exploit __builtin_unreachable if -fsanitize=unreachable

2017-12-04 Thread Richard Biener
On Mon, Dec 4, 2017 at 6:55 AM, Jeff Law  wrote:
>
> As I brought my patches for pr78496 up to the tip of the trunk I noticed
> a couple testsuite regressions with -fsanitize=unreachable tests.
>
> The problem is VRP and EVRP analysis/optimization could exploit
> __builtin_unreachable to narrow the range of an object, then use that
> narrowed range to eliminate the __builtin_unreachable.  That seems
> fundamentally wrong if we're compiling with -fsanitize=unreachable.
>
> So this patch changes both to not exploit __builtin_unreachable when
> -fsanitize=unreachable.
>
> Bootstrapped and regression tested with all three patches in this kit.
>
> OK for the trunk?

Jakub already fixed this in sligthly different ways.

Richard.

> Jeff
>
> * gimple-ssa-evrp-analyze.c
> (evrp_range_analyzer::record_ranges_from_incoming_edge): Do not
> exploit __builtin_unreachable for -fsanitize=unreachable.
> * tree-vrp.c (remove_range_assertions): Similarly.
>
>
> diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
> index fb3d329..3cdf271 100644
> --- a/gcc/gimple-ssa-evrp-analyze.c
> +++ b/gcc/gimple-ssa-evrp-analyze.c
> @@ -193,7 +193,8 @@ evrp_range_analyzer::record_ranges_from_incoming_edge 
> (basic_block bb)
>
>   /* If pred_e is really a fallthru we can record value ranges
>  in SSA names as well.  */
> - bool is_fallthru = assert_unreachable_fallthru_edge_p (pred_e);
> + bool is_fallthru = (assert_unreachable_fallthru_edge_p (pred_e)
> + && (flag_sanitize & SANITIZE_UNREACHABLE) == 0);
>
>   /* Push updated ranges only after finding all of them to avoid
>  ordering issues that can lead to worse ranges.  */
> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
> index a86b382..d0435a0 100644
> --- a/gcc/tree-vrp.c
> +++ b/gcc/tree-vrp.c
> @@ -5181,7 +5181,8 @@ remove_range_assertions (void)
> is_unreachable = 0;
> if (single_pred_p (bb)
> && assert_unreachable_fallthru_edge_p
> -   (single_pred_edge (bb)))
> +   (single_pred_edge (bb))
> +   && (flag_sanitize & SANITIZE_UNREACHABLE) == 0)
>   is_unreachable = 1;
>   }
> /* Handle
>


Re: [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM

2017-12-04 Thread Richard Biener
On Mon, Dec 4, 2017 at 6:59 AM, Jeff Law  wrote:
> And finally, the meat of the fix for pr78496.  This is largely the patch
> that was posed right when stage1 closed with just minor updates.
>
> This patch embeds evrp analysis into DOM and exploits it in the minimal
> way necessary to fix pr78496 without regressing other tests in the
> testsuite.
>
> The key effect we're looking for here is to pick up a slew of jump
> threads in DOM2 by exploiting the range information during jump threading.
>
> These aren't handled well in the standard tree-vrp jump threading -- we
> could abuse ASSERT_EXPRs further, but it'd just be ugly and probably
> computationally expensive.
>
> The ranges we want fall out naturally from a dominator walk, hence all
> the recent work around pulling out EVRP analysis into a little module
> other passes could reuse.
>
> With these changes we pick up all the missing jump thread opportunities
> in DOM for pr78496.  Sadly, I've been able to pull together an automated
> test that I like.  About the best I could come up with would be to
> verify that a large number of jump threads were realized during DOM2.
> But that still seems rather fragile.  I also looked at examining the
> results of PRE, but the partial redundancies that were originally the
> source of complaints in that BZ have already been fixed.  I also looked
> at perhaps looking for PHIs with constant args and then perhaps trying
> to filter those, but got nowhere.
>
> So there's no direct test for pr78496.  Sigh.
>
>
>
> Running EVRP analysis in DOM had an unintended side effect.  Namely that
> SSA_NAMEs that got created after the EVRP optimization pass would have
> range information attached to them.  That caused two warning regressions
> with -O1 code in the C++ and Fortran testsuites.  The problem is the
> ranges attached to the new SSA_NAMES were very wide and there was code
> left on an unexecutable path that called an allocation routine (C++) and
> a memset (Fortran) using those names as arguments.  The uber-wide ranges
> in those circumstances triggered the false positive warnings.
>
> By exploiting the EVRP data during the standard part of DOM to optimize
> conditionals, we're able to prove the paths in question simply aren't
> executable.  We remove them and the warnings go away.
>
> This work compromises builtin-unreachable-6.  So the original test it
> kept and -fno-tree-dominator-opts is added to keep it from being
> compromised.  A new builtin-unreachable-6a test is created to very that
> DOM does indeed remove all the __builtin_unreachable paths.
>
> This work also results in us optimizing 20030922-2.c again (conditional
> equivalences).  EVRP analysis will note the conditional equivalence.  We
> don't propagate anything from EVRP analysis at this time, but we do use
> it to try and simplify GIMPLE_CONDs to a compile-time constant which is
> what happens in this case.
>
> I plan to check this in if/when the first two patches are approved.

Looks good to me.  Btw, did you look at adding a GIMPLE testcase
for 78496?  You'd have stable IL into DOM then...

Richard.

> Jeff
>
> ps. The x_vr_values hack is scheduled to go away as we remove
> tree-vrp.c's threading implementation in gcc-9 -- we'll be able to drop
> the simplification callback and instead have simplification live within
> the dom_opt_dom_walker class where it'll have access to vr_values.  The
> analogous version in tree-vrp.c just gets deleted at that time.
>
> pps. I know there's a bogus propagation patch that I need to go back and
> re-check based on comments from Richi.  That's definitely in the queue.
>
>
>
> * gimple-ssa-evrp-analyze.h
> (evrp_range_analyzer::get_vr_values): Simplify.
> * gimple-ssa-evrp-analyze.c: Corresponding changes.
>
> * tree-ssa-dom.c: Include alloc-pool.h, tree-vrp.h, vr-values.h
> and gimple-ssa-evrp-analyze.h.
> (dom_opt_dom_walker class): Add evrp_range_analyzer member.
> (simplify_stmt_for_jump_threading): Copy a blob of code from
> tree-vrp.c to use ranges to simplify statements.
> (dom_opt_dom_walker::before_dom_children): Call
> evrp_range_analyzer::{enter,record_ranges_from_stmt} methods.
> (dom_opt_dom_walker::after_dom_children): Similarly for
> evrp_range_analyzer::leave.
> (dom_opt_dom_walker::optimize_stmt): Use EVRP ranges to optimize
> conditionals.
>
> * gcc.dg/builtin-unreachable-6.c: Disable DOM.
> * gcc.dg/builtin-unreachable-6a.c: New test.
> * gcc.dg/tree-ssa/20030922-1.c: No longer XFAIL.
> * gcc.dg/ssa-dom-branch-1.c: Tweak expected output.
>
> diff --git a/gcc/gimple-ssa-evrp-analyze.h b/gcc/gimple-ssa-evrp-analyze.h
> index 6216a90..4783e6f 100644
> --- a/gcc/gimple-ssa-evrp-analyze.h
> +++ b/gcc/gimple-ssa-evrp-analyze.h
> @@ -51,13 +51,7 @@ class evrp_range_analyzer
>   true, then we are transferring ownership.  Else we keep ownership.
>
>   This sho

[PATCH] Add testcase for PR83252 (was Re: patch to fix PR80818)

2017-12-04 Thread Jakub Jelinek
On Wed, Nov 29, 2017 at 05:21:22PM -0500, Vladimir Makarov wrote:
>   The following patch fixes
> 
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80818
> 
>   The patch was successfully tested and bootstrapped on x86_64. The patch
> has no test because it is hard to check the problem.  I checked manually

This changed fixed PR83252 which has a reasonably small testcase.
I've further reduced it using creduce (with -O0 -W{,maybe-}uninitialized
and/or -fsanitize=undefined checking, plus test that it succeeds with
r255258 and fails with r255257).  Can you please double check if the
testcase represents the same issue you were working on or if your change
merely made the bug latent again?

2017-12-04  Jakub Jelinek  

PR target/83252
* gcc.target/i386/i386.exp (check_effective_target_bmi2): Moved to ...
* lib/target-supports.exp (check_effective_target_bmi2): ... here.  
Guard with
i?86-*-* x86_64-*-*.
* g++.dg/opt/pr83252.C: New test.

--- gcc/testsuite/gcc.target/i386/i386.exp.jj   2017-11-24 08:58:04.0 
+0100
+++ gcc/testsuite/gcc.target/i386/i386.exp  2017-12-04 11:42:49.215754513 
+0100
@@ -207,17 +207,6 @@ proc check_effective_target_bmi { } {
 } "-mbmi" ]
 }
 
-# Return 1 if bmi2 instructions can be compiled.
-proc check_effective_target_bmi2 { } {
-return [check_no_compiler_messages bmi2 object {
-   unsigned int
-   _bzhi_u32 (unsigned int __X, unsigned int __Y)
-   {
-   return __builtin_ia32_bzhi_si (__X, __Y);
-   }
-} "-mbmi2" ]
-}
-
 # Return 1 if ADX instructions can be compiled.
 proc check_effective_target_adx { } {
 return [check_no_compiler_messages adx object {
--- gcc/testsuite/lib/target-supports.exp.jj2017-11-28 12:11:35.0 
+0100
+++ gcc/testsuite/lib/target-supports.exp   2017-12-04 11:34:55.080598710 
+0100
@@ -1852,6 +1852,20 @@ proc check_effective_target_avx512f_runt
 return 0
 }
 
+# Return 1 if bmi2 instructions can be compiled.
+proc check_effective_target_bmi2 { } {
+if { !([istarget i?86-*-*] || [istarget x86_64-*-*]) } {
+return 0
+}
+return [check_no_compiler_messages bmi2 object {
+   unsigned int
+   _bzhi_u32 (unsigned int __X, unsigned int __Y)
+   {
+   return __builtin_ia32_bzhi_si (__X, __Y);
+   }
+} "-mbmi2" ]
+}
+
 # Return 1 if the target supports executing MIPS Paired-Single instructions,
 # 0 otherwise.  Cache the result.
 
--- gcc/testsuite/g++.dg/opt/pr83252.C.jj   2017-12-04 11:31:29.945191880 
+0100
+++ gcc/testsuite/g++.dg/opt/pr83252.C  2017-12-04 11:41:33.041686570 +0100
@@ -0,0 +1,92 @@
+// PR target/83252
+// { dg-do run }
+// { dg-options "-O3" }
+// { dg-additional-options "-mbmi2 -mtune=intel" { target bmi2 } }
+
+#if __SIZEOF_INT__ == 4 && __SIZEOF_LONG_LONG__ == 8 && __CHAR_BIT__ == 8
+
+#ifdef __BMI2__
+#include "../../gcc.target/i386/bmi2-check.h"
+#endif
+
+long long h = 707493562598231894LL, i, n, x3, x5;
+long long j, l = -2228108721620697360LL, o, y9;
+int k, p, r, s, t = 2, u, w, z8, x7, y4, y5, y6, y7, y8, x1, x2, x4, x6, d;
+unsigned v, x = 751359462, z = 1, y3 = 60;
+unsigned *y = &x, *z2 = &z, *z3 = &v;
+unsigned long long z1 = 2;
+unsigned long long *z4 = &z1;
+long long *z7;
+unsigned long long z9 = 7091529791657LL;
+
+void
+foo ()
+{
+  if ((-2783342978U * (int) l || z) && z2 && h && z1 && z9 & ~-(-2783342978U * 
(int) l))
+{
+  i = 3060393125LL < n;
+  y7 = o >> *y - 751359400;
+  *z3 = x7;
+  long a = (o >> *y - 751359400 >> ~-(-2783342978U * (int) l) - 88480234) 
- (-2783342978U * (int) l);
+  y6 = a;
+  if (~0 % *z4 % 5)
+   y8 = -3 * (l - 4 ? : 407228174574);
+  if (y3 < 1)
+   {
+ long long *b = &y9;
+ z3 = 0;
+ int c = *z2;
+ *z7 = 0;
+ x1 = ~(-((unsigned) (-2783342978U * (unsigned long long) l)));
+ p = *b & j;
+ x2 = c;
+   }
+  else
+   {
+ j = 0;
+ int e = !0 % (7 % *z4);
+ r = ((s || !k) && t) - -(-2783342978U * (unsigned long long) l);
+ x3 = o >> *y - 751359400;
+ y9 = z9;
+ long f = o >> *y - 751359400 >> ~-(-2783342978U * (int) l) - 88480234;
+ x4 = z1;
+ u = n * f * e * y4;
+   }
+  if (8ULL * -(-(-2783342978U * (int) l)))
+   ;
+  else
+   {
+ *z3 = 0;
+ int g = 3 & y5;
+ x5 = (unsigned) (~o + 9223372036854775807 >> l);
+ z8 = g + y9;
+ v = j || ~0 + 9223372036854775807 >> ~-(-2783342978U * (int) l);
+ x6 = o >> (8 * l);
+ w = *y ? -2783342978U * l : 0;
+   }
+}
+}
+
+#ifdef __BMI2__
+void
+bmi2_test (void)
+{
+  foo ();
+  if (r != 88480289)
+__builtin_abort ();
+}
+#else
+int
+main ()
+{
+  foo ();
+  if (r != 88480289)
+__builtin_abort ();
+}
+#endif
+#else
+int
+main ()
+{
+}
+#endif


Jakub


Re: [RFA][PATCH][tree-optimization/78496] 01/03 Do not lose range information from earlier VRP passes

2017-12-04 Thread Richard Biener
On Mon, Dec 4, 2017 at 6:55 AM, Jeff Law  wrote:
> As we touched on in IRC, the EVRP analyzer was doing something stupid
> which caused it to not merge in existing range information under certain
> circumstances.
>
> Consider this fragment:
>
>   x_1 = foo ()
>   if (x_1 > 2)
> __builtin_unreachable ();
>   if (x_1 < 0)
> __builtin_unreachable ();
>
> Obviously the range for x_1 is [0,2] and we compute that range in the
> EVRP optimization pass as well as VRP.
>
> If a pass (say VRP) were to delete the __builtin_unreachable calls we'll
> be left with:
>
>
>   x_1 = foo ()
>
> Any subsequent EVRP analysis won't be able to generate range information
> for that statement -- ie, it looks like VR_VARYING.  Due to a dumb bug
> in the EVRP analysis we allowed that VR_VARYING to override any range
> that had been computed by an earlier VRP or EVRP pass.

Doh - probably not noticed because EVRP was only run "first" ...

>
> Fixing is trivial.  Always call update_value_range, even if the
> currently discovered range is VR_VARYING.
>
> Bootstrapped and regression tested, both in isolation and as part of
> this 3 part kit.
>
> OK for the trunk?

Ok.

Thanks,
Richard.

> Jeff
>
> * gimple-ssa-evrp-analyze.c
> (evrp_range_analyzer::extract_range_from_stmt):  Always use
> vr_values::update_value_range so preexisting range info is
> medged with new range info, even if the new range is VR_VARYING.
>
> diff --git a/gcc/gimple-ssa-evrp-analyze.c b/gcc/gimple-ssa-evrp-analyze.c
> index 551b1d6..fb3d329 100644
> --- a/gcc/gimple-ssa-evrp-analyze.c
> +++ b/gcc/gimple-ssa-evrp-analyze.c
> @@ -271,8 +271,7 @@ evrp_range_analyzer::record_ranges_from_stmt (gimple 
> *stmt)
>edge taken_edge;
>value_range vr = VR_INITIALIZER;
>vr_values->extract_range_from_stmt (stmt, &taken_edge, &output, &vr);
> -  if (output
> - && (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE))
> +  if (output)
> {
>   vr_values->update_value_range (output, &vr);
>
>


Re: [PATCH v2 3/4] [SPARC] Errata workaround for GRLIB-TN-0010

2017-12-04 Thread Eric Botcazou
> @@ -1024,6 +1040,31 @@ sparc_do_work_around_errata (void)
>   emit_insn_before (gen_nop (), target);
>   }
> 
> +  /* Insert a NOP between load instruction and atomic
> +  instruction.  Insert a NOP at branch target if load
> +  in delay slot and atomic instruction at branch target.  */
> +  if (sparc_fix_ut700
> +   && NONJUMP_INSN_P (insn)
> +   && (set = single_set (insn)) != NULL_RTX
> +   && MEM_P (SET_SRC (set))
> +   && REG_P (SET_DEST (set)))

Any particular reason to use MEM_P instead of mem_ref here?  It looks like 
there is also a case for the b2bst workaround (only loads are concerned).

If no, I'll make the changes along with more cosmetic fixes.

-- 
Eric Botcazou


Re: [RFC][PATCH] Extend DCE to remove unnecessary new/delete-pairs

2017-12-04 Thread Trevor Saunders
On Wed, Nov 29, 2017 at 08:56:44AM -0700, Martin Sebor wrote:
> On 11/29/2017 01:30 AM, Jakub Jelinek wrote:
> > On Tue, Nov 28, 2017 at 09:11:00PM -0700, Martin Sebor wrote:
> > > On 11/27/2017 02:22 AM, Dominik Inführ wrote:
> > > > Thanks for all the reviews! I’ve revised the patch, the 
> > > > operator_delete_flag is now stored in tree_decl_with_vis (there already 
> > > > seem to be some FUNCTION_DECL-flags in there). I’ve also added the 
> > > > option -fallocation-dce to disable this optimization. It bootstraps and 
> > > > no regressions on aarch64 and x86_64.
> > > > 
> > > It's great to be able to eliminate pairs of these calls.  For
> > > unpaired calls, though, I think it would be even more useful to
> > > also issue a warning.  Otherwise the elimination will mask bugs
> > 
> > ??  I hope you're only talking about allocation where the returned
> > pointer can't leak elsewhere, doing allocation in one function
> > (e.g. constructor, or whatever other function) and deallocation in some
> > other one is so common such a warning would be not just useless, but
> > harmful with almost all occurrences being false positives.
> > 
> > Warning on malloc/standard operator new or malloc/realloc-like function
> > when the return pointer can't escape the current function is reasonable.
> 
> Yes, warn for leaks, or for calls to delete/free with no matching
> new/malloc (when they can be detected).
> 
> From the test case included in the patch, warn on the first two
> of the following three functions:
> 
> +++ b/gcc/testsuite/g++.dg/cpp1y/new1.C
> @@ -0,0 +1,65 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-cddce-details" } */
> +
> +#include 
> +
> +void
> +new_without_use() {
> +  int *x = new int;
> +}
> +
> +void
> +new_array_without_use() {
> +  int *x = new int[5];
> +}
> +
> +void
> +new_primitive() {
> +  int *x = new int;
> +  delete x;
> +}
> 
> An obvious extension to such a checker would then be to also detect
> possible invalid deallocations, as in:
> 
>   void f (unsigned n)
>   {
> void *p = n < 256 ? alloca (n) : malloc (n);
> // ...
> free (p);
>   }
> 
> David Malcolm was working on something like that earlier this year
> so he might have some thoughts on this as well.

I also sent https://gcc.gnu.org/ml/gcc-patches/2017-05/msg00491.html a
while back, sorry I haven't gotten to  improving it yet though its gcc 9
stuff at this point I guess.

thanks

Trev

> 
> Martin


Re: [PATCH v2 3/4] [SPARC] Errata workaround for GRLIB-TN-0010

2017-12-04 Thread Daniel Cederman

Any particular reason to use MEM_P instead of mem_ref here?  It looks like
there is also a case for the b2bst workaround (only loads are concerned).


No, there was no particular reason. mem_ref seems like a better choice 
if it detects more types of loads.


/Daniel C



Re: [PATCH] final: Improve output for -dp and -fverbose-asm

2017-12-04 Thread Michael Matz
Hi,

On Thu, 30 Nov 2017, Martin Sebor wrote:

> adddi3_imm_carry_m1 seems (mostly) self-explanatory since it's built up 
> of common assembly mnemonics.  I confess I don't know what the number 
> after # means, either on powerpc64 or on any other target.

insn uid.

> > Or, for that matter, what "length" means?  Could be byte-length, sure. 
> > But OTOH, for a RISC target it's always four, so why print it?  The 
> > GCC developers surely meant cycle-length with that, nothing else makes 
> > sense.
> 
> Heh.  I thought it meant the length of the instruction in bytes, and it 
> made perfect sense to me.

It _does_.  My point was to show you that even lengthy (ahem) 
non-abbreviations are open to interpretation, and that it's not the number 
of characters that make the difference, but knowledge.  And perhaps, 
missing the latter, documentation.

> The basic point I'm making is that shortening length=NN to l=NN
> is not an improvement to the readability of the output

And I disagree.  It _is_ improving readability IMHO.  Basically the more 
often you need to mention token X, the shorter it can and should be.  If 
you mention something every line, it should be as short as possible, 
which is one character, and to give the eye some hold while scanning the 
line some visual punctuation like '=' should be added as well.

> as those consisting of multiple letters.  Does l stand for load,
> length, latency, or something else?

As context matters I think you're making up a problem that doesn't exist.

> This seems fairly elementary to me and I would have expected
> it to be non-controversial so I'm not sure why it's being
> challenged.  Don't we want the output to be generally useful?

Define "generally useful" in the context of -fverbose-asm.  I think it is 
already, and Seghers patch improves on this.

> What do we gain by adopting these terse abbreviations?

That OTOH seems obvious to me: lines that don't exceed terminal width.  
There's nothing more disturbing than line breaks in line-oriented formats 
like assembler.


Ciao,
Michael.


Avoid -Werror=format-overflow error in dbxout.c (dbxout_block) on Solaris/SPARC

2017-12-04 Thread Rainer Orth
Within test last week, 64-bit Solaris/SPARC bootstrap began to fail:

/vol/gcc/src/hg/trunk/local/gcc/dbxout.c: In function 'bool dbxout_block(tree, 
int, tree, int)':
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: error: '%lu' directive writing 
between 1 and 20 bytes into a region of size 14 [-Werror=format-overflow=]
 dbxout_block (tree block, int depth, tree args, int parent_blocknum)
 ^~~~
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: note: directive argument in 
the range [0, 18446744073709551614]
In file included from ./tm.h:26,
 from /vol/gcc/src/hg/trunk/local/gcc/target.h:52,
 from /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:72:
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sol2.h:353:11: note: 
'std::sprintf' output between 8 and 27 bytes into a destination of size 20
   sprintf ((LABEL), "*.L%s%lu", (PREFIX), (unsigned long)(NUM))
   ^
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3855:5: note: in expansion of macro 
'ASM_GENERATE_INTERNAL_LABEL'
 ASM_GENERATE_INTERNAL_LABEL (buf, "LBB", parent_blocknum);
 ^~~

The line numbers are extremely confusing, to say the least, though: the
one in the error and the first note refer to the begin of the function
definition and only the third note refers to the line of the actual
error.

Fixed as follows, which allowed sparcv9-sun-solaris2.11 bootstrap to
finish and passed regtest on sparc-sun-solaris2.11.

Ok for mainline?

Rainer

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


2017-12-04  Rainer Orth  

* dbxout.c (dbxout_block): Grow buf to 30 bytes.

diff --git a/gcc/dbxout.c b/gcc/dbxout.c
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -3844,7 +3844,7 @@ dbxout_block (tree block, int depth, tre
 	  /* If we emitted any vars and didn't output any LBRAC/RBRAC,
 		 either at this level or any lower level, we need to emit
 		 an empty LBRAC/RBRAC pair now.  */
-	  char buf[20];
+	  char buf[30];
 	  const char *scope_start;
 
 	  ret = true;


[PATCH][gimple-interchange] Add reduction validity check

2017-12-04 Thread Richard Biener

I've noticed we perform FP reduction association without the required
checks for associative math.  I've added 
gcc.dg/tree-ssa/loop-interchange-1b.c to cover this.

I also noticed we happily interchange a loop with a reduction like

 sum = a[i] - sum;

where a change in order of elements isn't ok.  Unfortunately bwaves
is exactly a case where single_use != next_def (tried to simply remove
that case for now), because reassoc didn't have a chance to fix the
operand order.  Thus this patch exports the relevant handling from
the vectorizer (for stage1 having a separate infrastructure gathering /
analyzing of reduction/induction infrastructure would be nice...)
and uses it from interchange.  We then don't handle
gcc.dg/tree-ssa/loop-interchange-4.c anymore (similar vectorizer 
missed-opt is PR65930).  I didn't bother to split up the vectorizer
code further to implement relaxed validity checking but simply XFAILed
this testcase.

Earlier I simplified allocation stuff in the main loop which is why
this part is included in this patch.

Bootstrap running on x86_64-unknown-linux-gnu.

I'll see to craft a testcase with the sum = a[i] - sum; mis-handling.

Ok?

Thanks,
Richard.

2017-12-04  Richard Biener  

* tree-vectorizer.h (check_reduction_path): Declare.
* tree-vect-loop.c (check_reduction_path): New function, split out
from ...
(vect_is_simple_reduction): ... here.
* gimple-loop-interchange.cc: Include tree-vectorizer.h.
(loop_cand::analyze_iloop_reduction_var): Use single_imm_use.
Properly check for a supported reduction operation and a
valid expression if the reduction covers multiple stmts.
(prepare_perfect_loop_nest): Simpify allocation.
(pass_linterchange::execute): Likewise.

* gcc.dg/tree-ssa/loop-interchange-1.c: Add fast-math flags.
* gcc.dg/tree-ssa/loop-interchange-1b.c: New test variant.
* gcc.dg/tree-ssa/loop-interchange-4.c: XFAIL.


Index: gcc/gimple-loop-interchange.cc
===
--- gcc/gimple-loop-interchange.cc  (revision 255375)
+++ gcc/gimple-loop-interchange.cc  (working copy)
@@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.
 #include "tree-ssa-loop-ivopts.h"
 #include "tree-ssa-dce.h"
 #include "tree-data-ref.h"
+#include "tree-vectorizer.h"
 
 /* This pass performs loop interchange: for example, the loop nest
 
@@ -551,23 +552,29 @@ loop_cand::analyze_iloop_reduction_var (
  in a way that reduction operation is seen as black box.  In general,
  we can ignore reassociation of reduction operator; we can handle fake
  reductions in which VAR is not even used to compute NEXT.  */
-  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
-{
-  stmt = USE_STMT (use_p);
-  if (is_gimple_debug (stmt))
-   continue;
-
-  if (!flow_bb_inside_loop_p (m_loop, gimple_bb (stmt)))
-   return false;
-
-  if (single_use != NULL)
-   return false;
+  if (! single_imm_use (var, &use_p, &single_use)
+  || ! flow_bb_inside_loop_p (m_loop, gimple_bb (single_use)))
+return false;
 
-  single_use = stmt;
-}
+  /* Check the reduction operation.  We require a commutative or
+ left-associative operation.  For FP math we also need to be allowed
+ to associate operations.  */
+  if (! is_gimple_assign (single_use)
+  || ! (commutative_tree_code (gimple_assign_rhs_code (single_use))
+   || (commutative_ternary_tree_code
+ (gimple_assign_rhs_code (single_use))
+   && (use_p->use == gimple_assign_rhs1_ptr (single_use)
+   || use_p->use == gimple_assign_rhs2_ptr (single_use)))
+   || (gimple_assign_rhs_code (single_use) == MINUS_EXPR
+   && use_p->use == gimple_assign_rhs1_ptr (single_use)))
+  || (FLOAT_TYPE_P (TREE_TYPE (var))
+ && ! flag_associative_math))
+return false;
 
+  /* Handle and verify a series of stmts feeding the reduction op.  */
   if (single_use != next_def
-  && !stmt_dominates_stmt_p (single_use, next_def))
+  && !check_reduction_path (UNKNOWN_LOCATION, m_loop, phi, next,
+   gimple_assign_rhs_code (single_use)))
 return false;
 
   /* Only support cases in which INIT is used in inner loop.  */
@@ -1964,7 +1971,7 @@ prepare_perfect_loop_nest (struct loop *
   vec *datarefs, vec *ddrs)
 {
   struct loop *start_loop = NULL, *innermost = loop;
-  struct loop *outermost = superloop_at_depth (loop, 0);
+  struct loop *outermost = loops_for_fn (cfun)->tree_root;
 
   /* Find loop nest from the innermost loop.  The outermost is the innermost
  outer*/
@@ -1985,21 +1992,26 @@ prepare_perfect_loop_nest (struct loop *
   if (!start_loop || !start_loop->inner)
 return false;
 
+  /* Prepare the data reference vector for the loop nest, pruning outer
+ loops we cannot handle.  */
   datarefs->create (20);
-  if ((s

[PATCH] Fix PR83255

2017-12-04 Thread Richard Biener

Been sitting in my tree (fixes SPEC 2k6 miscompares).

Bootstrapped and tested on x86-64-unknown-linux-gnu, applied.

Richard.

2017-12-04  Richard Biener  

PR tree-optimization/83255
* graphite-isl-ast-to-gimple.c (translate_isl_ast_node_for):
Re-add zero-iteration check.

* gcc.dg/graphite/pr83255.c: New testcase.

Index: gcc/graphite-isl-ast-to-gimple.c
===
--- gcc/graphite-isl-ast-to-gimple.c(revision 255375)
+++ gcc/graphite-isl-ast-to-gimple.c(working copy)
@@ -720,6 +720,32 @@ translate_isl_ast_node_for (loop_p conte
 ub = integer_zero_node;
 
   edge last_e = single_succ_edge (split_edge (next_e));
+
+  /* Compensate for the fact that we emit a do { } while loop from
+ a for ISL AST.
+ ???  We often miss constraints on niter because the SESE region
+ doesn't cover loop header copies.  Ideally we'd add constraints
+ for all relevant dominating conditions.  */
+  if (TREE_CODE (lb) == INTEGER_CST && TREE_CODE (ub) == INTEGER_CST
+  && tree_int_cst_compare (lb, ub) <= 0)
+;
+  else
+{
+  tree one = build_one_cst (POINTER_TYPE_P (type) ? sizetype : type);
+  /* Adding +1 and using LT_EXPR helps with loop latches that have a
+loop iteration count of "PARAMETER - 1".  For PARAMETER == 0 this
+becomes 2^k-1 due to integer overflow, and the condition lb <= ub
+is true, even if we do not want this.  However lb < ub + 1 is false,
+as expected.  */
+  tree ub_one = fold_build2 (POINTER_TYPE_P (type)
+? POINTER_PLUS_EXPR : PLUS_EXPR,
+type, ub, one);
+  create_empty_if_region_on_edge (next_e,
+ fold_build2 (LT_EXPR, boolean_type_node,
+  lb, ub_one));
+  next_e = get_true_edge_from_guard_bb (next_e->dest);
+}
+
   translate_isl_ast_for_loop (context_loop, node, next_e,
  type, lb, ub, ip);
   return last_e;
Index: gcc/testsuite/gcc.dg/graphite/pr83255.c
===
--- gcc/testsuite/gcc.dg/graphite/pr83255.c (nonexistent)
+++ gcc/testsuite/gcc.dg/graphite/pr83255.c (working copy)
@@ -0,0 +1,28 @@
+/* { dg-do run } */
+/* { dg-options "-O -floop-nest-optimize -fdump-tree-graphite-details" } */
+
+int rx, in;
+
+int
+main (void)
+{
+  const int tj = 3;
+  int as[tj];
+  static int l4;
+
+  while (l4 < 1)
+{
+  for (rx = 0; rx < tj; ++rx)
+   {
+ for (in = 0; in < tj; ++in)
+   as[in] = 1;
+ as[rx] = 0;
+   }
+  ++l4;
+}
+
+  if (as[tj - 1] != 0)
+__builtin_abort ();
+}
+
+/* { dg-final { scan-tree-dump "loop nest optimized" "graphite" } } */


Re: [PATCH][gimple-interchange] Add reduction validity check

2017-12-04 Thread Bin.Cheng
On Mon, Dec 4, 2017 at 1:11 PM, Richard Biener  wrote:
>
> I've noticed we perform FP reduction association without the required
> checks for associative math.  I've added
> gcc.dg/tree-ssa/loop-interchange-1b.c to cover this.
>
> I also noticed we happily interchange a loop with a reduction like
>
>  sum = a[i] - sum;
>
> where a change in order of elements isn't ok.  Unfortunately bwaves
> is exactly a case where single_use != next_def (tried to simply remove
> that case for now), because reassoc didn't have a chance to fix the
> operand order.  Thus this patch exports the relevant handling from
> the vectorizer (for stage1 having a separate infrastructure gathering /
> analyzing of reduction/induction infrastructure would be nice...)
> and uses it from interchange.  We then don't handle
> gcc.dg/tree-ssa/loop-interchange-4.c anymore (similar vectorizer
> missed-opt is PR65930).  I didn't bother to split up the vectorizer
> code further to implement relaxed validity checking but simply XFAILed
> this testcase.
>
> Earlier I simplified allocation stuff in the main loop which is why
> this part is included in this patch.
>
> Bootstrap running on x86_64-unknown-linux-gnu.
>
> I'll see to craft a testcase with the sum = a[i] - sum; mis-handling.
>
> Ok?
Sure.
Just for the record.  There is also similar associative check in
predcom.  As you suggested, a path extraction/checking interface for
associative checking would be great, given we have multiple users now.

Thanks,
bin
>
> Thanks,
> Richard.
>
> 2017-12-04  Richard Biener  
>
> * tree-vectorizer.h (check_reduction_path): Declare.
> * tree-vect-loop.c (check_reduction_path): New function, split out
> from ...
> (vect_is_simple_reduction): ... here.
> * gimple-loop-interchange.cc: Include tree-vectorizer.h.
> (loop_cand::analyze_iloop_reduction_var): Use single_imm_use.
> Properly check for a supported reduction operation and a
> valid expression if the reduction covers multiple stmts.
> (prepare_perfect_loop_nest): Simpify allocation.
> (pass_linterchange::execute): Likewise.
>
> * gcc.dg/tree-ssa/loop-interchange-1.c: Add fast-math flags.
> * gcc.dg/tree-ssa/loop-interchange-1b.c: New test variant.
> * gcc.dg/tree-ssa/loop-interchange-4.c: XFAIL.
>
>
> Index: gcc/gimple-loop-interchange.cc
> ===
> --- gcc/gimple-loop-interchange.cc  (revision 255375)
> +++ gcc/gimple-loop-interchange.cc  (working copy)
> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-ssa-loop-ivopts.h"
>  #include "tree-ssa-dce.h"
>  #include "tree-data-ref.h"
> +#include "tree-vectorizer.h"
>
>  /* This pass performs loop interchange: for example, the loop nest
>
> @@ -551,23 +552,29 @@ loop_cand::analyze_iloop_reduction_var (
>   in a way that reduction operation is seen as black box.  In general,
>   we can ignore reassociation of reduction operator; we can handle fake
>   reductions in which VAR is not even used to compute NEXT.  */
> -  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
> -{
> -  stmt = USE_STMT (use_p);
> -  if (is_gimple_debug (stmt))
> -   continue;
> -
> -  if (!flow_bb_inside_loop_p (m_loop, gimple_bb (stmt)))
> -   return false;
> -
> -  if (single_use != NULL)
> -   return false;
> +  if (! single_imm_use (var, &use_p, &single_use)
> +  || ! flow_bb_inside_loop_p (m_loop, gimple_bb (single_use)))
> +return false;
>
> -  single_use = stmt;
> -}
> +  /* Check the reduction operation.  We require a commutative or
> + left-associative operation.  For FP math we also need to be allowed
> + to associate operations.  */
> +  if (! is_gimple_assign (single_use)
> +  || ! (commutative_tree_code (gimple_assign_rhs_code (single_use))
> +   || (commutative_ternary_tree_code
> + (gimple_assign_rhs_code (single_use))
> +   && (use_p->use == gimple_assign_rhs1_ptr (single_use)
> +   || use_p->use == gimple_assign_rhs2_ptr (single_use)))
> +   || (gimple_assign_rhs_code (single_use) == MINUS_EXPR
> +   && use_p->use == gimple_assign_rhs1_ptr (single_use)))
> +  || (FLOAT_TYPE_P (TREE_TYPE (var))
> + && ! flag_associative_math))
> +return false;
>
> +  /* Handle and verify a series of stmts feeding the reduction op.  */
>if (single_use != next_def
> -  && !stmt_dominates_stmt_p (single_use, next_def))
> +  && !check_reduction_path (UNKNOWN_LOCATION, m_loop, phi, next,
> +   gimple_assign_rhs_code (single_use)))
>  return false;
>
>/* Only support cases in which INIT is used in inner loop.  */
> @@ -1964,7 +1971,7 @@ prepare_perfect_loop_nest (struct loop *
>vec *datarefs, vec *ddrs)
>  {
>struct loop *start_loop = NULL, *innermost = loop;
> -  st

Re: [PATCH][gimple-interchange] Add reduction validity check

2017-12-04 Thread Richard Biener
On Mon, 4 Dec 2017, Bin.Cheng wrote:

> On Mon, Dec 4, 2017 at 1:11 PM, Richard Biener  wrote:
> >
> > I've noticed we perform FP reduction association without the required
> > checks for associative math.  I've added
> > gcc.dg/tree-ssa/loop-interchange-1b.c to cover this.
> >
> > I also noticed we happily interchange a loop with a reduction like
> >
> >  sum = a[i] - sum;
> >
> > where a change in order of elements isn't ok.  Unfortunately bwaves
> > is exactly a case where single_use != next_def (tried to simply remove
> > that case for now), because reassoc didn't have a chance to fix the
> > operand order.  Thus this patch exports the relevant handling from
> > the vectorizer (for stage1 having a separate infrastructure gathering /
> > analyzing of reduction/induction infrastructure would be nice...)
> > and uses it from interchange.  We then don't handle
> > gcc.dg/tree-ssa/loop-interchange-4.c anymore (similar vectorizer
> > missed-opt is PR65930).  I didn't bother to split up the vectorizer
> > code further to implement relaxed validity checking but simply XFAILed
> > this testcase.
> >
> > Earlier I simplified allocation stuff in the main loop which is why
> > this part is included in this patch.
> >
> > Bootstrap running on x86_64-unknown-linux-gnu.
> >
> > I'll see to craft a testcase with the sum = a[i] - sum; mis-handling.
> >
> > Ok?
> Sure.
> Just for the record.  There is also similar associative check in
> predcom.  As you suggested, a path extraction/checking interface for
> associative checking would be great, given we have multiple users now.

Yeah.  Note for interchange we don't need associativeness in the sense
as implemented by associative_tree_code, we need left-associativeness
which also MINUS_EXPR provides.  So my check isn't perfect.  I guess
we instead need

 associative_tree_code ()
 || (code == MINUS_EXPR
 && use_p->use == gimple_assign_rhs1_ptr (single_use))

where we could also allow RDIV_EXPR and other left-associative
tree codes (but check_reduction_path would do the wrong thing
with those at the moment but it has MINUS_EXPR handling that
would support sum = x - (y - sum) which the above rejects).

So I'm retesting with

  /* Check the reduction operation.  We require a left-associative 
operation.  
 For FP math we also need to be allowed to associate operations.  */
  enum tree_code code = gimple_assign_rhs_code (single_use);
  if (gassign *ass = dyn_cast  (single_use))
{
  enum tree_code code = gimple_assign_rhs_code (ass);
  if (! (associative_tree_code (code)
 || (code == MINUS_EXPR
 && use_p->use == gimple_assign_rhs1_ptr (ass)))
  || (FLOAT_TYPE_P (TREE_TYPE (var))
  && ! flag_associative_math))
return false;
}
  else
return false;

which is more restrictive than before.

Richard.


Re: [PATCH][gimple-interchange] Add reduction validity check

2017-12-04 Thread Bin.Cheng
On Mon, Dec 4, 2017 at 1:11 PM, Richard Biener  wrote:
>
> I've noticed we perform FP reduction association without the required
> checks for associative math.  I've added
> gcc.dg/tree-ssa/loop-interchange-1b.c to cover this.
>
> I also noticed we happily interchange a loop with a reduction like
>
>  sum = a[i] - sum;
>
> where a change in order of elements isn't ok.  Unfortunately bwaves
> is exactly a case where single_use != next_def (tried to simply remove
> that case for now), because reassoc didn't have a chance to fix the
> operand order.  Thus this patch exports the relevant handling from
> the vectorizer (for stage1 having a separate infrastructure gathering /
> analyzing of reduction/induction infrastructure would be nice...)
> and uses it from interchange.  We then don't handle
> gcc.dg/tree-ssa/loop-interchange-4.c anymore (similar vectorizer
> missed-opt is PR65930).  I didn't bother to split up the vectorizer
> code further to implement relaxed validity checking but simply XFAILed
> this testcase.
>
> Earlier I simplified allocation stuff in the main loop which is why
> this part is included in this patch.
>
> Bootstrap running on x86_64-unknown-linux-gnu.
>
> I'll see to craft a testcase with the sum = a[i] - sum; mis-handling.
>
> Ok?
>
> Thanks,
> Richard.
>
> 2017-12-04  Richard Biener  
>
> * tree-vectorizer.h (check_reduction_path): Declare.
> * tree-vect-loop.c (check_reduction_path): New function, split out
> from ...
> (vect_is_simple_reduction): ... here.
> * gimple-loop-interchange.cc: Include tree-vectorizer.h.
> (loop_cand::analyze_iloop_reduction_var): Use single_imm_use.
> Properly check for a supported reduction operation and a
> valid expression if the reduction covers multiple stmts.
> (prepare_perfect_loop_nest): Simpify allocation.
> (pass_linterchange::execute): Likewise.
>
> * gcc.dg/tree-ssa/loop-interchange-1.c: Add fast-math flags.
> * gcc.dg/tree-ssa/loop-interchange-1b.c: New test variant.
> * gcc.dg/tree-ssa/loop-interchange-4.c: XFAIL.
>
>
> Index: gcc/gimple-loop-interchange.cc
> ===
> --- gcc/gimple-loop-interchange.cc  (revision 255375)
> +++ gcc/gimple-loop-interchange.cc  (working copy)
> @@ -41,6 +41,7 @@ along with GCC; see the file COPYING3.
>  #include "tree-ssa-loop-ivopts.h"
>  #include "tree-ssa-dce.h"
>  #include "tree-data-ref.h"
> +#include "tree-vectorizer.h"
>
>  /* This pass performs loop interchange: for example, the loop nest
>
> @@ -551,23 +552,29 @@ loop_cand::analyze_iloop_reduction_var (
>   in a way that reduction operation is seen as black box.  In general,
>   we can ignore reassociation of reduction operator; we can handle fake
>   reductions in which VAR is not even used to compute NEXT.  */
> -  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
> -{
> -  stmt = USE_STMT (use_p);
> -  if (is_gimple_debug (stmt))
> -   continue;
> -
> -  if (!flow_bb_inside_loop_p (m_loop, gimple_bb (stmt)))
> -   return false;
> -
> -  if (single_use != NULL)
> -   return false;
> +  if (! single_imm_use (var, &use_p, &single_use)
> +  || ! flow_bb_inside_loop_p (m_loop, gimple_bb (single_use)))
> +return false;
>
> -  single_use = stmt;
> -}
> +  /* Check the reduction operation.  We require a commutative or
> + left-associative operation.  For FP math we also need to be allowed
> + to associate operations.  */
> +  if (! is_gimple_assign (single_use)
> +  || ! (commutative_tree_code (gimple_assign_rhs_code (single_use))
> +   || (commutative_ternary_tree_code
> + (gimple_assign_rhs_code (single_use))
> +   && (use_p->use == gimple_assign_rhs1_ptr (single_use)
> +   || use_p->use == gimple_assign_rhs2_ptr (single_use)))
> +   || (gimple_assign_rhs_code (single_use) == MINUS_EXPR
> +   && use_p->use == gimple_assign_rhs1_ptr (single_use)))
> +  || (FLOAT_TYPE_P (TREE_TYPE (var))
> + && ! flag_associative_math))
> +return false;
>
> +  /* Handle and verify a series of stmts feeding the reduction op.  */
>if (single_use != next_def
> -  && !stmt_dominates_stmt_p (single_use, next_def))
> +  && !check_reduction_path (UNKNOWN_LOCATION, m_loop, phi, next,
> +   gimple_assign_rhs_code (single_use)))
>  return false;
>
>/* Only support cases in which INIT is used in inner loop.  */
> @@ -1964,7 +1971,7 @@ prepare_perfect_loop_nest (struct loop *
>vec *datarefs, vec *ddrs)
>  {
>struct loop *start_loop = NULL, *innermost = loop;
> -  struct loop *outermost = superloop_at_depth (loop, 0);
> +  struct loop *outermost = loops_for_fn (cfun)->tree_root;
>
>/* Find loop nest from the innermost loop.  The outermost is the innermost
>   outer*/
> @@ -1985,

Re: [RFA][PATCH][tree-optimization/78496] 01/03 Do not lose range information from earlier VRP passes

2017-12-04 Thread Jeff Law
On 12/04/2017 01:11 AM, Jakub Jelinek wrote:
> On Sun, Dec 03, 2017 at 10:55:27PM -0700, Jeff Law wrote:
>> As we touched on in IRC, the EVRP analyzer was doing something stupid
>> which caused it to not merge in existing range information under certain
>> circumstances.
>>
>> Consider this fragment:
>>
>>   x_1 = foo ()
>>   if (x_1 > 2)
>> __builtin_unreachable ();
>>   if (x_1 < 0)
>> __builtin_unreachable ();
> 
> Note that for say:
>   x_1 = foo ();
>   bar (x_1);
>   if (x_1 > 2)
> __builtin_unreachable ();
>   if (x_1 < 0)
> __builtin_unreachable ();
>   ...
>   further uses of x_1
> we can't do that anymore (at least, can't remember it in
> SSA_NAME_RANGE_INFO), as bar could not return/could loop etc.
Right.  Anything reflected into SSA_NAME_RANGE_INFO has to be globally
true.  With the call to bar the transformation can't safely be applied.

  Ditto with
> any other code in between foo and the unreachable asserts if it doesn't
> guarantee that we'll always reach the comparisons after the x_1 setter.
> Even
>   x_1 = foo ();
>   bar ();
>   if (x_1 > 2)
> __builtin_unreachable ();
>   if (x_1 < 0)
> __builtin_unreachable ();
> looks unclean, if bar doesn't return, then we'd just need to hope we don't
> add further uses of x_1 in between foo and bar.  Some optimizations do stuff
> like that, consider foo being a pass-through function.
This one is less clear.  But I don't think we should be trying to
optimize this case anyway -- too little to be gained and too close to
doing something unexpected.

jeff


Re: [RFA][PATCH][tree-optimization/78496] 02/03 Do not exploit __builtin_unreachable if -fsanitize=unreachable

2017-12-04 Thread Jeff Law
On 12/04/2017 03:40 AM, Richard Biener wrote:
> On Mon, Dec 4, 2017 at 6:55 AM, Jeff Law  wrote:
>>
>> As I brought my patches for pr78496 up to the tip of the trunk I noticed
>> a couple testsuite regressions with -fsanitize=unreachable tests.
>>
>> The problem is VRP and EVRP analysis/optimization could exploit
>> __builtin_unreachable to narrow the range of an object, then use that
>> narrowed range to eliminate the __builtin_unreachable.  That seems
>> fundamentally wrong if we're compiling with -fsanitize=unreachable.
>>
>> So this patch changes both to not exploit __builtin_unreachable when
>> -fsanitize=unreachable.
>>
>> Bootstrapped and regression tested with all three patches in this kit.
>>
>> OK for the trunk?
> 
> Jakub already fixed this in sligthly different ways.
ACK.  Just looked at his fix.  I'd pondered doing it in one of those
support routines as well.  I'll verify his fix is sufficient with my
work.  Consider this patch dropped as I expect Jakub's patch to be
sufficient.

jeff


[PATCH][Middle-end]79538 missing -Wformat-overflow with %s and non-member array arguments

2017-12-04 Thread Qing Zhao
Hi,

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79538 

missing -Wformat-overflow with %s and non-member array arguments

-Wformat-overflow uses the routine "get_range_strlen" to decide the
maximum string length, however, currently "get_range_strlen" misses
the handling of non-member arrays.

Adding the handling of non-member array resolves the issue.
Adding test case pr79538.c into gcc.dg.

During gcc bootstrap, 2 source files (c-family/c-cppbuiltin.c,
fortran/class.c) were detected new warnings by -Wformat-overflow
due to the new handling of non-member array in "get_range_strlen".
in order to avoid these new warnings and continue with bootstrap,
updating these 2 files to avoid the warnings.

in c-family/c-cppbuiltin.c, the warning is following:

../../latest_gcc_2/gcc/c-family/c-cppbuiltin.c:1627:15: note:
‘sprintf’ output 2 or more bytes (assuming 257) into a destination
of size 256
  sprintf (buf1, "%s=%s", macro, buf2);
  ^~~~
in the above, buf1 and buf2 are declared as:
char buf1[256], buf2[256];

i.e, buf1 and buf2 have same size. adjusting the size of buf1 and
buf2 resolves the warning.

fortran/class.c has the similar issue as above. Instead of adjusting
size of the buffers, replacing sprintf with xasprintf.

bootstraped and tested on both X86 and aarch64. no regression.

Okay for trunk?

thanks.

Qing

===

gcc/ChangeLog

2017-11-30  Qing Zhao  mailto:qing.z...@oracle.com>>

  PR middle_end/79538
  * gimple-fold.c (get_range_strlen): Add the handling of non-member array.

gcc/fortran/ChangeLog

2017-11-30  Qing Zhao  mailto:qing.z...@oracle.com>>

   PR middle_end/79538
   * class.c (gfc_build_class_symbol): Replace call to
   sprintf with xasprintf to avoid format-overflow warning.
   (generate_finalization_wrapper): Likewise.
   (gfc_find_derived_vtab): Likewise.
   (find_intrinsic_vtab): Likewise.


gcc/c-family/ChangeLog

2017-11-30  Qing Zhao  mailto:qing.z...@oracle.com>>

  PR middle_end/79538 
* c-cppbuiltin.c (builtin_define_with_hex_fp_value):
  Adjust the size of buf1 and buf2, add a new buf to avoid
  format-overflow warning.

gcc/testsuite/ChangeLog

2017-11-30  Qing Zhao  mailto:qing.z...@oracle.com>>

  PR middle_end/79538
  * gcc.dg/pr79538.c: New test.

---
gcc/c-family/c-cppbuiltin.c| 10 -
gcc/fortran/class.c| 49 --
gcc/gimple-fold.c  | 13 +++
gcc/testsuite/gcc.dg/pr79538.c | 23 
4 files changed, 69 insertions(+), 26 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr79538.c

diff --git a/gcc/c-family/c-cppbuiltin.c b/gcc/c-family/c-cppbuiltin.c
index 2ac9616..9e33aed 100644
--- a/gcc/c-family/c-cppbuiltin.c
+++ b/gcc/c-family/c-cppbuiltin.c
@@ -1613,7 +1613,7 @@ builtin_define_with_hex_fp_value (const char *macro,
  const char *fp_cast)
{
  REAL_VALUE_TYPE real;
-  char dec_str[64], buf1[256], buf2[256];
+  char dec_str[64], buf[256], buf1[128], buf2[64];

  /* This is very expensive, so if possible expand them lazily.  */
  if (lazy_hex_fp_value_count < 12
@@ -1656,11 +1656,11 @@ builtin_define_with_hex_fp_value (const char *macro,

  /* Assemble the macro in the following fashion
 macro = fp_cast [dec_str fp_suffix] */
-  sprintf (buf1, "%s%s", dec_str, fp_suffix);
-  sprintf (buf2, fp_cast, buf1);
-  sprintf (buf1, "%s=%s", macro, buf2);
+  sprintf (buf2, "%s%s", dec_str, fp_suffix);
+  sprintf (buf1, fp_cast, buf2);
+  sprintf (buf, "%s=%s", macro, buf1);

-  cpp_define (parse_in, buf1);
+  cpp_define (parse_in, buf);
}

/* Return a string constant for the suffix for a value of type TYPE
diff --git a/gcc/fortran/class.c b/gcc/fortran/class.c
index ebbd41b..a08fb8d 100644
--- a/gcc/fortran/class.c
+++ b/gcc/fortran/class.c
@@ -602,7 +602,8 @@ bool
gfc_build_class_symbol (gfc_typespec *ts, symbol_attribute *attr,
gfc_array_spec **as)
{
-  char name[GFC_MAX_SYMBOL_LEN+1], tname[GFC_MAX_SYMBOL_LEN+1];
+  char tname[GFC_MAX_SYMBOL_LEN+1];
+  char *name;
  gfc_symbol *fclass;
  gfc_symbol *vtab;
  gfc_component *c;
@@ -633,17 +634,17 @@ gfc_build_class_symbol (gfc_typespec *ts, 
symbol_attribute *attr,
  rank = !(*as) || (*as)->rank == -1 ? GFC_MAX_DIMENSIONS : (*as)->rank;
  get_unique_hashed_string (tname, ts->u.derived);
  if ((*as) && attr->allocatable)
-sprintf (name, "__class_%s_%d_%da", tname, rank, (*as)->corank);
+name = xasprintf ("__class_%s_%d_%da", tname, rank, (*as)->corank);
  else if ((*as) && attr->pointer)
-sprintf (name, "__class_%s_%d_%dp", tname, rank, (*as)->corank);
+name = xasprintf ("__class_%s_%d_%dp", tname, rank, (*as)->corank);
  else if ((*as))
-sprintf (name, "__class_%s_%d_%dt", tname, rank, (*as)->corank);
+name = xasprintf ("__class_%s_%d_%dt", tname, 

Re: [PATCH,RFC] combine: Remove use_crosses_set_p

2017-12-04 Thread Segher Boessenkool
On Thu, Nov 30, 2017 at 09:26:37AM +, Segher Boessenkool wrote:
> This removes use_crosses_set_p, and uses modified_between_p instead
> everywhere it was used.  This improves optimisation.  I'm a little bit
> worried it might increase compile time; so far I haven't seen it do
> that though.
> 
> Longer term I would like to get rid of reg_stat completely, perhaps
> use dataflow everywhere.
> 
> Comments/ideas welcome, both on this patch and that longer-term plan.

Since there are no comments, I'll commit it now.


Segher


> 2017-11-30  Segher Boessenkool  
> 
>   * combine.c: Adjust comment.
>   (use_crosses_set_p): Delete.
>   (can_combine_p): Use modified_between_p instead of use_crosses_set_p.
>   (try_combine): Ditto.
> 
> ---
>  gcc/combine.c | 69 
> ++-
>  1 file changed, 7 insertions(+), 62 deletions(-)
> 
> diff --git a/gcc/combine.c b/gcc/combine.c
> index 22a382d..748c9a8 100644
> --- a/gcc/combine.c
> +++ b/gcc/combine.c
> @@ -39,7 +39,7 @@ along with GCC; see the file COPYING3.  If not see
> insn as having a logical link to the preceding insn.  The same is true
> for an insn explicitly using CC0.
>  
> -   We check (with use_crosses_set_p) to avoid combining in such a way
> +   We check (with modified_between_p) to avoid combining in such a way
> as to move a computation to a place where its value would be different.
>  
> Combination is done by mathematically substituting the previous
> @@ -482,7 +482,6 @@ static void record_dead_and_set_regs_1 (rtx, const_rtx, 
> void *);
>  static void record_dead_and_set_regs (rtx_insn *);
>  static int get_last_value_validate (rtx *, rtx_insn *, int, int);
>  static rtx get_last_value (const_rtx);
> -static int use_crosses_set_p (const_rtx, int);
>  static void reg_dead_at_p_1 (rtx, const_rtx, void *);
>  static int reg_dead_at_p (rtx, rtx_insn *);
>  static void move_deaths (rtx, rtx, int, rtx_insn *, rtx *);
> @@ -2011,7 +2010,7 @@ can_combine_p (rtx_insn *insn, rtx_insn *i3, rtx_insn 
> *pred ATTRIBUTE_UNUSED,
>|| (! all_adjacent
> && (((!MEM_P (src)
>   || ! find_reg_note (insn, REG_EQUIV, src))
> -&& use_crosses_set_p (src, DF_INSN_LUID (insn)))
> +&& modified_between_p (src, insn, i3))
> || (GET_CODE (src) == ASM_OPERANDS && MEM_VOLATILE_P (src))
> || GET_CODE (src) == UNSPEC_VOLATILE))
>/* Don't combine across a CALL_INSN, because that would possibly
> @@ -3727,7 +3726,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>   }
>else if (m_split_insn && NEXT_INSN (NEXT_INSN (m_split_insn)) == 
> NULL_RTX
>  && (next_nonnote_nondebug_insn (i2) == i3
> -|| ! use_crosses_set_p (PATTERN (m_split_insn), DF_INSN_LUID 
> (i2
> +|| !modified_between_p (PATTERN (m_split_insn), i2, i3)))
>   {
> rtx i2set, i3set;
> rtx newi3pat = PATTERN (NEXT_INSN (m_split_insn));
> @@ -3791,7 +3790,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
> || can_change_dest_mode (i2dest, added_sets_2,
>  GET_MODE (*split)))
> && (next_nonnote_nondebug_insn (i2) == i3
> -   || ! use_crosses_set_p (*split, DF_INSN_LUID (i2)))
> +   || !modified_between_p (*split, i2, i3))
> /* We can't overwrite I2DEST if its value is still used by
>NEWPAT.  */
> && ! reg_referenced_p (i2dest, newpat))
> @@ -3982,8 +3981,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>  && GET_CODE (XVECEXP (newpat, 0, 1)) == SET
>  && rtx_equal_p (SET_SRC (XVECEXP (newpat, 0, 1)),
>  XEXP (SET_SRC (XVECEXP (newpat, 0, 0)), 0))
> -&& ! use_crosses_set_p (SET_SRC (XVECEXP (newpat, 0, 1)),
> -DF_INSN_LUID (i2))
> +&& !modified_between_p (SET_SRC (XVECEXP (newpat, 0, 1)), i2, i3)
>  && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != ZERO_EXTRACT
>  && GET_CODE (SET_DEST (XVECEXP (newpat, 0, 1))) != STRICT_LOW_PART
>  && ! (temp_expr = SET_DEST (XVECEXP (newpat, 0, 1)),
> @@ -4057,7 +4055,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
>be first.  The PARALLEL might also have been pre-existing in i3,
>so we need to make sure that we won't wrongly hoist a SET to i2
>that would conflict with a death note present in there.  */
> -  if (!use_crosses_set_p (SET_SRC (set1), DF_INSN_LUID (i2))
> +  if (!modified_between_p (SET_SRC (set1), i2, i3)
> && !(REG_P (SET_DEST (set1))
>  && find_reg_note (i2, REG_DEAD, SET_DEST (set1)))
> && !(GET_CODE (SET_DEST (set1)) == SUBREG
> @@ -4072,7 +4070,7 @@ try_combine (rtx_insn *i3, rtx_insn *i2, rtx_insn *i1, 
> rtx_insn *i0,
> newi2pat = set1;
>   

[PATCH][gimple-interchange] Random cleanups

2017-12-04 Thread Richard Biener

When skimming through the code I noticed the following (chatted on IRC
about parts of the changes).

Bootstrap / regtest running on x86_64-unknown-linux-gnu.

Will commit tomorrow unless you beat me to that.

Richard.

2017-12-04  Richard Biener  

* gimple-loop-interchange.cc (loop_cand::classify_simple_reduction):
Simplify.
(loop_cand::analyze_iloop_reduction_var): Reject dead reductions.
(loop_cand::analyze_oloop_reduction_var): Likewise.  Simplify.
(tree_loop_interchange::interchange_loops): Properly analyze
scalar evolution before instantiating a SCEV.

Index: gcc/gimple-loop-interchange.cc
===
--- gcc/gimple-loop-interchange.cc  (revision 255383)
+++ gcc/gimple-loop-interchange.cc  (working copy)
@@ -444,50 +444,21 @@ loop_cand::classify_simple_reduction (re
   if (!bb || bb->loop_father != m_outer)
return;
 
-  if (!is_gimple_assign (producer))
+  if (!gimple_assign_load_p (producer))
return;
 
-  code = gimple_assign_rhs_code (producer);
-  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
-   return;
-
-  lhs = gimple_assign_lhs (producer);
-  if (lhs != re->init)
-   return;
-
-  rhs = gimple_assign_rhs1 (producer);
-  if (!REFERENCE_CLASS_P (rhs))
-   return;
-
-  re->init_ref = rhs;
+  re->init_ref = gimple_assign_rhs1 (producer);
 }
   else if (!CONSTANT_CLASS_P (re->init))
 return;
 
-  /* Check how reduction variable is used.  Note usually reduction variable
- is used outside of its defining loop, we don't require that in terms of
- loop interchange.  */
-  if (!re->lcssa_phi)
-consumer = single_use_in_loop (re->next, m_loop);
-  else
-consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi), m_outer);
-
-  if (!consumer || !is_gimple_assign (consumer))
-return;
-
-  code = gimple_assign_rhs_code (consumer);
-  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
-return;
-
-  lhs = gimple_assign_lhs (consumer);
-  if (!REFERENCE_CLASS_P (lhs))
-return;
-
-  rhs = gimple_assign_rhs1 (consumer);
-  if (rhs != PHI_RESULT (re->lcssa_phi))
+  /* Check how reduction variable is used.  */
+  consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi), m_outer);
+  if (!consumer
+  || !gimple_store_p (consumer))
 return;
 
-  re->fini_ref = lhs;
+  re->fini_ref = gimple_get_lhs (consumer);
   re->consumer = consumer;
 
   /* Simple reduction with constant initializer.  */
@@ -608,6 +579,9 @@ loop_cand::analyze_iloop_reduction_var (
   else
return false;
 }
+  if (!lcssa_phi)
+return false;
+
   re = XCNEW (struct reduction);
   re->var = var;
   re->init = init;
@@ -681,15 +655,9 @@ loop_cand::analyze_oloop_reduction_var (
 
   /* Outer loop's reduction should only be used to initialize inner loop's
  simple reduction.  */
-  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
-{
-  stmt = USE_STMT (use_p);
-  if (is_gimple_debug (stmt))
-   continue;
-
-  if (stmt != inner_re->phi)
-   return false;
-}
+  if (! single_imm_use (var, &use_p, &stmt)
+  || stmt != inner_re->phi)
+return false;
 
   /* Check this reduction is correctly used outside of loop via lcssa phi.  */
   FOR_EACH_IMM_USE_FAST (use_p, iterator, next)
@@ -711,6 +679,8 @@ loop_cand::analyze_oloop_reduction_var (
   else
return false;
 }
+  if (!lcssa_phi)
+return false;
 
   re = XCNEW (struct reduction);
   re->var = var;
@@ -1146,12 +1116,18 @@ tree_loop_interchange::interchange_loops
   edge instantiate_below = loop_preheader_edge (loop_nest);
   gsi = gsi_last_bb (loop_preheader_edge (loop_nest)->src);
   i_niters = number_of_latch_executions (iloop.m_loop);
-  i_niters = instantiate_scev (instantiate_below, loop_nest, i_niters);
+  i_niters = analyze_scalar_evolution (loop_outer (iloop.m_loop), i_niters);
+  i_niters = instantiate_scev (instantiate_below, loop_outer (iloop.m_loop),
+  i_niters);
   i_niters = force_gimple_operand_gsi (&gsi, unshare_expr (i_niters), true,
   NULL_TREE, false, GSI_CONTINUE_LINKING);
   o_niters = number_of_latch_executions (oloop.m_loop);
   if (oloop.m_loop != loop_nest)
-o_niters = instantiate_scev (instantiate_below, loop_nest, o_niters);
+{
+  o_niters = analyze_scalar_evolution (loop_outer (oloop.m_loop), 
o_niters);
+  o_niters = instantiate_scev (instantiate_below, loop_outer 
(oloop.m_loop),
+  o_niters);
+}
   o_niters = force_gimple_operand_gsi (&gsi, unshare_expr (o_niters), true,
   NULL_TREE, false, GSI_CONTINUE_LINKING);
 


Re: [PATCH] final: Improve output for -dp and -fverbose-asm

2017-12-04 Thread Michael Matz
Hi,

On Thu, 30 Nov 2017, David Malcolm wrote:

> -fverbose-asm is on the border of compiler-debugging vs end-user usage.

I have yet to see a relevant percentage of end-users who even know what 
assembler is.  On the gcc.*@ and kernel.* mailing lists, sure.  But Joe 
Randomapp?

> FWIW an improvement to -fverbose-asm was explicitly mentioned in the
> gcc 7 release notes:
>   https://gcc.gnu.org/gcc-7/changes.html
> and I've seen at least some end-users comment favorably on that change;
> this was:
>   https://gcc.gnu.org/ml/gcc-patches/2016-08/msg01051.html
> which was originally a "-fasm-show-source" option.

Yes.  I think this was actually a disservice to readability of 
-fverbose-asm (sorry!) and would have preferred a suboption as well (but 
wouldn't have complained if with-sources would be the default).  First, it 
clutters the asm instructions with intervening non-aligned lines (and 
left-hanging even, giving more visual importance to them instead of what 
is actually important, which for a switch named verbose-asm seems the asm 
to me) and second it has the same problem as debugging scheduled code: 
jumping around crazy and stating the same source line multiple times.  
(Basically the same reason 'objdump -dS' is similarly ugly, and why the -S 
therein is an extra switch).

Luckily -dp still works as expected, so, ... well, I guess I'll live, and 
if not there's "grep -v '^#.*:[0-9]\+:'" :)


Ciao,
Michael.


Re: [PATCH][gimple-interchange] Random cleanups

2017-12-04 Thread Bin.Cheng
On Mon, Dec 4, 2017 at 3:43 PM, Richard Biener  wrote:
>
> When skimming through the code I noticed the following (chatted on IRC
> about parts of the changes).
>
> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>
> Will commit tomorrow unless you beat me to that.
>
> Richard.
>
> 2017-12-04  Richard Biener  
>
> * gimple-loop-interchange.cc (loop_cand::classify_simple_reduction):
> Simplify.
> (loop_cand::analyze_iloop_reduction_var): Reject dead reductions.
> (loop_cand::analyze_oloop_reduction_var): Likewise.  Simplify.
> (tree_loop_interchange::interchange_loops): Properly analyze
> scalar evolution before instantiating a SCEV.
>
> Index: gcc/gimple-loop-interchange.cc
> ===
> --- gcc/gimple-loop-interchange.cc  (revision 255383)
> +++ gcc/gimple-loop-interchange.cc  (working copy)
> @@ -444,50 +444,21 @@ loop_cand::classify_simple_reduction (re
>if (!bb || bb->loop_father != m_outer)
> return;
>
> -  if (!is_gimple_assign (producer))
> +  if (!gimple_assign_load_p (producer))
> return;
>
> -  code = gimple_assign_rhs_code (producer);
> -  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
> -   return;
> -
> -  lhs = gimple_assign_lhs (producer);
> -  if (lhs != re->init)
> -   return;
> -
> -  rhs = gimple_assign_rhs1 (producer);
> -  if (!REFERENCE_CLASS_P (rhs))
> -   return;
> -
> -  re->init_ref = rhs;
> +  re->init_ref = gimple_assign_rhs1 (producer);
>  }
>else if (!CONSTANT_CLASS_P (re->init))
>  return;
>
> -  /* Check how reduction variable is used.  Note usually reduction variable
> - is used outside of its defining loop, we don't require that in terms of
> - loop interchange.  */
> -  if (!re->lcssa_phi)
> -consumer = single_use_in_loop (re->next, m_loop);
> -  else
> -consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi), m_outer);
> -
> -  if (!consumer || !is_gimple_assign (consumer))
> -return;
> -
> -  code = gimple_assign_rhs_code (consumer);
> -  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
> -return;
> -
> -  lhs = gimple_assign_lhs (consumer);
> -  if (!REFERENCE_CLASS_P (lhs))
> -return;
> -
> -  rhs = gimple_assign_rhs1 (consumer);
> -  if (rhs != PHI_RESULT (re->lcssa_phi))
> +  /* Check how reduction variable is used.  */
> +  consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi), m_outer);
> +  if (!consumer
> +  || !gimple_store_p (consumer))
>  return;
>
> -  re->fini_ref = lhs;
> +  re->fini_ref = gimple_get_lhs (consumer);
>re->consumer = consumer;
>
>/* Simple reduction with constant initializer.  */
> @@ -608,6 +579,9 @@ loop_cand::analyze_iloop_reduction_var (
>else
> return false;
>  }
> +  if (!lcssa_phi)
> +return false;
> +
>re = XCNEW (struct reduction);
>re->var = var;
>re->init = init;
> @@ -681,15 +655,9 @@ loop_cand::analyze_oloop_reduction_var (
>
>/* Outer loop's reduction should only be used to initialize inner loop's
>   simple reduction.  */
> -  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
> -{
> -  stmt = USE_STMT (use_p);
> -  if (is_gimple_debug (stmt))
> -   continue;
> -
> -  if (stmt != inner_re->phi)
> -   return false;
> -}
> +  if (! single_imm_use (var, &use_p, &stmt)
> +  || stmt != inner_re->phi)
> +return false;
>
>/* Check this reduction is correctly used outside of loop via lcssa phi.  
> */
>FOR_EACH_IMM_USE_FAST (use_p, iterator, next)
> @@ -711,6 +679,8 @@ loop_cand::analyze_oloop_reduction_var (
>else
> return false;
>  }
> +  if (!lcssa_phi)
> +return false;
>
>re = XCNEW (struct reduction);
>re->var = var;
> @@ -1146,12 +1116,18 @@ tree_loop_interchange::interchange_loops
>edge instantiate_below = loop_preheader_edge (loop_nest);
>gsi = gsi_last_bb (loop_preheader_edge (loop_nest)->src);
>i_niters = number_of_latch_executions (iloop.m_loop);
> -  i_niters = instantiate_scev (instantiate_below, loop_nest, i_niters);
> +  i_niters = analyze_scalar_evolution (loop_outer (iloop.m_loop), i_niters);
> +  i_niters = instantiate_scev (instantiate_below, loop_outer (iloop.m_loop),
> +  i_niters);
>i_niters = force_gimple_operand_gsi (&gsi, unshare_expr (i_niters), true,
>NULL_TREE, false, 
> GSI_CONTINUE_LINKING);
>o_niters = number_of_latch_executions (oloop.m_loop);
>if (oloop.m_loop != loop_nest)
> -o_niters = instantiate_scev (instantiate_below, loop_nest, o_niters);
> +{
> +  o_niters = analyze_scalar_evolution (loop_outer (oloop.m_loop), 
> o_niters);
> +  o_niters = instantiate_scev (instantiate_below, loop_outer 
> (oloop.m_loop),
> +  o_niters);
> +}
Hmm, sorry to disturb.  IIRC, it's important to 

Re: [PATCH, obv?] Fix missing newlines from local-pure-const pass dump

2017-12-04 Thread Jeff Law
On 12/01/2017 11:42 AM, Luis Machado wrote:
> I noticed the debugging output from local-pure-const pass is missing a
> newline in a couple places, leading to this:
> 
>  local analysis of main
>scanning: i ={v} 0;
> Volatile stmt is not const/pure
> Volatile operand is not const/pure  scanning: j ={v} 20;
> Volatile stmt is not const/pure
> Volatile operand is not const/pure  scanning: vol.0_10 ={v} i;
> Volatile stmt is not const/pure
> 
> It should've been:
> 
>  local analysis of main
>scanning: i ={v} 0;
> Volatile stmt is not const/pure
> Volatile operand is not const/pure
>scanning: j ={v} 20;
> Volatile stmt is not const/pure
> Volatile operand is not const/pure
>scanning: vol.0_10 ={v} i;
> Volatile stmt is not const/pure
> 
> Seems fairly obvious. OK?
> 
> gcc/ChangeLog:
> 
> 2017-12-01  Luis Machado  
> 
>   * ipa-pure-const.c (check_decl): Add missing newline.
>   (state_from_flags): Likewise.
OK.
jeff


Re: [RFA, PATCH] Deprecate stabs if dwarf2 support available.

2017-12-04 Thread Jim Wilson
On Mon, Dec 4, 2017 at 2:38 AM, Richard Biener
 wrote:
> Note that "deprecation" doesn't match up with giving an error when using.
> We should warn (or maybe just document deprecation) instead.

OK.  I can warn instead.  That makes a lot of things simpler for
gcc-8.  Nothing will be broken by the change, except for the debug
torture support in the gcc testsuite, which is easy enough to fix.

> Maybe we can add a DWARF2_ONLY_DEBUGGING_INFO or
> NO_DBX_DEBUGGING_INFO macro so targets can decide they do not
> want to support -gstabs and reject it this way?  I'm not sure if
> #undefing DBX_DEBUGGING_INFO for those targets would be enough to
> achieve this?  It seems clearly not.  Thus sth like

I think we have too many *_DEBUGGING_INFO macros already.  And some of
them are undocumented and/or incorrectly documented.  So I'd rather
not add more.  DWARF2_LINENO_DEBUGGING_INFO is undocumented.
XCOFF_DEBUGGING_INFO is incorrectly documented.  The docs say it emits
a stabs variant, but it is used in both dbxout.c and dwarf2out.c, so
it is also emitting a dwarf2 variant.

> +#ifndef DBX_DEBUGGING_INFO
> +  warning ("-gstabs is deprecated on this target");
> +
> +#endif

It is already the case that you can't emit stabs if DBX_DEBUGGING_INFO
is not defined.  So this patch would do nothing useful.

rohan:2017$ ./xgcc -B./ -gstabs tmp.c
cc1: error: target system does not support the ‘stabs’ debug format
rohan:2018$

> and/or ignore stabs and use -gdwarf -g1 when -gstabs is specified?

Except that we can't emit dwarf-2 unless DWARF2_DEBUGGING_INFO is
defined.  So that takes us back to my patch which checks for
DWARF2_DEBUGGING_INFO.  It just needs to emit a warning instead of an
error, and then force a switch to dwarf2.

That leaves open the question of what to do about targets that only
support stabs.  We could do nothing for now and continue to emit
stabs.  We could call inform() to emit the deprecation message, to
avoid breaking -Werror, and continue to emit stabs.  I think the
latter is more useful, so I will try that.

> So unless we want to remove any support for GCC 8 I'd suggest to document
> the deprecation in gcc-8/changes.html and emit a deprecation warning from
> the --with-stabs configure option.  Note that deprecating stabs means
> deprecating
> those targets (or just debugging on those?) that only support stabs.

No, I don't want to remove stabs support at this time.  I just want to
start nudging people in the right direction, to switch to dwarf2.  I
think avr is the only actively maintained target that depends on
stabs, and we can ask them to add dwarf support.  For the other
targets, we can just drop debugging support if people are still
interested in the target, or drop the target if no one is still
interested in it.

Jim


[PATCH 0/2] v3: C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

2017-12-04 Thread David Malcolm
On Fri, 2017-12-01 at 19:09 -0500, David Malcolm wrote:
> On Fri, 2017-12-01 at 22:56 +0100, Jakub Jelinek wrote:
> > On Fri, Dec 01, 2017 at 04:48:20PM -0500, David Malcolm wrote:
> > > PR c/83236 reports an issue where the C FE unhelpfully suggests
> > > the
> > > use
> > > of glibc's private "__ino_t" type when it fails to recognize
> > > "ino_t":
> > > 
> > > $ cat > test.c < > > #include 
> > > ino_t inode;
> > > EOF
> > > $ gcc -std=c89 -fsyntax-only test.c
> > > test.c:2:1: error: unknown type name 'ino_t'; did you mean
> > > '__ino_t'?
> > >  ino_t inode;
> > >  ^
> > >  __ino_t
> > > 
> > > This patch updates the C/C++ FEs suggestions for unrecognized
> > > identifiers
> > > so that they don't suggest names that are reserved for use by the
> > > implementation i.e. those that begin with an underscore and
> > > either
> > > an
> > > uppercase letter or another underscore.
> > > 
> > > However, it allows built-in macros that match this pattern to be
> > > suggested, since it's useful to be able to suggest __FILE__,
> > > __LINE__
> > > etc.  Other macros *are* filtered.
> > > 
> > > One wart with the patch: the existing macro-handling spellcheck
> > > code
> > > is in spellcheck-tree.c, and needs to call the the new function
> > > "name_reserved_for_implementation_p", however the latter relates
> > > to
> > > the C family of FEs.
> > > Perhaps I should move all of the the macro-handling stuff in
> > > spellcheck-tree.h/c to e.g. a new c-family/c-spellcheck.h/c as a
> > > first step?
> > > 
> > > Successfully bootstrapped®rtested on x86_64-pc-linux-gnu.
> > > 
> > > OK for trunk?
> > > 
> > > gcc/c/ChangeLog:
> > >   PR c/83236
> > >   * c-decl.c (lookup_name_fuzzy): Don't suggest names that are
> > >   reserved for use by the implementation.
> > > 
> > > gcc/cp/ChangeLog:
> > >   PR c/83236
> > >   * name-lookup.c (consider_binding_level): Don't suggest names
> > > that
> > >   are reserved for use by the implementation.
> > > 
> > > gcc/ChangeLog:
> > >   PR c/83236
> > >   * spellcheck-tree.c (name_reserved_for_implementation_p): New
> > >   function.
> > >   (should_suggest_as_macro_p): New function.
> > >   (find_closest_macro_cpp_cb): Move the check for NT_MACRO to
> > >   should_suggest_as_macro_p and call it.
> > >   (selftest::test_name_reserved_for_implementation_p): New
> > > function.
> > >   (selftest::spellcheck_tree_c_tests): Call it.
> > >   * spellcheck-tree.h (name_reserved_for_implementation_p): New
> > >   decl.
> > > 
> > > gcc/testsuite/ChangeLog:
> > >   PR c/83236
> > >   * c-c++-common/spellcheck-reserved.c: New test case.
> > > ---
> > >  gcc/c/c-decl.c   |  5 +++
> > >  gcc/cp/name-lookup.c | 18 +++---
> > >  gcc/spellcheck-tree.c| 46
> > > +++-
> > >  gcc/spellcheck-tree.h|  2 ++
> > >  gcc/testsuite/c-c++-common/spellcheck-reserved.c | 25
> > > +
> > >  5 files changed, 91 insertions(+), 5 deletions(-)
> > >  create mode 100644 gcc/testsuite/c-c++-common/spellcheck-
> > > reserved.c
> > > 
> > > diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
> > > index 56c63d8..dfd136d 100644
> > > --- a/gcc/c/c-decl.c
> > > +++ b/gcc/c/c-decl.c
> > > @@ -4041,6 +4041,11 @@ lookup_name_fuzzy (tree name, enum
> > > lookup_name_fuzzy_kind kind, location_t loc)
> > >   if (TREE_CODE (binding->decl) == FUNCTION_DECL)
> > > if (C_DECL_IMPLICIT (binding->decl))
> > >   continue;
> > > + /* Don't suggest names that are reserved for use by the
> > > +implementation.  */
> > > + if (name_reserved_for_implementation_p
> > > + (IDENTIFIER_POINTER (binding->id)))
> > 
> > Can't you use a temporary to avoid wrapping line between function
> > name and ( ?
> 
> Fixed.
> 
> > More importantly, does this mean if I mistype __builtin_strtchr it
> > won't suggest __builtin_strrchr?  Would be nice if the filtering
> > of the names reserved for implementation isn't done if the
> > name being looked up is reserved for implementation.
> 
> Good idea, thanks.
> 
> Here's an updated version of the patch.
> 
> Changed in v2:
> * don't filter suggestions if the name name being looked up
>   is itself reserved for implementation
> * fix wrapping in c-decl.c's lookup_name_fuzzy
> * name-lookup.c (consider_binding_level): rename new variable from
> "name"
>   to "suggestion" to avoid shadowing a param
> * spellcheck-tree.c (test_name_reserved_for_implementation_p): Add
> more
>   test coverage ("_" and "__")
> 
> One additional wart I noticed writing the testase is that the
> C and C++ frontends offer different suggestions for
> "__builtin_strtchr".
> C recomends:
>   __builtin_strchr
> whereas C++ recommends:
>   __builtin_strrchr
> 
> The reason is that the C FE visits the builtins in order of
> builtins.def,
> whereas C++ visits them in the reverse order.
> 
> Both have the same edit distance, and so the first "winner" in
> best_match varies betwee

[PATCH 1/2] Move macro-spellchecking code from "gcc" to new files in c-family

2017-12-04 Thread David Malcolm
The code for spellchecking macros really belongs in c-family, rather
than in gcc/spellcheck-tree.c, so this patch moves it there.

gcc/ChangeLog:
* Makefile.in (C_COMMON_OBJS): Add c-family/c-spellcheck.o.

gcc/c-family/ChangeLog:
* c-spellcheck.cc: New file, taken from macro-handling code in
spellcheck-tree.c.
* c-spellcheck.h: New file, taken from macro-handling code in
spellcheck-tree.h.

gcc/c/ChangeLog:
* c-decl.c: Include "c-family/c-spellcheck.h".

gcc/cp/ChangeLog:
* name-lookup.c: Include "c-family/c-spellcheck.h".

gcc/ChangeLog:
* spellcheck-tree.c (find_closest_macro_cpp_cb): Move to
c-family/c-spellcheck.cc.
(best_macro_match::best_macro_match): Likewise.
* spellcheck-tree.h
(struct edit_distance_traits): Move to
c-family/c-spellcheck.h.
(class best_macro_match): Likewise.
---
 gcc/Makefile.in  |  2 +-
 gcc/c-family/c-spellcheck.cc | 57 
 gcc/c-family/c-spellcheck.h  | 51 +++
 gcc/c/c-decl.c   |  1 +
 gcc/cp/name-lookup.c |  1 +
 gcc/spellcheck-tree.c| 30 ---
 gcc/spellcheck-tree.h| 26 
 7 files changed, 111 insertions(+), 57 deletions(-)
 create mode 100644 gcc/c-family/c-spellcheck.cc
 create mode 100644 gcc/c-family/c-spellcheck.h

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index 0baa292..c04c5fa 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1196,7 +1196,7 @@ C_COMMON_OBJS = c-family/c-common.o 
c-family/c-cppbuiltin.o c-family/c-dump.o \
   c-family/c-ppoutput.o c-family/c-pragma.o c-family/c-pretty-print.o \
   c-family/c-semantics.o c-family/c-ada-spec.o \
   c-family/c-ubsan.o c-family/known-headers.o \
-  c-family/c-attribs.o c-family/c-warn.o
+  c-family/c-attribs.o c-family/c-warn.o c-family/c-spellcheck.o
 
 # Language-independent object files.
 # We put the *-match.o and insn-*.o files first so that a parallel make
diff --git a/gcc/c-family/c-spellcheck.cc b/gcc/c-family/c-spellcheck.cc
new file mode 100644
index 000..db70a64
--- /dev/null
+++ b/gcc/c-family/c-spellcheck.cc
@@ -0,0 +1,57 @@
+/* Find near-matches for macros.
+   Copyright (C) 2016-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "tree.h"
+#include "cpplib.h"
+#include "spellcheck-tree.h"
+#include "c-family/c-spellcheck.h"
+
+/* A callback for cpp_forall_identifiers, for use by best_macro_match's ctor.
+   Process HASHNODE and update the best_macro_match instance pointed to be
+   USER_DATA.  */
+
+static int
+find_closest_macro_cpp_cb (cpp_reader *, cpp_hashnode *hashnode,
+  void *user_data)
+{
+  if (hashnode->type != NT_MACRO)
+return 1;
+
+  best_macro_match *bmm = (best_macro_match *)user_data;
+  bmm->consider (hashnode);
+
+  /* Keep iterating.  */
+  return 1;
+}
+
+/* Constructor for best_macro_match.
+   Use find_closest_macro_cpp_cb to find the closest matching macro to
+   NAME within distance < best_distance_so_far. */
+
+best_macro_match::best_macro_match (tree goal,
+   edit_distance_t best_distance_so_far,
+   cpp_reader *reader)
+: best_match  (goal, best_distance_so_far)
+{
+  cpp_forall_identifiers (reader, find_closest_macro_cpp_cb, this);
+}
diff --git a/gcc/c-family/c-spellcheck.h b/gcc/c-family/c-spellcheck.h
new file mode 100644
index 000..adc539a
--- /dev/null
+++ b/gcc/c-family/c-spellcheck.h
@@ -0,0 +1,51 @@
+/* Find near-matches for macros.
+   Copyright (C) 2016-2017 Free Software Foundation, Inc.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+

[PATCH 2/2] v3: C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

2017-12-04 Thread David Malcolm
Changed in v3:
* updated for move to c-family/c-spellcheck.cc/h

Changed in v2:
* don't filter suggestions if the name name being looked up
is itself reserved for implementation
* fix wrapping in c-decl.c's lookup_name_fuzzy
* name-lookup.c (consider_binding_level): rename new variable from "name"
to "suggestion" to avoid shadowing a param
* spellcheck-tree.c (test_name_reserved_for_implementation_p): Add more
test coverage ("_" and "__")

One additional wart I noticed writing the testase is that the
C and C++ frontends offer different suggestions for "__builtin_strtchr".
C recomends:
  __builtin_strchr
whereas C++ recommends:
  __builtin_strrchr

The reason is that the C FE visits the builtins in order of builtins.def,
whereas C++ visits them in the reverse order.

Both have the same edit distance, and so the first "winner" in
best_match varies between FEs.

This is a pre-existing issue, though (not sure if it warrants a PR).

Blurb from v1 follows:

PR c/83236 reports an issue where the C FE suggests the use of glibc's
private "__ino_t" type when it fails to recognize "ino_t":

$ cat > test.c ident.str)
+  && !(hashnode->flags & NODE_BUILTIN))
+return false;
+
+  return true;
+}
 
 /* A callback for cpp_forall_identifiers, for use by best_macro_match's ctor.
Process HASHNODE and update the best_macro_match instance pointed to be
@@ -34,7 +65,7 @@ static int
 find_closest_macro_cpp_cb (cpp_reader *, cpp_hashnode *hashnode,
   void *user_data)
 {
-  if (hashnode->type != NT_MACRO)
+  if (!should_suggest_as_macro_p (hashnode))
 return 1;
 
   best_macro_match *bmm = (best_macro_match *)user_data;
@@ -55,3 +86,36 @@ best_macro_match::best_macro_match (tree goal,
 {
   cpp_forall_identifiers (reader, find_closest_macro_cpp_cb, this);
 }
+
+#if CHECKING_P
+
+namespace selftest {
+
+/* Selftests.  */
+
+/* Verify that name_reserved_for_implementation_p is sane.  */
+
+static void
+test_name_reserved_for_implementation_p ()
+{
+  ASSERT_FALSE (name_reserved_for_implementation_p (""));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("foo"));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("_"));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("_foo"));
+  ASSERT_FALSE (name_reserved_for_implementation_p ("_42"));
+  ASSERT_TRUE (name_reserved_for_implementation_p ("_Foo"));
+  ASSERT_TRUE (name_reserved_for_implementation_p ("__"));
+  ASSERT_TRUE (name_reserved_for_implementation_p ("__foo"));
+}
+
+/* Run all of the selftests within this file.  */
+
+void
+c_spellcheck_cc_tests ()
+{
+  test_name_reserved_for_implementation_p ();
+}
+
+} // namespace selftest
+
+#endif /* #if CHECKING_P */
diff --git a/gcc/c-family/c-spellcheck.h b/gcc/c-family/c-spellcheck.h
index adc539a..838f4f2 100644
--- a/gcc/c-family/c-spellcheck.h
+++ b/gcc/c-family/c-spellcheck.h
@@ -22,6 +22,8 @@ along with GCC; see the file COPYING3.  If not see
 
 #include "spellcheck.h"
 
+extern bool name_reserved_for_implementation_p (const char *str);
+
 /* Specialization of edit_distance_traits for preprocessor macros.  */
 
 template <>
diff --git a/gcc/c/c-decl.c b/gcc/c/c-decl.c
index d7dad1a..30156da 100644
--- a/gcc/c/c-decl.c
+++ b/gcc/c/c-decl.c
@@ -4027,6 +4027,9 @@ lookup_name_fuzzy (tree name, enum lookup_name_fuzzy_kind 
kind, location_t loc)
  IDENTIFIER_POINTER (name),
  header_hint));
 
+  bool name_reserved
+= name_reserved_for_implementation_p (IDENTIFIER_POINTER (name));
+
   best_match bm (name);
 
   /* Look within currently valid scopes.  */
@@ -4042,6 +4045,14 @@ lookup_name_fuzzy (tree name, enum 
lookup_name_fuzzy_kind kind, location_t loc)
if (TREE_CODE (binding->decl) == FUNCTION_DECL)
  if (C_DECL_IMPLICIT (binding->decl))
continue;
+   /* Don't suggest names that are reserved for use by the
+  implementation, unless NAME itself is reserved.  */
+   if (!name_reserved)
+ {
+   const char *suggestion_str = IDENTIFIER_POINTER (binding->id);
+   if (name_reserved_for_implementation_p (suggestion_str))
+ continue;
+ }
switch (kind)
  {
  case FUZZY_LOOKUP_TYPENAME:
diff --git a/gcc/cp/name-lookup.c b/gcc/cp/name-lookup.c
index e79787f..7f2b9f68 100644
--- a/gcc/cp/name-lookup.c
+++ b/gcc/cp/name-lookup.c
@@ -5614,6 +5614,9 @@ consider_binding_level (tree name, best_match  &bm,
  bm.consider (IDENTIFIER_POINTER (best_matching_field));
   }
 
+  bool name_reserved
+= name_reserved_for_implementation_p (IDENTIFIER_POINTER (name));
+
   for (tree t = lvl->names; t; t = TREE_CHAIN (t))
 {
   tree d = t;

Re: [PATCH, obv?] Fix missing newlines from local-pure-const pass dump

2017-12-04 Thread Luis Machado



On 12/04/2017 02:01 PM, Jeff Law wrote:

On 12/01/2017 11:42 AM, Luis Machado wrote:

I noticed the debugging output from local-pure-const pass is missing a
newline in a couple places, leading to this:

  local analysis of main
scanning: i ={v} 0;
 Volatile stmt is not const/pure
 Volatile operand is not const/pure  scanning: j ={v} 20;
 Volatile stmt is not const/pure
 Volatile operand is not const/pure  scanning: vol.0_10 ={v} i;
 Volatile stmt is not const/pure

It should've been:

  local analysis of main
scanning: i ={v} 0;
 Volatile stmt is not const/pure
 Volatile operand is not const/pure
scanning: j ={v} 20;
 Volatile stmt is not const/pure
 Volatile operand is not const/pure
scanning: vol.0_10 ={v} i;
 Volatile stmt is not const/pure

Seems fairly obvious. OK?

gcc/ChangeLog:

2017-12-01  Luis Machado  

* ipa-pure-const.c (check_decl): Add missing newline.
(state_from_flags): Likewise.

OK.
jeff



Thanks. Pushed now.


Re: [PATCH,RFC] combine: Remove use_crosses_set_p

2017-12-04 Thread Eric Botcazou
> Since there are no comments, I'll commit it now.

The idea looks interesting but the timing is a bit more questionable.

-- 
Eric Botcazou


Re: [Patch][aarch64] Add missing thunderx2-t99 instruction scheduling pipeline descriptions.

2017-12-04 Thread Kyrill Tkachov

Hi Steve,

On 01/12/17 18:42, Steve Ellcey wrote:

There are a number of instruction types defined in aarch64.md which do not
have pipeline/scheduling information in thunderx2-t99.md.  This patch adds
some of them.  This patch includes all the missing types except the neon
ones that I hope to include in a follow-up patch.

Bootstrapped and tested with no regressions on a thunderx2.

I know we are in stage3 but I hope this type of plaform specific
change is still OK to checkin.



This looks ok to me (though I cannot approve) with a style nit inline.
I'd say this fairly low risk to take in, but it's up to the maintainers 
to make

the final call.

Thanks,
Kyrill


Steve Ellcey
sell...@cavium.com


2017-11-30  Steve Ellcey  

* config/aarch64/thunderx2-t99.md (thunderx2t99_branch): Add trap
to reservation.
(thunderx2t99_nothing): New insn reservation.
(thunderx2t99_mrs): New insn reservation.
(thunderx2t99_multiple): New insn reservation.
(thunderx2t99_alu_basi): Add bfx to reservation.
(thunderx2t99_fp_cmp): Add fccmps and fccmpd to reservation.


diff --git a/gcc/config/aarch64/thunderx2t99.md 
b/gcc/config/aarch64/thunderx2t99.md

index 5bcf4ff..5e48521 100644
--- a/gcc/config/aarch64/thunderx2t99.md
+++ b/gcc/config/aarch64/thunderx2t99.md
@@ -69,9 +69,26 @@

 (define_insn_reservation "thunderx2t99_branch" 1
   (and (eq_attr "tune" "thunderx2t99")
-   (eq_attr "type" "call,branch"))
+   (eq_attr "type" "call,branch,trap"))
   "thunderx2t99_i2")

+;; Misc instructions.
+
+(define_insn_reservation "thunderx2t99_nothing" 0
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "no_insn,block"))
+  "nothing")
+
+(define_insn_reservation "thunderx2t99_mrs" 0
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "mrs"))
+  "thunderx2t99_i2")
+
+(define_insn_reservation "thunderx2t99_multiple" 1
+  (and (eq_attr "tune" "thunderx2t99")
+   (eq_attr "type" "multiple"))
+  
"thunderx2t99_i0+thunderx2t99_i1+thunderx2t99_i2+thunderx2t99_ls0+thunderx2t99_ls1+thunderx2t99_sd+thunderx2t99_i1m1+thunderx2t99_i1m2+thunderx2t99_i1m3+thunderx2t99_ls0d1+thunderx2t99_ls0d2+thunderx2t99_ls0d3+thunderx2t99_ls1d1+thunderx2t99_ls1d2+thunderx2t99_ls1d3+thunderx2t99_f0+thunderx2t99_f1")
+


We try to adhere to the 80 columns per line rule in the scheduling 
description files as well,

so can you please use "\" to break this into multiple lines.


 ;; Integer arithmetic/logic instructions.

 ; Plain register moves are handled by renaming, and don't create any 
uops.

@@ -87,7 +104,7 @@
adc_reg,adc_imm,adcs_reg,adcs_imm,\
logic_reg,logic_imm,logics_reg,logics_imm,\
csel,adr,mov_imm,shift_reg,shift_imm,bfm,\
-   rbit,rev,extend,rotate_imm"))
+   bfx,rbit,rev,extend,rotate_imm"))
   "thunderx2t99_i012")

 (define_insn_reservation "thunderx2t99_alu_shift" 2
@@ -155,7 +172,7 @@

 (define_insn_reservation "thunderx2t99_fp_cmp" 5
   (and (eq_attr "tune" "thunderx2t99")
-   (eq_attr "type" "fcmps,fcmpd"))
+   (eq_attr "type" "fcmps,fcmpd,fccmps,fccmpd"))
   "thunderx2t99_f01")

 (define_insn_reservation "thunderx2t99_fp_divsqrt_s" 16




Re: [Patch][aarch64] Add missing thunderx2-t99 instruction scheduling pipeline descriptions.

2017-12-04 Thread Steve Ellcey
On Mon, 2017-12-04 at 17:18 +, Kyrill Tkachov wrote:

> > +(define_insn_reservation "thunderx2t99_multiple" 1
> > +  (and (eq_attr "tune" "thunderx2t99")
> > +   (eq_attr "type" "multiple"))
> > +  "thunderx2t99_i0+thunderx2t99_i1+thunderx2t99_i2+thunderx2t99_ls
> > 0+thunderx2t99_ls1+thunderx2t99_sd+thunderx2t99_i1m1+thunderx2t99_i
> > 1m2+thunderx2t99_i1m3+thunderx2t99_ls0d1+thunderx2t99_ls0d2+thunder
> > x2t99_ls0d3+thunderx2t99_ls1d1+thunderx2t99_ls1d2+thunderx2t99_ls1d
> > 3+thunderx2t99_f0+thunderx2t99_f1")
> > +
> We try to adhere to the 80 columns per line rule in the scheduling 
> description files as well,
> so can you please use "\" to break this into multiple lines.

Yes, I wasn't sure if whatever program parses this file would honor
backslashes so I didn't break it up.  The falkor_extra definition in
falkor.md that I looked at is more than 80 characters and that is one
of the reasons I wasn't sure backslashes would be honored.  But I see
other places (power8.md) where the backslashes are used so I will make
that change if and when the patch is approved.

Steve Ellcey
sell...@cavium.com


Re: [PATCH][gimple-interchange] Random cleanups

2017-12-04 Thread Richard Biener
On December 4, 2017 5:01:45 PM GMT+01:00, "Bin.Cheng"  
wrote:
>On Mon, Dec 4, 2017 at 3:43 PM, Richard Biener 
>wrote:
>>
>> When skimming through the code I noticed the following (chatted on
>IRC
>> about parts of the changes).
>>
>> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>>
>> Will commit tomorrow unless you beat me to that.
>>
>> Richard.
>>
>> 2017-12-04  Richard Biener  
>>
>> * gimple-loop-interchange.cc
>(loop_cand::classify_simple_reduction):
>> Simplify.
>> (loop_cand::analyze_iloop_reduction_var): Reject dead
>reductions.
>> (loop_cand::analyze_oloop_reduction_var): Likewise. 
>Simplify.
>> (tree_loop_interchange::interchange_loops): Properly analyze
>> scalar evolution before instantiating a SCEV.
>>
>> Index: gcc/gimple-loop-interchange.cc
>> ===
>> --- gcc/gimple-loop-interchange.cc  (revision 255383)
>> +++ gcc/gimple-loop-interchange.cc  (working copy)
>> @@ -444,50 +444,21 @@ loop_cand::classify_simple_reduction (re
>>if (!bb || bb->loop_father != m_outer)
>> return;
>>
>> -  if (!is_gimple_assign (producer))
>> +  if (!gimple_assign_load_p (producer))
>> return;
>>
>> -  code = gimple_assign_rhs_code (producer);
>> -  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
>> -   return;
>> -
>> -  lhs = gimple_assign_lhs (producer);
>> -  if (lhs != re->init)
>> -   return;
>> -
>> -  rhs = gimple_assign_rhs1 (producer);
>> -  if (!REFERENCE_CLASS_P (rhs))
>> -   return;
>> -
>> -  re->init_ref = rhs;
>> +  re->init_ref = gimple_assign_rhs1 (producer);
>>  }
>>else if (!CONSTANT_CLASS_P (re->init))
>>  return;
>>
>> -  /* Check how reduction variable is used.  Note usually reduction
>variable
>> - is used outside of its defining loop, we don't require that in
>terms of
>> - loop interchange.  */
>> -  if (!re->lcssa_phi)
>> -consumer = single_use_in_loop (re->next, m_loop);
>> -  else
>> -consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi),
>m_outer);
>> -
>> -  if (!consumer || !is_gimple_assign (consumer))
>> -return;
>> -
>> -  code = gimple_assign_rhs_code (consumer);
>> -  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
>> -return;
>> -
>> -  lhs = gimple_assign_lhs (consumer);
>> -  if (!REFERENCE_CLASS_P (lhs))
>> -return;
>> -
>> -  rhs = gimple_assign_rhs1 (consumer);
>> -  if (rhs != PHI_RESULT (re->lcssa_phi))
>> +  /* Check how reduction variable is used.  */
>> +  consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi),
>m_outer);
>> +  if (!consumer
>> +  || !gimple_store_p (consumer))
>>  return;
>>
>> -  re->fini_ref = lhs;
>> +  re->fini_ref = gimple_get_lhs (consumer);
>>re->consumer = consumer;
>>
>>/* Simple reduction with constant initializer.  */
>> @@ -608,6 +579,9 @@ loop_cand::analyze_iloop_reduction_var (
>>else
>> return false;
>>  }
>> +  if (!lcssa_phi)
>> +return false;
>> +
>>re = XCNEW (struct reduction);
>>re->var = var;
>>re->init = init;
>> @@ -681,15 +655,9 @@ loop_cand::analyze_oloop_reduction_var (
>>
>>/* Outer loop's reduction should only be used to initialize inner
>loop's
>>   simple reduction.  */
>> -  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
>> -{
>> -  stmt = USE_STMT (use_p);
>> -  if (is_gimple_debug (stmt))
>> -   continue;
>> -
>> -  if (stmt != inner_re->phi)
>> -   return false;
>> -}
>> +  if (! single_imm_use (var, &use_p, &stmt)
>> +  || stmt != inner_re->phi)
>> +return false;
>>
>>/* Check this reduction is correctly used outside of loop via
>lcssa phi.  */
>>FOR_EACH_IMM_USE_FAST (use_p, iterator, next)
>> @@ -711,6 +679,8 @@ loop_cand::analyze_oloop_reduction_var (
>>else
>> return false;
>>  }
>> +  if (!lcssa_phi)
>> +return false;
>>
>>re = XCNEW (struct reduction);
>>re->var = var;
>> @@ -1146,12 +1116,18 @@ tree_loop_interchange::interchange_loops
>>edge instantiate_below = loop_preheader_edge (loop_nest);
>>gsi = gsi_last_bb (loop_preheader_edge (loop_nest)->src);
>>i_niters = number_of_latch_executions (iloop.m_loop);
>> -  i_niters = instantiate_scev (instantiate_below, loop_nest,
>i_niters);
>> +  i_niters = analyze_scalar_evolution (loop_outer (iloop.m_loop),
>i_niters);
>> +  i_niters = instantiate_scev (instantiate_below, loop_outer
>(iloop.m_loop),
>> +  i_niters);
>>i_niters = force_gimple_operand_gsi (&gsi, unshare_expr
>(i_niters), true,
>>NULL_TREE, false,
>GSI_CONTINUE_LINKING);
>>o_niters = number_of_latch_executions (oloop.m_loop);
>>if (oloop.m_loop != loop_nest)
>> -o_niters = instantiate_scev (instantiate_below, loop_nest,
>o_niters);
>> +{
>> +  o_niters = analyze_scalar_evolution (loop_outer
>(oloop.m_loo

Re: [PATCH][gimple-interchange] Random cleanups

2017-12-04 Thread Bin.Cheng
On Mon, Dec 4, 2017 at 5:39 PM, Richard Biener  wrote:
> On December 4, 2017 5:01:45 PM GMT+01:00, "Bin.Cheng"  
> wrote:
>>On Mon, Dec 4, 2017 at 3:43 PM, Richard Biener 
>>wrote:
>>>
>>> When skimming through the code I noticed the following (chatted on
>>IRC
>>> about parts of the changes).
>>>
>>> Bootstrap / regtest running on x86_64-unknown-linux-gnu.
>>>
>>> Will commit tomorrow unless you beat me to that.
>>>
>>> Richard.
>>>
>>> 2017-12-04  Richard Biener  
>>>
>>> * gimple-loop-interchange.cc
>>(loop_cand::classify_simple_reduction):
>>> Simplify.
>>> (loop_cand::analyze_iloop_reduction_var): Reject dead
>>reductions.
>>> (loop_cand::analyze_oloop_reduction_var): Likewise.
>>Simplify.
>>> (tree_loop_interchange::interchange_loops): Properly analyze
>>> scalar evolution before instantiating a SCEV.
>>>
>>> Index: gcc/gimple-loop-interchange.cc
>>> ===
>>> --- gcc/gimple-loop-interchange.cc  (revision 255383)
>>> +++ gcc/gimple-loop-interchange.cc  (working copy)
>>> @@ -444,50 +444,21 @@ loop_cand::classify_simple_reduction (re
>>>if (!bb || bb->loop_father != m_outer)
>>> return;
>>>
>>> -  if (!is_gimple_assign (producer))
>>> +  if (!gimple_assign_load_p (producer))
>>> return;
>>>
>>> -  code = gimple_assign_rhs_code (producer);
>>> -  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
>>> -   return;
>>> -
>>> -  lhs = gimple_assign_lhs (producer);
>>> -  if (lhs != re->init)
>>> -   return;
>>> -
>>> -  rhs = gimple_assign_rhs1 (producer);
>>> -  if (!REFERENCE_CLASS_P (rhs))
>>> -   return;
>>> -
>>> -  re->init_ref = rhs;
>>> +  re->init_ref = gimple_assign_rhs1 (producer);
>>>  }
>>>else if (!CONSTANT_CLASS_P (re->init))
>>>  return;
>>>
>>> -  /* Check how reduction variable is used.  Note usually reduction
>>variable
>>> - is used outside of its defining loop, we don't require that in
>>terms of
>>> - loop interchange.  */
>>> -  if (!re->lcssa_phi)
>>> -consumer = single_use_in_loop (re->next, m_loop);
>>> -  else
>>> -consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi),
>>m_outer);
>>> -
>>> -  if (!consumer || !is_gimple_assign (consumer))
>>> -return;
>>> -
>>> -  code = gimple_assign_rhs_code (consumer);
>>> -  if (get_gimple_rhs_class (code) != GIMPLE_SINGLE_RHS)
>>> -return;
>>> -
>>> -  lhs = gimple_assign_lhs (consumer);
>>> -  if (!REFERENCE_CLASS_P (lhs))
>>> -return;
>>> -
>>> -  rhs = gimple_assign_rhs1 (consumer);
>>> -  if (rhs != PHI_RESULT (re->lcssa_phi))
>>> +  /* Check how reduction variable is used.  */
>>> +  consumer = single_use_in_loop (PHI_RESULT (re->lcssa_phi),
>>m_outer);
>>> +  if (!consumer
>>> +  || !gimple_store_p (consumer))
>>>  return;
>>>
>>> -  re->fini_ref = lhs;
>>> +  re->fini_ref = gimple_get_lhs (consumer);
>>>re->consumer = consumer;
>>>
>>>/* Simple reduction with constant initializer.  */
>>> @@ -608,6 +579,9 @@ loop_cand::analyze_iloop_reduction_var (
>>>else
>>> return false;
>>>  }
>>> +  if (!lcssa_phi)
>>> +return false;
>>> +
>>>re = XCNEW (struct reduction);
>>>re->var = var;
>>>re->init = init;
>>> @@ -681,15 +655,9 @@ loop_cand::analyze_oloop_reduction_var (
>>>
>>>/* Outer loop's reduction should only be used to initialize inner
>>loop's
>>>   simple reduction.  */
>>> -  FOR_EACH_IMM_USE_FAST (use_p, iterator, var)
>>> -{
>>> -  stmt = USE_STMT (use_p);
>>> -  if (is_gimple_debug (stmt))
>>> -   continue;
>>> -
>>> -  if (stmt != inner_re->phi)
>>> -   return false;
>>> -}
>>> +  if (! single_imm_use (var, &use_p, &stmt)
>>> +  || stmt != inner_re->phi)
>>> +return false;
>>>
>>>/* Check this reduction is correctly used outside of loop via
>>lcssa phi.  */
>>>FOR_EACH_IMM_USE_FAST (use_p, iterator, next)
>>> @@ -711,6 +679,8 @@ loop_cand::analyze_oloop_reduction_var (
>>>else
>>> return false;
>>>  }
>>> +  if (!lcssa_phi)
>>> +return false;
>>>
>>>re = XCNEW (struct reduction);
>>>re->var = var;
>>> @@ -1146,12 +1116,18 @@ tree_loop_interchange::interchange_loops
>>>edge instantiate_below = loop_preheader_edge (loop_nest);
>>>gsi = gsi_last_bb (loop_preheader_edge (loop_nest)->src);
>>>i_niters = number_of_latch_executions (iloop.m_loop);
>>> -  i_niters = instantiate_scev (instantiate_below, loop_nest,
>>i_niters);
>>> +  i_niters = analyze_scalar_evolution (loop_outer (iloop.m_loop),
>>i_niters);
>>> +  i_niters = instantiate_scev (instantiate_below, loop_outer
>>(iloop.m_loop),
>>> +  i_niters);
>>>i_niters = force_gimple_operand_gsi (&gsi, unshare_expr
>>(i_niters), true,
>>>NULL_TREE, false,
>>GSI_CONTINUE_LINKING);
>>>o_niters = number_of_latch_executions (oloop.m

Re: Avoid -Werror=format-overflow error in dbxout.c (dbxout_block) on Solaris/SPARC

2017-12-04 Thread Martin Sebor

On 12/04/2017 05:41 AM, Rainer Orth wrote:

Within test last week, 64-bit Solaris/SPARC bootstrap began to fail:

/vol/gcc/src/hg/trunk/local/gcc/dbxout.c: In function 'bool dbxout_block(tree, 
int, tree, int)':
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: error: '%lu' directive writing 
between 1 and 20 bytes into a region of size 14 [-Werror=format-overflow=]
 dbxout_block (tree block, int depth, tree args, int parent_blocknum)
 ^~~~
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: note: directive argument in 
the range [0, 18446744073709551614]
In file included from ./tm.h:26,
 from /vol/gcc/src/hg/trunk/local/gcc/target.h:52,
 from /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:72:
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sol2.h:353:11: note: 
'std::sprintf' output between 8 and 27 bytes into a destination of size 20
   sprintf ((LABEL), "*.L%s%lu", (PREFIX), (unsigned long)(NUM))
   ^
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3855:5: note: in expansion of macro 
'ASM_GENERATE_INTERNAL_LABEL'
 ASM_GENERATE_INTERNAL_LABEL (buf, "LBB", parent_blocknum);
 ^~~

The line numbers are extremely confusing, to say the least, though: the
one in the error and the first note refer to the begin of the function
definition and only the third note refers to the line of the actual
error.


I agree it looks confusing.  It's the result of the interaction
between macro tracking and inlining.

I think it's a general problem that affects many (though not all)
warnings emitted out of the middle-end.  For instance, the example
below results in similar output.  The output changes depending on
whether or not _FORTIFY_SOURCE is defined.

#include 

#define FOO(d, s) strcpy (d, s)

void foo (char *d, const char *s) { FOO (d, s); }

void sink (char*);

void bar (void)
{
  char a[3];

  foo (a, "1234567");   // bug here

  sink (a);
}

Without _FORTIFY_SOURCE:

d.c: In function ‘void bar()’:
d.c:3:26: warning: ‘void* __builtin_memcpy(void*, const void*, long 
unsigned int)’ writing 8 bytes into a region of size 3 overflows the 
destination [-Wstringop-overflow=]

 #define FOO(d, s) strcpy (d, s)
   ~~~^~
d.c:5:37: note: in expansion of macro ‘FOO’
 void foo (char *d, const char *s) { FOO (d, s); }
 ^~~

If bar() were a bigger function with multiple calls to foo() it
could be tricky to tell which of them caused the warning.

With _FORTIFY_SOURCE:

In file included from /usr/include/string.h:635,
 from d.c:1:
In function ‘char* strcpy(char*, const char*)’,
inlined from ‘void bar()’ at d.c:5:37:
/usr/include/bits/string3.h:110:33: warning: ‘void* 
__builtin___memcpy_chk(void*, const void*, long unsigned int, long 
unsigned int)’ writing 8 bytes into a region of size 3 overflows the 
destination [-Wstringop-overflow=]

   return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
  ~~~^~~

This suffers from the same problem as the first case: the line
number of the offending call in bar() is missing.

In both cases the problem is compounded by the diagnostic, as
a result of folding, referring to __builtin___memcpy_chk while
the source code contains a call to strcpy.

I don't know nearly enough about the internals of the diagnostic
subsystem to tell how difficult it might be to change the output
to make it more readable.  David Malcolm is the expert on this
area so he might have an idea what it would take.

Martin


Re: Avoid -Werror=format-overflow error in dbxout.c (dbxout_block) on Solaris/SPARC

2017-12-04 Thread Jeff Law
On 12/04/2017 05:41 AM, Rainer Orth wrote:
> Within test last week, 64-bit Solaris/SPARC bootstrap began to fail:
> 
> /vol/gcc/src/hg/trunk/local/gcc/dbxout.c: In function 'bool 
> dbxout_block(tree, int, tree, int)':
> /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: error: '%lu' directive 
> writing between 1 and 20 bytes into a region of size 14 
> [-Werror=format-overflow=]
>  dbxout_block (tree block, int depth, tree args, int parent_blocknum)
>  ^~~~
> /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: note: directive argument in 
> the range [0, 18446744073709551614]
> In file included from ./tm.h:26,
>  from /vol/gcc/src/hg/trunk/local/gcc/target.h:52,
>  from /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:72:
> /vol/gcc/src/hg/trunk/local/gcc/config/sparc/sol2.h:353:11: note: 
> 'std::sprintf' output between 8 and 27 bytes into a destination of size 20
>sprintf ((LABEL), "*.L%s%lu", (PREFIX), (unsigned long)(NUM))
>^
> /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3855:5: note: in expansion of macro 
> 'ASM_GENERATE_INTERNAL_LABEL'
>  ASM_GENERATE_INTERNAL_LABEL (buf, "LBB", parent_blocknum);
>  ^~~
> 
> The line numbers are extremely confusing, to say the least, though: the
> one in the error and the first note refer to the begin of the function
> definition and only the third note refers to the line of the actual
> error.
> 
> Fixed as follows, which allowed sparcv9-sun-solaris2.11 bootstrap to
> finish and passed regtest on sparc-sun-solaris2.11.
> 
> Ok for mainline?
OK.
jeff


Re: [PATCH 2/2] v3: C/C++: don't suggest implementation names as spelling fixes (PR c/83236)

2017-12-04 Thread Jason Merrill
On Mon, Dec 4, 2017 at 12:00 PM, David Malcolm  wrote:
> * don't filter suggestions if the name name being looked up
> is itself reserved for implementation

I wonder if we want to avoid filtering if the name being looked up
starts with a single _, on the theory that the user meant to write __.

Other than that, looks good.

Jason


Re: Avoid -Werror=format-overflow error in dbxout.c (dbxout_block) on Solaris/SPARC

2017-12-04 Thread David Malcolm
On Mon, 2017-12-04 at 12:18 -0700, Martin Sebor wrote:
> On 12/04/2017 05:41 AM, Rainer Orth wrote:
> > Within test last week, 64-bit Solaris/SPARC bootstrap began to
> > fail:
> > 
> > /vol/gcc/src/hg/trunk/local/gcc/dbxout.c: In function 'bool
> > dbxout_block(tree, int, tree, int)':
> > /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: error: '%lu'
> > directive writing between 1 and 20 bytes into a region of size 14
> > [-Werror=format-overflow=]
> >  dbxout_block (tree block, int depth, tree args, int
> > parent_blocknum)
> >  ^~~~
> > /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: note: directive
> > argument in the range [0, 18446744073709551614]
> > In file included from ./tm.h:26,
> >  from /vol/gcc/src/hg/trunk/local/gcc/target.h:52,
> >  from /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:72:
> > /vol/gcc/src/hg/trunk/local/gcc/config/sparc/sol2.h:353:11: note:
> > 'std::sprintf' output between 8 and 27 bytes into a destination of
> > size 20
> >sprintf ((LABEL), "*.L%s%lu", (PREFIX), (unsigned long)(NUM))
> >^
> > /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3855:5: note: in expansion
> > of macro 'ASM_GENERATE_INTERNAL_LABEL'
> >  ASM_GENERATE_INTERNAL_LABEL (buf, "LBB", parent_blocknum);
> >  ^~~
> > 
> > The line numbers are extremely confusing, to say the least, though:
> > the
> > one in the error and the first note refer to the begin of the
> > function
> > definition and only the third note refers to the line of the actual
> > error.
> 
> I agree it looks confusing.  It's the result of the interaction
> between macro tracking and inlining.
> 
> I think it's a general problem that affects many (though not all)
> warnings emitted out of the middle-end.  For instance, the example
> below results in similar output.  The output changes depending on
> whether or not _FORTIFY_SOURCE is defined.
> 
> #include 
> 
> #define FOO(d, s) strcpy (d, s)
> 
> void foo (char *d, const char *s) { FOO (d, s); }
> 
> void sink (char*);
> 
> void bar (void)
> {
>char a[3];
> 
>foo (a, "1234567");   // bug here
> 
>sink (a);
> }
> 
> Without _FORTIFY_SOURCE:
> 
> d.c: In function ‘void bar()’:
> d.c:3:26: warning: ‘void* __builtin_memcpy(void*, const void*, long 
> unsigned int)’ writing 8 bytes into a region of size 3 overflows the 
> destination [-Wstringop-overflow=]
>   #define FOO(d, s) strcpy (d, s)
> ~~~^~
> d.c:5:37: note: in expansion of macro ‘FOO’
>   void foo (char *d, const char *s) { FOO (d, s); }
>   ^~~
> 
> If bar() were a bigger function with multiple calls to foo() it
> could be tricky to tell which of them caused the warning.
> 
> With _FORTIFY_SOURCE:
> 
> In file included from /usr/include/string.h:635,
>   from d.c:1:
> In function ‘char* strcpy(char*, const char*)’,
>  inlined from ‘void bar()’ at d.c:5:37:
> /usr/include/bits/string3.h:110:33: warning: ‘void* 
> __builtin___memcpy_chk(void*, const void*, long unsigned int, long 
> unsigned int)’ writing 8 bytes into a region of size 3 overflows the 
> destination [-Wstringop-overflow=]
> return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
>~~~^~~
> 
> This suffers from the same problem as the first case: the line
> number of the offending call in bar() is missing.
> 
> In both cases the problem is compounded by the diagnostic, as
> a result of folding, referring to __builtin___memcpy_chk while
> the source code contains a call to strcpy.
> 
> I don't know nearly enough about the internals of the diagnostic
> subsystem to tell how difficult it might be to change the output
> to make it more readable.  David Malcolm is the expert on this
> area so he might have an idea what it would take.

[Did you mean to CC me on this, rather than dje?]

I'm not as familiar as I could be on the existing inlining-tracking
implementation - so much so that, in ignorance, I wrote my own version
of it earlier this year (doh!).

The warning implementation reads:

3151warning_at (loc, opt,
3152(integer_onep (range[0])
3153 ? G_("%K%qD writing %E byte into a region "
3154  "of size %E overflows the destination")
3155 : G_("%K%qD writing %E bytes into a region "
3156  "of size %E overflows the destination")),
3157exp, get_callee_fndecl (exp), range[0], 
objsize);


The inlining information is coming from that "%K" in the string, which
leads to tree-pretty-print.c's percent_K_format being called for "exp".
 This overwrites the "loc" provided for the warning_at with
EXPR_LOCATION of exp, and looks at TREE_BLOCK (exp), finding the
outermost containing block within its function, 

Re: [GCC 9][RFC][PATCH] Optimize PHIs with constant arguments better

2017-12-04 Thread Michael Matz
Hi,

On Thu, 30 Nov 2017, Jeff Law wrote:

> And after PHI propagation we have:
>   # m_5 = PHI <10(5), 12(6), 14(7)>
>   # _2 = PHI <10(5), 12(6), 14(7)>
>   # _3 = PHI <320(5), 384(6), 448(7)>
> :
>   goto ; [100.00%]
> 
> DCE will come along and wipe out m_5 and _2 if they are unused.

When I did something along these lines a long time ago I had to be a bit 
careful in not regressing performance.  Every PHI node with constants will 
generate N instructions (with N the arity), there's no coalescing 
possible.  And if the feeding PHI nodes don't go away it increases 
register pressure by (at least) one.

In one situation at that time I basically replaced N PHI nodes and 
N*N instructions (each calculating x op y for each x,y \in PHI) by N*N PHI 
nodes, increasing register pressure unsensibly.  As the N*N instructions 
formed a few expr trees the register pressure before the transformation 
was merely at about N+3.  In the end the benchmark was 30% slower than 
faster :)  (spilling added)

(If I read the patch correctly you don't handle the situation of "x op y" 
where both arguments come from PHI nodes, so it's probably not an issue 
for you, but I thought I'd mention it)


Ciao,
Michael.


[PATCH] gcc: xtensa: enable address sanitizer

2017-12-04 Thread Max Filippov
gcc/
2017-12-04  Max Filippov  

* config/xtensa/xtensa.c (xtensa_asan_shadow_offset): New
function.
(TARGET_ASAN_SHADOW_OFFSET): New macro definition.
* config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): Set to 1 if
ASAN is enabled.
---
 gcc/config/xtensa/xtensa.c | 12 
 gcc/config/xtensa/xtensa.h |  3 ++-
 2 files changed, 14 insertions(+), 1 deletion(-)

diff --git a/gcc/config/xtensa/xtensa.c b/gcc/config/xtensa/xtensa.c
index 1e73b2f4405d..92b9a600df82 100644
--- a/gcc/config/xtensa/xtensa.c
+++ b/gcc/config/xtensa/xtensa.c
@@ -183,6 +183,7 @@ static bool xtensa_hard_regno_mode_ok (unsigned int, 
machine_mode);
 static bool xtensa_modes_tieable_p (machine_mode, machine_mode);
 static HOST_WIDE_INT xtensa_constant_alignment (const_tree, HOST_WIDE_INT);
 static HOST_WIDE_INT xtensa_starting_frame_offset (void);
+static unsigned HOST_WIDE_INT xtensa_asan_shadow_offset (void);
 
 
 
@@ -325,6 +326,9 @@ static HOST_WIDE_INT xtensa_starting_frame_offset (void);
 #undef TARGET_STARTING_FRAME_OFFSET
 #define TARGET_STARTING_FRAME_OFFSET xtensa_starting_frame_offset
 
+#undef TARGET_ASAN_SHADOW_OFFSET
+#define TARGET_ASAN_SHADOW_OFFSET xtensa_asan_shadow_offset
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 
@@ -4413,4 +4417,12 @@ xtensa_starting_frame_offset (void)
   return crtl->outgoing_args_size;
 }
 
+/* Implement TARGET_ASAN_SHADOW_OFFSET.  */
+
+static unsigned HOST_WIDE_INT
+xtensa_asan_shadow_offset (void)
+{
+  return HOST_WIDE_INT_UC (0x1000);
+}
+
 #include "gt-xtensa.h"
diff --git a/gcc/config/xtensa/xtensa.h b/gcc/config/xtensa/xtensa.h
index b4cf53708b3e..1602fae3d9ea 100644
--- a/gcc/config/xtensa/xtensa.h
+++ b/gcc/config/xtensa/xtensa.h
@@ -430,7 +430,8 @@ enum reg_class
 
 #define STACK_GROWS_DOWNWARD 1
 
-#define FRAME_GROWS_DOWNWARD flag_stack_protect
+#define FRAME_GROWS_DOWNWARD (flag_stack_protect \
+ || (flag_sanitize & SANITIZE_ADDRESS) != 0)
 
 /* The ARG_POINTER and FRAME_POINTER are not real Xtensa registers, so
they are eliminated to either the stack pointer or hard frame pointer.  */
-- 
2.1.4



Re: [PATCH] gcc: xtensa: enable address sanitizer

2017-12-04 Thread Jakub Jelinek
On Mon, Dec 04, 2017 at 01:28:53PM -0800, Max Filippov wrote:
> gcc/
> 2017-12-04  Max Filippov  
> 
>   * config/xtensa/xtensa.c (xtensa_asan_shadow_offset): New
>   function.
>   (TARGET_ASAN_SHADOW_OFFSET): New macro definition.
>   * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): Set to 1 if
>   ASAN is enabled.

Is this just for -fsanitize=kernel-address ?  Because I don't see any
libsanitizer/ changes (and those would need to go upstream first anyway).

Jakub


Re: [PATCH #2], PR target/81959, Fix ++int to _Float128 conversion on power9

2017-12-04 Thread Michael Meissner
On Fri, Dec 01, 2017 at 05:33:39PM -0600, Segher Boessenkool wrote:
> Okay for trunk.  Further improvements welcome ;-)  Thanks!

Here is the patch for GCC 7 (the bug shows up in GCC 7).  It is slightly
different due to the surrounding lines in rs6000.c being different.  Is it ok
to apply?  There were no regressions in the build.

This patch will not be needed in GCC 6.

[gcc]
2017-12-04  Michael Meissner  

Back port from trunk
2017-12-01  Michael Meissner  

PR target/81959
* config/rs6000/rs6000.c (rs6000_address_for_fpconvert): Check for
whether we can allocate pseudos before trying to fix an address.
* config/rs6000/rs6000.md (float_si2_hw): Make sure the
memory address is indexed or indirect.
(floatuns_si2_hw2): Likewise.

[gcct/testsuite]
2017-12-04  Michael Meissner  

Back port from trunk
2017-12-01  Michael Meissner  

PR target/81959
* gcc.target/powerpc/pr81959.c: New test.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 255341)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -39938,7 +39938,8 @@ rs6000_address_for_fpconvert (rtx x)
 
   gcc_assert (MEM_P (x));
   addr = XEXP (x, 0);
-  if (! legitimate_indirect_address_p (addr, strict_p)
+  if (can_create_pseudo_p ()
+  && ! legitimate_indirect_address_p (addr, strict_p)
   && ! legitimate_indexed_address_p (addr, strict_p))
 {
   if (GET_CODE (addr) == PRE_INC || GET_CODE (addr) == PRE_DEC)
Index: gcc/config/rs6000/rs6000.md
===
--- gcc/config/rs6000/rs6000.md (revision 255341)
+++ gcc/config/rs6000/rs6000.md (working copy)
@@ -14642,6 +14642,9 @@ (define_insn_and_split "float_si2_
 {
   if (GET_CODE (operands[2]) == SCRATCH)
 operands[2] = gen_reg_rtx (DImode);
+
+  if (MEM_P (operands[1]))
+operands[1] = rs6000_address_for_fpconvert (operands[1]);
 })
 
 (define_insn_and_split "float2"
@@ -14705,6 +14708,9 @@ (define_insn_and_split "floatuns_s
 {
   if (GET_CODE (operands[2]) == SCRATCH)
 operands[2] = gen_reg_rtx (DImode);
+
+  if (MEM_P (operands[1]))
+operands[1] = rs6000_address_for_fpconvert (operands[1]);
 })
 
 (define_insn_and_split "floatuns2"
Index: gcc/testsuite/gcc.target/powerpc/pr81959.c
===
--- gcc/testsuite/gcc.target/powerpc/pr81959.c  (nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/pr81959.c  (working copy)
@@ -0,0 +1,25 @@
+/* { dg-do compile { target { powerpc64*-*-* && lp64 } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-options "-mpower9-vector -O2 -mfloat128" } */
+
+/* PR 81959, the compiler raised on unrecognizable insn message in converting
+   int to __float128, where the int had a PRE_INC in the address.  */
+
+#ifndef ARRAY_SIZE
+#define ARRAY_SIZE 1024
+#endif
+
+void
+convert_int_to_float128 (__float128 * __restrict__ p,
+int * __restrict__ q)
+{
+  unsigned long i;
+
+  for (i = 0; i < ARRAY_SIZE; i++)
+p[i] = (__float128)q[i];
+}
+
+/* { dg-final { scan-assembler {\mlfiwax\M|\mlxsiwax\M} } } */
+/* { dg-final { scan-assembler {\mxscvsdqp\M}   } } */
+/* { dg-final { scan-assembler-not {\mmtvsrd\M} } } */
+/* { dg-final { scan-assembler-not {\mmtvsrw[sz]\M} } } */


Re: [PATCH] gcc: xtensa: enable address sanitizer

2017-12-04 Thread Max Filippov
On Mon, Dec 4, 2017 at 1:31 PM, Jakub Jelinek  wrote:
> On Mon, Dec 04, 2017 at 01:28:53PM -0800, Max Filippov wrote:
>> gcc/
>> 2017-12-04  Max Filippov  
>>
>>   * config/xtensa/xtensa.c (xtensa_asan_shadow_offset): New
>>   function.
>>   (TARGET_ASAN_SHADOW_OFFSET): New macro definition.
>>   * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): Set to 1 if
>>   ASAN is enabled.
>
> Is this just for -fsanitize=kernel-address ?  Because I don't see any
> libsanitizer/ changes (and those would need to go upstream first anyway).

Yes, it's for the -fsanitize=kernel-address. I'll port libsanitizer later.

-- 
Thanks.
-- Max


[PATCH] Fix typos in riscv register save/restore.

2017-12-04 Thread Jim Wilson
This patch fixes a few typos with computing registers to save/restore in
prologue/epilogue.  This fixes the github riscv/riscv-gcc issue #111.  The
typos are harmless, as these registers are call clobbered, but this was a
problem for one person modifying gcc for academic reasons, and hence the
typos need to be fixed.

Tested with a gcc make check, there were no regressions.  Committed.

gcc/
* config/riscv/riscv.c (riscv_for_each_saved_reg): Use GP_REG_LAST
instead of GP_REG_LAST-1.
(riscv_adjust_libcall_cfi_prologue): Likewise.
(riscv_adjust_libcall_cri_epilogue): Likewise.
* config/riscv/riscv.h (CALL_USED_REGISTERS): Change a6 to t6 in
comment.
---
 gcc/config/riscv/riscv.c | 6 +++---
 gcc/config/riscv/riscv.h | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/gcc/config/riscv/riscv.c b/gcc/config/riscv/riscv.c
index 5547d68..c7283d0 100644
--- a/gcc/config/riscv/riscv.c
+++ b/gcc/config/riscv/riscv.c
@@ -3336,7 +3336,7 @@ riscv_for_each_saved_reg (HOST_WIDE_INT sp_offset, 
riscv_save_restore_fn fn)
 
   /* Save the link register and s-registers. */
   offset = cfun->machine->frame.gp_sp_offset - sp_offset;
-  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST-1; regno++)
+  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
 if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
riscv_save_restore_reg (word_mode, regno, offset, fn);
@@ -3424,7 +3424,7 @@ riscv_adjust_libcall_cfi_prologue ()
   int saved_size = cfun->machine->frame.save_libcall_adjustment;
   int offset;
 
-  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST-1; regno++)
+  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
 if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
/* The save order is ra, s0, s1, s2 to s11.  */
@@ -3552,7 +3552,7 @@ riscv_adjust_libcall_cfi_epilogue ()
   dwarf = alloc_reg_note (REG_CFA_ADJUST_CFA, adjust_sp_rtx,
  dwarf);
 
-  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST-1; regno++)
+  for (int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++)
 if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST))
   {
reg = gen_rtx_REG (SImode, regno);
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index fe09e84..feada72 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -233,7 +233,7 @@ along with GCC; see the file COPYING3.  If not see
   1, 1 \
 }
 
-/* a0-a7, t0-a6, fa0-fa7, and ft0-ft11 are volatile across calls.
+/* a0-a7, t0-t6, fa0-fa7, and ft0-ft11 are volatile across calls.
The call RTLs themselves clobber ra.  */
 
 #define CALL_USED_REGISTERS\
-- 
2.7.4



[PATCH] emit a trap for buffer overflow in placement new

2017-12-04 Thread Martin Sebor

The -Wplacement-new option warns for buffer overflow in placement
new expressions with objects of constant sizes, but because it's
implemented completely in the C++ front end it misses the more
interesting non-constant sizes.

The attached patch instruments both forms of operator placement
new to emit a trap when __builtin_object_size() determines that
the pointer points to an object less than the specified number
of bytes.  This is done only when _FORTIFY_SOURCE is defined
to a non-zero value.  This makes it possible to prevent buffer
overflow in most of the same cases as in built-ins like strcpy,
though without warnings when the size is nor a C++ constant
integer expression.

On x86_64-linux it passes testing with no apparent regressions.
Can anyone think of problems with this solution?  If not, given
its simplicity, would it be appropriate even at this stage?

Martin

PS I added the tests to the G++ test suite rather than to
libstdc++ because the latter doesn't understand the DejaGnu
scan-tree-dump directive (looks like it's missing an import
for the .exp file that defines it).
libstdc++-v3/ChangeLog:

	* libsupc++/new (placement new)[_FORTIFY_SOURCE]: Emit
	__builtin_trap.

gcc/ChangeLog:

	* g++.dg/init/new49.C: New test.
	* g++.dg/init/new50.C: New test.

diff --git a/gcc/testsuite/g++.dg/init/new49.C b/gcc/testsuite/g++.dg/init/new49.C
new file mode 100644
index 000..5713aef
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/new49.C
@@ -0,0 +1,57 @@
+// { dg-do compile }
+// { dg-options "-O2 -Wno-placement-new -fdump-tree-optimized" }
+
+#define _FORTIFY_SOURCE 1
+#include 
+
+void sink (void*);
+
+void test_placement_new_single_auto ()
+{
+  char buf[2];
+  long *p = new (buf) long;
+  sink (p);
+}
+
+void* test_placement_new_signle_allocated ()
+{
+  char *buf = (char*)::operator new (2);
+  long *p = new (buf) long;
+  return p;
+}
+
+void test_placement_new_array_auto_const ()
+{
+  char buf[sizeof (int)];
+  int *p = new (buf) int[2];
+  sink (p);
+}
+
+void* test_placement_new_array_allocated_const ()
+{
+  char *buf = (char*)::operator new (sizeof (int));
+  int *p = new (buf) int[2];
+  return p;
+}
+
+void test_placement_new_array_auto_range (unsigned n)
+{
+  if (n < 2)
+n = 2;
+
+  char buf[sizeof (int)];
+  int *p = new (buf) int[n];
+  sink (p);
+}
+
+void test_placement_new_array_allocated_range (unsigned n)
+{
+  if (n < 2)
+n = 2;
+
+  char *buf = (char*)::operator new (sizeof (int));
+  int *p = new (buf) int[n];
+  sink (p);
+}
+
+// { dg-final { scan-tree-dump-times "__builtin_trap" 6 "optimized" } }
diff --git a/gcc/testsuite/g++.dg/init/new50.C b/gcc/testsuite/g++.dg/init/new50.C
new file mode 100644
index 000..1490992
--- /dev/null
+++ b/gcc/testsuite/g++.dg/init/new50.C
@@ -0,0 +1,22 @@
+// { dg-do compile }
+// { dg-options "-O2 -fdump-tree-optimized" }
+
+#include 
+
+void sink (void*);
+
+void test_placement_new_single ()
+{
+  char buf[2];
+  long *p = new (buf) long;   // { dg-warning "placement new" }
+  sink (p);
+}
+
+void test_placement_new_array ()
+{
+  char buf[2];
+  int *p = new (buf) int[2];  // { dg-warning "placement new" }
+  sink (p);
+}
+
+// { dg-final { scan-tree-dump-not "__builtin_trap" "optimized" } }
diff --git a/libstdc++-v3/libsupc++/new b/libstdc++-v3/libsupc++/new
index 0e408b1..ef7ea6d 100644
--- a/libstdc++-v3/libsupc++/new
+++ b/libstdc++-v3/libsupc++/new
@@ -165,10 +165,25 @@ void operator delete[](void*, std::size_t, std::align_val_t)
 #endif // __cpp_aligned_new
 
 // Default placement versions of operator new.
-inline void* operator new(std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
-{ return __p; }
-inline void* operator new[](std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
-{ return __p; }
+inline void* operator new(std::size_t __n, void* __p) _GLIBCXX_USE_NOEXCEPT
+{
+#ifdef _FORTIFY_SOURCE && _FORTIFY_SOURCE > 0
+  if (__builtin_object_size (__p, 0) < __n)
+__builtin_trap ();
+#endif
+
+  return __p;
+}
+
+inline void* operator new[](std::size_t __n, void* __p) _GLIBCXX_USE_NOEXCEPT
+{
+#ifdef _FORTIFY_SOURCE && _FORTIFY_SOURCE > 0
+  if (__builtin_object_size (__p, 0) < __n)
+__builtin_trap ();
+#endif
+
+  return __p;
+}
 
 // Default placement versions of operator delete.
 inline void operator delete  (void*, void*) _GLIBCXX_USE_NOEXCEPT { }


Re: [PATCH] emit a trap for buffer overflow in placement new

2017-12-04 Thread Marc Glisse

On Mon, 4 Dec 2017, Martin Sebor wrote:


The -Wplacement-new option warns for buffer overflow in placement
new expressions with objects of constant sizes, but because it's
implemented completely in the C++ front end it misses the more
interesting non-constant sizes.

The attached patch instruments both forms of operator placement
new to emit a trap when __builtin_object_size() determines that
the pointer points to an object less than the specified number
of bytes.  This is done only when _FORTIFY_SOURCE is defined
to a non-zero value.  This makes it possible to prevent buffer
overflow in most of the same cases as in built-ins like strcpy,
though without warnings when the size is nor a C++ constant
integer expression.

On x86_64-linux it passes testing with no apparent regressions.
Can anyone think of problems with this solution?  If not, given
its simplicity, would it be appropriate even at this stage?


AFAIK, one can call this operator new manually on any pointer, including 
one-past-the-end pointers and null pointers. It is only with new 
expressions that the limitation comes in (because it runs a constructor 
afterwards). Not that people often do that...


--
Marc Glisse


C++ PATCH for c++/82373, constexpr if and non-constant condition

2017-12-04 Thread Jason Merrill
We need to require a constant-expression here, not just check for it.

The constexpr.c change was necessary to fix constexpr-lambda17.C.

Tested x86_64-pc-linux-gnu, applying to trunk.
commit 6ba5b86d80f9aea859b7d2b9bf7c517dadf3a701
Author: Jason Merrill 
Date:   Mon Dec 4 15:40:51 2017 -0500

PR c++/83273 - constexpr if allows non-constant condition

* semantics.c (finish_if_stmt_cond): Use require_constant_expression
rather than is_constant_expression.
* constexpr.c (potential_constant_expression_1) [LAMBDA_EXPR]: Allow
in C++17.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index f0370cc2aff..6dfecfc1b14 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -5524,6 +5524,14 @@ potential_constant_expression_1 (tree t, bool want_rval, 
bool strict, bool now,
   return RECUR (STMT_EXPR_STMT (t), rval);
 
 case LAMBDA_EXPR:
+  if (cxx_dialect >= cxx17)
+   /* In C++17 lambdas can be constexpr, don't give up yet.  */
+   return true;
+  else if (flags & tf_error)
+   error_at (loc, "lambda-expression is not a constant expression "
+ "before C++17");
+  return false;
+
 case DYNAMIC_CAST_EXPR:
 case PSEUDO_DTOR_EXPR:
 case NEW_EXPR:
diff --git a/gcc/cp/semantics.c b/gcc/cp/semantics.c
index e2daab4339e..c6726324ae6 100644
--- a/gcc/cp/semantics.c
+++ b/gcc/cp/semantics.c
@@ -731,7 +731,7 @@ finish_if_stmt_cond (tree cond, tree if_stmt)
 {
   cond = maybe_convert_cond (cond);
   if (IF_STMT_CONSTEXPR_P (if_stmt)
-  && is_constant_expression (cond)
+  && require_constant_expression (cond)
   && !value_dependent_expression_p (cond))
 {
   cond = instantiate_non_dependent_expr (cond);
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if12.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if12.C
index 4e887f3b939..db984a64677 100644
--- a/gcc/testsuite/g++.dg/cpp1z/constexpr-if12.C
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if12.C
@@ -7,7 +7,7 @@ struct T {
 
 template 
 constexpr auto bf(T t) {
-if constexpr(t.foo()) {
+if constexpr(t.foo()) {// { dg-error "constant expression" }
 return false;
 }
 return true;
diff --git a/gcc/testsuite/g++.dg/cpp1z/constexpr-if13.C 
b/gcc/testsuite/g++.dg/cpp1z/constexpr-if13.C
new file mode 100644
index 000..55dbfd902ee
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1z/constexpr-if13.C
@@ -0,0 +1,10 @@
+// PR c++/83273
+// { dg-options -std=c++17 }
+
+int main()
+{
+  auto d = 42;
+  if constexpr (d > 0) {   // { dg-error "constant expression" }
+  return d;
+  }
+}


Re: [v3 PATCH] Implement LWG 2221, No formatted output operator for nullptr

2017-12-04 Thread Jonathan Wakely

On 03/12/17 23:08 +0200, Ville Voutilainen wrote:

Tested on Linux-x64.

2017-11-14  Ville Voutilainen  

   Implement LWG 2221
   * include/std/ostream (operator<<(nullptr_t)): New.
   * testsuite/27_io/basic_ostream/inserters_other/char/lwg2221.cc: New.



diff --git a/libstdc++-v3/include/std/ostream b/libstdc++-v3/include/std/ostream
index f7cab03..18011bc 100644
--- a/libstdc++-v3/include/std/ostream
+++ b/libstdc++-v3/include/std/ostream
@@ -245,6 +245,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  operator<<(const void* __p)
  { return _M_insert(__p); }

+#if __cplusplus > 201402L
+  __ostream_type&
+  operator<<(nullptr_t)
+  { return *this << "nullptr"; }
+#endif


As discussed on IRC, this requires a new symbol to be exported for the
std::ostream and std::wostream explicit instantiations, or the new
test will fail to link at -O0.

That should wait for stage 1.



Re: [PATCH] Fix -Wsystem-header warnings in libstdc++

2017-12-04 Thread Jonathan Wakely

On 01/12/17 16:10 +, Jonathan Wakely wrote:

On 01/12/17 15:11 +, Jonathan Wakely wrote:

This fixes a number of warnings that show up with -Wsystem-headers


This fixes some more.


And some more, this time for -Wunused, -Wcomment, and
-Wvariadic-macros.

Tested powerpc64le-linux, committed to trunk.

The purpose of these is not only to reduce noise if users use
-Wsystem-headers, but to get us to a place where PR 50871 can be fixed
(i.e. build and test libstdc++ without having useful warnings
suppressed by the #pragma GCC system_header lines).

commit e04e77ee6202b69652d033d419283bc35d67897a
Author: Jonathan Wakely 
Date:   Mon Dec 4 20:49:42 2017 +

Fix warnings in 

* include/bits/regex_compiler.tcc: Use C-style comment to work around
PR preprocessor/61638.
(__INSERT_REGEX_MATCHER): Replace GNU extension with __VA_ARGS__.

diff --git a/libstdc++-v3/include/bits/regex_compiler.tcc b/libstdc++-v3/include/bits/regex_compiler.tcc
index 1f7dd91b643..0c89800ea94 100644
--- a/libstdc++-v3/include/bits/regex_compiler.tcc
+++ b/libstdc++-v3/include/bits/regex_compiler.tcc
@@ -30,8 +30,9 @@
 
 // FIXME make comments doxygen format.
 
+/*
 // This compiler refers to "Regular Expression Matching Can Be Simple And Fast"
-// (http://swtch.com/~rsc/regexp/regexp1.html";),
+// (http://swtch.com/~rsc/regexp/regexp1.html),
 // but doesn't strictly follow it.
 //
 // When compiling, states are *chained* instead of tree- or graph-constructed.
@@ -51,7 +52,8 @@
 // article.
 //
 // That's why we introduced dummy node here -- "end_tag" is a dummy node.
-// All dummy node will be eliminated at the end of compiling process.
+// All dummy nodes will be eliminated at the end of compilation.
+*/
 
 namespace std _GLIBCXX_VISIBILITY(default)
 {
@@ -292,18 +294,18 @@ namespace __detail
   return true;
 }
 
-#define __INSERT_REGEX_MATCHER(__func, args...)\
+#define __INSERT_REGEX_MATCHER(__func, ...)\
 	do\
 	  if (!(_M_flags & regex_constants::icase))\
 	if (!(_M_flags & regex_constants::collate))\
-	  __func(args);\
+	  __func(__VA_ARGS__);\
 	else\
-	  __func(args);\
+	  __func(__VA_ARGS__);\
 	  else\
 	if (!(_M_flags & regex_constants::collate))\
-	  __func(args);\
+	  __func(__VA_ARGS__);\
 	else\
-	  __func(args);\
+	  __func(__VA_ARGS__);\
 	while (false)
 
   template

commit f2d7faeec2d63743c5acdc1dc77fca7ba64fce1f
Author: Jonathan Wakely 
Date:   Mon Dec 4 16:08:34 2017 +

Fix -Wunused warnings in libstdc++ headers

* config/io/basic_file_stdio.h (__basic_file): Remove name of unused
parameter.
* include/bits/boost_concept_check.h: Add pragmas to disable
-Wunused-local-typedef warnings.
* include/bits/codecvt.h (codecvt_byname)
(codecvt_byname): Remove name of unused
parameter.
* include/bits/locale_facets_nonio.tcc (time_get::do_get_weekday)
(time_get::do_get_monthname, time_get::do_get_year): Remove unused
variables.
* include/std/bitset (_Base_bitset<0>::_M_getword): Remove name of
unused parameter.
* include/std/streambuf (_IsUnused): Define.
(basic_streambuf::imbue, basic_streambuf::pbackfail)
(basic_streambuf::overflow): Add macro to unused parameters.
* testsuite/24_iterators/operations/prev_neg.cc: Adjust dg-error.

diff --git a/libstdc++-v3/config/io/basic_file_stdio.h b/libstdc++-v3/config/io/basic_file_stdio.h
index f959ea534cb..76435e417ef 100644
--- a/libstdc++-v3/config/io/basic_file_stdio.h
+++ b/libstdc++-v3/config/io/basic_file_stdio.h
@@ -63,7 +63,7 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
   __basic_file(__c_lock* __lock = 0) throw ();
 
 #if __cplusplus >= 201103L
-  __basic_file(__basic_file&& __rv, __c_lock* __lock = 0) noexcept
+  __basic_file(__basic_file&& __rv, __c_lock* = 0) noexcept
   : _M_cfile(__rv._M_cfile), _M_cfile_created(__rv._M_cfile_created)
   {
 	__rv._M_cfile = nullptr;
diff --git a/libstdc++-v3/include/bits/boost_concept_check.h b/libstdc++-v3/include/bits/boost_concept_check.h
index fb9a643c869..43ab03415e5 100644
--- a/libstdc++-v3/include/bits/boost_concept_check.h
+++ b/libstdc++-v3/include/bits/boost_concept_check.h
@@ -48,6 +48,9 @@ namespace __gnu_cxx _GLIBCXX_VISIBILITY(default)
 {
 _GLIBCXX_BEGIN_NAMESPACE_VERSION
 
+#pragma GCC diagnostic push
+#pragma GCC diagnostic ignored "-Wunused-local-typedefs"
+
 #define _IsUnused __attribute__ ((__unused__))
 
 // When the C-C code is in use, we would like this function to do as little
@@ -783,6 +786,7 @@ struct _Aux_require_same<_Tp,_Tp> { typedef _Tp _Type; };
 _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace
 
+#pragma GCC diagnostic pop
 #undef _IsUnused
 
 #endif // _GLIBCXX_BOOST_CONCEPT_CHECK
diff --git a/libstdc++-v3/include/bits/codecvt.h b/libstdc++-v3/include/bits/codecvt.h
index fc2da321ab6

[PR 83141] Prevent SRA from removing type changing assignment

2017-12-04 Thread Martin Jambor
Hi,

this is a followup to Richi's
https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02396.html to fix PR
83141.  The basic idea is simple, be just as conservative about type
changing MEM_REFs as we are about actual VCEs.

I have checked how that would affect compilation of SPEC 2006 and (non
LTO) Mozilla Firefox and am happy to report that the difference was
tiny.  However, I had to make the test less strict, otherwise testcase
gcc.dg/guality/pr54970.c kept failing because it contains folded memcpy
and expects us to track values accross:

  int a[] = { 1, 2, 3 };
  /* ... */
  __builtin_memcpy (&a, (int [3]) { 4, 5, 6 }, sizeof (a));
/* { dg-final { gdb-test 31 "a\[0\]" "4" } } */
/* { dg-final { gdb-test 31 "a\[1\]" "5" } } */
/* { dg-final { gdb-test 31 "a\[2\]" "6" } } */
  
SRA is able to load replacement of a[0] directly from the temporary
array which is apparently necessary to generate proper debug info.  I
have therefore allowed the current transformation to go forward if the
source does not contain any padding or if it is a read-only declaration.
I would especially appreciate a review of the type_contains_padding_p
predicate as I am not sure it caches all cases.

The added call to contains_vce_or_bfcref_p in build_accesses_from_assign
is not strictly necessary, it is there to avoid attempting total
scalarization in vain.  This is also tested by the dump scan in the
added testcase.

A very similar patch has passed bootstrap and testing on x86_64, this
exact one is undergoing both now.  OK for trunk if it passes too?

Thanks,

Martin


2017-12-04  Martin Jambor  

PR tree-optimization/83141

* tree-sra.c (type_contains_padding_p): New function.
(contains_vce_or_bfcref_p): Move up in the file, also test for
MEM_REFs implicitely changing types with padding.  Remove inline
keyword.
(build_accesses_from_assign): Check contains_vce_or_bfcref_p
before setting bit in should_scalarize_away_bitmap.

testsuite/
* gcc.dg/tree-ssa/pr83141.c: New test.
---
 gcc/testsuite/gcc.dg/tree-ssa/pr83141.c |  31 +
 gcc/tree-sra.c  | 116 ++--
 2 files changed, 128 insertions(+), 19 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83141.c

diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c 
b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
new file mode 100644
index 000..86caf66025b
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
@@ -0,0 +1,31 @@
+/* { dg-do run } */
+/* { dg-options "-O -fdump-tree-esra-details" } */
+
+volatile short vs;
+volatile long vl;
+
+struct A { short s; long i; long j; };
+struct A a, b;
+void foo ()
+{
+  struct A c;
+  __builtin_memcpy (&c, &b, sizeof (struct A));
+  __builtin_memcpy (&a, &c, sizeof (struct A));
+
+  vs = c.s;
+  vl = c.i;
+  vl = c.j;
+}
+int main()
+{
+  __builtin_memset (&b, 0, sizeof (struct A));
+  b.s = 1;
+  __builtin_memcpy ((char *)&b+2, &b, 2);
+  foo ();
+  __builtin_memcpy (&a, (char *)&a+2, 2);
+  if (a.s != 1)
+__builtin_abort ();
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump-not "Will attempt to totally scalarize" "esra" 
} } */
diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
index 866cff0edb0..0a9622e77f8 100644
--- a/gcc/tree-sra.c
+++ b/gcc/tree-sra.c
@@ -1141,6 +1141,101 @@ contains_view_convert_expr_p (const_tree ref)
   return false;
 }
 
+/* Return false if we can determine that TYPE has no padding, otherwise return
+   true.  */
+
+static bool
+type_contains_padding_p (tree type)
+{
+  if (is_gimple_reg_type (type))
+return false;
+
+  if (!tree_fits_uhwi_p (TYPE_SIZE (type)))
+return true;
+
+  switch (TREE_CODE (type))
+{
+case RECORD_TYPE:
+  {
+   unsigned HOST_WIDE_INT pos = 0;
+   for (tree fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
+ if (TREE_CODE (fld) == FIELD_DECL)
+   {
+ tree ft = TREE_TYPE (fld);
+
+ if (DECL_BIT_FIELD (fld)
+ || DECL_PADDING_P (fld)
+ || !tree_fits_uhwi_p (TYPE_SIZE (ft)))
+   return true;
+
+ tree t = bit_position (fld);
+ if (!tree_fits_uhwi_p (t))
+   return true;
+ unsigned HOST_WIDE_INT bp = tree_to_uhwi (t);
+ if (pos != bp)
+   return true;
+
+ pos += tree_to_uhwi (TYPE_SIZE (ft));
+ if (type_contains_padding_p (ft))
+   return true;
+   }
+
+   if (pos != tree_to_uhwi (TYPE_SIZE (type)))
+ return true;
+
+   return false;
+  }
+
+case ARRAY_TYPE:
+  return type_contains_padding_p (TYPE_SIZE (type));
+
+default:
+  return true;
+}
+  gcc_unreachable ();
+}
+
+/* Return true if REF contains a MEM_REF that performs type conversion from a
+   type with padding or any VIEW_CONVERT_EXPR or a COMPONENT_

Re: [PATCH v2 3/4] [SPARC] Errata workaround for GRLIB-TN-0010

2017-12-04 Thread Eric Botcazou
> No, there was no particular reason. mem_ref seems like a better choice
> if it detects more types of loads.

Right, it's supposed to detect all types of loads.  Here's what I have 
installed on the mainline and 7 branch.


2017-12-04  Eric Botcazou  

* config/sparc/sparc.c (sparc_do_work_around_errata): Use mem_ref
instead of MEM_P in a couple more places.  Fix formatting issues.

-- 
Eric BotcazouIndex: config/sparc/sparc.c
===
--- config/sparc/sparc.c	(revision 255291)
+++ config/sparc/sparc.c	(working copy)
@@ -1050,43 +1050,42 @@ sparc_do_work_around_errata (void)
   /* Look into the instruction in a delay slot.  */
   if (NONJUMP_INSN_P (insn)
 	  && (seq = dyn_cast  (PATTERN (insn
-	  {
-	jump = seq->insn (0);
-	insn = seq->insn (1);
-	  }
+	{
+	  jump = seq->insn (0);
+	  insn = seq->insn (1);
+	}
   else if (JUMP_P (insn))
 	jump = insn;
   else
 	jump = NULL;
 
-  /* Place a NOP at the branch target of an integer branch if it is
-	 a floating-point operation or a floating-point branch.  */
+  /* Place a NOP at the branch target of an integer branch if it is a
+	 floating-point operation or a floating-point branch.  */
   if (sparc_fix_gr712rc
-	  && jump != NULL_RTX
+	  && jump
 	  && get_attr_branch_type (jump) == BRANCH_TYPE_ICC)
 	{
 	  rtx_insn *target = next_active_insn (JUMP_LABEL_AS_INSN (jump));
 	  if (target
 	  && (fpop_insn_p (target)
-		  || ((JUMP_P (target)
-		   && get_attr_branch_type (target) == BRANCH_TYPE_FCC
+		  || (JUMP_P (target)
+		  && get_attr_branch_type (target) == BRANCH_TYPE_FCC)))
 	emit_insn_before (gen_nop (), target);
 	}
 
-  /* Insert a NOP between load instruction and atomic
-	 instruction.  Insert a NOP at branch target if load
-	 in delay slot and atomic instruction at branch target.  */
+  /* Insert a NOP between load instruction and atomic instruction.  Insert
+	 a NOP at branch target if there is a load in delay slot and an atomic
+	 instruction at branch target.  */
   if (sparc_fix_ut700
 	  && NONJUMP_INSN_P (insn)
 	  && (set = single_set (insn)) != NULL_RTX
-	  && MEM_P (SET_SRC (set))
+	  && mem_ref (SET_SRC (set))
 	  && REG_P (SET_DEST (set)))
 	{
 	  if (jump)
 	{
 	  rtx_insn *target = next_active_insn (JUMP_LABEL_AS_INSN (jump));
-	  if (target
-		  && atomic_insn_for_leon3_p (target))
+	  if (target && atomic_insn_for_leon3_p (target))
 		emit_insn_before (gen_nop (), target);
 	}
 
@@ -1098,7 +1097,9 @@ sparc_do_work_around_errata (void)
 	insert_nop = true;
 	}
 
-  /* Look for sequences that could trigger the GRLIB-TN-0013 errata.  */
+  /* Look for a sequence that starts with a fdiv or fsqrt instruction and
+	 ends with another fdiv or fsqrt instruction with no dependencies on
+	 the former, along with an appropriate pattern in between.  */
   if (sparc_fix_lost_divsqrt
 	  && NONJUMP_INSN_P (insn)
 	  && div_sqrt_insn_p (insn))
@@ -1229,8 +1230,8 @@ sparc_do_work_around_errata (void)
 		 then the sequence cannot be problematic.  */
 	  if (i == 0)
 		{
-		  if (((set = single_set (after)) != NULL_RTX)
-		  && (MEM_P (SET_DEST (set)) || MEM_P (SET_SRC (set
+		  if ((set = single_set (after)) != NULL_RTX
+		  && (MEM_P (SET_DEST (set)) || mem_ref (SET_SRC (set
 		break;
 
 		  after = next_active_insn (after);
@@ -1240,21 +1241,21 @@ sparc_do_work_around_errata (void)
 
 	  /* Add NOP if third instruction is a store.  */
 	  if (i == 1
-		  && ((set = single_set (after)) != NULL_RTX)
+		  && (set = single_set (after)) != NULL_RTX
 		  && MEM_P (SET_DEST (set)))
 		insert_nop = true;
 	}
 	}
-  else
+
   /* Look for a single-word load into an odd-numbered FP register.  */
-  if (sparc_fix_at697f
-	  && NONJUMP_INSN_P (insn)
-	  && (set = single_set (insn)) != NULL_RTX
-	  && GET_MODE_SIZE (GET_MODE (SET_SRC (set))) == 4
-	  && MEM_P (SET_SRC (set))
-	  && REG_P (SET_DEST (set))
-	  && REGNO (SET_DEST (set)) > 31
-	  && REGNO (SET_DEST (set)) % 2 != 0)
+  else if (sparc_fix_at697f
+	   && NONJUMP_INSN_P (insn)
+	   && (set = single_set (insn)) != NULL_RTX
+	   && GET_MODE_SIZE (GET_MODE (SET_SRC (set))) == 4
+	   && mem_ref (SET_SRC (set))
+	   && REG_P (SET_DEST (set))
+	   && REGNO (SET_DEST (set)) > 31
+	   && REGNO (SET_DEST (set)) % 2 != 0)
 	{
 	  /* The wrong dependency is on the enclosing double register.  */
 	  const unsigned int x = REGNO (SET_DEST (set)) - 1;


Re: [PATCH] Fix -Wsystem-header warnings in libstdc++

2017-12-04 Thread Jonathan Wakely

On 04/12/17 23:07 +, Jonathan Wakely wrote:

On 01/12/17 16:10 +, Jonathan Wakely wrote:

On 01/12/17 15:11 +, Jonathan Wakely wrote:

This fixes a number of warnings that show up with -Wsystem-headers


This fixes some more.


And some more, this time for -Wunused, -Wcomment, and
-Wvariadic-macros.

Tested powerpc64le-linux, committed to trunk.

The purpose of these is not only to reduce noise if users use
-Wsystem-headers, but to get us to a place where PR 50871 can be fixed
(i.e. build and test libstdc++ without having useful warnings
suppressed by the #pragma GCC system_header lines).


This warning remains:

/home/jwakely/gcc/8/include/c++/8.0.0/bits/valarray_array.h:56:12: warning: 
type qualifiers ignored on function return type [-Wignored-qualifiers]
inline _Tp*__restrict__
   ^~~

I think we can just remove the __restrict__.



Re: [RFA][PATCH][tree-optimization/78496] 03/03 Embed and exploit EVRP analysis within DOM

2017-12-04 Thread Jeff Law
On 12/04/2017 03:42 AM, Richard Biener wrote:
> On Mon, Dec 4, 2017 at 6:59 AM, Jeff Law  wrote:
>> And finally, the meat of the fix for pr78496.  This is largely the patch
>> that was posed right when stage1 closed with just minor updates.
>>
>> This patch embeds evrp analysis into DOM and exploits it in the minimal
>> way necessary to fix pr78496 without regressing other tests in the
>> testsuite.
>>
>> The key effect we're looking for here is to pick up a slew of jump
>> threads in DOM2 by exploiting the range information during jump threading.
>>
>> These aren't handled well in the standard tree-vrp jump threading -- we
>> could abuse ASSERT_EXPRs further, but it'd just be ugly and probably
>> computationally expensive.
>>
>> The ranges we want fall out naturally from a dominator walk, hence all
>> the recent work around pulling out EVRP analysis into a little module
>> other passes could reuse.
>>
>> With these changes we pick up all the missing jump thread opportunities
>> in DOM for pr78496.  Sadly, I've been able to pull together an automated
>> test that I like.  About the best I could come up with would be to
>> verify that a large number of jump threads were realized during DOM2.
>> But that still seems rather fragile.  I also looked at examining the
>> results of PRE, but the partial redundancies that were originally the
>> source of complaints in that BZ have already been fixed.  I also looked
>> at perhaps looking for PHIs with constant args and then perhaps trying
>> to filter those, but got nowhere.
>>
>> So there's no direct test for pr78496.  Sigh.
>>
>>
>>
>> Running EVRP analysis in DOM had an unintended side effect.  Namely that
>> SSA_NAMEs that got created after the EVRP optimization pass would have
>> range information attached to them.  That caused two warning regressions
>> with -O1 code in the C++ and Fortran testsuites.  The problem is the
>> ranges attached to the new SSA_NAMES were very wide and there was code
>> left on an unexecutable path that called an allocation routine (C++) and
>> a memset (Fortran) using those names as arguments.  The uber-wide ranges
>> in those circumstances triggered the false positive warnings.
>>
>> By exploiting the EVRP data during the standard part of DOM to optimize
>> conditionals, we're able to prove the paths in question simply aren't
>> executable.  We remove them and the warnings go away.
>>
>> This work compromises builtin-unreachable-6.  So the original test it
>> kept and -fno-tree-dominator-opts is added to keep it from being
>> compromised.  A new builtin-unreachable-6a test is created to very that
>> DOM does indeed remove all the __builtin_unreachable paths.
>>
>> This work also results in us optimizing 20030922-2.c again (conditional
>> equivalences).  EVRP analysis will note the conditional equivalence.  We
>> don't propagate anything from EVRP analysis at this time, but we do use
>> it to try and simplify GIMPLE_CONDs to a compile-time constant which is
>> what happens in this case.
>>
>> I plan to check this in if/when the first two patches are approved.
> 
> Looks good to me.  Btw, did you look at adding a GIMPLE testcase
> for 78496?  You'd have stable IL into DOM then...
I thought about it.  But we're still stuck doing something like counting
jump threads realized.  We have several tests of this nature and they
all feel rather fragile.  So I'm definitely looking for better ways to
handle testing of this stuff.

jeff


[PATCH] Don't add __builtin_unreachable for -Wreturn-type places at -O0

2017-12-04 Thread Jakub Jelinek
Hi!

When debugging what turned out to be just a -Wreturn-type reported
failure now being a fatal problem even at -O0, it occurs to me that
because we are not optimizing at -O0, adding the __builtin_unreachable
doesn't bring many optimization advantages (practically just optimize
away epilogue) and so I think I'd prefer to only add those
when optimizing.

Similarly, for UBSan, we instrument these with -fsanitize=return,
part of -fsanitize=unreachable, where we provide nice runtime errors,
but when somebody intentionally turns that off (e.g. to quickly get
past such issues to look for something else), say with
-fsanitize=undefined -fno-sanitize=return, the same error will be
reported as __builtin_unreachable with a cryptic  location.
I think it is better to just leave it out if -fsanitize=unreachable.

Changed thusly, bootstrapped/regtested on x86_64-linux and i686-linux, ok
for trunk?

2017-12-05  Jakub Jelinek  

* cp-gimplify.c (cp_maybe_instrument_return): Don't add
__builtin_unreachable if -O0 or if -fsanitize=unreachable.

* g++.dg/missing-return.C: Add -O to dg-options.

--- gcc/cp/cp-gimplify.c.jj 2017-12-04 22:29:04.759741988 +0100
+++ gcc/cp/cp-gimplify.c2017-12-04 22:37:07.283784470 +0100
@@ -1554,6 +1554,18 @@ cp_maybe_instrument_return (tree fndecl)
   || !targetm.warn_func_return (fndecl))
 return;
 
+  if (!sanitize_flags_p (SANITIZE_RETURN, fndecl)
+  /* Don't add __builtin_unreachable () if not optimizing, it will not
+improve any optimizations in that case, just break UB code.
+Don't add it if -fsanitize=unreachable -fno-sanitize=return either,
+UBSan covers this with ubsan_instrument_return above where sufficient
+information is provided, while the __builtin_unreachable () below
+if return sanitization is disabled will just result in hard to
+understand runtime error without location.  */
+  && (!optimize
+ || sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl)))
+return;
+
   tree t = DECL_SAVED_TREE (fndecl);
   while (t)
 {
--- gcc/testsuite/g++.dg/missing-return.C.jj2017-11-06 17:24:11.0 
+0100
+++ gcc/testsuite/g++.dg/missing-return.C   2017-12-05 00:25:13.028800127 
+0100
@@ -1,5 +1,5 @@
 /* { dg-do compile } */
-/* { dg-options "-Wreturn-type -fdump-tree-optimized" } */
+/* { dg-options "-Wreturn-type -fdump-tree-optimized -O" } */
 
 int foo(int a)
 {

Jakub


Re: [PATCH] emit a trap for buffer overflow in placement new

2017-12-04 Thread Martin Sebor

On 12/04/2017 03:44 PM, Marc Glisse wrote:

On Mon, 4 Dec 2017, Martin Sebor wrote:


The -Wplacement-new option warns for buffer overflow in placement
new expressions with objects of constant sizes, but because it's
implemented completely in the C++ front end it misses the more
interesting non-constant sizes.

The attached patch instruments both forms of operator placement
new to emit a trap when __builtin_object_size() determines that
the pointer points to an object less than the specified number
of bytes.  This is done only when _FORTIFY_SOURCE is defined
to a non-zero value.  This makes it possible to prevent buffer
overflow in most of the same cases as in built-ins like strcpy,
though without warnings when the size is nor a C++ constant
integer expression.

On x86_64-linux it passes testing with no apparent regressions.
Can anyone think of problems with this solution?  If not, given
its simplicity, would it be appropriate even at this stage?


AFAIK, one can call this operator new manually on any pointer, including
one-past-the-end pointers and null pointers. It is only with new
expressions that the limitation comes in (because it runs a constructor
afterwards). Not that people often do that...


Hmm, yes, there don't seem to be any requirements on calling
the operator by itself.  That's too bad.  I was going to move
the placement new checking into the middle-end to improve
the detection and avoid some false positives but the operator
disappears too early, before the size of the expression that's
being stuffed into the destination is known.

The only other solution that comes to mind to detect non-constant
overflow is to do something like:

  void* operator new (size_t n, void *p)
  {
if (__builtin_object_size (p, 0) < n)
 __builtin_warn ("buffer overflow in placement new");
return p;
  }

and emit the warning from __builtin_warn during expansion.  That
should work and be usable in other contexts as well (e.g., to
implement overflow and other error detection in user-defined
functions).  It would be kind of like the middle-end form of
Clang attribute diagnose_if.

Still, I wonder if tightening up the standard to require that
the pointer point to an object of at least n bytes in size would
run into problems or be met with objections.

Martin


[PATCH] Fix ICE with extend_ref_init_temps of structured binding (PR c++/81197)

2017-12-04 Thread Jakub Jelinek
Hi!

We were calling cp_finish_decl before cp_finish_decomp, so that the
latter has the initializer finalized and can decompose it, and
cp_finish_decomp was mangling the decl if needed.
Unfortunately, as the following testcase shows, sometimes we need
it mangled already earlier, during that cp_finish_decl call.

Most of the patch below arranges the mangling (which needs just the list
of the identifiers and namespace context) to be done before the
cp_finish_decl.
Another thing is that with the change we wouldn't ICE, but would emit
weird mangled names, like _ZGR7_ZDC1tE0.  The problem is that we don't
keep the list of the corresponding identifiers around and only store
the full mangled name in DECL_ASSEMBLER_NAME, but the DC + E
case us  and can appear elsewhere, including the lifetime
extended temporary manglings.  The patch just attempts to discover the DC
in the mangled name and use a substring of the mangled name, there should be
no substutions in there.

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

Note, the _ZGR* mangling still looks weird, the ABI says:

   ::= GR  _ # First temporary
   ::= GR   _# Subsequent temporaries

but we instead of the _ or  always emit a number, starting with 0.
And, that number increases on every GR we produce, rather than just when
referring to portions of some initializer of the same object.
So, with the patch we e.g. emit _ZGRDC1tE0, while LLVM emits _ZGRDC1tE_
(and ICEs on the other structured binding, where we emit
_ZGRN1A1BDC1u1v2wwEE1 but should be emitting _ZGRN1A1BDC1u1v2wwEE_
probably).

2017-12-05  Jakub Jelinek  

PR c++/81197
* cp-tree.h (cp_maybe_mangle_decomp): Declare.
* decl.c (cp_maybe_mangle_decomp): New function.
(cp_finish_decomp): Don't SET_DECL_ASSEMBLER_NAME here.
* parser.c (cp_convert_range_for,
cp_parser_decomposition_declaration): Call cp_maybe_mangle_decomp.
* pt.c (tsubst_expr): Likewise.
* mangle.c (write_unqualified_name): Handle DECL_DECOMPOSITION_P
where DECL_ASSEMBLER_NAME is already set.

* g++.dg/cpp1z/decomp34.C: New test.

--- gcc/cp/cp-tree.h.jj 2017-12-01 09:19:07.0 +0100
+++ gcc/cp/cp-tree.h2017-12-04 13:14:11.673367545 +0100
@@ -6141,6 +6141,7 @@ extern void start_decl_1  (tree, bool);
 extern bool check_array_initializer(tree, tree, tree);
 extern void cp_finish_decl (tree, tree, bool, tree, int);
 extern tree lookup_decomp_type (tree);
+extern void cp_maybe_mangle_decomp (tree, tree, unsigned int);
 extern void cp_finish_decomp   (tree, tree, unsigned int);
 extern int cp_complete_array_type  (tree *, tree, bool);
 extern int cp_complete_array_type_or_error (tree *, tree, bool, 
tsubst_flags_t);
--- gcc/cp/decl.c.jj2017-11-30 23:45:45.0 +0100
+++ gcc/cp/decl.c   2017-12-04 12:52:44.679418902 +0100
@@ -7359,6 +7359,25 @@ lookup_decomp_type (tree v)
   return *decomp_type_table->get (v);
 }
 
+/* Mangle a decomposition declaration if needed.  Arguments like
+   in cp_finish_decomp.  */
+
+void
+cp_maybe_mangle_decomp (tree decl, tree first, unsigned int count)
+{
+  if (!processing_template_decl
+  && !error_operand_p (decl)
+  && DECL_NAMESPACE_SCOPE_P (decl))
+{
+  auto_vec v;
+  v.safe_grow (count);
+  tree d = first;
+  for (unsigned int i = 0; i < count; i++, d = DECL_CHAIN (d))
+   v[count - i - 1] = d;
+  SET_DECL_ASSEMBLER_NAME (decl, mangle_decomp (decl, v));
+}
+}
+
 /* Finish a decomposition declaration.  DECL is the underlying declaration
"e", FIRST is the head of a chain of decls for the individual identifiers
chained through DECL_CHAIN in reverse order and COUNT is the number of
@@ -7630,8 +7649,6 @@ cp_finish_decomp (tree decl, tree first,
DECL_HAS_VALUE_EXPR_P (v[i]) = 1;
  }
 }
-  else if (DECL_NAMESPACE_SCOPE_P (decl))
-SET_DECL_ASSEMBLER_NAME (decl, mangle_decomp (decl, v));
 }
 
 /* Returns a declaration for a VAR_DECL as if:
--- gcc/cp/parser.c.jj  2017-12-01 22:13:19.0 +0100
+++ gcc/cp/parser.c 2017-12-04 12:56:04.883923605 +0100
@@ -11926,6 +11926,9 @@ cp_convert_range_for (tree statement, tr
 tf_warning_or_error);
   finish_for_expr (expression, statement);
 
+  if (VAR_P (range_decl) && DECL_DECOMPOSITION_P (range_decl))
+cp_maybe_mangle_decomp (range_decl, decomp_first_name, decomp_cnt);
+
   /* The declaration is initialized with *__begin inside the loop body.  */
   cp_finish_decl (range_decl,
  build_x_indirect_ref (input_location, begin, RO_UNARY_STAR,
@@ -13269,6 +13272,7 @@ cp_parser_decomposition_declaration (cp_
 
   if (decl != error_mark_node)
{
+ cp_maybe_mangle_decomp (decl, prev, v.length ());
  cp_finish_decl (decl, initializer, non_constant_p, NULL_TREE,
   

Update scheduling for generic tuning

2017-12-04 Thread Jan Hubicka
Hi,
currently generic is still scheduling for K8.  I have run several combinations
on Zen and Haswell and it seems that core is more sensitive to scheduling for
its microarchitecture than Zen (or that Zen's scheduler model is less tuned).
At the moment for SPECfp2006 there is only one benchmark that off noise benefit
from zen tuning and the difference is overall within noise.

For this reason I tune for generic.  I also verified that this model does not
regress buldozer. 

Bootstrapped/regtested x86_64-linux, comitted.

Honza

* athlon.md: Disable for generic.
* haswell.md: Enable for generic.
* i386.c (ix86_sched_init_global): Add core hooks for generic.
* x86-tune-sched.c (ix86_issue_rate): Increase issue rate for generic
to 4.
(ix86_adjust_cost): Move generic to haswell path.
Index: config/i386/athlon.md
===
--- config/i386/athlon.md   (revision 255388)
+++ config/i386/athlon.md   (working copy)
@@ -151,11 +151,11 @@
 
 ;; Jump instructions are executed in the branch unit completely transparent to 
us
 (define_insn_reservation "athlon_branch" 0
-(and (eq_attr "cpu" "athlon,k8,generic,amdfam10")
+(and (eq_attr "cpu" "athlon,k8,amdfam10")
  (eq_attr "type" "ibr"))
 "athlon-direct,athlon-ieu")
 (define_insn_reservation "athlon_call" 0
-(and (eq_attr "cpu" "athlon,k8,generic")
+(and (eq_attr "cpu" "athlon,k8")
  (eq_attr "type" "call,callv"))
 "athlon-vector,athlon-ieu")
 (define_insn_reservation "athlon_call_amdfam10" 0
@@ -166,15 +166,15 @@
 ;; Latency of push operation is 3 cycles, but ESP value is available
 ;; earlier
 (define_insn_reservation "athlon_push" 2
-(and (eq_attr "cpu" "athlon,k8,generic,amdfam10")
+(and (eq_attr "cpu" "athlon,k8,amdfam10")
  (eq_attr "type" "push"))
 "athlon-direct,athlon-agu,athlon-store")
 (define_insn_reservation "athlon_pop" 4
-(and (eq_attr "cpu" "athlon,k8,generic")
+(and (eq_attr "cpu" "athlon,k8")
  (eq_attr "type" "pop"))
 "athlon-vector,athlon-load,athlon-ieu")
 (define_insn_reservation "athlon_pop_k8" 3
-(and (eq_attr "cpu" "k8,generic")
+(and (eq_attr "cpu" "k8")
  (eq_attr "type" "pop"))
 "athlon-double,(athlon-ieu+athlon-load)")
 (define_insn_reservation "athlon_pop_amdfam10" 3
@@ -186,13 +186,13 @@
  (eq_attr "type" "leave"))
 "athlon-vector,(athlon-ieu+athlon-load)")
 (define_insn_reservation "athlon_leave_k8" 3
-(and (eq_attr "cpu" "k8,generic,amdfam10")
+(and (eq_attr "cpu" "k8,amdfam10")
  (eq_attr "type" "leave"))
 "athlon-double,(athlon-ieu+athlon-load)")
 
 ;; Lea executes in AGU unit with 2 cycles latency.
 (define_insn_reservation "athlon_lea" 2
-(and (eq_attr "cpu" "athlon,k8,generic")
+(and (eq_attr "cpu" "athlon,k8")
  (eq_attr "type" "lea"))
 "athlon-direct,athlon-agu,nothing")
 ;; Lea executes in AGU unit with 1 cycle latency on AMDFAM10
@@ -209,13 +209,13 @@
 
"athlon-vector,athlon-ieu0,athlon-mult,nothing,nothing,athlon-ieu0")
 ;; ??? Widening multiply is vector or double.
 (define_insn_reservation "athlon_imul_k8_DI" 4
-(and (eq_attr "cpu" "k8,generic,amdfam10")
+(and (eq_attr "cpu" "k8,amdfam10")
  (and (eq_attr "type" "imul")
   (and (eq_attr "mode" "DI")
(eq_attr "memory" "none,unknown"
 
"athlon-direct0,athlon-ieu0,athlon-mult,nothing,athlon-ieu0")
 (define_insn_reservation "athlon_imul_k8" 3
-(and (eq_attr "cpu" "k8,generic,amdfam10")
+(and (eq_attr "cpu" "k8,amdfam10")
  (and (eq_attr "type" "imul")
   (eq_attr "memory" "none,unknown")))
 "athlon-direct0,athlon-ieu0,athlon-mult,athlon-ieu0")
@@ -231,13 +231,13 @@
   (eq_attr "memory" "load,both")))
 
"athlon-vector,athlon-load,athlon-ieu,athlon-mult,nothing,nothing,athlon-ieu")
 (define_insn_reservation "athlon_imul_mem_k8_DI" 7
-(and (eq_attr "cpu" "k8,generic,amdfam10")
+(and (eq_attr "cpu" "k8,amdfam1

Re: [PR 83141] Prevent SRA from removing type changing assignment

2017-12-04 Thread Martin Jambor
On Tue, Dec 05 2017, Martin Jambor wrote:
Hi,

> Hi,
>
> this is a followup to Richi's
> https://gcc.gnu.org/ml/gcc-patches/2017-11/msg02396.html to fix PR
> 83141.  The basic idea is simple, be just as conservative about type
> changing MEM_REFs as we are about actual VCEs.
>
> I have checked how that would affect compilation of SPEC 2006 and (non
> LTO) Mozilla Firefox and am happy to report that the difference was
> tiny.  However, I had to make the test less strict, otherwise testcase
> gcc.dg/guality/pr54970.c kept failing because it contains folded memcpy
> and expects us to track values accross:
>
>   int a[] = { 1, 2, 3 };
>   /* ... */
>   __builtin_memcpy (&a, (int [3]) { 4, 5, 6 }, sizeof (a));
>   /* { dg-final { gdb-test 31 "a\[0\]" "4" } } */
>   /* { dg-final { gdb-test 31 "a\[1\]" "5" } } */
>   /* { dg-final { gdb-test 31 "a\[2\]" "6" } } */
>   
> SRA is able to load replacement of a[0] directly from the temporary
> array which is apparently necessary to generate proper debug info.  I
> have therefore allowed the current transformation to go forward if the
> source does not contain any padding or if it is a read-only declaration.

Ah, the read-only test is of course bogus, it was a last minute addition
when I was apparently already too tired to think it through.  Please
disregard that line in the patch (it has passed bootstrap and testing
without it).

Sorry for the noise,

Martin


> I would especially appreciate a review of the type_contains_padding_p
> predicate as I am not sure it caches all cases.
>
> The added call to contains_vce_or_bfcref_p in build_accesses_from_assign
> is not strictly necessary, it is there to avoid attempting total
> scalarization in vain.  This is also tested by the dump scan in the
> added testcase.
>
> A very similar patch has passed bootstrap and testing on x86_64, this
> exact one is undergoing both now.  OK for trunk if it passes too?
>
> Thanks,
>
> Martin
>
>
> 2017-12-04  Martin Jambor  
>
>   PR tree-optimization/83141
>
>   * tree-sra.c (type_contains_padding_p): New function.
>   (contains_vce_or_bfcref_p): Move up in the file, also test for
>   MEM_REFs implicitely changing types with padding.  Remove inline
>   keyword.
>   (build_accesses_from_assign): Check contains_vce_or_bfcref_p
>   before setting bit in should_scalarize_away_bitmap.
>
> testsuite/
>   * gcc.dg/tree-ssa/pr83141.c: New test.
> ---
>  gcc/testsuite/gcc.dg/tree-ssa/pr83141.c |  31 +
>  gcc/tree-sra.c  | 116 
> ++--
>  2 files changed, 128 insertions(+), 19 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
>
> diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c 
> b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
> new file mode 100644
> index 000..86caf66025b
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr83141.c
> @@ -0,0 +1,31 @@
> +/* { dg-do run } */
> +/* { dg-options "-O -fdump-tree-esra-details" } */
> +
> +volatile short vs;
> +volatile long vl;
> +
> +struct A { short s; long i; long j; };
> +struct A a, b;
> +void foo ()
> +{
> +  struct A c;
> +  __builtin_memcpy (&c, &b, sizeof (struct A));
> +  __builtin_memcpy (&a, &c, sizeof (struct A));
> +
> +  vs = c.s;
> +  vl = c.i;
> +  vl = c.j;
> +}
> +int main()
> +{
> +  __builtin_memset (&b, 0, sizeof (struct A));
> +  b.s = 1;
> +  __builtin_memcpy ((char *)&b+2, &b, 2);
> +  foo ();
> +  __builtin_memcpy (&a, (char *)&a+2, 2);
> +  if (a.s != 1)
> +__builtin_abort ();
> +  return 0;
> +}
> +
> +/* { dg-final { scan-tree-dump-not "Will attempt to totally scalarize" 
> "esra" } } */
> diff --git a/gcc/tree-sra.c b/gcc/tree-sra.c
> index 866cff0edb0..0a9622e77f8 100644
> --- a/gcc/tree-sra.c
> +++ b/gcc/tree-sra.c
> @@ -1141,6 +1141,101 @@ contains_view_convert_expr_p (const_tree ref)
>return false;
>  }
>  
> +/* Return false if we can determine that TYPE has no padding, otherwise 
> return
> +   true.  */
> +
> +static bool
> +type_contains_padding_p (tree type)
> +{
> +  if (is_gimple_reg_type (type))
> +return false;
> +
> +  if (!tree_fits_uhwi_p (TYPE_SIZE (type)))
> +return true;
> +
> +  switch (TREE_CODE (type))
> +{
> +case RECORD_TYPE:
> +  {
> + unsigned HOST_WIDE_INT pos = 0;
> + for (tree fld = TYPE_FIELDS (type); fld; fld = DECL_CHAIN (fld))
> +   if (TREE_CODE (fld) == FIELD_DECL)
> + {
> +   tree ft = TREE_TYPE (fld);
> +
> +   if (DECL_BIT_FIELD (fld)
> +   || DECL_PADDING_P (fld)
> +   || !tree_fits_uhwi_p (TYPE_SIZE (ft)))
> + return true;
> +
> +   tree t = bit_position (fld);
> +   if (!tree_fits_uhwi_p (t))
> + return true;
> +   unsigned HOST_WIDE_INT bp = tree_to_uhwi (t);
> +   if (pos != bp)
> + return true;
> +
> +   pos += 

Re: [Patch][aarch64] Use IFUNCs to enable LSE instructions in libatomic on aarch64

2017-12-04 Thread Steve Ellcey
FYI: Since James approved the Aarch64 part and since I think the
generic part can be considered trivial, I have gone ahead and checked
this patch in.

Steve Ellcey
sell...@cavium.com

On Tue, 2017-11-28 at 16:12 -0800, Steve Ellcey wrote:
> On Tue, 2017-11-21 at 17:35 +, James Greenhalgh wrote:
> 
> > 
> > 
> > Thanks for the detailed explanation. I understood this, and my
> > opinion is
> > that the AArch64 parts of this patch are OK (and I don't know who
> > needs to
> > Ack the small generic changes you require).
> > 
> > Let's give Richard/Marcus 48 hours to object while we wait for an
> > OK on the
> > generic bits, and then OK for AArch64.
> > 
> > Thanks,
> > James
> > 
> > Reviewed-By: James Greenhalgh 
> James, I haven't seen anything from Richard or Marcus.  Do you think
> a
> review from a global maintainer is really needed?  There is no
> libatomic specific maintainer.  The only non-aarch64 specific change
> is using the macro IFUNC_RESOLVER_ARGS in place of the hardcoded
> 'void'
> argument for ifunc selector functions and for all non-aarch64
> platforms
> the macro will be defined as 'void' so there is no real change for
> any
> other platform.
> 
> Steve Ellcey
> sell...@cavium.com


Re: Avoid -Werror=format-overflow error in dbxout.c (dbxout_block) on Solaris/SPARC

2017-12-04 Thread Martin Sebor

On 12/04/2017 01:58 PM, David Malcolm wrote:

On Mon, 2017-12-04 at 12:18 -0700, Martin Sebor wrote:

On 12/04/2017 05:41 AM, Rainer Orth wrote:

Within test last week, 64-bit Solaris/SPARC bootstrap began to
fail:

/vol/gcc/src/hg/trunk/local/gcc/dbxout.c: In function 'bool
dbxout_block(tree, int, tree, int)':
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: error: '%lu'
directive writing between 1 and 20 bytes into a region of size 14
[-Werror=format-overflow=]
 dbxout_block (tree block, int depth, tree args, int
parent_blocknum)
 ^~~~
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3767:1: note: directive
argument in the range [0, 18446744073709551614]
In file included from ./tm.h:26,
 from /vol/gcc/src/hg/trunk/local/gcc/target.h:52,
 from /vol/gcc/src/hg/trunk/local/gcc/dbxout.c:72:
/vol/gcc/src/hg/trunk/local/gcc/config/sparc/sol2.h:353:11: note:
'std::sprintf' output between 8 and 27 bytes into a destination of
size 20
   sprintf ((LABEL), "*.L%s%lu", (PREFIX), (unsigned long)(NUM))
   ^
/vol/gcc/src/hg/trunk/local/gcc/dbxout.c:3855:5: note: in expansion
of macro 'ASM_GENERATE_INTERNAL_LABEL'
 ASM_GENERATE_INTERNAL_LABEL (buf, "LBB", parent_blocknum);
 ^~~

The line numbers are extremely confusing, to say the least, though:
the
one in the error and the first note refer to the begin of the
function
definition and only the third note refers to the line of the actual
error.


I agree it looks confusing.  It's the result of the interaction
between macro tracking and inlining.

I think it's a general problem that affects many (though not all)
warnings emitted out of the middle-end.  For instance, the example
below results in similar output.  The output changes depending on
whether or not _FORTIFY_SOURCE is defined.

#include 

#define FOO(d, s) strcpy (d, s)

void foo (char *d, const char *s) { FOO (d, s); }

void sink (char*);

void bar (void)
{
   char a[3];

   foo (a, "1234567");   // bug here

   sink (a);
}

Without _FORTIFY_SOURCE:

d.c: In function ‘void bar()’:
d.c:3:26: warning: ‘void* __builtin_memcpy(void*, const void*, long
unsigned int)’ writing 8 bytes into a region of size 3 overflows the
destination [-Wstringop-overflow=]
  #define FOO(d, s) strcpy (d, s)
~~~^~
d.c:5:37: note: in expansion of macro ‘FOO’
  void foo (char *d, const char *s) { FOO (d, s); }
  ^~~

If bar() were a bigger function with multiple calls to foo() it
could be tricky to tell which of them caused the warning.

With _FORTIFY_SOURCE:

In file included from /usr/include/string.h:635,
  from d.c:1:
In function ‘char* strcpy(char*, const char*)’,
 inlined from ‘void bar()’ at d.c:5:37:
/usr/include/bits/string3.h:110:33: warning: ‘void*
__builtin___memcpy_chk(void*, const void*, long unsigned int, long
unsigned int)’ writing 8 bytes into a region of size 3 overflows the
destination [-Wstringop-overflow=]
return __builtin___strcpy_chk (__dest, __src, __bos (__dest));
   ~~~^~~

This suffers from the same problem as the first case: the line
number of the offending call in bar() is missing.

In both cases the problem is compounded by the diagnostic, as
a result of folding, referring to __builtin___memcpy_chk while
the source code contains a call to strcpy.

I don't know nearly enough about the internals of the diagnostic
subsystem to tell how difficult it might be to change the output
to make it more readable.  David Malcolm is the expert on this
area so he might have an idea what it would take.


[Did you mean to CC me on this, rather than dje?]


I meant you but I'm interested in all expert opinions/suggestions.


I'm not as familiar as I could be on the existing inlining-tracking
implementation - so much so that, in ignorance, I wrote my own version
of it earlier this year (doh!).

The warning implementation reads:

3151warning_at (loc, opt,
3152(integer_onep (range[0])
3153 ? G_("%K%qD writing %E byte into a region "
3154  "of size %E overflows the destination")
3155 : G_("%K%qD writing %E bytes into a region "
3156  "of size %E overflows the destination")),
3157exp, get_callee_fndecl (exp), range[0], 
objsize);


The inlining information is coming from that "%K" in the string, which
leads to tree-pretty-print.c's percent_K_format being called for "exp".
 This overwrites the "loc" provided for the warning_at with
EXPR_LOCATION of exp, and looks at TREE_BLOCK (exp), finding the
outermost containing block within its function, and writing it into the
diagnostic_info's x_data.

This block is used by lhd_print_error_function (and C++'s
cp_print_error_function) to print t

Go patch committed: Omit nil check when dereferencing closure variable

2017-12-04 Thread Ian Lance Taylor
This patch to the Go frontend by Than McIntosh adds the "no nil check
needed" annotation to the dereference operations created in
Parse::enclosing_var_reference (this is safe since the closure object
is under control of the compiler, and pointer fields in it will always
be non-nil).  Bootstrapped and ran Go testsuite on
x86_64-pc-linux-gnu.  Committed to mainline.

Ian
Index: gcc/go/gofrontend/MERGE
===
--- gcc/go/gofrontend/MERGE (revision 255347)
+++ gcc/go/gofrontend/MERGE (working copy)
@@ -1,4 +1,4 @@
-338f7434175bb71f3c8905e9ad7f480aec3afee6
+297cf346f2400274946650ab9ecd039427fc986b
 
 The first line of this file holds the git revision number of the last
 merge done from the gofrontend repository.
Index: gcc/go/gofrontend/parse.cc
===
--- gcc/go/gofrontend/parse.cc  (revision 255340)
+++ gcc/go/gofrontend/parse.cc  (working copy)
@@ -2747,7 +2747,7 @@ Parse::enclosing_var_reference(Named_obj
   location);
   closure_ref =
   Expression::make_dereference(closure_ref,
-   Expression::NIL_CHECK_DEFAULT,
+   Expression::NIL_CHECK_NOT_NEEDED,
location);
 
   // The closure structure holds pointers to the variables, so we need
@@ -2755,7 +2755,8 @@ Parse::enclosing_var_reference(Named_obj
   Expression* e = Expression::make_field_reference(closure_ref,
   ins.first->index(),
   location);
-  e = Expression::make_dereference(e, Expression::NIL_CHECK_DEFAULT, location);
+  e = Expression::make_dereference(e, Expression::NIL_CHECK_NOT_NEEDED,
+   location);
   return Expression::make_enclosing_var_reference(e, var, location);
 }
 


Re: [PATCH 5/3] C++ bits to improve detection of attribute conflicts (PR 81544)

2017-12-04 Thread Jason Merrill
OK.

On Sun, Dec 3, 2017 at 9:59 AM, Martin Sebor  wrote:
> Ping: https://gcc.gnu.org/ml/gcc-patches/2017-08/msg01034.html
>
> The attached C++ only patch is rebased on the top of trunk.
>
> The remaining patches in the series (C FE and the back ends)
> have been approved.
>
> Martin
>
>
> On 08/23/2017 08:36 AM, Martin Sebor wrote:
>>
>> On 08/22/2017 09:51 PM, Jason Merrill wrote:
>>>
>>> The C and C++ front ends already have a diagnose_mismatched_attributes
>>> function, which seems like the natural place to add more conflict
>>> checking.
>>
>>
>> There are a few major problems with handling attribute conflicts
>> in diagnose_mismatched_attributes.
>>
>> The function only diagnoses conflicts, it doesn't resolve them
>> (choose one attribute or the other).  Currently, when they are
>> resolved, it's done in each attribute handler.  But this is done
>> inconsistently because of the limitations of the infrastructure:
>> no access to the last pushed declaration.  Often, it's not done
>> at all.  Making the declaration  available to every handler is
>> one of the design improvements of the patch.
>>
>> Attributes are defined in the attribute_spec tables in a number
>> of different places: all the back ends, in addition to the front
>> ends, but processed by the general infrastructure in
>> decl_attributes in attribs.c.  The infrastructure already has
>> all the smarts to either accept or reject a conflicting attribute.
>> None of this is available in diagnose_mismatched_attributes.
>> The current implementation of the function hardcodes knowledge
>> about a small number of attribute relationships in the code.
>> That's a poor approach given the table-driven attribute_spec
>> design.  For one, it makes it difficult for back-end maintainers
>> to add a new attribute that's mutually exclusive with another
>> (the back-end maintainer would need to change the front end).
>> It might explain why no back-end attribute conflicts are
>> diagnosed there.
>>
>> Some, maybe even most of these problems could be overcome by
>> moving the conflict resolution from decl_attributes to
>> diagnose_mismatched_attributes.  But it would mean duplicating
>> a fair amount of the logic between the two functions.  I think
>> that would result in an inferior solution.  From both a design
>> and implementation point of view, I feel the logic fits best
>> in the attribs.c.
>>
>> Martin
>
>


Re: [PATCH] gcc: xtensa: enable address sanitizer

2017-12-04 Thread augustine.sterl...@gmail.com
On Mon, Dec 4, 2017 at 1:28 PM, Max Filippov  wrote:
> gcc/
> 2017-12-04  Max Filippov  
>
> * config/xtensa/xtensa.c (xtensa_asan_shadow_offset): New
> function.
> (TARGET_ASAN_SHADOW_OFFSET): New macro definition.
> * config/xtensa/xtensa.h (FRAME_GROWS_DOWNWARD): Set to 1 if
> ASAN is enabled.

This is OK.


Re: [PATCH] Don't add __builtin_unreachable for -Wreturn-type places at -O0

2017-12-04 Thread Richard Biener
On December 5, 2017 12:48:02 AM GMT+01:00, Jakub Jelinek  
wrote:
>Hi!
>
>When debugging what turned out to be just a -Wreturn-type reported
>failure now being a fatal problem even at -O0, it occurs to me that
>because we are not optimizing at -O0, adding the __builtin_unreachable
>doesn't bring many optimization advantages (practically just optimize
>away epilogue) and so I think I'd prefer to only add those
>when optimizing.
>
>Similarly, for UBSan, we instrument these with -fsanitize=return,
>part of -fsanitize=unreachable, where we provide nice runtime errors,
>but when somebody intentionally turns that off (e.g. to quickly get
>past such issues to look for something else), say with
>-fsanitize=undefined -fno-sanitize=return, the same error will be
>reported as __builtin_unreachable with a cryptic  location.
>I think it is better to just leave it out if -fsanitize=unreachable.
>
>Changed thusly, bootstrapped/regtested on x86_64-linux and i686-linux,
>ok
>for trunk?

OK. 

Richard. 

>2017-12-05  Jakub Jelinek  
>
>   * cp-gimplify.c (cp_maybe_instrument_return): Don't add
>   __builtin_unreachable if -O0 or if -fsanitize=unreachable.
>
>   * g++.dg/missing-return.C: Add -O to dg-options.
>
>--- gcc/cp/cp-gimplify.c.jj2017-12-04 22:29:04.759741988 +0100
>+++ gcc/cp/cp-gimplify.c   2017-12-04 22:37:07.283784470 +0100
>@@ -1554,6 +1554,18 @@ cp_maybe_instrument_return (tree fndecl)
>   || !targetm.warn_func_return (fndecl))
> return;
> 
>+  if (!sanitize_flags_p (SANITIZE_RETURN, fndecl)
>+  /* Don't add __builtin_unreachable () if not optimizing, it will
>not
>+   improve any optimizations in that case, just break UB code.
>+   Don't add it if -fsanitize=unreachable -fno-sanitize=return either,
>+   UBSan covers this with ubsan_instrument_return above where
>sufficient
>+   information is provided, while the __builtin_unreachable () below
>+   if return sanitization is disabled will just result in hard to
>+   understand runtime error without location.  */
>+  && (!optimize
>+|| sanitize_flags_p (SANITIZE_UNREACHABLE, fndecl)))
>+return;
>+
>   tree t = DECL_SAVED_TREE (fndecl);
>   while (t)
> {
>--- gcc/testsuite/g++.dg/missing-return.C.jj   2017-11-06
>17:24:11.0 +0100
>+++ gcc/testsuite/g++.dg/missing-return.C  2017-12-05 00:25:13.028800127
>+0100
>@@ -1,5 +1,5 @@
> /* { dg-do compile } */
>-/* { dg-options "-Wreturn-type -fdump-tree-optimized" } */
>+/* { dg-options "-Wreturn-type -fdump-tree-optimized -O" } */
> 
> int foo(int a)
> {
>
>   Jakub



Re: [PATCH][i386,AVX] Enable VNNI support [1/5]

2017-12-04 Thread Kirill Yukhin
Hello Julia,
On 24 Oct 10:37, Koval, Julia wrote:
> Hi,
> This patch enables VNNI isaset option. The doc for isaset and instruction: 
> https://software.intel.com/sites/default/files/managed/c5/15/architecture-instruction-set-extensions-programming-reference.pdf
> 
> Ok for trunk?
Your patch is OK for trunk. I've checked it in.

--
Thanks, K
> 
> Thanks,
> Julia