Re: [Patch, testuite, committed] Fix some more tests that fail for non 32-bit int targets

2016-07-26 Thread Senthil Kumar Selvaraj

Mike Stump writes:

> On Jul 25, 2016, at 5:00 AM, Senthil Kumar Selvaraj 
>  wrote:
>> 
>>  The below patch fixes tests that fail for the avr target, because they
>>  assume ints are atleast 32 bits wide and pointers and longs have the
>>  same size.
>> 
>>  I've required int32plus support for one test, and for the other two,
>>  I've introduced a cast to intptr_t to avoid the pointer <-> int size
>>  difference warning.
>> 
>>  Reg tested on avr and x86_64 with no regressions. Committed as
>>  obvious.
>
> Can you use __INTPTR_TYPE__ instead?  This way, you don't need to add any 
> include, and the test case will work in cross environments without any libc, 
> which is at times handy.  Thanks.
>
> See grep INTPTR gcc/gcc/testsuite/gcc.dg/*.c for how people use it, and for 
> the unsigned version.
>
>> 2016-07-25  Senthil Kumar Selvaraj  
>> 
>>  * gcc.dg/torture/pr69352.c (foo): Cast to intptr_t instead of long.
>>  * gcc.dg/torture/pr69771.c: Require int32plus.
>>  * gcc.dg/torture/pr71866.c (inb): Add cast to intptr_t.

I'll keep that in mind, thanks.

Is the below patch ok? I used uintptr_t in the second test as the source
type is unsigned long.

Regards
Senthil

Index: pr69352.c
===
--- pr69352.c   (revision 238743)
+++ pr69352.c   (working copy)
@@ -1,6 +1,6 @@
 /* { dg-do compile } */
 
-#include 
+__extension__ typedef __INTPTR_TYPE__ intptr_t;
 
 int a[10][14], b, c, d, e, f, g, h, i;
 void bar (void);
Index: pr71866.c
===
--- pr71866.c   (revision 238743)
+++ pr71866.c   (working copy)
@@ -1,7 +1,8 @@
 /* { dg-do compile } */
 /* { dg-additional-options "-ftree-pre -fcode-hoisting" } */
 
-#include 
+__extension__ typedef __UINTPTR_TYPE__ uintptr_t;
+
 typedef unsigned char u8;
 extern unsigned long pci_io_base;
 u8 in_8 (const volatile void *);
@@ -26,7 +27,7 @@
 static inline
 u8 inb (unsigned long port)
 {
-  return readb((volatile void *)(intptr_t)pci_io_base + port);
+  return readb((volatile void *)(uintptr_t)pci_io_base + port);
 }
 static inline
 void outb (u8 val, unsigned long port)


Re: [RS6000] push_secondary_reload ICE

2016-07-26 Thread Segher Boessenkool
Hi Alan,

Thanks for the writeup.

On Tue, Jul 26, 2016 at 04:09:12PM +0930, Alan Modra wrote:
>   /* If this is a scalar floating point value and we want to load it into the
>  traditional Altivec registers, do it via a move via a traditional 
> floating
>  point register, unless we have D-form addressing.  Also make sure that
>  non-zero constants use a FPR.  */
>   if (!done_p && reg_addr[mode].scalar_in_vmx_p
>   && !mode_supports_vmx_dform (mode)
>   && (rclass == VSX_REGS || rclass == ALTIVEC_REGS)
>   && (memory_p || (GET_CODE (x) == CONST_DOUBLE)))
> {
>   ret = FLOAT_REGS;
>   default_p = false;
>   done_p = true;
> }
> 
> For the pr72103 testcase this is effectively saying we shouldn't store
> a DImode vsx reg to a stack slot directly, but instead should reload
> into a float reg.  Which seems a little odd to me..

I added SCALAR_FLOAT_MODE_P (mode) to the condition, which also fixes
this particular PR, and makes this do what the comment says it does.
Mike?  What is best here?  Change the comment or change the code?

> Anyway, regardless of whether this PR needs more attention, I believe
> we should stop using default_secondary_reload or ensure it isn't
> presented with uninitialized data.  The following cures the ICE and
> has been bootstrapped and regression tested powerpc64le-linux.
> 
> OK for mainline?

Yes please.  It is also okay for backports (it seems to be needed on 5
and 6, although the testcase in PR72103 does not fail there).

Thanks,


Segher

>   PR target/72103
>   * config/rs6000/rs6000.c (rs6000_secondary_reload): Initialize
>   sri->t_icode.


Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-26 Thread Ilya Enkovich
2016-07-26 0:08 GMT+03:00 Jeff Law :
> On 07/25/2016 12:32 PM, Richard Biener wrote:
>>
>> On July 25, 2016 8:01:17 PM GMT+02:00, Jeff Law  wrote:
>>>
>>> On 07/22/2016 05:36 AM, Richard Biener wrote:

 The thing that needs work I think is re-running of if-conversion.
>>>
>>> I wonder if we could revamp if-conversion to work on a subset of the
>>> CFG?   I can see that potentially being useful in other contexts.
>>> Would
>>> that work for you Richi?
>>
>>
>> Well, you need to make it not need post-dominators or preserve them (or
>> compute "post-dominators" on SESE regions).
>
> Oh, but it'd be so nice to have DOMs and/or PDOMs on regions.  But that's
> probably out of scope for gcc-7.
>
>
>>
>> What doesn't work with the idea to clone the epilogue using
>> __built-in_vectorized()
>> For the if- vs. Not if-converted loop?
>
> I must be missing something.   I don't see how builtin_vectorized_function
> helps, but maybe I've got the wrong built-in or don't understand what you're
> suggesting.
>
> It sounds like this is the biggest impediment to moving forward.  So let's
> reset and make sure we're all on the same page here.
>
> Ilya, what's the fundamental reason why we need to run if-conversion again?
> Yes, I know you want to if-convert the epilogue, but why?
>
> What are the consequences of not doing if-conversion on the epilogue?
> Presumably we miss a vectorization opportunity on the tail.  But that may be
> a reasonable limitation to allow the existing work to move forward while you
> go back and revamp things a little.

If we have some control-flow in a loop then we have to if-convert it
for vectorizer.
We need to preserve both versions: if-converted one for vectorizer and
the original
one to be used if vectorization fails.  For epilogues we have similar
situation and
need two versions.  I do it by running if-conversion on a copy of original loop.
Note that it doesn't run full if-conversion pass. If-conversion is
called for epilogue
loop only.

Thanks,
Ilya

>
> Jeff


[PATCH][AArch64] Cleanup frame push/pop code

2016-07-26 Thread Wilco Dijkstra
This patch improves the readability of the prolog and epilog code by moving 
some 
code into separate functions.  There is no difference in generated code.

OK for commit?

ChangeLog:
2016-07-26  Wilco Dijkstra  

gcc/
* config/aarch64/aarch64.c (aarch64_pushwb_pair_reg): Rename.
(aarch64_push_reg): New function to push 1 or 2 registers.
(aarch64_pop_reg): New function to pop 1 or 2 registers.
(aarch64_expand_prologue): Use aarch64_push_regs.
(aarch64_expand_epilogue): Use aarch64_pop_regs.

--
diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
index 
f3300f0909ddaac7359f189fcddcdce7e61ab51b..547eb6771084cda40fa7cd1a8973e6a17f6799d4
 100644
--- a/gcc/config/aarch64/aarch64.c
+++ b/gcc/config/aarch64/aarch64.c
@@ -2806,10 +2806,14 @@ aarch64_gen_storewb_pair (machine_mode mode, rtx base, 
rtx reg, rtx reg2,
 }
 
 static void
-aarch64_pushwb_pair_reg (machine_mode mode, unsigned regno1,
-unsigned regno2, HOST_WIDE_INT adjustment)
+aarch64_push_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment)
 {
   rtx_insn *insn;
+  machine_mode mode = (regno1 <= R30_REGNUM) ? DImode : DFmode;
+
+  if (regno2 == FIRST_PSEUDO_REGISTER)
+return aarch64_pushwb_single_reg (mode, regno1, adjustment);
+
   rtx reg1 = gen_rtx_REG (mode, regno1);
   rtx reg2 = gen_rtx_REG (mode, regno2);
 
@@ -2837,6 +2841,30 @@ aarch64_gen_loadwb_pair (machine_mode mode, rtx base, 
rtx reg, rtx reg2,
 }
 }
 
+static void
+aarch64_pop_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment,
+ rtx *cfi_ops)
+{
+  machine_mode mode = (regno1 <= R30_REGNUM) ? DImode : DFmode;
+  rtx reg1 = gen_rtx_REG (mode, regno1);
+
+  *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg1, *cfi_ops);
+
+  if (regno2 == FIRST_PSEUDO_REGISTER)
+{
+  rtx mem = plus_constant (Pmode, stack_pointer_rtx, adjustment);
+  mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem);
+  emit_move_insn (reg1, gen_rtx_MEM (mode, mem));
+}
+  else
+{
+  rtx reg2 = gen_rtx_REG (mode, regno2);
+  *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg2, *cfi_ops);
+  emit_insn (aarch64_gen_loadwb_pair (mode, stack_pointer_rtx, reg1,
+ reg2, adjustment));
+}
+}
+
 static rtx
 aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx reg1, rtx mem2,
rtx reg2)
@@ -3124,7 +3152,7 @@ aarch64_expand_prologue (void)
 R30_REGNUM, false);
}
  else
-   aarch64_pushwb_pair_reg (DImode, R29_REGNUM, R30_REGNUM, offset);
+   aarch64_push_regs (R29_REGNUM, R30_REGNUM, offset);
 
  /* Set up frame pointer to point to the location of the
 previous frame pointer on the stack.  */
@@ -3150,14 +3178,8 @@ aarch64_expand_prologue (void)
}
  else
{
- machine_mode mode1 = (reg1 <= R30_REGNUM) ? DImode : DFmode;
-
+ aarch64_push_regs (reg1, reg2, offset);
  skip_wb = true;
-
- if (reg2 == FIRST_PSEUDO_REGISTER)
-   aarch64_pushwb_single_reg (mode1, reg1, offset);
- else
-   aarch64_pushwb_pair_reg (mode1, reg1, reg2, offset);
}
}
 
@@ -3279,39 +3301,16 @@ aarch64_expand_epilogue (bool for_sibcall)
emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
 
   if (skip_wb)
-   {
- machine_mode mode1 = (reg1 <= R30_REGNUM) ? DImode : DFmode;
- rtx rreg1 = gen_rtx_REG (mode1, reg1);
-
- cfi_ops = alloc_reg_note (REG_CFA_RESTORE, rreg1, cfi_ops);
- if (reg2 == FIRST_PSEUDO_REGISTER)
-   {
- rtx mem = plus_constant (Pmode, stack_pointer_rtx, offset);
- mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem);
- mem = gen_rtx_MEM (mode1, mem);
- insn = emit_move_insn (rreg1, mem);
-   }
- else
-   {
- rtx rreg2 = gen_rtx_REG (mode1, reg2);
-
- cfi_ops = alloc_reg_note (REG_CFA_RESTORE, rreg2, cfi_ops);
- insn = emit_insn (aarch64_gen_loadwb_pair
-   (mode1, stack_pointer_rtx, rreg1,
-rreg2, offset));
-   }
-   }
+   aarch64_pop_regs (reg1, reg2, offset, &cfi_ops);
   else
-   {
- insn = emit_insn (gen_add2_insn (stack_pointer_rtx,
-  GEN_INT (offset)));
-   }
+   emit_insn (gen_add2_insn (stack_pointer_rtx, GEN_INT (offset)));
 
   /* Reset the CFA to be SP + FRAME_SIZE.  */
   rtx new_cfa = stack_pointer_rtx;
   if (frame_size > 0)
new_cfa = plus_constant (Pmode, new_cfa, frame_size);
   cfi_ops = alloc_reg_note (REG_CFA_DEF_CFA, new_cfa, cfi_ops);
+  insn = get_last_insn ();
   REG_NOTES (insn) = cfi_ops;
   RTX_FRAME_RE

[patch, avr] Fix mmcu to specs issues

2016-07-26 Thread Pitchumani Sivanupandi

avr-gcc expected to find the device specs in the search paths specified. But
it doesn't work as expected when device specs in different place than the
installed location.

Example-1:
avr-gcc wrongly assumes the device specs will be in first identified
'device-specs' folder in prefix path. avr-gcc natively supports device
at90pwm1, but complains as it couldn't find the specs file in the first
'device-specs' directory in the search path.


$ avr-gcc test.c -mmcu=at90pwm1 -B /home/install/dev/atxmega128a1u/

avr-gcc: error: cannot access device-specs for _at90pwm1_ expected at 
_/home/install/dev/atxmega128a1u/device-specs/specs-at90pwm1_
avr-gcc: note: devices natively supported: ata5272 ata5505 ata5702m322 
ata5782 ata5790 ata5790n ata5791 ata5795 ata5831 ata6285 ata6286 ata6289 
ata6612c ata6613c ata6614q ata6616c ata6617c ata664251 ata8210 ata8510 
atmega103 atmega128 atmega128a atmega128rfa1 atmega128rfr2 atmega1280 
atmega1281 atmega1284 atmega1284p atmega1284rfr2 atmega16 atmega16a 
atmega16hva atmega16hva2 atmega16hvb atmega16hvbrevb atmega16m1 
atmega16u2 atmega16u4 atmega161 atmega162 atmega163 atmega164a 
atmega164p atmega164pa atmega165 atmega165a atmega165p atmega165pa 
atmega168 atmega168a atmega168p atmega168pa atmega168pb atmega169 
atmega169a atmega169p atmega169pa atmega256rfr2 atmega2560 atmega2561 
atmega2564rfr2 atmega32 atmega32a atmega32c1 atmega32hvb atmega32hvbrevb 
atmega32m1 atmega32u2 atmega32u4 atmega32u6 atmega323 atmega324a 
atmega324p atmega324pa atmega325 atmega325a atmega325p atmega325pa 
atmega3250 atmega3250a atmega3250p atmega3250pa atmega328 atmega328p 
atmega328pb atmega329 atmega329a atmega329p atmega329pa atmega3290 
atmega3290a atmega3290p atmega3290pa atmega406 atmega48 atmega48a 
atmega48p atmega48pa atmega48pb atmega64 atmega64a atmega64c1 
atmega64hve atmega64hve2 atmega64m1 atmega64rfr2 atmega640 atmega644 
atmega644a atmega644p atmega644pa atmega644rfr2 atmega645 atmega645a 
atmega645p atmega6450 atmega6450a atmega6450p atmega649 atmega649a 
atmega649p atmega6490 atmega6490a atmega6490p atmega8 atmega8a 
atmega8hva atmega8u2 atmega8515 atmega8535 atmega88 atmega88a atmega88p 
atmega88pa atmega88pb attiny10 attiny11 attiny12 attiny13 attiny13a 
attiny15 attiny1634 attiny167 attiny20 attiny22 attiny2313 attiny2313a 
attiny24 attiny24a attiny25 attiny26 attiny261 attiny261a attiny28 
attiny4 attiny40 attiny43u attiny4313 attiny44 attiny44a attiny441 
attiny45 attiny461 attiny461a attiny48 attiny5 attiny828 attiny84 
attiny84a attiny841 attiny85 attiny861 attiny861a attiny87 attiny88 
attiny9 atxmega128a1 atxmega128a1u atxmega128a3 atxmega128a3u 
atxmega128a4u atxmega128b1 atxmega128b3 atxmega128c3 atxmega128d3 
atxmega128d4 atxmega16a4 atxmega16a4u atxmega16c4 atxmega16d4 
atxmega16e5 atxmega192a3 atxmega192a3u atxmega192c3 atxmega192d3 
atxmega256a3 atxmega256a3b atxmega256a3bu atxmega256a3u atxmega256c3 
atxmega256d3 atxmega32a4 atxmega32a4u atxmega32c3 atxmega32c4 
atxmega32d3 atxmega32d4 atxmega32e5 atxmega384c3 atxmega384d3 
atxmega64a1 atxmega64a1u atxmega64a3 atxmega64a3u atxmega64a4u 
atxmega64b1 atxmega64b3 atxmega64c3 atxmega64d3 atxmega64d4 atxmega8e5 
at43usb320 at43usb355 at76c711 at86rf401 at90can128 at90can32 at90can64 
at90c8534 at90pwm1 at90pwm161 at90pwm2 at90pwm2b at90pwm216 at90pwm3 
at90pwm3b at90pwm316 at90pwm81 at90scr100 at90s1200 at90s2313 at90s2323 
at90s2333 at90s2343 at90s4414 at90s4433 at90s4434 at90s8515 at90s8535 
at90usb1286 at90usb1287 at90usb162 at90usb646 at90usb647 at90usb82 at94k 
m3000
avr-gcc: note: supported core architectures: avr2 avr25 avr3 avr31 avr35 
avr4 avr5 avr51 avr6 avrxmega2 avrxmega4 avrxmega5 avrxmega6 avrxmega7 
avrtiny avr1
avr-gcc: note: you can provide your own specs files, see 
 for details



Example-2:
Similar issue happens when -flto option is enabled and device specs in 
custom

search path.

atxmega128a1u device specs in custom path and that is passed with -B option.
Architecture spec files such as specs-avrxmega7 will be in installed 
location.


$ avr-gcc test.c -mmcu=atxmega128a1u -flto -B 
/home/install/dev/atxmega128a1u/


avr-gcc: error: cannot access device-specs for _avrxmega7_ expected at 
_/home/install/dev/atxmega128a1u/device-specs/specs-avrxmega7_
avr-gcc: note: supported core architectures: avr2 avr25 avr3 avr31 avr35 
avr4 avr5 avr51 avr6 avrxmega2 avrxmega4 avrxmega5 avrxmega6 avrxmega7 
avrtiny avr1
avr-gcc: note: you can provide your own specs files, see 
 for details

lto-wrapper: fatal error: avr-gcc returned 1 exit status
compilation terminated.
/home/avr-trunk/install/lib/gcc/avr/7.0.0/../../../../avr/bin/ld: error: 
lto-wrapper failed

collect2: error: ld returned 1 exit status

Attached patch to address these issues.

Fix:
Replace device-specs-file spec function with mmcu-to-device-specs. This will
not assume that specs files are in first identified device-specs directory.
Inst

Re: [RS6000] push_secondary_reload ICE

2016-07-26 Thread Alan Modra
On Tue, Jul 26, 2016 at 03:26:55AM -0500, Segher Boessenkool wrote:
> On Tue, Jul 26, 2016 at 04:09:12PM +0930, Alan Modra wrote:
> >   /* If this is a scalar floating point value and we want to load it into 
> > the
> >  traditional Altivec registers, do it via a move via a traditional 
> > floating
> >  point register, unless we have D-form addressing.  Also make sure that
> >  non-zero constants use a FPR.  */
> >   if (!done_p && reg_addr[mode].scalar_in_vmx_p
> >   && !mode_supports_vmx_dform (mode)
> >   && (rclass == VSX_REGS || rclass == ALTIVEC_REGS)
> >   && (memory_p || (GET_CODE (x) == CONST_DOUBLE)))
> > {
> >   ret = FLOAT_REGS;
> >   default_p = false;
> >   done_p = true;
> > }
> > 
> > For the pr72103 testcase this is effectively saying we shouldn't store
> > a DImode vsx reg to a stack slot directly, but instead should reload
> > into a float reg.  Which seems a little odd to me..
> 
> I added SCALAR_FLOAT_MODE_P (mode) to the condition, which also fixes
> this particular PR, and makes this do what the comment says it does.
> Mike?  What is best here?  Change the comment or change the code?

Also, the comment only mentions loads whereas the code affects both
loads and stores.

-- 
Alan Modra
Australia Development Lab, IBM


[PATCH] Fix PR72517, improve vector extract RTL expansion

2016-07-26 Thread Richard Biener

The following patch fixes a bug in selecting a "better" mode to
do a vector extract from.  The existing code was evidently written
without knowing what GET_MODE_WIDER_MODE does for vector modes
in the face of multiple supported vector sizes.  This results in
odd code-generation choices for HImode extract from V16QI for
example.

The patch adds a missing constraint - make the new vector element
mode have the same size as the extraction mode as this is what
vec_extract_optab requires.

With the patch the testcase gcc.dg/vect/slp-45.c doesn't spill anymore
with -msse4.1 or -mavx2 while it spills plenty without it.

Bootstrap and regtest ongoing on x86_64-unknown-linux-gnu.

Note that the patch will now happily choose to extract DImode
punning the source to V4DImode from V8SFmode (because the expander
appearantly always chooses an integer mode for bit extraction).
Hopefully we don't have targets that have separate register files
for float vs. int vectors (or a heavy penalty in such punning).
Note this is a pre-existing issue as well - we just less likely
spill now (which possibly may made have such targets happy).

Richard.

2016-07-26  Richard Biener  

PR middle-end/72517
* expmed.c (extract_bit_field_1): Constrain the vector mode
with element size matching the extraction mode size when
choosing a better vector mode to do the extraction from.

Index: gcc/expmed.c
===
*** gcc/expmed.c(revision 238743)
--- gcc/expmed.c(working copy)
*** extract_bit_field_1 (rtx str_rtx, unsign
*** 1581,1586 
--- 1581,1587 
  
for (; new_mode != VOIDmode ; new_mode = GET_MODE_WIDER_MODE (new_mode))
if (GET_MODE_SIZE (new_mode) == GET_MODE_SIZE (GET_MODE (op0))
+   && GET_MODE_UNIT_SIZE (new_mode) == GET_MODE_SIZE (tmode)
&& targetm.vector_mode_supported_p (new_mode))
  break;
if (new_mode != VOIDmode)


Re: [PATCH] Remove dead call to gimple_bb in slsr_process_phi

2016-07-26 Thread Richard Biener
On Mon, Jul 25, 2016 at 9:51 PM, Bill Schmidt
 wrote:
> Hi,
>
> As reported on the gcc mailing list, slsr_process_phi contains a dead call
> to gimple_bb.  I looked into why this wasn't noticed before, and concluded
> that we don't actually need the call.  To reach this point, the phi argument
> must not correspond to a strength-reduction candidate in the table, and must
> not be a function argument.  Since even simple copies and casts create entries
> in the candidate table, the definition of the phi arg has a non-trivial RHS
> that is not of a form useful to strength reduction, and thus there is no
> reason to proceed; the phi is itself therefore not a strength reduction
> candidate.
>
> The existing logic leaves arg_bb with its initial NULL value, which correctly
> terminates processing for this phi.  Thus I am just changing the logic to
> make this explicit, and we don't need the call to gimple_bb at all.
>
> Bootstrapped and tested on powerpc64le-unknown-linux-gnu with no regressions.
> Is this ok for trunk?  (The error is harmless, so I see no reason for a
> backport.)

Ok.

Richard.

> Thanks!
> Bill
>
>
> 2016-07-25  Bill Schmidt  
>
> * gimple-ssa-strength-reduction.c (slsr_process_phi): Remove dead
> and unnecessary call to gimple_bb.
>
>
> Index: gcc/gimple-ssa-strength-reduction.c
> ===
> --- gcc/gimple-ssa-strength-reduction.c (revision 238719)
> +++ gcc/gimple-ssa-strength-reduction.c (working copy)
> @@ -785,14 +785,10 @@ slsr_process_phi (gphi *phi, bool speed)
> savings += stmt_cost (arg_stmt, speed);
> }
> }
> -  else
> +  else if (SSA_NAME_IS_DEFAULT_DEF (arg))
> {
>   derived_base_name = arg;
> -
> - if (SSA_NAME_IS_DEFAULT_DEF (arg))
> -   arg_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> - else
> -   gimple_bb (SSA_NAME_DEF_STMT (arg));
> + arg_bb = single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun));
> }
>
>if (!arg_bb || arg_bb->loop_father != cand_loop)
>


Re: Gimple loop splitting v2

2016-07-26 Thread Richard Biener
On Mon, Jul 25, 2016 at 10:57 PM, Andrew Pinski  wrote:
> On Wed, Dec 2, 2015 at 5:23 AM, Michael Matz  wrote:
>> Hi,
>>
>> On Tue, 1 Dec 2015, Jeff Law wrote:
>>
>>> > So, okay for trunk?
>>> -ENOPATCH
>>
>> Sigh :)
>> Here it is.
>
>
> I found one problem with it.
> Take:
> void f(int *a, int M, int *b)
> {
>   for(int i = 0; i <= M; i++)
> {
>if (i < M)
> a[i] = i;
> }
> }
>  CUT ---
> There are two issues with the code as below.  The outer most loop's
> aux is still set which causes the vectorizer not to vector the loop.
> The other issue is I need to run pass_scev_cprop after pass_loop_split
> to get the induction variable usage after the loop gone so the
> vectorizer will work.

I think scev_cprop needs to be re-written to an utility so that the vectorizer
itself can (within its own cost-model) eliminate an induction using it.

Richard.

> Something like (note this is copy and paste from a terminal):
> diff --git a/gcc/passes.def b/gcc/passes.def
> index c327900..e8d6ea6 100644
> --- a/gcc/passes.def
> +++ b/gcc/passes.def
> @@ -262,8 +262,8 @@ along with GCC; see the file COPYING3.  If not see
>   NEXT_PASS (pass_copy_prop);
>   NEXT_PASS (pass_dce);
>   NEXT_PASS (pass_tree_unswitch);
> - NEXT_PASS (pass_scev_cprop);
>   NEXT_PASS (pass_loop_split);
> + NEXT_PASS (pass_scev_cprop);
>   NEXT_PASS (pass_record_bounds);
>   NEXT_PASS (pass_loop_distribution);
>   NEXT_PASS (pass_copy_prop);
> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
> index 5411530..e72ef19 100644
> --- a/gcc/tree-ssa-loop-split.c
> +++ b/gcc/tree-ssa-loop-split.c
> @@ -592,7 +592,11 @@ tree_ssa_split_loops (void)
>
>gcc_assert (scev_initialized_p ());
>FOR_EACH_LOOP (loop, 0)
> -loop->aux = NULL;
> +{
> +  loop->aux = NULL;
> +  if (loop_outer (loop))
> +   loop_outer (loop)->aux = NULL;
> +}

How does the iterator not visit loop_outer (loop)?!

>
>/* Go through all loops starting from innermost.  */
>FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
> @@ -631,7 +635,11 @@ tree_ssa_split_loops (void)
>  }
>
>FOR_EACH_LOOP (loop, 0)
> -loop->aux = NULL;
> +{
> +  loop->aux = NULL;
> +  if (loop_outer (loop))
> +   loop_outer (loop)->aux = NULL;
> +}
>
>if (changed)
>  return TODO_cleanup_cfg;
> -  CUT -
>
> Thanks,
> Andrew
>
>
>>
>>
>> Ciao,
>> Michael.
>> * common.opt (-fsplit-loops): New flag.
>> * passes.def (pass_loop_split): Add.
>> * opts.c (default_options_table): Add OPT_fsplit_loops entry at -O3.
>> (enable_fdo_optimizations): Add loop splitting.
>> * timevar.def (TV_LOOP_SPLIT): Add.
>> * tree-pass.h (make_pass_loop_split): Declare.
>> * tree-ssa-loop-manip.h (rewrite_into_loop_closed_ssa_1): Declare.
>> * tree-ssa-loop-unswitch.c: Include tree-ssa-loop-manip.h,
>> * tree-ssa-loop-split.c: New file.
>> * Makefile.in (OBJS): Add tree-ssa-loop-split.o.
>> * doc/invoke.texi (fsplit-loops): Document.
>> * doc/passes.texi (Loop optimization): Add paragraph about loop
>> splitting.
>>
>> testsuite/
>> * gcc.dg/loop-split.c: New test.
>>
>> Index: common.opt
>> ===
>> --- common.opt  (revision 231115)
>> +++ common.opt  (working copy)
>> @@ -2453,6 +2457,10 @@ funswitch-loops
>>  Common Report Var(flag_unswitch_loops) Optimization
>>  Perform loop unswitching.
>>
>> +fsplit-loops
>> +Common Report Var(flag_split_loops) Optimization
>> +Perform loop splitting.
>> +
>>  funwind-tables
>>  Common Report Var(flag_unwind_tables) Optimization
>>  Just generate unwind tables for exception handling.
>> Index: passes.def
>> ===
>> --- passes.def  (revision 231115)
>> +++ passes.def  (working copy)
>> @@ -252,6 +252,7 @@ along with GCC; see the file COPYING3.
>>   NEXT_PASS (pass_dce);
>>   NEXT_PASS (pass_tree_unswitch);
>>   NEXT_PASS (pass_scev_cprop);
>> + NEXT_PASS (pass_loop_split);
>>   NEXT_PASS (pass_record_bounds);
>>   NEXT_PASS (pass_loop_distribution);
>>   NEXT_PASS (pass_copy_prop);
>> Index: opts.c
>> ===
>> --- opts.c  (revision 231115)
>> +++ opts.c  (working copy)
>> @@ -532,6 +532,7 @@ static const struct default_options defa
>> regardless of them being declared inline.  */
>>  { OPT_LEVELS_3_PLUS_AND_SIZE, OPT_finline_functions, NULL, 1 },
>>  { OPT_LEVELS_1_PLUS_NOT_DEBUG, OPT_finline_functions_called_once, NULL, 
>> 1 },
>> +{ OPT_LEVELS_3_PLUS, OPT_fsplit_loops, NULL, 1 },
>>  { OPT_LEVELS_3_PLUS, OPT_funswitch_loops, NULL, 1 },
>>  { OPT_LEVELS_3_PLUS, OPT_fgcse_after_reload, NULL, 1 },
>>  { OPT_LEVELS_3_PLUS, OPT_ftree_lo

Re: warn on dead function calls [2/4] libsupc++/eh_alloc.cc fallout

2016-07-26 Thread Richard Biener
On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:

> Many warnings for dead-calls are emitted with patch on call to
> operator new in libsupc++/eh_alloc.cc, which I am not sure are correct
> or false positives, for instance:
> 
> /home/prathamesh.kulkarni/gcc-svn/trunk/libstdc++-v3/libsupc++/eh_alloc.cc:170:22:
> warning: Call from void* {anonymous}::pool::allocate(std::size_t) to
> void* operator new(std::size_t, void*) has no effect [-Wunused-value]
> new (f) free_entry;
>   ^
> It appears to me new() is defined as follows in libsupc++/new:
> // Default placement versions of operator new.
> inline void* operator new(std::size_t, void* __p) _GLIBCXX_USE_NOEXCEPT
> { return __p; }
> 
> So could it be considered as a dead call since new() doesn't have side-effect
> and it's return value is not assigned to any variable
> or is the warning wrong for the above call ?

The warning is wrong as it has semantic meaning in the C++ language.
Implementation-wise it is correct.  A way out would be to set
TREE_NO_WARNING on the call emitted by the FE.

Richard.


Re: warn for dead function calls [3/4] testsuite fallout

2016-07-26 Thread Richard Biener
On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:

> Hi,
> The following test-cases broke due to the warning.
> I think however the warning is right for all the cases:
> 
> a) g++.dg/tree-ssa/invalid-dom.C:
> I believe the call from main() to E::bar() is dead call ?

Can't find this testcase.

> b) libffi/testsuite/libffi.call/float.c:
> Call from main() to floating() is dead call.
> 
> c) libffi/testsuite/libffi.call/float3.c:
> Calls from main() to floating_1() and floating_2() are dead calls.
> 
> d) libffi/testsuite/libffi.call/negint.c:
> Call from main() to checking () is dead call.

Looks like dead calls in the above but libffi is maintained upstream
and just copied to GCC.  Please raise the issue upstream.

Richard.

> Should I update the test-cases to pass -Wno-unsued-value,
> or remove the calls ?
> 
> Thanks,
> Prathamesh
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: warn for dead function calls [4/4] stor-layout.c fallout

2016-07-26 Thread Richard Biener
On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:

> The following is an interesting case which broke stor-layout.c.
> The patch warned for the following call to be dead from
> bit_field_mode_iterator::next_mode() to get_mode_alignment ():
> 
>   /* Stop if the mode requires too much alignment.  */
>   if (GET_MODE_ALIGNMENT (m_mode) > m_align
>   && SLOW_UNALIGNED_ACCESS (m_mode, m_align))
> break;
> 
> GET_MODE_ALIGNMENT (MODE) is just #defined as get_mode_alignment (MODE)
> in machmode.h
> 
> SLOW_UNALIGNED_ACCESS (MODE, ALIGN) is #defined to STRICT_ALIGNMENT
> in defaults.h, and i386.h sets STRICT_ALIGNMENT to 0.
> So essentially it comes down to:
> 
> if (get_mode_alignment (m_mode) > m_align && 0)
>   break;
> 
> which clearly makes get_mode_alignment(m_mode) a dead call
> since it's a pure function.
> However if a target overrides SLOW_UNALIGNED_ACCESS(mode, align)
> and sets it to some runtime value, then the call won't be dead for that 
> target.
> 
> Should we split the above  in two different if conditions ?
> if (GET_MODE_ALIGNMENT (m_mode) > m_align)
>   if (SLOW_UNALIGNED_ACCESS (m_mode, m_align))
> break;

I'm surprised it's only one case that you hit ;)  Be prepared for
other targets to be broken similarly.

This hints at the general issue of issueing warnings after optimization,
they can easily become false positives.

Richard.


Re: [Bug tree-optimization] Fix for PR71994

2016-07-26 Thread Richard Biener
On Tue, Jul 26, 2016 at 5:13 AM, kugan
 wrote:
> Hi,
>
> For testcase in pr71994, type of bb conditional result and the type of the
> PHI stmt are different (as om.0_1 is int and the first PHI argument is
> _bool; PHI stmt uses a constant zero that comes from edge 2). Therefore when
> we optimize final range test stmt, we end up setting integer 1 for other PHI
> argument. This results in ICE.
>
>   :
>   om.0_1 = om;
>   _2 = om.0_1 >= 0;
>   int _3 = (int) _2;
>   if (om.0_1 != 0)
> goto ;
>   else
> goto ;
>
>   :
>   int _4 = om.0_1 & _3;
>   _Bool _12 = _4 != 0;
>   :
>
>   # _Bool _13 = PHI <0(2), _12(3)>
>
>
> IMHO, the fix should be that, we should check the type before replacing the
> value (punt otherwise). Attached patch does that. Bootstrapped and
> regression tested on x86_64-linux-gnu with no new regressions. Is this OK
> for trunk?

Ugh, this is undocumented spaghetti code I am not familiar with.  I
can't see if it
is safe to truncate a constant here so I'd feel safer to just punt instead.

Richard.

> Thanks,
> Kugan
>
> gcc/testsuite/ChangeLog:
>
> 2016-07-26  Kugan Vivekanandarajah  
>
> * gcc.dg/torture/pr71994.c: New test.
>
> gcc/ChangeLog:
>
> 2016-07-26  Kugan Vivekanandarajah  
>
> * tree-ssa-reassoc.c (maybe_optimize_range_tests): Check type
> compatibility.
>


Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-26 Thread Richard Biener
On Tue, Jul 26, 2016 at 11:57 AM, Ilya Enkovich  wrote:
> 2016-07-26 0:08 GMT+03:00 Jeff Law :
>> On 07/25/2016 12:32 PM, Richard Biener wrote:
>>>
>>> On July 25, 2016 8:01:17 PM GMT+02:00, Jeff Law  wrote:

 On 07/22/2016 05:36 AM, Richard Biener wrote:
>
> The thing that needs work I think is re-running of if-conversion.

 I wonder if we could revamp if-conversion to work on a subset of the
 CFG?   I can see that potentially being useful in other contexts.
 Would
 that work for you Richi?
>>>
>>>
>>> Well, you need to make it not need post-dominators or preserve them (or
>>> compute "post-dominators" on SESE regions).
>>
>> Oh, but it'd be so nice to have DOMs and/or PDOMs on regions.  But that's
>> probably out of scope for gcc-7.
>>
>>
>>>
>>> What doesn't work with the idea to clone the epilogue using
>>> __built-in_vectorized()
>>> For the if- vs. Not if-converted loop?
>>
>> I must be missing something.   I don't see how builtin_vectorized_function
>> helps, but maybe I've got the wrong built-in or don't understand what you're
>> suggesting.
>>
>> It sounds like this is the biggest impediment to moving forward.  So let's
>> reset and make sure we're all on the same page here.
>>
>> Ilya, what's the fundamental reason why we need to run if-conversion again?
>> Yes, I know you want to if-convert the epilogue, but why?
>>
>> What are the consequences of not doing if-conversion on the epilogue?
>> Presumably we miss a vectorization opportunity on the tail.  But that may be
>> a reasonable limitation to allow the existing work to move forward while you
>> go back and revamp things a little.
>
> If we have some control-flow in a loop then we have to if-convert it
> for vectorizer.
> We need to preserve both versions: if-converted one for vectorizer and
> the original
> one to be used if vectorization fails.  For epilogues we have similar
> situation and
> need two versions.  I do it by running if-conversion on a copy of original 
> loop.
> Note that it doesn't run full if-conversion pass. If-conversion is
> called for epilogue
> loop only.

But it will still compute post-dominators for the full function for example.

You have the if-converted loop available already - it's the loop we are going
to vectorize.  If if-conversion generated if (__builtin_vectorized_p ()) style
loop copies then you can simply create the epilogue in the same way.
If it didn't then the loop is already if-converted anyway.

I see no need to re-run if-conversion here.

Richard.

> Thanks,
> Ilya
>
>>
>> Jeff


Re: [PR70920] transform (intptr_t) x eq/ne CST to x eq/ne (typeof x) cst

2016-07-26 Thread Richard Biener
On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:

> On 25 July 2016 at 14:32, Richard Biener  wrote:
> > On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:
> >
> >> Hi Richard,
> >> The attached patch tries to fix PR70920.
> >> It adds your pattern from comment 1 in the PR
> >> (with additional gating on INTEGRAL_TYPE_P to avoid regressing 
> >> finalize_18.f90)
> >> and second pattern, which is reverse of the first transform.
> >> I needed to update ssa-dom-branch-1.c because with patch applied,
> >> jump threading removed the second if (i != 0B) block.
> >> The dumps with and without patch for ssa-dom-branch-1.c start
> >> to differ with forwprop1:
> >>
> >> before:
> >>  :
> >>   _1 = temp_16(D)->code;
> >>   _2 = _1 == 42;
> >>   _3 = (int) _2;
> >>   _4 = (long int) _3;
> >>   temp_17 = (struct rtx_def *) _4;
> >>   if (temp_17 != 0B)
> >> goto ;
> >>   else
> >> goto ;
> >>
> >> after:
> >> :
> >>   _1 = temp_16(D)->code;
> >>   _2 = _1 == 42;
> >>   _3 = (int) _2;
> >>   _4 = (long int) _2;
> >>   temp_17 = (struct rtx_def *) _4;
> >>   if (_1 == 42)
> >> goto ;
> >>   else
> >> goto ;
> >>
> >> I suppose the transform is correct for above test-case ?
> >>
> >> Then vrp dump shows:
> >>  Threaded jump 5 --> 9 to 13
> >>   Threaded jump 8 --> 9 to 13
> >>   Threaded jump 3 --> 9 to 13
> >>   Threaded jump 12 --> 9 to 14
> >> Removing basic block 9
> >> basic block 9, loop depth 0
> >>  pred:
> >> if (i1_10(D) != 0B)
> >>   goto ;
> >> else
> >>   goto ;
> >>  succ:   10
> >>  11
> >>
> >> So there remained two instances of if (i1_10 (D) != 0B) in dom2 dump file,
> >> and hence needed to update the test-case.
> >>
> >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
> >> OK to commit ?
> >
> > --- a/gcc/match.pd
> > +++ b/gcc/match.pd
> > @@ -3408,3 +3408,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
> >  { CONSTRUCTOR_ELT (ctor, idx / k)->value; })
> > (BIT_FIELD_REF { CONSTRUCTOR_ELT (ctor, idx / k)->value; }
> >@1 { bitsize_int ((idx % k) * width); })
> > +
> > +/* PR70920: Transform (intptr_t)x eq/ne CST to x eq/ne (typeof x) CST.
> > */
> > +
> > +(for cmp (ne eq)
> > + (simplify
> > +  (cmp (convert@2 @0) INTEGER_CST@1)
> > +  (if (POINTER_TYPE_P (TREE_TYPE (@0))
> > +   && INTEGRAL_TYPE_P (TREE_TYPE (@2)))
> >
> > you can use @1 here and omit @2.
> >
> > +   (cmp @0 (convert @1)
> > +
> > +/* Reverse of the above case:
> > +   x has integral_type, CST is a pointer constant.
> > +   Transform (typeof CST)x eq/ne CST to x eq/ne (typeof x) CST.  */
> > +
> > +(for cmp (ne eq)
> > + (simplify
> > +  (cmp (convert @0) @1)
> > +  (if (POINTER_TYPE_P (TREE_TYPE (@1))
> > +   && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
> > +(cmp @0 (convert @1)
> >
> > The second pattern lacks the INTEGER_CST on @1 so it doesn't match
> > its comment.  Please do not add vertical space between pattern
> > comment and pattern.
> >
> > Please place patterns not at the end of match.pd but where similar
> > transforms are done.  Like after
> >
> > /* Simplify pointer equality compares using PTA.  */
> > (for neeq (ne eq)
> >  (simplify
> >   (neeq @0 @1)
> >   (if (POINTER_TYPE_P (TREE_TYPE (@0))
> >&& ptrs_compare_unequal (@0, @1))
> >{ neeq == EQ_EXPR ? boolean_false_node : boolean_true_node; })))
> >
> > please also share the (for ...) for both patterns or merge them
> > by changing the condition to
> >
> >   (if ((POINTER_TYPE_P (TREE_TYPE (@0))
> > && INTEGRAL_TYPE_P (TREE_TYPE (@1)))
> >|| (INTEGRAL_TYPE_P (TREE_TYPE (@0))
> >&& POINTER_TYPE_P (TREE_TYPE (@1
> >
> Hi,
> Done suggested changes in this version.
> pr70920-4.c (test-case in patch) is now folded during  ccp instead of
> forwprop after merging the
> two patterns.
> Passes bootstrap+test on x86_64-unknown-linux-gnu.
> OK for trunk ?

(please paste in ChangeLog entries rather than attaching them).

In gcc.dg/tree-ssa/ssa-dom-branch-1.c you need to adjust the comment
before the dump-scan you adjust.

Ok with that change.

Thanks,
Richard.


Re: [PR71078] x / abs(x) -> copysign (1.0, x)

2016-07-26 Thread Richard Biener
On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:

> Hi,
> The attached patch tries to fix PR71078.
> I am not sure if I have got the converts right.
> I put (convert? @0) and (convert1? (abs @1))
> to match for cases when operands's types may
> be different from outermost type like in pr71078-3.c

Types of RDIV_EXPR have to be the same so as you have a
match on @0 the converts need to be either both present
or not present.

+  (if (FLOAT_TYPE_P (type)

as you special-case several types below please use SCALAR_FLOAT_TYPE_P
here.

+   && ! HONOR_NANS (type)
+   && ! HONOR_INFINITIES (type))
+   (switch
+(if (type == float_type_node)
+ (BUILT_IN_COPYSIGNF { build_one_cst (type); } (convert @0)))

please use if (types_match (type, float_type_node)) instead of
pointer equality.  I _think_ you can do better here by using
IFN_COPYSIGN but possibly only so if the target supports it.
Richard - this seems to be the first pattern in need of
generating a builtin where no other was there to match the type
to - any idea how we can safely use the internal function here?
I see those do not have an expander that would fall back to
expanding the regular builtin, correct?

Please place the pattern next to

/* Optimize -A / A to -1.0 if we don't care about
   NaNs or Infinities.  */
(simplify
 (rdiv:C @0 (negate @0))
 (if (FLOAT_TYPE_P (type)
  && ! HONOR_NANS (type)
  && ! HONOR_INFINITIES (type))
  { build_minus_one_cst (type); }))

where it logically belongs.

Thanks,
Richard.

> test-case (included in patch).
> Bootstrap+test in progress on x86_64-unknown-linux-gnu.
> 
> Thanks,
> Prathamesh
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


GCC 4.9.4 Status Report (2016-07-26), branch frozen for release

2016-07-26 Thread Richard Biener

The 4.9 branch is now frozen for the final GCC 4.9.4 release, I will
announce GCC 4.9.4 RC1 once it has built.

Richard.


Re: [RFC][IPA-VRP] Early VRP Implementation

2016-07-26 Thread kugan



Hi Riachard,

Thanks for the review. Here is an updated patch with comments below.


+/* Restore/Pop all the old VRs maintained in the cond_stack.  */
+
+void evrp_dom_walker::finalize_dom_walker ()
+{
+  while (!cond_stack.is_empty ())
+{
+  tree var = cond_stack.last ().second;
+  pop_value_range (var);
+  cond_stack.pop ();
+}

why is this necessary at all?  Looks like a waste of time (and the
stack should be empty here
anyway).  I think you leak cond_stack as well, I suppose using
auto_vec might work here,
you have to check.

Done.





+ /* Discover VR when condition is true.  */
+ if (te == e
+ && !TREE_OVERFLOW_P (op0)
+ && !TREE_OVERFLOW_P (op1))



This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
set_value_range otherwise:

gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
  && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
(max)));


Ok, instead make sure to remove the overflow flag via

  if (TREE_OVERFLOW_P (op0))
op0 = drop_tree_overflow (op0);

 ...
Done.



it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and FALSE_VALUE
cases and only invert the tree comparison for the false edge.

Done.



+ tree cond = build2 (code, boolean_type_node, op0, op1);
+ extract_range_for_var_from_comparison_expr (&vr, op0, cond);

no wasteful tree building here please.  Instead of cond pass in code,
op0 and op1
as separate arguments.

Done.



+ /* If we found any usable VR, set the VR to ssa_name and create a
+PUSH old value in the cond_stack with the old VR.  */
+ if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
+   {
+ value_range *new_vr = vrp_value_range_pool.allocate ();
+ memset (new_vr, 0, sizeof (*new_vr));
+ *new_vr = vr;
+ cond_stack.safe_push (std::make_pair (bb, op0));

the memset looks redundant given you copy-assing from vr anyway.

Done.



You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
both i_1 and x_2 might have ranges that are useful.
I will address this once the basic structure is in place if that is OK 
with you.




push and pop_value_range should be methods in the dom walker class
and vrp_stack and cond_stack should be a single stack.  You can omit
basic_block if you use a "magic" NULL_TREE var as marker (simply
push a NULL_TREE, NULL pair at the end of before_dom_children).


Done.



you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE

why do you need those TREE_OVERFLOW checks?

+ tree cond = build2 (code, boolean_type_node, op0, op1);
+ tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
+ extract_range_from_assert (&vr, a);

so I was hoping that the "refactoring" patch in the series would expose a
more
useful interface than extract_range_from_assert ... namely one that can
extract a range from the comparison directly and does not require building
a scratch ASSERT_EXPR.



I have split extract_range_from_assert into
extract_range_for_var_from_comparison_expr and using it now. Is that better?


See above.




+ /* If we found any usable VR, set the VR to ssa_name and create
a
+restore point in the cond_stack with the  old VR. */
+ if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
+   {
+ value_range *new_vr = XCNEW (value_range);
+ *new_vr = vr;
+ cond_stack.safe_push (std::make_pair (bb,
+   std::make_pair (op0,
+
old_vr)));
+ change_value_range (op0, new_vr);

I don't like 'change_value_range' as interface, please integrate that into
a push/pop_value_range style interface instead.



Tried using push/pop interface.



+   vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
+}
+
+  return NULL;

you should return taken_edge_p (misnamed as it isn't a pointer) and take
advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
defer this as a followup improvement).



I have added a TODO to this effect and will comeback to it.



Note that the advantage of a DOM-based VRP is that backtracking is easy
to implement (you don't do that yet).  That is, after DEF got a (better)
value-range you can simply re-visit all the defs of its uses (and
recursively).
I think you have to be careful with stmts that might prematurely leave a
BB
though (like via EH).  So sth for a followup as well.



Is this OK now. Bootstrapped and regression tested on x86_64-linux  with no
new regressions.


Better, still not perfect.

I'm also arguing on the pass placement now - you put it where it may make sense
for IPA VRP (but then IPA VRP could simply do this in its analysis phase) but
not so much where it makes sense during the early pipeline.  I think it makes
sense after FRE.

Looking at how you finalize I see several issues with simply re-using
vrp_fin

Re: [patch, avr] Fix mmcu to specs issues

2016-07-26 Thread Georg-Johann Lay

On 26.07.2016 12:20, Pitchumani Sivanupandi wrote:

avr-gcc expected to find the device specs in the search paths specified. But
it doesn't work as expected when device specs in different place than the
installed location.

Example-1:
avr-gcc wrongly assumes the device specs will be in first identified
'device-specs' folder in prefix path. avr-gcc natively supports device
at90pwm1, but complains as it couldn't find the specs file in the first
'device-specs' directory in the search path.


$ avr-gcc test.c -mmcu=at90pwm1 -B /home/install/dev/atxmega128a1u/

avr-gcc: error: cannot access device-specs for _at90pwm1_ expected at


Are the "_"s literally? Then the spec file name should be "specs-_at90pwm1_".


_/home/install/dev/atxmega128a1u/device-specs/specs-at90pwm1_
avr-gcc: note: devices natively supported: ata5272 ata5505 ata5702m322 ata5782
ata5790 ata5790n ata5791 ata5795 ata5831 ata6285 ata6286 ata6289 ata6612c
ata6613c ata6614q ata6616c ata6617c ata664251 ata8210 ata8510 atmega103
atmega128 atmega128a atmega128rfa1 atmega128rfr2 atmega1280 atmega1281
atmega1284 atmega1284p atmega1284rfr2 atmega16 atmega16a atmega16hva
atmega16hva2 atmega16hvb atmega16hvbrevb atmega16m1 atmega16u2 atmega16u4
atmega161 atmega162 atmega163 atmega164a atmega164p atmega164pa atmega165
atmega165a atmega165p atmega165pa atmega168 atmega168a atmega168p atmega168pa
atmega168pb atmega169 atmega169a atmega169p atmega169pa atmega256rfr2
atmega2560 atmega2561 atmega2564rfr2 atmega32 atmega32a atmega32c1 atmega32hvb
atmega32hvbrevb atmega32m1 atmega32u2 atmega32u4 atmega32u6 atmega323
atmega324a atmega324p atmega324pa atmega325 atmega325a atmega325p atmega325pa
atmega3250 atmega3250a atmega3250p atmega3250pa atmega328 atmega328p
atmega328pb atmega329 atmega329a atmega329p atmega329pa atmega3290 atmega3290a
atmega3290p atmega3290pa atmega406 atmega48 atmega48a atmega48p atmega48pa
atmega48pb atmega64 atmega64a atmega64c1 atmega64hve atmega64hve2 atmega64m1
atmega64rfr2 atmega640 atmega644 atmega644a atmega644p atmega644pa
atmega644rfr2 atmega645 atmega645a atmega645p atmega6450 atmega6450a
atmega6450p atmega649 atmega649a atmega649p atmega6490 atmega6490a atmega6490p
atmega8 atmega8a atmega8hva atmega8u2 atmega8515 atmega8535 atmega88 atmega88a
atmega88p atmega88pa atmega88pb attiny10 attiny11 attiny12 attiny13 attiny13a
attiny15 attiny1634 attiny167 attiny20 attiny22 attiny2313 attiny2313a attiny24
attiny24a attiny25 attiny26 attiny261 attiny261a attiny28 attiny4 attiny40
attiny43u attiny4313 attiny44 attiny44a attiny441 attiny45 attiny461 attiny461a
attiny48 attiny5 attiny828 attiny84 attiny84a attiny841 attiny85 attiny861
attiny861a attiny87 attiny88 attiny9 atxmega128a1 atxmega128a1u atxmega128a3
atxmega128a3u atxmega128a4u atxmega128b1 atxmega128b3 atxmega128c3 atxmega128d3
atxmega128d4 atxmega16a4 atxmega16a4u atxmega16c4 atxmega16d4 atxmega16e5
atxmega192a3 atxmega192a3u atxmega192c3 atxmega192d3 atxmega256a3 atxmega256a3b
atxmega256a3bu atxmega256a3u atxmega256c3 atxmega256d3 atxmega32a4 atxmega32a4u
atxmega32c3 atxmega32c4 atxmega32d3 atxmega32d4 atxmega32e5 atxmega384c3
atxmega384d3 atxmega64a1 atxmega64a1u atxmega64a3 atxmega64a3u atxmega64a4u
atxmega64b1 atxmega64b3 atxmega64c3 atxmega64d3 atxmega64d4 atxmega8e5
at43usb320 at43usb355 at76c711 at86rf401 at90can128 at90can32 at90can64
at90c8534 at90pwm1 at90pwm161 at90pwm2 at90pwm2b at90pwm216 at90pwm3 at90pwm3b
at90pwm316 at90pwm81 at90scr100 at90s1200 at90s2313 at90s2323 at90s2333
at90s2343 at90s4414 at90s4433 at90s4434 at90s8515 at90s8535 at90usb1286
at90usb1287 at90usb162 at90usb646 at90usb647 at90usb82 at94k m3000
avr-gcc: note: supported core architectures: avr2 avr25 avr3 avr31 avr35 avr4
avr5 avr51 avr6 avrxmega2 avrxmega4 avrxmega5 avrxmega6 avrxmega7 avrtiny avr1
avr-gcc: note: you can provide your own specs files, see
 for details


Example-2:
Similar issue happens when -flto option is enabled and device specs in custom
search path.

atxmega128a1u device specs in custom path and that is passed with -B option.
Architecture spec files such as specs-avrxmega7 will be in installed location.

$ avr-gcc test.c -mmcu=atxmega128a1u -flto -B /home/install/dev/atxmega128a1u/

avr-gcc: error: cannot access device-specs for _avrxmega7_ expected at
_/home/install/dev/atxmega128a1u/device-specs/specs-avrxmega7_
avr-gcc: note: supported core architectures: avr2 avr25 avr3 avr31 avr35 avr4
avr5 avr51 avr6 avrxmega2 avrxmega4 avrxmega5 avrxmega6 avrxmega7 avrtiny avr1
avr-gcc: note: you can provide your own specs files, see
 for details
lto-wrapper: fatal error: avr-gcc returned 1 exit status
compilation terminated.
/home/avr-trunk/install/lib/gcc/avr/7.0.0/../../../../avr/bin/ld: error:
lto-wrapper failed
collect2: error: ld returned 1 exit status

Attached patch to address these issues.

Fix:
Replace device-specs-file spec function with mmcu-to-device-specs. This will
not as

Re: [Bug tree-optimization] Fix for PR71994

2016-07-26 Thread kugan

Hi Richard,

On 26/07/16 21:48, Richard Biener wrote:

On Tue, Jul 26, 2016 at 5:13 AM, kugan
 wrote:

Hi,

For testcase in pr71994, type of bb conditional result and the type of the
PHI stmt are different (as om.0_1 is int and the first PHI argument is
_bool; PHI stmt uses a constant zero that comes from edge 2). Therefore when
we optimize final range test stmt, we end up setting integer 1 for other PHI
argument. This results in ICE.

  :
  om.0_1 = om;
  _2 = om.0_1 >= 0;
  int _3 = (int) _2;
  if (om.0_1 != 0)
goto ;
  else
goto ;

  :
  int _4 = om.0_1 & _3;
  _Bool _12 = _4 != 0;
  :

  # _Bool _13 = PHI <0(2), _12(3)>


IMHO, the fix should be that, we should check the type before replacing the
value (punt otherwise). Attached patch does that. Bootstrapped and
regression tested on x86_64-linux-gnu with no new regressions. Is this OK
for trunk?


Ugh, this is undocumented spaghetti code I am not familiar with.  I
can't see if it
is safe to truncate a constant here so I'd feel safer to just punt instead.



Please see the attached patch,  which now constant truncation. Testing 
in progress. Is this OK for trunk if there is no new regressions?


Thanks,
Kugan

gcc/testsuite/ChangeLog:

2016-07-26  Kugan Vivekanandarajah  

* gcc.dg/torture/pr71994.c: New test.

gcc/ChangeLog:

2016-07-26  Kugan Vivekanandarajah  

	* tree-ssa-reassoc.c (maybe_optimize_range_tests): Check type 
compatibility.
diff --git a/gcc/testsuite/gcc.dg/torture/pr71994.c 
b/gcc/testsuite/gcc.dg/torture/pr71994.c
index e69de29..8f5e92c 100644
--- a/gcc/testsuite/gcc.dg/torture/pr71994.c
+++ b/gcc/testsuite/gcc.dg/torture/pr71994.c
@@ -0,0 +1,14 @@
+/* PR tree-optimization/71994 */
+/* { dg-do compile } */
+int om, h6;
+
+void eo (void)
+{
+  const int tl = 1;
+  int ln;
+
+  h6 = (om + tl) > 0;
+  ln = om && (om & h6);
+  h6 = om;
+  om = ln < h6;
+}
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 18cf978..21742dd 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -3636,7 +3636,9 @@ maybe_optimize_range_tests (gimple *stmt)
  gcc_assert (bb == last_bb);
  new_op = ops[bbinfo[idx].first_idx++]->op;
}
- if (bbinfo[idx].op != new_op)
+ if (bbinfo[idx].op != new_op
+ && types_compatible_p (TREE_TYPE (new_op),
+TREE_TYPE (bbinfo[idx].op)))
{
  imm_use_iterator iter;
  use_operand_p use_p;


[PATCH] Add mark_spam.py script

2016-07-26 Thread Martin Liška
Hello.

This is python script that utilizes bugzilla API and marks PRs as spam:

$ ./mark_spam.py --help
usage: mark_spam.py [-h] [--verbose] api_key range

Mark Bugzilla issues as spam.

positional arguments:
  api_key API key
  range   Range of IDs, e.g. 10-23,24,25,27

optional arguments:
  -h, --help  show this help message and exit
  --verbose   Verbose logging

Sample usage:
$ ./mark_spam.py my_api_key 72634-72636
Marking as spam: PR72634
Marking as spam: PR72635
Marking as spam: PR72636

API key can be set up here:
https://gcc.gnu.org/bugzilla/userprefs.cgi?tab=apikey

Sample PR marked by the script: 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72635

Ready to install?
Martin
>From 467dc2cf8f0c549f5d7ee190efe59c841a9acad9 Mon Sep 17 00:00:00 2001
From: marxin 
Date: Tue, 26 Jul 2016 14:34:55 +0200
Subject: [PATCH] Add mark_spam.py script

contrib/ChangeLog:

2016-07-26  Martin Liska  

	* mark_spam.py: New file.
---
 contrib/mark_spam.py | 86 
 1 file changed, 86 insertions(+)
 create mode 100755 contrib/mark_spam.py

diff --git a/contrib/mark_spam.py b/contrib/mark_spam.py
new file mode 100755
index 000..cc394dc
--- /dev/null
+++ b/contrib/mark_spam.py
@@ -0,0 +1,86 @@
+#!/usr/bin/env python3
+#
+# Script to mark bunch of PRs as spam 
+#
+# This file is part of GCC.
+#
+# GCC is free software; you can redistribute it and/or modify it under
+# the terms of the GNU General Public License as published by the Free
+# Software Foundation; either version 3, or (at your option) any later
+# version.
+#
+# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+# WARRANTY; without even the implied warranty of MERCHANTABILITY or
+# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+# for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with GCC; see the file COPYING3.  If not see
+# .  */
+#
+#
+#
+
+import requests
+import json
+import argparse
+
+base_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/'
+
+def mark_as_spam(id, api_key, verbose):
+print('Marking as spam: PR%d' % id)
+# 1) get bug info to find 'cc'
+u = base_url + 'bug/' + str(id)
+r = requests.get(u)
+response = json.loads(r.text)
+
+# 2) mark the bug as spam
+cc_list = response['bugs'][0]['cc']
+data = {
+'status': 'RESOLVED',
+'resolution': 'INVALID',
+'summary': 'spam',
+'ids': [id],
+'api_key': api_key,
+'comment': { 'comment': 'spam'},
+'product': 'gcc',
+'component': 'spam',
+'version': 'unknown',
+'cc': {'remove': cc_list},
+'priority': 'P5',
+'severity': 'trivial',
+'assigned_to': 'unassig...@gcc.gnu.org' }
+
+r = requests.put(u, json = data)
+if verbose:
+print(r)
+print(r.text)
+
+# 3) mark the first comment as spam
+r = requests.get(u + '/comment')
+response = json.loads(r.text)
+comment_id = response['bugs'][str(id)]['comments'][0]['id']
+
+u2 = '%sbug/comment/%d/tags' % (base_url, comment_id)
+r = requests.put(u2, json = {'comment_id': comment_id, 'add': ['spam'], 'api_key': api_key})
+if verbose:
+print(r)
+print(r.text)
+
+parser = argparse.ArgumentParser(description='Mark Bugzilla issues as spam.')
+parser.add_argument('api_key', help = 'API key')
+parser.add_argument('range', help = 'Range of IDs, e.g. 10-23,24,25,27')
+parser.add_argument('--verbose', action = 'store_true', help = 'Verbose logging')
+
+args = parser.parse_args()
+
+chunks = args.range.split(',')
+for c in chunks:
+parts = list(map(lambda x: int(x), c.split('-')))
+if len(parts) == 1:
+r = [parts[0]]
+else:
+r = range(parts[0], parts[1] + 1)
+
+for id in r:
+mark_as_spam(id, args.api_key, args.verbose)
-- 
2.9.2

#!/usr/bin/env python3
#
# Script to mark bunch of PRs as spam 
#
# This file is part of GCC.
#
# GCC is free software; you can redistribute it and/or modify it under
# the terms of the GNU General Public License as published by the Free
# Software Foundation; either version 3, or (at your option) any later
# version.
#
# GCC is distributed in the hope that it will be useful, but WITHOUT ANY
# WARRANTY; without even the implied warranty of MERCHANTABILITY or
# FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
# for more details.
#
# You should have received a copy of the GNU General Public License
# along with GCC; see the file COPYING3.  If not see
# .  */
#
#
#

import requests
import json
import argparse

base_url = 'https://gcc.gnu.org/bugzilla/rest.cgi/'

def mark_as_spam(id, api_key, verbose):
print('Marking as spam: PR%d' % id)
# 1) get bug info to find 'cc'
u = base_url + 'bug/' + str(id)
r = requests.get(u)
response = json.loads(r.text)

# 2) mark the bug as 

Re: [Bug tree-optimization] Fix for PR71994

2016-07-26 Thread Richard Biener
On Tue, Jul 26, 2016 at 2:34 PM, kugan
 wrote:
> Hi Richard,
>
>
> On 26/07/16 21:48, Richard Biener wrote:
>>
>> On Tue, Jul 26, 2016 at 5:13 AM, kugan
>>  wrote:
>>>
>>> Hi,
>>>
>>> For testcase in pr71994, type of bb conditional result and the type of
>>> the
>>> PHI stmt are different (as om.0_1 is int and the first PHI argument is
>>> _bool; PHI stmt uses a constant zero that comes from edge 2). Therefore
>>> when
>>> we optimize final range test stmt, we end up setting integer 1 for other
>>> PHI
>>> argument. This results in ICE.
>>>
>>>   :
>>>   om.0_1 = om;
>>>   _2 = om.0_1 >= 0;
>>>   int _3 = (int) _2;
>>>   if (om.0_1 != 0)
>>> goto ;
>>>   else
>>> goto ;
>>>
>>>   :
>>>   int _4 = om.0_1 & _3;
>>>   _Bool _12 = _4 != 0;
>>>   :
>>>
>>>   # _Bool _13 = PHI <0(2), _12(3)>
>>>
>>>
>>> IMHO, the fix should be that, we should check the type before replacing
>>> the
>>> value (punt otherwise). Attached patch does that. Bootstrapped and
>>> regression tested on x86_64-linux-gnu with no new regressions. Is this OK
>>> for trunk?
>>
>>
>> Ugh, this is undocumented spaghetti code I am not familiar with.  I
>> can't see if it
>> is safe to truncate a constant here so I'd feel safer to just punt
>> instead.
>>
>
> Please see the attached patch,  which now constant truncation. Testing in
> progress. Is this OK for trunk if there is no new regressions?

Yes.

Thanks,
Richard.

>
> Thanks,
> Kugan
>
> gcc/testsuite/ChangeLog:
>
> 2016-07-26  Kugan Vivekanandarajah  
>
> * gcc.dg/torture/pr71994.c: New test.
>
> gcc/ChangeLog:
>
> 2016-07-26  Kugan Vivekanandarajah  
>
> * tree-ssa-reassoc.c (maybe_optimize_range_tests): Check type
> compatibility.


Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-26 Thread Ilya Enkovich
2016-07-26 14:51 GMT+03:00 Richard Biener :
> On Tue, Jul 26, 2016 at 11:57 AM, Ilya Enkovich  
> wrote:
>> 2016-07-26 0:08 GMT+03:00 Jeff Law :
>>> On 07/25/2016 12:32 PM, Richard Biener wrote:

 On July 25, 2016 8:01:17 PM GMT+02:00, Jeff Law  wrote:
>
> On 07/22/2016 05:36 AM, Richard Biener wrote:
>>
>> The thing that needs work I think is re-running of if-conversion.
>
> I wonder if we could revamp if-conversion to work on a subset of the
> CFG?   I can see that potentially being useful in other contexts.
> Would
> that work for you Richi?


 Well, you need to make it not need post-dominators or preserve them (or
 compute "post-dominators" on SESE regions).
>>>
>>> Oh, but it'd be so nice to have DOMs and/or PDOMs on regions.  But that's
>>> probably out of scope for gcc-7.
>>>
>>>

 What doesn't work with the idea to clone the epilogue using
 __built-in_vectorized()
 For the if- vs. Not if-converted loop?
>>>
>>> I must be missing something.   I don't see how builtin_vectorized_function
>>> helps, but maybe I've got the wrong built-in or don't understand what you're
>>> suggesting.
>>>
>>> It sounds like this is the biggest impediment to moving forward.  So let's
>>> reset and make sure we're all on the same page here.
>>>
>>> Ilya, what's the fundamental reason why we need to run if-conversion again?
>>> Yes, I know you want to if-convert the epilogue, but why?
>>>
>>> What are the consequences of not doing if-conversion on the epilogue?
>>> Presumably we miss a vectorization opportunity on the tail.  But that may be
>>> a reasonable limitation to allow the existing work to move forward while you
>>> go back and revamp things a little.
>>
>> If we have some control-flow in a loop then we have to if-convert it
>> for vectorizer.
>> We need to preserve both versions: if-converted one for vectorizer and
>> the original
>> one to be used if vectorization fails.  For epilogues we have similar
>> situation and
>> need two versions.  I do it by running if-conversion on a copy of original 
>> loop.
>> Note that it doesn't run full if-conversion pass. If-conversion is
>> called for epilogue
>> loop only.
>
> But it will still compute post-dominators for the full function for example.
>
> You have the if-converted loop available already - it's the loop we are going
> to vectorize.  If if-conversion generated if (__builtin_vectorized_p ()) style
> loop copies then you can simply create the epilogue in the same way.
> If it didn't then the loop is already if-converted anyway.
>

Agree.  Calling if-conversion is just much simpler in implementation.

Thanks,
Ilya

> I see no need to re-run if-conversion here.
>
> Richard.
>
>> Thanks,
>> Ilya
>>
>>>
>>> Jeff


Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-26 Thread Richard Biener
On Tue, Jul 26, 2016 at 3:03 PM, Ilya Enkovich  wrote:
> 2016-07-26 14:51 GMT+03:00 Richard Biener :
>> On Tue, Jul 26, 2016 at 11:57 AM, Ilya Enkovich  
>> wrote:
>>> 2016-07-26 0:08 GMT+03:00 Jeff Law :
 On 07/25/2016 12:32 PM, Richard Biener wrote:
>
> On July 25, 2016 8:01:17 PM GMT+02:00, Jeff Law  wrote:
>>
>> On 07/22/2016 05:36 AM, Richard Biener wrote:
>>>
>>> The thing that needs work I think is re-running of if-conversion.
>>
>> I wonder if we could revamp if-conversion to work on a subset of the
>> CFG?   I can see that potentially being useful in other contexts.
>> Would
>> that work for you Richi?
>
>
> Well, you need to make it not need post-dominators or preserve them (or
> compute "post-dominators" on SESE regions).

 Oh, but it'd be so nice to have DOMs and/or PDOMs on regions.  But that's
 probably out of scope for gcc-7.


>
> What doesn't work with the idea to clone the epilogue using
> __built-in_vectorized()
> For the if- vs. Not if-converted loop?

 I must be missing something.   I don't see how builtin_vectorized_function
 helps, but maybe I've got the wrong built-in or don't understand what 
 you're
 suggesting.

 It sounds like this is the biggest impediment to moving forward.  So let's
 reset and make sure we're all on the same page here.

 Ilya, what's the fundamental reason why we need to run if-conversion again?
 Yes, I know you want to if-convert the epilogue, but why?

 What are the consequences of not doing if-conversion on the epilogue?
 Presumably we miss a vectorization opportunity on the tail.  But that may 
 be
 a reasonable limitation to allow the existing work to move forward while 
 you
 go back and revamp things a little.
>>>
>>> If we have some control-flow in a loop then we have to if-convert it
>>> for vectorizer.
>>> We need to preserve both versions: if-converted one for vectorizer and
>>> the original
>>> one to be used if vectorization fails.  For epilogues we have similar
>>> situation and
>>> need two versions.  I do it by running if-conversion on a copy of original 
>>> loop.
>>> Note that it doesn't run full if-conversion pass. If-conversion is
>>> called for epilogue
>>> loop only.
>>
>> But it will still compute post-dominators for the full function for example.
>>
>> You have the if-converted loop available already - it's the loop we are going
>> to vectorize.  If if-conversion generated if (__builtin_vectorized_p ()) 
>> style
>> loop copies then you can simply create the epilogue in the same way.
>> If it didn't then the loop is already if-converted anyway.
>>
>
> Agree.  Calling if-conversion is just much simpler in implementation.

Agreed, but it's also quadratic in the number of vectorized loops in a function.
Not sure if it is really very much simpler either.

Richard.

> Thanks,
> Ilya
>
>> I see no need to re-run if-conversion here.
>>
>> Richard.
>>
>>> Thanks,
>>> Ilya
>>>

 Jeff


Re: [PR70920] transform (intptr_t) x eq/ne CST to x eq/ne (typeof x) cst

2016-07-26 Thread Prathamesh Kulkarni
On 26 July 2016 at 17:28, Richard Biener  wrote:
> On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:
>
>> On 25 July 2016 at 14:32, Richard Biener  wrote:
>> > On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:
>> >
>> >> Hi Richard,
>> >> The attached patch tries to fix PR70920.
>> >> It adds your pattern from comment 1 in the PR
>> >> (with additional gating on INTEGRAL_TYPE_P to avoid regressing 
>> >> finalize_18.f90)
>> >> and second pattern, which is reverse of the first transform.
>> >> I needed to update ssa-dom-branch-1.c because with patch applied,
>> >> jump threading removed the second if (i != 0B) block.
>> >> The dumps with and without patch for ssa-dom-branch-1.c start
>> >> to differ with forwprop1:
>> >>
>> >> before:
>> >>  :
>> >>   _1 = temp_16(D)->code;
>> >>   _2 = _1 == 42;
>> >>   _3 = (int) _2;
>> >>   _4 = (long int) _3;
>> >>   temp_17 = (struct rtx_def *) _4;
>> >>   if (temp_17 != 0B)
>> >> goto ;
>> >>   else
>> >> goto ;
>> >>
>> >> after:
>> >> :
>> >>   _1 = temp_16(D)->code;
>> >>   _2 = _1 == 42;
>> >>   _3 = (int) _2;
>> >>   _4 = (long int) _2;
>> >>   temp_17 = (struct rtx_def *) _4;
>> >>   if (_1 == 42)
>> >> goto ;
>> >>   else
>> >> goto ;
>> >>
>> >> I suppose the transform is correct for above test-case ?
>> >>
>> >> Then vrp dump shows:
>> >>  Threaded jump 5 --> 9 to 13
>> >>   Threaded jump 8 --> 9 to 13
>> >>   Threaded jump 3 --> 9 to 13
>> >>   Threaded jump 12 --> 9 to 14
>> >> Removing basic block 9
>> >> basic block 9, loop depth 0
>> >>  pred:
>> >> if (i1_10(D) != 0B)
>> >>   goto ;
>> >> else
>> >>   goto ;
>> >>  succ:   10
>> >>  11
>> >>
>> >> So there remained two instances of if (i1_10 (D) != 0B) in dom2 dump file,
>> >> and hence needed to update the test-case.
>> >>
>> >> Bootstrapped and tested on x86_64-unknown-linux-gnu.
>> >> OK to commit ?
>> >
>> > --- a/gcc/match.pd
>> > +++ b/gcc/match.pd
>> > @@ -3408,3 +3408,23 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT)
>> >  { CONSTRUCTOR_ELT (ctor, idx / k)->value; })
>> > (BIT_FIELD_REF { CONSTRUCTOR_ELT (ctor, idx / k)->value; }
>> >@1 { bitsize_int ((idx % k) * width); })
>> > +
>> > +/* PR70920: Transform (intptr_t)x eq/ne CST to x eq/ne (typeof x) CST.
>> > */
>> > +
>> > +(for cmp (ne eq)
>> > + (simplify
>> > +  (cmp (convert@2 @0) INTEGER_CST@1)
>> > +  (if (POINTER_TYPE_P (TREE_TYPE (@0))
>> > +   && INTEGRAL_TYPE_P (TREE_TYPE (@2)))
>> >
>> > you can use @1 here and omit @2.
>> >
>> > +   (cmp @0 (convert @1)
>> > +
>> > +/* Reverse of the above case:
>> > +   x has integral_type, CST is a pointer constant.
>> > +   Transform (typeof CST)x eq/ne CST to x eq/ne (typeof x) CST.  */
>> > +
>> > +(for cmp (ne eq)
>> > + (simplify
>> > +  (cmp (convert @0) @1)
>> > +  (if (POINTER_TYPE_P (TREE_TYPE (@1))
>> > +   && INTEGRAL_TYPE_P (TREE_TYPE (@0)))
>> > +(cmp @0 (convert @1)
>> >
>> > The second pattern lacks the INTEGER_CST on @1 so it doesn't match
>> > its comment.  Please do not add vertical space between pattern
>> > comment and pattern.
>> >
>> > Please place patterns not at the end of match.pd but where similar
>> > transforms are done.  Like after
>> >
>> > /* Simplify pointer equality compares using PTA.  */
>> > (for neeq (ne eq)
>> >  (simplify
>> >   (neeq @0 @1)
>> >   (if (POINTER_TYPE_P (TREE_TYPE (@0))
>> >&& ptrs_compare_unequal (@0, @1))
>> >{ neeq == EQ_EXPR ? boolean_false_node : boolean_true_node; })))
>> >
>> > please also share the (for ...) for both patterns or merge them
>> > by changing the condition to
>> >
>> >   (if ((POINTER_TYPE_P (TREE_TYPE (@0))
>> > && INTEGRAL_TYPE_P (TREE_TYPE (@1)))
>> >|| (INTEGRAL_TYPE_P (TREE_TYPE (@0))
>> >&& POINTER_TYPE_P (TREE_TYPE (@1
>> >
>> Hi,
>> Done suggested changes in this version.
>> pr70920-4.c (test-case in patch) is now folded during  ccp instead of
>> forwprop after merging the
>> two patterns.
>> Passes bootstrap+test on x86_64-unknown-linux-gnu.
>> OK for trunk ?
>
> (please paste in ChangeLog entries rather than attaching them).
Will do henceforth.
>
> In gcc.dg/tree-ssa/ssa-dom-branch-1.c you need to adjust the comment
> before the dump-scan you adjust.
>
> Ok with that change.
Thanks, committed as r238754 after adjusting the comment in ssa-dom-branch-1.c.

Thanks,
Prathamesh
>
> Thanks,
> Richard.


Re: warn for dead function calls [3/4] testsuite fallout

2016-07-26 Thread Prathamesh Kulkarni
On 26 July 2016 at 17:06, Richard Biener  wrote:
> On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:
>
>> Hi,
>> The following test-cases broke due to the warning.
>> I think however the warning is right for all the cases:
>>
>> a) g++.dg/tree-ssa/invalid-dom.C:
>> I believe the call from main() to E::bar() is dead call ?
>
> Can't find this testcase.
oops, it's dom-invalid.C -;)
>
>> b) libffi/testsuite/libffi.call/float.c:
>> Call from main() to floating() is dead call.
>>
>> c) libffi/testsuite/libffi.call/float3.c:
>> Calls from main() to floating_1() and floating_2() are dead calls.
>>
>> d) libffi/testsuite/libffi.call/negint.c:
>> Call from main() to checking () is dead call.
>
> Looks like dead calls in the above but libffi is maintained upstream
> and just copied to GCC.  Please raise the issue upstream.
Ok I will raise the issue upstream.

Thanks,
Prathamesh
>
> Richard.
>
>> Should I update the test-cases to pass -Wno-unsued-value,
>> or remove the calls ?
>>
>> Thanks,
>> Prathamesh
>>
>>
>
> --
> Richard Biener 
> SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> 21284 (AG Nuernberg)


Re: [PATCH][AArch64] Cleanup frame push/pop code

2016-07-26 Thread Richard Earnshaw (lists)
On 26/07/16 11:08, Wilco Dijkstra wrote:
> This patch improves the readability of the prolog and epilog code by moving 
> some 
> code into separate functions.  There is no difference in generated code.
> 
> OK for commit?
> 
> ChangeLog:
> 2016-07-26  Wilco Dijkstra  
> 
> gcc/
>   * config/aarch64/aarch64.c (aarch64_pushwb_pair_reg): Rename.
>   (aarch64_push_reg): New function to push 1 or 2 registers.
>   (aarch64_pop_reg): New function to pop 1 or 2 registers.
>   (aarch64_expand_prologue): Use aarch64_push_regs.
>   (aarch64_expand_epilogue): Use aarch64_pop_regs.
> 

This is OK.  However, as a follow-up clean up patch, can we change use
of FIRST_PSEUDO_REGISTER to INVALID_REGNUM?  Using F_S_P is an abuse here.

R.

> --
> diff --git a/gcc/config/aarch64/aarch64.c b/gcc/config/aarch64/aarch64.c
> index 
> f3300f0909ddaac7359f189fcddcdce7e61ab51b..547eb6771084cda40fa7cd1a8973e6a17f6799d4
>  100644
> --- a/gcc/config/aarch64/aarch64.c
> +++ b/gcc/config/aarch64/aarch64.c
> @@ -2806,10 +2806,14 @@ aarch64_gen_storewb_pair (machine_mode mode, rtx 
> base, rtx reg, rtx reg2,
>  }
>  
>  static void
> -aarch64_pushwb_pair_reg (machine_mode mode, unsigned regno1,
> -  unsigned regno2, HOST_WIDE_INT adjustment)
> +aarch64_push_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT 
> adjustment)
>  {
>rtx_insn *insn;
> +  machine_mode mode = (regno1 <= R30_REGNUM) ? DImode : DFmode;
> +
> +  if (regno2 == FIRST_PSEUDO_REGISTER)
> +return aarch64_pushwb_single_reg (mode, regno1, adjustment);
> +
>rtx reg1 = gen_rtx_REG (mode, regno1);
>rtx reg2 = gen_rtx_REG (mode, regno2);
>  
> @@ -2837,6 +2841,30 @@ aarch64_gen_loadwb_pair (machine_mode mode, rtx base, 
> rtx reg, rtx reg2,
>  }
>  }
>  
> +static void
> +aarch64_pop_regs (unsigned regno1, unsigned regno2, HOST_WIDE_INT adjustment,
> +   rtx *cfi_ops)
> +{
> +  machine_mode mode = (regno1 <= R30_REGNUM) ? DImode : DFmode;
> +  rtx reg1 = gen_rtx_REG (mode, regno1);
> +
> +  *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg1, *cfi_ops);
> +
> +  if (regno2 == FIRST_PSEUDO_REGISTER)
> +{
> +  rtx mem = plus_constant (Pmode, stack_pointer_rtx, adjustment);
> +  mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem);
> +  emit_move_insn (reg1, gen_rtx_MEM (mode, mem));
> +}
> +  else
> +{
> +  rtx reg2 = gen_rtx_REG (mode, regno2);
> +  *cfi_ops = alloc_reg_note (REG_CFA_RESTORE, reg2, *cfi_ops);
> +  emit_insn (aarch64_gen_loadwb_pair (mode, stack_pointer_rtx, reg1,
> +   reg2, adjustment));
> +}
> +}
> +
>  static rtx
>  aarch64_gen_store_pair (machine_mode mode, rtx mem1, rtx reg1, rtx mem2,
>   rtx reg2)
> @@ -3124,7 +3152,7 @@ aarch64_expand_prologue (void)
>R30_REGNUM, false);
>   }
> else
> - aarch64_pushwb_pair_reg (DImode, R29_REGNUM, R30_REGNUM, offset);
> + aarch64_push_regs (R29_REGNUM, R30_REGNUM, offset);
>  
> /* Set up frame pointer to point to the location of the
>previous frame pointer on the stack.  */
> @@ -3150,14 +3178,8 @@ aarch64_expand_prologue (void)
>   }
> else
>   {
> -   machine_mode mode1 = (reg1 <= R30_REGNUM) ? DImode : DFmode;
> -
> +   aarch64_push_regs (reg1, reg2, offset);
> skip_wb = true;
> -
> -   if (reg2 == FIRST_PSEUDO_REGISTER)
> - aarch64_pushwb_single_reg (mode1, reg1, offset);
> -   else
> - aarch64_pushwb_pair_reg (mode1, reg1, reg2, offset);
>   }
>   }
>  
> @@ -3279,39 +3301,16 @@ aarch64_expand_epilogue (bool for_sibcall)
>   emit_insn (gen_stack_tie (stack_pointer_rtx, stack_pointer_rtx));
>  
>if (skip_wb)
> - {
> -   machine_mode mode1 = (reg1 <= R30_REGNUM) ? DImode : DFmode;
> -   rtx rreg1 = gen_rtx_REG (mode1, reg1);
> -
> -   cfi_ops = alloc_reg_note (REG_CFA_RESTORE, rreg1, cfi_ops);
> -   if (reg2 == FIRST_PSEUDO_REGISTER)
> - {
> -   rtx mem = plus_constant (Pmode, stack_pointer_rtx, offset);
> -   mem = gen_rtx_POST_MODIFY (Pmode, stack_pointer_rtx, mem);
> -   mem = gen_rtx_MEM (mode1, mem);
> -   insn = emit_move_insn (rreg1, mem);
> - }
> -   else
> - {
> -   rtx rreg2 = gen_rtx_REG (mode1, reg2);
> -
> -   cfi_ops = alloc_reg_note (REG_CFA_RESTORE, rreg2, cfi_ops);
> -   insn = emit_insn (aarch64_gen_loadwb_pair
> - (mode1, stack_pointer_rtx, rreg1,
> -  rreg2, offset));
> - }
> - }
> + aarch64_pop_regs (reg1, reg2, offset, &cfi_ops);
>else
> - {
> -   insn = emit_insn (gen_add2_insn (stack_pointer_rtx,
> -GEN_INT (offset)));
> - }
> + emit_insn (gen_add2_insn (stack_pointer_rtx, GEN_INT (

Re: warn for dead function calls [4/4] stor-layout.c fallout

2016-07-26 Thread Prathamesh Kulkarni
On 26 July 2016 at 17:07, Richard Biener  wrote:
> On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:
>
>> The following is an interesting case which broke stor-layout.c.
>> The patch warned for the following call to be dead from
>> bit_field_mode_iterator::next_mode() to get_mode_alignment ():
>>
>>   /* Stop if the mode requires too much alignment.  */
>>   if (GET_MODE_ALIGNMENT (m_mode) > m_align
>>   && SLOW_UNALIGNED_ACCESS (m_mode, m_align))
>> break;
>>
>> GET_MODE_ALIGNMENT (MODE) is just #defined as get_mode_alignment (MODE)
>> in machmode.h
>>
>> SLOW_UNALIGNED_ACCESS (MODE, ALIGN) is #defined to STRICT_ALIGNMENT
>> in defaults.h, and i386.h sets STRICT_ALIGNMENT to 0.
>> So essentially it comes down to:
>>
>> if (get_mode_alignment (m_mode) > m_align && 0)
>>   break;
>>
>> which clearly makes get_mode_alignment(m_mode) a dead call
>> since it's a pure function.
>> However if a target overrides SLOW_UNALIGNED_ACCESS(mode, align)
>> and sets it to some runtime value, then the call won't be dead for that 
>> target.
>>
>> Should we split the above  in two different if conditions ?
>> if (GET_MODE_ALIGNMENT (m_mode) > m_align)
>>   if (SLOW_UNALIGNED_ACCESS (m_mode, m_align))
>> break;
>
> I'm surprised it's only one case that you hit ;)  Be prepared for
> other targets to be broken similarly.
>
> This hints at the general issue of issueing warnings after optimization,
> they can easily become false positives.
Hmm, this would indeed give rise to such false positives :/
I wonder whether we should restrict the warning only for cases when the call
is outermost expression ?
Not sure how to go about that. Maybe add a new flag to tree_exp for CALL_EXPR
say OUTERMOST_CALL_P, which is set by FE when the call is determined to be
outermost expression  ?

Thanks,
Prathamesh
>
> Richard.


Re: [PR71078] x / abs(x) -> copysign (1.0, x)

2016-07-26 Thread Richard Sandiford
Richard Biener  writes:
> On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:
>
>> Hi,
>> The attached patch tries to fix PR71078.
>> I am not sure if I have got the converts right.
>> I put (convert? @0) and (convert1? (abs @1))
>> to match for cases when operands's types may
>> be different from outermost type like in pr71078-3.c
>
> Types of RDIV_EXPR have to be the same so as you have a
> match on @0 the converts need to be either both present
> or not present.
>
> +  (if (FLOAT_TYPE_P (type)
>
> as you special-case several types below please use SCALAR_FLOAT_TYPE_P
> here.
>
> +   && ! HONOR_NANS (type)
> +   && ! HONOR_INFINITIES (type))
> +   (switch
> +(if (type == float_type_node)
> + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (convert @0)))
>
> please use if (types_match (type, float_type_node)) instead of
> pointer equality.

IMO it'd be better to have some way of using mathfn_builtin from match.pd.

> I _think_ you can do better here by using
> IFN_COPYSIGN but possibly only so if the target supports it.
> Richard - this seems to be the first pattern in need of
> generating a builtin where no other was there to match the type
> to - any idea how we can safely use the internal function here?

I don't think there's any benefit to using the internal function here.
At the moment we only use internal functions before cfgexpand if they
let us do something that we couldn't do otherwise, such as vectorising a
function or avoiding unnecessary effects on errno.  All other cases are
handled by cfgexpand itself.

> I see those do not have an expander that would fall back to
> expanding the regular builtin, correct?

Right.  In:

https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01057.html

you weren't keen on the idea of tree codes introducing new dependencies
on libm, so I carried that priniple over to the internal functions
(which are really just extended tree codes).  I suppose copysign is
a special case since we can always open code it, but in general we
shouldn't fall back to something that could generate a call.

Thanks,
Richard


Re: warn for dead function calls [3/4] testsuite fallout

2016-07-26 Thread Richard Biener
On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:

> On 26 July 2016 at 17:06, Richard Biener  wrote:
> > On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> The following test-cases broke due to the warning.
> >> I think however the warning is right for all the cases:
> >>
> >> a) g++.dg/tree-ssa/invalid-dom.C:
> >> I believe the call from main() to E::bar() is dead call ?
> >
> > Can't find this testcase.
> oops, it's dom-invalid.C -;)

Ok.  This testcase was for an infinite loop during compiling and
likely was sensitive to inlining decisions.  We likely dervirtualize
the call - maybe add -fno-devirtualize instead?

> >
> >> b) libffi/testsuite/libffi.call/float.c:
> >> Call from main() to floating() is dead call.
> >>
> >> c) libffi/testsuite/libffi.call/float3.c:
> >> Calls from main() to floating_1() and floating_2() are dead calls.
> >>
> >> d) libffi/testsuite/libffi.call/negint.c:
> >> Call from main() to checking () is dead call.
> >
> > Looks like dead calls in the above but libffi is maintained upstream
> > and just copied to GCC.  Please raise the issue upstream.
> Ok I will raise the issue upstream.
> 
> Thanks,
> Prathamesh
> >
> > Richard.
> >
> >> Should I update the test-cases to pass -Wno-unsued-value,
> >> or remove the calls ?
> >>
> >> Thanks,
> >> Prathamesh
> >>
> >>
> >
> > --
> > Richard Biener 
> > SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
> > 21284 (AG Nuernberg)
> 
> 

-- 
Richard Biener 
SUSE LINUX GmbH, GF: Felix Imendoerffer, Jane Smithard, Graham Norton, HRB 
21284 (AG Nuernberg)


Re: warn for dead function calls [4/4] stor-layout.c fallout

2016-07-26 Thread Richard Biener
On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:

> On 26 July 2016 at 17:07, Richard Biener  wrote:
> > On Tue, 26 Jul 2016, Prathamesh Kulkarni wrote:
> >
> >> The following is an interesting case which broke stor-layout.c.
> >> The patch warned for the following call to be dead from
> >> bit_field_mode_iterator::next_mode() to get_mode_alignment ():
> >>
> >>   /* Stop if the mode requires too much alignment.  */
> >>   if (GET_MODE_ALIGNMENT (m_mode) > m_align
> >>   && SLOW_UNALIGNED_ACCESS (m_mode, m_align))
> >> break;
> >>
> >> GET_MODE_ALIGNMENT (MODE) is just #defined as get_mode_alignment (MODE)
> >> in machmode.h
> >>
> >> SLOW_UNALIGNED_ACCESS (MODE, ALIGN) is #defined to STRICT_ALIGNMENT
> >> in defaults.h, and i386.h sets STRICT_ALIGNMENT to 0.
> >> So essentially it comes down to:
> >>
> >> if (get_mode_alignment (m_mode) > m_align && 0)
> >>   break;
> >>
> >> which clearly makes get_mode_alignment(m_mode) a dead call
> >> since it's a pure function.
> >> However if a target overrides SLOW_UNALIGNED_ACCESS(mode, align)
> >> and sets it to some runtime value, then the call won't be dead for that 
> >> target.
> >>
> >> Should we split the above  in two different if conditions ?
> >> if (GET_MODE_ALIGNMENT (m_mode) > m_align)
> >>   if (SLOW_UNALIGNED_ACCESS (m_mode, m_align))
> >> break;
> >
> > I'm surprised it's only one case that you hit ;)  Be prepared for
> > other targets to be broken similarly.
> >
> > This hints at the general issue of issueing warnings after optimization,
> > they can easily become false positives.
> Hmm, this would indeed give rise to such false positives :/
> I wonder whether we should restrict the warning only for cases when the call
> is outermost expression ?
> Not sure how to go about that. Maybe add a new flag to tree_exp for CALL_EXPR
> say OUTERMOST_CALL_P, which is set by FE when the call is determined to be
> outermost expression  ?

That doesn't sound like a good idea.  But I have no good idea either
but to suggest this kind of warning is bad ;)

Richard.


Re: [RFC][IPA-VRP] Early VRP Implementation

2016-07-26 Thread Richard Biener
On Tue, Jul 26, 2016 at 2:27 PM, kugan
 wrote:
>
>
> Hi Riachard,
>
> Thanks for the review. Here is an updated patch with comments below.
>
>> +/* Restore/Pop all the old VRs maintained in the cond_stack.  */
>> +
>> +void evrp_dom_walker::finalize_dom_walker ()
>> +{
>> +  while (!cond_stack.is_empty ())
>> +{
>> +  tree var = cond_stack.last ().second;
>> +  pop_value_range (var);
>> +  cond_stack.pop ();
>> +}
>>
>> why is this necessary at all?  Looks like a waste of time (and the
>> stack should be empty here
>> anyway).  I think you leak cond_stack as well, I suppose using
>> auto_vec might work here,
>> you have to check.
>
> Done.
>
>>

 + /* Discover VR when condition is true.  */
 + if (te == e
 + && !TREE_OVERFLOW_P (op0)
 + && !TREE_OVERFLOW_P (op1))
>>>
>>>
>>>
>>> This is mainly to handle gcc.dg/pr38934.c. This would result in ICE from
>>> set_value_range otherwise:
>>>
>>> gcc_assert ((!TREE_OVERFLOW_P (min) || is_overflow_infinity (min))
>>>   && (!TREE_OVERFLOW_P (max) || is_overflow_infinity
>>> (max)));
>>
>>
>> Ok, instead make sure to remove the overflow flag via
>>
>>   if (TREE_OVERFLOW_P (op0))
>> op0 = drop_tree_overflow (op0);
>
>  ...
> Done.
>
>>
>> it looks like you want to merge the if ( & EDGE_TRUE_VALUE) and
>> FALSE_VALUE
>> cases and only invert the tree comparison for the false edge.
>
> Done.
>
>>
>> + tree cond = build2 (code, boolean_type_node, op0, op1);
>> + extract_range_for_var_from_comparison_expr (&vr, op0, cond);
>>
>> no wasteful tree building here please.  Instead of cond pass in code,
>> op0 and op1
>> as separate arguments.
>
> Done.
>
>>
>> + /* If we found any usable VR, set the VR to ssa_name and create
>> a
>> +PUSH old value in the cond_stack with the old VR.  */
>> + if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
>> +   {
>> + value_range *new_vr = vrp_value_range_pool.allocate ();
>> + memset (new_vr, 0, sizeof (*new_vr));
>> + *new_vr = vr;
>> + cond_stack.safe_push (std::make_pair (bb, op0));
>>
>> the memset looks redundant given you copy-assing from vr anyway.
>
> Done.
>
>>
>> You seem to miss the chance to register ranges for x_2 in i_1 < x_2 where
>> both i_1 and x_2 might have ranges that are useful.
>
> I will address this once the basic structure is in place if that is OK with
> you.

Sure.  Just note it down somewhere.

>>
>> push and pop_value_range should be methods in the dom walker class
>> and vrp_stack and cond_stack should be a single stack.  You can omit
>> basic_block if you use a "magic" NULL_TREE var as marker (simply
>> push a NULL_TREE, NULL pair at the end of before_dom_children).
>>
> Done.
>
>

 you can use e->flags & EDGE_TRUE_VALUE/EDGE_FALSE_VALUE

 why do you need those TREE_OVERFLOW checks?

 + tree cond = build2 (code, boolean_type_node, op0, op1);
 + tree a = build2 (ASSERT_EXPR, TREE_TYPE (op0), op0, cond);
 + extract_range_from_assert (&vr, a);

 so I was hoping that the "refactoring" patch in the series would expose
 a
 more
 useful interface than extract_range_from_assert ... namely one that can
 extract a range from the comparison directly and does not require
 building
 a scratch ASSERT_EXPR.
>>>
>>>
>>>
>>> I have split extract_range_from_assert into
>>> extract_range_for_var_from_comparison_expr and using it now. Is that
>>> better?
>>
>>
>> See above.
>>


 + /* If we found any usable VR, set the VR to ssa_name and
 create
 a
 +restore point in the cond_stack with the  old VR. */
 + if (vr.type == VR_RANGE || vr.type == VR_ANTI_RANGE)
 +   {
 + value_range *new_vr = XCNEW (value_range);
 + *new_vr = vr;
 + cond_stack.safe_push (std::make_pair (bb,
 +   std::make_pair (op0,
 +
 old_vr)));
 + change_value_range (op0, new_vr);

 I don't like 'change_value_range' as interface, please integrate that
 into
 a push/pop_value_range style interface instead.
>>>
>>>
>>>
>>> Tried using push/pop interface.



 +   vrp_visit_stmt (stmt, &taken_edge_p, &output_p);
 +}
 +
 +  return NULL;

 you should return taken_edge_p (misnamed as it isn't a pointer) and take
 advantage of EDGE_EXECUTABLE.  Again see DOM/SCCVN (might want to
 defer this as a followup improvement).
>>>
>>>
>>>
>>> I have added a TODO to this effect and will comeback to it.



 Note that the advantage of a DOM-based VRP is that backtracking is easy
 to implement (you don't do that yet).  That is, after DEF got a (better)
 value-range you can simply re-visit all the de

Re: [patch] Get rid of stack trampolines for nested functions

2016-07-26 Thread Eric Botcazou
> So i think with the various nits I pointed out the target independent
> stuff is good to go on the trunk.  Then you can just iterate with the
> target guys to get those wrapped up.

OK, I'll do a pass on the entire patch, post a second version with the 
required fixes, split into 2 parts as agreed, and get approval from the 
architecture maintainers individually.

-- 
Eric Botcazou


Re: [PATCH] Replacing gcc's dependence on libiberty's fnmatch to gnulib's fnmatch

2016-07-26 Thread ayush goel
On 26 July 2016 at 3:38:59 AM, Manuel López-Ibáñez
(lopeziba...@gmail.com) wrote:
> On 25 July 2016 at 18:18, ayush goel wrote:
> > On top of the previously filed patch for importing gnulib (the link
> > isn’t available on the archive yet, however this contains some of the
> > information: 
> > http://gcc.1065356.n5.nabble.com/Importing-gnulib-into-the-gcc-tree-td1275807.html#a1279573)
> > now I have replaced another function from libiberty with the
> > corresponding version from gnulib.
> > Even though in both OSX and GNU/Linux, fnmatch is provided by the GNU
> > libc already, so the copy in libiberty is not used in your systems.
> > However since the objective is to replace whatever functions can be
> > leveraged by gnulib, these changes have been made.
>
> Why the change from "fnmatch.h" to ?

Gnulib doesn’t contain a header for fnmatch. It itself relies on
glib’c fnmatch.h

>
> Also, are the files in gnulib and libiberty semantically identical?
> The wiki page does not say anything about this. How did you check
> this?

Well the online docs for libiberty and gnulib claim the same
definition for fnmatch. Apart from this I’ve manually gone through the
source code and they seem to be semantically similar.
Also the fact that the system builds fine and the tests also execute
fine could serve as a manifestation for the fact that they are
semantically same.
>
> GCC can run on other systems besides OSX and GNU/Linux, how can you
> test that your change does not break anything on those systems?
>
Well I have access to these two systems only. How would you suggest I
test my patches on all possible systems?

> Cheers,
>
> Manuel.
>

-Ayush


RE: [PATCH 3/4] Add support to run auto-vectorization tests for multiple effective targets

2016-07-26 Thread Robert Suchanek
Hi,

> On May 5, 2016, at 8:14 AM, Robert Suchanek  
> wrote:
> >
> > I'm resending this patch as it has been rebased and updated.  I reverted a
> change
> > to check_effective_target_vect_call_lrint procedure because it does not use
> > cached result.
> 
> Ok.
> 
> Please ensure that the compilation flag is mixed into the test case name so
> that as you iterate over them, the test case names are unique.

An effective target is likely to have a unique flag to enable a given set of
SIMD operations and this is mixed into test case names.

I double-checked this with mips-mti-linux-gnu where auto-vectorization tests
can be run twice i.e. for -mmsa and -mpaired-single.

The patch was rebased once again and tested on x86_64-unknown-linux-gnu.

Committed as r238755.

Thanks and regards,
Robert



[AArch64] Handle HFAs of float16 types properly

2016-07-26 Thread James Greenhalgh

Hi,

It looks like we've not been handling structures of 16-bit floating-point
data correctly for AArch64. For some reason we end up passing them
packed in to integer registers. That is to say, on trunk and GCC 6, for:

  struct x {
__fp16 x[4];
  };

  __fp16
  foo1 (struct x x)
  {
return x.x[1];
  }

We generate:

  foo1:
sbfxx0, x0, 16, 16
mov v0.h[0], w0
ret

Which is wrong.

This patch fixes that, so now we generate:

  foo1:
umovw0, v1.h[0]
sxthx0, w0
mov v0.h[0], w0
ret

Far from optimal (I'll work on that...) but at least getting the data from
the right register bank!

To do this we need to keep around a reference to the fp16 type after we
construct it. I've moved this initialisation to a new function
aarch64_init_fp16_types in aarch64-builtins.c and made the references
available through arm_neon.h.

After that, we want to remove the #if 0 wrapping HFmode support in
aarch64_gimplify_va_arg_expr in aarch64.c, and add HFmode to the
REAL_TYPE and COMPLEX_TYPE support in aapcs_vfp_sub_candidate.

Strictly speaking, we don't need the hunk regarding COMPLEX_TYPE.
We can't build complex forms of __fp16. But, were we ever to support the
_Float16 type we'd need this. Rather than leave the chance it will be
forgotten about, I've just added it here. If the maintainers would prefer,
I can change this to a TODO and put a sticky-note somewhere near my desk.

With those simple changes, we fix the argument passing. The rest of the
patch is an update to the various testcases in aapcs64.exp to fully cover
various __fp16 cases (both naked, and within an HFA).

Bootstrapped on aarch64-none-linux-gnu and tested with no issues. Also
tested on aarch64_be-none-elf. All test came back clean.

OK? As this is an ABI break, I'm not proposing for it to go back to GCC 6,
though it will apply cleanly there if the maintainers support that.

Thanks,
James

---

gcc/

2016-07-26  James Greenhalgh  

* config/aarch64/aarch64.h (aarch64_fp16_type_node): Declare.
(aarch64_fp16_ptr_type_node): Likewise.
* config/aarch64/aarch64-simd-builtins.c
(aarch64_fp16_ptr_type_node): Define.
(aarch64_init_fp16_types): New, refactored out of...
(aarch64_init_builtins): ...here, update to call
aarch64_init_fp16_types.
* config/aarch64/aarch64.c (aarch64_gimplify_va_arg_expr): Handle
HFmode.
(aapcs_vfp_sub_candidate): Likewise.

gcc/testsuite/

2016-07-26  James Greenhalgh  

* gcc.target/aarch64/aapcs64/abitest-common.h: Define half-precision
registers.
* gcc.target/aarch64/aapcs64/abitest.S (dumpregs): Add assembly for
saving the half-precision registers.
* gcc.target/aarch64/aapcs64/func-ret-1.c: Test that an __fp16
value is returned in h0.
* gcc.target/aarch64/aapcs64/test_2.c: Check that __FP16 arguments
are passed in FP/SIMD registers.
* gcc.target/aarch64/aapcs64/test_27.c: New, test that __fp16 HFA
passing works corrcetly.
* gcc.target/aarch64/aapcs64/type-def.h (hfa_f16x1_t): New.
(hfa_f16x2_t): Likewise.
(hfa_f16x3_t): Likewise.
* gcc.target/aarch64/aapcs64/va_arg-1.c: Check that __fp16 values
are promoted to double and passed in a double register.
* gcc.target/aarch64/aapcs64/va_arg-2.c: Check that __fp16 values
are promoted to double and stacked.
* gcc.target/aarch64/aapcs64/va_arg-4.c: Check stacking of HFA of
__fp16 data types.
* gcc.target/aarch64/aapcs64/va_arg-5.c: Likewise.
* gcc.target/aarch64/aapcs64/va_arg-16.c: New, check HFAs of
__fp16 first get passed in FP/SIMD registers, then stacked.

diff --git a/gcc/config/aarch64/aarch64-builtins.c b/gcc/config/aarch64/aarch64-builtins.c
index ca91d91..1de325a 100644
--- a/gcc/config/aarch64/aarch64-builtins.c
+++ b/gcc/config/aarch64/aarch64-builtins.c
@@ -443,13 +443,15 @@ static struct aarch64_simd_type_info aarch64_simd_types [] = {
 };
 #undef ENTRY
 
-/* This type is not SIMD-specific; it is the user-visible __fp16.  */
-static tree aarch64_fp16_type_node = NULL_TREE;
-
 static tree aarch64_simd_intOI_type_node = NULL_TREE;
 static tree aarch64_simd_intCI_type_node = NULL_TREE;
 static tree aarch64_simd_intXI_type_node = NULL_TREE;
 
+/* The user-visible __fp16 type, and a pointer to that type.  Used
+   across the back-end.  */
+tree aarch64_fp16_type_node = NULL_TREE;
+tree aarch64_fp16_ptr_type_node = NULL_TREE;
+
 static const char *
 aarch64_mangle_builtin_scalar_type (const_tree type)
 {
@@ -883,6 +885,21 @@ aarch64_init_builtin_rsqrt (void)
   }
 }
 
+/* Initialize the backend types that support the user-visible __fp16
+   type, also initialize a pointer to that type, to be used when
+   forming HFAs.  */
+
+static void
+aarch64_init_fp16_types (void)
+{
+  aarch64_fp16_type_node = make_node (REAL_TYPE);
+  TYPE_PRECISION (aarch64_fp16_ty

Re: [PR71078] x / abs(x) -> copysign (1.0, x)

2016-07-26 Thread Richard Biener
On Tue, 26 Jul 2016, Richard Sandiford wrote:

> Richard Biener  writes:
> > On Mon, 25 Jul 2016, Prathamesh Kulkarni wrote:
> >
> >> Hi,
> >> The attached patch tries to fix PR71078.
> >> I am not sure if I have got the converts right.
> >> I put (convert? @0) and (convert1? (abs @1))
> >> to match for cases when operands's types may
> >> be different from outermost type like in pr71078-3.c
> >
> > Types of RDIV_EXPR have to be the same so as you have a
> > match on @0 the converts need to be either both present
> > or not present.
> >
> > +  (if (FLOAT_TYPE_P (type)
> >
> > as you special-case several types below please use SCALAR_FLOAT_TYPE_P
> > here.
> >
> > +   && ! HONOR_NANS (type)
> > +   && ! HONOR_INFINITIES (type))
> > +   (switch
> > +(if (type == float_type_node)
> > + (BUILT_IN_COPYSIGNF { build_one_cst (type); } (convert @0)))
> >
> > please use if (types_match (type, float_type_node)) instead of
> > pointer equality.
> 
> IMO it'd be better to have some way of using mathfn_builtin from match.pd.

Yeah ... but that is not supported.  I suppose we could "easily"
allow BUILT_IN_COPYSIGN for all types when the generator inserts
a mathfn_built_in_2 call unconditionally for fn-kind result ops.
Of course we'd need to deal with it failing and it would be
redundant work for most of the existing patterns.  See patch below
which should allow

 (BUILT_IN_COPYSIGN { build_cone_cst (type); } (conver @0))

for all types.  Well.  We probably should check for END_BUILTINS
as return value and fail pattern generation late here...

> > I _think_ you can do better here by using
> > IFN_COPYSIGN but possibly only so if the target supports it.
> > Richard - this seems to be the first pattern in need of
> > generating a builtin where no other was there to match the type
> > to - any idea how we can safely use the internal function here?
> 
> I don't think there's any benefit to using the internal function here.

The advantage is solely that the internal function is "overloaded"
vs the builtin having variants for all the float types that are
supported.

> At the moment we only use internal functions before cfgexpand if they
> let us do something that we couldn't do otherwise, such as vectorising a
> function or avoiding unnecessary effects on errno.  All other cases are
> handled by cfgexpand itself.
> 
> > I see those do not have an expander that would fall back to
> > expanding the regular builtin, correct?
> 
> Right.  In:
> 
> https://gcc.gnu.org/ml/gcc-patches/2015-08/msg01057.html
> 
> you weren't keen on the idea of tree codes introducing new dependencies
> on libm, so I carried that priniple over to the internal functions
> (which are really just extended tree codes).  I suppose copysign is
> a special case since we can always open code it, but in general we
> shouldn't fall back to something that could generate a call.

Richard.

Index: gcc/builtins.c
===
--- gcc/builtins.c  (revision 238743)
+++ gcc/builtins.c  (working copy)
@@ -1767,7 +1767,7 @@ expand_builtin_classify_type (tree exp)
This is purely an operation on function codes; it does not guarantee
that the target actually has an implementation of the function.  */
 
-static built_in_function
+built_in_function
 mathfn_built_in_2 (tree type, combined_fn fn)
 {
   built_in_function fcode, fcodef, fcodel;
Index: gcc/builtins.h
===
--- gcc/builtins.h  (revision 238743)
+++ gcc/builtins.h  (working copy)
@@ -61,6 +61,7 @@ extern tree c_strlen (tree, int);
 extern void expand_builtin_setjmp_setup (rtx, rtx);
 extern void expand_builtin_setjmp_receiver (rtx);
 extern void expand_builtin_update_setjmp_buf (rtx);
+extern built_in_function mathfn_built_in_2 (tree, combined_fn);
 extern tree mathfn_built_in (tree, enum built_in_function fn);
 extern tree mathfn_built_in (tree, combined_fn);
 extern rtx builtin_strncpy_read_str (void *, HOST_WIDE_INT, machine_mode);
Index: gcc/genmatch.c
===
--- gcc/genmatch.c  (revision 238743)
+++ gcc/genmatch.c  (working copy)
@@ -2333,6 +2333,14 @@ expr::gen_transform (FILE *f, int indent
   else
 opr_name = operation->id;
 
+  if (operation->kind == id_base::FN)
+{
+  fprintf_indent (f, indent,
+ "combined_fn tem_fcode = as_combined_fn ("
+ "mathfn_built_in_2 (%s, %s));\n", type, opr_name);
+  opr_name = "tem_fcode";
+}
+
   if (gimple)
 {
   if (*opr == CONVERT_EXPR)



Re: [Bug tree-optimization] Fix for PR71994

2016-07-26 Thread Jeff Law

On 07/25/2016 09:13 PM, kugan wrote:

Hi,

For testcase in pr71994, type of bb conditional result and the type of
the PHI stmt are different (as om.0_1 is int and the first PHI argument
is _bool; PHI stmt uses a constant zero that comes from edge 2).
Therefore when we optimize final range test stmt, we end up setting
integer 1 for other PHI argument. This results in ICE.

  :
  om.0_1 = om;
  _2 = om.0_1 >= 0;
  int _3 = (int) _2;
  if (om.0_1 != 0)
goto ;
  else
goto ;

  :
  int _4 = om.0_1 & _3;
  _Bool _12 = _4 != 0;
  :

  # _Bool _13 = PHI <0(2), _12(3)>


IMHO, the fix should be that, we should check the type before replacing
the value (punt otherwise). Attached patch does that. Bootstrapped and
regression tested on x86_64-linux-gnu with no new regressions. Is this
OK for trunk?

Thanks,
Kugan

gcc/testsuite/ChangeLog:

2016-07-26  Kugan Vivekanandarajah  

* gcc.dg/torture/pr71994.c: New test.

gcc/ChangeLog:

2016-07-26  Kugan Vivekanandarajah  

* tree-ssa-reassoc.c (maybe_optimize_range_tests): Check type
compatibility.
I'd kind of like to see some analysis of how we've got a bool here -- 
ISTM it ought to have been converted it to the type of the LHS of the 
PHI when propagated.


Jeff


[PATCH], Fix PR 71869, Correctly implement isgreater, etc. on PowerPC __float128

2016-07-26 Thread Michael Meissner
When I originally developed the IEEE 128-bit floating point support, the
emulation routines in libgcc did not raise errors on signalling NaNs.  In the
course of adding full support for IEEE 128-bit floating point, we now have
added exception signaling support in the library.  This means the C99/IEEE
built-in functions isgreater, isgreaterequal, isless, islessequal, and
islessgreater now will raise an error when you compare a signaling NaN.  These
functions are mandated not to raise an error.

These patches add calls to __unordkf3 to validate that both arguments are
ordered before calling the ge/le/eq comparison function.  I have done
bootstraps and make check on both little endian Power8 and big endian Power7
(both 32-bit and 64-bit tests done on the Power7 box) with no regressions.  Are
these patches ok for the trunk?

In addition, since the glibc group needs this functionality to complete the
__float128 library support, I would like to get it into GCC 6.2 if it is still
possible.  I know this is last minute, but this patch is important to the GLIBC
group, and it was only noticed recently.  I will be starting the bootstrap and
regression test on the gcc6 branch shortly.  If the patch works on gcc6, can I
commit it to the gcc6 branch ASAP, or will it have to wait for GCC 6.3?

[gcc]
2016-07-25  Michael Meissner  

PR target/71869
* config/rs6000/rs6000.c (rs6000_generate_compare): Rework
__float128 support when we don't have hardware support, so that
the IEEE built-in functions like isgreater, first call __unordkf3
to make sure neither operand is a NaN, and if both operands are
ordered, do the normal comparison.

[gcc/testsuite]
2016-07-25  Michael Meissner  

PR target/71869
* gcc.target/powerpc/float128-cmp.c: New test to make sure that
IEEE built-in functions first check to see if the arguments are
ordered.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 238730)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -21755,8 +21755,8 @@ rs6000_generate_compare (rtx cmp, machin
   else if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
 {
   rtx libfunc = NULL_RTX;
-  bool uneq_or_ltgt = false;
-  rtx dest = gen_reg_rtx (SImode);
+  bool check_nan = false;
+  rtx dest;
 
   switch (code)
{
@@ -21783,21 +21783,23 @@ rs6000_generate_compare (rtx cmp, machin
 
case UNGE:
case UNGT:
- libfunc = optab_libfunc (le_optab, mode);
+ check_nan = true;
+ libfunc = optab_libfunc (ge_optab, mode);
  code = (code == UNGE) ? GE : GT;
  break;
 
case UNLE:
case UNLT:
- libfunc = optab_libfunc (ge_optab, mode);
+ check_nan = true;
+ libfunc = optab_libfunc (le_optab, mode);
  code = (code == UNLE) ? LE : LT;
  break;
 
case UNEQ:
case LTGT:
- libfunc = optab_libfunc (le_optab, mode);
- uneq_or_ltgt = true;
- code = (code = UNEQ) ? NE : EQ;
+ check_nan = true;
+ libfunc = optab_libfunc (eq_optab, mode);
+ code = (code = UNEQ) ? EQ : NE;
  break;
 
default:
@@ -21805,21 +21807,56 @@ rs6000_generate_compare (rtx cmp, machin
}
 
   gcc_assert (libfunc);
-  dest = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
- SImode, 2, op0, mode, op1, mode);
 
-  /* If this is UNEQ or LTGT, we call __lekf2, which returns -1 for less
-than, 0 for equal, +1 for greater, and +2 for nan.  We add 1, to give
-a value of 0..3, and then do and AND immediate of 1 to isolate whether
-it is 0/Nan (i.e. bottom bit is 0), or less than/greater than
-(i.e. bottom bit is 1).  */
-  if (uneq_or_ltgt)
-   {
- rtx add_result = gen_reg_rtx (SImode);
- rtx and_result = gen_reg_rtx (SImode);
- emit_insn (gen_addsi3 (add_result, dest, GEN_INT (1)));
- emit_insn (gen_andsi3 (and_result, add_result, GEN_INT (1)));
- dest = and_result;
+  if (!check_nan)
+   dest = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
+   SImode, 2, op0, mode, op1, mode);
+
+  /* The library signals an exception for signalling NaNs, so we need to
+handle isgreater, etc. by first checking isordered.  */
+  else
+   {
+ rtx ne_rtx, normal_dest, unord_dest;
+ rtx unord_func = optab_libfunc (unord_optab, mode);
+ rtx join_label = gen_label_rtx ();
+ rtx join_ref = gen_rtx_LABEL_REF (VOIDmode, join_label);
+ rtx unord_cmp = gen_reg_rtx (comp_mode);
+
+
+ /* Test for either value being a NaN. 

Re: [PATCH 2/6] use auto_sbitmap in various places

2016-07-26 Thread Jeff Law

On 07/25/2016 07:55 PM, Trevor Saunders wrote:

On Mon, Jul 25, 2016 at 11:18:07AM -0600, Jeff Law wrote:

On 07/24/2016 05:44 AM, tbsaunde+...@tbsaunde.org wrote:

From: Trevor Saunders 

gcc/ChangeLog:

2016-07-24  Trevor Saunders  

* bt-load.c (compute_out): Use auto_sbitmap class.
(link_btr_uses): Likewise.
* cfganal.c (mark_dfs_back_edges): Likewise.
(post_order_compute): Likewise.
(inverted_post_order_compute): Likewise.
(pre_and_rev_post_order_compute_fn): Likewise.
(single_pred_before_succ_order): Likewise.
* cfgexpand.c (pass_expand::execute): Likewise.
* cfgloop.c (verify_loop_structure): Likewise.
* cfgloopmanip.c (fix_bb_placements): Likewise.
(remove_path): Likewise.
(update_dominators_in_loop): Likewise.
* cfgrtl.c (break_superblocks): Likewise.
* ddg.c (check_sccs): Likewise.
(create_ddg_all_sccs): Likewise.
* df-core.c (df_worklist_dataflow): Likewise.
* dse.c (dse_step3): Likewise.
* except.c (eh_region_outermost): Likewise.
* function.c (thread_prologue_and_epilogue_insns): Likewise.
* gcse.c (prune_expressions): Likewise.
(prune_insertions_deletions): Likewise.
* gimple-ssa-backprop.c (backprop::~backprop): Likewise.
* graph.c (draw_cfg_nodes_no_loops): Likewise.
* ira-lives.c (remove_some_program_points_and_update_live_ranges): 
Likewise.
* lcm.c (compute_earliest): Likewise.
(compute_farthest): Likewise.
* loop-unroll.c (unroll_loop_constant_iterations): Likewise.
(unroll_loop_runtime_iterations): Likewise.
(unroll_loop_stupid): Likewise.
* lower-subreg.c (decompose_multiword_subregs): Likewise.
* lra-lives.c: Likewise.
* lra.c (lra): Likewise.
* modulo-sched.c (schedule_reg_moves): Likewise.
(optimize_sc): Likewise.
(get_sched_window): Likewise.
(sms_schedule_by_order): Likewise.
(check_nodes_order): Likewise.
(order_nodes_of_sccs): Likewise.
(order_nodes_in_scc): Likewise.
* recog.c (split_all_insns): Likewise.
* regcprop.c (pass_cprop_hardreg::execute): Likewise.
* reload1.c (reload): Likewise.
* sched-rgn.c (haifa_find_rgns): Likewise.
(split_edges): Likewise.
(compute_trg_info): Likewise.
* sel-sched.c (init_seqno): Likewise.
* store-motion.c (remove_reachable_equiv_notes): Likewise.
* tree-into-ssa.c (update_ssa): Likewise.
* tree-ssa-live.c (live_worklist): Likewise.
* tree-ssa-loop-im.c (fill_always_executed_in): Likewise.
* tree-ssa-loop-ivcanon.c (try_unroll_loop_completely):
* Likewise.
(try_peel_loop): Likewise.
* tree-ssa-loop-manip.c (tree_transform_and_unroll_loop):
* Likewise.
* tree-ssa-pre.c (compute_antic): Likewise.
* tree-ssa-reassoc.c (undistribute_ops_list): Likewise.
* tree-stdarg.c (reachable_at_most_once): Likewise.
* tree-vect-slp.c (vect_attempt_slp_rearrange_stmts): Likewise.
* var-tracking.c (vt_find_locations): Likewise.

While looking at this, I noticed some more.  Check the local sbitmaps in
ddg.c::find_nodes_on_paths.


yeah, there was a couple places like that which I left for the time
being mostly because they swapped bitmaps and I wanted to add a way of
doing that first.


I wonder if building a plugin to help find this kind of thing would help.
Essentially what we want is TYPE which we're considering converting to an
auto_TYPE.  We want to see the declaration, possibly an allocation and an
explicit release.  All paths have to pass through an explicit release.  The
object must not escape.  We can have a whitelist of routines where the
object can be passed in as a parameter.

Given that kind of infrastructure we ought to be able to look at a type and
say, yes, this seems to make sense to turn into an auto and here's all the
places that need twiddling for that change.

That infrastructure could also do things like say "X escaped via call Y" or
a path doesn't release which can further guide the process, or show leaks.


yeah, that would definitely be nice, and I've considered doing it
before, but been afraid it would be a bit of work.  I suppose though if
it took work it would be because I learned some things so maybe its
worth doing for bitmap and all the uses of malloc / free.
It also would likely to be extendable to other uses as well.  There's a 
ton of parallels between this and doing things like double-free, 
use-after-free, leaked resource detection.






diff --git a/gcc/tree-ssa-pre.c b/gcc/tree-ssa-pre.c
index a5f3f71..6247a4c 100644
--- a/gcc/tree-ssa-pre.c
+++ b/gcc/tree-ssa-pre.c
@@ -2451,7 +2451,7 @@ compute_antic (void)

has_abnormal_preds seems ripe for a similar change in this function.


unfortunately its a file static, so that needs work first.
Ah.  Missed tha

Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-26 Thread Jeff Law

On 07/26/2016 03:57 AM, Ilya Enkovich wrote:


Ilya, what's the fundamental reason why we need to run
if-conversion again? Yes, I know you want to if-convert the
epilogue, but why?

What are the consequences of not doing if-conversion on the
epilogue? Presumably we miss a vectorization opportunity on the
tail.  But that may be a reasonable limitation to allow the
existing work to move forward while you go back and revamp things a
little.


If we have some control-flow in a loop then we have to if-convert it
for vectorizer. We need to preserve both versions: if-converted one
for vectorizer and the original one to be used if vectorization
fails.  For epilogues we have similar situation and need two
versions.  I do it by running if-conversion on a copy of original
loop. Note that it doesn't run full if-conversion pass. If-conversion
is called for epilogue loop only.
Right.  So what I think Richi wants you to try is to use the 
if-converted loop to construct the if-converted epilogue.  It seems 
conceptually simple and low cost -- the question is on the 
implementation side.  I have no clue how painful that would be.


jeff



Re: [PATCH, vec-tails 07/10] Support loop epilogue combining

2016-07-26 Thread Ilya Enkovich
2016-07-26 18:26 GMT+03:00 Jeff Law :
> On 07/26/2016 03:57 AM, Ilya Enkovich wrote:
>>>
>>>
>>> Ilya, what's the fundamental reason why we need to run
>>> if-conversion again? Yes, I know you want to if-convert the
>>> epilogue, but why?
>>>
>>> What are the consequences of not doing if-conversion on the
>>> epilogue? Presumably we miss a vectorization opportunity on the
>>> tail.  But that may be a reasonable limitation to allow the
>>> existing work to move forward while you go back and revamp things a
>>> little.
>>
>>
>> If we have some control-flow in a loop then we have to if-convert it
>> for vectorizer. We need to preserve both versions: if-converted one
>> for vectorizer and the original one to be used if vectorization
>> fails.  For epilogues we have similar situation and need two
>> versions.  I do it by running if-conversion on a copy of original
>> loop. Note that it doesn't run full if-conversion pass. If-conversion
>> is called for epilogue loop only.
>
> Right.  So what I think Richi wants you to try is to use the if-converted
> loop to construct the if-converted epilogue.  It seems conceptually simple
> and low cost -- the question is on the implementation side.  I have no clue
> how painful that would be.

Probably another part of if-conversion may be re-used to build required
epilogue.  I'll have a look.

Thanks,
Ilya

>
> jeff
>


[PATCH] gfortran: Fix allocation of diagnostig string (was too small).

2016-07-26 Thread Dominik Vogt
The attached patch fixes an out of bound write to memory allocated
with alloca() on the stack.  This rarely ever happened because on
one hand -fbounds-check needs to be enabled, and on the other hand
alloca() used to allocate a few bytes extra most of the time so
most of the time the excess write did no harm.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/fortran/ChangeLog

* trans-array.c (gfc_conv_array_ref): Fix allocation of diagnostic
message (was too small).
>From a364536c94c5b5c124c3fd6e5cb547aa941aca12 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Tue, 26 Jul 2016 13:17:29 +0100
Subject: [PATCH] gfortran: Fix allocation of diagnostig string (was too
 small).

---
 gcc/fortran/trans-array.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/gcc/fortran/trans-array.c b/gcc/fortran/trans-array.c
index e95c8dd..7572755 100644
--- a/gcc/fortran/trans-array.c
+++ b/gcc/fortran/trans-array.c
@@ -3332,7 +3332,7 @@ gfc_conv_array_ref (gfc_se * se, gfc_array_ref * ar, 
gfc_expr *expr,
  if (ref->type == REF_ARRAY && &ref->u.ar == ar)
break;
  if (ref->type == REF_COMPONENT)
-   len += 1 + strlen (ref->u.c.component->name);
+   len += 2 + strlen (ref->u.c.component->name);
}
 
   var_name = XALLOCAVEC (char, len);
-- 
2.3.0



Re: [PATCH] Fix unsafe function attributes for special functions (PR 71876)

2016-07-26 Thread Bernd Edlinger
On 07/22/16 14:49, Bernd Edlinger wrote:
>
> Hmm, cough...
>
> Boot-strap and reg-test was OK, but...
>
> I just noticed that the new built-ins do not quite work as expected.
>
> First with C++ there is no warning in this example although the
> parameters are different:
>
> cat test2.C
> extern "C"
> void savectx   () __attribute__((nothrow));
>
> int test () throw()
> {
>savectx ();
>return 0;
> }
>
>
> and what I do not understand at all in the moment, is:
> although both the builtin and the explict header say
> "nothrow" there is still eh code emitted that was not
> there before.
>

I have enterd this in bugzilla here:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71973

Before that is solved, new built-in functions will not help
too much, that's clear now.

> So I would like to split that patch again, in the
> cleanup of special_function_p and drop the new built-ins
> in the moment, until I understand what went wrong there.
> The built-in were just meant to bring up warnings, and
> are therefore less important.
>
>
> Sorry for all the iterations this took already :(
>
> So here is the next version of the patch that I am gonna try.
>
>
> Is it OK for trunk?
>
>
> Thanks
> Bernd.


Richard, this was the latest version of the patch:
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg01481.html


Are you OK with my clean-up of the presumably dead function names,
or would you like to keep the quirks in special_function_p for now
and just remove ECF_NORETURN and ECF_LEAF?


Note, that I still consider it incorrect, that special_function_p
cannot distinguish between C and C++ bindings though.


Thanks
Bernd.


Re: [PATCH PR71734] Add missed check that reference defined inside loop.

2016-07-26 Thread Yuri Rumyantsev
Hi Richard,

It turned out that the patch proposed by you does not work properly
for nested loop. If we slightly change pr70729.cc to
(non-essential part is omitted
void inline Ss::foo (float w)
{
#pragma omp simd
  for (int i=0; i:
> Richard,
>
> Jakub has already fixed it.
> Sorry for troubles.
> Yuri.
>
> 2016-07-19 18:29 GMT+03:00 Renlin Li :
>> Hi Yuri,
>>
>> I saw this test case runs on arm platforms, and maybe other platforms as
>> well.
>>
>> testsuite/g++.dg/vect/pr70729.cc:7:10: fatal error: xmmintrin.h: No such
>> file or directory
>>
>> Before the change here, it's gated by vect_simd_clones target selector,
>> which limit it to i?86/x86_64 platform only.
>>
>> Regards,
>> Renlin Li
>>
>>
>>
>>
>> On 08/07/16 15:07, Yuri Rumyantsev wrote:
>>>
>>> Hi Richard,
>>>
>>> Thanks for your help - your patch looks much better.
>>> Here is new patch in which additional argument was added to determine
>>> source loop of reference.
>>>
>>> Bootstrap and regression testing did not show any new failures.
>>>
>>> Is it OK for trunk?
>>> ChangeLog:
>>> 2016-07-08  Yuri Rumyantsev  
>>>
>>> PR tree-optimization/71734
>>> * tree-ssa-loop-im.c (ref_indep_loop_p_1): Add REF_LOOP argument which
>>> contains REF, use it to check safelen, assume that safelen value
>>> must be greater 1, fix style.
>>> (ref_indep_loop_p_2): Add REF_LOOP argument.
>>> (ref_indep_loop_p): Pass LOOP as additional argument to
>>> ref_indep_loop_p_2.
>>> gcc/testsuite/ChangeLog:
>>>  * g++.dg/vect/pr70729.cc: Delete redundant dg options, fix style.
>>>
>>> 2016-07-08 11:18 GMT+03:00 Richard Biener :

 On Thu, Jul 7, 2016 at 5:38 PM, Yuri Rumyantsev 
 wrote:
>
> I checked simd3.f90 and found out that my additional check reject
> independence of references
>
> REF is independent in loop#3
> .istart0.19, .iend0.20
> which are defined in loop#1 which is outer for loop#3.
> Note that these references are defined by
> _103 = __builtin_GOMP_loop_dynamic_next (&.istart0.19, &.iend0.20);
> which is in loop#1.
> It is clear that both these references can not be independent for
> loop#3.


 Ok, so we end up calling ref_indep_loop for ref in LOOP also for inner
 loops
 of LOOP to catch memory references in those as well.  So the issue is
 really
 that we look at the wrong loop for safelen and we _do_ want to apply
 safelen
 to inner loops as well.

 So better track the loop we are ultimately asking the question for, like
 in the
 attached patch (fixes the testcase for me).

 Richard.



> 2016-07-07 17:11 GMT+03:00 Richard Biener :
>>
>> On Thu, Jul 7, 2016 at 4:04 PM, Yuri Rumyantsev 
>> wrote:
>>>
>>> I Added this check because of new failures in libgomp.fortran suite.
>>> Here is copy of Jakub message:
>>> --- Comment #29 from Jakub Jelinek  ---
>>> The #c27 r237844 change looks bogus to me.
>>> First of all, IMNSHO you can argue this way only if ref is a reference
>>> seen in
>>> loop LOOP,
>>
>>
>> or inner loops of LOOP I guess.  I _think_ we never call
>> ref_indep_loop_p_1 with
>> a REF whose loop is not a sub-loop of LOOP or LOOP itself (as it would
>> not make
>> sense to do that, it would be a waste of time).
>>
>> So only if "or inner loops of LOOP" is not correct the check would be
>> needed
>> but then my issue with unrolling an inner loop and turning a ref that
>> safelen
>> does not apply to into a ref that it now applies to arises.
>>
>> I don't fully get what Jakub is hinting at.
>>
>> Can you install the safelen > 0 -> safelen > 1 fix please?  Jakub, can
>> you
>> explain that bitmap check with a simple testcase?
>>
>> Thanks,
>> Richard.
>>
>>> which is the case of e.g. *.omp_data_i_23(D).a ref in simd3.f90 -O2
>>> -fopenmp -msse2, but not the D.3815[0] case tested during can_sm_ref_p
>>> - the
>>> D.3815[0] = 0; as well as something = D.3815[0]; stmt found in the
>>> outer loop
>>> obviously can be dependent on many of the loads and/or stores in the
>>> loop, be
>>> it "omp simd array" or not.
>>> Say for
>>> void
>>> foo (int *p, int *q)
>>> {
>>>#pragma omp simd
>>>for (int i = 0; i < 1024; i++)
>>>  p[i] += q[0];
>>> }
>>> sure, q[0] can't alias p[0] ... p[1022], the earlier iterations could
>>> write
>>> something that changes its value, and then it would behave differently
>>> from
>>> using VF = 1024, where everything is performed in parallel.
>>> Though, actually, it can alias, just it would have to write the same
>>> value as
>>> was there.  So, if this is used to determine if it is safe to hoist
>>> the load
>>> before the loop, it is fine, if it is used to determine if &q[0] >=
>>> &p[0] &&
>>> &q[0] <= &p[1023], then it is n

Re: [PATCH 2/2][v4] Drop excess size used for run time allocated stack variables.

2016-07-26 Thread Dominik Vogt
Finally a patch that works and is simple.  Bootstrapped and
regression tested on s390, s390x biarch and x86_64.  The new patch
exploits the known alignment of (stack pointer +
STACK_DYNAMIC_OFFSET) as described earlier (see below).  I think
that is the right way to get rid of the extra allocation.  It
took a long time to understand the problem.

As the patch triggers a bug in the fortran compiler, the
der_type.f90 test case may fail on some targets if this patch is
used without the fortran fix that I've posted in another thread.

(The patch also contains a fix for a typo in a comment in the
patched function.)

See ChangeLog for a full description of the new patch.

Since the patch is all new, we're not going to commit it without a
new OK.

> Actually I was goind to abandon the patch in its current state.
> :-)  We talked about it internally and concluded that the problem
> is really this:
> 
>  * get_dynamic_stack_size is passed a SIZE of a data block (which
>is allocated elsewhere), the SIZE_ALIGN of the SIZE (i.e. the
>alignment of the underlying memory units (e.g. 32 bytes split
>into 4 times 8 bytes = 64 bit alignment) and the
>REQUIRED_ALIGN of the data portion of the allocated memory.
>  * Assuming the function is called with SIZE = 2, SIZE_ALIGN = 8
>and REQUIRED_ALIGN = 64 it first adds 7 bytes to SIZE -> 9.
>This is what is needed to have two bytes 8-byte-aligned at some
>memory location without any known alignment.
>  * Finally round_push is called to round up SIZE to a multiple of
>the stack slot size.
> 
> The key to understanding this is that the function assumes that
> STACK_DYNMAIC_OFFSET is completely unknown at the time its called
> and therefore it does not make assumptions about the alignment of
> STACKPOINTER + STACK_DYNMAIC_OFFSET.  The latest patch simply
> hard-codes that SP + SDO is supposed to be aligned to at least
> stack slot size (and does that in a very complicated way).  Since
> there is no guarantee that this is the case on all targets, the
> patch is broken.  It may miscalculate a SIZE that is too small in
> some cases.
> 
> However, on many targets there is some guarantee about the
> alignment of SP + SDO even if the actual value of SDO is unknown.
> On s390x it's always 8-byte-aligned (stack slot size).  So the
> right fix should be to add knowledge about the target's guaranteed
> alignment of SP + SDO to the function.  I'm right now testing a
> much simpler patch that uses
> REGNO_POINTER_ALIGN(VIRTUAL_STACK_DYNAMIC_REGNUM) as the
> alignment.

Ciao

Dominik ^_^  ^_^

-- 

Dominik Vogt
IBM Germany
gcc/ChangeLog

* explow.c (get_dynamic_stack_size): Take known alignment of stack
pointer + STACK_DYNAMIC_OFFSET into account when calculating the size
needed.
Correct a typo in a comment.
>From 7d7a6b7fdb189759eb11b96c93ee4adfa8608a97 Mon Sep 17 00:00:00 2001
From: Dominik Vogt 
Date: Fri, 29 Apr 2016 08:36:59 +0100
Subject: [PATCH] Reduce size allocated for run time allocated stack
 variables.

The present calculation sometimes led to more stack memory being used than
necessary with alloca.  First, (STACK_BOUNDARY -1) would be added to the
allocated size:

  size = plus_constant (Pmode, size, extra);
  size = force_operand (size, NULL_RTX);

Then round_push was called and added another (STACK_BOUNDARY - 1) before
rounding down to a multiple of STACK_BOUNDARY.  On s390x this resulted in
adding 14 before rounding down for "x" in the test case pr36728-1.c.

The problem was that get_dynamic_stack_size did not take into account that the
target might guarantee some alignment of (stack_pointer + STACK_DYNAMIC_OFFSET)
even if the value of STACK_DYNAMIC_OFFSET is not known yet.
---
 gcc/explow.c | 14 ++
 1 file changed, 10 insertions(+), 4 deletions(-)

diff --git a/gcc/explow.c b/gcc/explow.c
index a345690..f97a214 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -1224,9 +1224,15 @@ get_dynamic_stack_size (rtx *psize, unsigned size_align,
  example), so we must preventively align the value.  We leave space
  in SIZE for the hole that might result from the alignment operation.  */
 
-  extra = (required_align - BITS_PER_UNIT) / BITS_PER_UNIT;
-  size = plus_constant (Pmode, size, extra);
-  size = force_operand (size, NULL_RTX);
+  unsigned known_align = REGNO_POINTER_ALIGN (VIRTUAL_STACK_DYNAMIC_REGNUM);
+  if (known_align == 0)
+known_align = BITS_PER_UNIT;
+  if (required_align > known_align)
+{
+  extra = (required_align - known_align) / BITS_PER_UNIT;
+  size = plus_constant (Pmode, size, extra);
+  size = force_operand (size, NULL_RTX);
+}
 
   if (flag_stack_usage_info && pstack_usage_size)
 *pstack_usage_size += extra;
@@ -1235,7 +1241,7 @@ get_dynamic_stack_size (rtx *psize, unsigned size_align,
 size_align = BITS_PER_UNIT;
 
   /* Round the size to a multiple of the required stack alignment.
- Since the stack if presumed to be rounded before this allocation,

Re: [Patch, testuite, committed] Fix some more tests that fail for non 32-bit int targets

2016-07-26 Thread Mike Stump
On Jul 26, 2016, at 1:08 AM, Senthil Kumar Selvaraj 
 wrote:
> Is the below patch ok?

Ok.  Thanks.  Such changes are trivial, usual and customary.

Re: [PATCH], Fix PR 71869, Correctly implement isgreater, etc. on PowerPC __float128

2016-07-26 Thread Segher Boessenkool
On Tue, Jul 26, 2016 at 11:23:38AM -0400, Michael Meissner wrote:
> When I originally developed the IEEE 128-bit floating point support, the
> emulation routines in libgcc did not raise errors on signalling NaNs.  In the
> course of adding full support for IEEE 128-bit floating point, we now have
> added exception signaling support in the library.  This means the C99/IEEE
> built-in functions isgreater, isgreaterequal, isless, islessequal, and
> islessgreater now will raise an error when you compare a signaling NaN.  These
> functions are mandated not to raise an error.
> 
> These patches add calls to __unordkf3 to validate that both arguments are
> ordered before calling the ge/le/eq comparison function.  I have done
> bootstraps and make check on both little endian Power8 and big endian Power7
> (both 32-bit and 64-bit tests done on the Power7 box) with no regressions.  
> Are
> these patches ok for the trunk?

They are okay for trunk (some testcase stuff below).

For the 6 branch, I'd rather see a little bit of testing on trunk first.
We still have some time (I hope).


> --- gcc/testsuite/gcc.target/powerpc/float128-cmp.c   (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/float128-cmp.c   (revision 0)
> @@ -0,0 +1,17 @@
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> +/* { dg-require-effective-target powerpc_float128_sw_ok } */
> +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> "-mcpu=power7" } } */
> +/* { dg-options "-O2 -mcpu=power7 -mfloat128" } */
> +
> +#ifndef __FLOAT128__
> +#error "-mfloat128 is not supported."
> +#endif

This should never trigger, it is only there for debug, right?

> +int
> +test_isgreater (__float128 a, __float128 b)
> +{
> +  return __builtin_isgreater (a, b);
> +}
> +
> +/* { dg-final { scan-assembler "bl __\[gl\]ekf2"} } */
> +/* { dg-final { scan-assembler "bl __unordkf2" } } */

There are some extra spaces in that [gl]ekf2 line.

Could you test all five functions please?  Use multiple testcases, maybe.

Thanks,


Segher


[PATCH 2/3] Use class substring_loc in c-format.c (PR c/52952)

2016-07-26 Thread David Malcolm
This patch updates c-format.c to use the new class substring_loc, added
in the previous patch, replacing location_column_from_byte_offset.
Hence with this patch, Wformat can underline the precise erroneous
format string in many more cases.

The patch also introduces two new functions for emitting Wformat
warnings: format_warning_at_substring and format_warning_at_char,
providing an inform in the face of macros where the pertinent part of
the format string may be separate from the function call.

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

Successful selftest run for stage 1 on powerpc-ibm-aix7.1.3.0 (gcc111)
in conjunction with the rest of the patch kit.

config-list.mk test run is in progress.

OK for trunk if it passes testing? (on top of patch 1)

gcc/c-family/ChangeLog:
PR c/52952
* c-format.c: Include "diagnostic.h".
(location_column_from_byte_offset): Delete.
(location_from_offset): Delete.
(format_warning_va): New function.
(format_warning_at_substring): New function.
(format_warning_at_char): New function.
(check_format_arg): Capture location of format_tree and pass to
check_format_info_main.
(check_format_info_main): Add params FMT_PARAM_LOC and
FORMAT_STRING_CST.  Convert calls to warning_at to calls to
format_warning_at_char.  Pass a substring_loc instance to
check_format_types.
(check_format_types): Convert first param from a location_t
to a const substring_loc & and rename to "fmt_loc".  Attempt
to extract the range of the relevant parameter and pass it
to format_type_warning.
(format_type_warning): Convert first param from a location_t
to a const substring_loc & and rename to "fmt_loc".  Add
params "param_range" and "type".  Replace calls to warning_at
with calls to format_warning_at_substring.

gcc/testsuite/ChangeLog:
PR c/52952
* gcc.dg/cpp/pr66415-1.c: Likewise.
* gcc.dg/format/asm_fprintf-1.c: Update column numbers.
* gcc.dg/format/c90-printf-1.c: Likewise.
* gcc.dg/format/diagnostic-ranges.c: New test case.
---
 gcc/c-family/c-format.c | 476 +++-
 gcc/testsuite/gcc.dg/cpp/pr66415-1.c|   8 +-
 gcc/testsuite/gcc.dg/format/asm_fprintf-1.c |   6 +-
 gcc/testsuite/gcc.dg/format/c90-printf-1.c  |  14 +-
 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c | 222 +++
 5 files changed, 544 insertions(+), 182 deletions(-)
 create mode 100644 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c

diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index c19c411..5b79588 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -29,6 +29,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "intl.h"
 #include "langhooks.h"
 #include "c-format.h"
+#include "diagnostic.h"
 
 /* Handle attributes associated with format checking.  */
 
@@ -65,78 +66,169 @@ static int first_target_format_type;
 static const char *format_name (int format_num);
 static int format_flags (int format_num);
 
-/* Given a string S of length LINE_WIDTH, find the visual column
-   corresponding to OFFSET bytes.   */
+/* Emit a warning governed by option OPT, using GMSGID as the format
+   string and AP as its arguments.
 
-static unsigned int
-location_column_from_byte_offset (const char *s, int line_width,
- unsigned int offset)
-{
-  const char * c = s;
-  if (*c != '"')
-return 0;
+   Attempt to obtain precise location information within a string
+   literal from FMT_LOC.
+
+   Case 1: if substring location is available, and is within the range of
+   the format string itself, the primary location of the
+   diagnostic is the substring range obtained from FMT_LOC, with the
+   caret at the *end* of the substring range.
+
+   For example:
+
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf ("hello %i", msg);
+~^
+
+   Case 2: if the substring location is available, but is not within
+   the range of the format string, the primary location is that of the
+   format string, and an note is emitted showing the substring location.
+
+   For example:
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf("hello " INT_FMT " world", msg);
+^
+ test.c:19: note: format string is defined here
+ #define INT_FMT "%i"
+  ~^
+
+   Case 3: if precise substring information is unavailable, the primary
+   location is that of the whole string passed to FMT_LOC's constructor.
+   For example:
+
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf(fmt, msg);
+^~~
+
+   For each of cases 1-3, if param_range is non-NULL, then it is used
+   as a secondary range within the warning.  For exa

[PATCH 1/3] (v2) On-demand locations within string-literals

2016-07-26 Thread David Malcolm
This is an updated version of:
  "[PATCH] RFC: On-demand locations within string-literals"
https://gcc.gnu.org/ml/gcc-patches/2016-07/msg00441.html

Changes in v2:
- Tweaks to substring location selftests
- Many more selftests (EBCDIC, the various wide string types, etc)
- Clean up conditions in charset.c; require source == execution charset
  to have substring locations
- Make string_concat_db field private
- Return error messages rather than bool
- Fix source_range for charset.c:convert_escape
- Introduce class substring_loc
- Handle bad input locations more gracefully
- Ensure that we can read substring information for a token which
  starts in one linemap and ends in another (seen in
  gcc.dg/cpp/pr69985.c)

This patch implements precise tracking of source locations for the
individual chars within string literals, so that we can e.g. underline
specific ranges in -Wformat diagnostics.  It handles macros,
concatenated tokens, escaped characters etc.

The idea is to replace the limited implementation of this we currently
have in c-format.c (see r223470 [1]).  Doing so happens in patch 2 of
the kit; this patch just provides the infrastructure to do so.

As before the patch implements a new mode within libcpp's string literal
lexer.  It's disabled during the regular lexer, but it's available
through a low-level interface in input.{c|h} which can rerun the libcpp
code and capture the per-char source_ranges for when we need to issue a
diagnostic.  It also now adds a higher-level interface in c-common.h:
class substring_loc.

As before, to handle concatentation the patch adds some extra data
storage: every time a string concatenation happens in c-lex.c: it stores
the locations of the component tokens in a hash_map, keyed by the
spelling location of the start first token (see class string_concat_db
in input.h).

Hence it's only storing extra data for string concatenations,
not for simple string literals.

As before, this doesn't support the C++ frontend yet, but it doesn't
regress the status quo for c-format.c from C++.  I have a patch for
the C++ FE that records string concatenation information to the lexer,
but given that it's not used yet, I didn't add that in this patch, as
the data would be redundant.

This version of the patch properly handles encodings (and adds a
lot of test coverage for this to input.c).  It makes the simplifying
restriction that precise source location information is only available
if source charset == execution charset, as discussed on this list,
failing gracefully when this isn't the case.

I believe I can self-approve the changes to input.c, input.h, libcpp,
and the testsuite; the remaining changes needing approval are those
to c-family, to gcc.c, and to selftest.h.

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

Successful selftest run for stage 1 on powerpc-ibm-aix7.1.3.0 (gcc111)
in conjunction with the rest of the patch kit.

config-list.mk test run is in progress.

OK for trunk if it passes testing? (by itself)

[1]  
https://gcc.gnu.org/git/?p=gcc.git;a=commitdiff;h=d5a2ddc76a109258297ff345957c35cb50116c94#patch2

gcc/c-family/ChangeLog:
* c-common.c (get_cpp_ttype_from_string_type): New function.
(g_string_concat_db): New global.
(substring_loc::get_range): New method.
* c-common.h (g_string_concat_db): New declaration.
(class substring_loc): New class.
* c-lex.c (lex_string): When concatenating strings, capture the
locations of all tokens using a new obstack, and record the
concatenation locations within g_string_concat_db.
* c-opts.c (c_common_init_options): Construct g_string_concat_db
on the ggc-heap.

gcc/ChangeLog:
* gcc.c (cpp_options): Rename string to...
(cpp_options_): ...this, to avoid clashing with struct in
cpplib.h.
(static_specs): Update initialize for above renaming
* input.c (string_concat::string_concat): New constructor.
(string_concat_db::string_concat_db): New constructor.
(string_concat_db::record_string_concatenation): New method.
(string_concat_db::get_string_concatenation): New method.
(string_concat_db::get_key_loc): New method.
(class auto_cpp_string_vec): New class.
(get_substring_ranges_for_loc): New function.
(get_source_range_for_substring): New function.
(get_num_source_ranges_for_substring): New function.
(class selftest::lexer_test_options): New class.
(struct selftest::lexer_test): New struct.
(class selftest::ebcdic_execution_charset): New class.
(selftest::ebcdic_execution_charset::s_singleton): New variable.
(selftest::lexer_test::lexer_test): New constructor.
(selftest::lexer_test::~lexer_test): New destructor.
(selftest::lexer_test::get_token): New method.
(selftest::assert_char_at_range): New function.
(ASSERT_CHAR_AT_RANG

[PATCH 3/3] c-format.c: suggest the correct format string to use (PR c/64955)

2016-07-26 Thread David Malcolm
This adds fix-it hints to c-format.c so that it can (sometimes) suggest
the format string the user should have used.

The patch adds selftests for the new code in c-format.c.  These
selftests are thus lang-specific.  This is the first time we've had
lang-specific selftests, and hence the patch also adds a langhook for
running them.  (Note that currently the Makefile only invokes the
selftests for cc1).

Successfully bootstrapped®rtested in conjunction with the rest of the
patch kit on x86_64-pc-linux-gnu.

Successful selftest run for stage 1 on powerpc-ibm-aix7.1.3.0 (gcc111)
in conjunction with the rest of the patch kit.

config-list.mk test run is in progress.

OK for trunk if it passes testing?

gcc/c-family/ChangeLog:
PR c/64955
* c-common.h (selftest::c_format_c_tests): New declaration.
(selftest::run_c_tests): New declaration.
* c-format.c: Include "selftest.h.
(format_warning_va): Add param "corrected_substring" and use
it to add a replacement fix-it hint.
(format_warning_at_substring): Likewise.
(format_warning_at_char): Update for new param of
format_warning_va.
(check_format_info_main): Pass "fki" to check_format_types.
(check_format_types): Add param "fki" and pass it to
format_type_warning.
(deref_n_times): New function.
(get_modifier_for_format_len): New function.
(selftest::test_get_modifier_for_format_len): New function.
(get_format_for_type): New function.
(format_type_warning): Add param "fki" and use it to attempt
to provide hints for argument types when calling
format_warning_at_substring.
(selftest::get_info): New function.
(selftest::assert_format_for_type_streq): New function.
(ASSERT_FORMAT_FOR_TYPE_STREQ): New macro.
(selftest::test_get_format_for_type_printf): New function.
(selftest::test_get_format_for_type_scanf): New function.
(selftest::c_format_c_tests): New function.

gcc/c/ChangeLog:
PR c/64955
* c-lang.c (LANG_HOOKS_RUN_LANG_SELFTESTS): If CHECKING_P, wire
this up to selftest::run_c_tests.
(selftest::run_c_tests): New function.

gcc/ChangeLog:
PR c/64955
* langhooks-def.h (LANG_HOOKS_RUN_LANG_SELFTESTS): New default
do-nothing langhook.
(LANG_HOOKS_INITIALIZER): Add LANG_HOOKS_RUN_LANG_SELFTESTS.
* langhooks.h (struct lang_hooks): Add run_lang_selftests.
* selftest-run-tests.c: Include "tree.h" and "langhooks.h".
(selftest::run_tests): Call lang_hooks.run_lang_selftests.

gcc/testsuite/ChangeLog:
PR c/64955
* gcc.dg/format/diagnostic-ranges.c: Add fix-it hints to expected
output.
---
 gcc/c-family/c-common.h |   7 +
 gcc/c-family/c-format.c | 268 ++--
 gcc/c/c-lang.c  |  22 ++
 gcc/langhooks-def.h |   4 +-
 gcc/langhooks.h |   3 +
 gcc/selftest-run-tests.c|   5 +
 gcc/testsuite/gcc.dg/format/diagnostic-ranges.c |  30 ++-
 7 files changed, 319 insertions(+), 20 deletions(-)

diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
index 7b5da57..61f9ced 100644
--- a/gcc/c-family/c-common.h
+++ b/gcc/c-family/c-common.h
@@ -1533,4 +1533,11 @@ extern bool valid_array_size_p (location_t, tree, tree);
 extern bool cilk_ignorable_spawn_rhs_op (tree);
 extern bool cilk_recognize_spawn (tree, tree *);
 
+#if CHECKING_P
+namespace selftest {
+  extern void c_format_c_tests (void);
+  extern void run_c_tests (void);
+} // namespace selftest
+#endif /* #if CHECKING_P */
+
 #endif /* ! GCC_C_COMMON_H */
diff --git a/gcc/c-family/c-format.c b/gcc/c-family/c-format.c
index 5b79588..f5a4011 100644
--- a/gcc/c-family/c-format.c
+++ b/gcc/c-family/c-format.c
@@ -30,6 +30,7 @@ along with GCC; see the file COPYING3.  If not see
 #include "langhooks.h"
 #include "c-format.h"
 #include "diagnostic.h"
+#include "selftest.h"
 
 /* Handle attributes associated with format checking.  */
 
@@ -126,11 +127,21 @@ static int format_flags (int format_num);
  printf(fmt, msg);
 ^~~  ~~~
 
+   If CORRECTED_SUBSTRING is non-NULL, use it for cases 1 and 2 to provide
+   a fix-it hint, suggesting that it should replace the text within the
+   substring range.  For example:
+
+ test.c:90:10: warning: problem with '%i' here [-Wformat=]
+ printf ("hello %i", msg);
+~^
+%s
+
Return true if a warning was emitted, false otherwise.  */
 
-ATTRIBUTE_GCC_DIAG (4,0)
+ATTRIBUTE_GCC_DIAG (5,0)
 static bool
 format_warning_va (const substring_loc &fmt_loc, source_range *param_range,
+  const char *corrected_substring,
   int opt, const char *gmsgid, va_list *ap)
 {
   bool substring_within_range = false;
@@ -174,6 +185,9 @@ forma

Re: [PATCH] gfortran: Fix allocation of diagnostig string (was too small).

2016-07-26 Thread Janne Blomqvist
On Tue, Jul 26, 2016 at 6:42 PM, Dominik Vogt  wrote:
> The attached patch fixes an out of bound write to memory allocated
> with alloca() on the stack.  This rarely ever happened because on
> one hand -fbounds-check needs to be enabled, and on the other hand
> alloca() used to allocate a few bytes extra most of the time so
> most of the time the excess write did no harm.

Ok for trunk, thanks.


-- 
Janne Blomqvist


Re: [PATCH] Replacing gcc's dependence on libiberty's fnmatch to gnulib's fnmatch

2016-07-26 Thread Manuel López-Ibáñez
On 26 July 2016 at 14:51, ayush goel  wrote:
> On 26 July 2016 at 3:38:59 AM, Manuel López-Ibáñez
> (lopeziba...@gmail.com) wrote:
>> Why the change from "fnmatch.h" to ?
>
> Gnulib doesn’t contain a header for fnmatch. It itself relies on
> glib’c fnmatch.h

I see two modules here:

https://www.gnu.org/software/gnulib/MODULES.html#module=fnmatch
https://www.gnu.org/software/gnulib/MODULES.html#module=fnmatch-gnu

Both of them generate a header file if the one from glibc is not
present or it is broken:

http://git.savannah.gnu.org/gitweb/?p=gnulib.git;a=blob_plain;f=modules/fnmatch

Does GCC use the GNU extensions of fnmatch? If not, you probably only
need fnmatch; otherwise you may need to also import fnmatch-gnu.

>> Also, are the files in gnulib and libiberty semantically identical?
>> The wiki page does not say anything about this. How did you check
>> this?
>
> Well the online docs for libiberty and gnulib claim the same
> definition for fnmatch. Apart from this I’ve manually gone through the
> source code and they seem to be semantically similar.

Ah, good! You should mention this in future submissions (and in the wiki page).

> Also the fact that the system builds fine and the tests also execute
> fine could serve as a manifestation for the fact that they are
> semantically same.

This is a necessary condition but not a sufficient condition.
Unfortunately, comparing the two functions in detail is still
necessary even if the tests succeed.

>> GCC can run on other systems besides OSX and GNU/Linux, how can you
>> test that your change does not break anything on those systems?
>>
> Well I have access to these two systems only. How would you suggest I
> test my patches on all possible systems?

Well, you could find fnmatch.h in your system and remove/rename it
temporarily (also remove the libiberty versions under include/), then
try to bootstrap and see what happens. If the system fnmatch.h is not
used, the build system of gnulib should create a funmatch.h (check if
it is created, otherwise GCC found a different fnmatch.h).

Cheers,

Manuel.


[Patch GCC]Support constraint flags in loop structure.

2016-07-26 Thread Bin Cheng
Hi,
This patch adds support for constraint flags in loop structure.  Different to 
existing boolean flags which are set by niter analyzer, constraint flag is 
mainly set by consumers and affects certain semantics of niter analyzer APIs.  
Currently niter APIs affected are number_of_iterations_exit* and their callers. 
 Constraint flags are added to support handling of possible infinite loops in 
GCC.  One typical usecase of constraint is in vectorizer, as described in 
patch's comment:

  /* ...
   1) Compute niter->assumptions by calling niter analyzer API and
  record it as possible condition for loop versioning.
   2) Clear buffered result of niter/scev analyzer.
   3) Set constraint LOOP_C_FINITE assuming the loop is finite.
   4) Analyze data references.  Since data reference analysis depends
  on niter/scev analyzer, the point is that niter/scev analysis
  is done under circumstance of LOOP_C_FINITE constraint.
   5) Version the loop with assumptions computed in step 1).
   6) Vectorize the versioned loop in which assumptions is guarded to
  be true.
   7) Update constraints in versioned loops so that niter analyzer
  in following passes can use it.

 Note consumers are usually the loop optimizers and it is consumers'
 responsibility to set/clear constraints correctly.  Failing to do
 that might result in hard to track bugs in niter/scev analyzer.  */

Next patch will use constraint to vectorize possible infinite loop by 
versioning, I would also expect possible infinite loops (loops with 
assumptions) can be handled by more optimizers.  This patch itself doesn't 
change GCC behavior, bootstrap and test on x86_64.  Any comments?

Thanks,
bin

2016-07-25  Bin Cheng  

* cfgloop.h (struct loop): New field constraints.
(LOOP_C_INFINITE, LOOP_C_FINITE): New macros.
(loop_constraint_set, loop_constraint_clr, loop_constraint_set_p): New
functions.
* cfgloop.c (alloc_loop): Initialize new field.
* tree-ssa-loop-niter.c (number_of_iterations_exit_assumptions):
Adjust niter analysis wrto loop constraints.diff --git a/gcc/cfgloop.c b/gcc/cfgloop.c
index 2087b90..b5c920c 100644
--- a/gcc/cfgloop.c
+++ b/gcc/cfgloop.c
@@ -339,6 +339,7 @@ alloc_loop (void)
   loop->exits = ggc_cleared_alloc ();
   loop->exits->next = loop->exits->prev = loop->exits;
   loop->can_be_parallel = false;
+  loop->constraints = 0;
   loop->nb_iterations_upper_bound = 0;
   loop->nb_iterations_likely_upper_bound = 0;
   loop->nb_iterations_estimate = 0;
diff --git a/gcc/cfgloop.h b/gcc/cfgloop.h
index dfc7525..c5936f1 100644
--- a/gcc/cfgloop.h
+++ b/gcc/cfgloop.h
@@ -147,6 +147,30 @@ struct GTY ((chain_next ("%h.next"))) loop {
   /* Auxiliary info specific to a pass.  */
   PTR GTY ((skip (""))) aux;
 
+  /* Different to other boolean flags which are set by niter analyzer,
+ constraint is set by consumers and it affects certain semantics
+ of niter analyzer APIs.  Currently the APIs affected are functions
+ number_of_iterations_exit* and their callers.  One typical usecase
+ of constraints is to vectorize possibly infinite loop:
+
+   1) Compute niter->assumptions by calling niter analyzer API and
+ record it as possible condition for loop versioning.
+   2) Clear buffered result of niter/scev analyzer.
+   3) Set constraint LOOP_C_FINITE assuming the loop is finite.
+   4) Analyze data references.  Since data reference analysis depends
+ on niter/scev analyzer, the point is that niter/scev analysis
+ is done under circumstance of LOOP_C_FINITE constraint.
+   5) Version the loop with assumptions computed in step 1).
+   6) Vectorize the versioned loop in which assumptions is guarded to
+ be true.
+   7) Update constraints in versioned loops so that niter analyzer
+ in following passes can use it.
+
+ Note consumers are usually the loop optimizers and it is consumers'
+ responsibility to set/clear constraints correctly.  Failing to do
+ that might result in hard to track bugs in niter/scev analyzer.  */
+  unsigned constraints;
+
   /* The number of times the latch of the loop is executed.  This can be an
  INTEGER_CST, or a symbolic expression representing the number of
  iterations like "N - 1", or a COND_EXPR containing the runtime
@@ -221,6 +245,29 @@ struct GTY ((chain_next ("%h.next"))) loop {
   basic_block former_header;
 };
 
+/* Set if the loop is known to be infinite.  */
+#define LOOP_C_INFINITE(1 << 0)
+/* Set if the loop is known to be finite without any assumptions.  */
+#define LOOP_C_FINITE  (1 << 1)
+
+static inline void
+loop_constraint_set (struct loop *loop, unsigned c)
+{
+  loop->constraints |= c;
+}
+
+static inline void
+loop_constraint_clr (struct loop *loop, unsigned c)
+{
+  loop->constraints &= ~c;
+}
+
+static inline bool
+loop_constraint_set_p (

[PATCH GCC]Vectorize possible infinite loop by loop versioning.

2016-07-26 Thread Bin Cheng
Hi,
This patch vectorizes possible infinite loops by versioning.  It analyzes loops 
considered for vectorization using loop constraint facility which was 
introduced by previous patch; then vectorizes the loop with respect to 
assumptions of loop niters information; at last, it sets constraint flags for 
versioned loop so that niter analyzer in following optimizers can take 
advantage of it.  The patch also adds two tests.

Bootstrap and test on x86_64.  Any comments?

Thanks,
bin

2016-07-25  Bin Cheng  

PR tree-optimization/57558
* tree-vect-loop-manip.c (vect_create_cond_for_niters_checks): New
function.
(vect_loop_versioning): Support versioning with niter assumptions.
* tree-vect-loop.c (tree-ssa-loop.h): Include header file.
(vect_get_loop_niters): New parameter.  Reimplement to support
assumptions in loop niter info.
(vect_analyze_loop_form_1, vect_analyze_loop_form): Ditto.
(new_loop_vec_info): Init LOOP_VINFO_NITERS_ASSUMPTIONS.
(vect_estimate_min_profitable_iters): Use LOOP_REQUIRES_VERSIONING.
Support loop versioning for niters.
* tree-vectorizer.c (tree-ssa-loop-niter.h): Include header file.
(vect_free_loop_info_assumptions): New function.
(vectorize_loops): Free loop niter info for loops with flag
LOOP_F_ASSUMPTIONS set if vectorization failed.
* tree-vectorizer.h (struct _loop_vec_info): New field
num_iters_assumptions.
(LOOP_VINFO_NITERS_ASSUMPTIONS): New macro.
(LOOP_REQUIRES_VERSIONING_FOR_NITERS): New macro.
(LOOP_REQUIRES_VERSIONING): New macro.
(vect_free_loop_info_assumptions): New decl.

gcc/testsuite/ChangeLog
2016-07-25  Bin Cheng  

PR tree-optimization/57558
* gcc.dg/vect/pr57558-1.c: New test.
* gcc.dg/vect/pr57558-2.c: New test.diff --git a/gcc/tree-vect-loop-manip.c b/gcc/tree-vect-loop-manip.c
index 819abcd..c1381b3 100644
--- a/gcc/tree-vect-loop-manip.c
+++ b/gcc/tree-vect-loop-manip.c
@@ -2082,6 +2082,37 @@ vect_do_peeling_for_alignment (loop_vec_info loop_vinfo, 
tree ni_name,
   free_original_copy_tables ();
 }
 
+/* Function vect_create_cond_for_niters_checks.
+
+   Create a conditional expression that represents the run-time checks for
+   loop's niter.  The loop is guaranteed to to terminate if the run-time
+   checks hold.
+
+   Input:
+   COND_EXPR  - input conditional expression.  New conditions will be chained
+   with logical AND operation.  If it is NULL, then the function
+   is used to return the number of alias checks.
+   LOOP_VINFO - field LOOP_VINFO_MAY_ALIAS_STMTS contains the list of ddrs
+   to be checked.
+
+   Output:
+   COND_EXPR - conditional expression.
+
+   The returned COND_EXPR is the conditional expression to be used in the
+   if statement that controls which version of the loop gets executed at
+   runtime.  */
+
+static void
+vect_create_cond_for_niters_checks (loop_vec_info loop_vinfo, tree *cond_expr)
+{
+  tree part_cond_expr = LOOP_VINFO_NITERS_ASSUMPTIONS (loop_vinfo);
+
+  if (*cond_expr)
+*cond_expr = fold_build2 (TRUTH_AND_EXPR, boolean_type_node,
+ *cond_expr, part_cond_expr);
+  else
+*cond_expr = part_cond_expr;
+}
 
 /* Function vect_create_cond_for_align_checks.
 
@@ -2330,7 +2361,7 @@ void
 vect_loop_versioning (loop_vec_info loop_vinfo,
  unsigned int th, bool check_profitability)
 {
-  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
+  struct loop *loop = LOOP_VINFO_LOOP (loop_vinfo), *nloop;
   struct loop *scalar_loop = LOOP_VINFO_SCALAR_LOOP (loop_vinfo);
   basic_block condition_bb;
   gphi_iterator gsi;
@@ -2347,14 +2378,19 @@ vect_loop_versioning (loop_vec_info loop_vinfo,
   tree scalar_loop_iters = LOOP_VINFO_NITERS (loop_vinfo);
   bool version_align = LOOP_REQUIRES_VERSIONING_FOR_ALIGNMENT (loop_vinfo);
   bool version_alias = LOOP_REQUIRES_VERSIONING_FOR_ALIAS (loop_vinfo);
+  bool version_niter = LOOP_REQUIRES_VERSIONING_FOR_NITERS (loop_vinfo);
 
   if (check_profitability)
-{
-  cond_expr = fold_build2 (GT_EXPR, boolean_type_node, scalar_loop_iters,
-  build_int_cst (TREE_TYPE (scalar_loop_iters), 
th));
-  cond_expr = force_gimple_operand_1 (cond_expr, &cond_expr_stmt_list,
- is_gimple_condexpr, NULL_TREE);
-}
+cond_expr = fold_build2 (GT_EXPR, boolean_type_node, scalar_loop_iters,
+build_int_cst (TREE_TYPE (scalar_loop_iters),
+  th));
+
+  if (version_niter)
+vect_create_cond_for_niters_checks (loop_vinfo, &cond_expr);
+
+  if (cond_expr)
+cond_expr = force_gimple_operand_1 (cond_expr, &cond_expr_stmt_list,
+   is_gimple_condexpr, NULL_TREE);
 
   if (version_align)
 vect_create_cond_for_align_checks (loop_vinfo,

Re: [PATCH] Remove special streaming of builtins

2016-07-26 Thread H.J. Lu
On Mon, Jul 25, 2016 at 4:35 AM, Richard Biener  wrote:
>
> So I needed to fix that builtins appearing in BLOCK_VARs and the solution
> I came up with accidentially disabled streaming via the special path.
> Thus the following patch removes the special-casing completely and makes
> the BLOCK_VARs handling work the same way as for regular externs (by
> streaming a local copy).  We stream each builtin decl once and then
> refer to it via the decl index (which is cheaper than the special
> casing).
>
> I'm not 100% this solves for example the -fno-math-errno inlining
> across TUs (it certainly doesn't if you use attribute optimize with
> -fno-math-errno), but it eventually should by means of having two
> different BUILT_IN_XXX if they have different behavior.  At least
> if all relevant bits are set on the function _type_ rather than
> the decl which I think we still lto-symtab replace with one
> entity during WPA(?)
>
> Well.
>
> LTO bootstrapped and tested on x86_64-unknown-linux-gnu (c,c++,fortran),
> bootstrapped on x86_64-unknown-linux-gnu (all), testing in progress.
>
> I might have not catched all fndecl compares.
>
> Will apply to trunk if testing completes.  As said, maybe followup
> cleanups possible, at least to lto-opts.c / lto-wrapper.
>
> Richard.
>
> 2016-07-25  Richard Biener  
>
> * cgraph.c (cgraph_node::verify_node): Compare against builtin
> by using DECL_BUILT_IN_CLASS and DECL_FUNCTION_CODE.
> * tree-chkp.c (chkp_gimple_call_builtin_p): Likewise.
> * tree-streamer.h (streamer_handle_as_builtin_p): Remove.
> (streamer_get_builtin_tree): Likewise.
> (streamer_write_builtin): Likewise.
> * lto-streamer.h (LTO_builtin_decl): Remove.
> * lto-streamer-in.c (lto_read_tree_1): Remove assert.
> (lto_input_scc): Remove LTO_builtin_decl handling.
> (lto_input_tree_1): Liekwise.
> * lto-streamer-out.c (lto_output_tree_1): Remove special
> handling of builtins.
> (DFS::DFS): Likewise.
> * tree-streamer-in.c (streamer_get_builtin_tree): Remove.
> * tree-streamer-out.c (pack_ts_function_decl_value_fields): Remove
> assert.
> (streamer_write_builtin): Remove.
>
> lto/
> * lto.c (compare_tree_sccs_1): Remove streamer_handle_as_builtin_p 
> uses.
> (unify_scc): Likewise.
> (lto_read_decls): Likewise.
>

This caused:

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72683


-- 
H.J.


Re: [PATCH] Replacing gcc's dependence on libiberty's fnmatch to gnulib's fnmatch

2016-07-26 Thread Prathamesh Kulkarni
On 26 July 2016 at 19:21, ayush goel  wrote:
> On 26 July 2016 at 3:38:59 AM, Manuel López-Ibáñez
> (lopeziba...@gmail.com) wrote:
>> On 25 July 2016 at 18:18, ayush goel wrote:
>> > On top of the previously filed patch for importing gnulib (the link
>> > isn’t available on the archive yet, however this contains some of the
>> > information: 
>> > http://gcc.1065356.n5.nabble.com/Importing-gnulib-into-the-gcc-tree-td1275807.html#a1279573)
>> > now I have replaced another function from libiberty with the
>> > corresponding version from gnulib.
>> > Even though in both OSX and GNU/Linux, fnmatch is provided by the GNU
>> > libc already, so the copy in libiberty is not used in your systems.
>> > However since the objective is to replace whatever functions can be
>> > leveraged by gnulib, these changes have been made.
>>
>> Why the change from "fnmatch.h" to ?
>
> Gnulib doesn’t contain a header for fnmatch. It itself relies on
> glib’c fnmatch.h
>
>>
>> Also, are the files in gnulib and libiberty semantically identical?
>> The wiki page does not say anything about this. How did you check
>> this?
>
> Well the online docs for libiberty and gnulib claim the same
> definition for fnmatch. Apart from this I’ve manually gone through the
> source code and they seem to be semantically similar.
> Also the fact that the system builds fine and the tests also execute
> fine could serve as a manifestation for the fact that they are
> semantically same.
>>
>> GCC can run on other systems besides OSX and GNU/Linux, how can you
>> test that your change does not break anything on those systems?
>>
> Well I have access to these two systems only. How would you suggest I
> test my patches on all possible systems?
Maybe building contrib/config-list.mk could help ? AFAIK, it will cross build
the gcc tree for different targets, but not run the testsuite, however
I could be wrong.

Thanks,
Prathamesh
>
>> Cheers,
>>
>> Manuel.
>>
>
> -Ayush


[PATCH] Fix wrong code on aarch64 due to paradoxical subreg

2016-07-26 Thread Bernd Edlinger
Hi!

As described in PR 71779 and PR 70903 we have a wrong code issue on aarch64
which is triggered by a paradoxical subreg that can be created in
store_bit_field_using_insv when the target has an EP_insv bitfield instruction.

The attached patch changes this insn...

(insn 1047 1046 1048 (set (reg:DI 481)
(subreg:DI (reg/f:SI 479) 0)) y.c:12702 -1
 (nil))

... into a more simple insn:

(insn 1047 1046 1048 (set (reg:DI 479)
(zero_extend:DI (reg:SI 480))) isl_input.c:2496 -1
 (nil))

which manages to fix two bugs at the same time.

The patch does not include a test case, as I was looking at the PR 71779
just by curiosity, and I have only a theoretical interest in aarch64. However,
it should be easy to extract one from PR 70903 at least, but I can't do that by
myself.

Therefore I would like to leave the test case(s) to Cristina Tamar from ARM,
and/or Andreas Schwab, who have also helped out with bootstrapping and
reg-testing on aarch64 and aarch64-ilp32.

Bootstrap and reg-test on x86_64-linux-gnu with no regressions.
Successfully bootstrapped on aarch64_ilp32 and the ISL testsuite passes.
Also successfully bootstrapped on an aarch64 juno board and no regressions.


Is it OK for trunk?


Thanks
Bernd.2016-07-25  Bernd Edlinger  

	PR rtl-optimization/71779
	PR rtl-optimization/70903
	* expmed.c (store_bit_field_using_insv): Don't generate a paradoxical
	subreg.

Index: gcc/calls.c
Index: expmed.c
===
--- expmed.c	(revision 238694)
+++ expmed.c	(working copy)
@@ -664,14 +664,7 @@ store_bit_field_using_insv (const extraction_insn
 	 if we must narrow it, be sure we do it correctly.  */
 
 	  if (GET_MODE_SIZE (GET_MODE (value)) < GET_MODE_SIZE (op_mode))
-	{
-	  tmp = simplify_subreg (op_mode, value1, GET_MODE (value), 0);
-	  if (! tmp)
-		tmp = simplify_gen_subreg (op_mode,
-	   force_reg (GET_MODE (value),
-		  value1),
-	   GET_MODE (value), 0);
-	}
+	tmp = convert_to_mode (op_mode, value1, 1);
 	  else
 	{
 	  tmp = gen_lowpart_if_possible (op_mode, value1);




Re: [PATCH 1/3] (v2) On-demand locations within string-literals

2016-07-26 Thread Manuel López-Ibáñez

On 26/07/16 18:11, David Malcolm wrote:


gcc/ChangeLog:
* gcc.c (cpp_options): Rename string to...
(cpp_options_): ...this, to avoid clashing with struct in
cpplib.h.


It seems to me that you need this because  now gcc.c includes cpplib.h via 
input.h, which seems wrong.


input.h was FE-independent (it depends on line-map.h but it is an accident of 
history that line-map.h is in libcpp since it doesn't depend on anything from 
libcpp [*]). Note that input.h is included in coretypes.h, so this means that 
now cpplib.h is included almost everywhere! [**]


There is the following in coretypes.h:

/* Provide forward struct declaration so that we don't have to include
   all of cpplib.h whenever a random prototype includes a pointer.
   Note that the cpp_reader and cpp_token typedefs remain part of
   cpplib.h.  */

struct cpp_reader;
struct cpp_token;

precisely to avoid including cpplib.h.


If I understand correctly, cpplib.h is needed in input.h because of this 
declaration:


+extern const char *get_source_range_for_substring (cpp_reader *pfile,
+  string_concat_db *concats,
+  location_t strloc,
+  enum cpp_ttype type,
+  int start_idx, int end_idx,
+  source_range *out_range);


Does this really need to be in input.h ?  It seems something that only C-family 
languages will be able to use. Note that you need a reader to use this 
function, and for that, you need to already include cpplib.h.


Perhaps it could live for now in c-format.c, since it is the only place using 
it?

Cheers,

Manuel.

[*] In an ideal world, we would have a language-agnostic diagnostics library 
that would include line-map and that would be used by libcpp and the rest of 
GCC, so that we can remove all the error-routines in libcpp and the awkward 
glue code that ties it into diagnostics.c.


[**] And it seems that we are slowly undoing all the work that was done by 
Andrew MacLeod to clean up the .h web and remove dependencies 
(https://gcc.gnu.org/wiki/rearch).





Re: [PATCH] accept flexible arrays in struct in unions (c++/71912 - [6/7 regression])

2016-07-26 Thread Jason Merrill

On 07/23/2016 01:18 PM, Martin Sebor wrote:

+  /* A pair of the first non-static non-empty data members following
+ either the flexible array member, if found, or the zero-length
+ array member otherwise.  AFTER[1] refers to the first such data
+ member of a union that the struct containing the flexible array
+ member or zero-length array is a member, or NULL when no such
+ union exists.  AFTER[0] refers to the first such data member
+ that is not a member of such a union.  */


This is pretty hard to follow, could you add an example?  Why do we want 
to track these separately rather than look at DECL_CONTEXT at diagnostic 
time?



+  /* The type in which an anonymous struct or union containing ARRAY
+ is defined or null if no such anonymous struct or union exists.  */
+  tree anonctx;


It seems clearer to find this at diagnostic time by following TYPE_CONTEXT.


  /* Find the next non-static data member if it exists.  */
  for (next = fld;
   (next = DECL_CHAIN (next))
 && TREE_CODE (next) != FIELD_DECL; );


I think next_initializable_field would be useful here.


   if (TREE_CODE (fld) != TYPE_DECL
  && RECORD_OR_UNION_TYPE_P (fldtype)
- && TYPE_ANONYMOUS_P (fldtype))
+ && VAR_DECL != TREE_CODE (fld)
+ && (FIELD_DECL != TREE_CODE (fld) || !DECL_FIELD_IS_BASE (fld)))


Please put the constant on the RHS of comparisons.

It seems that you're only interested in FIELD_DECL here, so maybe move 
up the



  /* Skip anything that's not a (non-static) data member.  */
  if (TREE_CODE (fld) != FIELD_DECL)
continue;


and remove the checks for TYPE_DECL and VAR_DECL?  For that matter, you 
could also move up the DECL_ARTIFICIAL check so you don't need to check 
DECL_FIELD_IS_BASE.



+ /* Is the member an anonymous struct or union?  */
+ bool anon_p = (!TYPE_ANONYMOUS_P (fldtype)
+|| !DECL_NAME (fld)
+|| anon_aggrname_p (DECL_NAME (fld)));


This looks like anon_p will be true for any non-static data member of 
non-anonymous type?  Maybe you want ANON_AGGR_TYPE_P?



+ /* If this member isn't anonymous and a prior non-flexible array
+member has been seen in one of the enclosing structs, clear
+the FIRST member since it doesn't contribute to the flexible
+array struct's members.  */
+ if (first && !array && !anon_p)
+   fmem->first = NULL_TREE;


Why is this needed?  For a non-anonymous member, presumably we'll give 
an error when analyzing the type of the member on its own, so we 
shouldn't need to complain again here.



  /* Replace the zero-length array if it's been stored and
 reset the after pointer.  */
  if (TYPE_DOMAIN (TREE_TYPE (fmem->array)))
{
  fmem->after[bool (pun)] = NULL_TREE;
  fmem->array = fld;
}


So we silently allow a zero-length array followed by a flexible array?  Why?


+ msg = G_("zero-size array member %qD in an otherwise empty %q#T");
+
+   if (msg && pedwarn (DECL_SOURCE_LOCATION (fmem->array),
+   OPT_Wpedantic, msg, fmem->array, fmemctx))
+ {
+   inform (location_of (t), "in the definition of %q#T", fmemctx);
+
+   /* Prevent the same flexible array member from being diagnosed
+  more than once if it happens to be nested in more than one
+  union and overlap with another member.  This avoids multiple
+  warnings for perverse cases like the the following where
+  U::U1::X::a1 would otherwise be diagnosed first followed by
+  S::U1::X::a1:
+struct S {
+  union U {
+union U1 { struct X { int n1, a1[]; } x1; } u1;
+union U2 { struct X { int n2, a2[]; } x1; } u2;
+  } u;
+} s;
+   */
+   TREE_NO_WARNING (fmem->array) = 1;
+ }


There doesn't seem to be anything that checks TREE_NO_WARNING for the 
zero-size array case.



+   /* Avoid issuing further duiagnostics after the error above.  */


Typo.


  if (fmem == &flexmems
  && !TYPE_ANONYMOUS_P (t) && !anon_aggrname_p (TYPE_IDENTIFIER (t)))


I think you want ANON_AGGR_TYPE_P here, too.

I also wonder about integrating this with layout_class_type, since that 
is also looking at layout issues.


Jason



Re: [PATCH] Remove special streaming of builtins

2016-07-26 Thread Richard Biener
On July 26, 2016 7:26:46 PM GMT+02:00, "H.J. Lu"  wrote:
>On Mon, Jul 25, 2016 at 4:35 AM, Richard Biener 
>wrote:
>>
>> So I needed to fix that builtins appearing in BLOCK_VARs and the
>solution
>> I came up with accidentially disabled streaming via the special path.
>> Thus the following patch removes the special-casing completely and
>makes
>> the BLOCK_VARs handling work the same way as for regular externs (by
>> streaming a local copy).  We stream each builtin decl once and then
>> refer to it via the decl index (which is cheaper than the special
>> casing).
>>
>> I'm not 100% this solves for example the -fno-math-errno inlining
>> across TUs (it certainly doesn't if you use attribute optimize with
>> -fno-math-errno), but it eventually should by means of having two
>> different BUILT_IN_XXX if they have different behavior.  At least
>> if all relevant bits are set on the function _type_ rather than
>> the decl which I think we still lto-symtab replace with one
>> entity during WPA(?)
>>
>> Well.
>>
>> LTO bootstrapped and tested on x86_64-unknown-linux-gnu
>(c,c++,fortran),
>> bootstrapped on x86_64-unknown-linux-gnu (all), testing in progress.
>>
>> I might have not catched all fndecl compares.
>>
>> Will apply to trunk if testing completes.  As said, maybe followup
>> cleanups possible, at least to lto-opts.c / lto-wrapper.
>>
>> Richard.
>>
>> 2016-07-25  Richard Biener  
>>
>> * cgraph.c (cgraph_node::verify_node): Compare against
>builtin
>> by using DECL_BUILT_IN_CLASS and DECL_FUNCTION_CODE.
>> * tree-chkp.c (chkp_gimple_call_builtin_p): Likewise.
>> * tree-streamer.h (streamer_handle_as_builtin_p): Remove.
>> (streamer_get_builtin_tree): Likewise.
>> (streamer_write_builtin): Likewise.
>> * lto-streamer.h (LTO_builtin_decl): Remove.
>> * lto-streamer-in.c (lto_read_tree_1): Remove assert.
>> (lto_input_scc): Remove LTO_builtin_decl handling.
>> (lto_input_tree_1): Liekwise.
>> * lto-streamer-out.c (lto_output_tree_1): Remove special
>> handling of builtins.
>> (DFS::DFS): Likewise.
>> * tree-streamer-in.c (streamer_get_builtin_tree): Remove.
>> * tree-streamer-out.c (pack_ts_function_decl_value_fields):
>Remove
>> assert.
>> (streamer_write_builtin): Remove.
>>
>> lto/
>> * lto.c (compare_tree_sccs_1): Remove
>streamer_handle_as_builtin_p uses.
>> (unify_scc): Likewise.
>> (lto_read_decls): Likewise.
>>
>
>This caused:
>
>https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72683

Probably another by-decl built-in function compare in the mpx support.  I have 
fixed the one that triggered on a not mpx capable machine.  Ilya possibly knows 
where the other one(s) are lurking off his head?

Richard.




Re: [PATCH], Fix PR 71869, Correctly implement isgreater, etc. on PowerPC __float128

2016-07-26 Thread Michael Meissner
On Tue, Jul 26, 2016 at 11:05:32AM -0500, Segher Boessenkool wrote:
> > --- gcc/testsuite/gcc.target/powerpc/float128-cmp.c (revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/float128-cmp.c (revision 0)
> > @@ -0,0 +1,17 @@
> > +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> > +/* { dg-require-effective-target powerpc_float128_sw_ok } */
> > +/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { 
> > "-mcpu=power7" } } */
> > +/* { dg-options "-O2 -mcpu=power7 -mfloat128" } */
> > +
> > +#ifndef __FLOAT128__
> > +#error "-mfloat128 is not supported."
> > +#endif
> 
> This should never trigger, it is only there for debug, right?
> 
> > +int
> > +test_isgreater (__float128 a, __float128 b)
> > +{
> > +  return __builtin_isgreater (a, b);
> > +}
> > +
> > +/* { dg-final { scan-assembler "bl __\[gl\]ekf2"} } */
> > +/* { dg-final { scan-assembler "bl __unordkf2" } } */
> 
> There are some extra spaces in that [gl]ekf2 line.
> 
> Could you test all five functions please?  Use multiple testcases, maybe.

I decided to write an executable test rather than do more assembly tests.  The
patch to rs6000.c is unchanged, and the test now tests all of the comparison
operators and functions with various values including quiet NaNs and signalling
NaNs.  It passes on power7 big endian 64-bit, power7 big endian 32-bit, power8
little endian 64-bit, and I ran it on the power9 simulator with hardware
float128 support.

[gcc]
2016-07-26  Michael Meissner  

PR target/71869
* config/rs6000/rs6000.c (rs6000_generate_compare): Rework
__float128 support when we don't have hardware support, so that
the IEEE built-in functions like isgreater, first call __unordkf3
to make sure neither operand is a NaN, and if both operands are
ordered, do the normal comparison.

[gcc/testsuite]
2016-07-26  Michael Meissner  

PR target/71869
* gcc.target/powerpc/float128-cmp.c: New test to make sure that
IEEE built-in functions handle quiet and signalling NaNs
correctly.

-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 238730)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -21755,8 +21755,8 @@ rs6000_generate_compare (rtx cmp, machin
   else if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
 {
   rtx libfunc = NULL_RTX;
-  bool uneq_or_ltgt = false;
-  rtx dest = gen_reg_rtx (SImode);
+  bool check_nan = false;
+  rtx dest;
 
   switch (code)
{
@@ -21783,21 +21783,23 @@ rs6000_generate_compare (rtx cmp, machin
 
case UNGE:
case UNGT:
- libfunc = optab_libfunc (le_optab, mode);
+ check_nan = true;
+ libfunc = optab_libfunc (ge_optab, mode);
  code = (code == UNGE) ? GE : GT;
  break;
 
case UNLE:
case UNLT:
- libfunc = optab_libfunc (ge_optab, mode);
+ check_nan = true;
+ libfunc = optab_libfunc (le_optab, mode);
  code = (code == UNLE) ? LE : LT;
  break;
 
case UNEQ:
case LTGT:
- libfunc = optab_libfunc (le_optab, mode);
- uneq_or_ltgt = true;
- code = (code = UNEQ) ? NE : EQ;
+ check_nan = true;
+ libfunc = optab_libfunc (eq_optab, mode);
+ code = (code = UNEQ) ? EQ : NE;
  break;
 
default:
@@ -21805,21 +21807,56 @@ rs6000_generate_compare (rtx cmp, machin
}
 
   gcc_assert (libfunc);
-  dest = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
- SImode, 2, op0, mode, op1, mode);
 
-  /* If this is UNEQ or LTGT, we call __lekf2, which returns -1 for less
-than, 0 for equal, +1 for greater, and +2 for nan.  We add 1, to give
-a value of 0..3, and then do and AND immediate of 1 to isolate whether
-it is 0/Nan (i.e. bottom bit is 0), or less than/greater than
-(i.e. bottom bit is 1).  */
-  if (uneq_or_ltgt)
-   {
- rtx add_result = gen_reg_rtx (SImode);
- rtx and_result = gen_reg_rtx (SImode);
- emit_insn (gen_addsi3 (add_result, dest, GEN_INT (1)));
- emit_insn (gen_andsi3 (and_result, add_result, GEN_INT (1)));
- dest = and_result;
+  if (!check_nan)
+   dest = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
+   SImode, 2, op0, mode, op1, mode);
+
+  /* The library signals an exception for signalling NaNs, so we need to
+handle isgreater, etc. by first checking isordered.  */
+  else
+   {
+ rtx ne_rtx, normal_dest, unord_dest;
+ rtx unord_func = optab_libfunc (unord_optab, mode);
+ rtx join_label = gen_label_rtx ();
+ rtx 

Re: [PATCH] correct atomic_compare_exchange_n return type (c++/71675)

2016-07-26 Thread Martin Sebor

On 07/25/2016 05:03 AM, Renlin Li wrote:

Hi Martin,

I observed the following error:

ERROR: gcc.dg/atomic/pr71675.c   -O2 -flto -fuse-linker-plugin
-fno-fat-lto-objects : syntax error in target selector "target c11" for
" dg-do 3 compile { target c11 } "

It seems we don't have a c11 effective target check available
in dejagnu target-supports.exp.


Thanks for pointing this out.  I had trouble getting this or any
of the other atomic tests to run on their own with make check-c
RUNTESTFLAGS='atomic.exp'  They simply don't.  I didn't notice
a failure in the full test suite run and since the test compiled
successfully outside DejaGnu I didn't take the time to understand
why it wouldn't run on its own or with the rest of the atomic
subset, or to confirm that the c11 target selector was understood.
Looks like I managed to run into not one but two gotchas with
this trivial test.

I've committed r238766 and replaced the unsupported target selector
with a dg-options directive as Jeff and Jason suggested.

Martin



Thanks,
Renlin


diff --git a/gcc/testsuite/gcc.dg/atomic/pr71675.c
b/gcc/testsuite/gcc.dg/atomic/pr71675.c
new file mode 100644
index 000..0e344ac
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/atomic/pr71675.c
@@ -0,0 +1,32 @@
+/* PR c++/71675 - __atomic_compare_exchange_n returns wrong type for
typed enum
+ */
+/* { dg-do compile { target c11 } } */




[PATCH 2/2] Use static_assert for STATIC_ASSERT for C++11 onwards

2016-07-26 Thread David Malcolm
C++11 has a
  static_assert (COND, MESSAGE)
which gives more readable error messages for STATIC_ASSERT than our
current implementation.

This patch makes us use it if __cplusplus >= 201103L

There's also a provisional static_assert (COND) in C++1z, but presumably
we should wait until that one is fully standardized before using it.

Bootstrapped®rtested on x86_64-pc-linux-gnu (in conjunction
with the previous patch)

OK for trunk?

gcc/ChangeLog:
* system.h (STATIC_ASSERT): Use static_assert if building
with C++11 onwards.
---
 gcc/system.h | 7 ++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/gcc/system.h b/gcc/system.h
index 78a7da6..8a17197 100644
--- a/gcc/system.h
+++ b/gcc/system.h
@@ -752,9 +752,14 @@ extern void fancy_abort (const char *, int, const char *) 
ATTRIBUTE_NORETURN;
 #define STATIC_CONSTANT_P(X) (false && (X))
 #endif
 
-/* Until we can use C++11's static_assert.  */
+/* static_assert (COND, MESSAGE) is available in C++11 onwards.  */
+#if __cplusplus >= 201103L
+#define STATIC_ASSERT(X) \
+  static_assert ((X), #X)
+#else
 #define STATIC_ASSERT(X) \
   typedef int assertion1[(X) ? 1 : -1] ATTRIBUTE_UNUSED
+#endif
 
 /* Provide a fake boolean type.  We make no attempt to use the
C99 _Bool, as it may not be available in the bootstrap compiler,
-- 
1.8.5.3



[PATCH 1/2] Add a STATIC_ASSERT on sizeof (struct cp_token)

2016-07-26 Thread David Malcolm
On Mon, 2016-07-25 at 10:36 +0200, Richard Biener wrote:
> On Fri, Jul 22, 2016 at 4:11 PM, Jakub Jelinek 
> wrote:
> > On Fri, Jul 22, 2016 at 10:33:50AM -0400, David Malcolm wrote:
> > > gcc/cp/ChangeLog:
> > >   * parser.h (struct cp_token): Add a STATIC_ASSERT on the
> > >   size of the struct.
> > > ---
> > >  gcc/cp/parser.h | 9 +
> > >  1 file changed, 9 insertions(+)
> > > 
> > > diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
> > > index ccbace9..8c1de57 100644
> > > --- a/gcc/cp/parser.h
> > > +++ b/gcc/cp/parser.h
> > > @@ -71,6 +71,15 @@ struct GTY (()) cp_token {
> > >  "|| (%1.type == CPP_DECLTYPE)"))) u;
> > >  };
> > > 
> > > +/* The C++ frontend lexes everything first, and keeps the tokens
> > > +   in memory, so there are possibly millions of tokens in
> > > memory.
> > > +   Ensure that we don't accidentally grow the structure.  */
> > > +STATIC_ASSERT (sizeof (cp_token) ==
> > > +(2 // "type" and "keyword"
> > > + + 1 // "flags"
> > > + + 1 // bitfields
> > > + + 4 // location_t
> > > + + sizeof (void *))); // union
> > 
> > Doesn't that assume way too much on the host data layout?
> > This can be compiled with weirdo system compilers or on weirdo
> > hosts,
> > I bet we don't really support non-8-bit char hosts, but still this
> > might
> > break somewhere...
> > 
> > The formatting is wrong BTW, == shouldn't appear at the end of
> > line.
> 
> Maybe restrict it to __GNUC__ and __x86_64__ or so.
> 
> Richard.
> 
> > Jakub

Bootstrapped®rtested on x86_64-pc-linux-gnu (in conjunction
with the followup patch)

OK for trunk if it passes testing by itself?

gcc/cp/ChangeLog:
* parser.h (struct cp_token): Add a STATIC_ASSERT on the
size of the struct.
---
 gcc/cp/parser.h | 17 +
 1 file changed, 17 insertions(+)

diff --git a/gcc/cp/parser.h b/gcc/cp/parser.h
index 2923378..bc961c5 100644
--- a/gcc/cp/parser.h
+++ b/gcc/cp/parser.h
@@ -71,6 +71,23 @@ struct GTY (()) cp_token {
   "|| (%1.type == CPP_DECLTYPE)"))) u;
 };
 
+/* The C++ frontend lexes everything first, and keeps the tokens
+   in memory, so there are possibly millions of tokens in memory.
+
+   Use a STATIC_ASSERT to ensure that we don't accidentally grow
+   the structure.
+
+   To avoid introducing too many assumptions on the host data layout,
+   only enable the assertion when compiling with GCC for a
+   known-good host.  */
+#if defined (__GNUC__) && defined (__x86_64__)
+STATIC_ASSERT (sizeof (cp_token) ==
+  (2 // "type" and "keyword"
+   + 1 // "flags"
+   + 1 // bitfields
+   + 4 // location_t
+   + sizeof (void *))); // union
+#endif /* #if defined (__GNUC__) && defined (__x86_64__) */
 
 /* We use a stack of token pointer for saving token sets.  */
 typedef struct cp_token *cp_token_position;
-- 
1.8.5.3



Re: [PATCH], Fix PR 71869, Correctly implement isgreater, etc. on PowerPC __float128

2016-07-26 Thread Segher Boessenkool
On Tue, Jul 26, 2016 at 04:09:01PM -0400, Michael Meissner wrote:
> > Could you test all five functions please?  Use multiple testcases, maybe.
> 
> I decided to write an executable test rather than do more assembly tests.  The
> patch to rs6000.c is unchanged, and the test now tests all of the comparison
> operators and functions with various values including quiet NaNs and 
> signalling
> NaNs.  It passes on power7 big endian 64-bit, power7 big endian 32-bit, power8
> little endian 64-bit, and I ran it on the power9 simulator with hardware
> float128 support.

> --- gcc/testsuite/gcc.target/powerpc/float128-cmp.c   (revision 0)
> +++ gcc/testsuite/gcc.target/powerpc/float128-cmp.c   (revision 0)
> @@ -0,0 +1,106 @@
> +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> +/* { dg-require-effective-target ppc_float128_sw } */
> +/* { dg-options "-mvsx -O2 -mfloat128" } */

dg-do compile?  That's not testing much then, as an executable test!


Segher


Re: [PATCH], Fix PR 71869, Correctly implement isgreater, etc. on PowerPC __float128

2016-07-26 Thread Michael Meissner
On Tue, Jul 26, 2016 at 06:26:19PM -0500, Segher Boessenkool wrote:
> On Tue, Jul 26, 2016 at 04:09:01PM -0400, Michael Meissner wrote:
> > > Could you test all five functions please?  Use multiple testcases, maybe.
> > 
> > I decided to write an executable test rather than do more assembly tests.  
> > The
> > patch to rs6000.c is unchanged, and the test now tests all of the comparison
> > operators and functions with various values including quiet NaNs and 
> > signalling
> > NaNs.  It passes on power7 big endian 64-bit, power7 big endian 32-bit, 
> > power8
> > little endian 64-bit, and I ran it on the power9 simulator with hardware
> > float128 support.
> 
> > --- gcc/testsuite/gcc.target/powerpc/float128-cmp.c (revision 0)
> > +++ gcc/testsuite/gcc.target/powerpc/float128-cmp.c (revision 0)
> > @@ -0,0 +1,106 @@
> > +/* { dg-do compile { target { powerpc*-*-linux* } } } */
> > +/* { dg-require-effective-target ppc_float128_sw } */
> > +/* { dg-options "-mvsx -O2 -mfloat128" } */
> 
> dg-do compile?  That's not testing much then, as an executable test!

Good catch.  Hopefully, third time is a charm.  I verified that changing the
dg-do compile to dg-do run did run properly on a big endian power7 (both 32-bit
and 64-bit) and a little endian power8.

[gcc]
2016-07-26  Michael Meissner  

PR target/71869
* config/rs6000/rs6000.c (rs6000_generate_compare): Rework
__float128 support when we don't have hardware support, so that
the IEEE built-in functions like isgreater, first call __unordkf3
to make sure neither operand is a NaN, and if both operands are
ordered, do the normal comparison.

[gcc/testsuite]
2016-07-26  Michael Meissner  

PR target/71869
* gcc.target/powerpc/float128-cmp.c: New test to make sure that
IEEE built-in functions handle quiet and signalling NaNs
correctly.


-- 
Michael Meissner, IBM
IBM, M/S 2506R, 550 King Street, Littleton, MA 01460-6245, USA
email: meiss...@linux.vnet.ibm.com, phone: +1 (978) 899-4797
Index: gcc/config/rs6000/rs6000.c
===
--- gcc/config/rs6000/rs6000.c  (revision 238730)
+++ gcc/config/rs6000/rs6000.c  (working copy)
@@ -21755,8 +21755,8 @@ rs6000_generate_compare (rtx cmp, machin
   else if (!TARGET_FLOAT128_HW && FLOAT128_VECTOR_P (mode))
 {
   rtx libfunc = NULL_RTX;
-  bool uneq_or_ltgt = false;
-  rtx dest = gen_reg_rtx (SImode);
+  bool check_nan = false;
+  rtx dest;
 
   switch (code)
{
@@ -21783,21 +21783,23 @@ rs6000_generate_compare (rtx cmp, machin
 
case UNGE:
case UNGT:
- libfunc = optab_libfunc (le_optab, mode);
+ check_nan = true;
+ libfunc = optab_libfunc (ge_optab, mode);
  code = (code == UNGE) ? GE : GT;
  break;
 
case UNLE:
case UNLT:
- libfunc = optab_libfunc (ge_optab, mode);
+ check_nan = true;
+ libfunc = optab_libfunc (le_optab, mode);
  code = (code == UNLE) ? LE : LT;
  break;
 
case UNEQ:
case LTGT:
- libfunc = optab_libfunc (le_optab, mode);
- uneq_or_ltgt = true;
- code = (code = UNEQ) ? NE : EQ;
+ check_nan = true;
+ libfunc = optab_libfunc (eq_optab, mode);
+ code = (code = UNEQ) ? EQ : NE;
  break;
 
default:
@@ -21805,21 +21807,56 @@ rs6000_generate_compare (rtx cmp, machin
}
 
   gcc_assert (libfunc);
-  dest = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
- SImode, 2, op0, mode, op1, mode);
 
-  /* If this is UNEQ or LTGT, we call __lekf2, which returns -1 for less
-than, 0 for equal, +1 for greater, and +2 for nan.  We add 1, to give
-a value of 0..3, and then do and AND immediate of 1 to isolate whether
-it is 0/Nan (i.e. bottom bit is 0), or less than/greater than
-(i.e. bottom bit is 1).  */
-  if (uneq_or_ltgt)
-   {
- rtx add_result = gen_reg_rtx (SImode);
- rtx and_result = gen_reg_rtx (SImode);
- emit_insn (gen_addsi3 (add_result, dest, GEN_INT (1)));
- emit_insn (gen_andsi3 (and_result, add_result, GEN_INT (1)));
- dest = and_result;
+  if (!check_nan)
+   dest = emit_library_call_value (libfunc, NULL_RTX, LCT_CONST,
+   SImode, 2, op0, mode, op1, mode);
+
+  /* The library signals an exception for signalling NaNs, so we need to
+handle isgreater, etc. by first checking isordered.  */
+  else
+   {
+ rtx ne_rtx, normal_dest, unord_dest;
+ rtx unord_func = optab_libfunc (unord_optab, mode);
+ rtx join_label = gen_label_rtx ();
+ rtx join_ref = gen_rtx_LABEL_REF (VOIDmode, join_label);
+ rtx unord_cmp = gen_reg_rtx (comp_mode);
+
+
+ /* Test for either value being a NaN.  */
+ gcc_assert (unor

Re: [Bug tree-optimization] Fix for PR71994

2016-07-26 Thread kugan

Hi Jeff,

Thanks for your comments.




* tree-ssa-reassoc.c (maybe_optimize_range_tests): Check type
compatibility.

I'd kind of like to see some analysis of how we've got a bool here --
ISTM it ought to have been converted it to the type of the LHS of the
PHI when propagated.


You are right. The problem was with the order of checking tcc_compare 
and calling get_ops. We ended up calling get_ops where we should not.


Bootstrap and regression testing is ongoing. Is this OK for trunk if no 
regressions?


Thanks,
Kugan


gcc/testsuite/ChangeLog:

2016-07-27  Kugan Vivekanandarajah  

* gcc.dg/torture/pr71994.c: New test.

gcc/ChangeLog:

2016-07-27  Kugan Vivekanandarajah  

* tree-ssa-reassoc.c (maybe_optimize_range_tests): Check tcc_comparison
 before calling get_ops.




diff --git a/gcc/testsuite/gcc.dg/torture/pr71994.c 
b/gcc/testsuite/gcc.dg/torture/pr71994.c
index e69de29..8f5e92c 100644
--- a/gcc/testsuite/gcc.dg/torture/pr71994.c
+++ b/gcc/testsuite/gcc.dg/torture/pr71994.c
@@ -0,0 +1,14 @@
+/* PR tree-optimization/71994 */
+/* { dg-do compile } */
+int om, h6;
+
+void eo (void)
+{
+  const int tl = 1;
+  int ln;
+
+  h6 = (om + tl) > 0;
+  ln = om && (om & h6);
+  h6 = om;
+  om = ln < h6;
+}
diff --git a/gcc/tree-ssa-reassoc.c b/gcc/tree-ssa-reassoc.c
index 18cf978..8f2256f 100644
--- a/gcc/tree-ssa-reassoc.c
+++ b/gcc/tree-ssa-reassoc.c
@@ -3520,10 +3520,10 @@ maybe_optimize_range_tests (gimple *stmt)
 push into ops the individual range test arguments
 of the bitwise or resp. and, recursively.  */
  if (TREE_CODE (rhs) == SSA_NAME
- && !get_ops (rhs, code, &ops,
-   loop_containing_stmt (stmt))
  && (TREE_CODE_CLASS (gimple_assign_rhs_code (stmt))
  != tcc_comparison)
+ && !get_ops (rhs, code, &ops,
+   loop_containing_stmt (stmt))
  && has_single_use (rhs))
{
  /* Otherwise, push the _234 range test itself.  */


Re: [PATCH], Fix PR 71869, Correctly implement isgreater, etc. on PowerPC __float128

2016-07-26 Thread Segher Boessenkool
On Tue, Jul 26, 2016 at 08:04:32PM -0400, Michael Meissner wrote:
> > dg-do compile?  That's not testing much then, as an executable test!
> 
> Good catch.  Hopefully, third time is a charm.  I verified that changing the
> dg-do compile to dg-do run did run properly on a big endian power7 (both 
> 32-bit
> and 64-bit) and a little endian power8.
> 
> [gcc]
> 2016-07-26  Michael Meissner  
> 
>   PR target/71869
>   * config/rs6000/rs6000.c (rs6000_generate_compare): Rework
>   __float128 support when we don't have hardware support, so that
>   the IEEE built-in functions like isgreater, first call __unordkf3
>   to make sure neither operand is a NaN, and if both operands are
>   ordered, do the normal comparison.
> 
> [gcc/testsuite]
> 2016-07-26  Michael Meissner  
> 
>   PR target/71869
>   * gcc.target/powerpc/float128-cmp.c: New test to make sure that
>   IEEE built-in functions handle quiet and signalling NaNs
>   correctly.

Okay for trunk.  Okay for 6 once there is independent testing (on
gcc-testresults, say) for all affected targets and you aren't comfortable
waiting any longer ;-)


Segher


Re: [PATCH/AARCH64] Add scheduler for vulcan.

2016-07-26 Thread Virendra Pathak
Hi James,

> Would you mind taking a look at some of these techniques to see if you can
> reduce the size of the generated automata without causing too much
> trouble for code generation for Vulcan?

Thanks for the review. I will look into this.


with regards,
Virendra Pathak


On Mon, Jul 25, 2016 at 4:03 PM, James Greenhalgh
 wrote:
> On Wed, Jul 20, 2016 at 03:07:45PM +0530, Virendra Pathak wrote:
>> Hi gcc-patches group,
>>
>> Please find the patch for adding the basic scheduler for vulcan
>> in the aarch64 port.
>>
>> Tested the patch with compiling cross aarch64-linux-gcc,
>> bootstrapped native aarch64-unknown-linux-gnu and
>> run gcc regression.
>>
>> Kindly review and merge the patch to trunk, if the patch is okay.
>>
>> There are few TODO in this patch which we have planned to
>> submit in the next submission e.g. crc and crypto
>> instructions, further improving integer & fp load/store
>> based on addressing mode of the address.
>
> Hi Virendra,
>
> Thanks for the patch, I have some concerns about the size of the
> automata that this description generates. As you can see
> (use (automata_option "stats") in the description to enable statistics)
> this scheduler description generates a 10x larger automata for Vulcan than
> the second largest description we have for AArch64 (cortex_a53_advsimd):
>
>   Automaton `cortex_a53_advsimd'
>  9072 NDFA states,  49572 NDFA arcs
>  9072 DFA states,   49572 DFA arcs
>  4050 minimal DFA states,   23679 minimal DFA arcs
>   368 all insns 11 insn equivalence classes
> 0 locked states
>   28759 transition comb vector els, 44550 trans table els: use simple vect
>   44550 min delay table els, compression factor 2
>
>   Automaton `vulcan'
> 103223 NDFA states,  651918 NDFA arcs
> 103223 DFA states,   651918 DFA arcs
> 45857 minimal DFA states,   352255 minimal DFA arcs
>   368 all insns 28 insn equivalence classes
> 0 locked states
>   429671 transition comb vector els, 1283996 trans table els: use comb vect
>   1283996 min delay table els, compression factor 2
>
> Such a large automaton increases compiler build time and memory consumption,
> often for little scheduling benefit.
>
> Normally such a large automaton comes from using a large repeat
> expression (*). For example in your modeling of divisions:
>
>> +(define_insn_reservation "vulcan_div" 13
>> +  (and (eq_attr "tune" "vulcan")
>> +   (eq_attr "type" "sdiv,udiv"))
>> +  "vulcan_i1*13")
>> +
>> +(define_insn_reservation "vulcan_fp_divsqrt_s" 16
>> +  (and (eq_attr "tune" "vulcan")
>> +   (eq_attr "type" "fdivs,fsqrts"))
>> +  "vulcan_f0*8|vulcan_f1*8")
>> +
>> +(define_insn_reservation "vulcan_fp_divsqrt_d" 23
>> +  (and (eq_attr "tune" "vulcan")
>> +   (eq_attr "type" "fdivd,fsqrtd"))
>> +  "vulcan_f0*12|vulcan_f1*12")
>
> In other pipeline models, we try to keep these repeat numbers low to avoid
> the large state-space growth they cause. For example, the Cortex-A57
> pipeline model describes them as:
>
>   (define_insn_reservation "cortex_a57_fp_divd" 16
> (and (eq_attr "tune" "cortexa57")
>  (eq_attr "type" "fdivd, fsqrtd, neon_fp_div_d, neon_fp_sqrt_d"))
> "ca57_cx2_block*3")
>
> The lower accuracy is acceptable because of the nature of the scheduling
> model. For a machine with an issue rate of "4" like Vulcan, each cycle the
> compiler models it tries to find four instructions to schedule, before it
> progresses the state of the automaton. If an instruction is modelled as
> blocking the "vulcan_i1" unit for 13 cycles, that means up to 52
> instructions that the scheduler would have to find before issuing the next
> instruction which would use vulcan_i1. Because scheduling works within
> basic-blocks, the chance of finding so many independent instructions is
> extremely low, and so you'd never see the benefit of the 13-cycle block.
>
> I tried lowering the repeat expressions as so:
>
>> +(define_insn_reservation "vulcan_div" 13
>> +  (and (eq_attr "tune" "vulcan")
>> +   (eq_attr "type" "sdiv,udiv"))
>> +  "vulcan_i1*3")
>> +
>> +(define_insn_reservation "vulcan_fp_divsqrt_s" 16
>> +  (and (eq_attr "tune" "vulcan")
>> +   (eq_attr "type" "fdivs,fsqrts"))
>> +  "vulcan_f0*3|vulcan_f1*3")
>> +
>> +(define_insn_reservation "vulcan_fp_divsqrt_d" 23
>> +  (and (eq_attr "tune" "vulcan")
>> +   (eq_attr "type" "fdivd,fsqrtd"))
>> +  "vulcan_f0*5|vulcan_f1*5")
>
> Which more than halves the size of the generated automaton:
>
>   Automaton `vulcan'
> 45370 NDFA states,  319261 NDFA arcs
> 45370 DFA states,   319261 DFA arcs
> 20150 minimal DFA states,   170824 minimal DFA arcs
>   368 all insns 28 insn equivalence classes
> 0 locked states
>   215565 transition comb vector els, 564200 trans table els: use comb vect
>   564200 min delay table els, compression factor 2
>
> The other technique we use to reduce the size of the generated automato

Re: Gimple loop splitting v2

2016-07-26 Thread Andrew Pinski
On Tue, Jul 26, 2016 at 4:32 AM, Richard Biener
 wrote:
> On Mon, Jul 25, 2016 at 10:57 PM, Andrew Pinski  wrote:
>> On Wed, Dec 2, 2015 at 5:23 AM, Michael Matz  wrote:
>>> Hi,
>>>
>>> On Tue, 1 Dec 2015, Jeff Law wrote:
>>>
 > So, okay for trunk?
 -ENOPATCH
>>>
>>> Sigh :)
>>> Here it is.
>>
>>
>> I found one problem with it.
>> Take:
>> void f(int *a, int M, int *b)
>> {
>>   for(int i = 0; i <= M; i++)
>> {
>>if (i < M)
>> a[i] = i;
>> }
>> }
>>  CUT ---
>> There are two issues with the code as below.  The outer most loop's
>> aux is still set which causes the vectorizer not to vector the loop.
>> The other issue is I need to run pass_scev_cprop after pass_loop_split
>> to get the induction variable usage after the loop gone so the
>> vectorizer will work.
>
> I think scev_cprop needs to be re-written to an utility so that the vectorizer
> itself can (within its own cost-model) eliminate an induction using it.
>
> Richard.
>
>> Something like (note this is copy and paste from a terminal):
>> diff --git a/gcc/passes.def b/gcc/passes.def
>> index c327900..e8d6ea6 100644
>> --- a/gcc/passes.def
>> +++ b/gcc/passes.def
>> @@ -262,8 +262,8 @@ along with GCC; see the file COPYING3.  If not see
>>   NEXT_PASS (pass_copy_prop);
>>   NEXT_PASS (pass_dce);
>>   NEXT_PASS (pass_tree_unswitch);
>> - NEXT_PASS (pass_scev_cprop);
>>   NEXT_PASS (pass_loop_split);
>> + NEXT_PASS (pass_scev_cprop);
>>   NEXT_PASS (pass_record_bounds);
>>   NEXT_PASS (pass_loop_distribution);
>>   NEXT_PASS (pass_copy_prop);
>> diff --git a/gcc/tree-ssa-loop-split.c b/gcc/tree-ssa-loop-split.c
>> index 5411530..e72ef19 100644
>> --- a/gcc/tree-ssa-loop-split.c
>> +++ b/gcc/tree-ssa-loop-split.c
>> @@ -592,7 +592,11 @@ tree_ssa_split_loops (void)
>>
>>gcc_assert (scev_initialized_p ());
>>FOR_EACH_LOOP (loop, 0)
>> -loop->aux = NULL;
>> +{
>> +  loop->aux = NULL;
>> +  if (loop_outer (loop))
>> +   loop_outer (loop)->aux = NULL;
>> +}
>
> How does the iterator not visit loop_outer (loop)?!

The iterator with flags of 0 does not visit the the root.  So the way
to fix this is change 0 (which is the flags) with LI_INCLUDE_ROOT so
we zero out the root too.

Thanks,
Andrew

>
>>
>>/* Go through all loops starting from innermost.  */
>>FOR_EACH_LOOP (loop, LI_FROM_INNERMOST)
>> @@ -631,7 +635,11 @@ tree_ssa_split_loops (void)
>>  }
>>
>>FOR_EACH_LOOP (loop, 0)
>> -loop->aux = NULL;
>> +{
>> +  loop->aux = NULL;
>> +  if (loop_outer (loop))
>> +   loop_outer (loop)->aux = NULL;
>> +}
>>
>>if (changed)
>>  return TODO_cleanup_cfg;
>> -  CUT -
>>
>> Thanks,
>> Andrew
>>
>>
>>>
>>>
>>> Ciao,
>>> Michael.
>>> * common.opt (-fsplit-loops): New flag.
>>> * passes.def (pass_loop_split): Add.
>>> * opts.c (default_options_table): Add OPT_fsplit_loops entry at -O3.
>>> (enable_fdo_optimizations): Add loop splitting.
>>> * timevar.def (TV_LOOP_SPLIT): Add.
>>> * tree-pass.h (make_pass_loop_split): Declare.
>>> * tree-ssa-loop-manip.h (rewrite_into_loop_closed_ssa_1): Declare.
>>> * tree-ssa-loop-unswitch.c: Include tree-ssa-loop-manip.h,
>>> * tree-ssa-loop-split.c: New file.
>>> * Makefile.in (OBJS): Add tree-ssa-loop-split.o.
>>> * doc/invoke.texi (fsplit-loops): Document.
>>> * doc/passes.texi (Loop optimization): Add paragraph about loop
>>> splitting.
>>>
>>> testsuite/
>>> * gcc.dg/loop-split.c: New test.
>>>
>>> Index: common.opt
>>> ===
>>> --- common.opt  (revision 231115)
>>> +++ common.opt  (working copy)
>>> @@ -2453,6 +2457,10 @@ funswitch-loops
>>>  Common Report Var(flag_unswitch_loops) Optimization
>>>  Perform loop unswitching.
>>>
>>> +fsplit-loops
>>> +Common Report Var(flag_split_loops) Optimization
>>> +Perform loop splitting.
>>> +
>>>  funwind-tables
>>>  Common Report Var(flag_unwind_tables) Optimization
>>>  Just generate unwind tables for exception handling.
>>> Index: passes.def
>>> ===
>>> --- passes.def  (revision 231115)
>>> +++ passes.def  (working copy)
>>> @@ -252,6 +252,7 @@ along with GCC; see the file COPYING3.
>>>   NEXT_PASS (pass_dce);
>>>   NEXT_PASS (pass_tree_unswitch);
>>>   NEXT_PASS (pass_scev_cprop);
>>> + NEXT_PASS (pass_loop_split);
>>>   NEXT_PASS (pass_record_bounds);
>>>   NEXT_PASS (pass_loop_distribution);
>>>   NEXT_PASS (pass_copy_prop);
>>> Index: opts.c
>>> ===
>>> --- opts.c  (revision 231115)
>>> +++ opts.c  (working copy)
>>> @@ -532,6 +532,7 @@ static const struct default_options defa
>>> regardless of them being declared inline.  *

[C++ PATCH] for PR72457

2016-07-26 Thread Markus Trippelsdorf
On 2016.07.23 at 22:55 -0400, Jason Merrill wrote:
> Using build_value_init in a base initialization is wrong, because it
> calls the complete object constructor and misses protected access.  So
> let's handle list-value-initialization in expand_aggr_init_1.
> 
> Tested x86_64-pc-linux-gnu, applying to trunk.

This patch causes https://gcc.gnu.org/bugzilla/show_bug.cgi?id=72457.
And because it was backported, the gcc-6 branch is also affected.

The following fix was tested on ppc64le. OK for trunk and gcc-6?

(Unfortunately the reduced testcase is much too big.)

 PR c++/72457
 *constexpr.c (cx_check_missing_mem_inits): Handle potential
 NULL_TREE.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 6bcb41ae8254..83fd9a4896ac 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -734,7 +734,7 @@ cx_check_missing_mem_inits (tree fun, tree body, bool 
complain)
  || DECL_ARTIFICIAL (index))
continue;
}
-  for (; field != index; field = DECL_CHAIN (field))
+  for (; field != NULL_TREE && field != index; field = DECL_CHAIN (field))
{
  tree ftype;
  if (TREE_CODE (field) != FIELD_DECL

-- 
Markus