Re: Fix regrename compare-debug issue

2016-05-05 Thread Eric Botcazou
> When scanning addresses inside a debug insn, we shouldn't use normal
> base/index classes. This shows as a compare-debug issue on Alpha, where
> INDEX_REG_CLASS is NO_REGS, and this prevented a chain from being
> renamed with debugging turned on.
> 
> Uros has reported that this patch resolves the issues he was seeing on
> Alpha, and I've bootstrapped and tested it on x86_64-linux. Ok?

OK, thanks.  It might worthwhile to add a sentence somewhere (maybe at the end 
of the head comment of the file) documenting the special treatment applied to 
debug insns during the pass.

-- 
Eric Botcazou


Re: Enabling -frename-registers?

2016-05-05 Thread Ramana Radhakrishnan
On Wed, May 4, 2016 at 4:20 PM, Wilco Dijkstra  wrote:
> Bernd Schmidt wrote:
>> On 05/04/2016 03:25 PM, Ramana Radhakrishnan wrote:
>>> On ARM / AArch32 I haven't seen any performance data yet - the one place we 
>>> are concerned
>>> about the impact is on Thumb2 code size as regrename may end up 
>>> inadvertently putting more
>>> things in high registers.
>>
>> In theory at least arm_preferred_rename_class is designed to make the  
>> opposite happen. Bernd
>
> I do not see that working unfortunately - Thumb-2 codesize increases by a few 
> percent even with -Os.
> This is primarily due to replacing a low register with IP, which often 
> changes a 16-bit instruction like:
>
> movsr2, #8
>
> into a 32-bit one:
>
> mov ip, #8
>
> This will also affect other targets with multiple instruction sizes. So I 
> think it should check the
> size of the new instruction patterns and only accept a rename if it is not 
> larger (certainly with -Os).

Can you file a bugzilla entry with a testcase that folks can look at please ?

Ramana

>


Re: [Patch AArch64] Fix PR target/63874

2016-05-05 Thread Ramana Radhakrishnan
On Thu, Mar 31, 2016 at 2:11 PM, Ramana Radhakrishnan
 wrote:
> Hi,
>
> In this PR we have a situation where we aren't really detecting
> weak references vs weak definitions. If one has a weak definition
> that binds locally there's no reason not to put out PC relative
> relocations.
>
> However if you have a genuine weak reference that is
> known not to bind locally it makes very little sense
> to put out an entry into the literal pool which doesn't always
> work with DSOs and shared objects.
>
> Tested aarch64-none-linux-gnu bootstrap and regression test with no 
> regressions
>
> This is not a regression and given what we've seen recently with protected
> symbols and binds_locally_p I'd rather this were queued for GCC 7.
>
> Ok ?

Ping ^ 2.

https://gcc.gnu.org/ml/gcc-patches/2016-03/msg01680.html

regards
Ramana
>
> regards
> Ramana
>
> gcc/
>
> * config/aarch64/aarch64.c (aarch64_classify_symbol): Typo in comment fixed.
>   Only force to memory if it is a weak external reference.
>
>
> gcc/testsuite
>
> * gcc.target/aarch64/pr63874.c: New test.


Re: [PATCH] Handle also switch for -Wdangling-else

2016-05-05 Thread Marek Polacek
On Wed, May 04, 2016 at 09:54:29PM +0200, Jakub Jelinek wrote:
> Hi!
> 
> This patch let us warn about danling else even if there is a switch
> without {}s around the body.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2016-05-04  Jakub Jelinek  
> 
>   * c-parser.c (c_parser_switch_statement): Add IF_P argument,
>   parse it through to c_parser_c99_block_statement.
>   (c_parser_statement_after_labels): Adjust c_parser_switch_statement
>   caller.
> 
>   * parser.c (cp_parser_selection_statement): For RID_SWITCH,
>   pass if_p instead of NULL to cp_parser_implicitly_scoped_statement.
> 
>   * c-c++-common/Wdangling-else-4.c: New test.

Ok, thanks.

Marek


[PATCH] MIPS: In mips_print_address_operand pass the mode argument to mips_classify_address

2016-05-05 Thread Andrew Bennett
Hi,

Currently the mips_print_operand_address function ignores its mode argument, 
and when it calls
mips_classify_address it forces the mode argument to be the machine's word 
mode.  This patch
makes mips_print_operand_address pass the mode argument provided to it to 
mips_classify_address,
so that it uses the actual mode of the mem rtx.  This patch is also a 
pre-requisite for the
following patch: https://gcc.gnu.org/ml/gcc-patches/2016-05/msg00314.html.

I have tested the patch on the mips-mti-elf toolchain and there have been no 
regressions.

The patch and ChangeLog are below.


Ok to commit?

Many thanks,



Andrew


gcc/
* config/mips/mips.c (mips_print_operand_address): Pass the mode 
argument to
mips_classify_address.



diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
index 399f231..6cdda3b 100644
--- a/gcc/config/mips/mips.c
+++ b/gcc/config/mips/mips.c
@@ -8634,11 +8634,14 @@ mips_print_operand (FILE *file, rtx op, int letter)
 /* Implement TARGET_PRINT_OPERAND_ADDRESS.  */
 
 static void
-mips_print_operand_address (FILE *file, machine_mode /*mode*/, rtx x)
+mips_print_operand_address (FILE *file, machine_mode mode, rtx x)
 {
   struct mips_address_info addr;
 
-  if (mips_classify_address (&addr, x, word_mode, true))
+  if (mode == VOIDmode)
+mode = word_mode;
+
+  if (mips_classify_address (&addr, x, mode, true))
 switch (addr.type)
   {
   case ADDRESS_REG:



RE: [PATCH] MIPS: Ensure that lo_sums do not contain an unaligned symbol

2016-05-05 Thread Matthew Fortune
Hi Andrew,

Thanks for working on this it is a painful area.  There's a bit more to do
but this is cleaning up some sneaky bugs.  Can you create a GCC bugzilla
entry if you haven't already as we should record where these bugs exist and
when they are fixed?

See my comments but I think that you are fixing more variants of this bug
than your summary states so we need to capture the detail on what code is
affected by these issues.

Andrew Bennett  writes:
> different offsets.  Lets show this with an example C program.
> 
> struct
> {
>   short s;
>   unsigned long long l;
> } h;
> 
> void foo (void)
> {
>   h.l = 0;
> }
> 
> When this is compiled for MIPS it produces the following assembly:
> 
> lui $2,%hi(h+8)
> sw  $0,%lo(h+8)($2)
> jr  $31
> sw  $0,%lo(h+12)($2)

This looks like a stale example h+8 implies 8 bytes of data preceding 'l'.


>diff --git a/gcc/config/mips/mips.c b/gcc/config/mips/mips.c
>index 6cdda3b..f07e433 100644
>--- a/gcc/config/mips/mips.c
>+++ b/gcc/config/mips/mips.c
>@@ -2354,6 +2354,38 @@ mips_valid_lo_sum_p (enum mips_symbol_type symbol_type, 
>machine_mode mode)
>   return true;
> }
> 
>+/* Return true if X in LO_SUM (REG, X) is a valid.  */

is a valid...

I would however merge this code into mips_valid_lo_sum_p as the new function
name is fairly confusing.  Both call sites for this function have the
symbol_type available which mips_valid_lo_sum_p requires and also have the
mode, reg, x so just add those to mips_valid_lo_sum_p.

>+
>+bool
>+mips_valid_lo_sum_lo_part_p (machine_mode mode, rtx reg, rtx x)
>+{
>+   rtx symbol = NULL;

three space indent.

>+
>+   if (mips_abi != ABI_32)
>+ return true;

I don't think this is limited to o32.  I was thinking this was just about
splitting a multi-word unaligned access but actually the test cases in this
patch show that it is also about accessing unaligned elements in a structure
using the same 'hi' part for multiple 'lo' parts with differing offsets.

In the end I think there is no word size or abi specific issues here; it is
quite general.

>+   if (mode == BLKmode)
>+ return true;

Why is this special? Does the core GCC code ensure that lo_sum on a BLKmode
cannot have a constant offset greater than alignment?

>+   if (reg && REG_P (reg) && REGNO (reg) == GLOBAL_POINTER_REGNUM)
>+ return true;

I don't think reg need be an optional argument it is available at both
call sites.  A comment to say why offsets from the global pointer are
not affected would also be useful.

>+
>+   if (GET_CODE (x) == CONST
>+   && GET_CODE (XEXP (x, 0)) == PLUS
>+   && GET_CODE (XEXP (XEXP (x, 0), 0)) == SYMBOL_REF)
>+ symbol = XEXP (XEXP (x, 0), 0);
>+   else if (GET_CODE (x) == SYMBOL_REF)
>+ symbol = x;
>+
>+   if (symbol
>+   && SYMBOL_REF_DECL (symbol)
>+   && (GET_MODE_ALIGNMENT (mode) / BITS_PER_UNIT) >
>+ (DECL_ALIGN_UNIT (SYMBOL_REF_DECL (symbol

This needs another bracket to cover the multiline '>' condition.

>+ return false;
>+
>+   return true;
>+}
>+
> /* Return true if X is a valid address for machine mode MODE.  If it is,
>fill in INFO appropriately.  STRICT_P is true if REG_OK_STRICT is in
>effect.  */
>@@ -2394,7 +2426,8 @@ mips_classify_address (struct mips_address_info *info, 
>rtx x,
>   info->symbol_type
>   = mips_classify_symbolic_expression (info->offset, SYMBOL_CONTEXT_MEM);
>   return (mips_valid_base_register_p (info->reg, mode, strict_p)
>-&& mips_valid_lo_sum_p (info->symbol_type, mode));
>+&& mips_valid_lo_sum_p (info->symbol_type, mode)
>+&& mips_valid_lo_sum_lo_part_p (mode, info->reg, info->offset));

As above this can become:

  && mips_valid_lo_sum_p (info->symbol_type, mode, info-reg,
  info->offset)

> 
> case CONST_INT:
>   /* Small-integer addresses don't occur very often, but they
>@@ -3143,6 +3176,8 @@ mips_split_symbol (rtx temp, rtx addr, machine_mode 
>mode, rtx *low_out)
>   high = gen_rtx_HIGH (Pmode, copy_rtx (addr));
>   high = mips_force_temporary (temp, high);
>   *low_out = gen_rtx_LO_SUM (Pmode, high, addr);
>+if (!mips_valid_lo_sum_lo_part_p (mode, NULL, addr))

tab indent.  This can also become:

if (!mips_valid_lo_sum_p (symbol_type, mode, high, addr))

>+*low_out = mips_force_temporary (temp, *low_out);
>   break;
> }
> return true;

General comment on the tests... I don't think any of the '-mfp??' related tests
are actually doing anything with the FPU. All the load/store operations are
happening in GP registers so the FP mode is irrelevant. This makes some
tests redundant.

>diff --git a/gcc/testsuite/gcc.target/mips/hi-lo-reloc-offset1.c 
>b/gcc/testsuite/gcc.target/mips/hi-lo-reloc-offset1.c
>new file mode 100644
>index 000..dc81fd9
>--- /dev/null
>+++ b/gcc/testsuite/gc

Re: [PATCH] Some further XMM16+ improvements

2016-05-05 Thread Kirill Yukhin
Hi Jakub,
On 03 May 20:57, Jakub Jelinek wrote:
> Hi!
> 
> This patch improves code generation e.g. on the first attached testcase
> and allows accepting the second one.
> 
> I've noticed we don't allow TFmode or V1TImode in xmm16+ regs at all,
> while they are allowed in xmm0-xmm15, so IMHO should be ok even with
> AVX512VL.
> 
> Wonder if it wouldn't be better to add a new constraint that would act
> like v constraint for TARGET_AVX512VL and like x constraint otherwise,
> that might greatly simplify the i386.md changes in this patch.
Good idea, I thought about that myself. IMHO this might be a follow up.

> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
> or with some changes?  Haven't figured out how to test the *andnot*
> and ** patterns though.
Are you going to commit testcases?
Yeah, tests for FP *logic* look odd, so I am OK for not having them.

--
Thanks, K


> 2016-05-03  Jakub Jelinek  
> 
>   * config/i386/i386.h (VALID_AVX512VL_128_REG_MODE): Allow
>   TFmode and V1TImode in xmm16+ registers for TARGET_AVX512VL.
>   * config/i386/i386.md (avx512fvecmode): New mode attr.
>   (*pushtf): Use v constraint instead of x.
>   (*movtf_internal): Likewise.  For TARGET_AVX512VL and
>   xmm16+ registers, use vmovdqu64 or vmovdqa64 instructions.
>   (*absneg2): Add avx512vl alternatives.
>   (*absnegtf2_sse): Likewise.
>   (copysign3_const, copysign3_var): Likewise.
>   * config/i386/sse.md (*andnot3): Add avx512vl and
>   avx512f alternatives.
>   (*andnottf3, *3, *tf3): Likewise.
> 
> --- gcc/config/i386/i386.h.jj 2016-03-30 16:00:17.0 +0200
> +++ gcc/config/i386/i386.h2016-05-03 15:55:46.656342870 +0200
> @@ -1126,7 +1126,8 @@ extern const char *host_detect_local_cpu
>  
>  #define VALID_AVX512VL_128_REG_MODE(MODE)\
>((MODE) == V2DImode || (MODE) == V2DFmode || (MODE) == V16QImode   \
> -   || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode)
> +   || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode \
> +   || (MODE) == TFmode || (MODE) == V1TImode)
>  
>  #define VALID_SSE2_REG_MODE(MODE)\
>((MODE) == V16QImode || (MODE) == V8HImode || (MODE) == V2DFmode   \
> --- gcc/config/i386/i386.md.jj2016-05-03 14:16:14.0 +0200
> +++ gcc/config/i386/i386.md   2016-05-03 17:13:46.643545826 +0200
> @@ -1165,6 +1165,10 @@ (define_mode_attr ssevecmode
>  (define_mode_attr ssevecmodelower
>[(QI "v16qi") (HI "v8hi") (SI "v4si") (DI "v2di") (SF "v4sf") (DF "v2df")])
>  
> +;; AVX512F vector mode corresponding to a scalar mode
> +(define_mode_attr avx512fvecmode
> +  [(QI "V64QI") (HI "V32HI") (SI "V16SI") (DI "V8DI") (SF "V16SF") (DF 
> "V8DF")])
> +
>  ;; Instruction suffix for REX 64bit operators.
>  (define_mode_attr rex64suffix [(SI "") (DI "{q}")])
>  
> @@ -2928,7 +2932,7 @@ (define_insn "*insvqi"
>  
>  (define_insn "*pushtf"
>[(set (match_operand:TF 0 "push_operand" "=<,<")
> - (match_operand:TF 1 "general_no_elim_operand" "x,*roF"))]
> + (match_operand:TF 1 "general_no_elim_operand" "v,*roF"))]
>"TARGET_64BIT || TARGET_SSE"
>  {
>/* This insn should be already split before reg-stack.  */
> @@ -3107,8 +3111,8 @@ (define_expand "mov"
>"ix86_expand_move (mode, operands); DONE;")
>  
>  (define_insn "*movtf_internal"
> -  [(set (match_operand:TF 0 "nonimmediate_operand" "=x,x ,m,?*r ,!o")
> - (match_operand:TF 1 "general_operand"  "C ,xm,x,*roF,*rC"))]
> +  [(set (match_operand:TF 0 "nonimmediate_operand" "=v,v ,m,?*r ,!o")
> + (match_operand:TF 1 "general_operand"  "C ,vm,v,*roF,*rC"))]
>"(TARGET_64BIT || TARGET_SSE)
> && !(MEM_P (operands[0]) && MEM_P (operands[1]))
> && (!can_create_pseudo_p ()
> @@ -3133,6 +3137,10 @@ (define_insn "*movtf_internal"
>   {
> if (get_attr_mode (insn) == MODE_V4SF)
>   return "%vmovups\t{%1, %0|%0, %1}";
> +   else if (TARGET_AVX512VL
> +&& (EXT_REX_SSE_REG_P (operands[0])
> +|| EXT_REX_SSE_REG_P (operands[1])))
> + return "vmovdqu64\t{%1, %0|%0, %1}";
> else
>   return "%vmovdqu\t{%1, %0|%0, %1}";
>   }
> @@ -3140,6 +3148,10 @@ (define_insn "*movtf_internal"
>   {
> if (get_attr_mode (insn) == MODE_V4SF)
>   return "%vmovaps\t{%1, %0|%0, %1}";
> +   else if (TARGET_AVX512VL
> +&& (EXT_REX_SSE_REG_P (operands[0])
> +|| EXT_REX_SSE_REG_P (operands[1])))
> + return "vmovdqa64\t{%1, %0|%0, %1}";
> else
>   return "%vmovdqa\t{%1, %0|%0, %1}";
>   }
> @@ -9253,10 +9265,10 @@ (define_expand "2"
>"ix86_expand_fp_absneg_operator (, mode, operands); DONE;")
>  
>  (define_insn "*absneg2"
> -  [(set (match_operand:MODEF 0 "register_operand" "=x,x,f,!r")
> +  [(set (match_operand:MODEF 0 "register_operand" "=x,x,v,v,f,!r")
>   (match_operator:MODEF 3 "absneg_oper

Re: [PATCH] Some further XMM16+ improvements

2016-05-05 Thread Jakub Jelinek
On Thu, May 05, 2016 at 12:49:57PM +0300, Kirill Yukhin wrote:
> Hi Jakub,
> On 03 May 20:57, Jakub Jelinek wrote:
> > This patch improves code generation e.g. on the first attached testcase
> > and allows accepting the second one.
> > 
> > I've noticed we don't allow TFmode or V1TImode in xmm16+ regs at all,
> > while they are allowed in xmm0-xmm15, so IMHO should be ok even with
> > AVX512VL.
> > 
> > Wonder if it wouldn't be better to add a new constraint that would act
> > like v constraint for TARGET_AVX512VL and like x constraint otherwise,
> > that might greatly simplify the i386.md changes in this patch.
> Good idea, I thought about that myself. IMHO this might be a follow up.

Ok, will add that to todo.

> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
> > or with some changes?  Haven't figured out how to test the *andnot*
> > and ** patterns though.
> Are you going to commit testcases?

I didn't mean to in this case, but guess I could (as for the other patches,
dg-do assemble only, I think trying to scan the assembly might be too fragile,
it is up to the RA to decide).

> Yeah, tests for FP *logic* look odd, so I am OK for not having them.

So, is the patch ok for trunk with the two testcases turned into
dg-do assemble tests, or do you want me to repost with that, or add the
Yv constraint right away, something else?

Jakub


Re: [PATCH] Improve _fmadd__mask3

2016-05-05 Thread Kirill Yukhin
Hello Jakub,
On 04 May 21:31, Jakub Jelinek wrote:
> Hi!
> 
> As the testcase can show, we should be using v constraint and generate
> better code that way.
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2016-05-04  Jakub Jelinek  
> 
>   * config/i386/sse.md (_fmadd__mask3): Use
>   v constraint instead of x.
> 
>   * gcc.target/i386/avx512f-vfmadd-1.c: New test.
Didn't get what the test checks?
It works fine w/o patch (generating extra moves though)
Maybe scan-asm that xmm{16,17,18} actually hit FMA?

--
Thanks, K

> 
> --- gcc/config/i386/sse.md.jj 2016-05-04 14:36:08.0 +0200
> +++ gcc/config/i386/sse.md2016-05-04 15:16:44.180894303 +0200
> @@ -3327,10 +3327,10 @@ (define_insn "_fmadd__mask
> (set_attr "mode" "")])
>  
>  (define_insn "_fmadd__mask3"
> -  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=x")
> +  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
>   (vec_merge:VF_AVX512VL
> (fma:VF_AVX512VL
> - (match_operand:VF_AVX512VL 1 "register_operand" "x")
> + (match_operand:VF_AVX512VL 1 "register_operand" "v")
>   (match_operand:VF_AVX512VL 2 "nonimmediate_operand" 
> "")
>   (match_operand:VF_AVX512VL 3 "register_operand" "0"))
> (match_dup 3)
> --- gcc/testsuite/gcc.target/i386/avx512f-vfmadd-1.c.jj   2016-05-04 
> 15:35:54.919506742 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-vfmadd-1.c  2016-05-04 
> 15:36:08.648326113 +0200
> @@ -0,0 +1,24 @@
> +/* { dg-do assemble { target { avx512f && { ! ia32 } } } } */
> +/* { dg-options "-O2 -mavx512f" } */
> +
> +#include 
> +
> +void
> +f1 (__m512d x, __m512d y, __m512d z, __mmask8 m)
> +{
> +  register __m512d a __asm ("xmm16"), b __asm ("xmm17"), c __asm ("xmm18");
> +  a = x; b = y; c = z;
> +  asm volatile ("" : "+v" (a), "+v" (b), "+v" (c));
> +  a = _mm512_mask3_fmadd_round_pd (c, b, a, m, _MM_FROUND_TO_NEG_INF | 
> _MM_FROUND_NO_EXC);
> +  asm volatile ("" : "+v" (a));
> +}
> +
> +void
> +f2 (__m512 x, __m512 y, __m512 z, __mmask8 m)
> +{
> +  register __m512 a __asm ("xmm16"), b __asm ("xmm17"), c __asm ("xmm18");
> +  a = x; b = y; c = z;
> +  asm volatile ("" : "+v" (a), "+v" (b), "+v" (c));
> +  a = _mm512_mask3_fmadd_round_ps (c, b, a, m, _MM_FROUND_TO_NEG_INF | 
> _MM_FROUND_NO_EXC);
> +  asm volatile ("" : "+v" (a));
> +}
> 
>   Jakub


Re: [patch] libstdc++/69703 ignore endianness in codecvt_utf8

2016-05-05 Thread Jonathan Wakely

On 04/05/16 17:19 +0100, Andre Vieira (lists) wrote:

On 20/04/16 18:40, Jonathan Wakely wrote:

On 19/04/16 19:07 +0100, Jonathan Wakely wrote:

This was reported as a bug in the Filesystem library, but it's
actually a problem in the codecvt_utf8 facet that it uses.


The fix had a silly typo meaning it didn't work for big endian
targets, which was revealed by the improved tests I added.

Tested x86_64-linux and powerpc64-linux, committed to trunk.



Hi Jonathan,

We are seeing experimental/filesystem/path/native/string.cc fail on
baremetal targets. I'm guessing this is missing a
'dg-require-filesystem-ts', as seen on other tests like
experimental/filesystem/path/modifiers/swap.cc.

Cheers,
Andre


Sorry about that, I've committed the missing directive.


commit 775381d7e53d5a68a2725b6b72f081d254d9380b
Author: Jonathan Wakely 
Date:   Thu May 5 10:42:35 2016 +0100

Add dg-require-filesystem-ts directive to test

	* testsuite/experimental/filesystem/path/native/string.cc: Add
	dg-require-filesystem-ts directive.

diff --git a/libstdc++-v3/testsuite/experimental/filesystem/path/native/string.cc b/libstdc++-v3/testsuite/experimental/filesystem/path/native/string.cc
index 05ff57c..e56fda7 100644
--- a/libstdc++-v3/testsuite/experimental/filesystem/path/native/string.cc
+++ b/libstdc++-v3/testsuite/experimental/filesystem/path/native/string.cc
@@ -16,6 +16,7 @@
 // .
 
 // { dg-options "-std=gnu++11 -lstdc++fs" }
+// { dg-require-filesystem-ts "" }
 
 #include 
 #include 


Re: [PATCH] Some further XMM16+ improvements

2016-05-05 Thread Kirill Yukhin
On 05 May 11:56, Jakub Jelinek wrote:
> On Thu, May 05, 2016 at 12:49:57PM +0300, Kirill Yukhin wrote:
> > Hi Jakub,
> > On 03 May 20:57, Jakub Jelinek wrote:
> > > This patch improves code generation e.g. on the first attached testcase
> > > and allows accepting the second one.
> > > 
> > > I've noticed we don't allow TFmode or V1TImode in xmm16+ regs at all,
> > > while they are allowed in xmm0-xmm15, so IMHO should be ok even with
> > > AVX512VL.
> > > 
> > > Wonder if it wouldn't be better to add a new constraint that would act
> > > like v constraint for TARGET_AVX512VL and like x constraint otherwise,
> > > that might greatly simplify the i386.md changes in this patch.
> > Good idea, I thought about that myself. IMHO this might be a follow up.
> 
> Ok, will add that to todo.
> 
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk,
> > > or with some changes?  Haven't figured out how to test the *andnot*
> > > and ** patterns though.
> > Are you going to commit testcases?
> 
> I didn't mean to in this case, but guess I could (as for the other patches,
> dg-do assemble only, I think trying to scan the assembly might be too fragile,
> it is up to the RA to decide).
> 
> > Yeah, tests for FP *logic* look odd, so I am OK for not having them.
> 
> So, is the patch ok for trunk with the two testcases turned into
> dg-do assemble tests, or do you want me to repost with that, or add the
> Yv constraint right away, something else?
Nope. Patch is pre-OK. Thanks!
> 
>   Jakub

--
K


Re: [C++ PATCH] PR c++/69855

2016-05-05 Thread Paolo Carlini

.. minor nit: the new testcase has a number of trailing blank lines.

Paolo.


Re: [PATCH] Some further XMM16+ improvements

2016-05-05 Thread Jakub Jelinek
On Thu, May 05, 2016 at 01:34:07PM +0300, Kirill Yukhin wrote:
> > So, is the patch ok for trunk with the two testcases turned into
> > dg-do assemble tests, or do you want me to repost with that, or add the
> > Yv constraint right away, something else?
> Nope. Patch is pre-OK. Thanks!

Actually, it isn't all that hard to add the new constraint and use it.
So here is a so far untested patch, though because of the scan-assembler
it depends on the PR target/70927 fix.

2016-05-03  Jakub Jelinek  

* config/i386/constraints.md (Yv): New constraint.
* config/i386/i386.h (VALID_AVX512VL_128_REG_MODE): Allow
TFmode and V1TImode in xmm16+ registers for TARGET_AVX512VL.
* config/i386/i386.md (avx512fvecmode): New mode attr.
(*pushtf): Use v constraint instead of x.
(*movtf_internal): Likewise.  For TARGET_AVX512VL and
xmm16+ registers, use vmovdqu64 or vmovdqa64 instructions.
(*absneg2): Use Yv constraint instead of x constraint.
(*absnegtf2_sse): Likewise.
(copysign3_const, copysign3_var): Likewise.
* config/i386/sse.md (*andnot3): Add avx512vl and
avx512f alternatives.
(*andnottf3, *3, *tf3): Likewise.

* gcc.target/i386/avx512dq-abs-copysign-1.c: New test.
* gcc.target/i386/avx512vl-abs-copysign-1.c: New test.
* gcc.target/i386/avx512vl-abs-copysign-2.c: New test.

--- gcc/config/i386/constraints.md.jj   2016-05-03 13:44:31.0 +0200
+++ gcc/config/i386/constraints.md  2016-05-05 12:03:50.197071618 +0200
@@ -145,6 +145,10 @@ (define_register_constraint "Yr"
  "TARGET_SSE ? (X86_TUNE_AVOID_4BYTE_PREFIXES ? NO_REX_SSE_REGS : 
ALL_SSE_REGS) : NO_REGS"
  "@internal Lower SSE register when avoiding REX prefix and all SSE registers 
otherwise.")
 
+(define_register_constraint "Yv"
+ "TARGET_AVX512VL ? ALL_SSE_REGS : TARGET_SSE ? SSE_REGS : NO_REGS"
+ "@internal For AVX512VL, any EVEX encodable SSE register 
(@code{%xmm0-%xmm31}), otherwise any SSE register.")
+
 ;; We use the B prefix to denote any number of internal operands:
 ;;  f  FLAGS_REG
 ;;  g  GOT memory operand.
--- gcc/config/i386/i386.h.jj   2016-05-03 21:27:41.253864955 +0200
+++ gcc/config/i386/i386.h  2016-05-05 12:04:06.627852607 +0200
@@ -1126,7 +1126,8 @@ extern const char *host_detect_local_cpu
 
 #define VALID_AVX512VL_128_REG_MODE(MODE)  \
   ((MODE) == V2DImode || (MODE) == V2DFmode || (MODE) == V16QImode \
-   || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode)
+   || (MODE) == V4SImode || (MODE) == V4SFmode || (MODE) == V8HImode   \
+   || (MODE) == TFmode || (MODE) == V1TImode)
 
 #define VALID_SSE2_REG_MODE(MODE)  \
   ((MODE) == V16QImode || (MODE) == V8HImode || (MODE) == V2DFmode \
--- gcc/config/i386/i386.md.jj  2016-05-03 21:27:45.560807504 +0200
+++ gcc/config/i386/i386.md 2016-05-05 12:13:21.355458467 +0200
@@ -1165,6 +1165,10 @@ (define_mode_attr ssevecmode
 (define_mode_attr ssevecmodelower
   [(QI "v16qi") (HI "v8hi") (SI "v4si") (DI "v2di") (SF "v4sf") (DF "v2df")])
 
+;; AVX512F vector mode corresponding to a scalar mode
+(define_mode_attr avx512fvecmode
+  [(QI "V64QI") (HI "V32HI") (SI "V16SI") (DI "V8DI") (SF "V16SF") (DF 
"V8DF")])
+
 ;; Instruction suffix for REX 64bit operators.
 (define_mode_attr rex64suffix [(SI "") (DI "{q}")])
 
@@ -2928,7 +2932,7 @@ (define_insn "*insvqi"
 
 (define_insn "*pushtf"
   [(set (match_operand:TF 0 "push_operand" "=<,<")
-   (match_operand:TF 1 "general_no_elim_operand" "x,*roF"))]
+   (match_operand:TF 1 "general_no_elim_operand" "v,*roF"))]
   "TARGET_64BIT || TARGET_SSE"
 {
   /* This insn should be already split before reg-stack.  */
@@ -3107,8 +3111,8 @@ (define_expand "mov"
   "ix86_expand_move (mode, operands); DONE;")
 
 (define_insn "*movtf_internal"
-  [(set (match_operand:TF 0 "nonimmediate_operand" "=x,x ,m,?*r ,!o")
-   (match_operand:TF 1 "general_operand"  "C ,xm,x,*roF,*rC"))]
+  [(set (match_operand:TF 0 "nonimmediate_operand" "=v,v ,m,?*r ,!o")
+   (match_operand:TF 1 "general_operand"  "C ,vm,v,*roF,*rC"))]
   "(TARGET_64BIT || TARGET_SSE)
&& !(MEM_P (operands[0]) && MEM_P (operands[1]))
&& (!can_create_pseudo_p ()
@@ -3133,6 +3137,10 @@ (define_insn "*movtf_internal"
{
  if (get_attr_mode (insn) == MODE_V4SF)
return "%vmovups\t{%1, %0|%0, %1}";
+ else if (TARGET_AVX512VL
+  && (EXT_REX_SSE_REG_P (operands[0])
+  || EXT_REX_SSE_REG_P (operands[1])))
+   return "vmovdqu64\t{%1, %0|%0, %1}";
  else
return "%vmovdqu\t{%1, %0|%0, %1}";
}
@@ -3140,6 +3148,10 @@ (define_insn "*movtf_internal"
{
  if (get_attr_mode (insn) == MODE_V4SF)
return "%vmovaps\t{%1, %0|%0, %1}";
+ else if (TARGET_AVX512VL
+  && (EXT_REX_SSE_REG_P (operands[0])
+  || E

Re: [PATCH] Improve _fmadd__mask3

2016-05-05 Thread Jakub Jelinek
On Thu, May 05, 2016 at 01:01:39PM +0300, Kirill Yukhin wrote:
> Hello Jakub,
> On 04 May 21:31, Jakub Jelinek wrote:
> > Hi!
> > 
> > As the testcase can show, we should be using v constraint and generate
> > better code that way.
> > 
> > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > 
> > 2016-05-04  Jakub Jelinek  
> > 
> > * config/i386/sse.md (_fmadd__mask3): Use
> > v constraint instead of x.
> > 
> > * gcc.target/i386/avx512f-vfmadd-1.c: New test.
> Didn't get what the test checks?
> It works fine w/o patch (generating extra moves though)
> Maybe scan-asm that xmm{16,17,18} actually hit FMA?

Like this?

2016-05-05  Jakub Jelinek  

* config/i386/sse.md (_fmadd__mask3): Use
v constraint instead of x.

* gcc.target/i386/avx512f-vfmadd-1.c: New test.

--- gcc/config/i386/sse.md.jj   2016-05-05 12:57:58.772804841 +0200
+++ gcc/config/i386/sse.md  2016-05-05 12:58:06.875697073 +0200
@@ -3409,10 +3409,10 @@ (define_insn "_fmadd__mask
(set_attr "mode" "")])
 
 (define_insn "_fmadd__mask3"
-  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=x")
+  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
(vec_merge:VF_AVX512VL
  (fma:VF_AVX512VL
-   (match_operand:VF_AVX512VL 1 "register_operand" "x")
+   (match_operand:VF_AVX512VL 1 "register_operand" "v")
(match_operand:VF_AVX512VL 2 "nonimmediate_operand" 
"")
(match_operand:VF_AVX512VL 3 "register_operand" "0"))
  (match_dup 3)
--- gcc/testsuite/gcc.target/i386/avx512f-vfmadd-1.c.jj 2016-05-05 
12:58:06.876697060 +0200
+++ gcc/testsuite/gcc.target/i386/avx512f-vfmadd-1.c2016-05-05 
13:31:03.123435963 +0200
@@ -0,0 +1,27 @@
+/* { dg-do compile { target { ! ia32 } } } */
+/* { dg-options "-O2 -mavx512f" } */
+
+#include 
+
+void
+f1 (__m512d x, __m512d y, __m512d z, __mmask8 m)
+{
+  register __m512d a __asm ("xmm16"), b __asm ("xmm17"), c __asm ("xmm18");
+  a = x; b = y; c = z;
+  asm volatile ("" : "+v" (a), "+v" (b), "+v" (c));
+  a = _mm512_mask3_fmadd_round_pd (c, b, a, m, _MM_FROUND_TO_NEG_INF | 
_MM_FROUND_NO_EXC);
+  asm volatile ("" : "+v" (a));
+}
+
+void
+f2 (__m512 x, __m512 y, __m512 z, __mmask8 m)
+{
+  register __m512 a __asm ("xmm16"), b __asm ("xmm17"), c __asm ("xmm18");
+  a = x; b = y; c = z;
+  asm volatile ("" : "+v" (a), "+v" (b), "+v" (c));
+  a = _mm512_mask3_fmadd_round_ps (c, b, a, m, _MM_FROUND_TO_NEG_INF | 
_MM_FROUND_NO_EXC);
+  asm volatile ("" : "+v" (a));
+}
+
+/* { dg-final { scan-assembler "vfmadd231pd\[^\n\r\]*zmm16" } } */
+/* { dg-final { scan-assembler "vfmadd231ps\[^\n\r\]*zmm16" } } */


Jakub


[PATCH][ARM] PR target/70830: Avoid POP-{reglist}^ when returning from interrupt handlers

2016-05-05 Thread Kyrill Tkachov

Hi all,

In this PR we deal with some fallout from the conversion to unified assembly.
We now end up emitting instructions like:
  pop {r0,r1,r2,r3,pc}^
which is not legal. We have to use an LDM form.

There are bugs in two arm.c functions: output_return_instruction and 
arm_output_multireg_pop.

In output_return_instruction the buggy hunk from the conversion was:
  else
-   if (TARGET_UNIFIED_ASM)
  sprintf (instr, "pop%s\t{", conditional);
-   else
- sprintf (instr, "ldm%sfd\t%%|sp!, {", conditional);

The code was already very obscurely structured and arguably the bug was latent.
It emitted POP only when TARGET_UNIFIED_ASM was on, and since 
TARGET_UNIFIED_ASM was on
only for Thumb, we never went down this path interrupt handling code, since the 
interrupt
attribute is only available for ARM code. After the removal of 
TARGET_UNIFIED_ASM we ended up
using POP unconditionally. So this patch adds a check for IS_INTERRUPT and 
outputs the
appropriate LDM form.

In arm_output_multireg_pop the buggy hunk was:
-  if ((regno_base == SP_REGNUM) && TARGET_THUMB)
+  if ((regno_base == SP_REGNUM) && update)
 {
-  /* Output pop (not stmfd) because it has a shorter encoding.  */
-  gcc_assert (update);
   sprintf (pattern, "pop%s\t{", conditional);
 }

Again, the POP was guarded on TARGET_THUMB and so would never be taken on 
interrupt handling
routines. This patch guards that with the appropriate check on interrupt return.

Also, there are a couple of bugs in the 'else' branch of that 'if':
* The "ldmfd%s" was output without a '\t' at the end which meant that the base 
register
name would be concatenated with the 'ldmfd', creating invalid assembly.

* The logic:

  if (regno_base == SP_REGNUM)
  /* update is never true here, hence there is no need to handle
 pop here.  */
sprintf (pattern, "ldmfd%s", conditional);

  if (update)
sprintf (pattern, "ldmia%s\t", conditional);
  else
sprintf (pattern, "ldm%s\t", conditional);

Meant that for "regno == SP_REGNUM && !update" we'd end up printing 
"ldmfd%sldm%s\t"
to pattern. I didn't manage to reproduce that condition though, so maybe it 
can't ever occur.
This patch fixes both these issues nevertheless.

I've added the testcase from the PR to catch the fix in 
output_return_instruction.
The testcase doesn't catch the bugs in arm_output_multireg_pop, but the 
existing tests
gcc.target/arm/interrupt-1.c and gcc.target/arm/interrupt-2.c would have caught 
them
if only they were assemble tests rather than just compile. So this patch makes 
them
assembly tests (and reverts the scan-assembler checks for the correct LDM 
pattern).

Bootstrapped and tested on arm-none-linux-gnueabihf.
Ok for trunk and GCC 6?

Thanks,
Kyrill

2016-05-05  Kyrylo Tkachov  

PR target/70830
* config/arm/arm.c (arm_output_multireg_pop): Avoid POP instruction
when popping the PC and within an interrupt handler routine.
Add missing tab to output of "ldmfd".
(output_return_instruction): Output LDMFD with SP update rather
than POP when returning from interrupt handler.

2016-05-05  Kyrylo Tkachov  

PR target/70830
* gcc.target/arm/interrupt-1.c: Change dg-compile to dg-assemble.
Add -save-temps to dg-options.
Scan for ldmfd rather than pop instruction.
* gcc.target/arm/interrupt-2.c: Likewise.
* gcc.target/arm/pr70830.c: New test.
diff --git a/gcc/config/arm/arm.c b/gcc/config/arm/arm.c
index 5136bc0171c8cb1f670096cae93037cf4611c4c5..2f1de2bcb7d69889eb080e338f8e939cc038b63b 100644
--- a/gcc/config/arm/arm.c
+++ b/gcc/config/arm/arm.c
@@ -17582,6 +17582,7 @@ arm_output_multireg_pop (rtx *operands, bool return_pc, rtx cond, bool reverse,
   int num_saves = XVECLEN (operands[0], 0);
   unsigned int regno;
   unsigned int regno_base = REGNO (operands[1]);
+  bool interrupt_p = IS_INTERRUPT (arm_current_func_type ());
 
   offset = 0;
   offset += update ? 1 : 0;
@@ -17599,7 +17600,8 @@ arm_output_multireg_pop (rtx *operands, bool return_pc, rtx cond, bool reverse,
 }
 
   conditional = reverse ? "%?%D0" : "%?%d0";
-  if ((regno_base == SP_REGNUM) && update)
+  /* Can't use POP if returning from an interrupt.  */
+  if ((regno_base == SP_REGNUM) && !(interrupt_p && return_pc))
 {
   sprintf (pattern, "pop%s\t{", conditional);
 }
@@ -17608,11 +17610,8 @@ arm_output_multireg_pop (rtx *operands, bool return_pc, rtx cond, bool reverse,
   /* Output ldmfd when the base register is SP, otherwise output ldmia.
  It's just a convention, their semantics are identical.  */
   if (regno_base == SP_REGNUM)
-	  /* update is never true here, hence there is no need to handle
-	 pop here.  */
-	sprintf (pattern, "ldmfd%s", conditional);
-
-  if (update)
+	sprintf (pattern, "ldmfd%s\t", conditional);
+  else if (update)
 	sprintf (pattern, "ldmia%s\t", conditional);
   else
 	sprintf (pattern, "ldm%s\t", conditional);
@@ -1763

RE: [PATCH] MIPS: In mips_print_address_operand pass the mode argument to mips_classify_address

2016-05-05 Thread Matthew Fortune
Andrew Bennett  writes:
> gcc/
>   * config/mips/mips.c (mips_print_operand_address): Pass the mode
> argument to
>   mips_classify_address.

Changelog content should wrap at 74 chars.

> I have tested the patch on the mips-mti-elf toolchain and there have
> been no regressions.

OK, if a linux testsuite run is also clean.  I do not expect any changes
in output from this change.

Thanks,
Matthew


Re: [PATCH] Improve _fmadd__mask3

2016-05-05 Thread Kirill Yukhin

On 05 May 13:33, Jakub Jelinek wrote:
> On Thu, May 05, 2016 at 01:01:39PM +0300, Kirill Yukhin wrote:
> > Hello Jakub,
> > On 04 May 21:31, Jakub Jelinek wrote:
> > > Hi!
> > > 
> > > As the testcase can show, we should be using v constraint and generate
> > > better code that way.
> > > 
> > > Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> > > 
> > > 2016-05-04  Jakub Jelinek  
> > > 
> > >   * config/i386/sse.md (_fmadd__mask3): Use
> > >   v constraint instead of x.
> > > 
> > >   * gcc.target/i386/avx512f-vfmadd-1.c: New test.
> > Didn't get what the test checks?
> > It works fine w/o patch (generating extra moves though)
> > Maybe scan-asm that xmm{16,17,18} actually hit FMA?
> 
> Like this?
Yeah.  OK for trunk.
> 
> 2016-05-05  Jakub Jelinek  
> 
>   * config/i386/sse.md (_fmadd__mask3): Use
>   v constraint instead of x.
> 
>   * gcc.target/i386/avx512f-vfmadd-1.c: New test.
> 
> --- gcc/config/i386/sse.md.jj 2016-05-05 12:57:58.772804841 +0200
> +++ gcc/config/i386/sse.md2016-05-05 12:58:06.875697073 +0200
> @@ -3409,10 +3409,10 @@ (define_insn "_fmadd__mask
> (set_attr "mode" "")])
>  
>  (define_insn "_fmadd__mask3"
> -  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=x")
> +  [(set (match_operand:VF_AVX512VL 0 "register_operand" "=v")
>   (vec_merge:VF_AVX512VL
> (fma:VF_AVX512VL
> - (match_operand:VF_AVX512VL 1 "register_operand" "x")
> + (match_operand:VF_AVX512VL 1 "register_operand" "v")
>   (match_operand:VF_AVX512VL 2 "nonimmediate_operand" 
> "")
>   (match_operand:VF_AVX512VL 3 "register_operand" "0"))
> (match_dup 3)
> --- gcc/testsuite/gcc.target/i386/avx512f-vfmadd-1.c.jj   2016-05-05 
> 12:58:06.876697060 +0200
> +++ gcc/testsuite/gcc.target/i386/avx512f-vfmadd-1.c  2016-05-05 
> 13:31:03.123435963 +0200
> @@ -0,0 +1,27 @@
> +/* { dg-do compile { target { ! ia32 } } } */
> +/* { dg-options "-O2 -mavx512f" } */
> +
> +#include 
> +
> +void
> +f1 (__m512d x, __m512d y, __m512d z, __mmask8 m)
> +{
> +  register __m512d a __asm ("xmm16"), b __asm ("xmm17"), c __asm ("xmm18");
> +  a = x; b = y; c = z;
> +  asm volatile ("" : "+v" (a), "+v" (b), "+v" (c));
> +  a = _mm512_mask3_fmadd_round_pd (c, b, a, m, _MM_FROUND_TO_NEG_INF | 
> _MM_FROUND_NO_EXC);
> +  asm volatile ("" : "+v" (a));
> +}
> +
> +void
> +f2 (__m512 x, __m512 y, __m512 z, __mmask8 m)
> +{
> +  register __m512 a __asm ("xmm16"), b __asm ("xmm17"), c __asm ("xmm18");
> +  a = x; b = y; c = z;
> +  asm volatile ("" : "+v" (a), "+v" (b), "+v" (c));
> +  a = _mm512_mask3_fmadd_round_ps (c, b, a, m, _MM_FROUND_TO_NEG_INF | 
> _MM_FROUND_NO_EXC);
> +  asm volatile ("" : "+v" (a));
> +}
> +
> +/* { dg-final { scan-assembler "vfmadd231pd\[^\n\r\]*zmm16" } } */
> +/* { dg-final { scan-assembler "vfmadd231ps\[^\n\r\]*zmm16" } } */
> 
> 
>   Jakub


[PATCH, i386, AVX-512] Fix sse-14.c (Intel assembly)

2016-05-05 Thread Petr Murzin
Hello,

The attached patch fixes sse-14.c to compile with -masm=intel.
Bootstrapped. No regressions detected.

Please have a look. Is it ok for trunk?

2016-05-05  Petr Murzin  

gcc/
* config/i386/sse.md: Use proper operand modifiers.
* config/i386/i386.c (ix86_print_operand): Expand check for size
override codes for Intel syntax.

Thanks,
Petr


fix_intel_syntax.patch
Description: Binary data


Re: Enabling -frename-registers?

2016-05-05 Thread Wilco Dijkstra
Ramana Radhakrishnan wrote:
>
> Can you file a bugzilla entry with a testcase that folks can look at please ?

I created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961. Unfortunately
I don't have a simple testcase that I can share.

Wilco



Re: [C++ PATCH] PR c++/69855

2016-05-05 Thread Ville Voutilainen
On 5 May 2016 at 13:36, Paolo Carlini  wrote:
> .. minor nit: the new testcase has a number of trailing blank lines.


New patch attached. :)


69855.diff5
Description: Binary data


Re: [PING][PATCH] New plugin event when evaluating a constexpr call

2016-05-05 Thread Andres Tiraboschi
Hi,
thanks for the feedback, I'll do the changes.

2016-05-04 13:16 GMT-03:00 Jason Merrill :
> On 05/02/2016 03:28 PM, Andres Tiraboschi wrote:
>>
>> +  constexpr_call_info call_info;
>> +  call_info.function = t;
>> +  call_info.call_stack = call_stack;
>> +  call_info.ctx = ctx;
>> +  call_info.lval_p = lval;
>> +  call_info.non_constant_p = non_constant_p;
>> +  call_info.overflow_p = overflow_p;
>> +  call_info.result = NULL_TREE;
>> +
>> +  invoke_plugin_callbacks (PLUGIN_EVAL_CALL_CONSTEXPR, &call_info);
>
>
> Let's move this into a separate function so that it doesn't increase the
> stack footprint of cxx_eval_call_expression.
>
> Jason
>


Re: [PATCH #3], Fix _Complex when there are multiple FP types the same size

2016-05-05 Thread Jakub Jelinek
On Mon, May 02, 2016 at 05:10:24PM -0400, Michael Meissner wrote:
> [gcc/fortran]
> 2016-05-02  Michael Meissner  
> 
>   * trans-types.c (gfc_build_complex_type):

Something missing above...

Jakub


[committed] Improve gfc_match_omp_clauses

2016-05-05 Thread Jakub Jelinek
Hi!

Before trying to add another 11 new clauses (for OpenMP 4.5), I've realized
that we already have way too many to try to match all of them each time;
while the matchers are guarded with mask & something, for >= 58 clauses even
that is just not too effective and hardly readable as well.

So, this patch reworks them in what the fortran FE does elsewhere, peek
the first char of the clause and switch on the first letter, and only
handle clauses starting with that letter.  Additionally, I've sorted
the clauses alphabetically even within each switch case.

Bootstrapped/regtested on x86_64-linux and i686-linux, committed to trunk.

2016-05-05  Jakub Jelinek  

* openmp.c (gfc_match_omp_clauses): Restructuralize, so that clause
parsing is done in a big switch based on gfc_peek_ascii_char and
individual clauses under their first letters are sorted too.

--- gcc/fortran/openmp.c.jj 2016-04-07 23:27:44.0 +0200
+++ gcc/fortran/openmp.c2016-05-05 11:05:39.726462694 +0200
@@ -640,646 +640,711 @@ gfc_match_omp_clauses (gfc_omp_clauses *
   needs_space = false;
   first = false;
   gfc_gobble_whitespace ();
-  if ((mask & OMP_CLAUSE_ASYNC) && !c->async)
-   if (gfc_match ("async") == MATCH_YES)
- {
-   c->async = true;
-   needs_space = false;
-   if (gfc_match (" ( %e )", &c->async_expr) != MATCH_YES)
- {
-   c->async_expr = gfc_get_constant_expr (BT_INTEGER,
-  gfc_default_integer_kind,
- &gfc_current_locus);
-   mpz_set_si (c->async_expr->value.integer, GOMP_ASYNC_NOVAL);
- }
-   continue;
- }
-  if ((mask & OMP_CLAUSE_GANG) && !c->gang)
-   if (gfc_match ("gang") == MATCH_YES)
- {
-   c->gang = true;
-   if (match_oacc_clause_gang(c) == MATCH_YES)
- needs_space = false;
-   else
- needs_space = true;
-   continue;
- }
-  if ((mask & OMP_CLAUSE_WORKER) && !c->worker)
-   if (gfc_match ("worker") == MATCH_YES)
- {
-   c->worker = true;
-   if (gfc_match (" ( num : %e )", &c->worker_expr) == MATCH_YES
-   || gfc_match (" ( %e )", &c->worker_expr) == MATCH_YES)
- needs_space = false;
-   else
- needs_space = true;
-   continue;
- }
-  if ((mask & OMP_CLAUSE_VECTOR_LENGTH) && c->vector_length_expr == NULL
- && gfc_match ("vector_length ( %e )", &c->vector_length_expr)
- == MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_VECTOR) && !c->vector)
-   if (gfc_match ("vector") == MATCH_YES)
- {
-   c->vector = true;
-   if (gfc_match (" ( length : %e )", &c->vector_expr) == MATCH_YES
-   || gfc_match (" ( %e )", &c->vector_expr) == MATCH_YES)
- needs_space = false;
-   else
- needs_space = true;
-   continue;
- }
-  if ((mask & OMP_CLAUSE_IF) && c->if_expr == NULL
- && gfc_match ("if ( %e )", &c->if_expr) == MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_FINAL) && c->final_expr == NULL
- && gfc_match ("final ( %e )", &c->final_expr) == MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_NUM_THREADS) && c->num_threads == NULL
- && gfc_match ("num_threads ( %e )", &c->num_threads) == MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_PRIVATE)
- && gfc_match_omp_variable_list ("private (",
- &c->lists[OMP_LIST_PRIVATE], true)
-== MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_FIRSTPRIVATE)
- && gfc_match_omp_variable_list ("firstprivate (",
- &c->lists[OMP_LIST_FIRSTPRIVATE],
- true)
-== MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_LASTPRIVATE)
- && gfc_match_omp_variable_list ("lastprivate (",
- &c->lists[OMP_LIST_LASTPRIVATE],
- true)
-== MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_COPYPRIVATE)
- && gfc_match_omp_variable_list ("copyprivate (",
- &c->lists[OMP_LIST_COPYPRIVATE],
- true)
-== MATCH_YES)
-   continue;
-  if ((mask & OMP_CLAUSE_SHARED)
- && gfc_match_omp_variable_list ("shared (",
- &c->lists[OMP_LIST_SHARED], true)
-== MATCH_YES)
-   continue;
-  if (mask & OMP_CLAUSE_COPYIN)
-   {
- if (openacc)
-   {
- if (gfc_match ("copyin ( ") == MATCH_YES
- && gfc_match_omp_map_clause (&c->lists[OMP_LIST_MAP],

Re: Improve pure/const propagation across interposable function with non-interposable aliases

2016-05-05 Thread Rainer Orth
Hi Jan,

>   * gcc.dg/ipa/pure-const-3.c: New testcase.

the testcase FAILs:

UNRESOLVED: gcc.dg/ipa/pure-const-3.c scan-ipa-dump pure-const "found to be 
const"

The log shows

gcc.dg/ipa/pure-const-3.c: dump file does not exist

The following patch fixes this.  Tested with the appropriate runtest
invocation on i386-pc-solaris2.12.  I guess this is obvious?

Rainer


2016-05-05  Rainer Orth  

* gcc.dg/ipa/pure-const-3.c: Scan local-pure-const1 dump.

# HG changeset patch
# Parent  d39b4a1eb735ab8dd6f6029ab882650de945f341
Fix gcc.dg/ipa/pure-const-3.c scan

diff --git a/gcc/testsuite/gcc.dg/ipa/pure-const-3.c b/gcc/testsuite/gcc.dg/ipa/pure-const-3.c
--- a/gcc/testsuite/gcc.dg/ipa/pure-const-3.c
+++ b/gcc/testsuite/gcc.dg/ipa/pure-const-3.c
@@ -21,4 +21,4 @@ main()
 __builtin_abort ();
   return 0;
 }
-/* { dg-final { scan-ipa-dump "found to be const" "pure-const"} } */
+/* { dg-final { scan-tree-dump "found to be const" "local-pure-const1"} } */

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


Missing pointer dereference in tree-affine.c

2016-05-05 Thread Richard Sandiford
wide_int_constant_multiple_p used:

  if (*mult_set && mult != 0)
return false;

to check whether we had previously seen a nonzero multiple, but "mult" is
a pointer to the previous value rather than the previous value itself.

Noticed by inspection while working on another patch, so I don't have a
testcase.  I tried adding an assert for combinations that were wrongly
rejected before but it didn't trigger during a bootstrap and regtest.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* tree-affine.c (wide_int_constant_multiple_p): Add missing
pointer dereference.

diff --git a/gcc/tree-affine.c b/gcc/tree-affine.c
index 32f2301..4884241 100644
--- a/gcc/tree-affine.c
+++ b/gcc/tree-affine.c
@@ -769,7 +769,7 @@ wide_int_constant_multiple_p (const widest_int &val, const 
widest_int &div,
 
   if (val == 0)
 {
-  if (*mult_set && mult != 0)
+  if (*mult_set && *mult != 0)
return false;
   *mult_set = true;
   *mult = 0;



Fix handling of negative bitpos in expand_debug_expr

2016-05-05 Thread Richard Sandiford
expand_debug_expr handled negative bit positions using:

else if (bitpos < 0)
  {
HOST_WIDE_INT units
  = (-bitpos + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
op0 = adjust_address_nv (op0, mode1, units);
bitpos += units * BITS_PER_UNIT;
  }

Here "units" is the negative of the (negative) byte offset, so I think
we should be offsetting OP0 by -units instead.  E.g. a bitpos of -17
would give units==3, so this code would move OP0 up by 3 bytes and set
bitpos to 7, giving a total bitpos of 31.

Just noticed by inspection.  An assert triggered for:

gcc.target/i386/mpx/bitfields-1-lbv.c
gcc.target/i386/mpx/field-addr-7-lbv.c
gcc.target/i386/mpx/reference-3-lbv.cpp
gcc.target/i386/mpx/reference-4-lbv.cpp

at -m32 but otherwise this case doesn't seem to trigger during a
bootstrap and regtest.

Tested on x86_64-linux-gnu.  OK to install?

Thanks,
Richard


gcc/
* cfgexpand.c (expand_debug_expr): Fix address offset for negative
bitpos.

diff --git a/gcc/cfgexpand.c b/gcc/cfgexpand.c
index 21f21c9..77a1964 100644
--- a/gcc/cfgexpand.c
+++ b/gcc/cfgexpand.c
@@ -4473,7 +4473,7 @@ expand_debug_expr (tree exp)
  {
HOST_WIDE_INT units
  = (-bitpos + BITS_PER_UNIT - 1) / BITS_PER_UNIT;
-   op0 = adjust_address_nv (op0, mode1, units);
+   op0 = adjust_address_nv (op0, mode1, -units);
bitpos += units * BITS_PER_UNIT;
  }
else if (bitpos == 0 && bitsize == GET_MODE_BITSIZE (mode))



Re: [PATCH] Better location info for "incomplete type" error msg (PR c/70756)

2016-05-05 Thread Marek Polacek
On Wed, May 04, 2016 at 11:52:39AM -0400, Jason Merrill wrote:
> On Wed, May 4, 2016 at 9:00 AM, Marek Polacek  wrote:
> > On Tue, May 03, 2016 at 08:05:47PM -0400, Jason Merrill wrote:
> >> Looks good.
> >>
> >> But I don't see a C++ testcase; can the test go into c-c++-common?
> >
> > Sadly, no.  As of now, the patch doesn't improve things for C++ (?).  Seems
> > we'd need to pass better locations down to pointer_int_sum / size_in_bytes.
> > It cascades :(.
> 
> Sure.  But can you fix that, too, while you're thinking about it?
> Passing the location to cp_pointer_int_sum and pointer_diff seems
> pretty simple.

That's true, that was pretty simple, actually.  And while at it, I also
added a location parameter to cp_build_modify_expr.  With that, we generate
better diagnostics even for C++, so I could move the test to c-c++-common.
And I also added another test, this time with -Wpointer-arith diagnostics,
which this patch improves as well.

Bootstrapped/regtested on x86_64-linux, ok for trunk?

2016-05-05  Marek Polacek  

PR c/70756
* c-common.c (pointer_int_sum): Call size_in_bytes_loc instead of
size_in_bytes and pass LOC to it.

* c-decl.c (build_compound_literal): Pass LOC down to
c_incomplete_type_error.
* c-tree.h (require_complete_type): Adjust declaration.
(c_incomplete_type_error): Likewise.
* c-typeck.c (require_complete_type): Add location parameter, pass it
down to c_incomplete_type_error.
(c_incomplete_type_error): Add location parameter, pass it down to
error_at.
(build_component_ref): Pass location down to c_incomplete_type_error.
(default_conversion): Pass location down to require_complete_type.
(build_array_ref): Likewise.
(build_function_call_vec): Likewise.
(convert_arguments): Likewise.
(build_unary_op): Likewise.
(build_c_cast): Likewise.
(build_modify_expr): Likewise.
(convert_for_assignment): Likewise.
(c_finish_omp_clauses): Likewise.

* call.c (build_new_op_1): Pass LOC to cp_build_modify_expr.
* cp-tree.h (cp_build_modify_expr): Update declaration.
(cxx_incomplete_type_error, cxx_incomplete_type_diagnostic): New inline
overloads.
* cp-ubsan.c (cp_ubsan_dfs_initialize_vtbl_ptrs): Pass INPUT_LOCATION to
cp_build_modify_expr.
* decl2.c (set_guard): Likewise.
(handle_tls_init): Likewise.
* init.c (perform_member_init): Likewise.
(expand_virtual_init): Likewise.
(build_new_1): Likewise.
(build_vec_delete_1): Likewise.
(get_temp_regvar): Likewise.
(build_vec_init): Likewise.
* method.c (do_build_copy_assign): Likewise.
(assignable_expr): Likewise.
* semantics.c (finish_omp_for): Likewise.
* typeck.c (cp_build_binary_op): Pass LOCATION to pointer_diff and
cp_pointer_int_sum.
(cp_pointer_int_sum): Add location parameter.  Pass it down to
pointer_int_sum.
(pointer_diff): Add location parameter.  Use it.
(build_modify_expr): Pass location down to cp_build_modify_expr.
(cp_build_modify_expr): Add location parameter.  Use it.
(build_x_modify_expr): Pass location down to cp_build_modify_expr.
* typeck2.c (cxx_incomplete_type_diagnostic,
cxx_incomplete_type_error): Add location parameter.

* langhooks-def.h (lhd_incomplete_type_error): Adjust declaration.
* langhooks.c (lhd_incomplete_type_error): Add location parameter.
* langhooks.h (incomplete_type_error): Likewise.
* tree.c (size_in_bytes_loc): Renamed from size_in_bytes.  Add location
parameter, pass it down to incomplete_type_error.
* tree.h (size_in_bytes): New inline overload.
(size_in_bytes_loc): Renamed from size_in_bytes.

* c-c++-common/pr70756-2.c: New test.
* c-c++-common/pr70756.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 63a18c8..150bdb2 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -4269,7 +4269,7 @@ pointer_int_sum (location_t loc, enum tree_code 
resultcode,
   size_exp = integer_one_node;
 }
   else
-size_exp = size_in_bytes (TREE_TYPE (result_type));
+size_exp = size_in_bytes_loc (loc, TREE_TYPE (result_type));
 
   /* We are manipulating pointer values, so we don't need to warn
  about relying on undefined signed overflow.  We disable the
diff --git gcc/c/c-decl.c gcc/c/c-decl.c
index 7094efc..48fa65c 100644
--- gcc/c/c-decl.c
+++ gcc/c/c-decl.c
@@ -5112,7 +5112,7 @@ build_compound_literal (location_t loc, tree type, tree 
init, bool non_const)
 
   if (type == error_mark_node || !COMPLETE_TYPE_P (type))
 {
-  c_incomplete_type_error (NULL_TREE, type);
+  c_incomplete_type_error (loc, NULL_TREE, type);
   return error_mark_node;
 }
 
diff --git gcc/c/c-tree.h gcc/c/c-tree.h
index 0

Re: Inline across -ffast-math boundary

2016-05-05 Thread Rainer Orth
Richard Biener  writes:

>> >> This new testcase does not pass on bare-metal configs (using newlib),
>> >> because:
>> >> warning: implicit declaration of function 'isnanf'
>> >> [-Wimplicit-function-declaration]
>> >> warning: incompatible implicit declaration of built-in function 'isnanf'
>> >>
>> >> I'm not sure what's the appropriate dg-require to avoid this?
>> >
>> > c99_runtime I guess.
>> >
>> Indeed, that what was used in a previous occurrence of a similar
>> problem (PR 69227).
>> 
>> I've attached the small (obvious?) patch to make the new inline-8.c
>> test UNSUPPORTED
>> without c99_runtime.
>> 
>> OK?
>
> Ok.

The testcase still FAILs on Solaris:

FAIL: gcc.dg/ipa/inline-8.c (test for excess errors)
Excess errors:
/vol/gcc/src/hg/trunk/local/gcc/testsuite/gcc.dg/ipa/inline-8.c:12:10: warning: 
implicit declaration of function 'isnanf' [-Wimplicit-function-declaration]

The isnanf(3C) manpage claims that isnanf() is declared in 
(which is true), but the prototype is also in  under
__EXTENSIONS__ && !_XOPEN_SOURCE.

The function is not in C99 or XPG7, and instead of trying to do the
equivalent of AC_USE_SYSTEM_EXTENSIONS, I've followed the lead of
gcc.target/arm/pr59896.c and just declare it in the testcase.

Tested with the appropriate runtest invocations on i386-pc-solaris2.12
and x86_64-pc-linux-gnu, installed on mainline.

Rainer


2016-05-04  Rainer Orth  

* gcc.dg/ipa/inline-8.c (isnanf): Declare.

# HG changeset patch
# Parent  7654b342bb270f12f16abf76b78e235c2798832f
Declare isnanf in gcc.dg/ipa/inline-8.c

diff --git a/gcc/testsuite/gcc.dg/ipa/inline-8.c b/gcc/testsuite/gcc.dg/ipa/inline-8.c
--- a/gcc/testsuite/gcc.dg/ipa/inline-8.c
+++ b/gcc/testsuite/gcc.dg/ipa/inline-8.c
@@ -5,6 +5,7 @@
 /* { dg-options "-O2"  } */
 /* { dg-add-options c99_runtime } */
 #include 
+extern int isnanf (float);
 /* Can't be inlined because isnanf will be optimized out.  */
 int
 cmp (float a)

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[v3] Enable dg-reqire-sharedlib on Solaris

2016-05-05 Thread Rainer Orth
I happened to notice that dg-require-sharedlib hardcodes the targets
that shared library, and Solaris is missing.  Fixed with the following
patch.

Bootstrapped on i386-pc-solaris2.12, the affected testcases now PASS.
Ok for mainline?

Rainer


2016-05-04  Rainer Orth  

* testsuite/lib/libstdc++.exp (libstdc++_init): Enable on
*-*-solaris*.

# HG changeset patch
# Parent  ba48a13c1219b37de7d83394a829bc065b62b24f
Enable dg-reqire-sharedlib on Solaris

diff --git a/libstdc++-v3/testsuite/lib/libstdc++.exp b/libstdc++-v3/testsuite/lib/libstdc++.exp
--- a/libstdc++-v3/testsuite/lib/libstdc++.exp
+++ b/libstdc++-v3/testsuite/lib/libstdc++.exp
@@ -184,8 +184,9 @@ proc libstdc++_init { testfile } {
 set v3-sharedlib 0
 set sharedlibdir [lookfor_file $blddir src/.libs/libstdc++.$shlib_ext]
 if {$sharedlibdir != ""} {
-	if { ([string match "*-*-linux*" $target_triplet]
-	  || [string match "*-*-gnu*" $target_triplet])
+	if { ([string match "*-*-gnu*" $target_triplet]
+	  || [string match "*-*-linux*" $target_triplet]
+	  || [string match "*-*-solaris*" $target_triplet])
 	 && [isnative] } then {
 	set v3-sharedlib 1
 	verbose -log "shared library support detected"

-- 
-
Rainer Orth, Center for Biotechnology, Bielefeld University


[patch] Coalesce in more cases

2016-05-05 Thread Eric Botcazou
Hi,

gimple_can_coalesce_p is rather picky about the conditions under which SSA 
names can be coalesced.  In particular, when it comes to the type, it's:

  /* Now check the types.  If the types are the same, then we should
 try to coalesce V1 and V2.  */
  tree t1 = TREE_TYPE (name1);
  tree t2 = TREE_TYPE (name2);
  if (t1 == t2)

or

  /* If the types are not the same, check for a canonical type match.  This
 (for example) allows coalescing when the types are fundamentally the
 same, but just have different names. 

 Note pointer types with different address spaces may have the same
 canonical type.  Those are rejected for coalescing by the
 types_compatible_p check.  */
  if (TYPE_CANONICAL (t1)
  && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
  && types_compatible_p (t1, t2))
goto check_modes;

The test on TYPE_CANONICAL looks overkill to me.  It's needed in the non-
optimized case (-fno-tree-coalesce-vars) as compute_samebase_partition_bases 
uses TYPE_CANONICAL to discriminate partitions, but it's not needed in the 
optimized case as compute_optimized_partition_bases uses the full information.
For example, in Ada it prevents subtypes from being coalesced with types and 
in C++ it prevents different pointer types from being coalesced.  Hence the 
attached patch, which lifts the restriction in the optimized case.

Tested on x86_64-suse-linux, OK for the mainline?


2016-05-05  Eric Botcazou  

* tree-ssa-coalesce.c (gimple_can_coalesce_p): In the optimized case,
allow coalescing if the types are compatible.

-- 
Eric BotcazouIndex: tree-ssa-coalesce.c
===
--- tree-ssa-coalesce.c	(revision 235900)
+++ tree-ssa-coalesce.c	(working copy)
@@ -1569,17 +1569,24 @@ gimple_can_coalesce_p (tree name1, tree
 			var2 ? LOCAL_DECL_ALIGNMENT (var2) : TYPE_ALIGN (t2)))
 return false;
 
-  /* If the types are not the same, check for a canonical type match.  This
+  /* If the types are not the same, see whether they are compatible.  This
  (for example) allows coalescing when the types are fundamentally the
- same, but just have different names. 
+ same, but just have different names.
 
- Note pointer types with different address spaces may have the same
- canonical type.  Those are rejected for coalescing by the
- types_compatible_p check.  */
-  if (TYPE_CANONICAL (t1)
-  && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
-  && types_compatible_p (t1, t2))
-goto check_modes;
+ In the non-optimized case, we must first test TYPE_CANONICAL because
+ we use it to compute the partition_to_base_index of the map.  */
+  if (flag_tree_coalesce_vars)
+{
+  if (types_compatible_p (t1, t2))
+	goto check_modes;
+}
+  else
+{
+  if (TYPE_CANONICAL (t1)
+	  && TYPE_CANONICAL (t1) == TYPE_CANONICAL (t2)
+	  && types_compatible_p (t1, t2))
+	goto check_modes;
+}
 
   return false;
 }


RE: [PATCH 1/4] [MIPS] Add support for MIPS SIMD Architecture (MSA)

2016-05-05 Thread Robert Suchanek
Hi Matthew,

Revised patch attached.

Tested with mips-img-linux-gnu and bootstrapped x86_64-unknown-linux-gnu.

> > mips_gen_const_int_vector
> This should use gen_int_for_mode instead of GEN_INT to avoid the issues that
> msa_ldi is
> trying to handle.

gen_int_mode cannot be used to generate a vector of constants as it expects a 
scalar mode.
AFAICS, there isn't any generic helper to replace this.

> 
> > mips_const_vector_same_bytes_p
> comment on this function is same as previous function

Corrected.

> 
> > mips_msa_idiv_insns
> Why not just update mips_idiv_insns and add a mode argument?

Done. 

> 
> > Implement TARGET_PRINT_OPERAND.
> Comment spacing between 'E' 'B' and description is different to existing

Updated.

> 
> > mips_print_operand
> case 'v' subcases V4SImode and V4SFmode are identical. same for DI/DF.

Updated.

> 
> >@@ -12272,13 +12837,25 @@ mips_class_max_nregs (enum reg_class rclass,
> machine_mode mode)
> >   if (hard_reg_set_intersect_p (left, reg_class_contents[(int) ST_REGS]))
> > {
> >   if (HARD_REGNO_MODE_OK (ST_REG_FIRST, mode))
> >-size = MIN (size, 4);
> >+{
> >+  if (MSA_SUPPORTED_MODE_P (mode))
> >+size = MIN (size, UNITS_PER_MSA_REG);
> >+  else
> >+size = MIN (size, UNITS_PER_FPREG);
> >+}
> >+
> 
> This hunk should be removed. MSA modes are not supported in ST_REGS.

Indeed.  Removed.

> 
> >@@ -12299,6 +12876,10 @@ mips_cannot_change_mode_class (machine_mode from,
> >   && INTEGRAL_MODE_P (from) && INTEGRAL_MODE_P (to))
> > return false;
> >
> >+  /* Allow conversions between different MSA vector modes and TImode.  */
> 
> Remove 'and TImode' we do not support it.

Done.

> 
> >@@ -19497,9 +21284,64 @@ mips_expand_vec_unpack (rtx operands[2], bool
> unsigned_p, bool high_p)
> >+if (!unsigned_p)
> >+{
> >+  /* Extract sign extention for each element comparing each element with
> >+ immediate zero.  */
> >+  tmp = gen_reg_rtx (imode);
> >+  emit_insn (cmpFunc (tmp, operands[1], CONST0_RTX (imode)));
> >+}
> >+else
> >+{
> >+  tmp = force_reg (imode, CONST0_RTX (imode));
> >+}
> 
> Indentation and unnecessary braces on the else.

Fixed.

> 
> +   A single N-word move is usually the same cost as N single-word moves.
> +   For MSA, we set MOVE_MAX to 16 bytes.
> +   Then, MAX_MOVE_MAX is 16 unconditionally.  */
> +#define MOVE_MAX (TARGET_MSA ? 16 : UNITS_PER_WORD)
> +#define MAX_MOVE_MAX 16
> 
> The 16 here should be UNITS_PER_MSA_REG
> 

The changes have been reverted because of link to MAX_FIXED_MODE_SIZE macro
causing failures in the by_pieces logic if MOVE_MAX_PIECES is larger than 
MAX_FIXED_MODE_SIZE.
As it stands, vector modes appear to be handled explicitly in the common code
so it's unlikely we need to modify any of these.
If they do then it will be included in the follow up.

> > mips_expand_builtin_insn
> 
> General comment about operations that take an immediate. There is code to
> perform range
> checking but it does not seem to leave any trail when the maybe_expand_insn
> fails to
> tell the user it was an out of range immediate that was the problem. (follow 
> up
> work)

Will do.

> 
> >+case CODE_FOR_msa_andi_b:
> >+case CODE_FOR_msa_ori_b:
> >+case CODE_FOR_msa_nori_b:
> >+case CODE_FOR_msa_xori_b:
> >+  gcc_assert (has_target_p && nops == 3);
> >+  if (!CONST_INT_P (ops[2].value))
> >+break;
> >+  ops[2].mode = ops[0].mode;
> >+  /* We need to convert the unsigned value to signed.  */
> >+  val = sext_hwi (INTVAL (ops[2].value),
> >+  GET_MODE_UNIT_PRECISION (ops[2].mode));
> >+  ops[2].value = mips_gen_const_int_vector (ops[2].mode, val);
> >+  break
> 
> Isn't the sext_hwi just effectively doing what gen_int_for_mode would? Fixing
> mips_gen_const_int_vector would eliminate all of them.

That's correct. I've moved it to mips_gen_cost_int_vector and used gen_int_mode.

> 
> >@@ -527,7 +551,9 @@ (define_attr "insn_count" ""
> >  (const_int 2)
> >
> >  (eq_attr "type" "idiv,idiv3")
> >- (symbol_ref "mips_idiv_insns ()")
> >+ (cond [(eq_attr "mode" "TI")
> >+(symbol_ref "mips_msa_idiv_insns () * 4")]
> >+(symbol_ref "mips_idiv_insns () * 4"))
> 
> Why *4?

I'm not sure but it appears to be introduced long ago.
I removed it and used only mips_idiv_insns with the mode.

> 
> >@@ -1537,8 +1553,10 @@ FP_ASM_SPEC "\
> > #define LONG_LONG_ACCUM_TYPE_SIZE (TARGET_64BIT ? 128 : 64)
> >
> > /* long double is not a fixed mode, but the idea is that, if we
> >-   support long double, we also want a 128-bit integer type.  */
> >-#define MAX_FIXED_MODE_SIZE LONG_DOUBLE_TYPE_SIZE
> >+   support long double, we also want a 128-bit integer type.
> >+   For MSA, we support an integer type with a width of BITS_PER_MSA_REG.  */
> >+#define MAX_FIXED_MODE_SIZE \
> >+  (TARGET_MSA ? BITS_PER_MSA_REG : LONG_DOUBLE_TYPE_SIZE)
> 
> This doesn't seem right. We don't support TImode 

[PATCH PR70935, Regression 6,7]

2016-05-05 Thread Yuri Rumyantsev
Hi All,

Here is a simple patch which cures the problem with nonlegal
transformation of endless loop. THe fix is simply check that guard
edge destination is not loop latch block.

Bootstrapping and regression testing did not show any new failures.
Is it OK for trunk?

ChangeLog:
2016-05-05  Yuri Rumyantsev  

PR debug/70935
* tree-ssa-loop-unswitch.c (find_loop_guard): Reject guard edge with
loop latch destination.

gcc/testsuite/ChangeLog
* gcc.dg/torture/pr70935.c: New test.


patch.1
Description: Binary data


[PATCH PR57206]Add test since the PR is fixed by patch to PR48052

2016-05-05 Thread Bin Cheng
Hi,
This patch adds a test for PR57206.  The issue itself is long fixed by patch to 
PR48052.

Test on x86_64.  It's an obvious change, applied on trunk.

Thanks
bin

gcc/testsuite/ChangeLog
2016-05-04  Bin Cheng  

PR tree-optimization/57206
* gcc.dg/vect/pr57206.c: New test.diff --git a/gcc/testsuite/gcc.dg/vect/pr57206.c 
b/gcc/testsuite/gcc.dg/vect/pr57206.c
new file mode 100644
index 000..009688e
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/vect/pr57206.c
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target vect_float } */
+
+void bad0(float * d, unsigned int n)
+{
+  unsigned int i;
+  for (i=n; i>0; --i) 
+d[n-i] = 0.0;
+}
+
+/* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect" } } */


Re: Enabling -frename-registers?

2016-05-05 Thread Mike Stump
On May 5, 2016, at 6:00 AM, Wilco Dijkstra  wrote:
> 
> Ramana Radhakrishnan wrote:
>> 
>> Can you file a bugzilla entry with a testcase that folks can look at please ?
> 
> I created https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70961. Unfortunately
> I don't have a simple testcase that I can share.

Simple has never been a requirement for bug submission.  It is merely nice.



[PATCH,rs6000] Add built-in support for new Power9 darn (deliver a random number) instruction

2016-05-05 Thread Kelvin Nilsen
This patch adds built-in function support for the Power9 darn 
instruction.

I have bootstrapped and tested on both powerpc64le-unknown-linux-gnu and
powerpc64-unknown-linux-gnu with no regressions.  Is this ok for trunk
and for backpatching to ???.

Thanks,
Kelvin


gcc/testsuite/ChangeLog:

2016-05-04  Kelvin Nilsen  

* gcc.target/powerpc/darn-0.c: New test.
* gcc.target/powerpc/darn-1.c: New test.
* gcc.target/powerpc/darn-2.c: New test.


gcc/ChangeLog:

2016-05-04  Kelvin Nilsen  

* config/rs6000/altivec.h: Add macro definitions for darn,
darn_32, and darn_raw.
* config/rs6000/altivec.md (UNSPEC_DARN): New unspec constant.
(UNSPEC_DARN_32): New usnpec constant.
(UNSPEC_DARN_RAW): New unspec constant.
("darn_32"): New instruction.
("darn_raw"): New instruction.
("darn"): New instruction.
* config/rs6000/rs6000-builtin.def (RS6000_BUILTIN_0): Add
support and documentation for this macro.
(BU_P9_MISC_1): New macro definition.
(BU_P9_64BIT_MISC_0): New macro definition.
(BU_P9_MISC_0): New macro definition.
("darn_32"): New builtin definition.
("darn_raw"): New builtin definition.
("darn"): New builtin definition.
* config/rs6000/rs6000.c: Add #define RS6000_BUILTIN_0 and #undef
RS6000_BUILTIN_0 directives to surround each occurrence of
#include "rs6000-builtin.def".
(rs6000_builtin_mask_calculate): Add in the RS6000_BTM_MODULO and
RS6000_BTM_64BIT flags to the returned mask, depending on
configuration. 
(def_builtin): Correct an error in the assignments made to the
debugging variable attr_string.
(rs6000_expand_builtin): Add support for no-operand built-in
functions. 
(builtin_function_type): Remove fatal_error assertion that is no
longer valid.
(rs6000_common_init_builtins): Add support for no-operand built-in
functions. 
* config/rs6000/rs6000.h (RS6000_BTM_MODULO): New macro
definition. 
(RS6000_BTM_PURE): Enhance comment to clarify intent of this flag
definition. 
(RS6000_BTM_64BIT): New macro definition.
* doc/extend.texi: Document __builtin_darn (void),
__builtin_darn_raw (void), and __builtin_darn_32 (void) built-in
functions. 

Index: gcc/config/rs6000/altivec.h
===
--- gcc/config/rs6000/altivec.h (revision 235884)
+++ gcc/config/rs6000/altivec.h (working copy)
@@ -382,6 +382,11 @@
 #define vec_vsubuqm __builtin_vec_vsubuqm
 #define vec_vupkhsw __builtin_vec_vupkhsw
 #define vec_vupklsw __builtin_vec_vupklsw
+
+/* Non-Vector additions added in ISA 3.0. */
+#define darn __builtin_darn
+#define darn_32 __builtin_darn_32
+#define darn_raw __builtin_darn_raw
 #endif
 
 /* Predicates.
Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 235884)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -73,6 +73,9 @@
UNSPEC_VUNPACK_LO_SIGN_DIRECT
UNSPEC_VUPKHPX
UNSPEC_VUPKLPX
+   UNSPEC_DARN
+   UNSPEC_DARN_32
+   UNSPEC_DARN_RAW
UNSPEC_DST
UNSPEC_DSTT
UNSPEC_DSTST
@@ -3590,6 +3593,37 @@
   [(set_attr "length" "4")
(set_attr "type" "vecsimple")])
 
+(define_insn "darn_32"
+  [(set (match_operand:SI 0 "register_operand" "")
+(unspec:SI [(const_int 0)] UNSPEC_DARN_32))]
+  "TARGET_MODULO"
+  {
+ return "darn %0,0";
+  }
+  [(set_attr "type" "add")  
+   (set_attr "length" "4")])
+
+(define_insn "darn_raw"
+  [(set (match_operand:DI 0 "register_operand" "")
+(unspec:DI [(const_int 0)] UNSPEC_DARN_RAW))]
+  "TARGET_MODULO && TARGET_64BIT"
+  {
+ return "darn %0,2";
+  }
+  [(set_attr "type" "add")  
+   (set_attr "length" "4")])
+
+(define_insn "darn"
+  [(set (match_operand:DI 0 "register_operand" "")
+(unspec:DI [(const_int 0)] UNSPEC_DARN))]
+  "TARGET_MODULO && TARGET_64BIT"
+  {
+ return "darn %0,1";
+  }
+  [(set_attr "type" "add")  
+   (set_attr "length" "4")])
+
+
 (define_expand "bcd_"
   [(parallel [(set (reg:CCFP 74)
   (compare:CCFP
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 235884)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -24,6 +24,7 @@
.  */
 
 /* Before including this file, some macros must be defined:
+   RS6000_BUILTIN_0 -- 0 arg builtins
RS6000_BUILTIN_1 -- 1 arg builtins
RS6000_BUILTIN_2 -- 2 arg builtins
RS6000_BUILTIN_3 -- 3 arg builtins
@@ -43,6 +44,10 @@
ATTRbuiltin attribute information.
ICODE   Insn code of the function that implents the builtin.  */
 
+#ifndef RS6000_BUILTIN_0
+  #error "RS6000_BUILTIN

Re: [v3] Enable dg-reqire-sharedlib on Solaris

2016-05-05 Thread Jonathan Wakely

On 05/05/16 16:36 +0200, Rainer Orth wrote:

I happened to notice that dg-require-sharedlib hardcodes the targets
that shared library, and Solaris is missing.  Fixed with the following
patch.

Bootstrapped on i386-pc-solaris2.12, the affected testcases now PASS.
Ok for mainline?


Looks good - OK, thanks.



[PATCH] Make basic asm implicitly clobber memory

2016-05-05 Thread Bernd Edlinger
Hi!

this patch is inspired by recent discussion about basic asm:

Currently a basic asm is an instruction scheduling barrier,
but not a memory barrier, and most surprising, basic asm
does _not_ implicitly clobber CC on targets where
extended asm always implicitly clobbers CC, even if
nothing is in the clobber section.

This patch makes basic asm implicitly clobber CC on certain
targets, and makes the basic asm implicitly clobber memory,
but no general registers, which is what could be expected.

This is however only done for basic asm with non-empty
assembler string, which is in sync with David's proposed
basic asm warnings patch.

Due to the change in the tree representation, where
ASM_INPUT can now be the first element of a
PARALLEL block with the implicit clobber elements,
there are some changes necessary.

Most of the changes in the middle end, were necessary
because extract_asm_operands can not be used to find out
if a PARALLEL block is an asm statement, but in most cases
asm_noperands can be used instead.

There are also changes necessary in two targets: pa, and ia64.
I have successfully built cross-compilers for these targets.

Boot-strapped and reg-tested on x86_64-pc-linux-gnu
OK for trunk?


Thanks
Bernd.gcc/
2016-05-05  Bernd Edlinger  

PR c/24414
* cfgexpand.c (expand_asm_loc): Remove handling for ADDR_EXPR.
Implicitly clobber memory for basic asm with non-empty assembler
string.  Use targetm.md_asm_adjust also here.
* compare-emim.c (arithmetic_flags_clobber_p): Use asm_noperands here.
* final.c (final_scan_insn): Handle basic asm in PARALLEL block.
* gimple.c (gimple_asm_clobbers_memory_p): Handle basic asm with
non-empty assembler string.
* ira.c (compute_regs_asm_clobbered): Use asm_noperands here.
* recog.c (asm_noperands): Handle basic asm in PARALLEL block.
(decode_asm_operands): Handle basic asm in PARALLEL block.
(extract_insn): Handle basic asm in PARALLEL block.
* doc/extend.texi: Mention new behavior of basic asm.
* config/ia64/ia64 (rtx_needs_barrier): Handle ASM_INPUT here.
* config/pa/pa.c (branch_to_delay_slot_p, branch_needs_nop_p,
branch_needs_nop_p): Use asm_noperands.

gcc/testsuite/
2016-05-05  Bernd Edlinger  

PR c/24414
* gcc.target/i386/pr24414.c: New test.
Index: gcc/cfgexpand.c
===
--- gcc/cfgexpand.c	(revision 231412)
+++ gcc/cfgexpand.c	(working copy)
@@ -2655,9 +2655,6 @@ expand_asm_loc (tree string, int vol, location_t l
 {
   rtx body;
 
-  if (TREE_CODE (string) == ADDR_EXPR)
-string = TREE_OPERAND (string, 0);
-
   body = gen_rtx_ASM_INPUT_loc (VOIDmode,
 ggc_strdup (TREE_STRING_POINTER (string)),
 locus);
@@ -2664,6 +2661,34 @@ expand_asm_loc (tree string, int vol, location_t l
 
   MEM_VOLATILE_P (body) = vol;
 
+  /* Non-empty basic ASM implicitly clobbers memory.  */
+  if (TREE_STRING_LENGTH (string) != 0)
+{
+  rtx asm_op, clob;
+  unsigned i, nclobbers;
+  auto_vec input_rvec, output_rvec;
+  auto_vec constraints;
+  auto_vec clobber_rvec;
+  HARD_REG_SET clobbered_regs;
+  CLEAR_HARD_REG_SET (clobbered_regs);
+
+  clob = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (VOIDmode));
+  clobber_rvec.safe_push (clob);
+
+  if (targetm.md_asm_adjust)
+	targetm.md_asm_adjust (output_rvec, input_rvec,
+			   constraints, clobber_rvec,
+			   clobbered_regs);
+
+  asm_op = body;
+  nclobbers = clobber_rvec.length ();
+  body = gen_rtx_PARALLEL (VOIDmode, rtvec_alloc (1 + nclobbers));
+
+  XVECEXP (body, 0, 0) = asm_op;
+  for (i = 0; i < nclobbers; i++)
+	XVECEXP (body, 0, i + 1) = gen_rtx_CLOBBER (VOIDmode, clobber_rvec[i]);
+}
+
   emit_insn (body);
 }
 
Index: gcc/compare-elim.c
===
--- gcc/compare-elim.c	(revision 231412)
+++ gcc/compare-elim.c	(working copy)
@@ -162,7 +162,7 @@ arithmetic_flags_clobber_p (rtx_insn *insn)
   if (!NONJUMP_INSN_P (insn))
 return false;
   pat = PATTERN (insn);
-  if (extract_asm_operands (pat))
+  if (asm_noperands (pat) >= 0)
 return false;
 
   if (GET_CODE (pat) == PARALLEL && XVECLEN (pat, 0) == 2)
Index: gcc/doc/extend.texi
===
--- gcc/doc/extend.texi	(revision 231412)
+++ gcc/doc/extend.texi	(working copy)
@@ -7442,6 +7442,10 @@ all basic @code{asm} blocks use the assembler dial
 Basic @code{asm} provides no
 mechanism to provide different assembler strings for different dialects.
 
+For basic @code{asm} with non-empty assembler string GCC assumes
+the assembler block does not change any general purpose registers,
+but it may read or write any globally accessible variable.
+
 Here is an example of basic @code{asm} for i386:
 
 @example
Index: gcc/final.c
==

Re: [PATCH], Add PowerPC ISA 3.0 vector d-form addressing

2016-05-05 Thread Michael Meissner
On Wed, May 04, 2016 at 11:16:52AM -0500, Segher Boessenkool wrote:
> Hi Mike,
> 
> On Tue, May 03, 2016 at 06:39:55PM -0400, Michael Meissner wrote:
> > With this patch, I enable -mlra if the user did not specify either -mlra or
> > -mno-lra on the command line, and -mcpu=power9 or -mpower9-dform-vector were
> > used. I also enabled -mvsx-timode if LRA was used, which also is a RELOAD
> > issue, that works with LRA.
> 
> I don't like enabling LRA if the user didn't ask for it; it is a bit too
> surprising.  What do you do if there is -mno-lra explicitly?  You can just
> do the same if no-lra is implicit?

Ok.

> > * doc/md.texi (wO constraint): Likewise.
> 
> Everything is "likewise", that isn't very helpful.  Writing big changelogs
> is annoying, I totally agree, but please try a bit harder.
> 
> > --- gcc/config/rs6000/rs6000.opt
> > (.../svn+ssh://meiss...@gcc.gnu.org/svn/gcc/trunk/gcc/config/rs6000)
> > (revision 235831)
> > +++ gcc/config/rs6000/rs6000.opt(.../gcc/config/rs6000) (working copy)
> > @@ -470,8 +470,8 @@ Target RejectNegative Joined UInteger Va
> >  -mlong-double-  Specify size of long double (64 or 128 bits).
> >  
> >  mlra
> > -Target Report Var(rs6000_lra_flag) Init(0) Save
> > -Use LRA instead of reload.
> > +Target Undocumented Mask(LRA) Var(rs6000_isa_flags)
> > +Use the LRA register allocator instead of the reload register allocator.
> 
> It wasn't "undocumented" before?  Why the change to a mask bit btw?

It was always meant to be undocumented, but I changed to be similar to
before. I am trying to change all of the random switches that set a word to be
an option mask, so I made that part of the change in these next patches. I did
remove setting it for -mcpu=power9.

> > +mpower9-dform-scalar
> > +Target Report Mask(P9_DFORM_SCALAR) Var(rs6000_isa_flags)
> > +Use/do not use scalar register+offset memory instructions added in ISA 3.0.
> > +
> > +mpower9-dform-vector
> > +Target Report Mask(P9_DFORM_VECTOR) Var(rs6000_isa_flags)
> > +Use/do not use vector register+offset memory instructions added in ISA 3.0.
> > +
> >  mpower9-dform
> > -Target Undocumented Mask(P9_DFORM) Var(rs6000_isa_flags)
> > -Use/do not use vector and scalar instructions added in ISA 3.0.
> > +Target Report Var(TARGET_P9_DFORM_BOTH) Init(-1) Save
> > +Use/do not use register+offset memory instructions added in ISA 3.0.
> 
> These should probably all be undocumented, though (they're not something
> users should use).

I will make -mpower9-dform public (which I thought it was, but evidently I
missed adding the documentation for GCC 6), but I will make the -scalar and
-vector forms private.

> > +/* Return true if the ADDR is an acceptiable address for a quad memory
> ^ spelling

Ok.

> > + if (((addr_mask & RELOAD_REG_QUAD_OFFSET) == 0)
> > + || !quad_address_p (addr, mode, false))
> 

Here is the latest version of the patch. Like before, it bootstraps and has no
regressions. Is it ok to apply to the trunk?

[gcc]
2016-05-05  Michael Meissner  

* config/rs6000/rs6000-cpus.def (ISA_3_0_MASKS_SERVER): Use
-mpower9-dform-scalar instead of -mpower9-dform. Add note not to
include -mpower9-dform-vector until we switch over to LRA.
(POWERPC_MASKS): Add -mlra. Split -mpower9-dform into two
switches, -mpower9-dform-scalar and -mpower9-dform-vector.
* config/rs6000/rs6000.opt (-mlra): Switch to being an option mask
bit instead of being a separate word. Split -mpower9-dform into
two switches, -mpower9-dform-scalar and -mpower9-dform-vector.
* config/rs6000/rs6000.c (RELOAD_REG_QUAD_OFFSET): New addr_mask
for the register class supporting 128-bit quad word memory
offsets.
(mode_supports_vsx_dform_quad): Helper function to return if the
register class uses quad word memory offsets.
(rs6000_debug_addr_mask): Add support for quad word memory
offsets.
(rs6000_debug_reg_global): Use TARGET_LRA instead of calling the
lra_p target hook.
(rs6000_setup_reg_addr_masks): If ISA 3.0 vector d-form
instructions are enabled, set up the appropriate addr_masks for
128-bit types.
(rs6000_init_hard_regno_mode_ok): wb constraint is now based on
-mpower9-dform-scalar, instead of -mpower9-dform.
(rs6000_option_override_internal): Split -mpower9-dform into two
switches, -mpower9-dform-scalar and -mpower9-dform-vector. The
-mpower9-dform switch sets or clears both. If we are not using the
LRA register allocator, do not enable -mpower9-dform-vector by
default. If we are using LRA, enable -mpower9-dform-vector and
-mvsx-timode if it is appropriate. Issue a warning if either
-mpower9-dform-vector or -mvsx-timode are explicitly used without
enabling LRA.
(quad_address_offset_p): New helper function to return if the
offset is legal for 

[PATCH], Add PowerPC ISA 3.0 min/max support

2016-05-05 Thread Michael Meissner
This patch was originally meant for GCC 6.x, but the stage3 submission window
closed before I could submit this patch.

This patch adds support for the new ISA 3.0 instructions for doing min, max,
and comparison.

Unlike the existing XSMINDP and XSMAXDP instructions, the new XSMINCDP and
XSMAXCDP instructions do the right thing with regard to one of the arguments
being NaN (not a number). This means, these instructions can be generated even
if the -ffast-math switch is not used.

In addition, the instructions XSCMPEQDP, XSCMPGTDP, and XSCMPGEDP generate
either all 1's or all 0's (similar to the vector forms of the instructions),
which allows floating point conditional move sequences to be generated with the
comparison and XXSEL instruction.

At the present time, the code does not support comparisons involving >= and <=
unless the -ffast-math option is used. I hope eventually to support generating
these instructions without having -ffast-math used.

The underlying reason is when fast math is not used, we change the condition
from:

(ge:SI (reg:CCFP ) (const_int 0))

to:

(ior:SI (gt:SI (reg:CCFP ) (const_int 0))
(eq:SI (reg:CCFP ) (const_int 0)))

The machine independent portion of the compiler does not recognize this when
trying to generate conditional moves.

I would imagine the 'fix' is to generate GE/LE all of the time, and then have a
splitter that converts it to IOR of GT/EQ if it is not a conditional move with
ISA 3.0 instructions.

I have bootstrapped the compiler and there were no regressions.  Is it ok to
apply to the trunk?

Note, this patch is independent of the vector d-form support patch I just
submitted.

[gcc]
2016-05-05  Michael Meissner  

* config/rs6000/predicates.md (all_ones_constant): New predicate
to recognize a vector/integer constant that is all 1's.
(min_max_operator): Don't match umin or umax, since it is only
used for floating point min/maxes.
(fpmask_comparison_operator): New predicate for returning true if
a comparison operator can generate -1/0 mask.
* config/rs6000/rs6000.c (print_operand): Add support for ISA 3.0
floating point min/max instructions.
(rs6000_emit_power9_minmax): New function to generate ISA 3.0
XSMAXCDP and XSMINCDP instructions.
(rs6000_emit_power9_cmove): New function to generate ISA 3.0
XSCMP{EQ,GE,GT,NE}DP instructions and XXSEL to speed up floating
point conditional moves.
(rs6000_emit_cmove): Add support for ISA 3.0 floating point
condition moves and min/max instructions.
* config/rs6000/rs6000.h (TARGET_MINMAX_SF): New target macros to
say whether the machine has floating point min/maxes that can be
generated directly.
(TARGET_MINMAX_DF): Likewise.
(PRINT_OPERAND_PUNCT_VALID_P): Add '@' for ISA 3.0 min/max.
* config/rs6000/rs6000.md (SFDF2): New iterator to allow mixed
mode floating point conditional moves.
(fp_minmax): New code iterator for ISA 3.0 min/max support.
(minmax): New code attributes for ISA 3.0 min/max support.
(MINMAX): Likewise.
(smax3): Rework floating point min/max to be a combined insn
using code iterators for min and max. Add support for ISA 3.0
min/max instructions.
(s3): Likewise.
(smax3_vsx): Likewise.
(s3_vsx): Likewise.
(smin3): Likewise.
(smin3_vsx): Likewise.
(*sdfsf3_vsx_1): Add support for min/max where one or both
operands are float that are promoted to double.
(sdfsf3_vsx_2): Likewise.
(sdfsf3_vsx_3): Likewise.
(sdfsf3_vsx_4): Likewise.
(min/max splitter, non-VSX): Use TARGET_MINMAX_.
(SF conditional move splitter): Merge SF and DF insns for
generating conditional move into one insn.
(movsfcc): Likewise.
(movcc): Likewise.
(fselsfsf4): Merge SF/DF conditional move variants into one insn
that handles different types for comparison and move.
(fseldfsf4): Likewise.
(DF conditional move splitter): Likewise.
(movdfcc): Likewise.
(fseldfdf4): Likewise.
(fselsfdf4): Likewise.
(movcc_p9): New floating point conditional
move support for ISA 3.0.
(fpmask): Likewise.
(xxsel): Likewise.
(lfiwax): Correct scratch constraint to be wi, since we don't
require direct move support.
(lfiwzx): Likewise.
(floatsi2_lfiwax_mem): Combine alternatives into a single
alternative.
(floatunssi2_lfiwzx_mem): Likewise.
(fix_truncsi2): Fix comment.
(fix_truncdi2_fctidz): Allow any VSX register instead of
Altivec register as the second alternative.
(fixuns_truncdi2_fctiduz): Likewise.

[gcc/testsuite]
2016-05-05  Michael Meissner  

* gcc.target/powerpc/p9-minmax-1.c: New tests for ISA 3.0 min/max
and con

[PATCH, i386]: Change true_regnum to REGNO in peephole2 and post-reload splitters

2016-05-05 Thread Uros Bizjak
Hello!

There is no point to determine regno of register operand using
true_regnum in peephole2s and post-reload splitters. REGNO can be used
instead.

2016-05-05  Uros Bizjak  

* config/i386/i386.md (peehole2 patterns): Change true_regnum
to REGNO in all peephole2 patterns.
(post-reload splitters): Change true_regnum to REGNO in
post-reload splitters.
(zero_extend splitters): Use general_reg_operand and
nonimmediate_gr_operand predicates.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
Index: i386.md
===
--- i386.md (revision 235906)
+++ i386.md (working copy)
@@ -3777,20 +3777,18 @@
   "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);")
 
 (define_split
-  [(set (match_operand:DI 0 "register_operand")
-   (zero_extend:DI (match_operand:SI 1 "register_operand")))]
+  [(set (match_operand:DI 0 "general_reg_operand")
+   (zero_extend:DI (match_operand:SI 1 "general_reg_operand")))]
   "!TARGET_64BIT && reload_completed
-   && !(MMX_REG_P (operands[0]) || SSE_REG_P (operands[0]))
-   && true_regnum (operands[0]) == true_regnum (operands[1])"
+   && REGNO (operands[0]) == REGNO (operands[1])"
   [(set (match_dup 4) (const_int 0))]
   "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);")
 
 (define_split
-  [(set (match_operand:DI 0 "nonimmediate_operand")
-   (zero_extend:DI (match_operand:SI 1 "nonimmediate_operand")))]
+  [(set (match_operand:DI 0 "nonimmediate_gr_operand")
+   (zero_extend:DI (match_operand:SI 1 "nonimmediate_gr_operand")))]
   "!TARGET_64BIT && reload_completed
-   && !(MEM_P (operands[0]) && MEM_P (operands[1]))
-   && !(MMX_REG_P (operands[0]) || SSE_REG_P (operands[0]))"
+   && !(MEM_P (operands[0]) && MEM_P (operands[1]))"
   [(set (match_dup 3) (match_dup 1))
(set (match_dup 4) (const_int 0))]
   "split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);")
@@ -3828,7 +3826,8 @@
   [(parallel [(set (match_dup 0) (and:SI (match_dup 0) (match_dup 2)))
  (clobber (reg:CC FLAGS_REG))])]
 {
-  if (true_regnum (operands[0]) != true_regnum (operands[1]))
+  if (!REG_P (operands[1])
+  || REGNO (operands[0]) != REGNO (operands[1]))
 {
   ix86_expand_clear (operands[0]);
 
@@ -3875,7 +3874,8 @@
   [(parallel [(set (match_dup 0) (and:SI (match_dup 0) (const_int 255)))
  (clobber (reg:CC FLAGS_REG))])]
 {
-  if (true_regnum (operands[0]) != true_regnum (operands[1]))
+  if (!REG_P (operands[1])
+  || REGNO (operands[0]) != REGNO (operands[1]))
 {
   ix86_expand_clear (operands[0]);
 
@@ -3988,8 +3988,8 @@
 
   /* Generate a cltd if possible and doing so it profitable.  */
   if ((optimize_function_for_size_p (cfun) || TARGET_USE_CLTD)
-  && true_regnum (operands[1]) == AX_REG
-  && true_regnum (operands[2]) == DX_REG)
+  && REGNO (operands[1]) == AX_REG
+  && REGNO (operands[2]) == DX_REG)
 {
   emit_insn (gen_ashrsi3_cvt (operands[2], operands[1], GEN_INT (31)));
 }
@@ -4030,8 +4030,8 @@
(set (match_operand:SI 3 "memory_operand") (match_dup 2))]
   "/* cltd is shorter than sarl $31, %eax */
!optimize_function_for_size_p (cfun)
-   && true_regnum (operands[1]) == AX_REG
-   && true_regnum (operands[2]) == DX_REG
+   && REGNO (operands[1]) == AX_REG
+   && REGNO (operands[2]) == DX_REG
&& peep2_reg_dead_p (2, operands[1])
&& peep2_reg_dead_p (3, operands[2])
&& !reg_mentioned_p (operands[2], operands[3])"
@@ -4052,19 +4052,19 @@
 {
   split_double_mode (DImode, &operands[0], 1, &operands[3], &operands[4]);
 
-  if (true_regnum (operands[3]) != true_regnum (operands[1]))
+  if (REGNO (operands[3]) != REGNO (operands[1]))
 emit_move_insn (operands[3], operands[1]);
 
   /* Generate a cltd if possible and doing so it profitable.  */
   if ((optimize_function_for_size_p (cfun) || TARGET_USE_CLTD)
-  && true_regnum (operands[3]) == AX_REG
-  && true_regnum (operands[4]) == DX_REG)
+  && REGNO (operands[3]) == AX_REG
+  && REGNO (operands[4]) == DX_REG)
 {
   emit_insn (gen_ashrsi3_cvt (operands[4], operands[3], GEN_INT (31)));
   DONE;
 }
 
-  if (true_regnum (operands[4]) != true_regnum (operands[1]))
+  if (REGNO (operands[4]) != REGNO (operands[1]))
 emit_move_insn (operands[4], operands[1]);
 
   emit_insn (gen_ashrsi3_cvt (operands[4], operands[4], GEN_INT (31)));
@@ -5207,7 +5207,7 @@
   "TARGET_SSE_PARTIAL_REG_DEPENDENCY
&& optimize_function_for_speed_p (cfun)
&& epilogue_completed
-   && (!SSE_REG_P (operands[1])
+   && (!REG_P (operands[1])
|| REGNO (operands[0]) != REGNO (operands[1]))
&& (!EXT_REX_SSE_REG_P (operands[0])
|| TARGET_AVX512VL)"
@@ -5235,7 +5235,7 @@
   "TARGET_SSE_PARTIAL_REG_DEPENDENCY
&& optimize_function_for_speed_p (cfun)
&& epilogue_completed
-   && (!SSE_REG_P (opera

Re: [PATCH,rs6000] Add built-in support for new Power9 darn (deliver a random number) instruction

2016-05-05 Thread Bernhard Reutner-Fischer
On May 5, 2016 6:26:01 PM GMT+02:00, Kelvin Nilsen 
 wrote:

>+  /* Handle simple no-argument operations. */
>+  d = bdesc_0arg;
>+  for (i = 0; i < ARRAY_SIZE (bdesc_0arg); i++, d++)
>+if (d->code == fcode)
>+  return rs6000_expand_zeroop_builtin (d->icode, target);
>+
>+  gcc_assert (false);
>   gcc_unreachable ();
> }

Surplus assert.

>+if (TARGET_DEBUG_BUILTIN)
>+  fprintf (stderr, "rs6000_builtin, skip no-argumen %s\n",
>d->name);

s/argumen %s/argument %s/

thanks,



[PATCHv2,rs6000] Add built-in support for new Power9 darn (deliver a random number) instruction

2016-05-05 Thread Kelvin Nilsen


This patch adds built-in function support for the Power9 darn
instruction.  This patch and ChangeLog is identical to one I sent
earlier today.  I have now completed additional testing and made a few
changes to my description.

I have bootstrapped and tested this patch against the trunk and against
the gcc-6-branch on both powerpc64le-unknown-linux-gnu and
powerpc64-unknown-linux-gnu with no regressions.  Is this ok for trunk
and for backporting to GCC 6 after a few days of burn-in time on the trunk?.

Thanks,
Kelvin


gcc/testsuite/ChangeLog:

2016-05-04  Kelvin Nilsen  

* gcc.target/powerpc/darn-0.c: New test.
* gcc.target/powerpc/darn-1.c: New test.
* gcc.target/powerpc/darn-2.c: New test.


gcc/ChangeLog:

2016-05-04  Kelvin Nilsen  

* config/rs6000/altivec.h: Add macro definitions for darn,
darn_32, and darn_raw.
* config/rs6000/altivec.md (UNSPEC_DARN): New unspec constant.
(UNSPEC_DARN_32): New usnpec constant.
(UNSPEC_DARN_RAW): New unspec constant.
("darn_32"): New instruction.
("darn_raw"): New instruction.
("darn"): New instruction.
* config/rs6000/rs6000-builtin.def (RS6000_BUILTIN_0): Add
support and documentation for this macro.
(BU_P9_MISC_1): New macro definition.
(BU_P9_64BIT_MISC_0): New macro definition.
(BU_P9_MISC_0): New macro definition.
("darn_32"): New builtin definition.
("darn_raw"): New builtin definition.
("darn"): New builtin definition.
* config/rs6000/rs6000.c: Add #define RS6000_BUILTIN_0 and #undef
RS6000_BUILTIN_0 directives to surround each occurrence of
#include "rs6000-builtin.def".
(rs6000_builtin_mask_calculate): Add in the RS6000_BTM_MODULO and
RS6000_BTM_64BIT flags to the returned mask, depending on
configuration.
(def_builtin): Correct an error in the assignments made to the
debugging variable attr_string.
(rs6000_expand_builtin): Add support for no-operand built-in
functions.
(builtin_function_type): Remove fatal_error assertion that is no
longer valid.
(rs6000_common_init_builtins): Add support for no-operand built-in
functions.
* config/rs6000/rs6000.h (RS6000_BTM_MODULO): New macro
definition.
(RS6000_BTM_PURE): Enhance comment to clarify intent of this flag
definition.
(RS6000_BTM_64BIT): New macro definition.
* doc/extend.texi: Document __builtin_darn (void),
__builtin_darn_raw (void), and __builtin_darn_32 (void) built-in
functions.

Index: gcc/config/rs6000/altivec.h
===
--- gcc/config/rs6000/altivec.h (revision 235884)
+++ gcc/config/rs6000/altivec.h (working copy)
@@ -382,6 +382,11 @@
 #define vec_vsubuqm __builtin_vec_vsubuqm
 #define vec_vupkhsw __builtin_vec_vupkhsw
 #define vec_vupklsw __builtin_vec_vupklsw
+
+/* Non-Vector additions added in ISA 3.0. */
+#define darn __builtin_darn
+#define darn_32 __builtin_darn_32
+#define darn_raw __builtin_darn_raw
 #endif

 /* Predicates.
Index: gcc/config/rs6000/altivec.md
===
--- gcc/config/rs6000/altivec.md(revision 235884)
+++ gcc/config/rs6000/altivec.md(working copy)
@@ -73,6 +73,9 @@
UNSPEC_VUNPACK_LO_SIGN_DIRECT
UNSPEC_VUPKHPX
UNSPEC_VUPKLPX
+   UNSPEC_DARN
+   UNSPEC_DARN_32
+   UNSPEC_DARN_RAW
UNSPEC_DST
UNSPEC_DSTT
UNSPEC_DSTST
@@ -3590,6 +3593,37 @@
   [(set_attr "length" "4")
(set_attr "type" "vecsimple")])

+(define_insn "darn_32"
+  [(set (match_operand:SI 0 "register_operand" "")
+(unspec:SI [(const_int 0)] UNSPEC_DARN_32))]
+  "TARGET_MODULO"
+  {
+ return "darn %0,0";
+  }
+  [(set_attr "type" "add")
+   (set_attr "length" "4")])
+
+(define_insn "darn_raw"
+  [(set (match_operand:DI 0 "register_operand" "")
+(unspec:DI [(const_int 0)] UNSPEC_DARN_RAW))]
+  "TARGET_MODULO && TARGET_64BIT"
+  {
+ return "darn %0,2";
+  }
+  [(set_attr "type" "add")
+   (set_attr "length" "4")])
+
+(define_insn "darn"
+  [(set (match_operand:DI 0 "register_operand" "")
+(unspec:DI [(const_int 0)] UNSPEC_DARN))]
+  "TARGET_MODULO && TARGET_64BIT"
+  {
+ return "darn %0,1";
+  }
+  [(set_attr "type" "add")
+   (set_attr "length" "4")])
+
+
 (define_expand "bcd_"
   [(parallel [(set (reg:CCFP 74)
   (compare:CCFP
Index: gcc/config/rs6000/rs6000-builtin.def
===
--- gcc/config/rs6000/rs6000-builtin.def(revision 235884)
+++ gcc/config/rs6000/rs6000-builtin.def(working copy)
@@ -24,6 +24,7 @@
.  */

 /* Before including this file, some macros must be defined:
+   RS6000_BUILTIN_0 -- 0 arg builtins
RS6000_BUILTIN_1 -- 1 arg builtins
RS6000_BUILTIN_2 -- 2 a

[PATCH, ARM] use vmov.i64 to load 0 into FP reg if neon enabled

2016-05-05 Thread Jim Wilson
For this simple testcase

double
sub (void)
{
  return 0.0;
}

Without the attached patch, an ARM compiler with neon support enabled, gives
 vldr.64 d0, .L2
With the attached patch, an ARM compiler with neon enabled, gives
 vmov.i64 d0, #0@ float
which is faster and smaller, as there is no load from a constant pool entry.

There are a few ways to implement this.  I added a neon enabled
attribute.  Another way to do this would be a new constraint, like Dg,
that tests for both neon and 0.

I don't see any mention of targets that only support single-float in
the ARM ARM, so it isn't obvious how to handle that.  I see no targets
that support both neon and single-float, but maybe I need to check for
that anyways?

Most of the patch involves renumbering constraints and matching
attributes.  The new alternative w/G must come before w/UvF or else we
still get a constant pool reference.  Otherwise the patch is pretty
small and simple.

We can do the same thing in the movdi pattern.  I haven't tried
writing that yet.

This patch was tested with a bootstrap and make check in an armhf
schroot on an xgene box.  There were no regressions.

OK to check in?

Jim
	* config/arm/arm.md: (arch): Add neon.
	(arch_enabled): Return yes for arch neon when TARGET_NEON.
	* config/arm/vfp.md (movdf_vfp): Add w/G as alternative 3.  Add
	neon_move as type for alt 3.  Add arch attr enabling alt 3 for neon.
	Emit vmov.i64 for alt 3.  Renumber alternatives 3 to 8.  Adjust
	attributes for alt renumbering.  Mark alt 3 as non-predicable.
	(thumb2_movdf_vfp): Likewise.

Index: config/arm/arm.md
===
--- config/arm/arm.md	(revision 235793)
+++ config/arm/arm.md	(working copy)
@@ -121,7 +121,7 @@
 ; arm_arch6.  "v6t2" for Thumb-2 with arm_arch6.  This attribute is
 ; used to compute attribute "enabled", use type "any" to enable an
 ; alternative in all cases.
-(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3"
+(define_attr "arch" "any,a,t,32,t1,t2,v6,nov6,v6t2,neon_for_64bits,avoid_neon_for_64bits,iwmmxt,iwmmxt2,armv6_or_vfpv3,neon"
   (const_string "any"))
 
 (define_attr "arch_enabled" "no,yes"
@@ -177,6 +177,10 @@
 	 (and (eq_attr "arch" "armv6_or_vfpv3")
 	  (match_test "arm_arch6 || TARGET_VFP3"))
 	 (const_string "yes")
+
+	 (and (eq_attr "arch" "neon")
+	  (match_test "TARGET_NEON"))
+	 (const_string "yes")
 	]
 
 	(const_string "no")))
Index: config/arm/vfp.md
===
--- config/arm/vfp.md	(revision 235793)
+++ config/arm/vfp.md	(working copy)
@@ -394,8 +394,8 @@
 ;; DFmode moves
 
 (define_insn "*movdf_vfp"
-  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w  ,Uv,r, m,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dy,UvF,w ,mF,r,w,r"))]
+  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w,w  ,Uv,r, m,w,r")
+	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dy,G,UvF,w ,mF,r,w,r"))]
   "TARGET_ARM && TARGET_HARD_FLOAT && TARGET_VFP
&& (   register_operand (operands[0], DFmode)
|| register_operand (operands[1], DFmode))"
@@ -410,16 +410,18 @@
   case 2:
 	gcc_assert (TARGET_VFP_DOUBLE);
 return \"vmov%?.f64\\t%P0, %1\";
-  case 3: case 4:
+  case 3:
+	return \"vmov.i64\\t%P0, #0@ float\";
+  case 4: case 5:
 	return output_move_vfp (operands);
-  case 5: case 6:
+  case 6: case 7:
 	return output_move_double (operands, true, NULL);
-  case 7:
+  case 8:
 	if (TARGET_VFP_SINGLE)
 	  return \"vmov%?.f32\\t%0, %1\;vmov%?.f32\\t%p0, %p1\";
 	else
 	  return \"vmov%?.f64\\t%P0, %P1\";
-  case 8:
+  case 9:
 return \"#\";
   default:
 	gcc_unreachable ();
@@ -426,23 +428,24 @@
   }
 }
   "
-  [(set_attr "type" "f_mcrr,f_mrrc,fconstd,f_loadd,f_stored,\
+  [(set_attr "type" "f_mcrr,f_mrrc,fconstd,neon_move,f_loadd,f_stored,\
  load2,store2,ffarithd,multiple")
-   (set (attr "length") (cond [(eq_attr "alternative" "5,6,8") (const_int 8)
-			   (eq_attr "alternative" "7")
+   (set (attr "length") (cond [(eq_attr "alternative" "6,7,9") (const_int 8)
+			   (eq_attr "alternative" "8")
 (if_then_else
  (match_test "TARGET_VFP_SINGLE")
  (const_int 8)
  (const_int 4))]
 			  (const_int 4)))
-   (set_attr "predicable" "yes")
-   (set_attr "pool_range" "*,*,*,1020,*,1020,*,*,*")
-   (set_attr "neg_pool_range" "*,*,*,1004,*,1004,*,*,*")]
+   (set_attr "predicable" "yes,yes,yes,no,yes,yes,yes,yes,yes,yes")
+   (set_attr "pool_range" "*,*,*,*,1020,*,1020,*,*,*")
+   (set_attr "neg_pool_range" "*,*,*,*,1004,*,1004,*,*,*")
+   (set_attr "arch" "any,any,any,neon,any,any,any,any,any,any")]
 )
 
 (define_insn "*thumb2_movdf_vfp"
-  [(set (match_operand:DF 0 "nonimmediate_soft_df_operand" "=w,?r,w ,w  ,Uv,r ,m,w,r")
-	(match_operand:DF 1 "soft_df_operand"		   " ?r,w,Dy,UvF,w, mF,r, w

[PATCH, i386]: Fix too broad (mem,reg)->(const,reg) splitters

2016-05-05 Thread Uros Bizjak
Hello!

This patch fixes a bunch of too broad (mem,reg)->(const,reg)
splitters, that block other similar splitters. The solution is to
check, if the splitter will transform the insn in the splitter
condition, instead of using FAIL in the splitter preparation
statements.

2016-05-06  Uros Bizjak  

PR target/70873
* config/i386/i386-protos.h (ix86_standard_x87sse_constant_load_p):
New prototype.
* config/i386/i386.c (ix86_standard_x87sse_constant_load_p): New.
* config/i386/i386.md (push mem splitter): Use find_constant_src in
the splitter condition.
(FP load splitter): Use ix86_standard_x87sse_constant_load_p in
the splitter condition.
(FP float_extend load splitter): Ditto.

Patch was bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN.

Uros.
diff --git a/gcc/config/i386/i386-protos.h b/gcc/config/i386/i386-protos.h
index 4145ed5..447f67e 100644
--- a/gcc/config/i386/i386-protos.h
+++ b/gcc/config/i386/i386-protos.h
@@ -52,6 +52,7 @@ extern const char *standard_80387_constant_opcode (rtx);
 extern rtx standard_80387_constant_rtx (int);
 extern int standard_sse_constant_p (rtx, machine_mode);
 extern const char *standard_sse_constant_opcode (rtx_insn *, rtx);
+extern bool ix86_standard_x87sse_constant_load_p (const rtx_insn *, rtx);
 extern bool symbolic_reference_mentioned_p (rtx);
 extern bool extended_reg_mentioned_p (rtx);
 extern bool x86_extended_QIreg_mentioned_p (rtx_insn *);
diff --git a/gcc/config/i386/i386.c b/gcc/config/i386/i386.c
index 9680aaf..05476f3 100644
--- a/gcc/config/i386/i386.c
+++ b/gcc/config/i386/i386.c
@@ -11219,6 +11219,26 @@ standard_sse_constant_opcode (rtx_insn *insn, rtx x)
   gcc_unreachable ();
 }
 
+/* Returns true if INSN can be transformed from a memory load
+   to a supported FP constant load.  */
+
+bool
+ix86_standard_x87sse_constant_load_p (const rtx_insn *insn, rtx dst)
+{
+  rtx src = find_constant_src (insn);
+
+  gcc_assert (REG_P (dst));
+
+  if (src == NULL
+  || (SSE_REGNO_P (REGNO (dst))
+ && standard_sse_constant_p (src, GET_MODE (dst)) != 1)
+  || (STACK_REGNO_P (REGNO (dst))
+  && standard_80387_constant_p (src) < 1))
+return false;
+
+  return true;
+}
+
 /* Returns true if OP contains a symbol reference */
 
 bool
diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index dd56b05..0bf01ab 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -3072,14 +3072,10 @@
 (define_split
   [(set (match_operand:SF 0 "push_operand")
(match_operand:SF 1 "memory_operand"))]
-  "reload_completed"
+  "reload_completed
+   && find_constant_src (insn)"
   [(set (match_dup 0) (match_dup 2))]
-{
-  operands[2] = find_constant_src (curr_insn);
-
-  if (operands[2] == NULL_RTX)
-FAIL;
-})
+  "operands[2] = find_constant_src (curr_insn);")
 
 (define_split
   [(set (match_operand 0 "push_operand")
@@ -3601,19 +3597,10 @@
&& (GET_MODE (operands[0]) == TFmode
|| GET_MODE (operands[0]) == XFmode
|| GET_MODE (operands[0]) == DFmode
-   || GET_MODE (operands[0]) == SFmode)"
+   || GET_MODE (operands[0]) == SFmode)
+   && ix86_standard_x87sse_constant_load_p (insn, operands[0])"
   [(set (match_dup 0) (match_dup 2))]
-{
-  operands[2] = find_constant_src (curr_insn);
-
-  if (operands[2] == NULL_RTX
-  || (SSE_REGNO_P (REGNO (operands[0]))
- && standard_sse_constant_p (operands[2],
- GET_MODE (operands[0])) != 1)
-  || (STACK_REGNO_P (REGNO (operands[0]))
-  && standard_80387_constant_p (operands[2]) < 1))
-FAIL;
-})
+  "operands[2] = find_constant_src (curr_insn);")
 
 (define_split
   [(set (match_operand 0 "any_fp_register_operand")
@@ -3621,19 +3608,10 @@
   "reload_completed
&& (GET_MODE (operands[0]) == TFmode
|| GET_MODE (operands[0]) == XFmode
-   || GET_MODE (operands[0]) == DFmode)"
+   || GET_MODE (operands[0]) == DFmode)
+   && ix86_standard_x87sse_constant_load_p (insn, operands[0])"
   [(set (match_dup 0) (match_dup 2))]
-{
-  operands[2] = find_constant_src (curr_insn);
-
-  if (operands[2] == NULL_RTX
-  || (SSE_REGNO_P (REGNO (operands[0]))
- && standard_sse_constant_p (operands[2],
- GET_MODE (operands[0])) != 1)
-  || (STACK_REGNO_P (REGNO (operands[0]))
-  && standard_80387_constant_p (operands[2]) < 1))
-FAIL;
-})
+  "operands[2] = find_constant_src (curr_insn);")
 
 ;; Split the load of -0.0 or -1.0 into fldz;fchs or fld1;fchs sequence
 (define_split


Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-05-05 Thread Eduard Sanou
Hi,

I've worked on applying the changes that have been commented in several
messages from this thread.  Find the patch and changelog attached.

On 16-04-29 09:17:44, Jakub Jelinek wrote:
> First of all, using error instead of fatal_error achieves just that too,
> except that it allows also detecting other errors in the source.
> fatal_error is meant for cases where there is no way to go forward with the
> compilation, while here exists a perfectly reasonable way to go forward 
> (assume
> the env var is not set, use a hardcoded timestamp, ...).

I've changed the use of fatal_error to error_at.  Now the parsing will
return -1 if there's an error, and gcc will be able to output the line
where the __DATE__ or __TIME__ macro was used that caused the error.

> Doing this on the gcc/ side is of course reasonable, but can be done through
> callbacks, libcpp already has lots of other callbacks into the gcc/ code,
> look for e.g. cpp_get_callbacks in gcc/c-family/* and in libcpp/ for
> corresponding code.

I've added a callback that allows parsing the SOURCE_DATE_EPOCH env var
right at the point where the date macros are expanded.  I've removed
some uneeded stuff from my previous patch.

> Also, as a follow-up, guess the driver should set this
> env var for the -fcompare-debug case if not already set, to something that
> matches the current date, so that __TIME__ macros expands the same in
> between both compilations, even when they don't compile both in the same
> second.

I've also patched the driver to set the SOURCE_DATE_EPOCH env var to the
current date when the -fcompare-debug is set.  I've tested this building
llvm with such flag.  
Markus could you verify that the patch indeed fixes the issue?


Martin Sebor wrote:
> I'm sorry I'm a little late but I have a couple of minor comments
> on the patch:

Thanks for the comments :)

> In most texts (e.g. the C and POSIX standards), the name of
> an environment variable doesn't include the dollar sign.  In
> other diagnostic messages GCC doesn't print one.  I suggest
> to follow the established practice and remove the dollar sign
> from this error message as well.

Change applied.

> I would also suggest to issue a single generic error message
> explaining what the valid value of the variable is instead of
> trying to describe what's wrong with it, for example as follows
> (note also the hyphen in "non-negative" which is the prevalent
> style used by other GCC messages and GNU documentation).
> 
>   "environment variable SOURCE_DATE_EPOCH must expand to a non-
>   negative integer less than or equal to %qlli", LLONG_MAX

Applied as well.

> The +%s option to the date command is a non-standard extension
> that's not universally available.  To avoid confusing users on
> systems that don't support it I would suggest to either avoid
> mentioning or to clarify that it's a Linux command.

I've added a comment to clarify this fact in the documentation.

Cheers,
Dhole
gcc/c-family/ChangeLog:

2016-05-05  Eduard Sanou  

* c-common.c (get_source_date_epoch): Renamed to
cb_get_source_date_epoch.
* c-common.c (cb_get_source_date_epoch): Use a single generic erorr
message when the parsing fails.  Use error_at instead of fatal_error.
* c-common.h (get_source_date_epoch): Renamed to
cb_get_source_date_epoch.
* c-common.h (cb_get_source_date_epoch): Prototype.
* c-common.h (MAX_SOURCE_DATE_EPOCH): Define.
* c-lex.c (init_c_lex): Set cb->get_source_date_epoch callback.
* c-lex.c (c_lex_with_flags): Remove initialization of
source_date_epoch.

gcc/ChangeLog:

2016-05-05  Eduard Sanou  

* doc/cppenv.texi: Note that the `%s` in `date` is a non-standard
extension.
* gcc.c (driver_handle_option): Call
setenv_SOURCE_DATE_EPOCH_current_time.
* gcc.c (setenv_SOURCE_DATE_EPOCH_current_time): New function, sets
the SOURCE_DATE_EPOCH environment variable to the current time.

libcpp/ChangeLog:

2016-05-05  Eduard Sanou  

* include/cpplib.h (cpp_callbacks): Add get_source_date_epoch
callback.
* include/cpplib.h (cpp_init_source_date_epoch): Remove.
* init.c (cpp_init_source_date_epoch): Remove.
* internal.h (cpp_reader): Remove source_date_epoch.
* macro.c (_cpp_builtin_macro_text): Use get_source_date_epoch
callback.
diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index d45bf1b..22974c5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -12788,7 +12788,7 @@ valid_array_size_p (location_t loc, tree type, tree 
name)
timestamp to replace embedded current dates to get reproducible
results.  Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
 time_t
-get_source_date_epoch ()
+cb_get_source_date_epoch (cpp_reader *pfile ATTRIBUTE_UNUSED)
 {
   char *source_date_epoch;
   long long epoch;
@@ -12800,19 +12800,14 @@ get_source_date_epoch ()
 
   errno = 0;
   epoch 

Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-05-05 Thread Dhole
On 16-04-29 09:17:44, Jakub Jelinek wrote:
> > Bernd: I'll see if I can prepare a testcase; first I need to get
> > familiar with the testing framework and learn how to set environment
> > variables in tests.  Any tips on that will be really welcome!
> 
> grep for dg-set-target-env-var in various tests.

I've been looking at how the test infrastructure works, and I'm having
some difficulties with setting the env var.

I've wrote a test case which fails (when it shouldn't) and I don't see
why.

I'm attaching the test file.

I'm running it with:
$ make check-gcc RUNTESTFLAGS=cpp.exp=source_date_epoch-1.c

What I find strange, however, is that if I set the env var from the
command line, it seems to pass:
$ SOURCE_DATE_EPOCH=123456 make check-gcc 
RUNTESTFLAGS=cpp.exp=source_date_epoch-1.c


P.S.: I've just sent another message to the thread with the patch
implementing the other mentioned issues.  I've mistakenly sent it from
another email account of mine: 

Cheers,
-- 
Dhole
/* { dg-do run } */
/* { dg-set-target-env-var SOURCE_DATE_EPOCH "123456" } */

int
main(void)
{
  __builtin_printf ("%s %s\n", __DATE__, __TIME__);
  return 0;
}

/* { dg-output "Jan  2 1970 10:17:36" } */


signature.asc
Description: PGP signature


Fix for PR68159 in Libiberty Demangler (6)

2016-05-05 Thread Marcel Böhme
Hi,

This patches fixes 
* the stack overflow reported in PR68159 in cplus_demangle_print_callback,
* a potential stack overflow in d_demangle_callback
* a potential stack overflow in is_ctor_or_dtor, and
* six potential buffer overflows (initialise less memory than needed due to 
integer overflow).

The stack overflow reported in PR68159 occurs due to assigning an array too 
much memory from the stack (447kb).
Similar stack overflows might occur in the remaining five dynamically allocated 
arrays in this and the other two functions.

Since the array size is controlled from the mangled string, we better safeguard 
from integer overflows and thus buffer overflows for these six arrays.

The patch allocates memory from the heap (xmalloc) instead of from the stack 
(dynamic arrays, alloca), checks for integer overflows, and frees the allocated 
memory before function return / abort.

Best regards,
- Marcel


Index: ChangeLog
===
--- ChangeLog   (revision 235941)
+++ ChangeLog   (working copy)
@@ -1,3 +1,14 @@
+2016-05-06  Marcel Böhme  
+
+   PR c++/68159
+   * cp-demangle.c: Check for overflow and allocate arrays of user-defined 
+   size on the heap, not on the stack.
+   (CP_DYNAMIC_ARRAYS): Remove redundant definition.
+   (cplus_demangle_print_callback): Check for overflow. Allocate memory 
+   for two arrays on the heap. Free memory before return / exit.
+   (d_demangle_callback): Likewise.
+   (is_ctor_or_dtor): Likewise. 
+
 2016-05-02  Marcel Böhme  
 
PR c++/70498
Index: cp-demangle.c
===
--- cp-demangle.c   (revision 235941)
+++ cp-demangle.c   (working copy)
@@ -186,20 +186,6 @@ static void d_init_info (const char *, int, size_t
 #define CP_STATIC_IF_GLIBCPP_V3
 #endif /* ! defined(IN_GLIBCPP_V3) */
 
-/* See if the compiler supports dynamic arrays.  */
-
-#ifdef __GNUC__
-#define CP_DYNAMIC_ARRAYS
-#else
-#ifdef __STDC__
-#ifdef __STDC_VERSION__
-#if __STDC_VERSION__ >= 199901L
-#define CP_DYNAMIC_ARRAYS
-#endif /* __STDC__VERSION >= 199901L */
-#endif /* defined (__STDC_VERSION__) */
-#endif /* defined (__STDC__) */
-#endif /* ! defined (__GNUC__) */
-
 /* We avoid pulling in the ctype tables, to prevent pulling in
additional unresolved symbols when this code is used in a library.
FIXME: Is this really a valid reason?  This comes from the original
@@ -4125,26 +4111,26 @@ cplus_demangle_print_callback (int options,
   struct d_print_info dpi;
 
   d_print_init (&dpi, callback, opaque, dc);
+  
+  if (dpi.num_copy_templates > INT_MAX / (int) sizeof (*dpi.copy_templates))
+xmalloc_failed(INT_MAX);
+  dpi.copy_templates = 
+  (struct d_print_template *) xmalloc (dpi.num_copy_templates 
+ * sizeof (*dpi.copy_templates));
 
-  {
-#ifdef CP_DYNAMIC_ARRAYS
-__extension__ struct d_saved_scope scopes[dpi.num_saved_scopes];
-__extension__ struct d_print_template temps[dpi.num_copy_templates];
+  if (dpi.num_saved_scopes > INT_MAX / (int) sizeof (*dpi.saved_scopes))
+xmalloc_failed(INT_MAX);
+  dpi.saved_scopes = 
+  (struct d_saved_scope *) xmalloc (dpi.num_saved_scopes 
+   * sizeof (*dpi.saved_scopes));
 
-dpi.saved_scopes = scopes;
-dpi.copy_templates = temps;
-#else
-dpi.saved_scopes = alloca (dpi.num_saved_scopes
-  * sizeof (*dpi.saved_scopes));
-dpi.copy_templates = alloca (dpi.num_copy_templates
-* sizeof (*dpi.copy_templates));
-#endif
+  d_print_comp (&dpi, options, dc);
 
-d_print_comp (&dpi, options, dc);
-  }
-
   d_print_flush (&dpi);
 
+  free(dpi.copy_templates);
+  free(dpi.saved_scopes);
+
   return ! d_print_saw_error (&dpi);
 }
 
@@ -5945,18 +5931,17 @@ d_demangle_callback (const char *mangled, int opti
 
   cplus_demangle_init_info (mangled, options, strlen (mangled), &di);
 
-  {
-#ifdef CP_DYNAMIC_ARRAYS
-__extension__ struct demangle_component comps[di.num_comps];
-__extension__ struct demangle_component *subs[di.num_subs];
+  if (di.num_comps > INT_MAX / (int) sizeof (*di.comps))
+xmalloc_failed(INT_MAX);
+  di.comps = (struct demangle_component *) xmalloc (di.num_comps 
+   * sizeof (*di.comps));
 
-di.comps = comps;
-di.subs = subs;
-#else
-di.comps = alloca (di.num_comps * sizeof (*di.comps));
-di.subs = alloca (di.num_subs * sizeof (*di.subs));
-#endif
+  if (di.num_subs > INT_MAX / (int) sizeof (*di.subs))
+xmalloc_failed(INT_MAX);
+  di.subs = (struct demangle_component **) xmalloc (di.num_subs 
+  * sizeof (*di.subs));
 
+  {
 switch (type)
   {
   case DCT_TYPE:
@@ -5977,6 +5962,8 @@ d_demangle_callback (const char *mangled, int opti
d_advance (&di, strlen (d_str (&di)));

[RS6000] complex long double ABI_V4 fix

2016-05-05 Thread Alan Modra
Revision 235792 regressed compat/scalar-by-value-6 for powerpc-linux
-m32 due to accidentally changing the ABI.  By another historical
accident, complex long double is stupidly passed in gprs for -m32.

Bootstrapped and regression tested powerpc64-linux.  Also fixes
gfortran.dg/{large_real_kind_2.F90,large_real_kind_form_io_1.f90}.
OK to apply?

* config/rs6000/rs6000.c (rs6000_function_arg): Exclude IBM
complex long double from args passed in fprs for ABI_V4.
(rs6000_function_arg_boundary, rs6000_function_arg_advance_1,
rs6000_gimplify_va_arg): Likewise.

diff --git a/gcc/config/rs6000/rs6000.c b/gcc/config/rs6000/rs6000.c
index 1215925..9c7a37b 100644
--- a/gcc/config/rs6000/rs6000.c
+++ b/gcc/config/rs6000/rs6000.c
@@ -10142,6 +10142,7 @@ rs6000_function_arg_boundary (machine_mode mode, 
const_tree type)
   && (GET_MODE_SIZE (mode) == 8
  || (TARGET_HARD_FLOAT
  && TARGET_FPRS
+ && !(mode == ICmode || (!TARGET_IEEEQUAD && mode == TCmode))
  && FLOAT128_2REG_P (mode
 return 64;
   else if (FLOAT128_VECTOR_P (mode))
@@ -10524,7 +10525,8 @@ rs6000_function_arg_advance_1 (CUMULATIVE_ARGS *cum, 
machine_mode mode,
   if (TARGET_HARD_FLOAT && TARGET_FPRS
  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
  || (TARGET_DOUBLE_FLOAT && mode == DFmode)
- || FLOAT128_2REG_P (mode)
+ || (!(mode == ICmode || (!TARGET_IEEEQUAD && mode == TCmode))
+ && FLOAT128_2REG_P (mode))
  || DECIMAL_FLOAT_MODE_P (mode)))
{
  /* _Decimal128 must use an even/odd register pair.  This assumes
@@ -11185,7 +11187,10 @@ rs6000_function_arg (cumulative_args_t cum_v, 
machine_mode mode,
   if (TARGET_HARD_FLOAT && TARGET_FPRS
  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
  || (TARGET_DOUBLE_FLOAT && mode == DFmode)
- || FLOAT128_2REG_P (mode)
+ /* ABI_V4 passes complex IBM long double in 8 gprs.
+Stupid, but we can't change the ABI now.  */
+ || (!(mode == ICmode || (!TARGET_IEEEQUAD && mode == TCmode))
+ && FLOAT128_2REG_P (mode))
  || DECIMAL_FLOAT_MODE_P (mode)))
{
  /* _Decimal128 must use an even/odd register pair.  This assumes
@@ -12107,19 +12112,21 @@ rs6000_gimplify_va_arg (tree valist, tree type, 
gimple_seq *pre_p,
   rsize = (size + 3) / 4;
   align = 1;
 
+  machine_mode mode = TYPE_MODE (type);
   if (TARGET_HARD_FLOAT && TARGET_FPRS
-  && ((TARGET_SINGLE_FLOAT && TYPE_MODE (type) == SFmode)
+  && ((TARGET_SINGLE_FLOAT && mode == SFmode)
   || (TARGET_DOUBLE_FLOAT 
-  && (TYPE_MODE (type) == DFmode 
- || FLOAT128_2REG_P (TYPE_MODE (type))
- || DECIMAL_FLOAT_MODE_P (TYPE_MODE (type))
+  && (mode == DFmode
+ || (!(mode == ICmode || (!TARGET_IEEEQUAD && mode == TCmode))
+ && FLOAT128_2REG_P (mode))
+ || DECIMAL_FLOAT_MODE_P (mode)
 {
   /* FP args go in FP registers, if present.  */
   reg = fpr;
   n_reg = (size + 7) / 8;
   sav_ofs = ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? 8 : 4) * 4;
   sav_scale = ((TARGET_HARD_FLOAT && TARGET_DOUBLE_FLOAT) ? 8 : 4);
-  if (TYPE_MODE (type) != SFmode && TYPE_MODE (type) != SDmode)
+  if (mode != SFmode && mode != SDmode)
align = 8;
 }
   else
@@ -12139,7 +12146,7 @@ rs6000_gimplify_va_arg (tree valist, tree type, 
gimple_seq *pre_p,
   addr = create_tmp_var (ptr_type_node, "addr");
 
   /*  AltiVec vectors never go in registers when -mabi=altivec.  */
-  if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (TYPE_MODE (type)))
+  if (TARGET_ALTIVEC_ABI && ALTIVEC_VECTOR_MODE (mode))
 align = 16;
   else
 {
@@ -12160,7 +12167,7 @@ rs6000_gimplify_va_arg (tree valist, tree type, 
gimple_seq *pre_p,
}
   /* _Decimal128 is passed in even/odd fpr pairs; the stored
 reg number is 0 for f1, so we want to make it odd.  */
-  else if (reg == fpr && TYPE_MODE (type) == TDmode)
+  else if (reg == fpr && mode == TDmode)
{
  t = build2 (BIT_IOR_EXPR, TREE_TYPE (reg), unshare_expr (reg),
  build_int_cst (TREE_TYPE (reg), 1));
@@ -12187,7 +12194,7 @@ rs6000_gimplify_va_arg (tree valist, tree type, 
gimple_seq *pre_p,
 FP register for 32-bit binaries.  */
   if (TARGET_32BIT
  && TARGET_HARD_FLOAT && TARGET_FPRS
- && TYPE_MODE (type) == SDmode)
+ && mode == SDmode)
t = fold_build_pointer_plus_hwi (t, size);
 
   gimplify_assign (addr, t, pre_p);

-- 
Alan Modra
Australia Development Lab, IBM


Re: Allow embedded timestamps by C/C++ macros to be set externally (3)

2016-05-05 Thread Andreas Schwab
Eduard Sanou  writes:

> diff --git a/gcc/c-family/c-common.h b/gcc/c-family/c-common.h
> index 1714284..dea2900 100644
> --- a/gcc/c-family/c-common.h
> +++ b/gcc/c-family/c-common.h
> @@ -1086,6 +1086,16 @@ extern vec *make_tree_vector_copy (const 
> vec *);
> c_register_builtin_type.  */
>  extern GTY(()) tree registered_builtin_types;
>  
> +/* Read SOURCE_DATE_EPOCH from environment to have a deterministic
> +   timestamp to replace embedded current dates to get reproducible
> +   results.  Returns -1 if SOURCE_DATE_EPOCH is not defined.  */
> +extern time_t cb_get_source_date_epoch (cpp_reader *pfile);
> +
> +/* The value (as a unix timestamp) corresponds to date 
> +   "Dec 31  23:59:59 UTC", which is the latest date that __DATE__ and 
> +   __TIME__ can store.  */
> +#define MAX_SOURCE_DATE_EPOCH 253402300799

This is bigger than INT_MAX, doesn't it trigger a warning that breaks
bootstrap?

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [RS6000] complex long double ABI_V4 fix

2016-05-05 Thread Alan Modra
On Fri, May 06, 2016 at 03:54:43PM +0930, Alan Modra wrote:
> Revision 235792 regressed compat/scalar-by-value-6 for powerpc-linux

Sorry, typo in the revision.  Should be 235794, git 3c62cae0.

-- 
Alan Modra
Australia Development Lab, IBM


Re: [PATCH] Make basic asm implicitly clobber memory

2016-05-05 Thread David Wohlferd

On 5/5/2016 10:29 AM, Bernd Edlinger wrote:

Hi!

this patch is inspired by recent discussion about basic asm:

Currently a basic asm is an instruction scheduling barrier,
but not a memory barrier, and most surprising, basic asm
does _not_ implicitly clobber CC on targets where
extended asm always implicitly clobbers CC, even if
nothing is in the clobber section.

This patch makes basic asm implicitly clobber CC on certain
targets, and makes the basic asm implicitly clobber memory,
but no general registers, which is what could be expected.

This is however only done for basic asm with non-empty
assembler string, which is in sync with David's proposed
basic asm warnings patch.

Due to the change in the tree representation, where
ASM_INPUT can now be the first element of a
PARALLEL block with the implicit clobber elements,
there are some changes necessary.

Most of the changes in the middle end, were necessary
because extract_asm_operands can not be used to find out
if a PARALLEL block is an asm statement, but in most cases
asm_noperands can be used instead.

There are also changes necessary in two targets: pa, and ia64.
I have successfully built cross-compilers for these targets.

Boot-strapped and reg-tested on x86_64-pc-linux-gnu
OK for trunk?


A few questions:

1) I'm not clear precisely what problem this patch fixes.  It's true 
that some people have incorrectly assumed that basic asm clobbers memory 
and this change would fix their code.  But some people also incorrectly 
assume it clobbers registers.  I assume that's why Jeff Law proposed 
making basic asm "an opaque blob that read/write/clobber any register or 
memory location."  Do we have enough problem reports from users to know 
which is the real solution here?


2) The -Wbasic-asm warning patch wasn't approved for v6.  If we are 
going to change this behavior now, is it time?


3) I assume there are good reasons why extended asm can't be used at top 
level.  Will adding these clobbers cause those problems in basic asm too?


4) There are more basic asm docs that need to change: "It also does not 
know about side effects of the assembler code, such as modifications to 
memory or registers. Unlike some compilers, GCC assumes that no changes 
to either memory or registers occur. This assumption may change in a 
future release."


dw