Re: [PATCH] Fix for PR26702: Emit .size for BSS variables on arm-eabi

2015-04-30 Thread Bin.Cheng
On Thu, Apr 23, 2015 at 10:51 PM, Ramana Radhakrishnan
 wrote:
> On Mon, Mar 30, 2015 at 9:25 PM, Kwok Cheung Yeung  
> wrote:
>> This is a simple patch that ensures that a .size directive is emitted when
>> space is allocated for a static variable in the BSS on bare-metal ARM
>> targets. This allows other tools such as GDB to look up the size of the
>> object correctly.
>>
>> Before:
>>
>> $ readelf -s pr26702.o
>>
>> Symbol table '.symtab' contains 10 entries:
>>Num:Value  Size TypeBind   Vis  Ndx Name
>> ...
>>  6:  0 NOTYPE  LOCAL  DEFAULT3 static_foo
>> ...
>>
>> After:
>>
>> $ readelf -s pr26702.o
>>
>> Symbol table '.symtab' contains 10 entries:
>>Num:Value  Size TypeBind   Vis  Ndx Name
>> ...
>>  6:  4 NOTYPE  LOCAL  DEFAULT3 static_foo
>> ...
>>
>> The testsuite has been run with a i686-pc-linux-gnu hosted cross-compiler
>> targetted at arm-none-eabi with no regressions.
>>
>> Kwok
>>
>>
>> 2015-03-30  Kwok Cheung Yeung  
>>
>> gcc/
>> PR target/26702
>> * config/arm/unknown-elf.h (ASM_OUTPUT_ALIGNED_DECL_LOCAL): Emit
>> size of local.
>>
>> gcc/testsuite/
>> PR target/26702
>> * gcc.target/arm/pr26702.c: New test.
>>
>> Index: gcc/testsuite/gcc.target/arm/pr26702.c
>> ===
>> --- gcc/testsuite/gcc.target/arm/pr26702.c  (revision 0)
>> +++ gcc/testsuite/gcc.target/arm/pr26702.c  (revision 0)
>> @@ -0,0 +1,4 @@
>> +/* { dg-do compile { target arm*-*-eabi* } } */
>> +/* { dg-final { scan-assembler "\\.size\[\\t \]+static_foo, 4" } } */
>> +int foo;
>> +static int static_foo;
>> Index: gcc/config/arm/unknown-elf.h
>> ===
>> --- gcc/config/arm/unknown-elf.h(revision 447549)
>> +++ gcc/config/arm/unknown-elf.h(working copy)
>> @@ -81,6 +81,8 @@
>>ASM_OUTPUT_ALIGN (FILE, floor_log2 (ALIGN / BITS_PER_UNIT)); \
>>ASM_OUTPUT_LABEL (FILE, NAME);   \
>>fprintf (FILE, "\t.space\t%d\n", SIZE ? (int)(SIZE) : 1);
>> \
>> +  fprintf (FILE, "\t.size\t%s, %d\n",  \
>> +  NAME, SIZE ? (int)(SIZE) : 1);   \
>>  }  \
>>while (0)
>>
>
>
> Now applied as attached with the following modifications.
>
> Sorry about the delay - I've been away for a bit and couldn't attend
> to committing this.

Hi Kwok,
The newly introduced test case failed on
arm-none-linux-gnueabi&arm-none-linux-gnueabihf.  Could you please
have a look at it?

FAIL: gcc.target/arm/pr26702.c scan-assembler \\.size[\\t ]+static_foo, 4

PR65937 is filed for tracking this.

Thanks,
bin


>
> Thanks
> Ramana


Re: More type narrowing in match.pd

2015-04-30 Thread Marc Glisse

On Wed, 29 Apr 2015, Jeff Law wrote:

This is an incremental improvement to the type narrowing in match.pd. It's 
largely based on the pattern I added to fix 47477.


Basically if we have

(bit_and (arith_op (convert A) (convert B)) mask)

Where the conversions are widening and the mask turns off all bits outside 
the original types of A & B, then we can turn that into


(bit_and (arith_op A B) mask)

We may need to convert A & B to an unsigned type with the same 
width/precision as their original type, but that's still better than a 
widening conversion.


Bootstrapped and regression tested on x86_64-linux-gnu.

OK for the trunk?


+/* This is another case of narrowing, specifically when there's an outer
+   BIT_AND_EXPR which masks off bits outside the type of the innermost
+   operands.   Like the previous case we have to convert the operands
+   to unsigned types to avoid introducing undefined behaviour for the
+   arithmetic operation.  */
+(for op (minus plus)

No mult? or widen_mult with a different pattern? (maybe that's already 
done elsewhere)


+  (simplify
+(bit_and (op (convert@2 @0) (convert@3 @1)) INTEGER_CST@4)

Maybe op@5 and then test single_use on @5? If I compute something, and 
before using it I test if the result is odd, I may not want to recompute 
it.


+(if (INTEGRAL_TYPE_P (type)

Can this be false, or is it for documentation?

+/* We check for type compatibility between @0 and @1 below,
+   so there's no need to check that @1/@3 are integral types.  */
+&& INTEGRAL_TYPE_P (TREE_TYPE (@0))
+&& INTEGRAL_TYPE_P (TREE_TYPE (@2))
+/* The precision of the type of each operand must match the
+   precision of the mode of each operand, similarly for the
+   result.  */

A nicely named helper that does this test would be cool. Every time I
see it I have to think again why it is necessary, and if there was a
function, I could refer to the comment above its definition ;-)

+&& (TYPE_PRECISION (TREE_TYPE (@0))
+== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0
+&& (TYPE_PRECISION (TREE_TYPE (@1))
+== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@1
+&& TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
+/* The inner conversion must be a widening conversion.  */
+&& TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
+&& ((GENERIC 
+ && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))

+ == TYPE_MAIN_VARIANT (TREE_TYPE (@1
+|| (GIMPLE
+&& types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1

We don't need to be that strict, but this probably covers the most
common case.

+&& (tree_int_cst_min_precision (@4, UNSIGNED)
+<= TYPE_PRECISION (TREE_TYPE (@0
+  (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
+   (with { tree ntype = TREE_TYPE (@0); }
+ (convert (bit_and (op @0 @1) (convert:ntype @4)
+  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
+   (convert (bit_and (op (convert:utype @0) (convert:utype @1))
+ (convert:utype @4)))


--
Marc Glisse


RE: [PATCH 2/3, ARM, libgcc, ping6] Code size optimization for the fmul/fdiv and dmul/ddiv function in libgcc

2015-04-30 Thread Thomas Preud'homme
Here is an updated patch that prefix local symbols with __ for more safety.
They appear in the symtab as local so it is not strictly necessary but one is
never too cautious. Being local, they also do not generate any PLT entry.
They appear only because the jumps are from one section to another
(which is the whole purpose of this patch) and thus need a static relocation.

I hope this revised version address all your concerns.

ChangeLog entry is unchanged:

*** gcc/libgcc/ChangeLog ***

2015-04-30   Tony Wang 

* config/arm/ieee754-sf.S: Expose symbols around fragment boundaries as 
function symbols.
* config/arm/ieee754-df.S: Same with above

diff --git a/libgcc/config/arm/ieee754-df.S b/libgcc/config/arm/ieee754-df.S
index c1468dc..39b0028 100644
--- a/libgcc/config/arm/ieee754-df.S
+++ b/libgcc/config/arm/ieee754-df.S
@@ -559,7 +559,7 @@ ARM_FUNC_ALIAS aeabi_l2d floatdidf
 
 #ifdef L_arm_muldivdf3
 
-ARM_FUNC_START muldf3
+ARM_FUNC_START muldf3, function_section
 ARM_FUNC_ALIAS aeabi_dmul muldf3
do_push {r4, r5, r6, lr}
 
@@ -571,7 +571,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
COND(and,s,ne)  r5, ip, yh, lsr #20
teqne   r4, ip
teqne   r5, ip
-   bleqLSYM(Lml_s)
+   bleq__Lml_s
 
@ Add exponents together
add r4, r4, r5
@@ -689,7 +689,7 @@ ARM_FUNC_ALIAS aeabi_dmul muldf3
subsip, r4, #(254 - 1)
do_it   hi
cmphi   ip, #0x700
-   bhi LSYM(Lml_u)
+   bhi __Lml_u
 
@ Round the result, merge final exponent.
cmp lr, #0x8000
@@ -716,9 +716,12 @@ LSYM(Lml_1):
mov lr, #0
subsr4, r4, #1
 
-LSYM(Lml_u):
+   FUNC_END aeabi_dmul
+   FUNC_END muldf3
+
+ARM_SYM_START __Lml_u
@ Overflow?
-   bgt LSYM(Lml_o)
+   bgt __Lml_o
 
@ Check if denormalized result is possible, otherwise return signed 0.
cmn r4, #(53 + 1)
@@ -778,10 +781,11 @@ LSYM(Lml_u):
do_it   eq
biceq   xl, xl, r3, lsr #31
RETLDM  "r4, r5, r6"
+   SYM_END __Lml_u
 
@ One or both arguments are denormalized.
@ Scale them leftwards and preserve sign bit.
-LSYM(Lml_d):
+ARM_SYM_START __Lml_d
teq r4, #0
bne 2f
and r6, xh, #0x8000
@@ -804,8 +808,9 @@ LSYM(Lml_d):
beq 3b
orr yh, yh, r6
RET
+   SYM_END __Lml_d
 
-LSYM(Lml_s):
+ARM_SYM_START __Lml_s
@ Isolate the INF and NAN cases away
teq r4, ip
and r5, ip, yh, lsr #20
@@ -817,10 +822,11 @@ LSYM(Lml_s):
orrsr6, xl, xh, lsl #1
do_it   ne
COND(orr,s,ne)  r6, yl, yh, lsl #1
-   bne LSYM(Lml_d)
+   bne __Lml_d
+   SYM_END __Lml_s
 
@ Result is 0, but determine sign anyway.
-LSYM(Lml_z):
+ARM_SYM_START __Lml_z
eor xh, xh, yh
and xh, xh, #0x8000
mov xl, #0
@@ -832,41 +838,42 @@ LSYM(Lml_z):
moveq   xl, yl
moveq   xh, yh
COND(orr,s,ne)  r6, yl, yh, lsl #1
-   beq LSYM(Lml_n) @ 0 * INF or INF * 0 -> NAN
+   beq __Lml_n @ 0 * INF or INF * 0 -> NAN
teq r4, ip
bne 1f
orrsr6, xl, xh, lsl #12
-   bne LSYM(Lml_n) @ NAN *  -> NAN
+   bne __Lml_n @ NAN *  -> NAN
 1: teq r5, ip
-   bne LSYM(Lml_i)
+   bne __Lml_i
orrsr6, yl, yh, lsl #12
do_it   ne, t
movne   xl, yl
movne   xh, yh
-   bne LSYM(Lml_n) @  * NAN -> NAN
+   bne __Lml_n @  * NAN -> NAN
+   SYM_END __Lml_z
 
@ Result is INF, but we need to determine its sign.
-LSYM(Lml_i):
+ARM_SYM_START __Lml_i
eor xh, xh, yh
+   SYM_END __Lml_i
 
@ Overflow: return INF (sign already in xh).
-LSYM(Lml_o):
+ARM_SYM_START __Lml_o
and xh, xh, #0x8000
orr xh, xh, #0x7f00
orr xh, xh, #0x00f0
mov xl, #0
RETLDM  "r4, r5, r6"
+   SYM_END __Lml_o
 
@ Return a quiet NAN.
-LSYM(Lml_n):
+ARM_SYM_START __Lml_n
orr xh, xh, #0x7f00
orr xh, xh, #0x00f8
RETLDM  "r4, r5, r6"
+   SYM_END __Lml_n
 
-   FUNC_END aeabi_dmul
-   FUNC_END muldf3
-
-ARM_FUNC_START divdf3
+ARM_FUNC_START divdf3 function_section
 ARM_FUNC_ALIAS aeabi_ddiv divdf3

do_push {r4, r5, r6, lr}
@@ -985,7 +992,7 @@ ARM_FUNC_ALIAS aeabi_ddiv divdf3
subsip, r4, #(254 - 1)
do_it   hi
cmphi   ip, #0x700
-   bhi LSYM(Lml_u)
+   bhi __Lml_u
 
@ Round the result, merge final exponent.
subsip, r5, yh
@@ -1009,13 +1016,13 @@ LSYM(Ldv_1):
orr xh, xh, #0x0010
mov lr, #0
subsr4, r4, #1
-   b   LSYM(Lml_u)
+   b   __Lml_u
 
@ Result mightt need to be denormaliz

Re: [PATCH, GCC, stage1] Fallback to copy-prop if constant-prop not possible

2015-04-30 Thread Bin.Cheng
On Fri, Apr 24, 2015 at 12:52 PM, Thomas Preud'homme
 wrote:
>> From: Jeff Law [mailto:l...@redhat.com]
>> Sent: Friday, April 24, 2015 11:15 AM
>>
>> So revised review is "ok for the trunk" :-)
>
> Committed.
Hi Thomas,
The newly introduced test failed on
arm-none-linux-gnueabi&arm-none-linux-gnueabihf.  Could you please
have a look at it?
FAIL: gcc.target/arm/pr64616.c scan-assembler-times ldr 2

GCC was configured with
gcc/configure --target=arm-none-linux-gnueabi --prefix=
--with-sysroot=... --enable-shared --disable-libsanitizer
--disable-libssp --disable-libmudflap
--with-plugin-ld=arm-none-linux-gnueabi-ld --enable-checking=yes
--enable-languages=c,c++,fortran --with-gmp=... --with-mpfr=...
--with-mpc=... --with-isl=... --with-cloog=... --with-arch=armv7-a
--with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a

Thanks,
bin

>
> Best regards,
>
> Thomas
>
>
>


Re: Mostly rewrite genrecog

2015-04-30 Thread Richard Sandiford
Andreas Schwab  writes:
> Richard Sandiford  writes:
>
>> /* Represents a test and the action that should be taken on the result.
>>If a transition exists for the test outcome, the machine switches
>>to the transition's target state.  If no suitable transition exists,
>>the machine either falls through to the next decision or, if there are no
>>more decisions to try, fails the match.  */
>> struct decision : list_head 
>> {
>>   decision (const test &);
>>
>>   void set_parent (list_head  *s);
>>   bool if_statement_p (uint64_t * = 0) const;
>>
>>   /* The state to which this decision belongs.  */
>>   state *s;
>>
>>   /* Links to other decisions in the same state.  */
>>   decision *prev, *next;
>>
>>   /* The test to perform.  */
>>   struct test test;
>> };
>
> ../../gcc/genrecog.c:1467: error: declaration of 'test decision::test'
> ../../gcc/genrecog.c:1051: error: changes meaning of 'test' from 'struct test'
>
> Bootstrap compiler is gcc 4.3.4.

Bah.  Does it like "::test test" instead of "struct test test"?

Richard



Re: ping: [PATCH, ARM] attribute target (thumb,arm) [0-6]

2015-04-30 Thread Ramana Radhakrishnan
On Mon, Apr 20, 2015 at 9:35 AM, Christian Bruel  wrote:
> Hello Ramana
>
>>>
>>
>> Can you respin this now that we are in stage1 again ?
>>
>> Ramana
>>
>
> Attached the rebased, rechecked set of patches. Original with comments
> posted in
>
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02455.html
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02458.html
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02460.html
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02461.html
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02463.html
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02467.html
> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02468.html
>
> many thanks,
>
> Christian


A general note, please reply to each of the patches with a rebased
patch as a separate email. Further more all your patches appear to
have dos line endings so they don't seem to apply cleanly. Please
don't have spurious headers in your patch submission - it then makes
it hard to , please create it in a way that it is easily applied by
someone trying it out. It looks like p4 needs a respin as I got a
reject trying to apply the documentation patch to my tree while trying
to apply it.

I tried the following decoration on foo in gcc.target/arm/attr_arm.c


int __attribute__((target("arm, fpu=vfpv4")))
foo(int a)
{
  return a ? 1 : 5;
}


And the compiler accepts it just fine.

Given that with LTO we are now using target attributes to decide
inlining - I'm not convinced that the inline asm case goes away. In
fact it only makes things worse so I'm almost convinced to forbid
inlining from "arm" to "thumb" or vice-versa, which is a reversal of
my earlier position. I hadn't twigged that LTO would reuse this
infrastructure and it's probably simpler to prevent inlining in those
cases.

Thoughts ?

So in essence I'm still playing with this and would like to iterate
towards a quick solution.

Ramana


Re: [PATCH] PR target/48904 x86_64-knetbsd-gnu missing defs

2015-04-30 Thread Bernhard Reutner-Fischer
Hi,

On 30 April 2015 at 07:00, Jeff Law  wrote:
> On 04/29/2015 02:01 AM, Bernhard Reutner-Fischer wrote:
>>
>> 2012-09-21  H.J. Lu  
>>
>> PR target/48904
>> * config.gcc (x86_64-*-knetbsd*-gnu): Add i386/knetbsd-gnu64.h.
>> * config/i386/knetbsd-gnu64.h: New file
>
> OK.  Please install on the trunk.

hmz, according to https://www.debian.org/ports/netbsd/ the debian
knetbsd port is abandoned since about 2002.
If this is true (please confirm) then we should probably remove knetbsd from
- upstream config repo
- GCC
- binutils-gdb

instead of the above patchlet.
This would work equally well for me WRT config-list.mk builds..
[I should have checked this earlier, sorry..]
>
> THanks,
> Jeff
>


Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-04-30 Thread Bin.Cheng
On Mon, Apr 27, 2015 at 8:01 PM, Uros Bizjak  wrote:
> On Wed, Feb 4, 2015 at 2:21 PM, Christian Bruel  
> wrote:
>> While trying to reduce the PR64835 case for ARM and x86, I noticed that the
>> alignment flags are cleared for x86 when attribute optimized is used.
>>
>> With the attached testcases, the visible effects are twofold :
>>
>> 1) Functions compiled in with attribute optimize (-O2) are not aligned as if
>> they were with the -O2 flag.
>>
>> 2) can_inline_edge_p fails because opts_for_fn (caller->decl) != opts_for_fn
>> (callee->decl)) even-though they are compiled with the same optimization
>> level.
>
> 2015-02-06  Christian Bruel  
>
> PR target/64835
> * config/i386/i386.c (ix86_default_align): New function.
> (ix86_override_options_after_change): Call ix86_default_align.
> (TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): New hook.
> (ix86_override_options_after_change): New function.
>
> 2015-02-06  Christian Bruel  
>
> PR target/64835
> * gcc.dg/ipa/iinline-attr.c: New test.
> * gcc.target/i386/iinline-attr-2.c: New test.
>
> OK for mainline.

Hi Christian,
I noticed case gcc.dg/ipa/iinline-attr.c failed on aarch64.  The
original patch is x86 specific, while the case is added as general
one.  Could you please have a look at this?

FAIL: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline
"hooray[^\\n]*inline copy in test"

Thanks,
bin
>
> Thanks,
> Uros


Re: ping: [PATCH, ARM] attribute target (thumb,arm) [0-6]

2015-04-30 Thread Christian Bruel


On 04/30/2015 09:43 AM, Ramana Radhakrishnan wrote:
> On Mon, Apr 20, 2015 at 9:35 AM, Christian Bruel  
> wrote:
>> Hello Ramana
>>

>>>
>>> Can you respin this now that we are in stage1 again ?
>>>
>>> Ramana
>>>
>>
>> Attached the rebased, rechecked set of patches. Original with comments
>> posted in
>>
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02455.html
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02458.html
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02460.html
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02461.html
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02463.html
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02467.html
>> https://gcc.gnu.org/ml/gcc-patches/2014-11/msg02468.html
>>
>> many thanks,
>>
>> Christian
> 
> 
> A general note, please reply to each of the patches with a rebased
> patch as a separate email. Further more all your patches appear to
> have dos line endings so they don't seem to apply cleanly. Please
> don't have spurious headers in your patch submission - it then makes
> it hard to , please create it in a way that it is easily applied by
> someone trying it out. It looks like p4 needs a respin as I got a
> reject trying to apply the documentation patch to my tree while trying
> to apply it.
> 

OK, thanks for the suggestions and sorry for the p4 reject. The sources
are moving fast and I have hard times catching up with re-bases.

> I tried the following decoration on foo in gcc.target/arm/attr_arm.c
> 
> 
> int __attribute__((target("arm, fpu=vfpv4")))
> foo(int a)
> {
>   return a ? 1 : 5;
> }
> 
> 
> And the compiler accepts it just fine.

Indeed, it's a mistake for now. attributes other the arm/thumb ones
shall be rejected (eventually with a "not yet implemented" warning for
the fpu, error for the others.) until we extend it.

> 
> Given that with LTO we are now using target attributes to decide
> inlining - I'm not convinced that the inline asm case goes away. In
> fact it only makes things worse so I'm almost convinced to forbid
> inlining from "arm" to "thumb" or vice-versa, which is a reversal of
> my earlier position. I hadn't twigged that LTO would reuse this
> infrastructure and it's probably simpler to prevent inlining in those
> cases.

I can resurrect the inline check chunk. FYI, with a few small examples
arm/thumb attribute is correctly handled by LTO

> 
> Thoughts ?
> 
> So in essence I'm still playing with this and would like to iterate
> towards a quick solution.
> 

thanks, that would be good if we could land the arm/thumb attribute and
start the fpu extensions separately. (I'm currently playing with
fpu=neon but it will take time to have something solid).

Christian

> Ramana
> 


Re: [PATCH, x86] Add TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE hook

2015-04-30 Thread Christian Bruel
OK I've have a look,

thanks

Christian


On 04/30/2015 10:27 AM, Bin.Cheng wrote:
> On Mon, Apr 27, 2015 at 8:01 PM, Uros Bizjak  wrote:
>> On Wed, Feb 4, 2015 at 2:21 PM, Christian Bruel  
>> wrote:
>>> While trying to reduce the PR64835 case for ARM and x86, I noticed that the
>>> alignment flags are cleared for x86 when attribute optimized is used.
>>>
>>> With the attached testcases, the visible effects are twofold :
>>>
>>> 1) Functions compiled in with attribute optimize (-O2) are not aligned as if
>>> they were with the -O2 flag.
>>>
>>> 2) can_inline_edge_p fails because opts_for_fn (caller->decl) != opts_for_fn
>>> (callee->decl)) even-though they are compiled with the same optimization
>>> level.
>>
>> 2015-02-06  Christian Bruel  
>>
>> PR target/64835
>> * config/i386/i386.c (ix86_default_align): New function.
>> (ix86_override_options_after_change): Call ix86_default_align.
>> (TARGET_OVERRIDE_OPTIONS_AFTER_CHANGE): New hook.
>> (ix86_override_options_after_change): New function.
>>
>> 2015-02-06  Christian Bruel  
>>
>> PR target/64835
>> * gcc.dg/ipa/iinline-attr.c: New test.
>> * gcc.target/i386/iinline-attr-2.c: New test.
>>
>> OK for mainline.
> 
> Hi Christian,
> I noticed case gcc.dg/ipa/iinline-attr.c failed on aarch64.  The
> original patch is x86 specific, while the case is added as general
> one.  Could you please have a look at this?
> 
> FAIL: gcc.dg/ipa/iinline-attr.c scan-ipa-dump inline
> "hooray[^\\n]*inline copy in test"
> 
> Thanks,
> bin
>>
>> Thanks,
>> Uros


Re: niter_base simplification

2015-04-30 Thread François Dumont

On 27/04/2015 13:55, Jonathan Wakely wrote:

On 22/04/15 22:10 +0200, François Dumont wrote:

Hello

   I don't know if I am missing something but I think __niter_base 
could be simplified to remove usage of _Iter_base. Additionally I 
overload it to also remove __normal_iterator layer even if behind a 
reverse_iterator or move_iterator, might help compiler to optimize 
code, no ? If not, might allow other algo optimization in the future...


   I prefered to provide a __make_reverse_iterator to allow the 
latter in C++11 and not only in C++14. Is it fine to do it this way 
or do you prefer to simply get rid of all this part ?


It's fine to add __make_reverse_iterator but see my comment below.

   * include/bits/cpp_type_traits.h (__gnu_cxx::__normal_iterator): 
Delete.


You're removing __is_normal_iterator not __normal_iterator.


   * include/bits/stl_algobase.h (std::__niter_base): Adapt.
   * include/bits/stl_iterator.h (__make_reverse_iterator): New in 
C++11.

   (std::__niter_base): Overloads for std::reverse_iterator,
   __gnu_cxx::__normal_iterator and std::move_iterator.

Tested under Linux x86_64. I checked that std::copy still ends up 
calling __builtin_memmove when used on vector iterators.


François



diff --git a/libstdc++-v3/include/bits/stl_algobase.h 
b/libstdc++-v3/include/bits/stl_algobase.h

index 0bcb133..73eea6b 100644
--- a/libstdc++-v3/include/bits/stl_algobase.h
+++ b/libstdc++-v3/include/bits/stl_algobase.h
@@ -270,17 +270,12 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
  return __a;
}

-  // If _Iterator is a __normal_iterator return its base (a plain 
pointer,
-  // normally) otherwise return it untouched.  See copy, fill, ... 
+  // Fallback implementation of the function used to remove the

+  // __normal_iterator wrapper. See copy, fill, ...


It's a bit strange to have a function with no other overloads visible
described as a fallback. It would be good to say that the other
definition is in bits/stl_iterator.h


  template
-struct _Niter_base
-: _Iter_base<_Iterator, __is_normal_iterator<_Iterator>::__value>
-{ };
-
-  template
-inline typename _Niter_base<_Iterator>::iterator_type
+inline _Iterator
__niter_base(_Iterator __it)
-{ return std::_Niter_base<_Iterator>::_S_base(__it); }
+{ return __it; }

  // Likewise, for move_iterator.


This comment no longer makes sense, because you've removed the comment
on _Niter_base that it referred to. Please restore the original text
of the _Niter_base comment for _Miter_base.

(Alternatively, could the same simplification be made for
__miter_base? Do we need _Miter_base<> or just two overloads of
__miter_base()?)


Definitely, I already have a patch for that.





  template
diff --git a/libstdc++-v3/include/bits/stl_iterator.h 
b/libstdc++-v3/include/bits/stl_iterator.h

index 4a9189e..3aad9f3 100644
--- a/libstdc++-v3/include/bits/stl_iterator.h
+++ b/libstdc++-v3/include/bits/stl_iterator.h
@@ -390,7 +390,15 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
{ return __y.base() - __x.base(); }
  //@}

-#if __cplusplus > 201103L
+#if __cplusplus == 201103L
+  template
+inline reverse_iterator<_Iterator>
+__make_reverse_iterator(_Iterator __i)
+{ return reverse_iterator<_Iterator>(__i); }
+
+# define _GLIBCXX_MAKE_REVERSE_ITERATOR(_Iter) \
+  std::__make_reverse_iterator(_Iter)
+#elif __cplusplus > 201103L
#define __cpp_lib_make_reverse_iterator 201402

  // _GLIBCXX_RESOLVE_LIB_DEFECTS
@@ -400,6 +408,17 @@ _GLIBCXX_BEGIN_NAMESPACE_VERSION
inline reverse_iterator<_Iterator>
make_reverse_iterator(_Iterator __i)
{ return reverse_iterator<_Iterator>(__i); }
+
+# define _GLIBCXX_MAKE_REVERSE_ITERATOR(_Iter) \
+  std::make_reverse_iterator(_Iter)
+#endif
+
+#if __cplusplus >= 201103L
+  template
+auto
+__niter_base(reverse_iterator<_Iterator> __it)
+-> 
decltype(_GLIBCXX_MAKE_REVERSE_ITERATOR(__niter_base(__it.base(
+{ return 
_GLIBCXX_MAKE_REVERSE_ITERATOR(__niter_base(__it.base())); }

#endif



It might be simpler to just add __make_reverse_iterator for >= 201103L
and then always use std::__make_reverse_iterator instead of a macro.

That's similar to what we do for std:__addressof and std:addressof.

Ok, attached is the patch I have plan to commit then that I am testing 
at the moment.


François

Index: ChangeLog
===
--- ChangeLog	(revision 222611)
+++ ChangeLog	(working copy)
@@ -1,5 +1,14 @@
 2015-04-30  François Dumont  
 
+	* include/bits/cpp_type_traits.h
+	(__gnu_cxx::__is_normal_iterator): Delete.
+	* include/bits/stl_algobase.h (std::__niter_base): Adapt.
+	* include/bits/stl_iterator.h (__make_reverse_iterator): New in C++11.
+	(std::__niter_base): Overloads for std::reverse_iterator,
+	__gnu_cxx::__normal_iterator and std::move_iterator.
+
+2015-04-30  François Dumont  
+
 	* include/bits/hashtable_policy.h (_Prime_rehash_policy::_S_n_primes):
 	Delete.
 	* src/c++11/hashtable_c++0x.cc (_Pr

Re: [patch] Rewrite check_global_declarations() generically

2015-04-30 Thread Richard Biener
On Wed, Apr 29, 2015 at 3:01 AM, Aldy Hernandez  wrote:
> [This is actually for the debug-early branch, but I figured I'd avoid the
> [debug-early] subject line to alert others of the upcoming change. Actually,
> I should've adapted this and submitted it to mainline, but considering I
> should be submitting the debug-early work "Real Soon Now" (tm), I don't want
> to get side-tracked.]
>
> This is one of those changes that Richi likes-- moving stuff out of the
> front-ends and into generic land...
>
> The check_global_declarations checks were failing in the branch because
> mainline depends on both, trees being present, and tree bits being set late
> in the compilation, sometimes as late as RTL (in the case of
> TREE_SYMBOL_REFERENCED).  The front-ends were making a list of interesting
> globals, and feeding them to check_global_declarations() to issue certain
> use without define (and vice versa) warnings.
>
> Instead of caching these globals from the front-end all the way to the
> back-end, I redesigned it to check global declarations generically, in a
> language agnostic way, while using symtab/cgraph to determine usage (instead
> of TREE_USED and other magic bits that weren't as accurate or in certain
> cases, available).
>
> With this patch we fix a slew of regressions on the debug-early branch, and
> we find a lot more legitimate warnings.  I had to adjust a lot of tests,
> because we're being much more aggressive.  IMO, this is good. Interestingly
> enough, I even found the following in gengtype.c:
>
> +/* ?? Why are we keeping this?  Is this actually used anywhere?  */
> +static void ATTRIBUTE_UNUSED
>  output_typename (outf_p of, const_type_p t)
>
> The reason mainline was not picking this up was because output_typename()
> was calling itself, thus the infrastructure assumed it was used.  By the
> way, can I remove this unused symbol instead of papering over the problem
> with ATTRIBUTE_UNUSED?
>
> The attached patch was tested with GCC and GDB.  It only has one regression,
> which I've asked Jason to look at to determine if it is a false positive or
> not: g++.dg/torture/pr46383.C.
>
> Gentlemen, is this an approach you can bless?  Don't worry about fully
> reviewing it (unless you want to-- no complaints here), I just want to make
> sure it's something I can commit to the branch and continue onto other
> regressions.  You will all get a chance to crucify the entire branch real
> soon :).

Yeah, the approach looks sane.

Richard.

> Thanks.
> Aldy


Re: More type narrowing in match.pd

2015-04-30 Thread Richard Biener
On Thu, Apr 30, 2015 at 5:52 AM, Jeff Law  wrote:
>
> This is an incremental improvement to the type narrowing in match.pd. It's
> largely based on the pattern I added to fix 47477.
>
> Basically if we have
>
> (bit_and (arith_op (convert A) (convert B)) mask)
>
> Where the conversions are widening and the mask turns off all bits outside
> the original types of A & B, then we can turn that into
>
> (bit_and (arith_op A B) mask)
>
> We may need to convert A & B to an unsigned type with the same
> width/precision as their original type, but that's still better than a
> widening conversion.
>
> Bootstrapped and regression tested on x86_64-linux-gnu.
>
> OK for the trunk?

Without looking too close at this patch I'll note that we might want to
improve the previous one first to also handle a constant 2nd operand
for the operation (your new one also misses that).

I have in my local dev tree (so completely untested...)

@@ -1040,31 +1052,22 @@ (define_operator_list CBRT BUILT_IN_CBRT
operation and convert the result to the desired type.  */
 (for op (plus minus)
   (simplify
-(convert (op (convert@2 @0) (convert@3 @1)))
+(convert (op:c@4 (convert@2 @0) (convert?@3 @1)))
 (if (INTEGRAL_TYPE_P (type)
-/* We check for type compatibility between @0 and @1 below,
-   so there's no need to check that @1/@3 are integral types.  */
 && INTEGRAL_TYPE_P (TREE_TYPE (@0))
-&& INTEGRAL_TYPE_P (TREE_TYPE (@2))
+&& INTEGRAL_TYPE_P (TREE_TYPE (@4))
 /* The precision of the type of each operand must match the
precision of the mode of each operand, similarly for the
result.  */
 && (TYPE_PRECISION (TREE_TYPE (@0))
 == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0
-&& (TYPE_PRECISION (TREE_TYPE (@1))
-== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@1
-&& TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
 /* The inner conversion must be a widening conversion.  */
 && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
-&& ((GENERIC
- && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
- == TYPE_MAIN_VARIANT (TREE_TYPE (@1)))
- && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
- == TYPE_MAIN_VARIANT (type)))
-|| (GIMPLE
-&& types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))
-&& types_compatible_p (TREE_TYPE (@0), type
+/* The final precision should match that of operand @0.  */
+&& TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (@0))
+/* Make sure the wide operation is dead after the transform.  */
+&& (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
-   (convert (op @0 @1)))
+   (convert (op @0 (convert @1
   (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
(convert (op (convert:utype @0) (convert:utype @1)))

and it was noticed multiple times that the type comparison boiler-plate
needs some helper function.  Like

Index: gimple-match-head.c
===
--- gimple-match-head.c (revision 222375)
+++ gimple-match-head.c (working copy)
@@ -861,3 +861,8 @@
   return op;
 }

+inline bool
+types_match (tree t1, tree t2)
+{
+  return types_compatible_p (t1, t2);
+}
Index: generic-match-head.c
===
--- generic-match-head.c(revision 222375)
+++ generic-match-head.c(working copy)
@@ -70,4 +70,8 @@
 #include "dumpfile.h"
 #include "generic-match.h"

-
+inline bool
+types_match (tree t1, tree t2)
+{
+  return TYPE_MAIN_VARIANT (t1) == TYPE_MAIN_VARIANT (t2);
+}

and then just use types_match (TREE_TYPE (...), TREE_TYPE (...)) everywhere.

I'll also add to the comment about using has_single_use - that will cause
missed optimizations for pattern uses via gimple_build which adds stmts
to a sequence and does not have SSA operands computed (so everything
will appear as having zero uses).  We need to add an abstraction for the
single-use test as well, like for generic-match-head.c

inline bool
single_use (tree t)
{
  return true;
}

and for gimple-match-head.c

inline bool
single use (tree t)
{
  return TREE_CODE (t) != SSA_NAME || has_zero_uses (t) || has_single_use (t);
}

And if you'd like to lend helping hands to adding patterns then transitioning
patterns from fold-const.c to match.pd is more appreciated than inventing
new ones ;)

Thanks,
Richard.

>
> diff --git a/gcc/ChangeLog b/gcc/ChangeLog
> index 5c7558a..51f68ab 100644
> --- a/gcc/ChangeLog
> +++ b/gcc/ChangeLog
> @@ -1,3 +1,8 @@
> +2015-04-29  Jeff Law  
> +
> +   * match.pd (bit_and (plus/minus (convert @0) (convert @1) mask): New
> +   simplifier to narrow arithmetic.
> +
>  2015-04-29  Mikhail Maltsev  
>
> * dojump.c (do_compare_rtx_and_jum

[PATCH][PING] Skip preprocessor directives in mklog

2015-04-30 Thread Yury Gribov

On 04/21/2015 02:26 PM, Yury Gribov wrote:

Hi all,

Contrib/mklog is currently faked by preprocessor directives inside
functions to produce invalid ChangeLog.  The attached patch fixes this.

Tested with my local mklog testsuite and http://paste.debian.net/167999/
.  Ok to commit?




[C PATCH, committed] Better location for an unknown field in initializer

2015-04-30 Thread Marek Polacek
Bootstrapped/regtested on x86_64-linux, applying to trunk.

2015-04-29  Marek Polacek  

* c-typeck.c (set_init_label): Call error_at instead of error and
pass LOC to it.

* gcc.dg/init-bad-8.c: New test.

diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
index c58e918..466079f 100644
--- gcc/c/c-typeck.c
+++ gcc/c/c-typeck.c
@@ -7926,7 +7926,7 @@ set_init_label (location_t loc, tree fieldname,
   field = lookup_field (constructor_type, fieldname);
 
   if (field == 0)
-error ("unknown field %qE specified in initializer", fieldname);
+error_at (loc, "unknown field %qE specified in initializer", fieldname);
   else
 do
   {
diff --git gcc/testsuite/gcc.dg/init-bad-8.c gcc/testsuite/gcc.dg/init-bad-8.c
index e69de29..b321323 100644
--- gcc/testsuite/gcc.dg/init-bad-8.c
+++ gcc/testsuite/gcc.dg/init-bad-8.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "" } */
+
+struct S { int i, j, k; };
+
+void
+foo (void)
+{
+  struct S s = { .i = 1, .j = 2, .l = 4}; /* { dg-error "34:unknown field .l. 
specified in initializer" } */
+}

Marek


Re: [PATCH 1/4] match.pd: Add x + (x & 1) -> (x + 1) & ~1 pattern

2015-04-30 Thread Richard Biener
On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes
 wrote:
> gcc.dg/20150120-1.c: New test
>
> Rounding an integer to the next even integer is sometimes written x +=
> x & 1. The equivalent x = (x+1)&~1 usually uses one less register, and
> in practical cases only the new value of x will be used (making it
> unlikely that the subexpression x&1 has any uses).

Now that we are in stage1 again

You are missig a ChangeLog entry and fail to state how you tested the patch.

Otherwise the patch looks ok.

Thanks,
Richard.

> Signed-off-by: Rasmus Villemoes 
> ---
>  gcc/match.pd  |  6 +
>  gcc/testsuite/gcc.dg/20150120-1.c | 51 
> +++
>  2 files changed, 57 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/20150120-1.c
>
> diff --git gcc/match.pd gcc/match.pd
> index 81c4ee6..ecefcfb 100644
> --- gcc/match.pd
> +++ gcc/match.pd
> @@ -255,6 +255,12 @@ along with GCC; see the file COPYING3.  If not see
>(bitop @0 @0)
>(non_lvalue @0)))
>
> +/* x + (x & 1) -> (x + 1) & ~1 */
> +(simplify
> + (plus:c @0 (bit_and@2 @0 integer_onep@1))
> + (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
> +  (bit_and (plus @0 @1) (bit_not @1
> +
>  (simplify
>   (abs (negate @0))
>   (abs @0))
> diff --git gcc/testsuite/gcc.dg/20150120-1.c gcc/testsuite/gcc.dg/20150120-1.c
> new file mode 100644
> index 000..18906c4
> --- /dev/null
> +++ gcc/testsuite/gcc.dg/20150120-1.c
> @@ -0,0 +1,51 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-original" } */
> +
> +/* x + (x & 1) -> (x + 1) & ~1 */
> +int
> +fn1 (int x)
> +{
> +   return x + (x & 1);
> +}
> +int
> +fn2 (int x)
> +{
> +   return (x & 1) + x;
> +}
> +int
> +fn3 (int x)
> +{
> +   return x + (1 & x);
> +}
> +int
> +fn4 (int x)
> +{
> +   return (1 & x) + x;
> +}
> +unsigned int
> +fn5 (unsigned int x)
> +{
> +   return x + (x & 1);
> +}
> +unsigned int
> +fn6 (unsigned int x)
> +{
> +   return (x & 1) + x;
> +}
> +unsigned int
> +fn7 (unsigned int x)
> +{
> +   return x + (x % 2);
> +}
> +unsigned int
> +fn8 (unsigned int x)
> +{
> +   return (x % 2) + x;
> +}
> +unsigned int
> +fn9 (unsigned int x)
> +{
> +   return (1LL & x) + x;
> +}
> +
> +/* { dg-final { scan-tree-dump-times "x \\+ 1" 9 "original" } } */
> --
> 2.1.3
>


Re: [PATCH] Fix up tm_clone_hasher

2015-04-30 Thread Marek Polacek
Ping.

On Wed, Apr 22, 2015 at 05:24:43PM +0200, Marek Polacek wrote:
> handle_cache_entry in tm_clone_hasher looks wrong: the condition
> if (e != HTAB_EMPTY_ENTRY || e != HTAB_DELETED_ENTRY) is always true.  While
> it could be fixed by just changing || into &&, I decided to follow suit and
> do what we do in handle_cache_entry's elsewhere in the codebase.  I've fixed
> a formatting issue below while at it.
> 
> Bootstrapped/regtested on x86_64-linux, ok for trunk?
> I think this should also go into 5.1.
> 
> 2015-04-22  Marek Polacek  
> 
>   * varasm.c (handle_cache_entry): Fix logic.
> 
> diff --git gcc/varasm.c gcc/varasm.c
> index 1597de1..3fc0316 100644
> --- gcc/varasm.c
> +++ gcc/varasm.c
> @@ -5779,21 +5779,20 @@ struct tm_clone_hasher : ggc_cache_hasher
>static hashval_t hash (tree_map *m) { return tree_map_hash (m); }
>static bool equal (tree_map *a, tree_map *b) { return tree_map_eq (a, b); }
>  
> -  static void handle_cache_entry (tree_map *&e)
> +  static void
> +  handle_cache_entry (tree_map *&e)
>{
> -if (e != HTAB_EMPTY_ENTRY || e != HTAB_DELETED_ENTRY)
> -  {
> - extern void gt_ggc_mx (tree_map *&);
> - if (ggc_marked_p (e->base.from))
> -   gt_ggc_mx (e);
> - else
> -   e = static_cast (HTAB_DELETED_ENTRY);
> -  }
> +extern void gt_ggc_mx (tree_map *&);
> +if (e == HTAB_EMPTY_ENTRY || e == HTAB_DELETED_ENTRY)
> +  return;
> +else if (ggc_marked_p (e->base.from))
> +  gt_ggc_mx (e);
> +else
> +  e = static_cast (HTAB_DELETED_ENTRY);
>}
>  };
>  
> -static GTY((cache))
> - hash_table *tm_clone_hash;
> +static GTY((cache)) hash_table *tm_clone_hash;
>  
>  void
>  record_tm_clone_pair (tree o, tree n)
> 
>   Marek

Marek


Re: [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern

2015-04-30 Thread Richard Biener
On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes
 wrote:
> Generalizing the x+(x&1) pattern, one can round up x to a multiple of
> a 2^k by adding the negative of x modulo 2^k. But it is fewer
> instructions, and presumably requires fewer registers, to do the more
> common (x+m)&~m where m=2^k-1.
>
> Signed-off-by: Rasmus Villemoes 
> ---
>  gcc/match.pd  |  9 ++
>  gcc/testsuite/gcc.dg/20150120-4.c | 59 
> +++
>  2 files changed, 68 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c
>
> diff --git gcc/match.pd gcc/match.pd
> index 47865f1..93c2298 100644
> --- gcc/match.pd
> +++ gcc/match.pd
> @@ -273,6 +273,15 @@ along with GCC; see the file COPYING3.  If not see
>   (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
>(bit_ior @0 (bit_not @1
>
> +/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1.  */
> +(simplify
> + (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1))

I think you want to restrict this to INTEGER_CST@1

> + (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1),
> +@1, build_one_cst (TREE_TYPE (@1))); }

We shouldn't dispatch to fold_binary in patterns.  int_const_binop would
be the appropriate function to use - but what happens for @1 == INT_MAX
where @1 + 1 overflows?  Similar, is this also valid for negative @1
and thus signed mask types?  IMHO we should check whether @1
is equal to wi::mask (TYPE_PRECISION (TREE_TYPE (@1)) - wi::clz (@1),
false, TYPE_PRECISION (TREE_TYPE (@1)).

As with the other patch a ChangeLog entry is missing as well as stating
how you tested the patch.

Thanks,
Richard.

> +  (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
> +   && cst && integer_pow2p (cst))
> +   (bit_and (plus @0 @1) (bit_not @1)
> +
>  (simplify
>   (abs (negate @0))
>   (abs @0))
> diff --git gcc/testsuite/gcc.dg/20150120-4.c gcc/testsuite/gcc.dg/20150120-4.c
> new file mode 100644
> index 000..c3552bf
> --- /dev/null
> +++ gcc/testsuite/gcc.dg/20150120-4.c
> @@ -0,0 +1,59 @@
> +/* { dg-do compile } */
> +/* { dg-options "-O2 -fdump-tree-original" } */
> +
> +/* x + ((-x) & m) -> (x + m) & ~m for m one less than a pow2.  */
> +int
> +fn1 (int x)
> +{
> +   return x + ((-x) & 7);
> +}
> +int
> +fn2 (int x)
> +{
> +   return ((-x) & 7) + x;
> +}
> +unsigned int
> +fn3 (unsigned int x)
> +{
> +   return x + ((-x) & 7);
> +}
> +unsigned int
> +fn4 (unsigned int x)
> +{
> +   return ((-x) & 7) + x;
> +}
> +unsigned int
> +fn5 (unsigned int x)
> +{
> +   return x + ((-x) % 8);
> +}
> +unsigned int
> +fn6 (unsigned int x)
> +{
> +   return ((-x) % 8) + x;
> +}
> +int
> +fn7 (int x)
> +{
> +   return x + ((-x) & 9);
> +}
> +int
> +fn8 (int x)
> +{
> +   return ((-x) & 9) + x;
> +}
> +unsigned int
> +fn9 (unsigned int x)
> +{
> +   return x + ((-x) & ~0U);
> +}
> +unsigned int
> +fn10 (unsigned int x)
> +{
> +   return ((-x) & ~0U) + x;
> +}
> +
> +
> +/* { dg-final { scan-tree-dump-times "x \\+ 7" 6 "original" } } */
> +/* { dg-final { scan-tree-dump-times "-x & 9" 2 "original" } } */
> +/* { dg-final { scan-tree-dump-times "return 0" 2 "original" } } */
> --
> 2.1.3
>


RE: [Ping^3] [PATCH, ARM, libgcc] New aeabi_idiv function for armv6-m

2015-04-30 Thread Hale Wang
> -Original Message-
> From: Hale Wang [mailto:hale.w...@arm.com]
> Sent: Monday, February 09, 2015 9:54 AM
> To: Richard Earnshaw
> Cc: Hale Wang; gcc-patches; Matthew Gretton-Dann
> Subject: RE: [Ping^2] [PATCH, ARM, libgcc] New aeabi_idiv function for
> armv6-m
> 
> Ping https://gcc.gnu.org/ml/gcc-patches/2014-12/msg01059.html.
> 

Ping for trunk. Is it ok for trunk now?

Thanks,
Hale
> > -Original Message-
> > From: gcc-patches-ow...@gcc.gnu.org [mailto:gcc-patches-
> > ow...@gcc.gnu.org] On Behalf Of Hale Wang
> > Sent: Friday, December 12, 2014 9:36 AM
> > To: gcc-patches
> > Subject: RE: [Ping] [PATCH, ARM, libgcc] New aeabi_idiv function for
> > armv6- m
> >
> > Ping? Already applied to arm/embedded-4_9-branch, is it OK for trunk?
> >
> > -Hale
> >
> > > -Original Message-
> > > From: Joey Ye [mailto:joey.ye...@gmail.com]
> > > Sent: Thursday, November 27, 2014 10:01 AM
> > > To: Hale Wang
> > > Cc: gcc-patches
> > > Subject: Re: [PATCH, ARM, libgcc] New aeabi_idiv function for
> > > armv6-m
> > >
> > > OK applying to arm/embedded-4_9-branch, though you still need
> > > maintainer approval into trunk.
> > >
> > > - Joey
> > >
> > > On Wed, Nov 26, 2014 at 11:43 AM, Hale Wang 
> > wrote:
> > > > Hi,
> > > >
> > > > This patch ports the aeabi_idiv routine from Linaro Cortex-Strings
> > > > (https://git.linaro.org/toolchain/cortex-strings.git), which was
> > > > contributed by ARM under Free BSD license.
> > > >
> > > > The new aeabi_idiv routine is used to replace the one in
> > > > libgcc/config/arm/lib1funcs.S. This replacement happens within the
> > > > Thumb1 wrapper. The new routine is under LGPLv3 license.
> > > >
> > > > The main advantage of this version is that it can improve the
> > > > performance of the aeabi_idiv function for Thumb1. This solution
> > > > will also increase the code size. So it will only be used if
> > > > __OPTIMIZE_SIZE__ is
> > > not defined.
> > > >
> > > > Make check passed for armv6-m.
> > > >
> > > > OK for trunk?
> > > >
> > > > Thanks,
> > > > Hale Wang
> > > >
> > > > libgcc/ChangeLog:
> > > >
> > > > 2014-11-26  Hale Wang  
> > > >
> > > > * config/arm/lib1funcs.S: Add new wrapper.
> > > >
> > > > ===
> > > > diff --git a/libgcc/config/arm/lib1funcs.S
> > > > b/libgcc/config/arm/lib1funcs.S index b617137..de66c81 100644
> > > > --- a/libgcc/config/arm/lib1funcs.S
> > > > +++ b/libgcc/config/arm/lib1funcs.S
> > > > @@ -306,34 +306,12 @@ LSYM(Lend_fde):
> > > >  #ifdef __ARM_EABI__
> > > >  .macro THUMB_LDIV0 name signed
> > > >  #if defined(__ARM_ARCH_6M__)
> > > > -   .ifc \signed, unsigned
> > > > -   cmp r0, #0
> > > > -   beq 1f
> > > > -   mov r0, #0
> > > > -   mvn r0, r0  @ 0x
> > > > -1:
> > > > -   .else
> > > > -   cmp r0, #0
> > > > -   beq 2f
> > > > -   blt 3f
> > > > +
> > > > +   push{r0, lr}
> > > > mov r0, #0
> > > > -   mvn r0, r0
> > > > -   lsr r0, r0, #1  @ 0x7fff
> > > > -   b   2f
> > > > -3: mov r0, #0x80
> > > > -   lsl r0, r0, #24 @ 0x8000
> > > > -2:
> > > > -   .endif
> > > > -   push{r0, r1, r2}
> > > > -   ldr r0, 4f
> > > > -   adr r1, 4f
> > > > -   add r0, r1
> > > > -   str r0, [sp, #8]
> > > > -   @ We know we are not on armv4t, so pop pc is safe.
> > > > -   pop {r0, r1, pc}
> > > > -   .align  2
> > > > -4:
> > > > -   .word   __aeabi_idiv0 - 4b
> > > > +   bl  SYM(__aeabi_idiv0)
> > > > +   pop {r1, pc}
> > > > +
> > > >  #elif defined(__thumb2__)
> > > > .syntax unified
> > > > .ifc \signed, unsigned
> > > > @@ -927,7 +905,158 @@ LSYM(Lover7):
> > > > add dividend, work
> > > >.endif
> > > >  LSYM(Lgot_result):
> > > > -.endm
> > > > +.endm
> > > > +
> > > > +#if defined(__prefer_thumb__)
> > > && !defined(__OPTIMIZE_SIZE__) .macro
> > > > +BranchToDiv n, label
> > > > +   lsr curbit, dividend, \n
> > > > +   cmp curbit, divisor
> > > > +   blo \label
> > > > +.endm
> > > > +
> > > > +.macro DoDiv n
> > > > +   lsr curbit, dividend, \n
> > > > +   cmp curbit, divisor
> > > > +   bcc 1f
> > > > +   lsl curbit, divisor, \n
> > > > +   sub dividend, dividend, curbit
> > > > +
> > > > +1: adc result, result
> > > > +.endm
> > > > +
> > > > +.macro THUMB1_Div_Positive
> > > > +   mov result, #0
> > > > +   BranchToDiv #1, LSYM(Lthumb1_div1)
> > > > +   BranchToDiv #4, LSYM(Lthumb1_div4)
> > > > +   BranchToDiv #8, LSYM(Lthumb1_div8)
> > > > +   BranchToDiv #12, LSYM(Lthumb1_div12)
> > > > +   BranchToDiv #16, LSYM(Lthumb1_div16)
> > > > +LSYM(Lthumb1_div_large_positive):
> > > > +   mov result, #0xff
> > > > +   lsl divisor, divisor, #8
> > > > +   rev result, result
>

Re: c-family PATCH to improve -Wbool-compare (PR c/64610)

2015-04-30 Thread Andreas Schwab
Marek Polacek  writes:

>   PR c/64610
>   * c-common.c (maybe_warn_bool_compare): Warn when comparing a boolean
>   with 0/1.

/usr/local/gcc/gcc-20150430/Build/./prev-gcc/xg++ 
-B/usr/local/gcc/gcc-20150430/Build/./prev-gcc/ -B/usr/aarch64-suse-linux/bin/ 
-nostdinc++ 
-B/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/src/.libs
 
-B/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/libsupc++/.libs
  
-I/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/include/aarch64-suse-linux
  
-I/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/include
  -I/usr/local/gcc/gcc-20150430/libstdc++-v3/libsupc++ 
-L/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/src/.libs
 
-L/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/libsupc++/.libs
 -c   -g -O2 -gtoggle -DIN_GCC-fno-exceptions -fno-rtti 
-fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
-Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
-Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror -fno-common 
 -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. -I../../gcc/../include 
-I../../gcc/../libcpp/include  -I../../gcc/../libdecnumber 
-I../../gcc/../libdecnumber/dpd -I../libdecnumber -I../../gcc/../libbacktrace   
-o expr.o -MT expr.o -MMD -MP -MF ./.deps/expr.TPo ../../gcc/expr.c
../../gcc/expr.c: In function 'int can_store_by_pieces(long unsigned int, 
rtx_def* (*)(void*, long int, machine_mode), void*, unsigned int, bool)':
../../gcc/expr.c:2496:16: error: comparison of constant 'true' with boolean 
expression is always true [-Werror=bool-compare]
reverse <= (HAVE_PRE_DECREMENT || HAVE_POST_DECREMENT);
^

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-30 Thread Richard Biener
On Tue, Apr 28, 2015 at 3:55 PM, Alan Lawrence  wrote:
> Tree if-conversion currently bails out for loops that (a) contain nested
> loops; (b) have more than one exit; (c) where the exit block (source of the
> exit edge) does not dominate the loop latch; (d) where the exit block is the
> loop header, or there are statements after the exit.
>
> This patch removes restrictions (c) and (d). The intuition is that, for (c),
> "if (P) {... if (Q) break;}" is equivalent to "if (P) {...}; if (P&&Q)
> break;" and this is mostly handled by existing code for propagating
> conditions. For (d), "if (P) break; stmts" is equivalent to "if (!P) stmts;
> if (P) break;" - this requires inserting the predicated stmts before the
> branch rather than after.

Hum - so you empty the latch by conditionalizing code on the exit condition?

> Mostly thus this patch is just removing assumptions about when we do/don't
> need to store predicates. One 'gotcha' was in some test cases the latch
> block passed into if-conversion is non-empty; in such cases, if-conversion
> will now restore "good form" by moving the statement into the exit block
> (predicated with !exit-condition).

Indeed.

> The condition on dominance in add_to_predicate_list, I haven't quite managed
> to convince myself is right; we _do_ want to store a predicate for the latch
> block to handle the above case, but I'm not totally sure of the
> postdominance condition - I think it may store conditions in cases where we
> don't really need to (e.g. "for (;;) { ... if (P) { for (;;) ; } }" which
> might look nested but isn't, and has no route to the function exit).
> However, storing conditions when we don't need to, is OK, unlike failing to
> store when we do need to ;).

So you still restrict loop form to two blocks - just the latch may now be
non-empty?  Thus I'd say keeping the existing check but amending it by
&& bb != loop->latch would be better.

Otherwise the patch looks good to me.

Can you please add at least one testcase for c) and d) where we now
vectorize something after the patch but not before?

Thanks,
Richard.

> A simple example of the patch at work:
>
> int
> foo ()
> {
>   for (int i = 0; i < N ; i++)
>   {
> int m = (a[i] & i) ? 5 : 4;
> b[i] = a[i] * m;
>   }
> }
>
> compiled at -O3, -fdump-tree-ivcanon shows this immediately before
> tree-if-conversion:
>
> ...function entry, variables, etc...
>   :
>   _10 = a[0];
>   goto ;
>
>   :
>   _5 = a[i_9];
>   _6 = _5 & i_9;
>   if (_6 != 0)
> goto ;
>   else
> goto ;
>
>   :
>
>   :
>   # m_14 = PHI <5(3), 4(4)>
>
>   :
>   # m_2 = PHI 
>   # _15 = PHI <_5(5), _10(2)>
>   # i_16 = PHI 
>   # ivtmp_13 = PHI 
>   _7 = m_2 * _15;
>   b[i_16] = _7;
>   i_9 = i_16 + 1;
>   ivtmp_3 = ivtmp_13 - 1;
>   if (ivtmp_3 != 0)
> goto ;
>   else
> goto ;
>
> which previously was not if-converted. With this patch:
>
>   :
>   _10 = a[0];
>   goto ;
>
>   :
>
>   :
>   # m_2 = PHI 
>   # _15 = PHI <_5(3), _10(2)>
>   # i_16 = PHI 
>   # ivtmp_13 = PHI 
>   _7 = m_2 * _15;
>   b[i_16] = _7;
>   i_9 = i_16 + 1;
>   ivtmp_3 = ivtmp_13 - 1;
>   _5 = a[i_9];
>   _6 = _5 & i_9;
>   m_14 = _6 != 0 ? 5 : 4;
>   if (ivtmp_3 != 0)
> goto ;
>   else
> goto ;
>
>   :
>   return;
>
> (Unfortunately the vectorizer still doesn't handle this loop either, but
> that's another issue/patch...)
>
> Bootstrapped + check-gcc on x86_64-unknown-linux-gnu and
> aarch64-none-linux-gnu.
> Cross-tested check-gcc on aarch64-none-elf.
> I'm investigating impact on benchmarks - on AArch64 Spec2k6, this touches a
> number of object files, leading to an overall slight decrease in the number
> of instructions, but no change that looks significant (specifically, no more
> or less vectorization).
>
> Is this OK for trunk?
>
> Cheers, Alan
>


Re: ping: [PATCH, ARM] attribute target (thumb,arm) [0-6]

2015-04-30 Thread Ramana Radhakrishnan



Christian



A general note, please reply to each of the patches with a rebased
patch as a separate email. Further more all your patches appear to
have dos line endings so they don't seem to apply cleanly. Please
don't have spurious headers in your patch submission - it then makes
it hard to , please create it in a way that it is easily applied by
someone trying it out. It looks like p4 needs a respin as I got a
reject trying to apply the documentation patch to my tree while trying
to apply it.



OK, thanks for the suggestions and sorry for the p4 reject. The sources
are moving fast and I have hard times catching up with re-bases.


I understand.




I tried the following decoration on foo in gcc.target/arm/attr_arm.c


int __attribute__((target("arm, fpu=vfpv4")))
foo(int a)
{
   return a ? 1 : 5;
}


And the compiler accepts it just fine.


Indeed, it's a mistake for now. attributes other the arm/thumb ones
shall be rejected (eventually with a "not yet implemented" warning for
the fpu, error for the others.) until we extend it.


Yep - funnily enough if you remove "arm" and just use "fpu=vfpv4", I 
think you get an error.






Given that with LTO we are now using target attributes to decide
inlining - I'm not convinced that the inline asm case goes away. In
fact it only makes things worse so I'm almost convinced to forbid
inlining from "arm" to "thumb" or vice-versa, which is a reversal of
my earlier position. I hadn't twigged that LTO would reuse this
infrastructure and it's probably simpler to prevent inlining in those
cases.


I can resurrect the inline check chunk. FYI, with a few small examples
arm/thumb attribute is correctly handled by LTO


Yes it would work with normal C code as it does normally - I'm worried 
about functions with inline asm. We've just increased the inlining scope 
with lto and that would mean things are a bit more painful ?







Thoughts ?

So in essence I'm still playing with this and would like to iterate
towards a quick solution.



thanks, that would be good if we could land the arm/thumb attribute and
start the fpu extensions separately. (I'm currently playing with
fpu=neon but it will take time to have something solid).


Absolutely - I'd rather spend the time first in polishing this up. 
Extending it for other options can be something you look at separately.


BTW I was pointed at a PR for this yesterday by a colleague - 
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=59884


So, lets use that as the PR for this work.

regards
Ramana



Christian


Ramana





Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8

2015-04-30 Thread Bin.Cheng
On Mon, Apr 27, 2015 at 9:26 PM, Bill Schmidt
 wrote:
> On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote:
>> On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt
>>  wrote:
>
>>
>> > Index: gcc/testsuite/gcc.dg/vect/vect-33.c
>> > ===
>> > --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118)
>> > +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy)
>> > @@ -36,9 +36,10 @@ int main (void)
>> >return main1 ();
>> >  }
>> >
>> > +/* vect_hw_misalign && { ! vect64 } */
>> >
>> >  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } */
>> > -/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" { 
>> > target { vect_hw_misalign && { {! vect64} || vect_multiple_sizes } } } } } 
>> > */
>> > +/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" { 
>> > target { { { ! powerpc*-*-* } && vect_hw_misalign } && { { ! vect64 } || 
>> > vect_multiple_sizes } } } } }  */
>> >  /* { dg-final { scan-tree-dump "Alignment of access forced using peeling" 
>> > "vect" { target { vector_alignment_reachable && { vect64 && {! 
>> > vect_multiple_sizes} } } } } } */
>> >  /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
>> > versioning" 1 "vect" { target { { {! vector_alignment_reachable} || {! 
>> > vect64} } && {! vect_hw_misalign} } } } } */
>> >  /* { dg-final { cleanup-tree-dump "vect" } } */
>>
>> Hi Bill,
>> With this change, the test case is skipped on aarch64 now.  Since it
>> passed before, Is it expected to act like this on 64bit platforms?
>
> Hi Bin,
>
> No, that's a mistake on my part -- thanks for the report!  That first
> added line was not intended to be part of the patch:
>
> +/* vect_hw_misalign && { ! vect64 } */
>
> Please try removing that line and verify that the patch succeeds again
> for ARM.  Assuming so, I'll prepare a patch to fix this.
>
> It looks like this mistake was introduced only in this particular test,
> but please let me know if you see any other anomalies.
Hi Bill,
I chased the wrong branch.  The test disappeared on fsf-48 branch in
out build, rather than trunk.  I guess it's not your patch's fault.
Will follow up and get back to you later.
Sorry for the inconvenience.

Thanks,
bin
>
> Thanks very much!
>
> Bill
>>
>> PASS->NA: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects
>> scan-tree-dump-times vect "Vectorizing an unaligned access" 0
>> PASS->NA: gcc.dg/vect/vect-33.c scan-tree-dump-times vect "Vectorizing
>> an unaligned access" 0
>>
>> Thanks,
>> bin
>>
>
>


Re: Mostly rewrite genrecog

2015-04-30 Thread Eric Botcazou
> The generated code.  genrecog.c itself isn't bad. :-)

Nice work then.

> OK.  I'd left the head comment alone because it just described the
> interface, which hasn't changed.  But I suppose past lack of commentary
> doesn't justify future lack of commentary.  Here's what I added:
> [...]
> BTW, hope at least part of the doubling in size is due to more commentary
> in the code itself.

I see.  Thanks a lot for writing down the description of the algorithm!

> I'd rather leave stuff like that to someone who wants it rather than try
> to write routines speculatively in the hope that someone would find them
> useful.

OK.

-- 
Eric Botcazou


Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-30 Thread Alan Lawrence

Richard Biener wrote:

On Tue, Apr 28, 2015 at 3:55 PM, Alan Lawrence  wrote:

Tree if-conversion currently bails out for loops that (a) contain nested
loops; (b) have more than one exit; (c) where the exit block (source of the
exit edge) does not dominate the loop latch; (d) where the exit block is the
loop header, or there are statements after the exit.

This patch removes restrictions (c) and (d). The intuition is that, for (c),
"if (P) {... if (Q) break;}" is equivalent to "if (P) {...}; if (P&&Q)
break;" and this is mostly handled by existing code for propagating
conditions. For (d), "if (P) break; stmts" is equivalent to "if (!P) stmts;
if (P) break;" - this requires inserting the predicated stmts before the
branch rather than after.


Hum - so you empty the latch by conditionalizing code on the exit condition?


Well, !(exit condition), but yes.


So you still restrict loop form to two blocks - just the latch may now be
non-empty?  Thus I'd say keeping the existing check but amending it by
&& bb != loop->latch would be better.


The idea was to try to end up with a loop with exactly two blocks, a main block 
with a condition at the end, and an empty latch; but to convert more 
bad-loop-form loops into this form.



Otherwise the patch looks good to me.

Can you please add at least one testcase for c) and d) where we now
vectorize something after the patch but not before?


So I think I have made an inconsistency, by changing the logic for 
dominance-of-latch to postdominance-of-header, in one place but not another 
(where it deals with conditional stores) - but I haven't managed to tickle that yet.


However, I'm struggling to find a case where this patch enables vectorization; 
the fancy if-converted loops tend to have other problems preventing 
vectorization, e.g. location of PHI nodes. In contrast, your suggestion of 
putting in another loop-header-copying pass, enables both if-conversion (with 
the existing tree-if-conv.c) and vectorization of a bunch of things (including 
the example I posted of this patch if-converting but still not vectorizing). So 
(short of massive changes to the vectorizer) that approach now feels more 
promising, although there are a good bunch of scan-tree-dump test failures that 
I need to look into...


Where (something like) this patch might be useful, could be as a first step to 
handling loops with multiple exits, that is, changing


for (;;)
{
 S1;
 if (P) break;
 S2;
 if (Q) break;
 S3;
}

into the equivalent of

for (;;)
{
  S1;
  if (!P) S2;
  if (!P && !Q) S3;
  if (P || Q) break;
}

But that's more work, and another patch, and I'm not yet clear how many loops of 
that form the vectorizer would do anything with anyway (let alone profitably!)...


Cheers, Alan



[PATCH][docs] Re: Update __atomic builtins documentation.

2015-04-30 Thread Matthew Wahab

[added tags to subject]

Ping.

On 20/04/15 14:29, Matthew Wahab wrote:

Hello,

The documentation for the __atomic builtins isn't clear about their expectations
and behaviour. In particular, assumptions about the C11/C++11 restrictions on
programs should be stated and the different behaviour of memory models in fences
and in operations should be noted. The behaviour of compare-exchange when the
compare fails is also confusing and the description of the implementation of the
__atomics is mixed in with the description of their functionality.

This patch tries to deal with some of these problems.

Tested by looking at the html.

Ok for trunk?
Matthew

2015-04-20  Matthew Wahab  

* doc/extend.texi (__atomic Builtins): Move implementation details
to the end of the description, rewrite opening paragraphs, state
difference with __sync builtins, state C11/C++11 assumptions,
weaken itemized descriptions, add explanation of memory model
behaviour, expand description of compare-exchange, simplify text.



diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7470e40..5b551c1 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -8353,45 +8353,47 @@ are not prevented from being speculated to before the barrier.
 @node __atomic Builtins
 @section Built-in Functions for Memory Model Aware Atomic Operations
 
-The following built-in functions approximately match the requirements for
-C++11 memory model. Many are similar to the @samp{__sync} prefixed built-in
-functions, but all also have a memory model parameter.  These are all
-identified by being prefixed with @samp{__atomic}, and most are overloaded
-such that they work with multiple types.
-
-GCC allows any integral scalar or pointer type that is 1, 2, 4, or 8
-bytes in length. 16-byte integral types are also allowed if
-@samp{__int128} (@pxref{__int128}) is supported by the architecture.
-
-Target architectures are encouraged to provide their own patterns for
-each of these built-in functions.  If no target is provided, the original 
-non-memory model set of @samp{__sync} atomic built-in functions are
-utilized, along with any required synchronization fences surrounding it in
-order to achieve the proper behavior.  Execution in this case is subject
-to the same restrictions as those built-in functions.
-
-If there is no pattern or mechanism to provide a lock free instruction
-sequence, a call is made to an external routine with the same parameters
-to be resolved at run time.
+The following built-in functions approximately match the requirements
+for C++11 concurrency and memory models.  They are all
+identified by being prefixed with @samp{__atomic} and most are
+overloaded so that they work with multiple types.
+
+These functions are intended to replace the legacy @samp{__sync}
+builtins.  The main difference is that the memory model to be used is a
+parameter to the functions.  New code should always use the
+@samp{__atomic} builtins rather than the @samp{__sync} builtins.
+
+Note that the @samp{__atomic} builtins assume that programs will
+conform to the C++11 model for concurrency.  In particular, they assume
+that programs are free of data races.  See the C++11 standard for
+detailed definitions.
+
+The @samp{__atomic} builtins can be used with any integral scalar or
+pointer type that is 1, 2, 4, or 8 bytes in length.  16-byte integral
+types are also allowed if @samp{__int128} (@pxref{__int128}) is
+supported by the architecture.
 
 The four non-arithmetic functions (load, store, exchange, and 
 compare_exchange) all have a generic version as well.  This generic
 version works on any data type.  If the data type size maps to one
 of the integral sizes that may have lock free support, the generic
-version utilizes the lock free built-in function.  Otherwise an
+version uses the lock free built-in function.  Otherwise an
 external call is left to be resolved at run time.  This external call is
 the same format with the addition of a @samp{size_t} parameter inserted
 as the first parameter indicating the size of the object being pointed to.
 All objects must be the same size.
 
 There are 6 different memory models that can be specified.  These map
-to the same names in the C++11 standard.  Refer there or to the
-@uref{http://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync,GCC wiki on
-atomic synchronization} for more detailed definitions.  These memory
-models integrate both barriers to code motion as well as synchronization
-requirements with other threads. These are listed in approximately
-ascending order of strength. It is also possible to use target specific
-flags for memory model flags, like Hardware Lock Elision.
+to the C++11 memory models with the same names, see the C++11 standard
+or the @uref{http://gcc.gnu.org/wiki/Atomic/GCCMM/AtomicSync,GCC wiki
+on atomic synchronization} for detailed definitions.  Individual
+targets may also support additional memory models for use on specific
+architectures.

Re: [PATCH] Remove some restrictions on loop shape in tree-if-conv.c

2015-04-30 Thread Richard Biener
On Thu, Apr 30, 2015 at 12:34 PM, Alan Lawrence  wrote:
> Richard Biener wrote:
>>
>> On Tue, Apr 28, 2015 at 3:55 PM, Alan Lawrence 
>> wrote:
>>>
>>> Tree if-conversion currently bails out for loops that (a) contain nested
>>> loops; (b) have more than one exit; (c) where the exit block (source of
>>> the
>>> exit edge) does not dominate the loop latch; (d) where the exit block is
>>> the
>>> loop header, or there are statements after the exit.
>>>
>>> This patch removes restrictions (c) and (d). The intuition is that, for
>>> (c),
>>> "if (P) {... if (Q) break;}" is equivalent to "if (P) {...}; if (P&&Q)
>>> break;" and this is mostly handled by existing code for propagating
>>> conditions. For (d), "if (P) break; stmts" is equivalent to "if (!P)
>>> stmts;
>>> if (P) break;" - this requires inserting the predicated stmts before the
>>> branch rather than after.
>>
>>
>> Hum - so you empty the latch by conditionalizing code on the exit
>> condition?
>
>
> Well, !(exit condition), but yes.
>
>> So you still restrict loop form to two blocks - just the latch may now be
>> non-empty?  Thus I'd say keeping the existing check but amending it by
>> && bb != loop->latch would be better.
>
>
> The idea was to try to end up with a loop with exactly two blocks, a main
> block with a condition at the end, and an empty latch; but to convert more
> bad-loop-form loops into this form.
>
>> Otherwise the patch looks good to me.
>>
>> Can you please add at least one testcase for c) and d) where we now
>> vectorize something after the patch but not before?
>
>
> So I think I have made an inconsistency, by changing the logic for
> dominance-of-latch to postdominance-of-header, in one place but not another
> (where it deals with conditional stores) - but I haven't managed to tickle
> that yet.
>
> However, I'm struggling to find a case where this patch enables
> vectorization; the fancy if-converted loops tend to have other problems
> preventing vectorization, e.g. location of PHI nodes. In contrast, your
> suggestion of putting in another loop-header-copying pass, enables both
> if-conversion (with the existing tree-if-conv.c) and vectorization of a
> bunch of things (including the example I posted of this patch if-converting
> but still not vectorizing). So (short of massive changes to the vectorizer)
> that approach now feels more promising, although there are a good bunch of
> scan-tree-dump test failures that I need to look into...

Heh.  I would have said the loop header copying could be done by the
vectorizer itself when it detects the non-empty latch simply call a factored
function from loop header copying that forcefully duplicates the header
(well, or simply inline the short relevant portion).

We'd eventually want to undo this if vectorization fails but that's possibly
a secondary concern.

Another possibility is to piggy-back that onto the if-conversion pass and
force the use of versioning for that.

Moving the CH pass is sth entirely different and I'd rather not do that.

Richard.


> Where (something like) this patch might be useful, could be as a first step
> to handling loops with multiple exits, that is, changing
>
> for (;;)
> {
>  S1;
>  if (P) break;
>  S2;
>  if (Q) break;
>  S3;
> }
>
> into the equivalent of
>
> for (;;)
> {
>   S1;
>   if (!P) S2;
>   if (!P && !Q) S3;
>   if (P || Q) break;
> }
>
> But that's more work, and another patch, and I'm not yet clear how many
> loops of that form the vectorizer would do anything with anyway (let alone
> profitably!)...
>
> Cheers, Alan
>


Re: More type narrowing in match.pd

2015-04-30 Thread Marc Glisse

On Thu, 30 Apr 2015, Richard Biener wrote:


I have in my local dev tree (so completely untested...)

@@ -1040,31 +1052,22 @@ (define_operator_list CBRT BUILT_IN_CBRT
   operation and convert the result to the desired type.  */
(for op (plus minus)
  (simplify
-(convert (op (convert@2 @0) (convert@3 @1)))
+(convert (op:c@4 (convert@2 @0) (convert?@3 @1)))


I believe the :c here requires extra code further down, so we don't turn 
a-b into b-a.



(if (INTEGRAL_TYPE_P (type)
-/* We check for type compatibility between @0 and @1 below,
-   so there's no need to check that @1/@3 are integral types.  */
&& INTEGRAL_TYPE_P (TREE_TYPE (@0))
-&& INTEGRAL_TYPE_P (TREE_TYPE (@2))
+&& INTEGRAL_TYPE_P (TREE_TYPE (@4))
/* The precision of the type of each operand must match the
   precision of the mode of each operand, similarly for the
   result.  */
&& (TYPE_PRECISION (TREE_TYPE (@0))
== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0
-&& (TYPE_PRECISION (TREE_TYPE (@1))
-== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@1
-&& TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
/* The inner conversion must be a widening conversion.  */
&& TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE (@0))
-&& ((GENERIC
- && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
- == TYPE_MAIN_VARIANT (TREE_TYPE (@1)))
- && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
- == TYPE_MAIN_VARIANT (type)))
-|| (GIMPLE
-&& types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))
-&& types_compatible_p (TREE_TYPE (@0), type
+/* The final precision should match that of operand @0.  */
+&& TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (@0))
+/* Make sure the wide operation is dead after the transform.  */
+&& (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
  (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
-   (convert (op @0 @1)))
+   (convert (op @0 (convert @1
  (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
   (convert (op (convert:utype @0) (convert:utype @1)))


--
Marc Glisse


Re: More type narrowing in match.pd

2015-04-30 Thread Richard Biener
On Thu, Apr 30, 2015 at 12:53 PM, Marc Glisse  wrote:
> On Thu, 30 Apr 2015, Richard Biener wrote:
>
>> I have in my local dev tree (so completely untested...)
>>
>> @@ -1040,31 +1052,22 @@ (define_operator_list CBRT BUILT_IN_CBRT
>>operation and convert the result to the desired type.  */
>> (for op (plus minus)
>>   (simplify
>> -(convert (op (convert@2 @0) (convert@3 @1)))
>> +(convert (op:c@4 (convert@2 @0) (convert?@3 @1)))
>
>
> I believe the :c here requires extra code further down, so we don't turn a-b
> into b-a.

Indeed.  I've added :c only for minus as 5 - x can't be canonicalized to
move the constant to 2nd position which is always possible for plus.

Might be cleaner to add a separate pattern for that case.

Richard.

>
>> (if (INTEGRAL_TYPE_P (type)
>> -/* We check for type compatibility between @0 and @1 below,
>> -   so there's no need to check that @1/@3 are integral types.  */
>> && INTEGRAL_TYPE_P (TREE_TYPE (@0))
>> -&& INTEGRAL_TYPE_P (TREE_TYPE (@2))
>> +&& INTEGRAL_TYPE_P (TREE_TYPE (@4))
>> /* The precision of the type of each operand must match the
>>precision of the mode of each operand, similarly for the
>>result.  */
>> && (TYPE_PRECISION (TREE_TYPE (@0))
>> == GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0
>> -&& (TYPE_PRECISION (TREE_TYPE (@1))
>> -== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@1
>> -&& TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
>> /* The inner conversion must be a widening conversion.  */
>> && TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
>> (@0))
>> -&& ((GENERIC
>> - && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
>> - == TYPE_MAIN_VARIANT (TREE_TYPE (@1)))
>> - && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
>> - == TYPE_MAIN_VARIANT (type)))
>> -|| (GIMPLE
>> -&& types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1))
>> -&& types_compatible_p (TREE_TYPE (@0), type
>> +/* The final precision should match that of operand @0.  */
>> +&& TYPE_PRECISION (type) == TYPE_PRECISION (TREE_TYPE (@0))
>> +/* Make sure the wide operation is dead after the transform.  */
>> +&& (TREE_CODE (@4) != SSA_NAME || has_single_use (@4)))
>>   (if (TYPE_OVERFLOW_WRAPS (TREE_TYPE (@0)))
>> -   (convert (op @0 @1)))
>> +   (convert (op @0 (convert @1
>>   (with { tree utype = unsigned_type_for (TREE_TYPE (@0)); }
>>(convert (op (convert:utype @0) (convert:utype @1)))
>
>
> --
> Marc Glisse


RE: [PATCH, combine] Try REG_EQUAL for nonzero_bits

2015-04-30 Thread Thomas Preud'homme
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Tuesday, April 28, 2015 12:27 AM
> To: Thomas Preud'homme; 'Eric Botcazou'
> Cc: gcc-patches@gcc.gnu.org
> Subject: Re: [PATCH, combine] Try REG_EQUAL for nonzero_bits
> 
> On 04/27/2015 04:26 AM, Thomas Preud'homme wrote:
> >> From: Jeff Law [mailto:l...@redhat.com]
> >> Sent: Saturday, April 25, 2015 3:00 AM
> >> Do you have a testcase where this change can result in better
> generated
> >> code.  If so please add that testcase.  It's OK if it's ARM specific.
> >
> > Hi Jeff,
> >
> > Last time I tried I couldn't reduce the code to a small testcase but if I
> remember
> > well it was mostly due to the problem of finding a good test for creduce
> > (zero extension is not unique enough). I'll try again with a more manual
> approach
> > and get back to you.
> OK.  No need for heroics -- give it a shot, but don't burn an insane
> amount of time on it.  If we can't get to a reasonable testcase, then so
> be it.

Sadly I couldn't get a testcase. I get almost same sequence of instruction as 
the program we found the problem into but couldn't get exactly the same. In all 
the cases I constructed the nonzero_bits info we already have were enough for 
combine to do its job. I couldn't find what cause this information to be 
inaccurate. I will try to investigate a bit further on Monday as another pass 
might not be doing its job properly. Or maybe there's something that prevent 
information being propagated.

Best regards,

Thomas





Re: niter_base simplification

2015-04-30 Thread Jonathan Wakely

On 30/04/15 10:40 +0200, François Dumont wrote:

On 27/04/2015 13:55, Jonathan Wakely wrote:

(Alternatively, could the same simplification be made for
__miter_base? Do we need _Miter_base<> or just two overloads of
__miter_base()?)


Definitely, I already have a patch for that.


Great :-)


It might be simpler to just add __make_reverse_iterator for >= 201103L
and then always use std::__make_reverse_iterator instead of a macro.

That's similar to what we do for std:__addressof and std:addressof.

Ok, attached is the patch I have plan to commit then that I am testing 
at the moment.


Looks good, OK for trunk assuming the tests pass.

Thanks!


Re: c-family PATCH to improve -Wbool-compare (PR c/64610)

2015-04-30 Thread Marek Polacek
On Thu, Apr 30, 2015 at 11:42:18AM +0200, Andreas Schwab wrote:
> Marek Polacek  writes:
> 
> > PR c/64610
> > * c-common.c (maybe_warn_bool_compare): Warn when comparing a boolean
> > with 0/1.
> 
> /usr/local/gcc/gcc-20150430/Build/./prev-gcc/xg++ 
> -B/usr/local/gcc/gcc-20150430/Build/./prev-gcc/ 
> -B/usr/aarch64-suse-linux/bin/ -nostdinc++ 
> -B/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/src/.libs
>  
> -B/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/libsupc++/.libs
>   
> -I/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/include/aarch64-suse-linux
>   
> -I/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/include
>   -I/usr/local/gcc/gcc-20150430/libstdc++-v3/libsupc++ 
> -L/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/src/.libs
>  
> -L/usr/local/gcc/gcc-20150430/Build/prev-aarch64-suse-linux/libstdc++-v3/libsupc++/.libs
>  -c   -g -O2 -gtoggle -DIN_GCC-fno-exceptions -fno-rtti 
> -fasynchronous-unwind-tables -W -Wall -Wno-narrowing -Wwrite-strings 
> -Wcast-qual -Wmissing-format-attribute -Woverloaded-virtual -pedantic 
> -Wno-long-long -Wno-variadic-macros -Wno-overlength-strings -Werror 
> -fno-common  -DHAVE_CONFIG_H -I. -I. -I../../gcc -I../../gcc/. 
> -I../../gcc/../include -I../../gcc/../libcpp/include  
> -I../../gcc/../libdecnumber -I../../gcc/../libdecnumber/dpd -I../libdecnumber 
> -I../../gcc/../libbacktrace   -o expr.o -MT expr.o -MMD -MP -MF 
> ./.deps/expr.TPo ../../gcc/expr.c
> ../../gcc/expr.c: In function 'int can_store_by_pieces(long unsigned int, 
> rtx_def* (*)(void*, long int, machine_mode), void*, unsigned int, bool)':
> ../../gcc/expr.c:2496:16: error: comparison of constant 'true' with boolean 
> expression is always true [-Werror=bool-compare]
> reverse <= (HAVE_PRE_DECREMENT || HAVE_POST_DECREMENT);
> ^

Yes, that is a bug in my code; I think I'll apply the following after
regtest/bootstrap (with a proper test).  If you could perhaps try the
patch as well, it'd be appreciated.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 7d314f8..ada8e8a 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -11924,6 +11924,17 @@ maybe_warn_bool_compare (location_t loc, enum 
tree_code code, tree op0,
 }
   else if (integer_zerop (cst) || integer_onep (cst))
 {
+  /* If the non-constant operand isn't of a boolean type, we
+don't want to warn here.  */
+  tree noncst = TREE_CODE (op0) == INTEGER_CST ? op1 : op0;
+  /* Handle booleans promoted to integers.  */
+  if (CONVERT_EXPR_P (noncst)
+ && TREE_TYPE (noncst) == integer_type_node
+ && TREE_CODE (TREE_TYPE (TREE_OPERAND (noncst, 0))) == BOOLEAN_TYPE)
+   /* Warn.  */;
+  else if (TREE_CODE (TREE_TYPE (noncst)) != BOOLEAN_TYPE
+  && !truth_value_p (TREE_CODE (noncst)))
+   return;
   /* Do some magic to get the right diagnostics.  */
   bool flag = TREE_CODE (op0) == INTEGER_CST;
   flag = integer_zerop (cst) ? flag : !flag;

Marek


Re: [rs6000] Fix compare debug failure on AIX

2015-04-30 Thread Eric Botcazou
> We might want to check if doing -Og and not just -O0.

You're right, thanks, amended patch attached, same ChangeLog.

-- 
Eric BotcazouIndex: config/rs6000/rs6000.c
===
--- config/rs6000/rs6000.c	(revision 222439)
+++ config/rs6000/rs6000.c	(working copy)
@@ -21932,8 +21932,8 @@ rs6000_stack_info (void)
   /* Determine if we need to allocate any stack frame:
 
  For AIX we need to push the stack if a frame pointer is needed
- (because the stack might be dynamically adjusted), if we are
- debugging, if we make calls, or if the sum of fp_save, gp_save,
+ (because the stack might be dynamically adjusted), if we want
+ to debug, if we make calls, or if the sum of fp_save, gp_save,
  and local variables are more than the space needed to save all
  non-volatile registers: 32-bit: 18*8 + 19*4 = 220 or 64-bit: 18*8
  + 18*8 = 288 (GPR13 reserved).
@@ -21950,7 +21950,7 @@ rs6000_stack_info (void)
   else if (frame_pointer_needed)
 info_ptr->push_p = 1;
 
-  else if (TARGET_XCOFF && write_symbols != NO_DEBUG)
+  else if (TARGET_XCOFF && (!optimize || optimize_debug))
 info_ptr->push_p = 1;
 
   else

[PATCH] [libstdc++] Add uniform container erasure.

2015-04-30 Thread Ed Smith-Rowland

This has been in me tree for a good while.

It is fairly simple and adds C++ experimental container erasure.

Builds and tests cleanly on x86_64-linux.

OK?

Index: include/Makefile.am
===
--- include/Makefile.am (revision 222573)
+++ include/Makefile.am (working copy)
@@ -645,15 +645,26 @@
 experimental_headers = \
${experimental_srcdir}/algorithm \
${experimental_srcdir}/any \
+   ${experimental_srcdir}/array \
${experimental_srcdir}/chrono \
+   ${experimental_srcdir}/deque \
+   ${experimental_srcdir}/erase_if.tcc \
+   ${experimental_srcdir}/forward_list \
${experimental_srcdir}/functional \
+   ${experimental_srcdir}/list \
+   ${experimental_srcdir}/map \
${experimental_srcdir}/optional \
${experimental_srcdir}/ratio \
+   ${experimental_srcdir}/set \
+   ${experimental_srcdir}/string \
${experimental_srcdir}/string_view \
+   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/system_error \
-   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/tuple \
-   ${experimental_srcdir}/type_traits
+   ${experimental_srcdir}/type_traits \
+   ${experimental_srcdir}/unordered_map \
+   ${experimental_srcdir}/unordered_set \
+   ${experimental_srcdir}/vector
 
 # This is the common subset of C++ files that all three "C" header models use.
 c_base_srcdir = $(C_INCLUDE_DIR)
Index: include/Makefile.in
===
--- include/Makefile.in (revision 222573)
+++ include/Makefile.in (working copy)
@@ -912,15 +912,26 @@
 experimental_headers = \
${experimental_srcdir}/algorithm \
${experimental_srcdir}/any \
+   ${experimental_srcdir}/array \
${experimental_srcdir}/chrono \
+   ${experimental_srcdir}/deque \
+   ${experimental_srcdir}/erase_if.tcc \
+   ${experimental_srcdir}/forward_list \
${experimental_srcdir}/functional \
+   ${experimental_srcdir}/list \
+   ${experimental_srcdir}/map \
${experimental_srcdir}/optional \
${experimental_srcdir}/ratio \
+   ${experimental_srcdir}/set \
+   ${experimental_srcdir}/string \
${experimental_srcdir}/string_view \
+   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/system_error \
-   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/tuple \
-   ${experimental_srcdir}/type_traits
+   ${experimental_srcdir}/type_traits \
+   ${experimental_srcdir}/unordered_map \
+   ${experimental_srcdir}/unordered_set \
+   ${experimental_srcdir}/vector
 
 
 # This is the common subset of C++ files that all three "C" header models use.
Index: include/experimental/array
===
--- include/experimental/array  (revision 0)
+++ include/experimental/array  (working copy)
@@ -0,0 +1,180 @@
+//  https://gist.github.com/lichray/6034753
+//  -*- C++ -*-
+
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file experimental/functional
+ *  This is a TS C++ Library header.
+ */
+
+#ifndef _GLIBCXX_EXPERIMENTAL_ARRAY
+#define _GLIBCXX_EXPERIMENTAL_ARRAY 1
+
+#pragma GCC system_header
+
+#if __cplusplus <= 201103L
+# include 
+#else
+
+#include 
+#include 
+#include 
+
+namespace std
+{
+namespace experimental
+{
+inline namespace fundamentals_v2
+{
+
+//  Made this up...
+#define __cpp_lib_experimental_make_array 201411
+
+namespace __detail
+{
+
+  template
+struct __lazy_conditional;
+
+  template
+struct __lazy_conditional
+{
+  using type = typename _Tp::type;
+};
+
+  template
+struct __lazy_conditional
+{
+  using type = typename _Tp::type;
+};
+
+  template
+struct __lazy_conditional
+{
+  using type = typename _Up::type;
+};
+
+  template
+using __if = 

Re: [RFC][PATCH 2/3] Propagate and save value ranges wrapped information

2015-04-30 Thread Richard Biener
On Thu, Apr 23, 2015 at 12:11 AM, Kugan
 wrote:
>
> On 19/01/15 22:28, Richard Biener wrote:
>> On Sat, 17 Jan 2015, Kugan wrote:
>>
>>>
>>> This patch propagate value range wrapps attribute and save this to
>>> SSA_NAME.
>>
>> diff --git a/gcc/tree-vrp.c b/gcc/tree-vrp.c
>> index 9b7695d..832c35d 100644
>> --- a/gcc/tree-vrp.c
>> +++ b/gcc/tree-vrp.c
>> @@ -103,6 +103,9 @@ struct value_range_d
>>tree min;
>>tree max;
>>
>> +  /* Set to true if values in this value range could wrapp. */
>> +  bool is_wrapped;
>> +
>>/* Set of SSA names whose value ranges are equivalent to this one.
>>   This set is only valid when TYPE is VR_RANGE or VR_ANTI_RANGE.  */
>>bitmap equiv;
>>
>> I can't make sense of this description (wrap with one p as well).
>> I assume you mean that the expression that has this value-range
>> assigned has an operation that may have wrapped?  (a value
>> can't wrap)
>>
>> You need to specify how is_wrapped behaves for range union and
>> intersect operations and which operations can wrap.
>>
>> I miss an overall description of these patches as to a) why you
>> need this information and b) why it helps.
>>
>> It's now also too late and thus you have plenty of time until stage1
>> starts again.
>
>
> Thanks Richard for the comments. Now that stage1 is open, here is the
> modified patch with the changes requested.
>
> Due to wrapping in the value range computation, there was a regression
> in aplha-linux
> (https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02458.html) while using
> value range infromation to remove zero/sign extensions in rtl
> expansaion. Hence I had to revert the patch that enabled zero/sign
> extension. Now I am propgating this wrap_p information to SSA_NAME so
> that we know, when used in PROMOTE_MODE, the values can have
> unpredictable bits beyon the type width.
>
> I have also updated the comments as below:
>
>
> +  /* Set to true if the values in this range might have been wrapped
> + during the operation that computed it.
> +
> + This is mainly used in zero/sign-extension elimination where value
> ranges
> + computed are for the type of SSA_NAME and computation is
> ultimately done
> + in PROMOTE_MODE.  Therefore, the value ranges has to be correct upto
> + PROMOTE_MODE precision.  If the operation can WRAP, higher bits in
> + PROMOTE_MODE can be unpredictable and cannot be used in zero/sign
> extension
> + elimination; additional wrap_p attribute is needed to show this.
> +
> + For example:
> + on alpha where PROMOTE_MODE is 64 bit and _344 is a 32 bit unsigned
> + variable,
> + _343 = ivtmp.179_52 + 2147483645;  [0x8004, 0x80043]
> +
> + the value range VRP will compute is:
> +
> + _344 = _343 * 2;  [0x8, 0x86]
> + _345 = (integer(kind=4)) _344;[0x8, 0x86]
> +
> + In PROMOTE_MODE, there will be garbage above the type width.  In
> places
> + like this, attribute wrap_p will be true.
> +
> + wrap_p in range union operation will be true if either of the
> value range
> + has wrap_p set.  In intersect operation, true when both the value
> ranges
> + have wrap_p set.  */
> +  bool wrap_p;
> +

So what you'd like to know is whether the value range is the same if
all operations were carried out in infinite precision (well, in PROMOTE_MODE
precision).

I'm not sure you can simply assert that we didn't wrap for example in
extract_range_from_assert.  Consider your above code and

  if (_344 < 0x87)
{
  _346 = ASSERT_EXPR <_344, _344 < 0x87>;
...

_346 definitely should have wrap_p set.

That said, I'm not convinced this is a sustainable approach to the issue.

I've long pondered with replacing the VRP overflow checking code
(for -fstrict-overflow) with keeping two lattices - one honoring undefined
overflow and one not and then comparing the results in the end.

A similar approach could be used for your wrap_p flag.

BUT ... a way more sensible approach to reduce the required sign/zero
extensions for PROMOTE_MODE targets is to lower the IL earlier,
before we perform the 2nd VRP run and thus have the sign-/zero-extensions
being eliminated by GIMPLE optimizers rather than only at RTL time.

ISTR you (or somebody else) played with that a bit?

Thanks,
Richard.

> Thanks,
> Kugan
>
>
> gcc/testsuite/ChangeLog:
>
> 2015-04-22  Kugan Vivekanandarajah  
>
> * gcc.dg/tree-ssa/vrp92.c: Update scanned pattern.
>
> gcc/ChangeLog:
>
> 2015-04-22  Kugan Vivekanandarajah  
>
> * builtins.c (determine_block_size): Use new definition of
>  get_range_info.
> * gimple-pretty-print.c (dump_ssaname_info): Dump new wrap_p info.
> * internal-fn.c (get_range_pos_neg): Use new definition of
>  get_range_info.
> (get_min_precision): Likewise.
> * tree-ssa-copy.c (fini_copy_prop): Use new definition of
>  duplicate_ssa_range_info.
> * tree-ssa-pre.c (insert_into_preds_of_block): Likewise.
>   

Re: [RFC][PATCH 2/3] Propagate and save value ranges wrapped information

2015-04-30 Thread Jakub Jelinek
On Thu, Apr 30, 2015 at 01:35:25PM +0200, Richard Biener wrote:
> I've long pondered with replacing the VRP overflow checking code
> (for -fstrict-overflow) with keeping two lattices - one honoring undefined
> overflow and one not and then comparing the results in the end.

Yeah, that would be greatly appreciated.  The (OVF) stuff is complete mess.

Jakub


Re: [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern

2015-04-30 Thread Marc Glisse

On Thu, 30 Apr 2015, Richard Biener wrote:


On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes
 wrote:

Generalizing the x+(x&1) pattern, one can round up x to a multiple of
a 2^k by adding the negative of x modulo 2^k. But it is fewer
instructions, and presumably requires fewer registers, to do the more
common (x+m)&~m where m=2^k-1.

Signed-off-by: Rasmus Villemoes 
---
 gcc/match.pd  |  9 ++
 gcc/testsuite/gcc.dg/20150120-4.c | 59 +++
 2 files changed, 68 insertions(+)
 create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c

diff --git gcc/match.pd gcc/match.pd
index 47865f1..93c2298 100644
--- gcc/match.pd
+++ gcc/match.pd
@@ -273,6 +273,15 @@ along with GCC; see the file COPYING3.  If not see
  (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
   (bit_ior @0 (bit_not @1

+/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1.  */
+(simplify
+ (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1))


I think you want to restrict this to INTEGER_CST@1


Is this only to make the following test easier (a good enough reason for 
me) or is there some fundamental reason why this transformation would be 
wrong for vectors?



+ (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1),
+@1, build_one_cst (TREE_TYPE (@1))); }


We shouldn't dispatch to fold_binary in patterns.  int_const_binop would
be the appropriate function to use - but what happens for @1 == INT_MAX
where @1 + 1 overflows?  Similar, is this also valid for negative @1
and thus signed mask types?  IMHO we should check whether @1
is equal to wi::mask (TYPE_PRECISION (TREE_TYPE (@1)) - wi::clz (@1),
false, TYPE_PRECISION (TREE_TYPE (@1)).

As with the other patch a ChangeLog entry is missing as well as stating
how you tested the patch.

Thanks,
Richard.


+  (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
+   && cst && integer_pow2p (cst))
+   (bit_and (plus @0 @1) (bit_not @1)


--
Marc Glisse


Re: Mostly rewrite genrecog

2015-04-30 Thread Richard Sandiford
Bin.Cheng  writes:
> Hi Richard,
> I noticed that this patch caused ICE for gcc.target/arm/mmx-2.c on
> arm-none-linux-gnueabi.  Could you please have a look at it?
>
> The log message is as below,
> /projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c: In function 'foo':
> /projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c:166:1:
> error: unrecognizable insn:
> (insn 541 540 542 2 (set (reg:V4HI 512 [ D.4809 ])
> (vec_merge:V4HI (vec_select:V4HI (reg:V4HI 510 [ D.4806 ])
> (parallel [
> (const_int 2 [0x2])
> (const_int 0 [0])
> (const_int 3 [0x3])
> (const_int 1 [0x1])
> ]))
> (vec_select:V4HI (reg:V4HI 511 [ D.4806 ])
> (parallel [
> (const_int 0 [0])
> (const_int 2 [0x2])
> (const_int 1 [0x1])
> (const_int 3 [0x3])
> ]))
> (const_int 5 [0x5])))
> /projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c:159 -1
>  (nil))
> /projects/.../src/gcc/gcc/testsuite/gcc.target/arm/mmx-2.c:166:1:
> internal compiler error: in extract_insn, at recog.c:2341
> 0xa42d2a _fatal_insn(char const*, rtx_def const*, char const*, int, char 
> const*)
> /projects/.../src/gcc/gcc/rtl-error.c:110
> 0xa42d59 _fatal_insn_not_found(rtx_def const*, char const*, int, char const*)
> /projects/.../src/gcc/gcc/rtl-error.c:118
> 0xa15ff7 extract_insn(rtx_insn*)
> /projects/.../src/gcc/gcc/recog.c:2341
> 0x7ffb42 instantiate_virtual_regs_in_insn
> /projects/.../src/gcc/gcc/function.c:1598
> 0x7ffb42 instantiate_virtual_regs
> /projects/.../src/gcc/gcc/function.c:1966
> 0x7ffb42 execute
> /projects/.../src/gcc/gcc/function.c:2015
> Please submit a full bug report,
> with preprocessed source if appropriate.
> Please include the complete backtrace with any bug report.
> See  for instructions.
>
> GCC is configured with
>
> gcc/configure --target=arm-none-linux-gnueabi --prefix=
> --with-sysroot=... --enable-shared --disable-libsanitizer
> --disable-libssp --disable-libmudflap
> --with-plugin-ld=arm-none-linux-gnueabi-ld --enable-checking=yes
> --enable-languages=c,c++,fortran --with-gmp=... --with-mpfr=...
> --with-mpc=... --with-isl=... --with-cloog=... --with-arch=armv7-a
> --with-fpu=vfpv3-d16 --with-float=softfp --with-arch=armv7-a

Sorry about that, thought I'd tested that combination.

I installed the patch below as obvious after testing on arm-linux-gnu.

Thanks,
Richard


gcc/
* genrecog.c (simplify_tests): Check that CONST_INT and XWINT tests
are for the same position.

Index: gcc/genrecog.c
===
--- gcc/genrecog.c  2015-04-30 09:06:17.706538299 +0100
+++ gcc/genrecog.c  2015-04-30 12:49:58.689309916 +0100
@@ -1597,7 +1597,8 @@ simplify_tests (state *s)
  && d->if_statement_p (&label)
  && label == CONST_INT)
if (decision *second = d->first->to->singleton ())
- if (second->test.kind == test::WIDE_INT_FIELD
+ if (d->test.pos == second->test.pos
+ && second->test.kind == test::WIDE_INT_FIELD
  && second->test.u.opno == 0
  && second->if_statement_p (&label)
  && IN_RANGE (int64_t (label),



Re: [PATCH][AARCH64]Use shl for vec_shr_ rtx pattern.

2015-04-30 Thread Renlin Li

Hi Marcus,

On 29/04/15 13:06, Marcus Shawcroft wrote:

I think there is another issue here, this change:

  if (BYTES_BIG_ENDIAN)
-  return "ushl %d0, %d1, %2";
+  return "shl %d0, %d1, %2";
  else
return "ushr %d0, %d1, %2";

is in the context of:

(define_insn "vec_shr_"
   [(set (match_operand:VD 0 "register_operand" "=w")
 (lshiftrt:VD (match_operand:VD 1 "register_operand" "w")
  (match_operand:SI 2 "immediate_operand" "i")))]


You are right. This pattern has ambiguity. I have updated the patch, and 
represent vec_shr as an upspec. This will prevent other rtx patterns 
implicitly matching this one.


The new patch is attached, is it Okay to commit?

Regards,
Renlin Li

gcc/ChangeLog:

2015-04-30  Renlin Li  

* config/aarch64/aarch64-simd.md (vec_shr): Defined as an unspec.
* config/aarch64/iterators.md (unspec): Add UNSPEC_VEC_SHR.

gcc/testsuite/ChangeLog:

2015-04-30  Renlin Li  

* gcc.target/aarch64/vect-reduc-or_1.c: New.


The RTL describes a right shift of the bits within each element in the
vector while the optab expects  a right shift of the elements within
the vector?

/Marcus

diff --git a/gcc/config/aarch64/aarch64-simd.md b/gcc/config/aarch64/aarch64-simd.md
index 0557570..6304eae6 100644
--- a/gcc/config/aarch64/aarch64-simd.md
+++ b/gcc/config/aarch64/aarch64-simd.md
@@ -783,12 +783,13 @@
 ;; For 64-bit modes we use ushl/r, as this does not require a SIMD zero.
 (define_insn "vec_shr_"
   [(set (match_operand:VD 0 "register_operand" "=w")
-(lshiftrt:VD (match_operand:VD 1 "register_operand" "w")
-		 (match_operand:SI 2 "immediate_operand" "i")))]
+(unspec:VD [(match_operand:VD 1 "register_operand" "w")
+		(match_operand:SI 2 "immediate_operand" "i")]
+		   UNSPEC_VEC_SHR))]
   "TARGET_SIMD"
   {
 if (BYTES_BIG_ENDIAN)
-  return "ushl %d0, %d1, %2";
+  return "shl %d0, %d1, %2";
 else
   return "ushr %d0, %d1, %2";
   }
diff --git a/gcc/config/aarch64/iterators.md b/gcc/config/aarch64/iterators.md
index 1fdff04..498358a 100644
--- a/gcc/config/aarch64/iterators.md
+++ b/gcc/config/aarch64/iterators.md
@@ -278,6 +278,7 @@
 UNSPEC_PMULL; Used in aarch64-simd.md.
 UNSPEC_PMULL2   ; Used in aarch64-simd.md.
 UNSPEC_REV_REGLIST  ; Used in aarch64-simd.md.
+UNSPEC_VEC_SHR  ; Used in aarch64-simd.md.
 ])
 
 ;; ---
diff --git a/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c b/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
new file mode 100644
index 000..f5d9460
--- /dev/null
+++ b/gcc/testsuite/gcc.target/aarch64/vect-reduc-or_1.c
@@ -0,0 +1,34 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -ftree-vectorize -fdump-tree-vect-all" } */
+/* Write a reduction loop to be reduced using whole vector right shift.  */
+
+extern void abort (void);
+
+unsigned char in[8] __attribute__((__aligned__(16)));
+
+int
+main (unsigned char argc, char **argv)
+{
+  unsigned char i = 0;
+  unsigned char sum = 1;
+
+  for (i = 0; i < 8; i++)
+in[i] = (i + i + 1) & 0xfd;
+
+  /* Prevent constant propagation of the entire loop below.  */
+  asm volatile ("" : : : "memory");
+
+  for (i = 0; i < 8; i++)
+sum |= in[i];
+
+  if (sum != 13)
+{
+  __builtin_printf("Failed %d\n", sum);
+  abort();
+}
+
+  return 0;
+}
+
+/* { dg-final { scan-tree-dump "Reduce using vector shifts" "vect" } } */
+/* { dg-final { cleanup-tree-dump "vect" } } */


Re: [PATCH][AArch64] Properly handle mvn-register and add EON+shift pattern and cost appropriately

2015-04-30 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01434.html

Thanks,
Kyrill

On 23/04/15 17:57, Kyrill Tkachov wrote:

[resending due to mail client messing up.]

Hi all,

The EON instruction can be expressed either by (xor (not a) b) or (not (xor
a b)),
simplify-rtx canonicalizes to the second form and we have a pattern for it
(*xor_one_cmpl3) but we don't have a pattern for the shifted operand
version. This patch adds that pattern as well as the proper handling of it
in
aarch64_rtx_costs. The zero-extend version of it is also added.

While we're in the NOT case of rtx costs this patch also
corrects the costing of MVN+shift operations.

With this patch, for C code:
unsigned long
baz (unsigned int a, unsigned int b)
{
   return ~(a ^ (b << 6));
}

unsigned long
foo (unsigned long a, unsigned long b)
{
   return ~(a ^ (b >> 24));
}


we now generate:
baz:
eon w0, w0, w1, lsl 6
ret

foo:
eon x0, x0, x1, lsr 24
ret

instead of the previous:
baz:
eor w0, w0, w1, lsl 6
mvn w0, w0
uxtwx0, w0
ret

foo:
eor x0, x0, x1, lsr 24
mvn x0, x0
ret


Bootstrapped and tested on aarch64-linux.
Ok for trunk?
Thanks,
Kyrill

2015-04-23  Kyrylo Tkachov  

* config/aarch64/aarch64.md
(*eor_one_cmpl_3_alt):
New pattern.
(*eor_one_cmpl_sidi3_alt_ze): Likewise.
* config/aarch64/aarch64.c (aarch64_rtx_costs): Handle MVN-shift
appropriately.  Handle alternative EON form.




Re: [PATCH 3/8] add default for PCC_BITFIELD_TYPE_MATTERS

2015-04-30 Thread Trevor Saunders
On Thu, Apr 30, 2015 at 08:54:05AM +0200, Andreas Schwab wrote:
> Trevor Saunders  writes:
> 
> >> diff --git a/libobjc/encoding.c b/libobjc/encoding.c
> >> index 7333908..20ace46 100644
> >> --- a/libobjc/encoding.c
> >> +++ b/libobjc/encoding.c
> >> @@ -1167,7 +1167,7 @@ objc_layout_structure_next_member (struct 
> >> objc_struct_layout *layout)
> >>/* Record must have at least as much alignment as any field.
> >>   Otherwise, the alignment of the field within the record
> >>   is meaningless.  */
> >> -#ifndef PCC_BITFIELD_TYPE_MATTERS
> >> +#if !PCC_BITFIELD_TYPE_MATTERS
> 
> With `#define PCC_BITFIELD_TYPE_MATTERS true' this expands to `#if
> !true' which evaluates to 1 since true isn't a defined identifier?

I think true is a defined identifier since this is compiled as c11.

tbsaunde@iceball:/src/gcc1-opt$ cat test.c
#define FOO true
#if !FOO
hello
#endif
tbsaunde@iceball:/src/gcc1-opt$ gcc/xgcc -B gcc/ -E test.c
# 1 "test.c"
# 1 ""
# 1 ""
# 1 "/usr/include/stdc-predef.h" 1 3 4
# 1 "" 2
# 1 "test.c"


hello
tbsaunde@iceball:/src/gcc1-opt$

Trev

> 
> Andreas.
> 
> -- 
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


Re: [PATCH][ARM] Do not lower cost of setting core reg to constant. It doesn't have any effect

2015-04-30 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01330.html

Thanks,
Kyrill

On 22/04/15 17:18, Kyrill Tkachov wrote:

Hi all,

This hunk that slightly reduces the cost of immediate moves doesn't actually 
have any effect.
In the whole of SPEC2006 it didn't make a difference. In any case, I'd like to 
move to a point
where we use COSTS_N_INSNS units for our costs and not increment decrement them 
by one.

This patch removes that bit of logic and makes it slightly cleaner to look at. 
As far as I know
its logic has never been confirmed in practice.

Bootstrapped and tested on arm.

Ok for trunk?

Thanks,
Kyrill

2015-04-22  Kyrylo Tkachov  

  * config/arm/arm.c (arm_new_rtx_costs): Do not lower cost
  immediate moves.




Re: [PATCH][AArch64] Properly cost FABD pattern

2015-04-30 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01329.html

Thanks,
Kyrill

On 22/04/15 17:01, Kyrill Tkachov wrote:

Hi all,

In rtx costs we do not handle the FP abs (minus (a b)) case which maps down
to a FABD instruction.
This patch fixes that. FABD behaves similarly to the FADD class of
instructions unlike simple FABS
which is closer to FNEG.

Tested aarch64-none-elf.
Ok for trunk?

Thanks,
Kyrill

2015-04-22  Kyrylo Tkachov  

 * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle pattern for
 fabd in ABS case.




Re: [PATCH][ARM][stage-1] Initialise cost to COSTS_N_INSNS (1) and increment in arm rtx costs

2015-04-30 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01130.html

Thanks,
Kyrill

On 21/04/15 10:11, Kyrill Tkachov wrote:

Hi all,

This is the first of a series to clean up and simplify the arm rtx costs 
function.
This patch initialises the cost to COSTS_N_INSNS (1) at the top and increments 
it when appropriate
in the rest of the function. This makes it more similar to the aarch64 rtx 
costs function and saves
us the trouble of having to remember to initialise the cost to COSTS_N_INSNS 
(1) in each case of the
switch statement.

Bootstrapped and tested arm-none-linux-gnueabihf.
Compiled some large programs with no codegen difference, except some DIV 
synthesis algorithms were changed,
presumably due to the cost of SDIV/UDIV, which is now being correctly 
calculated (before it was missing the
baseline COSTS_N_INSNS (1)).

Ok for trunk?

Thanks,
Kyrill

2015-04-21  Kyrylo Tkachov  

  * config/arm/arm.c (arm_new_rtx_costs): Initialise cost to
  COSTS_N_INSNS (1) and increment it appropriately throughout the
  function.




Re: [PATCH][ARM] Handle UNSPEC_VOLATILE in rtx costs and don't recurse inside the unspec

2015-04-30 Thread Kyrill Tkachov

Ping.
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01047.html

Thanks,
Kyrill

On 20/04/15 17:28, Kyrill Tkachov wrote:

Hi all,

A pet project of mine is to get to the point where backend rtx costs functions 
won't have
to handle rtxes that don't match down to any patterns/expanders we have. Or at 
least limit such cases.
A case dealt with in this patch is QImode PLUS. We don't actually generate or 
handle these anywhere in
the arm backend *except* in sync.md where, for example, 
atomic_ matches:
(set (match_operand:QHSD 0 "mem_noofs_operand" "+Ua")
  (unspec_volatile:QHSD
[(syncop:QHSD (match_dup 0)
   (match_operand:QHSD 1 "" ""))
 (match_operand:SI 2 "const_int_operand")];; model
VUNSPEC_ATOMIC_OP))

Here QHSD can contain QImode and HImode while syncop can be PLUS.
Now immediately during splitting in arm_split_atomic_op we convert that
QImode PLUS into an SImode one, so we never actually generate any kind of 
QImode add operations
(how would we? we don't have define_insns for such things) but the RTL 
optimisers will get a hold
of the UNSPEC_VOLATILE in the meantime and ask for it's cost (for example, cse 
when building libatomic).
Currently we don't handle UNSPEC_VOLATILE (VUNSPEC_ATOMIC_OP) so the arm rtx 
costs function just recurses
into the QImode PLUS that I'd like to avoid.
This patch stops that by passing the VUNSPEC_ATOMIC_OP into arm_unspec_cost and 
handling it there
(very straightforwardly just returning COSTS_N_INSNS (2); there's no indication 
that we want to do anything
smarter here) and stopping the recursion.

This is a small step in the direction of not having to care about obviously 
useless rtxes in the backend.
The astute reader might notice that in sync.md we also have the pattern 
atomic_fetch_
which expands to/matches this:
(set (match_operand:QHSD 0 "s_register_operand" "=&r")
  (match_operand:QHSD 1 "mem_noofs_operand" "+Ua"))
 (set (match_dup 1)
  (unspec_volatile:QHSD
[(syncop:QHSD (match_dup 1)
   (match_operand:QHSD 2 "" ""))
 (match_operand:SI 3 "const_int_operand")];; model
VUNSPEC_ATOMIC_OP))


Here the QImode PLUS is in a PARALLEL together with the UNSPEC, so it might 
have rtx costs called on it
as well. This will always be a (plus (reg) (mem)) rtx, which is unlike any 
other normal rtx we generate
in the arm backend. I'll try to get a patch to handle that case, but I'm still 
thinking on how to best
do that.

Tested arm-none-eabi, I didn't see any codegen differences in some compiled 
codebases.

Ok for trunk?

P.S. I know that expmed creates all kinds of irregular rtxes and asks for their 
costs. I'm hoping to clean that
up at some point...

2015-04-20  Kyrylo Tkachov  

  * config/arm/arm.c (arm_new_rtx_costs): Handle UNSPEC_VOLATILE.
  (arm_unspec_cost): Allos UNSPEC_VOLATILE.  Do not recurse inside
  unknown unspecs.




[PING] AArch64 costs patches

2015-04-30 Thread Kyrill Tkachov

Hi all,

I'd like to ping these 3 aarch64 costs patches.

[AArch64] Use extend_arith rtx cost appropriately: 
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01051.html
[AArch64] Properly cost MNEG/[SU]MNEGL patterns 
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01050.html
[AArch64] Properly handle SHIFT ops and EXTEND in aarch64_rtx_mult_cost 
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01049.html

Thanks,
Kyrill



Re: [PATCH][expr.c] PR 65358 Avoid clobbering partial argument during sibcall

2015-04-30 Thread Kyrill Tkachov


On 28/04/15 10:54, Kyrill Tkachov wrote:

On 27/04/15 21:13, Jeff Law wrote:

On 04/21/2015 11:33 AM, Kyrill Tkachov wrote:

On 21/04/15 15:09, Jeff Law wrote:

On 04/21/2015 02:30 AM, Kyrill Tkachov wrote:

From reading config/stormy16/stormy-abi it seems to me that we don't
pass arguments partially in stormy16, so this code would never be called
there. That leaves pa as the potential problematic target.
I don't suppose there's an easy way to test on pa? My checkout of
binutils
doesn't seem to include a sim target for it.

No simulator, no machines in the testfarm, the box I had access to via
parisc-linux.org seems dead and my ancient PA overheats well before a
bootstrap could complete.  I often regret knowing about the backwards
way many things were done on the PA because it makes me think about
cases that only matter on dead architectures.

So what should be the action plan here? I can't add an assert on
positive result as a negative result is valid.

We want to catch the case where this would cause trouble on
pa, or change the patch until we're confident that it's fine
for pa.

That being said, reading the documentation of STACK_GROWS_UPWARD
and ARGS_GROW_DOWNWARD I'm having a hard time visualising a case
where this would cause trouble on pa.

Is the problem that in the function:

+/* Add SIZE to X and check whether it's greater than Y.
+   If it is, return the constant amount by which it's greater or smaller.
+   If the two are not statically comparable (for example, X and Y contain
+   different registers) return -1.  This is used in expand_push_insn to
+   figure out if reading SIZE bytes from location X will end up reading
from
+   location Y.  */
+static int
+memory_load_overlap (rtx x, rtx y, HOST_WIDE_INT size)
+{
+  rtx tmp = plus_constant (Pmode, x, size);
+  rtx sub = simplify_gen_binary (MINUS, Pmode, tmp, y);
+
+  if (!CONST_INT_P (sub))
+return -1;
+
+  return INTVAL (sub);
+}

for ARGS_GROW_DOWNWARD we would be reading 'backwards' from x,
so the function should something like the following?

So I had to go back and compile some simple examples.

References to outgoing arguments will be SP relative.  References to the
incoming arguments will be ARGP relative.  And that brings me to the
another issue.  Isn't X in this context the incoming argument slot and
the destination an outgoing argument slot?

If so, the approach of memory_load_overlap simply won't work on a target
with calling conventions like the PA.  And you might really want to
consider punting for these kind of calling conventions

Ok, thanks for the guidance.
How about this? This patch disables sibcall optimisation when
encountering a partial argument when ARGS_GROW_DOWNWARD && 
!STACK_GROWS_DOWNWARD.
Hopefully this shouldn't harm codegen on parisc if, as you say, it's rare to 
have
partial arguments anyway on PA due to the large number of argument regs.

I tested this on arm and bootstrapped on x86_64.
I am now going through the process of getting access to a Debian PA machine to
give it a test there (thanks Dave!)

Ok if testing comes clean?

Hi Jeff,

So I got access to an hppa machine.
But as mentioned here https://gcc.gnu.org/ml/gcc/2015-04/msg00364.html
I don't think I can bootstrap and run a 64-bit pa testsuite.
I've verified that the compiler builds, but is there anything more I can
do in testing this patch?
The latest version is at 
https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01713.html
and disables sibcall optimisation on a target like pa when a partial argument 
is encountered.

Thanks,
Kyrill




Thanks,
Kyrill

2015-04-28  Kyrylo Tkachov  

  PR target/65358
  * calls.c (expand_call): Cancel sibcall optimisation when encountering
  partial argument on targets with ARGS_GROW_DOWNWARD and
  !STACK_GROWS_DOWNWARD.
  * expr.c (memory_load_overlap): New function.
  (emit_push_insn): When pushing partial args to the stack would
  clobber the register part load the overlapping part into a pseudo
  and put it into the hard reg after pushing.

2015-04-28  Honggyu Kim  

  PR target/65358
  * gcc.dg/pr65358.c: New test.


If you hadn't already done the work, I'd suggest punting for any case
where we have args partially in regs and partially in memory :-)

More thoughts when I can get an hour or two to remind myself how all
this stuff works on the PA.

I will note that testing on the PA is unlikely to show anything simply
because it uses 8 parameter passing registers.  So it's rare to pass
anything in memory at all.  Even rarer to have something partially in
memory and partially in registers.



Jeff






Re: Mostly rewrite genrecog

2015-04-30 Thread Andreas Schwab
Richard Sandiford  writes:

> Andreas Schwab  writes:
>> Richard Sandiford  writes:
>>
>>> /* Represents a test and the action that should be taken on the result.
>>>If a transition exists for the test outcome, the machine switches
>>>to the transition's target state.  If no suitable transition exists,
>>>the machine either falls through to the next decision or, if there are no
>>>more decisions to try, fails the match.  */
>>> struct decision : list_head 
>>> {
>>>   decision (const test &);
>>>
>>>   void set_parent (list_head  *s);
>>>   bool if_statement_p (uint64_t * = 0) const;
>>>
>>>   /* The state to which this decision belongs.  */
>>>   state *s;
>>>
>>>   /* Links to other decisions in the same state.  */
>>>   decision *prev, *next;
>>>
>>>   /* The test to perform.  */
>>>   struct test test;
>>> };
>>
>> ../../gcc/genrecog.c:1467: error: declaration of 'test decision::test'
>> ../../gcc/genrecog.c:1051: error: changes meaning of 'test' from 'struct 
>> test'
>>
>> Bootstrap compiler is gcc 4.3.4.
>
> Bah.  Does it like "::test test" instead of "struct test test"?

Same error.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH][AArch64] Use extend_arith rtx cost appropriately

2015-04-30 Thread Marcus Shawcroft
On 20 April 2015 at 17:48, Kyrill Tkachov  wrote:
> Hi all,
>
> When calculating the rtx costs of an arithmetic operation combined with
> zero or sign extension of its operand we should use the extend_arith
> cost rather than the arith_shift cost.
>
> Bootstrapped and tested on aarch64-linux.
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2015-04-20  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_rtx_costs): Use extend_arith
> rather than arith_shift cost when costing ADD/MINUS of an
> extended value.

OK /Marcus


Re: [PATCH][AArch64] Properly cost MNEG/[SU]MNEGL patterns

2015-04-30 Thread Marcus Shawcroft
On 20 April 2015 at 17:36, Kyrill Tkachov  wrote:
> Hi all,
>
> Currently we do not handle the MNEG patterns properly in rtx costs.
> These instructions are similar to the MSUB ones.
> This patch handles them by catching the NEG at the appropriate position,
> extracting its operands and letting the rest of the aarch64_rtx_mult_cost
> function
> handle the additional costs.
>
> Tested on aarch64-none-elf.
>
> Ok trunk?
>
> Thanks,
> Kyrill
>
> N.B.
> This patches' context depends on:
> https://gcc.gnu.org/ml/gcc-patches/2015-04/msg01049.html
>
> 2015-04-20  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_rtx_mult_cost): Handle MNEG
> and [SU]MNEGL patterns.


OK /Marcus


Re: [PATCH] add self-tuning to x86 hardware fast path in libitm

2015-04-30 Thread Nuno Diegues
> Patch looks good to me now. It would be perhaps nice to have an
> environment variable to turn the adaptive algorithm off for tests,
> but that's not critical.

Yes, that makes perfect sense.


> It would be also nice to test it on something else, but I understand
> it's difficult to find other software using the STM syntax.

Indeed. I'll try to find some time to work on that, but it may take a while.


> I can't approve the patch however. I believe it's big enough that you
> may need a copy right assignment.

I have signed a Form Assignment from the Free Software Foundation to
deal exactly with those matters for this patch to the libitm. Torvald
Riegel had advised me to do so.

I have not, however, received any further information; so I'm left
wondering if it went through or if it is still hanging. I will ping
back to FSF to check that out perhaps?


Best regards,
-- Nuno Diegues



>
> -Andi
>
> --
> a...@linux.intel.com -- Speaking for myself only


Re: [RFC][PATCH 2/3] Propagate and save value ranges wrapped information

2015-04-30 Thread Richard Biener
On Thu, 30 Apr 2015, Jakub Jelinek wrote:

> On Thu, Apr 30, 2015 at 01:35:25PM +0200, Richard Biener wrote:
> > I've long pondered with replacing the VRP overflow checking code
> > (for -fstrict-overflow) with keeping two lattices - one honoring undefined
> > overflow and one not and then comparing the results in the end.
> 
> Yeah, that would be greatly appreciated.  The (OVF) stuff is complete mess.

Just to explain a little bit, the idea is to have two lattices
and run the VRP propagation stage on each one, once with
flag_wrapv forced to 1 (all conditional on -Wstrict-overflow of course).
At stmt simplifiation time then determine if the outcome is dependent
on the choice of the lattice and if so, emit a strict-overflow warning.

This will avoid the missed optimizations we currently have due to the
(OVF) stuff.

And it will simplify the code and make it easier to maintain.

As for doing sth similar for PROMOTE_MODE you'd have to replace
all SSA names TREE_TYPE (and hope VRP doesn't explode on inconsistencies).
I don't think thats going to fly well though.  Instead you could
"lower" at VRP instrumentation time, inserting sign-/zero-extensions
and remove those not necessary at VRPs final stage.  Of course that
would tie lowering to VRP which probably we don't want to do.
I've meanwhile found the prototype lowering pass and as I've commented
on that patch you need to add at least a sign-extension tree operator
for efficiency (otherwise you need two stmts and an intermediate
non-promoted type).  zero-extension can be done via a BIT_AND_EXPR
(not 100% nice, but we avoid having two ways to compute sth which
speaks against having an explicit zero-extension operator).

Richard.


Re: [PATCH][AArch64] Properly handle SHIFT ops and EXTEND in aarch64_rtx_mult_cost

2015-04-30 Thread Marcus Shawcroft
On 20 April 2015 at 17:35, Kyrill Tkachov  wrote:
> Hi all,
>
> The aarch64_rtx_mult_cost helper is supposed to handle multiplication costs
> as well as
> PLUS/MINUS operations combined with multiplication or shift operations. The
> shift
> operations may contain an extension. Currently we do not handle all these
> cases properly.
> We also don't handle other supported shift types besides ASHIFT.
>
> This patch addresses that by beefing up aarch64_rtx_mult_cost to handle
> extensions inside the shifts and handling the other kinds of supported
> shifts.
>
> Bootstrapped and tested on aarch64-linux.
>
> Ok for trunk?
>
> Thanks,
> Kyrill
>
> 2015-04-20  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_shift_p): New function.
> (aarch64_rtx_mult_cost): Update comment to reflect that it also handles
> combined arithmetic-shift ops.  Properly handle all shift and extend
> operations that can occur in combination with PLUS/MINUS.
> Rename maybe_fma to compound_p.
> (aarch64_rtx_costs): Use aarch64_shift_p when costing compound
> arithmetic and shift operations.

+/* Return true iff CODE is a shift supported in combination
+   with arithmetic instructions.  */
+static bool
+aarch64_shift_p (enum rtx_code code)

OK, but add the blank line between the function comment and the
function please. /Marcus


Re: [PATCH 3/8] add default for PCC_BITFIELD_TYPE_MATTERS

2015-04-30 Thread Jakub Jelinek
On Thu, Apr 30, 2015 at 07:58:33AM -0400, Trevor Saunders wrote:
> On Thu, Apr 30, 2015 at 08:54:05AM +0200, Andreas Schwab wrote:
> > Trevor Saunders  writes:
> > 
> > >> diff --git a/libobjc/encoding.c b/libobjc/encoding.c
> > >> index 7333908..20ace46 100644
> > >> --- a/libobjc/encoding.c
> > >> +++ b/libobjc/encoding.c
> > >> @@ -1167,7 +1167,7 @@ objc_layout_structure_next_member (struct 
> > >> objc_struct_layout *layout)
> > >>/* Record must have at least as much alignment as any field.
> > >>   Otherwise, the alignment of the field within the record
> > >>   is meaningless.  */
> > >> -#ifndef PCC_BITFIELD_TYPE_MATTERS
> > >> +#if !PCC_BITFIELD_TYPE_MATTERS
> > 
> > With `#define PCC_BITFIELD_TYPE_MATTERS true' this expands to `#if
> > !true' which evaluates to 1 since true isn't a defined identifier?
> 
> I think true is a defined identifier since this is compiled as c11.

true is not a defined identifier, neither in C89, nor in C99, nor in C11.
In C, true may be a macro if stdbool.h is included.
system.h has:
#undef TRUE
#undef FALSE

#ifdef __cplusplus
  /* Obsolete.  */
# define TRUE true
# define FALSE false
#else /* !__cplusplus */
# undef bool
# undef true
# undef false

# define bool unsigned char
# define true 1
# define false 0

  /* Obsolete.  */
# define TRUE true
# define FALSE false
#endif /* !__cplusplus */
if it is included.

Jakub


Re: [PATCH 3/8] add default for PCC_BITFIELD_TYPE_MATTERS

2015-04-30 Thread Andreas Schwab
Trevor Saunders  writes:

> I think true is a defined identifier since this is compiled as c11.
>
> tbsaunde@iceball:/src/gcc1-opt$ cat test.c
> #define FOO true
> #if !FOO
> hello
> #endif
> tbsaunde@iceball:/src/gcc1-opt$ gcc/xgcc -B gcc/ -E test.c
> # 1 "test.c"
> # 1 ""
> # 1 ""
> # 1 "/usr/include/stdc-predef.h" 1 3 4
> # 1 "" 2
> # 1 "test.c"
>
>
> hello

Since you get hello this proves that true is *not* defined.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [PATCH 4/4] match.pd: Add x + ((-x) & m) -> (x + m) & ~m pattern

2015-04-30 Thread Richard Biener
On Thu, Apr 30, 2015 at 1:44 PM, Marc Glisse  wrote:
> On Thu, 30 Apr 2015, Richard Biener wrote:
>
>> On Wed, Jan 21, 2015 at 11:49 AM, Rasmus Villemoes
>>  wrote:
>>>
>>> Generalizing the x+(x&1) pattern, one can round up x to a multiple of
>>> a 2^k by adding the negative of x modulo 2^k. But it is fewer
>>> instructions, and presumably requires fewer registers, to do the more
>>> common (x+m)&~m where m=2^k-1.
>>>
>>> Signed-off-by: Rasmus Villemoes 
>>> ---
>>>  gcc/match.pd  |  9 ++
>>>  gcc/testsuite/gcc.dg/20150120-4.c | 59
>>> +++
>>>  2 files changed, 68 insertions(+)
>>>  create mode 100644 gcc/testsuite/gcc.dg/20150120-4.c
>>>
>>> diff --git gcc/match.pd gcc/match.pd
>>> index 47865f1..93c2298 100644
>>> --- gcc/match.pd
>>> +++ gcc/match.pd
>>> @@ -273,6 +273,15 @@ along with GCC; see the file COPYING3.  If not see
>>>   (if (TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
>>>(bit_ior @0 (bit_not @1
>>>
>>> +/* x + ((-x) & m) -> (x + m) & ~m when m == 2^k-1.  */
>>> +(simplify
>>> + (plus:c @0 (bit_and@2 (negate @0) CONSTANT_CLASS_P@1))
>>
>>
>> I think you want to restrict this to INTEGER_CST@1
>
>
> Is this only to make the following test easier (a good enough reason for me)
> or is there some fundamental reason why this transformation would be wrong
> for vectors?

Good question - I suppose it also works for vectors (well, the predicates
don't).  for non-ingegers or complex ints we shouldn't arrive here as
we can't have bit_and for them.  for pointers we can't have plus on them.

So yes, it makes the following tests easier.  A TODO comment for vectors
might be appropriate (we'd simply need a predicate that can test for
all emlements being 2^k-1).

Richard.

>
>>> + (with { tree cst = fold_binary (PLUS_EXPR, TREE_TYPE (@1),
>>> +@1, build_one_cst (TREE_TYPE (@1))); }
>>
>>
>> We shouldn't dispatch to fold_binary in patterns.  int_const_binop would
>> be the appropriate function to use - but what happens for @1 == INT_MAX
>> where @1 + 1 overflows?  Similar, is this also valid for negative @1
>> and thus signed mask types?  IMHO we should check whether @1
>> is equal to wi::mask (TYPE_PRECISION (TREE_TYPE (@1)) - wi::clz (@1),
>> false, TYPE_PRECISION (TREE_TYPE (@1)).
>>
>> As with the other patch a ChangeLog entry is missing as well as stating
>> how you tested the patch.
>>
>> Thanks,
>> Richard.
>>
>>> +  (if ((TREE_CODE (@2) != SSA_NAME || has_single_use (@2))
>>> +   && cst && integer_pow2p (cst))
>>> +   (bit_and (plus @0 @1) (bit_not @1)
>
>
> --
> Marc Glisse


Re: Mostly rewrite genrecog

2015-04-30 Thread Richard Biener
On Thu, Apr 30, 2015 at 2:08 PM, Andreas Schwab  wrote:
> Richard Sandiford  writes:
>
>> Andreas Schwab  writes:
>>> Richard Sandiford  writes:
>>>
 /* Represents a test and the action that should be taken on the result.
If a transition exists for the test outcome, the machine switches
to the transition's target state.  If no suitable transition exists,
the machine either falls through to the next decision or, if there are 
 no
more decisions to try, fails the match.  */
 struct decision : list_head 
 {
   decision (const test &);

   void set_parent (list_head  *s);
   bool if_statement_p (uint64_t * = 0) const;

   /* The state to which this decision belongs.  */
   state *s;

   /* Links to other decisions in the same state.  */
   decision *prev, *next;

   /* The test to perform.  */
   struct test test;
 };
>>>
>>> ../../gcc/genrecog.c:1467: error: declaration of 'test decision::test'
>>> ../../gcc/genrecog.c:1051: error: changes meaning of 'test' from 'struct 
>>> test'
>>>
>>> Bootstrap compiler is gcc 4.3.4.
>>
>> Bah.  Does it like "::test test" instead of "struct test test"?
>
> Same error.

You have to use a different name I believe (or -fpermissive).

Richard.

> Andreas.
>
> --
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


Re: [PATCH 3/8] add default for PCC_BITFIELD_TYPE_MATTERS

2015-04-30 Thread Trevor Saunders
On Thu, Apr 30, 2015 at 08:40:50AM +0200, Andreas Schwab wrote:
> Trevor Saunders  writes:
> 
> > actually pointing out libojc/encoding.c was more useful since that makes
> > it pretty clear the ifndef PCC_BITFIELD_TYPE_MATTERS there just needs to
> > be changed to #if !
> 
> That probably won't work on arm or powerpc or vax:
> 
> gcc/config/arm/arm.h:#define PCC_BITFIELD_TYPE_MATTERS TARGET_AAPCS_BASED
> gcc/config/rs6000/sysv4.h:#define PCC_BITFIELD_TYPE_MATTERS 
> (TARGET_BITFIELD_TYPE)
> gcc/config/vax/vax.h:#define PCC_BITFIELD_TYPE_MATTERS (! 
> TARGET_VAXC_ALIGNMENT)

hrmph, I don't see how this code ever worked correctly on those targets.
Consider the arm case the value of PCC_BITFIELD_TYPE_MATTERS depends on
arm_abi so if the bitfield type matters depends on what abi libobjc is
being built for, but its not obvious how libobjc is dealing with that.
I suppose it could be that libobjc is using this macro to know something
else that only sort of relaed somehow.  Unfortunately this code seems to
come from the creation of libobjc/ in 11998 and though the commit says
it is a move from gcc/objc/ nothing appears to have been removed from
gcc/objc/.

I guess the "best" thing to do is justadd a
__PCC_BITFIELD_TYPE_MATTERS__ that gcc defines and use that in libobjc?

Trev

> 
> Andreas.
> 
> -- 
> Andreas Schwab, sch...@linux-m68k.org
> GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
> "And now for something completely different."


Re: [PATCH 3/8] add default for PCC_BITFIELD_TYPE_MATTERS

2015-04-30 Thread Jakub Jelinek
On Thu, Apr 30, 2015 at 08:25:14AM -0400, Trevor Saunders wrote:
> On Thu, Apr 30, 2015 at 08:40:50AM +0200, Andreas Schwab wrote:
> > Trevor Saunders  writes:
> > 
> > > actually pointing out libojc/encoding.c was more useful since that makes
> > > it pretty clear the ifndef PCC_BITFIELD_TYPE_MATTERS there just needs to
> > > be changed to #if !
> > 
> > That probably won't work on arm or powerpc or vax:
> > 
> > gcc/config/arm/arm.h:#define PCC_BITFIELD_TYPE_MATTERS TARGET_AAPCS_BASED
> > gcc/config/rs6000/sysv4.h:#define   PCC_BITFIELD_TYPE_MATTERS 
> > (TARGET_BITFIELD_TYPE)
> > gcc/config/vax/vax.h:#define PCC_BITFIELD_TYPE_MATTERS (! 
> > TARGET_VAXC_ALIGNMENT)
> 
> hrmph, I don't see how this code ever worked correctly on those targets.
> Consider the arm case the value of PCC_BITFIELD_TYPE_MATTERS depends on
> arm_abi so if the bitfield type matters depends on what abi libobjc is
> being built for, but its not obvious how libobjc is dealing with that.
> I suppose it could be that libobjc is using this macro to know something
> else that only sort of relaed somehow.  Unfortunately this code seems to
> come from the creation of libobjc/ in 11998 and though the commit says
> it is a move from gcc/objc/ nothing appears to have been removed from
> gcc/objc/.
> 
> I guess the "best" thing to do is justadd a
> __PCC_BITFIELD_TYPE_MATTERS__ that gcc defines and use that in libobjc?

I think adding way too many predefines, especially rarely used ones, is
harmful, certainly it isn't free, consider -g3 or -dD where it will all end
up in, additional gcc start overhead, ... 
Can't just libobjc configury test for that?

Jakub


Re: [PATCH, rs6000, testsuite, PR65456] Changes for unaligned vector load/store support on POWER8

2015-04-30 Thread Bill Schmidt
On Thu, 2015-04-30 at 18:26 +0800, Bin.Cheng wrote:
> On Mon, Apr 27, 2015 at 9:26 PM, Bill Schmidt
>  wrote:
> > On Mon, 2015-04-27 at 14:23 +0800, Bin.Cheng wrote:
> >> On Mon, Mar 30, 2015 at 1:42 AM, Bill Schmidt
> >>  wrote:
> >
> >>
> >> > Index: gcc/testsuite/gcc.dg/vect/vect-33.c
> >> > ===
> >> > --- gcc/testsuite/gcc.dg/vect/vect-33.c (revision 221118)
> >> > +++ gcc/testsuite/gcc.dg/vect/vect-33.c (working copy)
> >> > @@ -36,9 +36,10 @@ int main (void)
> >> >return main1 ();
> >> >  }
> >> >
> >> > +/* vect_hw_misalign && { ! vect64 } */
> >> >
> >> >  /* { dg-final { scan-tree-dump-times "vectorized 1 loops" 1 "vect"  } } 
> >> > */
> >> > -/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" 
> >> > { target { vect_hw_misalign && { {! vect64} || vect_multiple_sizes } } } 
> >> > } } */
> >> > +/* { dg-final { scan-tree-dump "Vectorizing an unaligned access" "vect" 
> >> > { target { { { ! powerpc*-*-* } && vect_hw_misalign } && { { ! vect64 } 
> >> > || vect_multiple_sizes } } } } }  */
> >> >  /* { dg-final { scan-tree-dump "Alignment of access forced using 
> >> > peeling" "vect" { target { vector_alignment_reachable && { vect64 && {! 
> >> > vect_multiple_sizes} } } } } } */
> >> >  /* { dg-final { scan-tree-dump-times "Alignment of access forced using 
> >> > versioning" 1 "vect" { target { { {! vector_alignment_reachable} || {! 
> >> > vect64} } && {! vect_hw_misalign} } } } } */
> >> >  /* { dg-final { cleanup-tree-dump "vect" } } */
> >>
> >> Hi Bill,
> >> With this change, the test case is skipped on aarch64 now.  Since it
> >> passed before, Is it expected to act like this on 64bit platforms?
> >
> > Hi Bin,
> >
> > No, that's a mistake on my part -- thanks for the report!  That first
> > added line was not intended to be part of the patch:
> >
> > +/* vect_hw_misalign && { ! vect64 } */
> >
> > Please try removing that line and verify that the patch succeeds again
> > for ARM.  Assuming so, I'll prepare a patch to fix this.
> >
> > It looks like this mistake was introduced only in this particular test,
> > but please let me know if you see any other anomalies.
> Hi Bill,
> I chased the wrong branch.  The test disappeared on fsf-48 branch in
> out build, rather than trunk.  I guess it's not your patch's fault.
> Will follow up and get back to you later.
> Sorry for the inconvenience.

OK, thanks for letting me know!  There was still a bad line in this
patch, although it was only introduced in 5.1 and trunk, so I guess that
wasn't responsible in this case.  Thanks for checking!

Bill

> 
> Thanks,
> bin
> >
> > Thanks very much!
> >
> > Bill
> >>
> >> PASS->NA: gcc.dg/vect/vect-33.c -flto -ffat-lto-objects
> >> scan-tree-dump-times vect "Vectorizing an unaligned access" 0
> >> PASS->NA: gcc.dg/vect/vect-33.c scan-tree-dump-times vect "Vectorizing
> >> an unaligned access" 0
> >>
> >> Thanks,
> >> bin
> >>
> >
> >
> 




Re: [PATCH] [libstdc++] Add uniform container erasure.

2015-04-30 Thread Jonathan Wakely

On 30/04/15 07:33 -0400, Ed Smith-Rowland wrote:

This has been in me tree for a good while.

It is fairly simple and adds C++ experimental container erasure.


And make_array, which isn't in the working paper yet, so I'd prefer to
leave that part out for now.



Index: include/experimental/erase_if.tcc
===
--- include/experimental/erase_if.tcc   (revision 0)
+++ include/experimental/erase_if.tcc   (working copy)
@@ -0,0 +1,70 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2015 Free Software Foundation, Inc.
+//
+// This file is part of the GNU ISO C++ Library.  This library is free
+// software; you can redistribute it and/or modify it under the
+// terms of the GNU General Public License as published by the
+// Free Software Foundation; either version 3, or (at your option)
+// any later version.
+
+// This library is distributed in the hope that it will be useful,
+// but WITHOUT ANY WARRANTY; without even the implied warranty of
+// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+// GNU General Public License for more details.
+
+// Under Section 7 of GPL version 3, you are granted additional
+// permissions described in the GCC Runtime Library Exception, version
+// 3.1, as published by the Free Software Foundation.
+
+// You should have received a copy of the GNU General Public License and
+// a copy of the GCC Runtime Library Exception along with this program;
+// see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+// .
+
+/** @file experimental/erase_if.tcc
+ *  This is an internal header file, included by other library headers.
+ *  Do not attempt to use it directly. @headername{erase_if}


The Doxygen @headername command tells users which header they are
supposed to include, rather than this one. Since there is no
 header that's wrong. I'd just omit the @headername.



+ */
+
+#ifndef _GLIBCXX_EXPERIMENTAL_ERASE_IF_TCC
+#define _GLIBCXX_EXPERIMENTAL_ERASE_IF_TCC 1
+
+#pragma GCC system_header
+
+#if __cplusplus <= 201103L
+# include 
+#else
+
+namespace std
+{
+namespace experimental
+{
+inline namespace fundamentals_v2
+{
+
+  namespace __detail
+  {
+template
+  void
+  __erase_nodes_if(_Container& __cont, _Predicate __pred)
+  {
+   for (auto __iter = __cont.begin(), __last = __cont.end();
+__iter != __last;)
+   {
+ if (__pred(*__iter))
+   __iter = __cont.erase(__iter);
+ else
+   ++__iter;
+   }
+  }
+  }


This file doesn't really seem like a .tcc to me, it isn't providing
definitions of templates declared elsewhere (specifically in an
erase_if.h header).

Maybe we want an experimental/bits/ directory for this sort of thing
(which I could also use for the filesystem headers I'm about to
commit) but in the meanwhile I think just experimental/erase_if.h is a
better name.


OK for trunk with those changes (remove make_array, rename erase_if.tcc)




Re: [PATCH 3/8] add default for PCC_BITFIELD_TYPE_MATTERS

2015-04-30 Thread Trevor Saunders
On Thu, Apr 30, 2015 at 02:33:44PM +0200, Jakub Jelinek wrote:
> On Thu, Apr 30, 2015 at 08:25:14AM -0400, Trevor Saunders wrote:
> > On Thu, Apr 30, 2015 at 08:40:50AM +0200, Andreas Schwab wrote:
> > > Trevor Saunders  writes:
> > > 
> > > > actually pointing out libojc/encoding.c was more useful since that makes
> > > > it pretty clear the ifndef PCC_BITFIELD_TYPE_MATTERS there just needs to
> > > > be changed to #if !
> > > 
> > > That probably won't work on arm or powerpc or vax:
> > > 
> > > gcc/config/arm/arm.h:#define PCC_BITFIELD_TYPE_MATTERS TARGET_AAPCS_BASED
> > > gcc/config/rs6000/sysv4.h:#define PCC_BITFIELD_TYPE_MATTERS 
> > > (TARGET_BITFIELD_TYPE)
> > > gcc/config/vax/vax.h:#define PCC_BITFIELD_TYPE_MATTERS (! 
> > > TARGET_VAXC_ALIGNMENT)
> > 
> > hrmph, I don't see how this code ever worked correctly on those targets.
> > Consider the arm case the value of PCC_BITFIELD_TYPE_MATTERS depends on
> > arm_abi so if the bitfield type matters depends on what abi libobjc is
> > being built for, but its not obvious how libobjc is dealing with that.
> > I suppose it could be that libobjc is using this macro to know something
> > else that only sort of relaed somehow.  Unfortunately this code seems to
> > come from the creation of libobjc/ in 11998 and though the commit says
> > it is a move from gcc/objc/ nothing appears to have been removed from
> > gcc/objc/.
> > 
> > I guess the "best" thing to do is justadd a
> > __PCC_BITFIELD_TYPE_MATTERS__ that gcc defines and use that in libobjc?
> 
> I think adding way too many predefines, especially rarely used ones, is
> harmful, certainly it isn't free, consider -g3 or -dD where it will all end
> up in, additional gcc start overhead, ... 

there was a reason I said "best" I don't think its a great design
either.

> Can't just libobjc configury test for that?

I suppose it can test what happens with alignment of different types in
structs.  I guess I'm not really awake yet, and I'm pretty wary of
this since I really have no idea what its trying to do.

Trev

> 
>   Jakub


[PR testsuite/65205] Fix dg-shouldfail usage in OpenACC libgomp tests

2015-04-30 Thread Thomas Schwinge
Hi!

Here is a patch, prepared by Jim Norris, to fix dg-shouldfail usage in
OpenACC libgomp tests.  It introduces two regressions (that is, makes the
existing errors visible), which shall then be fixed later on:
libgomp.oacc-c-c++-common/lib-3.c, and
libgomp.oacc-c-c++-common/lib-42.c.

As obvious, committed to trunk in r222620:

commit cf9c09c49e63176ff8a1fba429971cb13226260b
Author: tschwinge 
Date:   Thu Apr 30 12:44:39 2015 +

[PR testsuite/65205] Fix dg-shouldfail usage in OpenACC libgomp tests

PR testsuite/65205
libgomp/
* testsuite/lib/libgomp.exp
(check_effective_target_openacc_host_selected)
(check_effective_target_openacc_host_nonshm_selected): New
procedures.
* testsuite/libgomp.oacc-c-c++-common/clauses-2.c: Fix misuse of
dg-shouldfail.
* testsuite/libgomp.oacc-c-c++-common/lib-1.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-11.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-16.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-17.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-18.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-2.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-20.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-21.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-22.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-23.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-25.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-26.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-27.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-28.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-29.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-3.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-30.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-34.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-35.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-36.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-39.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-4.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-40.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-42.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-43.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-44.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-47.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-48.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-52.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-53.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-54.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-57.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-58.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-62.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-63.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-64.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-65.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-67.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-68.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-71.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-77.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/lib-80.c: Likewise.
* testsuite/libgomp.oacc-c-c++-common/present-1.c: Likewise.

git-svn-id: svn+ssh://gcc.gnu.org/svn/gcc/trunk@222620 
138bc75d-0d04-0410-961f-82ee72b054a4
---
 libgomp/ChangeLog  |   53 
 libgomp/testsuite/lib/libgomp.exp  |   20 
 .../libgomp.oacc-c-c++-common/clauses-2.c  |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-1.c|3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-11.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-16.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-17.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-18.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-2.c|3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-20.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-21.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-22.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-23.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-25.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-26.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-27.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-28.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-29.c   |3 +-
 .../testsuite/libgomp.oacc-c-c++-common/lib-3.c|3 +-
 .../testsuite/lib

Re: [Patch, fortran, pr65548, 2nd take, v3] [5/6 Regression] gfc_conv_procedure_call

2015-04-30 Thread Andre Vehreschild
Hi all,

this is just a service release. I encountered that the new testcase in the
previous release included the testcase of the initial patch, that is
already on trunk. I therefore replaced the testcase allocate_with_source_5.f90
by allocate_with_source_6.f90 (the extended testcase). Besides this there is no
difference inbetween this and the patch in:

https://gcc.gnu.org/ml/fortran/2015-04/msg00121.html

Sorry for the mess. For a description of the original patches scope see below.

Bootstraps and regtests ok on x86_64-linux-gnu/F21.

Ok for trunk?

Regards,
Andre

On Wed, 29 Apr 2015 14:31:01 +0200
Andre Vehreschild  wrote:

> Hi all,
> 
> after the first patch to fix the issue reported in the pr, some more issues
> were reported, which are now fixed by this new patch, aka the 2nd take.
> 
> The patch modifies the gfc_trans_allocate() in order to pre-evaluate all
> source= expressions. It no longer rejects array valued source= expressions,
> but just uses gfc_conv_expr_descriptor () for most of them. Furthermore, is
> the allocate now again able to allocate arrays of strings. This feature
> previously slipped my attention.
> 
> Although the reporter has not yet reported, that the patch fixes his issue, I
> like to post it for review, because there are more patches in my pipeline,
> that depend on this one. 
> 
> Bootstraps and regtests ok on x86_64-linux-gnu/F21.
> 
> Ok, for trunk?
> 
> Regards,
>   Andre


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr65548_3.clog
Description: Binary data
diff --git a/gcc/fortran/trans-stmt.c b/gcc/fortran/trans-stmt.c
index 53e9bcc..1e435be 100644
--- a/gcc/fortran/trans-stmt.c
+++ b/gcc/fortran/trans-stmt.c
@@ -5148,14 +5148,11 @@ gfc_trans_allocate (gfc_code * code)
   TREE_USED (label_finish) = 0;
 }
 
-  /* When an expr3 is present, try to evaluate it only once.  In most
- cases expr3 is invariant for all elements of the allocation list.
- Only exceptions are arrays.  Furthermore the standards prevent a
- dependency of expr3 on the objects in the allocate list.  Therefore
- it is safe to pre-evaluate expr3 for complicated expressions, i.e.
- everything not a variable or constant.  When an array allocation
- is wanted, then the following block nevertheless evaluates the
- _vptr, _len and element_size for expr3.  */
+  /* When an expr3 is present evaluate it only once.  The standards prevent a
+ dependency of expr3 on the objects in the allocate list.  An expr3 can
+ be pre-evaluated in all cases.  One just has to make sure, to use the
+ correct way, i.e., to get the descriptor or to get a reference
+ expression.  */
   if (code->expr3)
 {
   bool vtab_needed = false;
@@ -5168,75 +5165,86 @@ gfc_trans_allocate (gfc_code * code)
 	   al = al->next)
 	vtab_needed = (al->expr->ts.type == BT_CLASS);
 
-  /* A array expr3 needs the scalarizer, therefore do not process it
-	 here.  */
-  if (code->expr3->expr_type != EXPR_ARRAY
-	  && (code->expr3->rank == 0
-	  || code->expr3->expr_type == EXPR_FUNCTION)
-	  && (!code->expr3->symtree
-	  || !code->expr3->symtree->n.sym->as)
-	  && !gfc_is_class_array_ref (code->expr3, NULL))
-	{
-	  /* When expr3 is a variable, i.e., a very simple expression,
+  /* When expr3 is a variable, i.e., a very simple expression,
 	 then convert it once here.  */
-	  if ((code->expr3->expr_type == EXPR_VARIABLE)
-	  || code->expr3->expr_type == EXPR_CONSTANT)
-	{
-	  if (!code->expr3->mold
-		  || code->expr3->ts.type == BT_CHARACTER
-		  || vtab_needed)
-		{
-		  /* Convert expr3 to a tree.  */
-		  gfc_init_se (&se, NULL);
-		  se.want_pointer = 1;
-		  gfc_conv_expr (&se, code->expr3);
-		  if (!code->expr3->mold)
-		expr3 = se.expr;
-		  else
-		expr3_tmp = se.expr;
-		  expr3_len = se.string_length;
-		  gfc_add_block_to_block (&block, &se.pre);
-		  gfc_add_block_to_block (&post, &se.post);
-		}
-	  /* else expr3 = NULL_TREE set above.  */
-	}
-	  else
+  if (code->expr3->expr_type == EXPR_VARIABLE
+	  || code->expr3->expr_type == EXPR_ARRAY
+	  || code->expr3->expr_type == EXPR_CONSTANT)
+	{
+	  if (!code->expr3->mold
+	  || code->expr3->ts.type == BT_CHARACTER
+	  || vtab_needed)
 	{
-	  /* In all other cases evaluate the expr3 and create a
-		 temporary.  */
+	  /* Convert expr3 to a tree.  */
 	  gfc_init_se (&se, NULL);
-	  if (code->expr3->rank != 0
-		  && code->expr3->expr_type == EXPR_FUNCTION
-		  && code->expr3->value.function.isym)
+	  /* For all "simple" expression just get the descriptor or the
+		 reference, respectively, depending on the rank of the expr.  */
+	  if (code->expr3->rank != 0)
 		gfc_conv_expr_descriptor (&se, code->expr3);
 	  else
 		gfc_conv_expr_reference (&se, code->expr3);
-	  if (code->expr3->ts.type == BT_CLASS)
-		gfc_conv_class_to_class (&se, code->expr3,
-	 code->expr3->ts,
-	 false, true,
-	 false, false);

C/C++ PATCH to fix latest -Wbool-compare extension

2015-04-30 Thread Marek Polacek
The problem here was that the -Wbool-compare warning about always false/true
comparisons with 0/1 was assuming that both operands are of a boolean type.
That was wrong so check for that, but don't get confused about bools promoted
to int.

This bug is blocking aarch64 bootstrap, so I'm taking the liberty of committing
it right away.

Bootstrapped/regtested on x86_64-linux, applying to trunk.

2015-04-30  Marek Polacek  

* c-common.c (maybe_warn_bool_compare): When comparing with 0/1,
require that the non-constant be of a boolean type.

* c-c++-common/Wbool-compare-3.c: New test.

diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
index 7d314f8..ada8e8a 100644
--- gcc/c-family/c-common.c
+++ gcc/c-family/c-common.c
@@ -11924,6 +11924,17 @@ maybe_warn_bool_compare (location_t loc, enum 
tree_code code, tree op0,
 }
   else if (integer_zerop (cst) || integer_onep (cst))
 {
+  /* If the non-constant operand isn't of a boolean type, we
+don't want to warn here.  */
+  tree noncst = TREE_CODE (op0) == INTEGER_CST ? op1 : op0;
+  /* Handle booleans promoted to integers.  */
+  if (CONVERT_EXPR_P (noncst)
+ && TREE_TYPE (noncst) == integer_type_node
+ && TREE_CODE (TREE_TYPE (TREE_OPERAND (noncst, 0))) == BOOLEAN_TYPE)
+   /* Warn.  */;
+  else if (TREE_CODE (TREE_TYPE (noncst)) != BOOLEAN_TYPE
+  && !truth_value_p (TREE_CODE (noncst)))
+   return;
   /* Do some magic to get the right diagnostics.  */
   bool flag = TREE_CODE (op0) == INTEGER_CST;
   flag = integer_zerop (cst) ? flag : !flag;
diff --git gcc/testsuite/c-c++-common/Wbool-compare-3.c 
gcc/testsuite/c-c++-common/Wbool-compare-3.c
index e69de29..bac4f47 100644
--- gcc/testsuite/c-c++-common/Wbool-compare-3.c
+++ gcc/testsuite/c-c++-common/Wbool-compare-3.c
@@ -0,0 +1,30 @@
+/* { dg-do compile } */
+/* { dg-options "-Wbool-compare" } */
+
+#ifndef __cplusplus
+# define bool _Bool
+#endif
+
+#define A 0
+#define B 1
+
+int
+foo (int i, bool b)
+{
+  int r = 0;
+
+  r += i <= (A || B);
+  r += i <= b;
+  r += i <= A;
+  r += i < (A || B);
+  r += i < b;
+  r += i < A;
+  r += i > (A || B);
+  r += i > b;
+  r += i > A;
+  r += i >= (A || B);
+  r += i >= b;
+  r += i >= A;
+
+  return r;
+}

Marek


Fix g++.dg/lto/20101010-4

2015-04-30 Thread Jan Hubicka
Hi,
this patch fixes ICE when comparing NULLPTR_TYPEs.

Comitted as obvious.

Honza

Index: ChangeLog
===
--- ChangeLog   (revision 222620)
+++ ChangeLog   (working copy)
@@ -1,3 +1,9 @@
+2015-04-30  Jan Hubicka  
+
+   PR lto/65948
+   * ipa-devirt.c (odr_types_equivalent_p): NULLPTR_TYPE is equivalent
+   to itself.
+
 2015-04-30  Richard Sandiford  
 
* genrecog.c (simplify_tests): Check that CONST_INT and XWINT tests
Index: ipa-devirt.c
===
--- ipa-devirt.c(revision 222620)
+++ ipa-devirt.c(working copy)
@@ -1537,6 +1537,7 @@ odr_types_equivalent_p (tree t1, tree t2
break;
   }
 case VOID_TYPE:
+case NULLPTR_TYPE:
   break;
 
 default:


Re: C PATCH to reject va_arg (ap, void) (PR c/65901)

2015-04-30 Thread Marek Polacek
On Wed, Apr 29, 2015 at 06:41:22PM +0200, Marek Polacek wrote:
> On Tue, Apr 28, 2015 at 09:07:09PM -0600, Martin Sebor wrote:
> > The error message in the test cases below isn't quite right.
> > The type of the aggregates isn't undefined, it's incomplete.
> > Looking at the function, I wonder if the first argument
> > should be EXPR rather than than NULL_TREE? Alternatively,
> > experimenting with other cases where GCC diagnoses invalid
> > uses of incomplete type, I see that it issues:
> > 
> >   "invalid application of %qs to incomplete type %qT"
> > 
> > which might work even better here since we could name the
> > expression (va_arg).
> 
> Yeah, I haven't concerned myself with the exact wording of the error
> message much, and I agree it could be improved.  But passing down the
> EXPR would mean that the compiler outputs "'ap' has an incomplete type"
> and that looks wrong as well.  I think I'm going to apply the following
> tomorrow if I hear no objections (and it passes testing).  Thanks for 
> noticing.

Committed now.
 
> (And I think c_incomplete_type_error deserves some TLC; I'll post a separate
> patch.)
> 
> 2015-04-29  Marek Polacek  
> 
>   * c-typeck.c (c_build_va_arg): Clarify the error message.
> 
>   * gcc.dg/pr65901.c (foo): Adjust dg-error.
> 
> diff --git gcc/c/c-typeck.c gcc/c/c-typeck.c
> index c58e918..028d2f81 100644
> --- gcc/c/c-typeck.c
> +++ gcc/c/c-typeck.c
> @@ -12645,14 +12645,17 @@ c_build_qualified_type (tree type, int type_quals)
>  tree
>  c_build_va_arg (location_t loc, tree expr, tree type)
>  {
> -  if (warn_cxx_compat && TREE_CODE (type) == ENUMERAL_TYPE)
> -warning_at (loc, OPT_Wc___compat,
> - "C++ requires promoted type, not enum type, in %");
> -  if (type == error_mark_node || !COMPLETE_TYPE_P (type))
> +  if (error_operand_p (type))
> +return error_mark_node;
> +  else if (!COMPLETE_TYPE_P (type))
>  {
> -  c_incomplete_type_error (NULL_TREE, type);
> +  error_at (loc, "second argument to % is of incomplete "
> + "type %qT", type);
>return error_mark_node;
>  }
> +  else if (warn_cxx_compat && TREE_CODE (type) == ENUMERAL_TYPE)
> +warning_at (loc, OPT_Wc___compat,
> + "C++ requires promoted type, not enum type, in %");
>return build_va_arg (loc, expr, type);
>  }
>  
> diff --git gcc/testsuite/gcc.dg/pr65901.c gcc/testsuite/gcc.dg/pr65901.c
> index 8708a1e..b40eea3 100644
> --- gcc/testsuite/gcc.dg/pr65901.c
> +++ gcc/testsuite/gcc.dg/pr65901.c
> @@ -9,8 +9,8 @@ union U;
>  void
>  foo (__builtin_va_list ap)
>  {
> -  __builtin_va_arg (ap, void);  /* { dg-error "invalid use of void 
> expression" } */
> -  __builtin_va_arg (ap, struct S);  /* { dg-error "invalid use of undefined 
> type" } */
> -  __builtin_va_arg (ap, enum E);  /* { dg-error "invalid use of undefined 
> type" } */
> -  __builtin_va_arg (ap, union U);  /* { dg-error "invalid use of undefined 
> type" } */
> +  __builtin_va_arg (ap, void);  /* { dg-error "second argument to .va_arg. 
> is of incomplete type .void." } */
> +  __builtin_va_arg (ap, struct S);  /* { dg-error "second argument to 
> .va_arg. is of incomplete type .struct S." } */
> +  __builtin_va_arg (ap, enum E);  /* { dg-error "second argument to .va_arg. 
> is of incomplete type .enum E." } */
> +  __builtin_va_arg (ap, union U);  /* { dg-error "second argument to 
> .va_arg. is of incomplete type .union U." } */
>  }

Marek


Re: c-family PATCH to improve -Wbool-compare (PR c/64610)

2015-04-30 Thread Andreas Schwab
Marek Polacek  writes:

> diff --git gcc/c-family/c-common.c gcc/c-family/c-common.c
> index 7d314f8..ada8e8a 100644
> --- gcc/c-family/c-common.c
> +++ gcc/c-family/c-common.c
> @@ -11924,6 +11924,17 @@ maybe_warn_bool_compare (location_t loc, enum 
> tree_code code, tree op0,
>  }
>else if (integer_zerop (cst) || integer_onep (cst))
>  {
> +  /* If the non-constant operand isn't of a boolean type, we
> +  don't want to warn here.  */
> +  tree noncst = TREE_CODE (op0) == INTEGER_CST ? op1 : op0;
> +  /* Handle booleans promoted to integers.  */
> +  if (CONVERT_EXPR_P (noncst)
> +   && TREE_TYPE (noncst) == integer_type_node
> +   && TREE_CODE (TREE_TYPE (TREE_OPERAND (noncst, 0))) == BOOLEAN_TYPE)
> + /* Warn.  */;
> +  else if (TREE_CODE (TREE_TYPE (noncst)) != BOOLEAN_TYPE
> +&& !truth_value_p (TREE_CODE (noncst)))
> + return;
>/* Do some magic to get the right diagnostics.  */
>bool flag = TREE_CODE (op0) == INTEGER_CST;
>flag = integer_zerop (cst) ? flag : !flag;

Looks good, successfully built stage3.

Andreas.

-- 
Andreas Schwab, sch...@linux-m68k.org
GPG Key fingerprint = 58CA 54C7 6D53 942B 1756  01D3 44D5 214B 8276 4ED5
"And now for something completely different."


Re: [Patch, fortran, PR44672, v5] [F08] ALLOCATE with SOURCE and no array-spec

2015-04-30 Thread Andre Vehreschild
Hi all,

and also for this bug, I like to present an updated patch. It was brought to my
attention, that the previous patch did not fix statements like:

allocate(m, source=[(I, I=1, n)])

where n is a variable and

type p
  class(*), allocatable :: m(:,:)
end type
real mat(2,3)
type(P) :: o
allocate(o%m, source=mat)

The new version of the patch fixes those issue now also and furthermore
addresses some issues (most probably not all) where the rank of the
source=-variable and the rank of the array to allocate differ. For example,
when one is do:

real v(:)
allocate(v, source= arr(1,2:3))

where arr has a rank of 2 and only the source=-expression a rank of one, which
is then compatible with v. Nevertheless did this need addressing, when setting
up the descriptor of the v and during data copy.

Bootstrap ok on x86_64-linux-gnu/f21.
Regtests with one regression in gfortran.dg/alloc_comp_constructor_1.f90, which
is addressed in the patch for pr58586, whose final version is in preparation.

Ok for trunk in combination with 58586 once both are reviewed?

Regards,
Andre


On Wed, 29 Apr 2015 17:23:58 +0200
Andre Vehreschild  wrote:

> Hi all,
> 
> this is the fourth version of the patch, adapting to the current state of
> trunk. This patch is based on my patch for 65584 version 2 and needs that
> patch applied beforehand to apply cleanly. The patch for 65548 is available
> from:
> 
> https://gcc.gnu.org/ml/fortran/2015-04/msg00121.html
> 
> Scope:
> 
> Allow allocate of arrays w/o having to give an array-spec as specified in
> F2008:C633. An example is:
> 
> integer, dimension(:) :: arr
> allocate(arr, source = [1,2,3])
> 
> Solution:
> 
> While resolving an allocate, the objects to allocate are analyzed whether they
> carry an array-spec, if not the array-spec of the source=-expression is
> transferred. Unfortunately some source=-expressions are not easy to handle and
> have to be assigned to a temporary variable first. Only with the temporary
> variable the gfc_trans_allocate() is then able to compute the array descriptor
> correctly and allocate with correct array bounds.
> 
> Side notes:
> 
> This patch creates a regression in alloc_comp_constructor_1.f90 where two
> free()'s are gone missing. This will be fixed by the patch for pr58586 and
> therefore not repeated here.
> 
> Bootstraps and regtests ok on x86_64-linux-gnu/f21.
> 
> Ok for trunk?
> 
> Regards,
>   Andre
> 


-- 
Andre Vehreschild * Email: vehre ad gmx dot de 


pr44672_5.clog
Description: Binary data
diff --git a/gcc/fortran/gfortran.h b/gcc/fortran/gfortran.h
index 832a6ce..9b5f4cf 100644
--- a/gcc/fortran/gfortran.h
+++ b/gcc/fortran/gfortran.h
@@ -2394,6 +2394,9 @@ typedef struct gfc_code
 {
   gfc_typespec ts;
   gfc_alloc *list;
+  /* Take the array specification from expr3 to allocate arrays
+	 without an explicit array specification.  */
+  unsigned arr_spec_from_expr3:1;
 }
 alloc;
 
diff --git a/gcc/fortran/resolve.c b/gcc/fortran/resolve.c
index 316b413..41026af 100644
--- a/gcc/fortran/resolve.c
+++ b/gcc/fortran/resolve.c
@@ -6804,7 +6804,7 @@ conformable_arrays (gfc_expr *e1, gfc_expr *e2)
have a trailing array reference that gives the size of the array.  */
 
 static bool
-resolve_allocate_expr (gfc_expr *e, gfc_code *code)
+resolve_allocate_expr (gfc_expr *e, gfc_code *code, bool *array_alloc_wo_spec)
 {
   int i, pointer, allocatable, dimension, is_abstract;
   int codimension;
@@ -7103,13 +7103,24 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code)
   if (!ref2 || ref2->type != REF_ARRAY || ref2->u.ar.type == AR_FULL
   || (dimension && ref2->u.ar.dimen == 0))
 {
-  gfc_error ("Array specification required in ALLOCATE statement "
-		 "at %L", &e->where);
-  goto failure;
+  /* F08:C633.  */
+  if (code->expr3)
+	{
+	  if (!gfc_notify_std (GFC_STD_F2008, "Array specification required "
+			   "in ALLOCATE statement at %L", &e->where))
+	goto failure;
+	  *array_alloc_wo_spec = true;
+	}
+  else
+	{
+	  gfc_error ("Array specification required in ALLOCATE statement "
+		 "at %L", &e->where);
+	  goto failure;
+	}
 }
 
   /* Make sure that the array section reference makes sense in the
-context of an ALLOCATE specification.  */
+ context of an ALLOCATE specification.  */
 
   ar = &ref2->u.ar;
 
@@ -7124,7 +7135,7 @@ resolve_allocate_expr (gfc_expr *e, gfc_code *code)
 
   for (i = 0; i < ar->dimen; i++)
 {
-  if (ref2->u.ar.type == AR_ELEMENT)
+  if (ar->type == AR_ELEMENT || ar->type == AR_FULL)
 	goto check_symbols;
 
   switch (ar->dimen_type[i])
@@ -7201,12 +7212,18 @@ failure:
   return false;
 }
 
+
 static void
 resolve_allocate_deallocate (gfc_code *code, const char *fcn)
 {
   gfc_expr *stat, *errmsg, *pe, *qe;
   gfc_alloc *a, *p, *q;
 
+  /* When this flag is set already, then this allocate has already been
+ resolved.  Doing so again, would result in an endless loop.  */
+  if (code->ext.alloc.arr_sp

C++ PATCH for c++/65876 (ICE with constexpr and arrays)

2015-04-30 Thread Jason Merrill
In this testcase, we delay expanding the VEC_INIT_EXPR until 
gimplification time, at which point we've already thrown away some 
constexpr bodies.  Since this location doesn't require a constant 
expression we can safely treat it as a non-constant expression.


Tested x86_64-pc-linux-gnu, applying to 5 branch.  On the trunk I want 
to address the cause of the problem.
commit 3dc40c07723fe4490a49176e7420ba5afe692db1
Author: Jason Merrill 
Date:   Tue Apr 28 22:17:30 2015 -0400

	PR c++/65876
	* constexpr.c (cxx_eval_call_expression): Fail gracefully if cgraph
	threw away DECL_SAVED_TREE.

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index 0e333aa..5e65f29 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -1355,7 +1355,14 @@ cxx_eval_call_expression (const constexpr_ctx *ctx, tree t,
 		 fun = DECL_CHAIN (fun))
 		  if (DECL_SAVED_TREE (fun))
 		break;
-	  gcc_assert (DECL_SAVED_TREE (fun));
+	  if (!DECL_SAVED_TREE (fun))
+		{
+		  /* cgraph/gimplification have released the DECL_SAVED_TREE
+		 for this function.  Fail gracefully.  */
+		  gcc_assert (ctx->quiet);
+		  *non_constant_p = true;
+		  return t;
+		}
 	  tree parms, res;
 
 	  /* Unshare the whole function body.  */
diff --git a/gcc/testsuite/g++.dg/cpp0x/constexpr-array12.C b/gcc/testsuite/g++.dg/cpp0x/constexpr-array12.C
new file mode 100644
index 000..ec81fff
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp0x/constexpr-array12.C
@@ -0,0 +1,33 @@
+// PR c++/65876
+// { dg-do compile { target c++11 } }
+
+template
+struct duration
+{
+constexpr duration() : r(0) {}
+
+template
+constexpr duration(duration x) : r(x.count()) {}
+
+constexpr int count() { return 0; }
+
+int r;
+};
+
+struct Config {
+duration<1> timeout { duration<2>() };
+};
+
+Config make_config()
+{
+return {};
+}
+
+struct ConfigArray {
+ConfigArray();
+Config all_configs[1];
+};
+
+ConfigArray::ConfigArray()
+{
+}


RE: Refactor gcc/tree-vectorize.c:vectorize_loops

2015-04-30 Thread Aditya K
Thank you very much Jeff.

-Aditya


> Date: Wed, 29 Apr 2015 23:44:20 -0600
> From: l...@redhat.com
> To: hiradi...@msn.com; ja...@redhat.com
> CC: gcc-patches@gcc.gnu.org
> Subject: Re: Refactor gcc/tree-vectorize.c:vectorize_loops
>
> On 04/29/2015 08:37 AM, Aditya K wrote:
>>
>> Thanks for the feedback. I have added comment and properly indented the code.
> I made a couple more formatting fixes (spaces -> tab & line wrapping),
> improved the ChangeLog, did a bootstrap & regression test on
> x86_64-linux-gnu and installed the final patch on the trunk.
>
> Thanks,
> Jeff
>
  

Re: [PATCH] [libstdc++] Add uniform container erasure.

2015-04-30 Thread Ed Smith-Rowland



And make_array, which isn't in the working paper yet, so I'd prefer to
leave that part out for now.

D'oh!  Sorry about that..  Removed.

The Doxygen @headername command tells users which header they are
supposed to include, rather than this one. Since there is no
 header that's wrong. I'd just omit the @headername.

Done.

This file doesn't really seem like a .tcc to me, it isn't providing
definitions of templates declared elsewhere (specifically in an
erase_if.h header).

Maybe we want an experimental/bits/ directory for this sort of thing
(which I could also use for the filesystem headers I'm about to
commit) but in the meanwhile I think just experimental/erase_if.h is a
better name.
Done. (I didn't make experimental/bits - I'll move erase_if.h after you 
add the bits directory).

OK for trunk with those changes (remove make_array, rename erase_if.tcc)


Rebuilt, retested and committed as 222630.
Altered patch attached.

Thanks,

Ed

2015-04-30  Edward Smith-Rowland  <3dw...@verizon.net>

Add fundamentals TR container erasure.
* include/Makefile.am: Add new headers.
* include/Makefile.in: Add new headers.
* include/experimental/array: New.
* include/experimental/deque: New.
* include/experimental/erase_if.tcc: New.
* include/experimental/forward_list: New.
* include/experimental/list: New.
* include/experimental/map: New.
* include/experimental/set: New.
* include/experimental/string: New.
* include/experimental/unordered_map: New.
* include/experimental/unordered_set: New.
* include/experimental/vector: New.
* testsuite/experimental/deque/erasure.cc: New.
* testsuite/experimental/forward_list/erasure.cc: New.
* testsuite/experimental/list/erasure.cc: New.
* testsuite/experimental/map/erasure.cc: New.
* testsuite/experimental/set/erasure.cc: New.
* testsuite/experimental/string/erasure.cc: New.
* testsuite/experimental/unordered_map/erasure.cc: New.
* testsuite/experimental/unordered_set/erasure.cc: New.
* testsuite/experimental/vector/erasure.cc: New.

Index: include/Makefile.am
===
--- include/Makefile.am (revision 222599)
+++ include/Makefile.am (working copy)
@@ -646,14 +646,24 @@
${experimental_srcdir}/algorithm \
${experimental_srcdir}/any \
${experimental_srcdir}/chrono \
+   ${experimental_srcdir}/deque \
+   ${experimental_srcdir}/erase_if.h \
+   ${experimental_srcdir}/forward_list \
${experimental_srcdir}/functional \
+   ${experimental_srcdir}/list \
+   ${experimental_srcdir}/map \
${experimental_srcdir}/optional \
${experimental_srcdir}/ratio \
+   ${experimental_srcdir}/set \
+   ${experimental_srcdir}/string \
${experimental_srcdir}/string_view \
+   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/system_error \
-   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/tuple \
-   ${experimental_srcdir}/type_traits
+   ${experimental_srcdir}/type_traits \
+   ${experimental_srcdir}/unordered_map \
+   ${experimental_srcdir}/unordered_set \
+   ${experimental_srcdir}/vector
 
 # This is the common subset of C++ files that all three "C" header models use.
 c_base_srcdir = $(C_INCLUDE_DIR)
Index: include/Makefile.in
===
--- include/Makefile.in (revision 222599)
+++ include/Makefile.in (working copy)
@@ -913,14 +913,24 @@
${experimental_srcdir}/algorithm \
${experimental_srcdir}/any \
${experimental_srcdir}/chrono \
+   ${experimental_srcdir}/deque \
+   ${experimental_srcdir}/erase_if.h \
+   ${experimental_srcdir}/forward_list \
${experimental_srcdir}/functional \
+   ${experimental_srcdir}/list \
+   ${experimental_srcdir}/map \
${experimental_srcdir}/optional \
${experimental_srcdir}/ratio \
+   ${experimental_srcdir}/set \
+   ${experimental_srcdir}/string \
${experimental_srcdir}/string_view \
+   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/system_error \
-   ${experimental_srcdir}/string_view.tcc \
${experimental_srcdir}/tuple \
-   ${experimental_srcdir}/type_traits
+   ${experimental_srcdir}/type_traits \
+   ${experimental_srcdir}/unordered_map \
+   ${experimental_srcdir}/unordered_set \
+   ${experimental_srcdir}/vector
 
 
 # This is the common subset of C++ files that all three "C" header models use.
Index: include/experimental/deque
===
--- include/experimental/deque  (revision 0)
+++ include/experimental/deque  (working copy)
@@ -0,0 +1,72 @@
+//  -*- C++ -*-
+
+// Copyright (C) 2015 Free Software Foundation

Re: [PATCH] PR target/48904 x86_64-knetbsd-gnu missing defs

2015-04-30 Thread Guillem Jover
Hi!

On Thu, 2015-04-30 at 09:58:28 +0200, Bernhard Reutner-Fischer wrote:
> On 30 April 2015 at 07:00, Jeff Law  wrote:
> > On 04/29/2015 02:01 AM, Bernhard Reutner-Fischer wrote:
> >>
> >> 2012-09-21  H.J. Lu  
> >>
> >> PR target/48904
> >> * config.gcc (x86_64-*-knetbsd*-gnu): Add i386/knetbsd-gnu64.h.
> >> * config/i386/knetbsd-gnu64.h: New file
> >
> > OK.  Please install on the trunk.
> 
> hmz, according to https://www.debian.org/ports/netbsd/ the debian
> knetbsd port is abandoned since about 2002.

Actually that page refers to the GNU/NetBSD port, which was based on
NetBSD's libc, the GNU/kNetBSD port (based on glibc), was started at
the same time as GNU/kFreeBSD but was short-lived and put into
hibernation to try to focus the effort on GNU/kFreeBSD. But that
has been a very long winter…

> If this is true (please confirm) then we should probably remove knetbsd from
> - upstream config repo
> - GCC
> - binutils-gdb

The difference between removing support in the toolchain and in the
upstream config repo is that for the former it will just need forward
porting the patches, but for the latter it takes a very long time and
lots of prodding to get upstream projects to update to current config.*
scripts. (In Debian we have been switching to always update the config.*
scripts at build time so it would probably not be too bad going forward
for us, but it would when using projects directly from upstream.)

Although I always find it a bit sad to remove ports support, I have the
impression the GNU/kNetBSD port is not coming to live any time soon. So
you might want to wait a bit for comments from others, perhaps someone
has been working on such port that we are not aware of, but otherwise I
think removal would be fine.

Thanks,
Guillem


Re: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value to output_constant_pool_2

2015-04-30 Thread Jeff Law

On 04/29/2015 04:30 AM, rohitarul...@freescale.com wrote:




Jeff, I have made the changes as per your comments and attached the patch.
If the patch is OK, I will proceed with the regression tests.
This patch refers back to 60158 and based on what I see in 60158, it 
appears I should be looking for a .data.rel.ro.local section which 
contains the address of a string constant.  But the constants are being 
put into .rodata.str1.4.  And if the issue is we're putting bits into 
the wrong section and don't have an appropriate .fixup section, then 
ISTM that the test should be compiled, then objdump used to verify the 
sections and/or relocations.


An additional concern is that I get the same code for the included 
testcase with or without your changes.  This is with a 
powerpc-softfloat-linux-gnuspe configured compiler -- which matches what 
I saw in pr 60158.


So while the patch seems reasonable, I'm concerned that I've been unable 
to show it changing anything.


Thoughts?

Jeff




Allow inlining across -fstrict-aliasing boundary

2015-04-30 Thread Jan Hubicka
Hi,
this patch permits inlining across flag_strict_aliasing this is hopefully
safe because te memory accesses in -fstrict-aliasing should have alias
set 0.

This should alllow to build packages that currently fail on always_inline
because of use of explicit optimization attributes and hopefully also solve the
performance issues with Firefox and LTO+FDO build that crept in shortly before
release.

Bootstrapped/regtested x86_64-linux, will commit it to mainline and to branch
later next week if no problems shows up.

Honza

PR ipa/65873
* ipa-inline.c (can_inline_edge_p): It is safe to inline across
-fstrict-aliasing boundaries.
Index: ipa-inline.c
===
--- ipa-inline.c(revision 222620)
+++ ipa-inline.c(working copy)
@@ -439,9 +439,6 @@ can_inline_edge_p (struct cgraph_edge *e
   == !opt_for_fn (callee->decl, optimize) || !always_inline))
  || check_match (flag_wrapv)
  || check_match (flag_trapv)
- /* Strictly speaking only when the callee contains memory
-accesses that are not using alias-set zero anyway.  */
- || check_maybe_down (flag_strict_aliasing)
  /* Strictly speaking only when the callee uses FP math.  */
  || check_maybe_up (flag_rounding_math)
  || check_maybe_up (flag_trapping_math)


Re: [PATCH, PR65915] Fix float conversion split.

2015-04-30 Thread Ilya Tocar
> Hi,
> 
> Looks like I missed some splits, which caused PR65915.
> Patch below fixes it.
> Ok for trunk?
> 
> 2015-04-28  Ilya Tocar  
> 
>   * config/i386/i386.md (define_split): Check for xmm16+,
>   when splitting scalar float conversion.
> 
> 
> ---
>  gcc/config/i386/i386.md | 8 ++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 937871a..af1cd9b 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4897,7 +4897,9 @@
>"TARGET_SSE2 && TARGET_SSE_MATH
> && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
> && reload_completed && SSE_REG_P (operands[0])
> -   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)"
> +   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
> +   && (!EXT_REX_SSE_REG_P (operands[0])
> +   || TARGET_AVX512VL)"
>[(const_int 0)]
>  {
>operands[3] = simplify_gen_subreg (mode, operands[0],
> @@ -4921,7 +4923,9 @@
>"TARGET_SSE2 && TARGET_SSE_MATH
> && TARGET_SSE_PARTIAL_REG_DEPENDENCY
> && optimize_function_for_speed_p (cfun)
> -   && reload_completed && SSE_REG_P (operands[0])"
> +   && reload_completed && SSE_REG_P (operands[0])
> +   && (!EXT_REX_SSE_REG_P (operands[0])
> +   || TARGET_AVX512VL)"
>[(const_int 0)]
>  {
>const machine_mode vmode = mode;
> -- 
> 1.8.3.1
>

Updated version below (now with test).

---
 gcc/config/i386/i386.md | 8 ++--
 gcc/config/i386/sse.md  | 6 +++---
 gcc/testsuite/gcc.target/i386/pr65915.c | 6 ++
 3 files changed, 15 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/i386/pr65915.c

diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
index 937871a..af1cd9b 100644
--- a/gcc/config/i386/i386.md
+++ b/gcc/config/i386/i386.md
@@ -4897,7 +4897,9 @@
   "TARGET_SSE2 && TARGET_SSE_MATH
&& TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
&& reload_completed && SSE_REG_P (operands[0])
-   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)"
+   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
+   && (!EXT_REX_SSE_REG_P (operands[0])
+   || TARGET_AVX512VL)"
   [(const_int 0)]
 {
   operands[3] = simplify_gen_subreg (mode, operands[0],
@@ -4921,7 +4923,9 @@
   "TARGET_SSE2 && TARGET_SSE_MATH
&& TARGET_SSE_PARTIAL_REG_DEPENDENCY
&& optimize_function_for_speed_p (cfun)
-   && reload_completed && SSE_REG_P (operands[0])"
+   && reload_completed && SSE_REG_P (operands[0])
+   && (!EXT_REX_SSE_REG_P (operands[0])
+   || TARGET_AVX512VL)"
   [(const_int 0)]
 {
   const machine_mode vmode = mode;
diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
index 9b7009a..c61098d 100644
--- a/gcc/config/i386/sse.md
+++ b/gcc/config/i386/sse.md
@@ -4258,11 +4258,11 @@
(set_attr "mode" "TI")])
 
 (define_insn "sse2_cvtsi2sd"
-  [(set (match_operand:V2DF 0 "register_operand" "=x,x,x")
+  [(set (match_operand:V2DF 0 "register_operand" "=x,x,v")
(vec_merge:V2DF
  (vec_duplicate:V2DF
(float:DF (match_operand:SI 2 "nonimmediate_operand" "r,m,rm")))
- (match_operand:V2DF 1 "register_operand" "0,0,x")
+ (match_operand:V2DF 1 "register_operand" "0,0,v")
  (const_int 1)))]
   "TARGET_SSE2"
   "@
@@ -4275,7 +4275,7 @@
(set_attr "amdfam10_decode" "vector,double,*")
(set_attr "bdver1_decode" "double,direct,*")
(set_attr "btver2_decode" "double,double,double")
-   (set_attr "prefix" "orig,orig,vex")
+   (set_attr "prefix" "orig,orig,maybe_evex")
(set_attr "mode" "DF")])
 
 (define_insn "sse2_cvtsi2sdq"
diff --git a/gcc/testsuite/gcc.target/i386/pr65915.c 
b/gcc/testsuite/gcc.target/i386/pr65915.c
new file mode 100644
index 000..990c5aa
--- /dev/null
+++ b/gcc/testsuite/gcc.target/i386/pr65915.c
@@ -0,0 +1,6 @@
+/* { dg-do run } */
+/* { dg-options "-O2 -mavx512f -fpic -mcmodel=medium" } */
+/* { dg-require-effective-target avx512f } */
+/* { dg-require-effective-target lp64 } */
+
+#include "avx512f-vrndscalepd-2.c"
-- 
1.8.3.1



Re: [PATCH, PR65915] Fix float conversion split.

2015-04-30 Thread H.J. Lu
On Thu, Apr 30, 2015 at 8:15 AM, Ilya Tocar  wrote:
>> Hi,
>>
>> Looks like I missed some splits, which caused PR65915.
>> Patch below fixes it.
>> Ok for trunk?
>>
>> 2015-04-28  Ilya Tocar  
>>
>>   * config/i386/i386.md (define_split): Check for xmm16+,
>>   when splitting scalar float conversion.
>>
>>
>> ---
>>  gcc/config/i386/i386.md | 8 ++--
>>  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
>> index 937871a..af1cd9b 100644
>> --- a/gcc/config/i386/i386.md
>> +++ b/gcc/config/i386/i386.md
>> @@ -4897,7 +4897,9 @@
>>"TARGET_SSE2 && TARGET_SSE_MATH
>> && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
>> && reload_completed && SSE_REG_P (operands[0])
>> -   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)"
>> +   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
>> +   && (!EXT_REX_SSE_REG_P (operands[0])
>> +   || TARGET_AVX512VL)"
>>[(const_int 0)]
>>  {
>>operands[3] = simplify_gen_subreg (mode, operands[0],
>> @@ -4921,7 +4923,9 @@
>>"TARGET_SSE2 && TARGET_SSE_MATH
>> && TARGET_SSE_PARTIAL_REG_DEPENDENCY
>> && optimize_function_for_speed_p (cfun)
>> -   && reload_completed && SSE_REG_P (operands[0])"
>> +   && reload_completed && SSE_REG_P (operands[0])
>> +   && (!EXT_REX_SSE_REG_P (operands[0])
>> +   || TARGET_AVX512VL)"
>>[(const_int 0)]
>>  {
>>const machine_mode vmode = mode;
>> --
>> 1.8.3.1
>>
>
> Updated version below (now with test).
>
> ---
>  gcc/config/i386/i386.md | 8 ++--
>  gcc/config/i386/sse.md  | 6 +++---
>  gcc/testsuite/gcc.target/i386/pr65915.c | 6 ++
>  3 files changed, 15 insertions(+), 5 deletions(-)
>  create mode 100644 gcc/testsuite/gcc.target/i386/pr65915.c
>
> diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md
> index 937871a..af1cd9b 100644
> --- a/gcc/config/i386/i386.md
> +++ b/gcc/config/i386/i386.md
> @@ -4897,7 +4897,9 @@
>"TARGET_SSE2 && TARGET_SSE_MATH
> && TARGET_USE_VECTOR_CONVERTS && optimize_function_for_speed_p (cfun)
> && reload_completed && SSE_REG_P (operands[0])
> -   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)"
> +   && (MEM_P (operands[1]) || TARGET_INTER_UNIT_MOVES_TO_VEC)
> +   && (!EXT_REX_SSE_REG_P (operands[0])
> +   || TARGET_AVX512VL)"
>[(const_int 0)]
>  {
>operands[3] = simplify_gen_subreg (mode, operands[0],
> @@ -4921,7 +4923,9 @@
>"TARGET_SSE2 && TARGET_SSE_MATH
> && TARGET_SSE_PARTIAL_REG_DEPENDENCY
> && optimize_function_for_speed_p (cfun)
> -   && reload_completed && SSE_REG_P (operands[0])"
> +   && reload_completed && SSE_REG_P (operands[0])
> +   && (!EXT_REX_SSE_REG_P (operands[0])
> +   || TARGET_AVX512VL)"
>[(const_int 0)]
>  {
>const machine_mode vmode = mode;
> diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md
> index 9b7009a..c61098d 100644
> --- a/gcc/config/i386/sse.md
> +++ b/gcc/config/i386/sse.md
> @@ -4258,11 +4258,11 @@
> (set_attr "mode" "TI")])
>
>  (define_insn "sse2_cvtsi2sd"
> -  [(set (match_operand:V2DF 0 "register_operand" "=x,x,x")
> +  [(set (match_operand:V2DF 0 "register_operand" "=x,x,v")
> (vec_merge:V2DF
>   (vec_duplicate:V2DF
> (float:DF (match_operand:SI 2 "nonimmediate_operand" "r,m,rm")))
> - (match_operand:V2DF 1 "register_operand" "0,0,x")
> + (match_operand:V2DF 1 "register_operand" "0,0,v")
>   (const_int 1)))]
>"TARGET_SSE2"
>"@
> @@ -4275,7 +4275,7 @@
> (set_attr "amdfam10_decode" "vector,double,*")
> (set_attr "bdver1_decode" "double,direct,*")
> (set_attr "btver2_decode" "double,double,double")
> -   (set_attr "prefix" "orig,orig,vex")
> +   (set_attr "prefix" "orig,orig,maybe_evex")
> (set_attr "mode" "DF")])
>
>  (define_insn "sse2_cvtsi2sdq"
> diff --git a/gcc/testsuite/gcc.target/i386/pr65915.c 
> b/gcc/testsuite/gcc.target/i386/pr65915.c
> new file mode 100644
> index 000..990c5aa
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/i386/pr65915.c
> @@ -0,0 +1,6 @@
> +/* { dg-do run } */
> +/* { dg-options "-O2 -mavx512f -fpic -mcmodel=medium" } */
> +/* { dg-require-effective-target avx512f } */
> +/* { dg-require-effective-target lp64 } */
> +
> +#include "avx512f-vrndscalepd-2.c"

Missing testcases for

FAIL: gcc.target/i386/avx512f-vrndscaleps-2.c (test for excess errors)
FAIL: gcc.target/i386/avx512vl-vrndscaleps-2.c (internal compiler error)

as well as ChangeLog entries.

-- 
H.J.


Re: [PATCH][AARCH64]Use shl for vec_shr_ rtx pattern.

2015-04-30 Thread Marcus Shawcroft
On 30 April 2015 at 12:55, Renlin Li  wrote:

> 2015-04-30  Renlin Li  
>
> * config/aarch64/aarch64-simd.md (vec_shr): Defined as an unspec.
> * config/aarch64/iterators.md (unspec): Add UNSPEC_VEC_SHR.
>
> gcc/testsuite/ChangeLog:
>
> 2015-04-30  Renlin Li  
>
> * gcc.target/aarch64/vect-reduc-or_1.c: New.

+  __builtin_printf("Failed %d\n", sum);
+  abort();

Space before (
Otherwise OK /Marcus


Re: [PATCH][ARM] Do not lower cost of setting core reg to constant. It doesn't have any effect

2015-04-30 Thread Marcus Shawcroft
On 22 April 2015 at 17:18, Kyrill Tkachov  wrote:

> 2015-04-22  Kyrylo Tkachov  
>
> * config/arm/arm.c (arm_new_rtx_costs): Do not lower cost
> immediate moves.

OK
/Marcus


Re: [PATCH][ARM] Do not lower cost of setting core reg to constant. It doesn't have any effect

2015-04-30 Thread Marcus Shawcroft
On 30 April 2015 at 16:22, Marcus Shawcroft  wrote:
> On 22 April 2015 at 17:18, Kyrill Tkachov  wrote:
>
>> 2015-04-22  Kyrylo Tkachov  
>>
>> * config/arm/arm.c (arm_new_rtx_costs): Do not lower cost
>> immediate moves.
>
> OK
> /Marcus

Ignore that, I'm not allowed to make that call. Wait for Ramana.
/Marcus


Re: [PATCH][AArch64] Properly handle mvn-register and add EON+shift pattern and cost appropriately

2015-04-30 Thread Marcus Shawcroft
On 23 April 2015 at 17:57, Kyrill Tkachov  wrote:

> 2015-04-23  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.md
> (*eor_one_cmpl_3_alt):
> New pattern.
> (*eor_one_cmpl_sidi3_alt_ze): Likewise.
> * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle MVN-shift
> appropriately.  Handle alternative EON form.

OK /Marcus


Re: [PATCH][AArch64] Properly cost FABD pattern

2015-04-30 Thread Marcus Shawcroft
On 22 April 2015 at 17:01, Kyrill Tkachov  wrote:

> 2015-04-22  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_rtx_costs): Handle pattern for
> fabd in ABS case.

OK /Marcus


RE: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value to output_constant_pool_2

2015-04-30 Thread rohitarul...@freescale.com


> -Original Message-
> From: Jeff Law [mailto:l...@redhat.com]
> Sent: Thursday, April 30, 2015 8:32 PM
> To: Dharmakan Rohit-B30502; gcc-patches@gcc.gnu.org;
> rguent...@suse.de; Jakub Jelinek
> Cc: Alan Modra; David Edelsohn; Wienskoski Edmar-RA8797
> Subject: Re: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value
> to output_constant_pool_2
> 
> On 04/29/2015 04:30 AM, rohitarul...@freescale.com wrote:
> >>
> >
> > Jeff, I have made the changes as per your comments and attached the
> patch.
> > If the patch is OK, I will proceed with the regression tests.
> This patch refers back to 60158 and based on what I see in 60158, it appears I
> should be looking for a .data.rel.ro.local section which contains the address
> of a string constant.  But the constants are being put into .rodata.str1.4.  
> And
> if the issue is we're putting bits into the wrong section and don't have an
> appropriate .fixup section, then ISTM that the test should be compiled, then
> objdump used to verify the sections and/or relocations.
> 
> An additional concern is that I get the same code for the included testcase
> with or without your changes.  This is with a powerpc-softfloat-linux-gnuspe
> configured compiler -- which matches what I saw in pr 60158.
> 
> So while the patch seems reasonable, I'm concerned that I've been unable to
> show it changing anything.
> 
> Thoughts?
> 

Jeff, the issue is still reproducible with GCC v4.8 branch but not with GCC 
v4.9 or trunk.

Regards,
Rohit


Re: [PATCH][AArch64] Add alternative 'extr' pattern, calculate rtx cost properly

2015-04-30 Thread Marcus Shawcroft
On 27 April 2015 at 11:01, Kyrill Tkachov  wrote:

> 2015-04-27  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.md (*extr5_insn_alt): New pattern.
> (*extrsi5_insn_uxtw_alt): Likewise.
> * config/aarch64/aarch64.c (aarch64_extr_rtx_p): New function.
> (aarch64_rtx_costs, IOR case): Use above to properly cost extr
> operations.

OK /Marcus


Re: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value to output_constant_pool_2

2015-04-30 Thread Jeff Law

On 04/30/2015 09:34 AM, rohitarul...@freescale.com wrote:




-Original Message-
From: Jeff Law [mailto:l...@redhat.com]
Sent: Thursday, April 30, 2015 8:32 PM
To: Dharmakan Rohit-B30502; gcc-patches@gcc.gnu.org;
rguent...@suse.de; Jakub Jelinek
Cc: Alan Modra; David Edelsohn; Wienskoski Edmar-RA8797
Subject: Re: [RFC: Patch, PR 60158] gcc/varasm.c : Pass actual alignment value
to output_constant_pool_2

On 04/29/2015 04:30 AM, rohitarul...@freescale.com wrote:




Jeff, I have made the changes as per your comments and attached the

patch.

If the patch is OK, I will proceed with the regression tests.

This patch refers back to 60158 and based on what I see in 60158, it appears I
should be looking for a .data.rel.ro.local section which contains the address
of a string constant.  But the constants are being put into .rodata.str1.4.  And
if the issue is we're putting bits into the wrong section and don't have an
appropriate .fixup section, then ISTM that the test should be compiled, then
objdump used to verify the sections and/or relocations.

An additional concern is that I get the same code for the included testcase
with or without your changes.  This is with a powerpc-softfloat-linux-gnuspe
configured compiler -- which matches what I saw in pr 60158.

So while the patch seems reasonable, I'm concerned that I've been unable to
show it changing anything.

Thoughts?



Jeff, the issue is still reproducible with GCC v4.8 branch but not with GCC 
v4.9 or trunk.
Was it fixed by some other approach or has the bug gone latent? 
Obviously if the former, then the patch is only relevant to gcc-4.8 if 
the latter, then we'll still want to get it fixed on the trunk and 
possibly in the release branches.


Can you please investigate if the bug has been fixed by other means or 
if it's just gone latent on the trunk?


Thanks,
jeff


Re: [PATCH] PR target/48904 x86_64-knetbsd-gnu missing defs

2015-04-30 Thread Jeff Law

On 04/30/2015 01:58 AM, Bernhard Reutner-Fischer wrote:

Hi,

On 30 April 2015 at 07:00, Jeff Law  wrote:

On 04/29/2015 02:01 AM, Bernhard Reutner-Fischer wrote:


2012-09-21  H.J. Lu  

 PR target/48904
 * config.gcc (x86_64-*-knetbsd*-gnu): Add i386/knetbsd-gnu64.h.
 * config/i386/knetbsd-gnu64.h: New file


OK.  Please install on the trunk.


hmz, according to https://www.debian.org/ports/netbsd/ the debian
knetbsd port is abandoned since about 2002.
If this is true (please confirm) then we should probably remove knetbsd from
- upstream config repo
- GCC
- binutils-gdb

instead of the above patchlet.
This would work equally well for me WRT config-list.mk builds..
[I should have checked this earlier, sorry..]

Given what Guillem indicated, I'd support removal.

It's often the case that we mark it as deprecated and issue an explicit 
error if someone tries to build the port.  That seems wise here.


jeff


Re: C/C++ PATCH to fix latest -Wbool-compare extension

2015-04-30 Thread Jeff Law

On 04/30/2015 07:17 AM, Marek Polacek wrote:

The problem here was that the -Wbool-compare warning about always false/true
comparisons with 0/1 was assuming that both operands are of a boolean type.
That was wrong so check for that, but don't get confused about bools promoted
to int.

This bug is blocking aarch64 bootstrap, so I'm taking the liberty of committing
it right away.

Bootstrapped/regtested on x86_64-linux, applying to trunk.

2015-04-30  Marek Polacek  

* c-common.c (maybe_warn_bool_compare): When comparing with 0/1,
require that the non-constant be of a boolean type.

* c-c++-common/Wbool-compare-3.c: New test.

OK.

BTW, you may also want to consider warning for integer types with a 
precision of 1 bit.


Jeff


Re: More type narrowing in match.pd

2015-04-30 Thread Jeff Law

On 04/30/2015 03:00 AM, Richard Biener wrote:


Without looking too close at this patch I'll note that we might want to
improve the previous one first to also handle a constant 2nd operand
for the operation (your new one also misses that).
Yea, I think you mentioned in that in the 47477 BZ as well.  If you've 
got testcases, pass them along so that we can build testcases around 
those forms as well.





and it was noticed multiple times that the type comparison boiler-plate
needs some helper function.  Like
Yes.  It's on the TODO list.  There's certainly more follow-ups in the 
pipeline.  If we want to factor out the boiler-plate now, that works for me.





And if you'd like to lend helping hands to adding patterns then transitioning
patterns from fold-const.c to match.pd is more appreciated than inventing
new ones ;)
The next round of work is much more likely to be reimplementing the 
operand shortening code shared between the C/C++ front-ends in match.pd 
and removal of the C/C++ operand shortening code.


This patch didn't fit into that work terribly well and seemed 
self-contained enough to go forward now rather than waiting.


Jeff



Re: [PATCH][AArch64] Fix operand costing logic for MINUS

2015-04-30 Thread Marcus Shawcroft
On 27 April 2015 at 14:24, Kyrill Tkachov  wrote:

> 2015-04-27  Kyrylo Tkachov  
>
> * config/aarch64/aarch64.c (aarch64_rtx_costs, MINUS):
> Properly account for both operand costs in simple case.

OK /Marcus


Re: C/C++ PATCH to fix latest -Wbool-compare extension

2015-04-30 Thread Marek Polacek
On Thu, Apr 30, 2015 at 09:55:02AM -0600, Jeff Law wrote:
> OK.
 
Thanks.

> BTW, you may also want to consider warning for integer types with a
> precision of 1 bit.

Yup, this is being tracked in
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=49706#c2
I suppose I'll get back to that in this stage1.

Marek


Re: Mostly rewrite genrecog

2015-04-30 Thread Richard Sandiford
Richard Biener  writes:
> On Thu, Apr 30, 2015 at 2:08 PM, Andreas Schwab  wrote:
>> Richard Sandiford  writes:
>>
>>> Andreas Schwab  writes:
 Richard Sandiford  writes:

> /* Represents a test and the action that should be taken on the result.
>If a transition exists for the test outcome, the machine switches
>to the transition's target state.  If no suitable transition exists,
>the machine either falls through to the next decision or, if there are 
> no
>more decisions to try, fails the match.  */
> struct decision : list_head 
> {
>   decision (const test &);
>
>   void set_parent (list_head  *s);
>   bool if_statement_p (uint64_t * = 0) const;
>
>   /* The state to which this decision belongs.  */
>   state *s;
>
>   /* Links to other decisions in the same state.  */
>   decision *prev, *next;
>
>   /* The test to perform.  */
>   struct test test;
> };

 ../../gcc/genrecog.c:1467: error: declaration of 'test decision::test'
 ../../gcc/genrecog.c:1051: error: changes meaning of 'test' from
 struct test'

 Bootstrap compiler is gcc 4.3.4.
>>>
>>> Bah.  Does it like "::test test" instead of "struct test test"?
>>
>> Same error.
>
> You have to use a different name I believe (or -fpermissive).

Hmm, but then why does it work with more recent compilers?

Thanks,
Richard



Re: [PATCH] Fix for PR26702: Emit .size for BSS variables on arm-eabi

2015-04-30 Thread Kwok Cheung Yeung

Hello

The target of the pr26702.c testcase was changed while committing from:

{ target arm*-*-eabi* }

in my original patch to:

{ target arm_eabi }

The check_effective_target_arm_eabi test (in 
gcc/testsuite/lib/target-supports.exp) checks for the presence of the 
__ARM_EABI__ preprocessor define, which is also present for the 
arm-none-linux-gnueabi* targets. This test should only be run on 
bare-metal targets without an OS, so this change was incorrect.


Ramana, could you please revert the target string to what it was originally?

Thanks

Kwok

ChangeLog (gcc/testsuite/):

* gcc.target/arm/pr26702.c: Change target to run only on
bare-metal configs.

On 30/04/2015 8:13 AM, Bin.Cheng wrote:

Hi Kwok,
The newly introduced test case failed on
arm-none-linux-gnueabi&arm-none-linux-gnueabihf.  Could you please
have a look at it?

FAIL: gcc.target/arm/pr26702.c scan-assembler \\.size[\\t ]+static_foo, 4

PR65937 is filed for tracking this.

Thanks,
bin





Re: [PATCH] Fix size & type for cold partition names (hot-cold function partitioning)

2015-04-30 Thread Caroline Tice
Done.  Here is the updated patch (with ChangeLog entries).  Only
change was to update tm.texi.in.

The bootstrap passed.  Is the patch ok to commit?

-- Caroline
cmt...@google.com

ChangeLog (gcc):

2015-04-30  Caroline Tice  

 PR 65929
* config/elfos.h (ASM_DECLARE_COLD_FUNCTION_NAME): New macro definition.
(ASM_DECLARE_COLD_FUNCTION_SIZE): New macro definition.
* doc/tm.texi.in (ASM_DECLARE_COLD_FUNCTION_NAME): Document new macro.
(ASM_DECLARE_COLD_FUNCTION_SIZE): Document new macro.
* final.c (final_scan_insn):  Use ASM_DECLARE_COLD_FUNCTION_NAME
instead of ASM_DECLARE_FUNCTION_NAME for cold partition name.
* varasm.c (assemble_end_function):  Use ASM_DECLARE_COLD_FUNCTION_SIZE
instead of ASM_DECLARE_FUNCTION_SIZE for cold partition size.

ChangeLog (gcc/testsuite):

2015-04-30  Caroline Tice  

PR  65929
* gcc.dg/tree-prof/cold_partition_label.c:  Only check for cold
partition size on certain targets.



On Wed, Apr 29, 2015 at 11:12 PM, Uros Bizjak  wrote:
> On Wed, Apr 29, 2015 at 11:22 PM, Caroline Tice  wrote:
>> Here is a new patch to update the cold name partition so that it will
>> only be treated like a function name and be given a size on the
>> architectures that specifically define macros for such.
>>
>> I also updated the test case to try to only test on the appropriate
>> architectures.  I am not sure I got the target triples correct for
>> this, so I would appreciate some extra attention to that in the
>> review.  I have tested this new patch on my workstation and it works
>> as intended.  I am in the process of bootstrapping with the new patch.
>> Assuming that the bootstrap passes, is this ok to commit?
>>
>> -- Caroline Tice
>> cmt...@google.com
>>
>> ChangeLog (gcc):
>>
>> 2015-04-29  Caroline Tice  
>>
>> PR 65929
>> * config/elfos.h (ASM_DECLARE_COLD_FUNCTION_NAME): New macro 
>> definition.
>> (ASM_DECLARE_COLD_FUNCTION_SIZE): New macro definition.
>> * final.c (final_scan_insn):  Use ASM_DECLARE_COLD_FUNCTION_NAME
>> instead of ASM_DECLARE_FUNCTION_NAME for cold partition name.
>> * varasm.c (assemble_end_function):  Use 
>> ASM_DECLARE_COLD_FUNCTION_SIZE
>> instead of ASM_DECLARE_FUNCTION_SIZE for cold partition size.
>>
>> ChangeLog (testsuite):
>>
>> 2015-04-29  Caroline Tice  
>>
>>PR  65929
>> * gcc.dg/tree-prof/cold_partition_label.c:  Only check for cold
>> partition size on certain targets.
>
> Documentation for new macros is missing (please see doc/tm.texi.in).
>
> Uros.
Index: gcc/config/elfos.h
===
--- gcc/config/elfos.h	(revision 222635)
+++ gcc/config/elfos.h	(working copy)
@@ -284,6 +284,22 @@
   while (0)
 #endif
 
+/* Write the extra assembler code needed to declare the name of a
+   cold function partition properly. Some svr4 assemblers need to also
+   have something extra said about the function's return value.  We
+   allow for that here.  */
+
+#ifndef ASM_DECLARE_COLD_FUNCTION_NAME
+#define ASM_DECLARE_COLD_FUNCTION_NAME(FILE, NAME, DECL)	\
+  do\
+{\
+  ASM_OUTPUT_TYPE_DIRECTIVE (FILE, NAME, "function");	\
+  ASM_DECLARE_RESULT (FILE, DECL_RESULT (DECL));		\
+  ASM_OUTPUT_FUNCTION_LABEL (FILE, NAME, DECL);		\
+}\
+  while (0)
+#endif
+
 /* Write the extra assembler code needed to declare an object properly.  */
 
 #ifdef HAVE_GAS_GNU_UNIQUE_OBJECT
@@ -358,6 +374,17 @@
   while (0)
 #endif
 
+/* This is how to declare the size of a cold function partition.  */
+#ifndef ASM_DECLARE_COLD_FUNCTION_SIZE
+#define ASM_DECLARE_COLD_FUNCTION_SIZE(FILE, FNAME, DECL)	\
+  do\
+{\
+  if (!flag_inhibit_size_directive)\
+	ASM_OUTPUT_MEASURED_SIZE (FILE, FNAME);			\
+}\
+  while (0)
+#endif
+
 /* A table of bytes codes used by the ASM_OUTPUT_ASCII and
ASM_OUTPUT_LIMITED_STRING macros.  Each byte in the table
corresponds to a particular byte value [0..255].  For any
Index: gcc/doc/tm.texi.in
===
--- gcc/doc/tm.texi.in	(revision 222635)
+++ gcc/doc/tm.texi.in	(working copy)
@@ -5574,6 +5574,34 @@
 of this macro.
 @end defmac
 
+@defmac ASM_DECLARE_COLD_FUNCTION_NAME (@var{stream}, @var{name}, @var{decl})
+A C statement (sans semicolon) to output to the stdio stream
+@var{stream} any text necessary for declaring the name @var{name} of a
+cold function partition which is being defined.  This macro is responsible
+for outputting the label definition (perhaps using
+@code{ASM_OUTPUT_FUNCTION_LABEL}).  The argument @var{decl} is the
+@code{FUNCTION_DECL} tree node representing the function.
+
+If this macro is not defined, then the cold partition name is defined in the
+usual manner as a label (by means of @code{ASM_OUTPUT_LABEL}).
+
+You may wish to use @code{ASM_OUTPUT_TYPE_DIRECTIVE} in the definition
+of this macro.
+@end defmac
+
+@defmac ASM_DECLARE_COLD_FUNCTION_SIZE (@var{stre

[C++ patch] PR 65858

2015-04-30 Thread Prathamesh Kulkarni
Hi,
The attached patch fixes ICE in PR65858.

For the test-case:
int x { 0.5 };
int main() { return 0; }

Compiling with: g++ -flto -Wno-narrowing -std=gnu++11
results in following ICE:
lto1: internal compiler error: in get_constructor, at varpool.c:331
0xd22f73 varpool_node::get_constructor()
../../src/gcc/varpool.c:331
0xd23e28 varpool_node::assemble_decl()
../../src/gcc/varpool.c:602
0x6b8793 output_in_order
../../src/gcc/cgraphunit.c:2137
0x6b8c83 symbol_table::compile()
../../src/gcc/cgraphunit.c:2378
0x62b205 lto_main()
../../src/gcc/lto/lto.c:3496

The ICE happens because error_mark_node gets streamed in the
object file and hits the assert:
gcc_assert (DECL_INITIAL (decl) != error_mark_node);

It appears that r49, which fixed PR65801 introduced this issue.

For the above test-case convert_like_real() calls check_narrowing():
 if (convs->check_narrowing
  && !check_narrowing (totype, expr, complain))
return error_mark_node;

Here convert_like_real() returns error_mark_node, because
check_narrowing() returns false.

Conside this part of check_narrowing():

if (!ok)
  {
 //...
 else if (complain & tf_error)
   {
 global_dc->pedantic_errors = 1;
 pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing,
"narrowing conversion of %qE from %qT to %qT
inside { }",
 init, ftype, type);
 global_dc->pedantic_errors = flag_pedantic_errors;
   }
   }
return cxx_dialect == cxx98 || ok;

pedwarn() doesn't print warning here and returns 0.
That's because the following condition becomes true in
diagnostic.c:diagnostic_report_diagnostic():

/* This tests if the user provided the appropriate -Wfoo or
   -Wno-foo option.  */
if (! context->option_enabled (diagnostic->option_index,
   context->option_state))
  return false;

So diagnostic_report_diagnostic() returns false to pedwarn()
which then returns 0 to check_narrowing() and warning is not printed.

return cxx_dialect == cxx98 || ok;
Since cxx_dialect is not cxx98 and ok is false, it returns false.

The attached patch fixes the ICE, by setting "ok = true" if
warn_narrowing is enabled thereby returning "true" to
convert_like_real().
Booststrapped and tested on x86_64-unknown-linux-gnu with no regressions.
OK for trunk ?

Thank you,
Prathamesh
/cp
2015-04-20  Prathamesh Kulkarni  

PR c++/65858
* typeck2.c (check_narrowing): Do not pedwarn if -Wno-narrowing is 
enabled.
Index: gcc/cp/typeck2.c
===
--- gcc/cp/typeck2.c	(revision 222573)
+++ gcc/cp/typeck2.c	(working copy)
@@ -958,11 +958,17 @@
 	}
   else if (complain & tf_error)
 	{
-	  global_dc->pedantic_errors = 1;
-	  pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing,
-		   "narrowing conversion of %qE from %qT to %qT inside { }",
-		   init, ftype, type);
-	  global_dc->pedantic_errors = flag_pedantic_errors;
+	  /* silence warning if -Wno-narrowing -is specified */
+	  if (!warn_narrowing)
+	ok = true;
+	  else
+	{ 
+	  global_dc->pedantic_errors = 1;
+	  pedwarn (EXPR_LOC_OR_LOC (init, input_location), OPT_Wnarrowing,
+		  "narrowing conversion of %qE from %qT to %qT inside { }",
+		   init, ftype, type);
+	  global_dc->pedantic_errors = flag_pedantic_errors;
+	}
 	}
 }
 


Re: [PATCH] Fix up tm_clone_hasher

2015-04-30 Thread Jeff Law

On 04/30/2015 03:30 AM, Marek Polacek wrote:

Ping.

On Wed, Apr 22, 2015 at 05:24:43PM +0200, Marek Polacek wrote:

handle_cache_entry in tm_clone_hasher looks wrong: the condition
if (e != HTAB_EMPTY_ENTRY || e != HTAB_DELETED_ENTRY) is always true.  While
it could be fixed by just changing || into &&, I decided to follow suit and
do what we do in handle_cache_entry's elsewhere in the codebase.  I've fixed
a formatting issue below while at it.

Bootstrapped/regtested on x86_64-linux, ok for trunk?
I think this should also go into 5.1.

2015-04-22  Marek Polacek  

* varasm.c (handle_cache_entry): Fix logic.
OK.  Though I do wonder if we should try to unify this with the other 
instances to avoid the useless duplication.


jeff



Go patch committed: Mark non-escaping variables whose address is not taken

2015-04-30 Thread Ian Lance Taylor
This patch from Chris Manghane marks variables whose address is not
taken as not escaping.  This is needed because for some types there
are ways for the variable to escape even without taking its address.
Bootstrapped and ran Go testsuite on x86_64-unknown-linux-gnu.
Committed to mainline.

Ian
diff -r c8ad01075b29 go/escape.cc
--- a/go/escape.cc  Wed Apr 29 15:39:43 2015 -0700
+++ b/go/escape.cc  Thu Apr 30 10:20:47 2015 -0700
@@ -1560,8 +1560,7 @@
 
   if (var->is_variable())
 {
-  if (var->var_value()->is_address_taken())
-   var->var_value()->set_does_not_escape();
+  var->var_value()->set_does_not_escape();
   if (var->var_value()->init() != NULL
  && var->var_value()->init()->allocation_expression() != NULL)
{
@@ -1570,9 +1569,6 @@
  alloc->set_allocate_on_stack();
}
 }
-  else if (var->is_result_variable()
-  && var->result_var_value()->is_address_taken())
-var->result_var_value()->set_does_not_escape();
 
   return TRAVERSE_CONTINUE;
 }


Re: More type narrowing in match.pd

2015-04-30 Thread Jeff Law

On 04/30/2015 01:17 AM, Marc Glisse wrote:


+/* This is another case of narrowing, specifically when there's an outer
+   BIT_AND_EXPR which masks off bits outside the type of the innermost
+   operands.   Like the previous case we have to convert the operands
+   to unsigned types to avoid introducing undefined behaviour for the
+   arithmetic operation.  */
+(for op (minus plus)

No mult? or widen_mult with a different pattern? (maybe that's already
done elsewhere)
No mult.  When I worked on the pattern for 47477, supporting mult 
clearly regressed the generated code -- presumably because we can often 
widen the operands for free.




+  (simplify
+(bit_and (op (convert@2 @0) (convert@3 @1)) INTEGER_CST@4)

Maybe op@5 and then test single_use on @5? If I compute something, and
before using it I test if the result is odd, I may not want to recompute
it.

Sure.  That ought to be easy to add.



+(if (INTEGRAL_TYPE_P (type)

Can this be false, or is it for documentation?
Can't recall a case where we were presented with a non-integral type, 
but I haven't even tried to work though what might happen on 
non-integral types.  Better safe than sorry.




+/* We check for type compatibility between @0 and @1 below,
+   so there's no need to check that @1/@3 are integral types.  */
+&& INTEGRAL_TYPE_P (TREE_TYPE (@0))
+&& INTEGRAL_TYPE_P (TREE_TYPE (@2))
+/* The precision of the type of each operand must match the
+   precision of the mode of each operand, similarly for the
+   result.  */

A nicely named helper that does this test would be cool. Every time I
see it I have to think again why it is necessary, and if there was a
function, I could refer to the comment above its definition ;-)
Factoring helpers for this stuff is something I wanted to do a bit 
latter as Kai and I build up the necessary patterns to eliminate the 
C/C++ specific operand shortening and hopefully the set of helpers 
needed becomes clearer.


The type_same_p helper clearly already makes sense as there's these two 
shortening patterns and two others that need it.  Given that & Richi's 
request, I'll go ahead and factor that one out.






+&& (TYPE_PRECISION (TREE_TYPE (@0))
+== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@0
+&& (TYPE_PRECISION (TREE_TYPE (@1))
+== GET_MODE_PRECISION (TYPE_MODE (TREE_TYPE (@1
+&& TYPE_PRECISION (type) == GET_MODE_PRECISION (TYPE_MODE (type))
+/* The inner conversion must be a widening conversion.  */
+&& TYPE_PRECISION (TREE_TYPE (@2)) > TYPE_PRECISION (TREE_TYPE
(@0))
+&& ((GENERIC + && (TYPE_MAIN_VARIANT (TREE_TYPE (@0))
+ == TYPE_MAIN_VARIANT (TREE_TYPE (@1
+|| (GIMPLE
+&& types_compatible_p (TREE_TYPE (@0), TREE_TYPE (@1

We don't need to be that strict, but this probably covers the most
common case.
Probably not.  The idea was to start with what we know is right & 
correct, then extend, particularly as we find/build testcases.  THe 
obvious extensions are those Richi pointed out in 47477, then again on 
this thread.  I'd like to tackle them as a follow-up and do so with both 
patterns.


[ They weren't tackled as part of 47477 as I wanted to focus on fixing
  the regression and didn't want to go much beyond what was necessary
  to fix the regression.  Obviously with stage1 open, it's time to
  tackle the cases Richi pointed out that we can/should handle. ]

jeff



Re: [PATCH] Fix size & type for cold partition names (hot-cold function partitioning)

2015-04-30 Thread Uros Bizjak
On Thu, Apr 30, 2015 at 6:26 PM, Caroline Tice  wrote:
> Done.  Here is the updated patch (with ChangeLog entries).  Only
> change was to update tm.texi.in.
>
> The bootstrap passed.  Is the patch ok to commit?

FYI, the (previous) patch also looks good on alphaev68-linux-gnu
native bootstrap [1].

[1] https://gcc.gnu.org/ml/gcc-testresults/2015-04/msg03508.html

Uros.


[debug-early] document new world and fix non-dwarf debugging backends

2015-04-30 Thread Aldy Hernandez
This patch adjusts the non-dwarf debugging back-ends to work in the new 
world.


Particularly interesting is dbxout, which was generating specific debug 
information for constants not written to memory.  For example, the 
following in C++:


const int invisible = 0xC0FFEE;

The new infrastructure never sees `invisible' because it doesn't make it 
to the symbol table, especially by the time late_global_decl runs.  What 
I did for this case was to handle it through the early_global_decl() 
hook, which does see `invisible', though it lacks location information. 
 However, since we don't need location information to represent this 
corner case, everything works fine and the pr23205-2.C stabs regression 
is fixed.


I have also updated the documentation for the hooks to reflect reality.

Committed to branch.

Down to 3 sets of individual regressions in the testsuite.  Yay.

Aldy
commit bb51ad83395536cc6efc151b6fe3f1fa0616d7a4
Author: Aldy Hernandez 
Date:   Thu Apr 30 10:10:27 2015 -0700

Adjust non dwarf debugging backends for the debug-early infrastructure.

diff --git a/gcc/dbxout.c b/gcc/dbxout.c
index 0c9a327..9f555c3 100644
--- a/gcc/dbxout.c
+++ b/gcc/dbxout.c
@@ -1346,24 +1346,65 @@ dbxout_function_decl (tree decl)
 
 #endif /* DBX_DEBUGGING_INFO  */
 
+/* Return true if a variable is really a constant and not written in
+   memory.  */
+
+static bool
+decl_is_really_constant (tree decl)
+{
+  return ((TREE_CODE (decl) == VAR_DECL
+  || TREE_CODE (decl) == RESULT_DECL)
+ && !DECL_EXTERNAL (decl)
+ && TREE_STATIC (decl)
+ && TREE_READONLY (decl)
+ && DECL_INITIAL (decl) != 0
+ && tree_fits_shwi_p (DECL_INITIAL (decl))
+ && ! TREE_ASM_WRITTEN (decl)
+ && (DECL_FILE_SCOPE_P (decl)
+ || TREE_CODE (DECL_CONTEXT (decl)) == BLOCK
+ || TREE_CODE (DECL_CONTEXT (decl)) == NAMESPACE_DECL)
+ && TREE_PUBLIC (decl) == 0);
+}
+
+/* Wrapper for dbxout_symbol that temporarily sets TREE_USED on the
+   DECL.  */
+
+static void
+dbxout_symbol_used (tree decl)
+{
+  int saved_tree_used = TREE_USED (decl);
+  TREE_USED (decl) = 1;
+  dbxout_symbol (decl, 0);
+  TREE_USED (decl) = saved_tree_used;
+}
+
+/* Output early debug information for a global DECL.  Called from
+   rest_of_decl_compilation during parsing.  */
+
 static void
-dbxout_early_global_decl (tree decl ATTRIBUTE_UNUSED)
+dbxout_early_global_decl (tree decl)
 {
-  /* NYI for non-dwarf.  */
+  /* True constant values may not appear in the symbol table, so they
+ will be missed by the late_global_decl hook.  Handle these cases
+ now, since early_global_decl will get unoptimized symbols early
+ enough-- and besides, true constants don't need location
+ information, so it's ok to handle them earlier.  */
+  if (decl_is_really_constant (decl))
+dbxout_symbol_used (decl);
 }
 
-/* Debug information for a global DECL.  Called from toplev.c after
-   compilation proper has finished.  */
+/* Output late debug information for a global DECL after location
+   information is available.  */
+
 static void
-dbxout_late_global_decl (tree decl)
+dbxout_late_global_decl (tree decl ATTRIBUTE_UNUSED)
 {
-  if (TREE_CODE (decl) == VAR_DECL && !DECL_EXTERNAL (decl))
-{
-  int saved_tree_used = TREE_USED (decl);
-  TREE_USED (decl) = 1;
-  dbxout_symbol (decl, 0);
-  TREE_USED (decl) = saved_tree_used;
-}
+  if (TREE_CODE (decl) == VAR_DECL
+  && !DECL_EXTERNAL (decl)
+  /* Read-only constants were handled in
+dbxout_early_global_decl.  */
+  && !decl_is_really_constant (decl))
+dbxout_symbol_used (decl);
 }
 
 /* This is just a function-type adapter; dbxout_symbol does exactly
@@ -2904,14 +2945,7 @@ dbxout_symbol (tree decl, int local ATTRIBUTE_UNUSED)
 and not written in memory, inform the debugger.
 
 ??? Why do we skip emitting the type and location in this case?  */
-  if (TREE_STATIC (decl) && TREE_READONLY (decl)
- && DECL_INITIAL (decl) != 0
- && tree_fits_shwi_p (DECL_INITIAL (decl))
- && ! TREE_ASM_WRITTEN (decl)
- && (DECL_FILE_SCOPE_P (decl)
- || TREE_CODE (DECL_CONTEXT (decl)) == BLOCK
- || TREE_CODE (DECL_CONTEXT (decl)) == NAMESPACE_DECL)
- && TREE_PUBLIC (decl) == 0)
+  if (decl_is_really_constant (decl))
{
  /* The sun4 assembler does not grok this.  */
 
diff --git a/gcc/debug.h b/gcc/debug.h
index 528ef3f..c360e5c 100644
--- a/gcc/debug.h
+++ b/gcc/debug.h
@@ -92,17 +92,37 @@ struct gcc_debug_hooks
   /* Debug information for a function DECL.  This might include the
  function name (a symbol), its parameters, and the block that
  makes up the function's body, and the local variables of the
- function.  */
+ function.
+
+ This is only called for FUNCTION_DECLs.  It is part of the late
+ debug pass and is called from rest_of_handle_final.
+
+ Location information

[patch] Implement ISO/IEC TS 18822 C++ File system TS

2015-04-30 Thread Jonathan Wakely

This is the complete  implementation I intend
to commit shortly. (It's also been pushed to the redi/filesystem-ts
branch in the git mirror).

As before, only a static library (libstdc++fs.a) is built, so there
are no symbols added to libstdc++.so and we can be a bit more risky
with regards to maintaining a stable ABI for this stuff.

Since the last updates to the branch I added the  header,
which meant I could add the missing path conversion features to
filesystem. I've also put everything relevant in a nested __cxx11
namespace, to avoid problems with the dual std::string ABI.

At this time the objects in libstdc++fa.a are all built with the
default std::string ABI (as set by configure). It will probably be
necessary to compile all of src/filesystem/*.cc twice when the dual
ABI is enabled, so both sets of definitions go in the archive.

I've tested this on GNU/Linux and DragonFly BSD, but as it's probably
not going to build everywhere I've added the configure option
--enable-libstdcxx-filesystem-ts which defaults to enabled on GNU, BSD
and Solaris targets, and disabled elsewhere for now. If it fails to
build on any of those targets we can change the default while we fix
the problem.

There are still quite a few operations (see src/filesystem/ops.cc)
without proper implementations, which might need to use the Win32 API.
I've compiled the _GLIBCXX_FILESYSTEM_IS_WINDOWS code by overriding
the macro, but not tested it any more than that. Someone will have to
try using --enable-libstdcxx-filesystem-ts on mingw.org, mingw-w64 and
cygwin to see what happens there. It's possible that macro could be
replaced by the existing _GLIBCXX_HAVE_DOS_BASED_FILESYSTEM macro, but
I'm not sure yet.




patch.txt.bz2
Description: BZip2 compressed data


  1   2   >