[Bug tree-optimization/63599] New: "wrong" branch optimization with Ofast in a loop

2014-10-20 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599

Bug ID: 63599
   Summary: "wrong" branch optimization with Ofast in a loop
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: vincenzo.innocente at cern dot ch

given this code

#include 

typedef float __attribute__( ( vector_size( 16 ) ) ) float32x4_t;

inline
float32x4_t atan(float32x4_t t) {
  constexpr float PIO4F = 0.7853981633974483096f;
  float32x4_t high = t > 0.4142135623730950f;
  auto z = t;
  float32x4_t ret={0.f,0.f,0.f,0.f};
// if all low no need to blend
  if ( _mm_movemask_ps(high) != 0) {
z   = ( t > 0.4142135623730950f ) ? (t-1.0f)/(t+1.0f) : t;
ret = ( t > 0.4142135623730950f ) ? ret+PIO4F : ret;
  }
  /* polynomial removed */
  return  ret += z;
}


float32x4_t doAtan(float32x4_t z) { return atan(z);}

float32x4_t va[1024];
float32x4_t vb[1024];

void computeV() {
  for (int i=0;i!=1024;++i)
vb[i]=atan(va[i]);
}


compiled with -Ofast
c++ -S -std=c++1y -Ofast bugmvmk.cc -march=nehalem; cat bugmvmk.s
produces the following code where the "movmskps%xmm8, %edx"
does not protect the code in the if block...

__Z8computeVv:
LFB2512:
movapsLC0(%rip), %xmm4
xorl%eax, %eax
movapsLC1(%rip), %xmm7
leaq_va(%rip), %rcx
movapsLC2(%rip), %xmm6
movapsLC3(%rip), %xmm5
.align 4,0x90
L10:
movaps(%rcx,%rax), %xmm2
movaps%xmm4, %xmm8
movaps%xmm2, %xmm3
cmpltps%xmm2, %xmm8
movaps%xmm2, %xmm1
addps%xmm6, %xmm3
addps%xmm7, %xmm1
movmskps%xmm8, %edx
andps%xmm5, %xmm8
rcpps%xmm3, %xmm0
mulps%xmm0, %xmm3
mulps%xmm0, %xmm3
addps%xmm0, %xmm0
subps%xmm3, %xmm0
mulps%xmm0, %xmm1
movaps%xmm2, %xmm0
cmpleps%xmm4, %xmm0
blendvps%xmm0, %xmm2, %xmm1
pxor%xmm0, %xmm0
testl%edx, %edx
jeL7
movaps%xmm8, %xmm0
L7:
testl%edx, %edx
jeL9
movaps%xmm1, %xmm2
L9:
addps%xmm0, %xmm2
leaq_vb(%rip), %rdx
movaps%xmm2, (%rdx,%rax)
addq$16, %rax
cmpq$16384, %rax
jneL10
ret

while with O2 is ok
__Z8computeVv:
LFB2512:
movapsLC0(%rip), %xmm4
xorl%eax, %eax
movapsLC1(%rip), %xmm7
leaq_va(%rip), %rsi
movapsLC2(%rip), %xmm6
leaq_vb(%rip), %rcx
movapsLC3(%rip), %xmm5
.align 4,0x90
L7:
movaps(%rsi,%rax), %xmm1
movaps%xmm4, %xmm0
pxor%xmm2, %xmm2
cmpltps%xmm1, %xmm0
movmskps%xmm0, %edx
testl%edx, %edx
jeL6
movaps%xmm1, %xmm3
movaps%xmm1, %xmm2
addps%xmm6, %xmm2
addps%xmm7, %xmm3
divps%xmm2, %xmm3
movaps%xmm0, %xmm2
andps%xmm5, %xmm2
blendvps%xmm0, %xmm3, %xmm1
L6:
addps%xmm2, %xmm1
movaps%xmm1, (%rcx,%rax)
addq$16, %rax
cmpq$16384, %rax
jneL7
ret

note that the function not in the loop (doAtan) is ok with both O2 and Ofast


[Bug target/63599] "wrong" branch optimization with Ofast in a loop

2014-10-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599

--- Comment #1 from Andrew Pinski  ---
The tree level looks like this:
  t_13 = VEC_COND_EXPR ;
  ret_14 = VEC_COND_EXPR  { 4.142135679721832275390625e-1,
4.142135679721832275390625e-1, 4.142135679721832275390625e-1,
4.142135679721832275390625e-1 }, { 7.85398185253143310546875e-1,
7.85398185253143310546875e-1, 7.85398185253143310546875e-1,
7.85398185253143310546875e-1 }, { 0.0, 0.0, 0.0, 0.0 }>;
  t_16 = _9 != 0 ? t_13 : t_4;
  ret_15 = _9 != 0 ? ret_14 : { 0.0, 0.0, 0.0, 0.0 };


>"movmskps  %xmm8, %edx"
> does not protect the code in the if block...
Yes it does just not the way you think it does.

Notice the last two statements are conditional expressions.

And that gets translated into the following:
testl%edx, %edx
jne.L9
movaps%xmm3, %xmm1
pxor%xmm2, %xmm2
.L9:

So if anything it is a missed optimization dealing with conditional moves with
vectors without a vector comparison.

[Bug tree-optimization/54488] tree loop invariant motion uses an excessive amount of memory

2014-10-20 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488

--- Comment #6 from rguenther at suse dot de  ---
On Sun, 19 Oct 2014, evgeniya.maenkova at gmail dot com wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488
> 
> --- Comment #5 from Evgeniya Maenkova  ---
> Also, I collect massif data and see no tree-ssa-lim in it (i mean in top
> contributors).
> 
> So what do you think?
> 
> (How did you measured 1,8Gb caused by lim? - this is for me to understand
> whether this bug is actual or not)

I basically watched 'top' with breakpoints at the start and end of LIM.


[Bug tree-optimization/62031] [4.8 Regression] Different results between O2 and O2 -fpredictive-commoning

2014-10-20 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62031

--- Comment #14 from clyon at gcc dot gnu.org ---
I confirm what I observed is a testsuite harness problem, for which I proposed
a patch here:
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01792.html

dejagnu-1.5 (as shipped with Ubuntu 14.04) masks the problem I was facing with
dejagnu-1.4.4-X as shipped with RHEL5).


[Bug ipa/63587] [5 Regression] ICE : tree check: expected var_decl, have result_decl in add_local_variables, at tree-inline.c:4112

2014-10-20 Thread rguenther at suse dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587

--- Comment #4 from rguenther at suse dot de  ---
On Sun, 19 Oct 2014, mliska at suse dot cz wrote:

> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587
> 
> --- Comment #2 from Martin Liška  ---
> Following two functions are merged:
> static boost::log::make_output_actor, RightT, ValueT>::type
> boost::log::make_output_actor, RightT,
> ValueT>::make(ActorT, RightT&) [with ActorT = boost::actor;
> LeftExprT = int; RightT = boost::log::attribute_actor boost::log::value_extractor, void, boost::actor>; ValueT = int;
> boost::log::make_output_actor, RightT, ValueT>::type =
> boost::actor,
> boost::log::to_log_fun> >] (struct actor left, struct attribute_actor & right)
> 
> 
> static boost::log::make_output_actor, RightT, ValueT>::type
> boost::log::make_output_actor, RightT,
> ValueT>::make(ActorT, RightT&) [with ActorT = boost::actor;
> LeftExprT = int; RightT = boost::log::attribute_actor<{anonymous}::my_class,
> boost::log::value_extractor, void, boost::actor>; ValueT = int;
> boost::log::make_output_actor, RightT, ValueT>::type =
> boost::actor,
> boost::log::to_log_fun> >] (struct actor left, struct attribute_actor & right)
> 
> with following body:
> {
>   struct type D.3826;
>   struct to_log_fun D.3825;
>   struct attribute_name D.3824;
>   int SR.9;
>   struct actor left;
> 
>   :
>   left = left;
>   SR.9_4 = MEM[(struct attribute_terminal *)right_2(D)];
>   MEM[(struct attribute_name *)&D.3824] = SR.9_4;
>   boost::log::attribute_output_terminal,
> boost::log::to_log_fun>::attribute_output_terminal (&D.3826, left, 
> D.3824,
> D.3825, 0);
>   D.3826 ={v} {CLOBBER};
>   return;
> 
> }
> 
> 
> 
> As I was debugging ao_ref_alias_sets, there's MEM_REF where we have different
> template arguments: attribute_actor vs.
> attribute_actor<{anonymous}::my_class,...>.
> What do you think Richard about these record_types from alias set perspective:
> 
> (gdb) p debug_tree(t1)
>   type  size 
> unit size 
> align 32 symtab 0 alias set 4 canonical type 0x76c33690 precision
> 32 min  max  0x76c51018
> 2147483647>
> pointer_to_this >
> 
> arg 0  type  attribute_actor>
> unsigned DI
> size 
> unit size 
> align 64 symtab 0 alias set 7 canonical type 0x76e20d20>
> visited var def_stmt GIMPLE_NOP
> 
> version 2
> ptr-info 0x76a7e3d8>
> arg 1 
> constant 0>>
> $1 = void
> (gdb) p debug_tree(t2)
>   type  size 
> unit size 
> align 32 symtab 0 alias set 4 canonical type 0x76c33690 precision
> 32 min  max  0x76c51018
> 2147483647>
> pointer_to_this >
> 
> arg 0  type  attribute_actor>
> unsigned DI
> size 
> unit size 
> align 64 symtab 0 alias set 7 canonical type 0x76e20540>
> visited var def_stmt GIMPLE_NOP
> 
> version 2
> ptr-info 0x76a7e300>
> arg 1 
> constant 0>>
> 
> these types are called for alias_set comparison, with following record_types:
> (gdb) p debug_tree((tree_node*)0x76de7dc8)
>   SI
> size  bitsizetype> constant 32>
> unit size  sizetype> constant 4>
> align 32 symtab 0 alias set 17 canonical type 0x76de7dc8
> fields  type  type_6
> SI size  unit size  4>
> align 32 symtab 0 alias set 15 canonical type 0x76dddb28 
> fields
>  context  boost>
> full-name "struct boost::actor"
> needs-constructor X() X(constX&) this=(X&) n_parents=0
> use_template=1 interface-unknown
> pointer_to_this  reference_to_this
>  chain >
> ignored decl_6 SI file ../../PR33754.c line 167 col 7 size 
>  0x76c51048 32> unit size 
> align 32 offset_align 128
> offset 
> bit offset  context 
>  0x76de7dc8 attribute_actor>
> chain  0x76de80a8 attribute_actor>
> external nonlocal suppress-debug decl_4 VOID file ../../PR33754.c
> line 168 col 1
> align 8 context  
> result
> 
> chain >> context
> 
> full-name "class boost::log::attribute_actor boost::log::value_extractor, void, boost::actor>"
> needs-constructor X() X(constX&) this=(X&) n_parents=1 use_template=1
> interface-unknown
> pointer_to_this  reference_to_this
>  chain  attribute_actor>>
> $3 = void
> (gdb) p debug_tree((tree_node*)0x76ddd888)
>   SI
> size  bitsizetype> constant 32>
> unit size  sizetype> constant 4>
> align 32 symtab 0 alias set 14 canonical type 0x76ddd888
> fields  type  type_6
> SI size  unit size  4>
> align 32 symtab 0 alias set 15 canonical type 0x76dddb28 
> fields
>  context  boost>
> full-name "struct boost::actor"
> needs-constructor X() X(constX&) this=(X&) n_parents=0
> use_template=1 interface-unknown
> pointer_to_this  reference_to_this
>  chain >
>

[Bug libfortran/63589] find_addr2line does not consider last PATH component

2014-10-20 Thread jb at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589

--- Comment #2 from Janne Blomqvist  ---
Author: jb
Date: Mon Oct 20 07:53:37 2014
New Revision: 216449

URL: https://gcc.gnu.org/viewcvs?rev=216449&root=gcc&view=rev
Log:
PR 63589 Fix splitting of PATH in find_addr2line.

2014-10-20  Janne Blomqvist  

PR libfortran/63589
* configure.ac: Check for strtok_r.
* runtime/main.c (gfstrtok_r): Fallback implementation of
strtok_r.
(find_addr2line): Use strtok_r to split PATH.
* config.h.in: Regenerated.
* configure: Regenerated.

Modified:
trunk/libgfortran/ChangeLog
trunk/libgfortran/config.h.in
trunk/libgfortran/configure
trunk/libgfortran/configure.ac
trunk/libgfortran/runtime/main.c


[Bug libfortran/63589] find_addr2line does not consider last PATH component

2014-10-20 Thread jb at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589

--- Comment #3 from Janne Blomqvist  ---
Author: jb
Date: Mon Oct 20 08:04:39 2014
New Revision: 216450

URL: https://gcc.gnu.org/viewcvs?rev=216450&root=gcc&view=rev
Log:
PR 63589 Fix splitting of PATH in find_addr2line.

2014-10-20  Janne Blomqvist  

PR libfortran/63589
* configure.ac: Check for strtok_r.
* runtime/main.c (gfstrtok_r): Fallback implementation of
strtok_r.
(find_addr2line): Use strtok_r to split PATH.
* config.h.in: Regenerated.
* configure: Regenerated.

Modified:
branches/gcc-4_9-branch/libgfortran/ChangeLog
branches/gcc-4_9-branch/libgfortran/config.h.in
branches/gcc-4_9-branch/libgfortran/configure
branches/gcc-4_9-branch/libgfortran/configure.ac
branches/gcc-4_9-branch/libgfortran/runtime/main.c


[Bug libfortran/63589] find_addr2line does not consider last PATH component

2014-10-20 Thread jb at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589

--- Comment #4 from Janne Blomqvist  ---
Author: jb
Date: Mon Oct 20 08:16:06 2014
New Revision: 216451

URL: https://gcc.gnu.org/viewcvs?rev=216451&root=gcc&view=rev
Log:
PR 63589 Fix splitting of PATH in find_addr2line.

2014-10-20  Janne Blomqvist  

PR libfortran/63589
* configure.ac: Check for strtok_r.
* runtime/main.c (gfstrtok_r): Fallback implementation of
strtok_r.
(find_addr2line): Use strtok_r to split PATH.
* config.h.in: Regenerated.
* configure: Regenerated.

Modified:
branches/gcc-4_8-branch/libgfortran/ChangeLog
branches/gcc-4_8-branch/libgfortran/config.h.in
branches/gcc-4_8-branch/libgfortran/configure
branches/gcc-4_8-branch/libgfortran/configure.ac
branches/gcc-4_8-branch/libgfortran/runtime/main.c


[Bug libfortran/63589] find_addr2line does not consider last PATH component

2014-10-20 Thread jb at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589

Janne Blomqvist  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #5 from Janne Blomqvist  ---
Fixed, closing.


[Bug tree-optimization/63586] x+x+x+x -> 4*x in gimple

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63586

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #1 from Jakub Jelinek  ---
I'd expect reassoc should be the pass to do this.


[Bug tree-optimization/63563] [4.9/5 Regression] ICE: in vectorizable_store, at tree-vect-stmts.c:5106 with -mavx2

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63563

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-10-20
 CC||jakub at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Started with my r205856, will have a look.


[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-20 Thread fei.yang0953 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173

Fei Yang  changed:

   What|Removed |Added

 CC||fei.yang0953 at gmail dot com

--- Comment #3 from Fei Yang  ---
(In reply to ktkachov from comment #1)
> Confirmed.

Feel free to propose a patch for them on gcc-patches along the
> lines you described in:
https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html

Hi,
  To let you know, we are currently working on this issue.
  We are implementing these with builtins.
  Hopefully, the patch will be posted this week. Thank you.


[Bug c/63600] New: ice in ix86_expand_sse2_abs

2014-10-20 Thread dcb314 at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63600

Bug ID: 63600
   Summary: ice in ix86_expand_sse2_abs
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c
  Assignee: unassigned at gcc dot gnu.org
  Reporter: dcb314 at hotmail dot com

Created attachment 33760
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33760&action=edit
C source code

I just tried to compile the attached code
on gcc trunk dated 20141019 on an AMD x86_64 box.

The compiler said

bug168.c: In function ‘long_unary_op’:
bug168.c:11345:32: internal compiler error: in ix86_expand_sse2_abs, at
config/i386/i386.c:45977
  for (n = 0; n < na; n++) b[n] = (((a[n]) >= 0) ? (a[n]) : -(a[n]));
^
0xf9ff5e ix86_expand_sse2_abs(rtx_def*, rtx_def*)
../../src/trunk/gcc/config/i386/i386.c:45977
0x10c707a gen_absv2di2(rtx_def*, rtx_def*)
../../src/trunk/gcc/config/i386/sse.md:13834
0xb1dc09 insn_gen_fn::operator()(rtx_def*, rtx_def*) const
../../src/trunk/gcc/recog.h:308
0xb1dc09 maybe_gen_insn(insn_code, unsigned int, expand_operand*)
../../src/trunk/gcc/optabs.c:8348
0xb1dc09 expand_unop_direct

Flag -O3 required. The attached code is the same source
code was provided for bug #53749.

[Bug c++/63601] New: Segfault on usage of 'this' in unevaluated context inside lambda

2014-10-20 Thread sneves at dei dot uc.pt
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601

Bug ID: 63601
   Summary: Segfault on usage of 'this' in unevaluated context
inside lambda
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: sneves at dei dot uc.pt

The following minimal example results in an 'ICE: Segmentation fault' in g++
4.8.1, 4.9.1, and 5.0.0 20141019:

auto f = []{ sizeof(this); };


[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
More thorough testcase (should be tested with different ISAs):
#define C1 c
#define C2 C1, C1
#define C4 C2, C2
#define C8 C4, C4
#define C16 C8, C8
#define C32 C16, C16
#define C64 C32, C32
#define C_(n) n
#define C(n) C_(C##n)

#define T(t,s) \
typedef t v##t##s __attribute__ ((__vector_size__ (s * sizeof (t; \
v##t##s test##t##s (t c) \
{ \
  v##t##s v = { C(s) }; \
  return v; \
}

typedef long long llong;

T(char, 64)
T(char, 32)
T(char, 16)
T(char, 8)
T(short, 32)
T(short, 16)
T(short, 8)
T(short, 4)
T(int, 16)
T(int, 8)
T(int, 4)
T(int, 2)
T(float, 16)
T(float, 8)
T(float, 4)
T(float, 2)
T(llong, 8)
T(llong, 4)
T(llong, 2)
T(double, 8)
T(double, 4)
T(double, 2)

Started with r216401, -mavx512f of course doesn't include the avx512bw
broadcast needed for the V64QI or V32HI duplicates.


[Bug target/63599] "wrong" branch optimization with Ofast in a loop

2014-10-20 Thread vincenzo.innocente at cern dot ch
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599

--- Comment #2 from vincenzo Innocente  ---
I agree that the code produces correct results. It looks to me  sub-optimal.
I understand that with Ofast the sequence below will be always executed

andps%xmm5, %xmm8
rcpps%xmm3, %xmm0
mulps%xmm0, %xmm3
mulps%xmm0, %xmm3
addps%xmm0, %xmm0
subps%xmm3, %xmm0
mulps%xmm0, %xmm1
movaps%xmm2, %xmm0
cmpleps%xmm4, %xmm0
blendvps%xmm0, %xmm2, %xmm1

while with O2 it will not.
and this generates a performance penalty for samples where the test is often
false.
( I tried to add __builtin_expect(x, false) with no effect. )


[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-20 Thread venkataramanan.kumar at amd dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173

--- Comment #4 from Venkataramanan  ---
(In reply to Fei Yang from comment #3)
> (In reply to ktkachov from comment #1)
> > Confirmed.
> 
> Feel free to propose a patch for them on gcc-patches along the
> > lines you described in:
> https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> 
> Hi,
>   To let you know, we are currently working on this issue.
>   We are implementing these with builtins.
>   Hopefully, the patch will be posted this week. Thank you.


Hi Fei Yang,

Ok no issues. I will let you do this. But please asign (In reply to Fei Yang
from comment #3)
> (In reply to ktkachov from comment #1)
> > Confirmed.
> 
> Feel free to propose a patch for them on gcc-patches along the
> > lines you described in:
> https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> 
> Hi,
>   To let you know, we are currently working on this issue.
>   We are implementing these with builtins.
>   Hopefully, the patch will be posted this week. Thank you.

Ok. Next time please assign the Bugzilla item to your name, so that we wont be
duplicating the work.


[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-20 Thread ramana at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173

Ramana Radhakrishnan  changed:

   What|Removed |Added

 CC||ramana at gcc dot gnu.org

--- Comment #5 from Ramana Radhakrishnan  ---
(In reply to Venkataramanan from comment #4)
> (In reply to Fei Yang from comment #3)
> > (In reply to ktkachov from comment #1)
> > > Confirmed.
> > 
> > Feel free to propose a patch for them on gcc-patches along the
> > > lines you described in:
> > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> > 
> > Hi,
> >   To let you know, we are currently working on this issue.
> >   We are implementing these with builtins.
> >   Hopefully, the patch will be posted this week. Thank you.
> 
> 
> Hi Fei Yang,
> 
> Ok no issues. I will let you do this. But please asign (In reply to Fei Yang
> from comment #3)
> > (In reply to ktkachov from comment #1)
> > > Confirmed.
> > 
> > Feel free to propose a patch for them on gcc-patches along the
> > > lines you described in:
> > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> > 
> > Hi,
> >   To let you know, we are currently working on this issue.
> >   We are implementing these with builtins.
> >   Hopefully, the patch will be posted this week. Thank you.
> 
> Ok. Next time please assign the Bugzilla item to your name, so that we wont
> be duplicating the work.


Linaro / Charles Bayliss was already working on this - he had patches out in
September for this.


[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-20 Thread clyon at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173

clyon at gcc dot gnu.org changed:

   What|Removed |Added

 CC||cbaylis at gcc dot gnu.org

--- Comment #6 from clyon at gcc dot gnu.org ---
(In reply to Ramana Radhakrishnan from comment #5)
> (In reply to Venkataramanan from comment #4)
> > (In reply to Fei Yang from comment #3)
> > > (In reply to ktkachov from comment #1)
> > > > Confirmed.
> > > 
> > > Feel free to propose a patch for them on gcc-patches along the
> > > > lines you described in:
> > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> > > 
> > > Hi,
> > >   To let you know, we are currently working on this issue.
> > >   We are implementing these with builtins.
> > >   Hopefully, the patch will be posted this week. Thank you.
> > 
> > 
> > Hi Fei Yang,
> > 
> > Ok no issues. I will let you do this. But please asign (In reply to Fei Yang
> > from comment #3)
> > > (In reply to ktkachov from comment #1)
> > > > Confirmed.
> > > 
> > > Feel free to propose a patch for them on gcc-patches along the
> > > > lines you described in:
> > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> > > 
> > > Hi,
> > >   To let you know, we are currently working on this issue.
> > >   We are implementing these with builtins.
> > >   Hopefully, the patch will be posted this week. Thank you.
> > 
> > Ok. Next time please assign the Bugzilla item to your name, so that we wont
> > be duplicating the work.
> 
> 
> Linaro / Charles Bayliss was already working on this - he had patches out in
> September for this.

It seems that Charles' patches cover vldX_lane, but not vldX_dup.


[Bug target/63599] "wrong" branch optimization with Ofast in a loop

2014-10-20 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599

--- Comment #3 from Marc Glisse  ---
ifcvt making a transformation that doesn't help vectorization and ends up
pessimizing the code... not really the first time this happens. I believe Jakub
had a big patch for that, but it never got in. Maybe vectors could be
special-cased if we never vectorize them anyway.


[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux

2014-10-20 Thread evstupac at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534

--- Comment #31 from Stupachenko Evgeny  ---
(In reply to Jeffrey A. Law from comment #29)
> I thought we had already dealt with the "hidden" GOT usages that show up
> during reload...  Is it IRA that's removing the SET_GOT?

That is not EQUIV related case. SET_GOT is removed by CSE called at IRA.
Here we have insn that don't use GOT register implicitly:

(insn 37 34 38 6 (set (mem:TF (pre_dec:SI (reg/f:SI 7 sp)) [0  S16 A8]) 
(const_double:TF 2.0769187434139310514121985316880384e+34
[0x0.8p+115])) frexpq.c:1316 121 {*pushtf} 
 (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) 
(nil)))

It appears that there are no other insns using GOT or calls.
Therefore CSE absolutely correct in removing SET_GOT.


[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux

2014-10-20 Thread evstupac at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534

--- Comment #32 from Stupachenko Evgeny  ---
(In reply to Iain Sandoe from comment #30)
> FWIW, I built a stage #1 with fortran, objc and ada enabled.
> 
> libgcc, libstdc++v3, libgomp, libobjc and libada build.
> 
> libgfortran & libquadmath fail (errors as per Dominique's post).

We got MAC and are setting up GCC build there to be able to reproduce all
issues and publish patch fixing whole bootstrap.


[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

--- Comment #3 from Jakub Jelinek  ---
Created attachment 33761
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33761&action=edit
WIP patch for discussions

>From what I see, if TARGET_AVX512BW is not defined, then we obviously can't use
ix86_vector_duplicate_value, but need two instructions (either it can be
QI->V32QI / HI->V16HI broadcast followed by concat of the two parts, or
QI->V16QI / HI->V8HI broadcast followed by concat of the 4 parts together).
But, it seems even for -mavx2 or -mavx we actually generate terrible code,
for -mavx2 there is no point in using 2 instructions when in theory
vpbroadcast{b,w} should handle it alone just fine.
The patch enables all of that, but unfortunately we generate perhaps not so
good code with it, e.g. for -mavx2 in testchar32, we spill the argument always
to memory, and then broadcast it from memory, even when vmovd + broadcast from
register could have been used.
And in testchar16, for some reason we spill into memory, and broadcast from
vmovd result (so the spill is totally useless).
Uros/Kyrill, any thoughts on this?


[Bug c++/63531] gcc segfaults on some sourcefiles when using '-Weffc++' and '-fsanitize=undefined' together

2014-10-20 Thread allizgubccg at reallysoft dot de
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63531

--- Comment #6 from Ralf  ---
(In reply to Marek Polacek from comment #5)
> I meant a GCC build, that contains r215459 fix (for that you'd have to build
> gcc, 5 nor 4.9.2 haven't been released yet).
> 
> But I'm pretty sure this is already fixed.

Yes, i can conform my problem is fixed in snapshot gcc-4.9-20141015.
Thanks for your help :)


[Bug tree-optimization/63583] [5 Regression] ICF does not check that the template strings are the same

2014-10-20 Thread marxin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63583

--- Comment #2 from marxin at gcc dot gnu.org ---
Author: marxin
Date: Mon Oct 20 10:44:54 2014
New Revision: 216458

URL: https://gcc.gnu.org/viewcvs?rev=216458&root=gcc&view=rev
Log:
PR ipa/63583

* ipa-icf-gimple.c (func_checker::compare_gimple_asm):
Gimple tempate string is compared.

* gcc.dg/ipa/pr63595.c: New test.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/ipa-icf-gimple.c
trunk/gcc/testsuite/ChangeLog


[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf

2014-10-20 Thread fei.yang0953 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173

--- Comment #7 from Fei Yang  ---
(In reply to clyon from comment #6)
> (In reply to Ramana Radhakrishnan from comment #5)
> (In reply to
> Venkataramanan from comment #4)
> > (In reply to Fei Yang from comment #3)
>
> > > (In reply to ktkachov from comment #1)
> > > > Confirmed.
> > > 
> > >
> Feel free to propose a patch for them on gcc-patches along the
> > > > lines
> you described in:
> > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> >
> > 
> > > Hi,
> > >   To let you know, we are currently working on this
> issue.
> > >   We are implementing these with builtins.
> > >   Hopefully,
> the patch will be posted this week. Thank you.
> > 
> > 
> > Hi Fei Yang,
>
> > 
> > Ok no issues. I will let you do this. But please asign (In reply to
> Fei Yang
> > from comment #3)
> > > (In reply to ktkachov from comment #1)
>
> > > > Confirmed.
> > > 
> > > Feel free to propose a patch for them on
> gcc-patches along the
> > > > lines you described in:
> > >
> https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html
> > > 
> > > Hi,
> > >   To
> let you know, we are currently working on this issue.
> > >   We are
> implementing these with builtins.
> > >   Hopefully, the patch will be
> posted this week. Thank you.
> > 
> > Ok. Next time please assign the
> Bugzilla item to your name, so that we wont
> > be duplicating the work.
> 
> > 
> Linaro / Charles Bayliss was already working on this - he had patches
> out in
> September for this.

It seems that Charles' patches cover
> vldX_lane, but not vldX_dup.

Hi Ramana,
  Do you mean this link:
https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00678.html


[Bug lto/61192] Conflict between register and function name for lto on sparc

2014-10-20 Thread i.palachev at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61192

Ilya Palachev  changed:

   What|Removed |Added

 CC||i.palachev at samsung dot com

--- Comment #2 from Ilya Palachev  ---
(In reply to Daniel Cederman from comment #0)
> when using lto on sparc.

Daniel, can you also provide original source code (not preprocessed)? It's
interesting whether this error can be reproduced on other arhictectures.


[Bug tree-optimization/63602] New: Wrong code w/ -O2 -ftree-loop-linear

2014-10-20 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63602

Bug ID: 63602
   Summary: Wrong code w/ -O2 -ftree-loop-linear
   Product: gcc
   Version: unknown
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: asolokha at gmx dot com

gcc produces wrong code w/ -ftree-loop-linear -O2 (and above) for the following
reduced case:

int sx;
int bn;
int vz = 1;
int *volatile n6 = &bn;

int
main(void)
{
  for (int i = 0; i < 3; ++i) {
sx = vz;
vz = bn;
  }
  return sx;
}

It struck me first w/ gcc-4.10.0-alpha20140810, but today I've reproduced it w/
4.8.3, 4.9.1 and 5-alpha20141019, so I'm not marking it as a regression.

Expected results:
% gcc-5.0_alpha20141019 -O2 -o good 963b8772.c
% ./good
% echo $?
0

Actual results:
% gcc-5.0_alpha20141019 -O2 -ftree-loop-linear -o bad 963b8772.c
% ./bad
% echo $?
1


[Bug lto/61052] g++ generated code segfaults when using LTO together with "extern template", non-LTO compiled files, and gold linker

2014-10-20 Thread i.palachev at samsung dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61052

Ilya Palachev  changed:

   What|Removed |Added

 CC||i.palachev at samsung dot com

--- Comment #1 from Ilya Palachev  ---
Hi,
I can see another error for the attached testcase.

$ gcc -c -O2 -flto a.cc
$ gcc -c -O2 -flto b.cc
$ gcc -c -Os e.cc
$ gcc -o a -fuse-ld=gold a.o e.o b.o

/usr/local/bin/ld.gold: -plugin: unknown option
/usr/local/bin/ld.gold: use the --help option for usage information
collect2: error: ld returned 1 exit status

$ ld -v
GNU gold (GNU Binutils 2.24.51.20141003) 1.11

It seems that this error os related with option "-fuse-ld", since the error
disappears if this option is not specified.


[Bug c++/63601] Segfault on usage of 'this' in unevaluated context inside lambda

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-10-20
 CC||jakub at gcc dot gnu.org,
   ||jason at gcc dot gnu.org
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
This used to be rejected until r196550 where it started to ICE.


[Bug debug/60655] [4.9 Regression] ICE: output_operand: invalid expression as operand

2014-10-20 Thread amodra at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60655

--- Comment #22 from Alan Modra  ---
Author: amodra
Date: Mon Oct 20 11:54:22 2014
New Revision: 216462

URL: https://gcc.gnu.org/viewcvs?rev=216462&root=gcc&view=rev
Log:
PR debug/60655
* simplify-rtx.c (simplify_plus_minus): Delete unused "input_ops".
Increase "ops" array size.  Correct array size tests.  Init
n_constants in loop.  Break out of innermost loop when finding
a trivial CONST expression.


Modified:
trunk/gcc/ChangeLog
trunk/gcc/simplify-rtx.c


[Bug target/63600] [5 Regression] ice in ix86_expand_sse2_abs

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63600

Jakub Jelinek  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-10-20
 CC||jakub at gcc dot gnu.org
   Target Milestone|--- |5.0
Summary|ice in ix86_expand_sse2_abs |[5 Regression] ice in
   ||ix86_expand_sse2_abs
 Ever confirmed|0   |1

--- Comment #1 from Jakub Jelinek  ---
Started with r216255.  Reduced testcase for -O3:
long *a, b;
int c;
void
foo (void)
{
  for (c = 0; c < 64; c++)
a[c] = b >= 0 ? b : -b;
}


[Bug c++/63531] gcc segfaults on some sourcefiles when using '-Weffc++' and '-fsanitize=undefined' together

2014-10-20 Thread mpolacek at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63531

Marek Polacek  changed:

   What|Removed |Added

 Status|WAITING |RESOLVED
 Resolution|--- |FIXED

--- Comment #7 from Marek Polacek  ---
.


[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f

2014-10-20 Thread kyukhin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

--- Comment #4 from Kirill Yukhin  ---
(In reply to Jakub Jelinek from comment #3)
> Created attachment 33761 [details]
> WIP patch for discussions
> 
> From what I see, if TARGET_AVX512BW is not defined, then we obviously can't
> use
> ix86_vector_duplicate_value, but need two instructions (either it can be
> QI->V32QI / HI->V16HI broadcast followed by concat of the two parts, or
> QI->V16QI / HI->V8HI broadcast followed by concat of the 4 parts together).
> But, it seems even for -mavx2 or -mavx we actually generate terrible code,
> for -mavx2 there is no point in using 2 instructions when in theory
> vpbroadcast{b,w} should handle it alone just fine.
Right!

> The patch enables all of that, but unfortunately we generate perhaps not so
> good code with it, e.g. for -mavx2 in testchar32, we spill the argument
> always to memory, and then broadcast it from memory, even when vmovd +
> broadcast from register could have been used.
> And in testchar16, for some reason we spill into memory, and broadcast from
> vmovd result (so the spill is totally useless).
I think this is because of subreg:QI of reg:SI.
Before reload we have (for testchar32):
(insn 2 5 3 2 (set (reg:SI 86 [ c ])
(reg:SI 5 di [ c ])) 1.c:22 90 {*movsi_internal}
 (expr_list:REG_DEAD (reg:SI 5 di [ c ])
(nil)))
(insn 7 4 12 2 (set (reg:V32QI 88 [ v ])
(vec_duplicate:V32QI (subreg:QI (reg:SI 86 [ c ]) 0))) 1.c:22 4112
{vec_dupv32qi}
 (expr_list:REG_DEAD (reg:SI 86 [ c ])
(nil)))

After reload we need to get rid off subreg:
(insn 2 5 3 2 (set (mem/c:SI (plus:DI (reg/f:DI 6 bp)
(const_int -20 [0xffec])) [8 %sfp+-4 S4 A32])
(reg:SI 5 di [ c ])) 1.c:22 90 {*movsi_internal}
 (nil))
(insn 7 4 12 2 (set (reg:V32QI 21 xmm0 [orig:88 v ] [88])
(vec_duplicate:V32QI (mem/c:QI (plus:DI (reg/f:DI 6 bp)
(const_int -20 [0xffec])) [8 %sfp+-4 S1 A32])))
1.c:22 4112 {vec_dupv32qi}
 (nil))

> Uros/Kyrill, any thoughts on this?
I like the patch.


[Bug ipa/63598] [5.0 Regression] ICE: in ipa_merge_profiles at ipa-utils.c:396

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63598

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |5.0


[Bug target/63596] Saving of GPR/FPRs for stdarg even though the variable argument is not used

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63596

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-10-20
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed.


[Bug tree-optimization/63595] Segmentation faults inside kernel

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63595

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |WAITING
   Last reconfirmed||2014-10-20
 Ever confirmed|0   |1


[Bug ipa/63587] [5 Regression] ICE : tree check: expected var_decl, have result_decl in add_local_variables, at tree-inline.c:4112

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |5.0


[Bug c++/63588] [5 Regression] ICE (segfault) on arm-linux-gnueabihf

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63588

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |5.0


[Bug tree-optimization/63583] [5 Regression] ICF does not check that the template strings are the same

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63583

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |FIXED

--- Comment #3 from Richard Biener  ---
Fixed.


[Bug c++/63582] [5 Regression]: g++.dg/init/enum1.C ... (test for errors, line 12)

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63582

Richard Biener  changed:

   What|Removed |Added

 Target|cris-axis-elf   |cris-axis-elf,
   ||i?86-linux-gnu
   Target Milestone|--- |5.0

--- Comment #2 from Richard Biener  ---
Also fails on x86_64-linux with -m32.


[Bug c++/63588] [5 Regression] ICE (segfault) on arm-linux-gnueabihf

2014-10-20 Thread ktkachov at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63588

ktkachov at gcc dot gnu.org changed:

   What|Removed |Added

 CC||ktkachov at gcc dot gnu.org

--- Comment #1 from ktkachov at gcc dot gnu.org ---
So is there a reduced testcase for this?


[Bug ipa/63580] [5 Regression] ICE : error: invalid argument to gimple call

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63580

Richard Biener  changed:

   What|Removed |Added

 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-10-20
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
You miss to mark p1 addressable in the alias decl (that is, copy
TREE_ADDRESSABLE).


[Bug rtl-optimization/63577] [4.8/4.9/5? Regression]: Huge compile time and memory usage with -O and not -fPIC

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63577

Richard Biener  changed:

   What|Removed |Added

 Target||x86_64-*-*
 Status|UNCONFIRMED |NEW
   Last reconfirmed||2014-10-20
  Component|fortran |rtl-optimization
Version|unknown |4.9.1
 Blocks||47344
   Target Milestone|--- |4.8.4
 Ever confirmed|0   |1

--- Comment #1 from Richard Biener  ---
Confirmed with 4.9:

 combiner:  31.14 (79%) usr   0.46 (74%) sys  31.65 (78%) wall
1029289 kB (96%) ggc
 TOTAL :  39.48 0.6240.77   
1071504 kB


[Bug ipa/63576] [5 Regression] ICE : in ipa_merge_profiles, at ipa-utils.c:540 during Firefox LTO/PGO build

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63576

Richard Biener  changed:

   What|Removed |Added

   Target Milestone|--- |5.0


[Bug rtl-optimization/63577] [4.8/4.9/5 Regression]: Huge compile time and memory usage with -O and not -fPIC

2014-10-20 Thread rguenth at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63577

Richard Biener  changed:

   What|Removed |Added

Summary|[4.8/4.9/5? Regression]:|[4.8/4.9/5 Regression]:
   |Huge compile time and   |Huge compile time and
   |memory usage with -O and|memory usage with -O and
   |not -fPIC   |not -fPIC

--- Comment #2 from Richard Biener  ---
--param max-combine-insns=2 helps a bit compile-time wise but not fully
memory-usage-wise (I suppose log-links are expensive and of course still set
up).
Only available on trunk, of course.


[Bug c++/63588] [5 Regression] ICE (segfault) on arm-linux-gnueabihf

2014-10-20 Thread doko at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63588

--- Comment #2 from Matthias Klose  ---
yes, see above.


[Bug tree-optimization/54488] tree loop invariant motion uses an excessive amount of memory

2014-10-20 Thread evgeniya.maenkova at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488

--- Comment #7 from Evgeniya Maenkova  ---
I got only 317Mb by top.


[Bug lto/63603] New: [4.9/5 Regression] Linking with -fno-lto still invokes LTO

2014-10-20 Thread burnus at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63603

Bug ID: 63603
   Summary: [4.9/5 Regression] Linking with -fno-lto still invokes
LTO
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: burnus at gcc dot gnu.org

Running

  echo "int main() {return 0;}" >  foo.c
  gcc -flto -ffat-lto-objects -c foo.c
  gcc -v -fno-lto foo.o 2>&1|grep lto1

shows that the -fno-lto is ignored for linking as lto1 is always invoked with
GCC 4.9 and 5.

Using GCC 4.8, LTO is not automatically invoked for linking but has to be
passed manually. Hence, it works there.


[Bug tree-optimization/63563] [4.9/5 Regression] ICE: in vectorizable_store, at tree-vect-stmts.c:5106 with -mavx2

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63563

Jakub Jelinek  changed:

   What|Removed |Added

 Status|NEW |ASSIGNED
   Assignee|unassigned at gcc dot gnu.org  |jakub at gcc dot gnu.org

--- Comment #2 from Jakub Jelinek  ---
Created attachment 33762
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33762&action=edit
gcc5-pr63563.patch

Untested fix.


[Bug lto/61192] Conflict between register and function name for lto on sparc

2014-10-20 Thread cederman at gaisler dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61192

--- Comment #3 from Daniel Cederman  ---
(In reply to Ilya Palachev from comment #2)
> (In reply to Daniel Cederman from comment #0)
> > when using lto on sparc.
> 
> Daniel, can you also provide original source code (not preprocessed)? It's
> interesting whether this error can be reproduced on other arhictectures.

I used creduce on the source code and this code triggers the error:

register int _SPARC_Per_CPU_current __asm__("g6");
int __getreent___trans_tmp_1;

__getreent() {
  int cpu_self = _SPARC_Per_CPU_current;
  __getreent___trans_tmp_1 = cpu_self;
}

g6() {}

I compiled with the same compiler as before, I have not tried with a newer
version of gcc.


[Bug lto/63603] [4.9/5 Regression] Linking with -fno-lto still invokes LTO

2014-10-20 Thread burnus at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63603

Tobias Burnus  changed:

   What|Removed |Added

   Target Milestone|--- |4.9.2


[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap

2014-10-20 Thread iverbin at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307

--- Comment #6 from iverbin at gcc dot gnu.org ---
Author: iverbin
Date: Mon Oct 20 15:22:09 2014
New Revision: 216483

URL: https://gcc.gnu.org/viewcvs?rev=216483&root=gcc&view=rev
Log:
PR c/63307
gcc/c-family/
* cilk.c: Include vec.h.
(struct cilk_decls): New structure.
(wrapper_parm_cb): Split this function to...
(fill_decls_vec): ...this...
(create_parm_list): ...and this.
(compare_decls): New function.
(for_local_cb): Remove.
(wrapper_local_cb): Ditto.
(build_wrapper_type): For now first traverse and fill vector of
declarations then sort it and then deal with sorted vector.
(cilk_outline): Ditto.
(declare_one_free_variable): Ditto.

Modified:
trunk/gcc/c-family/ChangeLog
trunk/gcc/c-family/cilk.c


[Bug fortran/63553] [OOP] Wrong code when assigning a CLASS to a TYPE

2014-10-20 Thread patnel97269-gfortran at yahoo dot fr
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63553

--- Comment #5 from patnel97269-gfortran at yahoo dot fr ---
Thanks for the patch. 

Another similar case, this time the type contains an allocatable field,
produces a internal compiler error (without applying the patch) :

 internal compiler error: in fold_convert_loc, at fold-const.c:2112


program toto
implicit none

type mother
integer :: i
double precision,dimension(:),allocatable :: values
end type mother


class(mother),allocatable :: cm,cm2

allocate(cm)
allocate(cm%values(10))
cm%i=3
cm%values=80d0
allocate(cm2)
select type(cm2)
type is (mother)
cm2=cm
end select
print *,cm2%i,cm2%values
end program


[Bug libquadmath/55821] Release tarballs (unconditionally) install libquadmath.info when libquadmath is not supported

2014-10-20 Thread sandra at codesourcery dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55821

--- Comment #9 from Sandra Loosemore  ---
Yes, that patch (with regenerated Makefile.in) did the trick.  Thanks.

config.log says my configure line is:

$ /scratch/sandra/arm-fsf/src/gcc-mainline/libquadmath/configure --srcdir=/scr
atch/sandra/arm-fsf/src/gcc-mainline/libquadmath --cache-file=./config.cache
--e
nable-multilib --with-cross-host=i686-pc-linux-gnu --enable-threads
--disable-li
bmudflap --disable-libstdcxx-pch --with-gnu-as --with-gnu-ld --enable-shared
--e
nable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-glibc-version=2.19
-
-disable-nls --prefix=/scratch/sandra/arm-fsf/install
--with-sysroot=/scratch/sa
ndra/arm-fsf/install/arm-none-linux-gnueabi/libc
--with-host-libstdcxx=-static-l
ibgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm --enable-libgomp --enable-libitm
--ena
ble-libatomic --disable-libssp --enable-poison-system-directories
--with-build-t
ime-tools=/scratch/sandra/arm-fsf/install/arm-none-linux-gnueabi/bin
--enable-la
nguages=c,c++,fortran,lto --program-transform-name=s&^&arm-none-linux-gnueabi-& 
--disable-option-checking --with-target-subdir=arm-none-linux-gnueabi
--build=i6
86-pc-linux-gnu --host=arm-none-linux-gnueabi --target=arm-none-linux-gnueabi


[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

--- Comment #5 from Jakub Jelinek  ---
Better testcase that tests both broadcasts from GPRs and broadcasts from
memory:

#define C1 c
#define C2 C1, C1
#define C4 C2, C2
#define C8 C4, C4
#define C16 C8, C8
#define C32 C16, C16
#define C64 C32, C32
#define C_(n) n
#define C(n) C_(C##n)

#define T(t,s) \
typedef t v##t##s __attribute__ ((__vector_size__ (s * sizeof (t;\
v##t##s test##t##s (t c)\
{\
  v##t##s v = { C(s) };\
  return v;\
}\
v##t##s test2##t##s (t *p)\
{\
  t c = *p;\
  v##t##s v = { C(s) };\
  return v;\
}

typedef long long llong;

T(char, 64)
T(char, 32)
T(char, 16)
T(char, 8)
T(short, 32)
T(short, 16)
T(short, 8)
T(short, 4)
T(int, 16)
T(int, 8)
T(int, 4)
T(int, 2)
T(float, 16)
T(float, 8)
T(float, 4)
T(float, 2)
T(llong, 8)
T(llong, 4)
T(llong, 2)
T(double, 8)
T(double, 4)
T(double, 2)


[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594

--- Comment #6 from Jakub Jelinek  ---
Created attachment 33763
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33763&action=edit
gcc5-pr63594-wip2.patch

Updated WIP patch, which attempts to generate better code using inter-unit
moves, but have also memory as an alternative, so it allows RA to choose what
is best.  This still generates non-perfect code for V2DI/V4DI loads from GPRs
without -mavx512f (but e.g. vec_concatv2di uses Yi constraint).
And, for AVX512-{F,BW,VL}, I'm surprised that the broadcasts from gprs are done
as different instructions from broadcasts from memory or vector reg, I would
have thought that must have been done using a single insn with alternatives.


[Bug target/63599] "wrong" branch optimization with Ofast in a loop

2014-10-20 Thread jakub at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599

Jakub Jelinek  changed:

   What|Removed |Added

 CC||jakub at gcc dot gnu.org

--- Comment #4 from Jakub Jelinek  ---
The big patch got committed in, but generally turning off tree if-conversion
didn't turn to be a win, so what ended up being committed is only if there are
any masked loads/stores, if-conversion applies only to vectorized loop and
nothing else.


[Bug c++/63601] Segfault on usage of 'this' in unevaluated context inside lambda

2014-10-20 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601

--- Comment #2 from Jason Merrill  ---
Author: jason
Date: Mon Oct 20 17:29:02 2014
New Revision: 216488

URL: https://gcc.gnu.org/viewcvs?rev=216488&root=gcc&view=rev
Log:
PR c++/63601
* lambda.c (current_nonlambda_function): New.
* semantics.c (finish_this_expr): Use it.
* cp-tree.h: Declare it.

Added:
trunk/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this20.C
Modified:
trunk/gcc/cp/ChangeLog
trunk/gcc/cp/cp-tree.h
trunk/gcc/cp/lambda.c
trunk/gcc/cp/semantics.c


[Bug rtl-optimization/63577] [4.8/4.9/5 Regression]: Huge compile time and memory usage with -O and not -fPIC

2014-10-20 Thread segher at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63577

Segher Boessenkool  changed:

   What|Removed |Added

 CC||segher at gcc dot gnu.org

--- Comment #3 from Segher Boessenkool  ---
The LOG_LINKS take up only a few hundred kB, tops; the gigantic memory
use is from of all the garbage RTL produced for all the failed combine
attempts.


[Bug c++/63601] Segfault on usage of 'this' in unevaluated context inside lambda

2014-10-20 Thread jason at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601

Jason Merrill  changed:

   What|Removed |Added

 Status|NEW |RESOLVED
 Resolution|--- |FIXED
   Assignee|unassigned at gcc dot gnu.org  |jason at gcc dot gnu.org
   Target Milestone|--- |5.0

--- Comment #3 from Jason Merrill  ---
Fixed.


[Bug c++/63604] New: [C++11] A direct-initialization of a reference should use explicit conversion functions

2014-10-20 Thread kariya_mitsuru at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63604

Bug ID: 63604
   Summary: [C++11] A direct-initialization of a reference should
use explicit conversion functions
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: kariya_mitsuru at hotmail dot com

The sample code below should be compiled successfully but it causes compilation
error by gcc.

==
struct T {};

struct S {
explicit operator T() { return T(); }
};

int main()
{
S s;
T&& t(s);
(void) t;
}
==
cf. http://melpon.org/wandbox/permlink/LHgajpAXzqTbpYDc

An initialization of a reference in a direct-initialization context should use
an explicit conversion function that converts to a class prvalue.


The latest C++ standard (n4140) 13.3.1.6 [over.match.ref]/p.1.1 says that

The conversion functions of S and its base classes are considered.  Those
non-explicit conversion functions that are not hidden within S and yield type
“lvalue reference to cv2 T2” (when initializing an lvalue reference or an
rvalue reference to function) or “cv2 T2” or “rvalue reference to cv2 T2” (when
initializing an rvalue reference or an lvalue reference to function), where
“cv1 T” is reference-compatible (8.5.3) with “cv2 T2”, are candidate functions.
 For direct-initialization, those explicit conversion functions that are not
hidden within S and yield type “lvalue reference to cv2 T2” or “cv2 T2” or
“rvalue reference to cv2 T2”, respectively, where T2 is same type as T or can
be converted to type T with a qualification conversion (4.4), are also
candidate functions.


I think that this sample code corresponds to the case “For
direct-initialization, ...”.


Note that this sample code is compiled successfully if the conversion function
returns an rvalue reference.
(cf. http://melpon.org/wandbox/permlink/kGpALX7zvzHzi7K5)

See also BUG 48453.

[Bug tree-optimization/63605] New: wrong code at -O3 on x86_64-linux-gnu

2014-10-20 Thread su at cs dot ucdavis.edu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63605

Bug ID: 63605
   Summary: wrong code at -O3 on x86_64-linux-gnu
   Product: gcc
   Version: 5.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: tree-optimization
  Assignee: unassigned at gcc dot gnu.org
  Reporter: su at cs dot ucdavis.edu

The current gcc trunk (as well as 4.8.x and 4.9.x) miscompiles the following
code on x86_64-linux at -O3 in both 32-bit and 64-bit modes.  

This is a regression from 4.7.x. 

The miscompilation seems to be caused by the tree vectorizer as
-fno-tree-vectorize makes it disappear. 

$ gcc-trunk -v
Using built-in specs.
COLLECT_GCC=gcc-trunk
COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper
Target: x86_64-unknown-linux-gnu
Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk
--enable-languages=c,c++ --disable-werror --enable-multilib
Thread model: posix
gcc version 5.0.0 20141018 (experimental) [trunk revision 216429] (GCC) 

$ gcc-trunk -O2 small.c; a.out
1
$ gcc-4.7 -O3 small.c; a.out
1
$ 
$ gcc-trunk -O3 small.c; a.out
0
$ 





int printf (const char *, ...);

int a, b[8] = { 2, 0, 0, 0, 0, 0, 0, 0 }, c[8];

int
main ()
{
  int d;
  for (; a < 8; a++)
{
  d = b[a] >> 1;
  c[a] = d != 0;
}
  printf ("%d\n", c[0]);
  return 0;
}


[Bug c++/57610] Reference initialized with temporary instead of sub-object of conversion result

2014-10-20 Thread kariya_mitsuru at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57610

Mitsuru Kariya  changed:

   What|Removed |Added

 CC||kariya_mitsuru at hotmail dot 
com

--- Comment #10 from Mitsuru Kariya  ---
Each status of the issues mentioned above is

CWG 1287: DRWP
CWG 1604: DR
CWG 1650: NAD

And, gcc HEAD (5.0.0) does not cause the slicing problem.

cf. 5.0.0 http://melpon.org/wandbox/permlink/xQQq1n98s7blSz8x
cf. 4.9.1 http://melpon.org/wandbox/permlink/l69tDXdptf1WVdAT

Note that these are compiled with the option "-fno-elide-constructors".

(Sorry, I don't know whether this issue should be "RESOLVED FIXED" or not,
however.)


[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap

2014-10-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307

--- Comment #7 from Andrew Pinski  ---
(In reply to iverbin from comment #6)
> Author: iverbin
> Date: Mon Oct 20 15:22:09 2014
> New Revision: 216483

This breaks the build as  wd->decl_map will always contain a BLOCK which does
not have an UID.

Please revert it as it is obvious you did not test it as a simple bootstrap
(with checking enabled which is default on the trunk) would have found this
issue.


[Bug c++/63606] New: Missing a warning for binding a reference member to a stack allocated parameter

2014-10-20 Thread bcmpinc at hotmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63606

Bug ID: 63606
   Summary: Missing a warning for binding a reference member to a
stack allocated parameter
   Product: gcc
   Version: 4.8.2
Status: UNCONFIRMED
  Severity: enhancement
  Priority: P3
 Component: c++
  Assignee: unassigned at gcc dot gnu.org
  Reporter: bcmpinc at hotmail dot com

The code below should produce a warning, as it binds a stack allocated
parameter to a reference member. However, gcc currently does not produce such a
warning.

The code is error prone as it will always result in a dangling reference: the
object being pointed to is destructed when the constructor returns. Similar
buggy code can accidentally be written when one forgets to insert the '&' to
pass-by-reference. Note that the clang compiler does emit a warning, named
-Wdangling-field, for the code below.


struct Bar {
  int a;
};
struct Foo{
  Foo(Bar arg) : bar(arg) {}
  Bar & bar;
};
int main() {
  Bar k;
  Foo oops(k);
  return 0;
}


[Bug c++/63181] GCC should warn about "obvious" bugs in binding a reference to temporary

2014-10-20 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63181

Jonathan Wakely  changed:

   What|Removed |Added

 CC||bcmpinc at hotmail dot com

--- Comment #3 from Jonathan Wakely  ---
*** Bug 63606 has been marked as a duplicate of this bug. ***


[Bug c++/63606] Missing a warning for binding a reference member to a stack allocated parameter

2014-10-20 Thread redi at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63606

Jonathan Wakely  changed:

   What|Removed |Added

 Status|UNCONFIRMED |RESOLVED
 Resolution|--- |DUPLICATE

--- Comment #1 from Jonathan Wakely  ---
dup

*** This bug has been marked as a duplicate of bug 63181 ***


[Bug c++/63582] [5 Regression]: g++.dg/init/enum1.C ... (test for errors, line 12)

2014-10-20 Thread dj at redhat dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63582

DJ Delorie  changed:

   What|Removed |Added

 CC||dj at redhat dot com
   Assignee|unassigned at gcc dot gnu.org  |dj at redhat dot com

--- Comment #3 from DJ Delorie  ---
Created attachment 33764
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33764&action=edit
proposed patch

As there are places in the code that scan all of integer_type_kind[]
without regard for whether those types are allowed or not, decline to
create said types in the first place if they're not enabled.

Unable to test at the moment due to PR 63307.


[Bug tree-optimization/63602] Wrong code w/ -O2 -ftree-loop-linear

2014-10-20 Thread asolokha at gmx dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63602

--- Comment #1 from Arseny Solokha  ---
It seems I've reduced the snippet too hard. However, are global variables
declared static or not, it doesn't change anything.


[Bug regression/61538] gcc after commit 39a8c5ea produces bad code for MIPS R1x000 CPUs

2014-10-20 Thread pinskia at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538

--- Comment #21 from Andrew Pinski  ---
(In reply to Joshua Kinard from comment #20)
> Created attachment 33166 [details]
> Disassembly of the ASM from 'sln' compiled by a non-working gcc-4.8.0.
> 
> This is the objdump disassembly of the '__lll_lock_wait_private()' function
> from the sln binary from glibc, statically compiled, by a BAD gcc-4.8.0
> checkout (7882e02e) no previous commits reversed.  This sln copy will hang
> trying to print usage instructions.

Do you have the preprocessed source for this?


[Bug regression/61538] gcc after commit 39a8c5ea produces bad code for MIPS R1x000 CPUs

2014-10-20 Thread kumba at gentoo dot org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538

--- Comment #22 from Joshua Kinard  ---
(In reply to Andrew Pinski from comment #21)
> (In reply to Joshua Kinard from comment #20)
> > Created attachment 33166 [details]
> > Disassembly of the ASM from 'sln' compiled by a non-working gcc-4.8.0.
> > 
> > This is the objdump disassembly of the '__lll_lock_wait_private()' function
> > from the sln binary from glibc, statically compiled, by a BAD gcc-4.8.0
> > checkout (7882e02e) no previous commits reversed.  This sln copy will hang
> > trying to print usage instructions.
> 
> Do you have the preprocessed source for this?

Not currently.  I'd have to intercept a glibc build and grab the compile string
for sln.c and use that to crate the preprocessed source.  I'll see if I can
start a run tonight or tomorrow for this.

That said, I have worked out that it's got something to do with gcc's built-in
atomics added for 4.8.  In glibc's sysdeps/mips/bits/atomic.h, there are
conditional macros that pick whether to use the old __sync_* builtins if
gcc-4.7 and earlier, or the new __atomic_* builtins in gcc-4.8 or later.  This
is why there is a difference between the output assembler between the 4.7 and
4.8 sln files.

Under gcc-4.7, atomic_exchange_acq falls back to __sync_lock_test_and_set,
which is an acquire memmodel operation, and this works fine on an R14000
processor.  It's under gcc-4.8+, whatever atomic_exchange_acquire() maps to
there, that hangs up on the processor.  I checked the kernel side, and the
futex is getting lost in freezable_schedule() in include/linux/freezer.h.  I
haven't traced beyond that point yet.  The futex will exit the scheduler when
you ctrl+c it.

If you delete or comment out the gcc-4.8 defines for the atomic ops in
sysdeps/mips/bits/atomic.h in glibc to force it back to the older __sync_* ops,
it'll build with 4.8+ and the resulting sln WILL work.  So it's definitely a
gcc issue.  I got a hold of Maxim Kuvyrkov regarding commit 39a8c5ea, but I
haven't heard back from him since early September, despite sending two
follow-up e-mails.


[Bug lto/63607] New: run fail with -flto -mfloat-abi=softfp for armeb-linux-gnueabi-gcc

2014-10-20 Thread fei.yang0953 at gmail dot com
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63607

Bug ID: 63607
   Summary: run fail with -flto -mfloat-abi=softfp for
armeb-linux-gnueabi-gcc
   Product: gcc
   Version: 4.7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: lto
  Assignee: unassigned at gcc dot gnu.org
  Reporter: fei.yang0953 at gmail dot com

testsuite/gcc.dg/torture/stackalign/builtin-apply-4.c:

/* PR tree-optimization/20076 */
/* { dg-do run } */

extern void abort (void);

double
foo (int arg)
{
  if (arg != 116)
abort();
  return arg + 1;
}

inline double
bar (int arg)
{
  foo (arg);
  __builtin_return (__builtin_apply ((void (*) ()) foo,
 __builtin_apply_args (), 16));
}

int
main (int argc, char **argv)
{
  if (bar (116) != 117.0)
abort ();

  return 0;
}

Compile option: armeb-linux-gnueabi-gcc builtin-apply-4.c -static
-mfloat-abi=softfp -flto 

Disassembly:
076c :
76c:  e92d4800 push   {fp, lr}
770:  e28db004 addfp, sp, #4
774:  e3a02113 movr2, #-1073741820; 0xc004
778:  e30a3aaa movw   r3, #43690   ; 0x
77c:  e34a3aaa movt   r3, #43690   ; 0x
780:  e5823000 strr3, [r2]
784:  e3a00074 movr0, #116 ; 0x74
788:  ebaa bl 638 
78c: eeb06b40vmov.f64d6, d0
790:  ed9f7b0a vldr   d7, [pc, #40] ; 7c0 
794:  eeb46b47 vcmp.f64 d6, d7

Analysis: Return value is not passed correctly. As we can see from line 790,
main gets the return value from d0 register, which is wrong as we use
-mfloat-abi=softfp here.