[Bug tree-optimization/63599] New: "wrong" branch optimization with Ofast in a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599 Bug ID: 63599 Summary: "wrong" branch optimization with Ofast in a loop Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: vincenzo.innocente at cern dot ch given this code #include typedef float __attribute__( ( vector_size( 16 ) ) ) float32x4_t; inline float32x4_t atan(float32x4_t t) { constexpr float PIO4F = 0.7853981633974483096f; float32x4_t high = t > 0.4142135623730950f; auto z = t; float32x4_t ret={0.f,0.f,0.f,0.f}; // if all low no need to blend if ( _mm_movemask_ps(high) != 0) { z = ( t > 0.4142135623730950f ) ? (t-1.0f)/(t+1.0f) : t; ret = ( t > 0.4142135623730950f ) ? ret+PIO4F : ret; } /* polynomial removed */ return ret += z; } float32x4_t doAtan(float32x4_t z) { return atan(z);} float32x4_t va[1024]; float32x4_t vb[1024]; void computeV() { for (int i=0;i!=1024;++i) vb[i]=atan(va[i]); } compiled with -Ofast c++ -S -std=c++1y -Ofast bugmvmk.cc -march=nehalem; cat bugmvmk.s produces the following code where the "movmskps%xmm8, %edx" does not protect the code in the if block... __Z8computeVv: LFB2512: movapsLC0(%rip), %xmm4 xorl%eax, %eax movapsLC1(%rip), %xmm7 leaq_va(%rip), %rcx movapsLC2(%rip), %xmm6 movapsLC3(%rip), %xmm5 .align 4,0x90 L10: movaps(%rcx,%rax), %xmm2 movaps%xmm4, %xmm8 movaps%xmm2, %xmm3 cmpltps%xmm2, %xmm8 movaps%xmm2, %xmm1 addps%xmm6, %xmm3 addps%xmm7, %xmm1 movmskps%xmm8, %edx andps%xmm5, %xmm8 rcpps%xmm3, %xmm0 mulps%xmm0, %xmm3 mulps%xmm0, %xmm3 addps%xmm0, %xmm0 subps%xmm3, %xmm0 mulps%xmm0, %xmm1 movaps%xmm2, %xmm0 cmpleps%xmm4, %xmm0 blendvps%xmm0, %xmm2, %xmm1 pxor%xmm0, %xmm0 testl%edx, %edx jeL7 movaps%xmm8, %xmm0 L7: testl%edx, %edx jeL9 movaps%xmm1, %xmm2 L9: addps%xmm0, %xmm2 leaq_vb(%rip), %rdx movaps%xmm2, (%rdx,%rax) addq$16, %rax cmpq$16384, %rax jneL10 ret while with O2 is ok __Z8computeVv: LFB2512: movapsLC0(%rip), %xmm4 xorl%eax, %eax movapsLC1(%rip), %xmm7 leaq_va(%rip), %rsi movapsLC2(%rip), %xmm6 leaq_vb(%rip), %rcx movapsLC3(%rip), %xmm5 .align 4,0x90 L7: movaps(%rsi,%rax), %xmm1 movaps%xmm4, %xmm0 pxor%xmm2, %xmm2 cmpltps%xmm1, %xmm0 movmskps%xmm0, %edx testl%edx, %edx jeL6 movaps%xmm1, %xmm3 movaps%xmm1, %xmm2 addps%xmm6, %xmm2 addps%xmm7, %xmm3 divps%xmm2, %xmm3 movaps%xmm0, %xmm2 andps%xmm5, %xmm2 blendvps%xmm0, %xmm3, %xmm1 L6: addps%xmm2, %xmm1 movaps%xmm1, (%rcx,%rax) addq$16, %rax cmpq$16384, %rax jneL7 ret note that the function not in the loop (doAtan) is ok with both O2 and Ofast
[Bug target/63599] "wrong" branch optimization with Ofast in a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599 --- Comment #1 from Andrew Pinski --- The tree level looks like this: t_13 = VEC_COND_EXPR ; ret_14 = VEC_COND_EXPR { 4.142135679721832275390625e-1, 4.142135679721832275390625e-1, 4.142135679721832275390625e-1, 4.142135679721832275390625e-1 }, { 7.85398185253143310546875e-1, 7.85398185253143310546875e-1, 7.85398185253143310546875e-1, 7.85398185253143310546875e-1 }, { 0.0, 0.0, 0.0, 0.0 }>; t_16 = _9 != 0 ? t_13 : t_4; ret_15 = _9 != 0 ? ret_14 : { 0.0, 0.0, 0.0, 0.0 }; >"movmskps %xmm8, %edx" > does not protect the code in the if block... Yes it does just not the way you think it does. Notice the last two statements are conditional expressions. And that gets translated into the following: testl%edx, %edx jne.L9 movaps%xmm3, %xmm1 pxor%xmm2, %xmm2 .L9: So if anything it is a missed optimization dealing with conditional moves with vectors without a vector comparison.
[Bug tree-optimization/54488] tree loop invariant motion uses an excessive amount of memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488 --- Comment #6 from rguenther at suse dot de --- On Sun, 19 Oct 2014, evgeniya.maenkova at gmail dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488 > > --- Comment #5 from Evgeniya Maenkova --- > Also, I collect massif data and see no tree-ssa-lim in it (i mean in top > contributors). > > So what do you think? > > (How did you measured 1,8Gb caused by lim? - this is for me to understand > whether this bug is actual or not) I basically watched 'top' with breakpoints at the start and end of LIM.
[Bug tree-optimization/62031] [4.8 Regression] Different results between O2 and O2 -fpredictive-commoning
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=62031 --- Comment #14 from clyon at gcc dot gnu.org --- I confirm what I observed is a testsuite harness problem, for which I proposed a patch here: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg01792.html dejagnu-1.5 (as shipped with Ubuntu 14.04) masks the problem I was facing with dejagnu-1.4.4-X as shipped with RHEL5).
[Bug ipa/63587] [5 Regression] ICE : tree check: expected var_decl, have result_decl in add_local_variables, at tree-inline.c:4112
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587 --- Comment #4 from rguenther at suse dot de --- On Sun, 19 Oct 2014, mliska at suse dot cz wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587 > > --- Comment #2 from Martin Liška --- > Following two functions are merged: > static boost::log::make_output_actor, RightT, ValueT>::type > boost::log::make_output_actor, RightT, > ValueT>::make(ActorT, RightT&) [with ActorT = boost::actor; > LeftExprT = int; RightT = boost::log::attribute_actor boost::log::value_extractor, void, boost::actor>; ValueT = int; > boost::log::make_output_actor, RightT, ValueT>::type = > boost::actor, > boost::log::to_log_fun> >] (struct actor left, struct attribute_actor & right) > > > static boost::log::make_output_actor, RightT, ValueT>::type > boost::log::make_output_actor, RightT, > ValueT>::make(ActorT, RightT&) [with ActorT = boost::actor; > LeftExprT = int; RightT = boost::log::attribute_actor<{anonymous}::my_class, > boost::log::value_extractor, void, boost::actor>; ValueT = int; > boost::log::make_output_actor, RightT, ValueT>::type = > boost::actor, > boost::log::to_log_fun> >] (struct actor left, struct attribute_actor & right) > > with following body: > { > struct type D.3826; > struct to_log_fun D.3825; > struct attribute_name D.3824; > int SR.9; > struct actor left; > > : > left = left; > SR.9_4 = MEM[(struct attribute_terminal *)right_2(D)]; > MEM[(struct attribute_name *)&D.3824] = SR.9_4; > boost::log::attribute_output_terminal, > boost::log::to_log_fun>::attribute_output_terminal (&D.3826, left, > D.3824, > D.3825, 0); > D.3826 ={v} {CLOBBER}; > return; > > } > > > > As I was debugging ao_ref_alias_sets, there's MEM_REF where we have different > template arguments: attribute_actor vs. > attribute_actor<{anonymous}::my_class,...>. > What do you think Richard about these record_types from alias set perspective: > > (gdb) p debug_tree(t1) > type size > unit size > align 32 symtab 0 alias set 4 canonical type 0x76c33690 precision > 32 min max 0x76c51018 > 2147483647> > pointer_to_this > > > arg 0 type attribute_actor> > unsigned DI > size > unit size > align 64 symtab 0 alias set 7 canonical type 0x76e20d20> > visited var def_stmt GIMPLE_NOP > > version 2 > ptr-info 0x76a7e3d8> > arg 1 > constant 0>> > $1 = void > (gdb) p debug_tree(t2) > type size > unit size > align 32 symtab 0 alias set 4 canonical type 0x76c33690 precision > 32 min max 0x76c51018 > 2147483647> > pointer_to_this > > > arg 0 type attribute_actor> > unsigned DI > size > unit size > align 64 symtab 0 alias set 7 canonical type 0x76e20540> > visited var def_stmt GIMPLE_NOP > > version 2 > ptr-info 0x76a7e300> > arg 1 > constant 0>> > > these types are called for alias_set comparison, with following record_types: > (gdb) p debug_tree((tree_node*)0x76de7dc8) > SI > size bitsizetype> constant 32> > unit size sizetype> constant 4> > align 32 symtab 0 alias set 17 canonical type 0x76de7dc8 > fields type type_6 > SI size unit size 4> > align 32 symtab 0 alias set 15 canonical type 0x76dddb28 > fields > context boost> > full-name "struct boost::actor" > needs-constructor X() X(constX&) this=(X&) n_parents=0 > use_template=1 interface-unknown > pointer_to_this reference_to_this > chain > > ignored decl_6 SI file ../../PR33754.c line 167 col 7 size > 0x76c51048 32> unit size > align 32 offset_align 128 > offset > bit offset context > 0x76de7dc8 attribute_actor> > chain 0x76de80a8 attribute_actor> > external nonlocal suppress-debug decl_4 VOID file ../../PR33754.c > line 168 col 1 > align 8 context > result > > chain >> context > > full-name "class boost::log::attribute_actor boost::log::value_extractor, void, boost::actor>" > needs-constructor X() X(constX&) this=(X&) n_parents=1 use_template=1 > interface-unknown > pointer_to_this reference_to_this > chain attribute_actor>> > $3 = void > (gdb) p debug_tree((tree_node*)0x76ddd888) > SI > size bitsizetype> constant 32> > unit size sizetype> constant 4> > align 32 symtab 0 alias set 14 canonical type 0x76ddd888 > fields type type_6 > SI size unit size 4> > align 32 symtab 0 alias set 15 canonical type 0x76dddb28 > fields > context boost> > full-name "struct boost::actor" > needs-constructor X() X(constX&) this=(X&) n_parents=0 > use_template=1 interface-unknown > pointer_to_this reference_to_this > chain > >
[Bug libfortran/63589] find_addr2line does not consider last PATH component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589 --- Comment #2 from Janne Blomqvist --- Author: jb Date: Mon Oct 20 07:53:37 2014 New Revision: 216449 URL: https://gcc.gnu.org/viewcvs?rev=216449&root=gcc&view=rev Log: PR 63589 Fix splitting of PATH in find_addr2line. 2014-10-20 Janne Blomqvist PR libfortran/63589 * configure.ac: Check for strtok_r. * runtime/main.c (gfstrtok_r): Fallback implementation of strtok_r. (find_addr2line): Use strtok_r to split PATH. * config.h.in: Regenerated. * configure: Regenerated. Modified: trunk/libgfortran/ChangeLog trunk/libgfortran/config.h.in trunk/libgfortran/configure trunk/libgfortran/configure.ac trunk/libgfortran/runtime/main.c
[Bug libfortran/63589] find_addr2line does not consider last PATH component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589 --- Comment #3 from Janne Blomqvist --- Author: jb Date: Mon Oct 20 08:04:39 2014 New Revision: 216450 URL: https://gcc.gnu.org/viewcvs?rev=216450&root=gcc&view=rev Log: PR 63589 Fix splitting of PATH in find_addr2line. 2014-10-20 Janne Blomqvist PR libfortran/63589 * configure.ac: Check for strtok_r. * runtime/main.c (gfstrtok_r): Fallback implementation of strtok_r. (find_addr2line): Use strtok_r to split PATH. * config.h.in: Regenerated. * configure: Regenerated. Modified: branches/gcc-4_9-branch/libgfortran/ChangeLog branches/gcc-4_9-branch/libgfortran/config.h.in branches/gcc-4_9-branch/libgfortran/configure branches/gcc-4_9-branch/libgfortran/configure.ac branches/gcc-4_9-branch/libgfortran/runtime/main.c
[Bug libfortran/63589] find_addr2line does not consider last PATH component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589 --- Comment #4 from Janne Blomqvist --- Author: jb Date: Mon Oct 20 08:16:06 2014 New Revision: 216451 URL: https://gcc.gnu.org/viewcvs?rev=216451&root=gcc&view=rev Log: PR 63589 Fix splitting of PATH in find_addr2line. 2014-10-20 Janne Blomqvist PR libfortran/63589 * configure.ac: Check for strtok_r. * runtime/main.c (gfstrtok_r): Fallback implementation of strtok_r. (find_addr2line): Use strtok_r to split PATH. * config.h.in: Regenerated. * configure: Regenerated. Modified: branches/gcc-4_8-branch/libgfortran/ChangeLog branches/gcc-4_8-branch/libgfortran/config.h.in branches/gcc-4_8-branch/libgfortran/configure branches/gcc-4_8-branch/libgfortran/configure.ac branches/gcc-4_8-branch/libgfortran/runtime/main.c
[Bug libfortran/63589] find_addr2line does not consider last PATH component
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63589 Janne Blomqvist changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #5 from Janne Blomqvist --- Fixed, closing.
[Bug tree-optimization/63586] x+x+x+x -> 4*x in gimple
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63586 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #1 from Jakub Jelinek --- I'd expect reassoc should be the pass to do this.
[Bug tree-optimization/63563] [4.9/5 Regression] ICE: in vectorizable_store, at tree-vect-stmts.c:5106 with -mavx2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63563 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-10-20 CC||jakub at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Started with my r205856, will have a look.
[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 Fei Yang changed: What|Removed |Added CC||fei.yang0953 at gmail dot com --- Comment #3 from Fei Yang --- (In reply to ktkachov from comment #1) > Confirmed. Feel free to propose a patch for them on gcc-patches along the > lines you described in: https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html Hi, To let you know, we are currently working on this issue. We are implementing these with builtins. Hopefully, the patch will be posted this week. Thank you.
[Bug c/63600] New: ice in ix86_expand_sse2_abs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63600 Bug ID: 63600 Summary: ice in ix86_expand_sse2_abs Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c Assignee: unassigned at gcc dot gnu.org Reporter: dcb314 at hotmail dot com Created attachment 33760 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33760&action=edit C source code I just tried to compile the attached code on gcc trunk dated 20141019 on an AMD x86_64 box. The compiler said bug168.c: In function ‘long_unary_op’: bug168.c:11345:32: internal compiler error: in ix86_expand_sse2_abs, at config/i386/i386.c:45977 for (n = 0; n < na; n++) b[n] = (((a[n]) >= 0) ? (a[n]) : -(a[n])); ^ 0xf9ff5e ix86_expand_sse2_abs(rtx_def*, rtx_def*) ../../src/trunk/gcc/config/i386/i386.c:45977 0x10c707a gen_absv2di2(rtx_def*, rtx_def*) ../../src/trunk/gcc/config/i386/sse.md:13834 0xb1dc09 insn_gen_fn::operator()(rtx_def*, rtx_def*) const ../../src/trunk/gcc/recog.h:308 0xb1dc09 maybe_gen_insn(insn_code, unsigned int, expand_operand*) ../../src/trunk/gcc/optabs.c:8348 0xb1dc09 expand_unop_direct Flag -O3 required. The attached code is the same source code was provided for bug #53749.
[Bug c++/63601] New: Segfault on usage of 'this' in unevaluated context inside lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601 Bug ID: 63601 Summary: Segfault on usage of 'this' in unevaluated context inside lambda Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: sneves at dei dot uc.pt The following minimal example results in an 'ICE: Segmentation fault' in g++ 4.8.1, 4.9.1, and 5.0.0 20141019: auto f = []{ sizeof(this); };
[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- More thorough testcase (should be tested with different ISAs): #define C1 c #define C2 C1, C1 #define C4 C2, C2 #define C8 C4, C4 #define C16 C8, C8 #define C32 C16, C16 #define C64 C32, C32 #define C_(n) n #define C(n) C_(C##n) #define T(t,s) \ typedef t v##t##s __attribute__ ((__vector_size__ (s * sizeof (t; \ v##t##s test##t##s (t c) \ { \ v##t##s v = { C(s) }; \ return v; \ } typedef long long llong; T(char, 64) T(char, 32) T(char, 16) T(char, 8) T(short, 32) T(short, 16) T(short, 8) T(short, 4) T(int, 16) T(int, 8) T(int, 4) T(int, 2) T(float, 16) T(float, 8) T(float, 4) T(float, 2) T(llong, 8) T(llong, 4) T(llong, 2) T(double, 8) T(double, 4) T(double, 2) Started with r216401, -mavx512f of course doesn't include the avx512bw broadcast needed for the V64QI or V32HI duplicates.
[Bug target/63599] "wrong" branch optimization with Ofast in a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599 --- Comment #2 from vincenzo Innocente --- I agree that the code produces correct results. It looks to me sub-optimal. I understand that with Ofast the sequence below will be always executed andps%xmm5, %xmm8 rcpps%xmm3, %xmm0 mulps%xmm0, %xmm3 mulps%xmm0, %xmm3 addps%xmm0, %xmm0 subps%xmm3, %xmm0 mulps%xmm0, %xmm1 movaps%xmm2, %xmm0 cmpleps%xmm4, %xmm0 blendvps%xmm0, %xmm2, %xmm1 while with O2 it will not. and this generates a performance penalty for samples where the test is often false. ( I tried to add __builtin_expect(x, false) with no effect. )
[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 --- Comment #4 from Venkataramanan --- (In reply to Fei Yang from comment #3) > (In reply to ktkachov from comment #1) > > Confirmed. > > Feel free to propose a patch for them on gcc-patches along the > > lines you described in: > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > Hi, > To let you know, we are currently working on this issue. > We are implementing these with builtins. > Hopefully, the patch will be posted this week. Thank you. Hi Fei Yang, Ok no issues. I will let you do this. But please asign (In reply to Fei Yang from comment #3) > (In reply to ktkachov from comment #1) > > Confirmed. > > Feel free to propose a patch for them on gcc-patches along the > > lines you described in: > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > Hi, > To let you know, we are currently working on this issue. > We are implementing these with builtins. > Hopefully, the patch will be posted this week. Thank you. Ok. Next time please assign the Bugzilla item to your name, so that we wont be duplicating the work.
[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 Ramana Radhakrishnan changed: What|Removed |Added CC||ramana at gcc dot gnu.org --- Comment #5 from Ramana Radhakrishnan --- (In reply to Venkataramanan from comment #4) > (In reply to Fei Yang from comment #3) > > (In reply to ktkachov from comment #1) > > > Confirmed. > > > > Feel free to propose a patch for them on gcc-patches along the > > > lines you described in: > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > > > Hi, > > To let you know, we are currently working on this issue. > > We are implementing these with builtins. > > Hopefully, the patch will be posted this week. Thank you. > > > Hi Fei Yang, > > Ok no issues. I will let you do this. But please asign (In reply to Fei Yang > from comment #3) > > (In reply to ktkachov from comment #1) > > > Confirmed. > > > > Feel free to propose a patch for them on gcc-patches along the > > > lines you described in: > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > > > Hi, > > To let you know, we are currently working on this issue. > > We are implementing these with builtins. > > Hopefully, the patch will be posted this week. Thank you. > > Ok. Next time please assign the Bugzilla item to your name, so that we wont > be duplicating the work. Linaro / Charles Bayliss was already working on this - he had patches out in September for this.
[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 clyon at gcc dot gnu.org changed: What|Removed |Added CC||cbaylis at gcc dot gnu.org --- Comment #6 from clyon at gcc dot gnu.org --- (In reply to Ramana Radhakrishnan from comment #5) > (In reply to Venkataramanan from comment #4) > > (In reply to Fei Yang from comment #3) > > > (In reply to ktkachov from comment #1) > > > > Confirmed. > > > > > > Feel free to propose a patch for them on gcc-patches along the > > > > lines you described in: > > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > > > > > Hi, > > > To let you know, we are currently working on this issue. > > > We are implementing these with builtins. > > > Hopefully, the patch will be posted this week. Thank you. > > > > > > Hi Fei Yang, > > > > Ok no issues. I will let you do this. But please asign (In reply to Fei Yang > > from comment #3) > > > (In reply to ktkachov from comment #1) > > > > Confirmed. > > > > > > Feel free to propose a patch for them on gcc-patches along the > > > > lines you described in: > > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > > > > > Hi, > > > To let you know, we are currently working on this issue. > > > We are implementing these with builtins. > > > Hopefully, the patch will be posted this week. Thank you. > > > > Ok. Next time please assign the Bugzilla item to your name, so that we wont > > be duplicating the work. > > > Linaro / Charles Bayliss was already working on this - he had patches out in > September for this. It seems that Charles' patches cover vldX_lane, but not vldX_dup.
[Bug target/63599] "wrong" branch optimization with Ofast in a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599 --- Comment #3 from Marc Glisse --- ifcvt making a transformation that doesn't help vectorization and ends up pessimizing the code... not really the first time this happens. I believe Jakub had a big patch for that, but it never got in. Maybe vectors could be special-cased if we never vectorize them anyway.
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #31 from Stupachenko Evgeny --- (In reply to Jeffrey A. Law from comment #29) > I thought we had already dealt with the "hidden" GOT usages that show up > during reload... Is it IRA that's removing the SET_GOT? That is not EQUIV related case. SET_GOT is removed by CSE called at IRA. Here we have insn that don't use GOT register implicitly: (insn 37 34 38 6 (set (mem:TF (pre_dec:SI (reg/f:SI 7 sp)) [0 S16 A8]) (const_double:TF 2.0769187434139310514121985316880384e+34 [0x0.8p+115])) frexpq.c:1316 121 {*pushtf} (expr_list:REG_ARGS_SIZE (const_int 16 [0x10]) (nil))) It appears that there are no other insns using GOT or calls. Therefore CSE absolutely correct in removing SET_GOT.
[Bug target/63534] [5 Regression] Bootstrap failure on x86_64/i686-linux
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63534 --- Comment #32 from Stupachenko Evgeny --- (In reply to Iain Sandoe from comment #30) > FWIW, I built a stage #1 with fortran, objc and ada enabled. > > libgcc, libstdc++v3, libgomp, libobjc and libada build. > > libgfortran & libquadmath fail (errors as per Dominique's post). We got MAC and are setting up GCC build there to be able to reproduce all issues and publish patch fixing whole bootstrap.
[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594 --- Comment #3 from Jakub Jelinek --- Created attachment 33761 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33761&action=edit WIP patch for discussions >From what I see, if TARGET_AVX512BW is not defined, then we obviously can't use ix86_vector_duplicate_value, but need two instructions (either it can be QI->V32QI / HI->V16HI broadcast followed by concat of the two parts, or QI->V16QI / HI->V8HI broadcast followed by concat of the 4 parts together). But, it seems even for -mavx2 or -mavx we actually generate terrible code, for -mavx2 there is no point in using 2 instructions when in theory vpbroadcast{b,w} should handle it alone just fine. The patch enables all of that, but unfortunately we generate perhaps not so good code with it, e.g. for -mavx2 in testchar32, we spill the argument always to memory, and then broadcast it from memory, even when vmovd + broadcast from register could have been used. And in testchar16, for some reason we spill into memory, and broadcast from vmovd result (so the spill is totally useless). Uros/Kyrill, any thoughts on this?
[Bug c++/63531] gcc segfaults on some sourcefiles when using '-Weffc++' and '-fsanitize=undefined' together
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63531 --- Comment #6 from Ralf --- (In reply to Marek Polacek from comment #5) > I meant a GCC build, that contains r215459 fix (for that you'd have to build > gcc, 5 nor 4.9.2 haven't been released yet). > > But I'm pretty sure this is already fixed. Yes, i can conform my problem is fixed in snapshot gcc-4.9-20141015. Thanks for your help :)
[Bug tree-optimization/63583] [5 Regression] ICF does not check that the template strings are the same
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63583 --- Comment #2 from marxin at gcc dot gnu.org --- Author: marxin Date: Mon Oct 20 10:44:54 2014 New Revision: 216458 URL: https://gcc.gnu.org/viewcvs?rev=216458&root=gcc&view=rev Log: PR ipa/63583 * ipa-icf-gimple.c (func_checker::compare_gimple_asm): Gimple tempate string is compared. * gcc.dg/ipa/pr63595.c: New test. Modified: trunk/gcc/ChangeLog trunk/gcc/ipa-icf-gimple.c trunk/gcc/testsuite/ChangeLog
[Bug target/63173] performance problem with simd intrinsics vld2_dup_* on aarch64-none-elf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63173 --- Comment #7 from Fei Yang --- (In reply to clyon from comment #6) > (In reply to Ramana Radhakrishnan from comment #5) > (In reply to > Venkataramanan from comment #4) > > (In reply to Fei Yang from comment #3) > > > > (In reply to ktkachov from comment #1) > > > > Confirmed. > > > > > > > Feel free to propose a patch for them on gcc-patches along the > > > > lines > you described in: > > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > > > > > > Hi, > > > To let you know, we are currently working on this > issue. > > > We are implementing these with builtins. > > > Hopefully, > the patch will be posted this week. Thank you. > > > > > > Hi Fei Yang, > > > > > Ok no issues. I will let you do this. But please asign (In reply to > Fei Yang > > from comment #3) > > > (In reply to ktkachov from comment #1) > > > > > Confirmed. > > > > > > Feel free to propose a patch for them on > gcc-patches along the > > > > lines you described in: > > > > https://gcc.gnu.org/ml/gcc/2014-09/msg00046.html > > > > > > Hi, > > > To > let you know, we are currently working on this issue. > > > We are > implementing these with builtins. > > > Hopefully, the patch will be > posted this week. Thank you. > > > > Ok. Next time please assign the > Bugzilla item to your name, so that we wont > > be duplicating the work. > > > > Linaro / Charles Bayliss was already working on this - he had patches > out in > September for this. It seems that Charles' patches cover > vldX_lane, but not vldX_dup. Hi Ramana, Do you mean this link: https://gcc.gnu.org/ml/gcc-patches/2014-10/msg00678.html
[Bug lto/61192] Conflict between register and function name for lto on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61192 Ilya Palachev changed: What|Removed |Added CC||i.palachev at samsung dot com --- Comment #2 from Ilya Palachev --- (In reply to Daniel Cederman from comment #0) > when using lto on sparc. Daniel, can you also provide original source code (not preprocessed)? It's interesting whether this error can be reproduced on other arhictectures.
[Bug tree-optimization/63602] New: Wrong code w/ -O2 -ftree-loop-linear
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63602 Bug ID: 63602 Summary: Wrong code w/ -O2 -ftree-loop-linear Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: asolokha at gmx dot com gcc produces wrong code w/ -ftree-loop-linear -O2 (and above) for the following reduced case: int sx; int bn; int vz = 1; int *volatile n6 = &bn; int main(void) { for (int i = 0; i < 3; ++i) { sx = vz; vz = bn; } return sx; } It struck me first w/ gcc-4.10.0-alpha20140810, but today I've reproduced it w/ 4.8.3, 4.9.1 and 5-alpha20141019, so I'm not marking it as a regression. Expected results: % gcc-5.0_alpha20141019 -O2 -o good 963b8772.c % ./good % echo $? 0 Actual results: % gcc-5.0_alpha20141019 -O2 -ftree-loop-linear -o bad 963b8772.c % ./bad % echo $? 1
[Bug lto/61052] g++ generated code segfaults when using LTO together with "extern template", non-LTO compiled files, and gold linker
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61052 Ilya Palachev changed: What|Removed |Added CC||i.palachev at samsung dot com --- Comment #1 from Ilya Palachev --- Hi, I can see another error for the attached testcase. $ gcc -c -O2 -flto a.cc $ gcc -c -O2 -flto b.cc $ gcc -c -Os e.cc $ gcc -o a -fuse-ld=gold a.o e.o b.o /usr/local/bin/ld.gold: -plugin: unknown option /usr/local/bin/ld.gold: use the --help option for usage information collect2: error: ld returned 1 exit status $ ld -v GNU gold (GNU Binutils 2.24.51.20141003) 1.11 It seems that this error os related with option "-fuse-ld", since the error disappears if this option is not specified.
[Bug c++/63601] Segfault on usage of 'this' in unevaluated context inside lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-10-20 CC||jakub at gcc dot gnu.org, ||jason at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- This used to be rejected until r196550 where it started to ICE.
[Bug debug/60655] [4.9 Regression] ICE: output_operand: invalid expression as operand
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60655 --- Comment #22 from Alan Modra --- Author: amodra Date: Mon Oct 20 11:54:22 2014 New Revision: 216462 URL: https://gcc.gnu.org/viewcvs?rev=216462&root=gcc&view=rev Log: PR debug/60655 * simplify-rtx.c (simplify_plus_minus): Delete unused "input_ops". Increase "ops" array size. Correct array size tests. Init n_constants in loop. Break out of innermost loop when finding a trivial CONST expression. Modified: trunk/gcc/ChangeLog trunk/gcc/simplify-rtx.c
[Bug target/63600] [5 Regression] ice in ix86_expand_sse2_abs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63600 Jakub Jelinek changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-10-20 CC||jakub at gcc dot gnu.org Target Milestone|--- |5.0 Summary|ice in ix86_expand_sse2_abs |[5 Regression] ice in ||ix86_expand_sse2_abs Ever confirmed|0 |1 --- Comment #1 from Jakub Jelinek --- Started with r216255. Reduced testcase for -O3: long *a, b; int c; void foo (void) { for (c = 0; c < 64; c++) a[c] = b >= 0 ? b : -b; }
[Bug c++/63531] gcc segfaults on some sourcefiles when using '-Weffc++' and '-fsanitize=undefined' together
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63531 Marek Polacek changed: What|Removed |Added Status|WAITING |RESOLVED Resolution|--- |FIXED --- Comment #7 from Marek Polacek --- .
[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594 --- Comment #4 from Kirill Yukhin --- (In reply to Jakub Jelinek from comment #3) > Created attachment 33761 [details] > WIP patch for discussions > > From what I see, if TARGET_AVX512BW is not defined, then we obviously can't > use > ix86_vector_duplicate_value, but need two instructions (either it can be > QI->V32QI / HI->V16HI broadcast followed by concat of the two parts, or > QI->V16QI / HI->V8HI broadcast followed by concat of the 4 parts together). > But, it seems even for -mavx2 or -mavx we actually generate terrible code, > for -mavx2 there is no point in using 2 instructions when in theory > vpbroadcast{b,w} should handle it alone just fine. Right! > The patch enables all of that, but unfortunately we generate perhaps not so > good code with it, e.g. for -mavx2 in testchar32, we spill the argument > always to memory, and then broadcast it from memory, even when vmovd + > broadcast from register could have been used. > And in testchar16, for some reason we spill into memory, and broadcast from > vmovd result (so the spill is totally useless). I think this is because of subreg:QI of reg:SI. Before reload we have (for testchar32): (insn 2 5 3 2 (set (reg:SI 86 [ c ]) (reg:SI 5 di [ c ])) 1.c:22 90 {*movsi_internal} (expr_list:REG_DEAD (reg:SI 5 di [ c ]) (nil))) (insn 7 4 12 2 (set (reg:V32QI 88 [ v ]) (vec_duplicate:V32QI (subreg:QI (reg:SI 86 [ c ]) 0))) 1.c:22 4112 {vec_dupv32qi} (expr_list:REG_DEAD (reg:SI 86 [ c ]) (nil))) After reload we need to get rid off subreg: (insn 2 5 3 2 (set (mem/c:SI (plus:DI (reg/f:DI 6 bp) (const_int -20 [0xffec])) [8 %sfp+-4 S4 A32]) (reg:SI 5 di [ c ])) 1.c:22 90 {*movsi_internal} (nil)) (insn 7 4 12 2 (set (reg:V32QI 21 xmm0 [orig:88 v ] [88]) (vec_duplicate:V32QI (mem/c:QI (plus:DI (reg/f:DI 6 bp) (const_int -20 [0xffec])) [8 %sfp+-4 S1 A32]))) 1.c:22 4112 {vec_dupv32qi} (nil)) > Uros/Kyrill, any thoughts on this? I like the patch.
[Bug ipa/63598] [5.0 Regression] ICE: in ipa_merge_profiles at ipa-utils.c:396
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63598 Richard Biener changed: What|Removed |Added Target Milestone|--- |5.0
[Bug target/63596] Saving of GPR/FPRs for stdarg even though the variable argument is not used
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63596 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-10-20 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed.
[Bug tree-optimization/63595] Segmentation faults inside kernel
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63595 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |WAITING Last reconfirmed||2014-10-20 Ever confirmed|0 |1
[Bug ipa/63587] [5 Regression] ICE : tree check: expected var_decl, have result_decl in add_local_variables, at tree-inline.c:4112
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63587 Richard Biener changed: What|Removed |Added Target Milestone|--- |5.0
[Bug c++/63588] [5 Regression] ICE (segfault) on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63588 Richard Biener changed: What|Removed |Added Target Milestone|--- |5.0
[Bug tree-optimization/63583] [5 Regression] ICF does not check that the template strings are the same
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63583 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #3 from Richard Biener --- Fixed.
[Bug c++/63582] [5 Regression]: g++.dg/init/enum1.C ... (test for errors, line 12)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63582 Richard Biener changed: What|Removed |Added Target|cris-axis-elf |cris-axis-elf, ||i?86-linux-gnu Target Milestone|--- |5.0 --- Comment #2 from Richard Biener --- Also fails on x86_64-linux with -m32.
[Bug c++/63588] [5 Regression] ICE (segfault) on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63588 ktkachov at gcc dot gnu.org changed: What|Removed |Added CC||ktkachov at gcc dot gnu.org --- Comment #1 from ktkachov at gcc dot gnu.org --- So is there a reduced testcase for this?
[Bug ipa/63580] [5 Regression] ICE : error: invalid argument to gimple call
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63580 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2014-10-20 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- You miss to mark p1 addressable in the alias decl (that is, copy TREE_ADDRESSABLE).
[Bug rtl-optimization/63577] [4.8/4.9/5? Regression]: Huge compile time and memory usage with -O and not -fPIC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63577 Richard Biener changed: What|Removed |Added Target||x86_64-*-* Status|UNCONFIRMED |NEW Last reconfirmed||2014-10-20 Component|fortran |rtl-optimization Version|unknown |4.9.1 Blocks||47344 Target Milestone|--- |4.8.4 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Confirmed with 4.9: combiner: 31.14 (79%) usr 0.46 (74%) sys 31.65 (78%) wall 1029289 kB (96%) ggc TOTAL : 39.48 0.6240.77 1071504 kB
[Bug ipa/63576] [5 Regression] ICE : in ipa_merge_profiles, at ipa-utils.c:540 during Firefox LTO/PGO build
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63576 Richard Biener changed: What|Removed |Added Target Milestone|--- |5.0
[Bug rtl-optimization/63577] [4.8/4.9/5 Regression]: Huge compile time and memory usage with -O and not -fPIC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63577 Richard Biener changed: What|Removed |Added Summary|[4.8/4.9/5? Regression]:|[4.8/4.9/5 Regression]: |Huge compile time and |Huge compile time and |memory usage with -O and|memory usage with -O and |not -fPIC |not -fPIC --- Comment #2 from Richard Biener --- --param max-combine-insns=2 helps a bit compile-time wise but not fully memory-usage-wise (I suppose log-links are expensive and of course still set up). Only available on trunk, of course.
[Bug c++/63588] [5 Regression] ICE (segfault) on arm-linux-gnueabihf
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63588 --- Comment #2 from Matthias Klose --- yes, see above.
[Bug tree-optimization/54488] tree loop invariant motion uses an excessive amount of memory
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54488 --- Comment #7 from Evgeniya Maenkova --- I got only 317Mb by top.
[Bug lto/63603] New: [4.9/5 Regression] Linking with -fno-lto still invokes LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63603 Bug ID: 63603 Summary: [4.9/5 Regression] Linking with -fno-lto still invokes LTO Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: burnus at gcc dot gnu.org Running echo "int main() {return 0;}" > foo.c gcc -flto -ffat-lto-objects -c foo.c gcc -v -fno-lto foo.o 2>&1|grep lto1 shows that the -fno-lto is ignored for linking as lto1 is always invoked with GCC 4.9 and 5. Using GCC 4.8, LTO is not automatically invoked for linking but has to be passed manually. Hence, it works there.
[Bug tree-optimization/63563] [4.9/5 Regression] ICE: in vectorizable_store, at tree-vect-stmts.c:5106 with -mavx2
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63563 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Created attachment 33762 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33762&action=edit gcc5-pr63563.patch Untested fix.
[Bug lto/61192] Conflict between register and function name for lto on sparc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61192 --- Comment #3 from Daniel Cederman --- (In reply to Ilya Palachev from comment #2) > (In reply to Daniel Cederman from comment #0) > > when using lto on sparc. > > Daniel, can you also provide original source code (not preprocessed)? It's > interesting whether this error can be reproduced on other arhictectures. I used creduce on the source code and this code triggers the error: register int _SPARC_Per_CPU_current __asm__("g6"); int __getreent___trans_tmp_1; __getreent() { int cpu_self = _SPARC_Per_CPU_current; __getreent___trans_tmp_1 = cpu_self; } g6() {} I compiled with the same compiler as before, I have not tried with a newer version of gcc.
[Bug lto/63603] [4.9/5 Regression] Linking with -fno-lto still invokes LTO
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63603 Tobias Burnus changed: What|Removed |Added Target Milestone|--- |4.9.2
[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307 --- Comment #6 from iverbin at gcc dot gnu.org --- Author: iverbin Date: Mon Oct 20 15:22:09 2014 New Revision: 216483 URL: https://gcc.gnu.org/viewcvs?rev=216483&root=gcc&view=rev Log: PR c/63307 gcc/c-family/ * cilk.c: Include vec.h. (struct cilk_decls): New structure. (wrapper_parm_cb): Split this function to... (fill_decls_vec): ...this... (create_parm_list): ...and this. (compare_decls): New function. (for_local_cb): Remove. (wrapper_local_cb): Ditto. (build_wrapper_type): For now first traverse and fill vector of declarations then sort it and then deal with sorted vector. (cilk_outline): Ditto. (declare_one_free_variable): Ditto. Modified: trunk/gcc/c-family/ChangeLog trunk/gcc/c-family/cilk.c
[Bug fortran/63553] [OOP] Wrong code when assigning a CLASS to a TYPE
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63553 --- Comment #5 from patnel97269-gfortran at yahoo dot fr --- Thanks for the patch. Another similar case, this time the type contains an allocatable field, produces a internal compiler error (without applying the patch) : internal compiler error: in fold_convert_loc, at fold-const.c:2112 program toto implicit none type mother integer :: i double precision,dimension(:),allocatable :: values end type mother class(mother),allocatable :: cm,cm2 allocate(cm) allocate(cm%values(10)) cm%i=3 cm%values=80d0 allocate(cm2) select type(cm2) type is (mother) cm2=cm end select print *,cm2%i,cm2%values end program
[Bug libquadmath/55821] Release tarballs (unconditionally) install libquadmath.info when libquadmath is not supported
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55821 --- Comment #9 from Sandra Loosemore --- Yes, that patch (with regenerated Makefile.in) did the trick. Thanks. config.log says my configure line is: $ /scratch/sandra/arm-fsf/src/gcc-mainline/libquadmath/configure --srcdir=/scr atch/sandra/arm-fsf/src/gcc-mainline/libquadmath --cache-file=./config.cache --e nable-multilib --with-cross-host=i686-pc-linux-gnu --enable-threads --disable-li bmudflap --disable-libstdcxx-pch --with-gnu-as --with-gnu-ld --enable-shared --e nable-lto --enable-symvers=gnu --enable-__cxa_atexit --with-glibc-version=2.19 - -disable-nls --prefix=/scratch/sandra/arm-fsf/install --with-sysroot=/scratch/sa ndra/arm-fsf/install/arm-none-linux-gnueabi/libc --with-host-libstdcxx=-static-l ibgcc -Wl,-Bstatic,-lstdc++,-Bdynamic -lm --enable-libgomp --enable-libitm --ena ble-libatomic --disable-libssp --enable-poison-system-directories --with-build-t ime-tools=/scratch/sandra/arm-fsf/install/arm-none-linux-gnueabi/bin --enable-la nguages=c,c++,fortran,lto --program-transform-name=s&^&arm-none-linux-gnueabi-& --disable-option-checking --with-target-subdir=arm-none-linux-gnueabi --build=i6 86-pc-linux-gnu --host=arm-none-linux-gnueabi --target=arm-none-linux-gnueabi
[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594 --- Comment #5 from Jakub Jelinek --- Better testcase that tests both broadcasts from GPRs and broadcasts from memory: #define C1 c #define C2 C1, C1 #define C4 C2, C2 #define C8 C4, C4 #define C16 C8, C8 #define C32 C16, C16 #define C64 C32, C32 #define C_(n) n #define C(n) C_(C##n) #define T(t,s) \ typedef t v##t##s __attribute__ ((__vector_size__ (s * sizeof (t;\ v##t##s test##t##s (t c)\ {\ v##t##s v = { C(s) };\ return v;\ }\ v##t##s test2##t##s (t *p)\ {\ t c = *p;\ v##t##s v = { C(s) };\ return v;\ } typedef long long llong; T(char, 64) T(char, 32) T(char, 16) T(char, 8) T(short, 32) T(short, 16) T(short, 8) T(short, 4) T(int, 16) T(int, 8) T(int, 4) T(int, 2) T(float, 16) T(float, 8) T(float, 4) T(float, 2) T(llong, 8) T(llong, 4) T(llong, 2) T(double, 8) T(double, 4) T(double, 2)
[Bug target/63594] [5 Regression] ICE: in ix86_vector_duplicate_value, at config/i386/i386.c:39831 with -mavx512f
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63594 --- Comment #6 from Jakub Jelinek --- Created attachment 33763 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33763&action=edit gcc5-pr63594-wip2.patch Updated WIP patch, which attempts to generate better code using inter-unit moves, but have also memory as an alternative, so it allows RA to choose what is best. This still generates non-perfect code for V2DI/V4DI loads from GPRs without -mavx512f (but e.g. vec_concatv2di uses Yi constraint). And, for AVX512-{F,BW,VL}, I'm surprised that the broadcasts from gprs are done as different instructions from broadcasts from memory or vector reg, I would have thought that must have been done using a single insn with alternatives.
[Bug target/63599] "wrong" branch optimization with Ofast in a loop
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63599 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #4 from Jakub Jelinek --- The big patch got committed in, but generally turning off tree if-conversion didn't turn to be a win, so what ended up being committed is only if there are any masked loads/stores, if-conversion applies only to vectorized loop and nothing else.
[Bug c++/63601] Segfault on usage of 'this' in unevaluated context inside lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601 --- Comment #2 from Jason Merrill --- Author: jason Date: Mon Oct 20 17:29:02 2014 New Revision: 216488 URL: https://gcc.gnu.org/viewcvs?rev=216488&root=gcc&view=rev Log: PR c++/63601 * lambda.c (current_nonlambda_function): New. * semantics.c (finish_this_expr): Use it. * cp-tree.h: Declare it. Added: trunk/gcc/testsuite/g++.dg/cpp0x/lambda/lambda-this20.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/cp-tree.h trunk/gcc/cp/lambda.c trunk/gcc/cp/semantics.c
[Bug rtl-optimization/63577] [4.8/4.9/5 Regression]: Huge compile time and memory usage with -O and not -fPIC
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63577 Segher Boessenkool changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #3 from Segher Boessenkool --- The LOG_LINKS take up only a few hundred kB, tops; the gigantic memory use is from of all the garbage RTL produced for all the failed combine attempts.
[Bug c++/63601] Segfault on usage of 'this' in unevaluated context inside lambda
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63601 Jason Merrill changed: What|Removed |Added Status|NEW |RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org Target Milestone|--- |5.0 --- Comment #3 from Jason Merrill --- Fixed.
[Bug c++/63604] New: [C++11] A direct-initialization of a reference should use explicit conversion functions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63604 Bug ID: 63604 Summary: [C++11] A direct-initialization of a reference should use explicit conversion functions Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: kariya_mitsuru at hotmail dot com The sample code below should be compiled successfully but it causes compilation error by gcc. == struct T {}; struct S { explicit operator T() { return T(); } }; int main() { S s; T&& t(s); (void) t; } == cf. http://melpon.org/wandbox/permlink/LHgajpAXzqTbpYDc An initialization of a reference in a direct-initialization context should use an explicit conversion function that converts to a class prvalue. The latest C++ standard (n4140) 13.3.1.6 [over.match.ref]/p.1.1 says that The conversion functions of S and its base classes are considered. Those non-explicit conversion functions that are not hidden within S and yield type “lvalue reference to cv2 T2” (when initializing an lvalue reference or an rvalue reference to function) or “cv2 T2” or “rvalue reference to cv2 T2” (when initializing an rvalue reference or an lvalue reference to function), where “cv1 T” is reference-compatible (8.5.3) with “cv2 T2”, are candidate functions. For direct-initialization, those explicit conversion functions that are not hidden within S and yield type “lvalue reference to cv2 T2” or “cv2 T2” or “rvalue reference to cv2 T2”, respectively, where T2 is same type as T or can be converted to type T with a qualification conversion (4.4), are also candidate functions. I think that this sample code corresponds to the case “For direct-initialization, ...”. Note that this sample code is compiled successfully if the conversion function returns an rvalue reference. (cf. http://melpon.org/wandbox/permlink/kGpALX7zvzHzi7K5) See also BUG 48453.
[Bug tree-optimization/63605] New: wrong code at -O3 on x86_64-linux-gnu
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63605 Bug ID: 63605 Summary: wrong code at -O3 on x86_64-linux-gnu Product: gcc Version: 5.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: su at cs dot ucdavis.edu The current gcc trunk (as well as 4.8.x and 4.9.x) miscompiles the following code on x86_64-linux at -O3 in both 32-bit and 64-bit modes. This is a regression from 4.7.x. The miscompilation seems to be caused by the tree vectorizer as -fno-tree-vectorize makes it disappear. $ gcc-trunk -v Using built-in specs. COLLECT_GCC=gcc-trunk COLLECT_LTO_WRAPPER=/usr/local/gcc-trunk/libexec/gcc/x86_64-unknown-linux-gnu/5.0.0/lto-wrapper Target: x86_64-unknown-linux-gnu Configured with: ../gcc-trunk/configure --prefix=/usr/local/gcc-trunk --enable-languages=c,c++ --disable-werror --enable-multilib Thread model: posix gcc version 5.0.0 20141018 (experimental) [trunk revision 216429] (GCC) $ gcc-trunk -O2 small.c; a.out 1 $ gcc-4.7 -O3 small.c; a.out 1 $ $ gcc-trunk -O3 small.c; a.out 0 $ int printf (const char *, ...); int a, b[8] = { 2, 0, 0, 0, 0, 0, 0, 0 }, c[8]; int main () { int d; for (; a < 8; a++) { d = b[a] >> 1; c[a] = d != 0; } printf ("%d\n", c[0]); return 0; }
[Bug c++/57610] Reference initialized with temporary instead of sub-object of conversion result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=57610 Mitsuru Kariya changed: What|Removed |Added CC||kariya_mitsuru at hotmail dot com --- Comment #10 from Mitsuru Kariya --- Each status of the issues mentioned above is CWG 1287: DRWP CWG 1604: DR CWG 1650: NAD And, gcc HEAD (5.0.0) does not cause the slicing problem. cf. 5.0.0 http://melpon.org/wandbox/permlink/xQQq1n98s7blSz8x cf. 4.9.1 http://melpon.org/wandbox/permlink/l69tDXdptf1WVdAT Note that these are compiled with the option "-fno-elide-constructors". (Sorry, I don't know whether this issue should be "RESOLVED FIXED" or not, however.)
[Bug c/63307] [4.9/5 Regression] Cilk+ breaks -fcompare-debug bootstrap
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63307 --- Comment #7 from Andrew Pinski --- (In reply to iverbin from comment #6) > Author: iverbin > Date: Mon Oct 20 15:22:09 2014 > New Revision: 216483 This breaks the build as wd->decl_map will always contain a BLOCK which does not have an UID. Please revert it as it is obvious you did not test it as a simple bootstrap (with checking enabled which is default on the trunk) would have found this issue.
[Bug c++/63606] New: Missing a warning for binding a reference member to a stack allocated parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63606 Bug ID: 63606 Summary: Missing a warning for binding a reference member to a stack allocated parameter Product: gcc Version: 4.8.2 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: bcmpinc at hotmail dot com The code below should produce a warning, as it binds a stack allocated parameter to a reference member. However, gcc currently does not produce such a warning. The code is error prone as it will always result in a dangling reference: the object being pointed to is destructed when the constructor returns. Similar buggy code can accidentally be written when one forgets to insert the '&' to pass-by-reference. Note that the clang compiler does emit a warning, named -Wdangling-field, for the code below. struct Bar { int a; }; struct Foo{ Foo(Bar arg) : bar(arg) {} Bar & bar; }; int main() { Bar k; Foo oops(k); return 0; }
[Bug c++/63181] GCC should warn about "obvious" bugs in binding a reference to temporary
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63181 Jonathan Wakely changed: What|Removed |Added CC||bcmpinc at hotmail dot com --- Comment #3 from Jonathan Wakely --- *** Bug 63606 has been marked as a duplicate of this bug. ***
[Bug c++/63606] Missing a warning for binding a reference member to a stack allocated parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63606 Jonathan Wakely changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #1 from Jonathan Wakely --- dup *** This bug has been marked as a duplicate of bug 63181 ***
[Bug c++/63582] [5 Regression]: g++.dg/init/enum1.C ... (test for errors, line 12)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63582 DJ Delorie changed: What|Removed |Added CC||dj at redhat dot com Assignee|unassigned at gcc dot gnu.org |dj at redhat dot com --- Comment #3 from DJ Delorie --- Created attachment 33764 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=33764&action=edit proposed patch As there are places in the code that scan all of integer_type_kind[] without regard for whether those types are allowed or not, decline to create said types in the first place if they're not enabled. Unable to test at the moment due to PR 63307.
[Bug tree-optimization/63602] Wrong code w/ -O2 -ftree-loop-linear
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63602 --- Comment #1 from Arseny Solokha --- It seems I've reduced the snippet too hard. However, are global variables declared static or not, it doesn't change anything.
[Bug regression/61538] gcc after commit 39a8c5ea produces bad code for MIPS R1x000 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538 --- Comment #21 from Andrew Pinski --- (In reply to Joshua Kinard from comment #20) > Created attachment 33166 [details] > Disassembly of the ASM from 'sln' compiled by a non-working gcc-4.8.0. > > This is the objdump disassembly of the '__lll_lock_wait_private()' function > from the sln binary from glibc, statically compiled, by a BAD gcc-4.8.0 > checkout (7882e02e) no previous commits reversed. This sln copy will hang > trying to print usage instructions. Do you have the preprocessed source for this?
[Bug regression/61538] gcc after commit 39a8c5ea produces bad code for MIPS R1x000 CPUs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61538 --- Comment #22 from Joshua Kinard --- (In reply to Andrew Pinski from comment #21) > (In reply to Joshua Kinard from comment #20) > > Created attachment 33166 [details] > > Disassembly of the ASM from 'sln' compiled by a non-working gcc-4.8.0. > > > > This is the objdump disassembly of the '__lll_lock_wait_private()' function > > from the sln binary from glibc, statically compiled, by a BAD gcc-4.8.0 > > checkout (7882e02e) no previous commits reversed. This sln copy will hang > > trying to print usage instructions. > > Do you have the preprocessed source for this? Not currently. I'd have to intercept a glibc build and grab the compile string for sln.c and use that to crate the preprocessed source. I'll see if I can start a run tonight or tomorrow for this. That said, I have worked out that it's got something to do with gcc's built-in atomics added for 4.8. In glibc's sysdeps/mips/bits/atomic.h, there are conditional macros that pick whether to use the old __sync_* builtins if gcc-4.7 and earlier, or the new __atomic_* builtins in gcc-4.8 or later. This is why there is a difference between the output assembler between the 4.7 and 4.8 sln files. Under gcc-4.7, atomic_exchange_acq falls back to __sync_lock_test_and_set, which is an acquire memmodel operation, and this works fine on an R14000 processor. It's under gcc-4.8+, whatever atomic_exchange_acquire() maps to there, that hangs up on the processor. I checked the kernel side, and the futex is getting lost in freezable_schedule() in include/linux/freezer.h. I haven't traced beyond that point yet. The futex will exit the scheduler when you ctrl+c it. If you delete or comment out the gcc-4.8 defines for the atomic ops in sysdeps/mips/bits/atomic.h in glibc to force it back to the older __sync_* ops, it'll build with 4.8+ and the resulting sln WILL work. So it's definitely a gcc issue. I got a hold of Maxim Kuvyrkov regarding commit 39a8c5ea, but I haven't heard back from him since early September, despite sending two follow-up e-mails.
[Bug lto/63607] New: run fail with -flto -mfloat-abi=softfp for armeb-linux-gnueabi-gcc
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63607 Bug ID: 63607 Summary: run fail with -flto -mfloat-abi=softfp for armeb-linux-gnueabi-gcc Product: gcc Version: 4.7.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: lto Assignee: unassigned at gcc dot gnu.org Reporter: fei.yang0953 at gmail dot com testsuite/gcc.dg/torture/stackalign/builtin-apply-4.c: /* PR tree-optimization/20076 */ /* { dg-do run } */ extern void abort (void); double foo (int arg) { if (arg != 116) abort(); return arg + 1; } inline double bar (int arg) { foo (arg); __builtin_return (__builtin_apply ((void (*) ()) foo, __builtin_apply_args (), 16)); } int main (int argc, char **argv) { if (bar (116) != 117.0) abort (); return 0; } Compile option: armeb-linux-gnueabi-gcc builtin-apply-4.c -static -mfloat-abi=softfp -flto Disassembly: 076c : 76c: e92d4800 push {fp, lr} 770: e28db004 addfp, sp, #4 774: e3a02113 movr2, #-1073741820; 0xc004 778: e30a3aaa movw r3, #43690 ; 0x 77c: e34a3aaa movt r3, #43690 ; 0x 780: e5823000 strr3, [r2] 784: e3a00074 movr0, #116 ; 0x74 788: ebaa bl 638 78c: eeb06b40vmov.f64d6, d0 790: ed9f7b0a vldr d7, [pc, #40] ; 7c0 794: eeb46b47 vcmp.f64 d6, d7 Analysis: Return value is not passed correctly. As we can see from line 790, main gets the return value from d0 register, which is wrong as we use -mfloat-abi=softfp here.