[Bug middle-end/91205] -fstack-protector-strong -D_FORTIFY_SOURCE=2 breaks tftpd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91205 --- Comment #5 from Ricardo Ribalda --- Thanks for the clarification! (And now, after using C/gcc daily for over 18 years I realise that I have no fucking clue about C and gcc :) )
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 Jakub Jelinek changed: What|Removed |Added Status|NEW |ASSIGNED Component|tree-optimization |target Assignee|unassigned at gcc dot gnu.org |jakub at gcc dot gnu.org
[Bug c++/90101] [P0732] Error using non-type template parameter in a template template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90101 --- Comment #3 from Jason Merrill --- Author: jason Date: Fri Jul 19 07:29:15 2019 New Revision: 273592 URL: https://gcc.gnu.org/viewcvs?rev=273592&root=gcc&view=rev Log: PR c++/90101 - dependent class non-type parameter. We shouldn't complain that a dependent type is incomplete. * pt.c (invalid_nontype_parm_type_p): Check for dependent class type. Added: trunk/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C trunk/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C Modified: trunk/gcc/cp/ChangeLog trunk/gcc/cp/pt.c
[Bug tree-optimization/91198] GCC not generating AVX-512 compress/expand instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91198 Richard Biener changed: What|Removed |Added CC||rguenth at gcc dot gnu.org Blocks||53947 --- Comment #3 from Richard Biener --- How would a vectorized version with the intrinsic look like? I can see how it works for the first 16 floats (the first input vector) but after that? How does the compression work when it crosses vector size boundary? And yes, this isn't implemented yet, I wasn't even aware of such instructions. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug c++/90101] [P0732] Error using non-type template parameter in a template template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90101 Jason Merrill changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #4 from Jason Merrill --- Fixed for GCC 10.
[Bug c++/90100] [P0732] Cannot write a type-trait matching non-type class template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90100 Jason Merrill changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |10.0 --- Comment #1 from Jason Merrill --- Fixed for GCC 10.
[Bug tree-optimization/91200] ICE on valid code at -O1: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91200 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-07-19 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- Latent issue - mine.
[Bug c++/90097] [concepts] Error while comparing 2 non-type parameters in constraints
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90097 Jason Merrill changed: What|Removed |Added Status|UNCONFIRMED |NEW Last reconfirmed||2019-07-19 CC||jason at gcc dot gnu.org Summary|[P0732] Error while |[concepts] Error while |comparing 2 non-type|comparing 2 non-type |parameters in constraints |parameters in constraints Ever confirmed|0 |1 --- Comment #1 from Jason Merrill --- This works on the concepts-cxx2a branch.
[Bug tree-optimization/91200] ICE on valid code at -O1: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91200 --- Comment #2 from Richard Biener --- So we're applying cselim to if (a.0_1 != 0) goto ; [50.00%] else goto ; [50.00%] [local count: 766958447]: # i_20 = PHI h[i_20] = &c; [local count: 168730858]: # i_21 = PHI where bb3 has only a single stmt but the PHI node precludes simply inserting the stmt elsewhere. The PHI node is left from non-iterating FRE which isn't able to do all simplification in one go.
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 Richard Biener changed: What|Removed |Added Keywords||missed-optimization Status|UNCONFIRMED |NEW Last reconfirmed||2019-07-19 Blocks||53947 Target Milestone|--- |7.5 Summary|[7~9 Regression] SIMD not |[7/8/9/10 Regression] SIMD |generated for horizontal|not generated for |sum of bytes in array |horizontal sum of bytes in ||array Ever confirmed|0 |1 --- Comment #2 from Richard Biener --- Thus confirmed. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 [Bug 53947] [meta-bug] vectorizer missed-optimizations
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 Richard Biener changed: What|Removed |Added Target|x86 |x86_64-*-* i?86-*-* CC||jakub at gcc dot gnu.org, ||rguenth at gcc dot gnu.org Component|c |middle-end --- Comment #3 from Richard Biener --- If you use singed adds it makes a semantic difference and we narrow it to (unsigned char)a + (unsigned char)b to make it well-defined in case of overflow. I think the same issue applies for the shift where C says that a >> b is really (int)a >> b and thus it is well-defined if b == 24. If we write it as gimple (char)a >> b then we conclude b has to be in the range [0, 7] which would be wrong. So on GIMPLE we'd need to do (char)a >> (b&7) which I think wouldn't be nicely canonical. So this is probably best fixed at RTL expansion time or somewhen shortly before RTL expansion in a "narrowing/widening" pass taking into account target capabilities (and ABIs). I think the vectorizer has similar tricks in the pattern recognizer.
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 --- Comment #4 from Richard Biener --- (In reply to Jakub Jelinek from comment #3) > --- gcc/optabs.c.jj 2019-07-15 10:53:10.743205405 +0200 > +++ gcc/optabs.c 2019-07-19 00:38:20.271852242 +0200 > @@ -2972,6 +2972,17 @@ expand_unop (machine_mode mode, optab un >return target; > } > > + /* Emit ~op0 as op0 ^ -1. */ > + if (unoptab == one_cmpl_optab > + && (SCALAR_INT_MODE_P (mode) || GET_MODE_CLASS (mode) == > MODE_VECTOR_INT) > + && optab_handler (xor_optab, mode) != CODE_FOR_nothing) > +{ > + temp = expand_binop (mode, xor_optab, op0, CONSTM1_RTX (mode), > +target, unsignedp, OPTAB_DIRECT); > + if (temp) > + return temp; > +} > + >if (optab_to_code (unoptab) == NEG) > { >/* Try negating floating point values by flipping the sign bit. */ > > seems to work for me. Or of course something similar can be done in > config/i386/mmx.md, basically copy the sse.md one_cmpl2 pattern to > mmx.md with TARGET_MMX_WITH_SSE and MMXMODEI iterator. Doing it in the middle-end is better IMHO. But yeah, we're somewhat sloppy with vector optabs checks in match.pd where it seems "obvious" targets need to have support...
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 rsandifo at gcc dot gnu.org changed: What|Removed |Added CC||rsandifo at gcc dot gnu.org --- Comment #5 from rsandifo at gcc dot gnu.org --- > Doing it in the middle-end is better IMHO. But yeah, we're somewhat sloppy > with vector optabs checks in match.pd where it seems "obvious" targets need > to have support... +1 FWIW. The patch in comment 3 looks good.
[Bug tree-optimization/91207] [10 Regression] Wrong code with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91207 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Known to work||9.1.0 Last reconfirmed||2019-07-19 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Blocks||91178 Ever confirmed|0 |1 Summary|Wrong code with -O3 |[10 Regression] Wrong code ||with -O3 Target Milestone|--- |10.0 --- Comment #1 from Richard Biener --- Confirmed. Works on the GCC 9 branch. Caused by the fix for PR91178. Referenced Bugs: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91178 [Bug 91178] [9 Regression] Infinite recursion in split_constant_offset in slp after r260289
[Bug bootstrap/91208] [10 Regression] bootstrap comparison failure for objc and obj-c++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91208 Richard Biener changed: What|Removed |Added Target||x86-64-linux Component|objc|bootstrap Target Milestone|--- |10.0
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #4 from Jakub Jelinek --- Looking at x86 shl/shr instructions, it seems they don't do the SHIFT_COUNT_TRUNCATED masking, but actually mask always the shift count with & 31 (unless 64-bit shift, then it is indeed SHIFT_COUNT_TRUNCATED). So, I'd think we want alternate patterns to the QImode and HImode x86 shifts that represent it as zero (for shift left or logical shift right) or sign (for shift left maybe too or arithmetic shift right) extension of the QI/HI mode operand to SImode, performing a shift in SImode and a SUBREG back to QI/HImode and keep it expressed like that in the IL to say explicitly what the instruction does.
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #5 from Jakub Jelinek --- Though, note the combiner doesn't try to match that, nor with the void foo (unsigned char a, unsigned char b, unsigned char *c) { *c = a >> b; } case, the final subreg is in some other instruction (e.g. the set of hard register to it, or memory store) and combiner doesn't try to split i at the point of the subreg but of the subreg operand. Dunno what options we have, hand recognize in some machine specific pass, or a new optab for that kind of thing, something else?
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #4) > Looking at x86 shl/shr instructions, it seems they don't do the > SHIFT_COUNT_TRUNCATED masking, but actually mask always the shift count with > & 31 (unless 64-bit shift, then it is indeed SHIFT_COUNT_TRUNCATED). Yes, this is the reason x86 doesn't define SHIFT_COUNT_TRUNCATED (also, the documentation is confusing to me, talking about some "real or pretended" bit field operations). OTOH, x86 has several combined patterns that remove masking from SImode and DImode shift/rotate instructions. It is possible to define TARGET_SHIFT_TRUNCATION_MASK, but when experimenting with this hook, I didn't find any effect on shifts.
[Bug rtl-optimization/53652] *andn* isn't used for vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652 Jakub Jelinek changed: What|Removed |Added CC||segher at gcc dot gnu.org --- Comment #3 from Jakub Jelinek --- Ran into this again in context of PR91204, there is another case that isn't matched for a different reason: int a, b, c[64]; void foo (void) { int i; for (i = 0; i < 64; i++) c[i] = ~c[i] & b; } In this case the loop has been unrolled and combiner even tries to match (set (reg:V4SI 137 [ vect__4.8 ]) (and:V4SI (not:V4SI (mem/c:V4SI (symbol_ref:DI ("c") [flags 0x2] ) [1 MEM [(int *)&c]+0 S16 A128])) (reg:V4SI 132))) but doesn't match that as memory operand is not allowed in the andnot patterns (perhaps it should and we should just wait for reload to cure it up).
[Bug tree-optimization/91178] [9 Regression] Infinite recursion in split_constant_offset in slp after r260289
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91178 Bug 91178 depends on bug 91207, which changed state. Bug 91207 Summary: [10 Regression] Wrong code with -O3 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91207 What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED
[Bug tree-optimization/91207] [10 Regression] Wrong code with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91207 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #2 from Richard Biener --- Fixed.
[Bug tree-optimization/91207] [10 Regression] Wrong code with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91207 --- Comment #3 from Richard Biener --- Author: rguenth Date: Fri Jul 19 08:47:41 2019 New Revision: 273593 URL: https://gcc.gnu.org/viewcvs?rev=273593&root=gcc&view=rev Log: 2019-07-19 Richard Biener PR tree-optimization/91207 Revert 2019-07-17 Richard Biener PR tree-optimization/91178 * tree-vect-stmts.c (get_group_load_store_type): For SLP loads with a gap larger than the vector size always use VMAT_STRIDED_SLP. (vectorizable_load): For VMAT_STRIDED_SLP with a permutation avoid loading vectors that are only contained in the gap and thus are not needed. * gcc.dg/torture/pr91207.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr91207.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
[Bug tree-optimization/91178] [9 Regression] Infinite recursion in split_constant_offset in slp after r260289
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91178 --- Comment #7 from Richard Biener --- Author: rguenth Date: Fri Jul 19 08:47:41 2019 New Revision: 273593 URL: https://gcc.gnu.org/viewcvs?rev=273593&root=gcc&view=rev Log: 2019-07-19 Richard Biener PR tree-optimization/91207 Revert 2019-07-17 Richard Biener PR tree-optimization/91178 * tree-vect-stmts.c (get_group_load_store_type): For SLP loads with a gap larger than the vector size always use VMAT_STRIDED_SLP. (vectorizable_load): For VMAT_STRIDED_SLP with a permutation avoid loading vectors that are only contained in the gap and thus are not needed. * gcc.dg/torture/pr91207.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr91207.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-vect-stmts.c
[Bug c++/90098] [P0732] Partial specialization of a class template with variadic parameter pack fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90098 --- Comment #2 from Jason Merrill --- Author: jason Date: Fri Jul 19 08:53:07 2019 New Revision: 273597 URL: https://gcc.gnu.org/viewcvs?rev=273597&root=gcc&view=rev Log: PR c++/90098 - partial specialization and class non-type parms. A non-type template parameter of class type used in an expression has const-qualified type; the pt.c hunks deal with this difference from the unqualified type of the parameter declaration. WAhen we use such a parameter as an argument to another template, we don't want to confuse things by copying it, we should pass it straight through. And we might as well skip copying other classes in constant evaluation context in a template, too; we'll get the copy semantics at instantiation time. PR c++/90099 PR c++/90101 * call.c (build_converted_constant_expr_internal): Don't copy. * pt.c (process_partial_specialization): Allow VIEW_CONVERT_EXPR around class non-type parameter. (unify) [TEMPLATE_PARM_INDEX]: Ignore cv-quals. (invalid_nontype_parm_type_p): Check for dependent class type. Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class18.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class19.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class20.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C Modified: branches/gcc-9-branch/gcc/cp/ChangeLog branches/gcc-9-branch/gcc/cp/call.c branches/gcc-9-branch/gcc/cp/pt.c
[Bug c++/63149] wrong auto deduction from braced-init-list
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63149 --- Comment #9 from Jason Merrill --- Author: jason Date: Fri Jul 19 08:52:50 2019 New Revision: 273595 URL: https://gcc.gnu.org/viewcvs?rev=273595&root=gcc&view=rev Log: PR c++/63149 - wrong auto deduction from braced-init-list 2019-06-04 Nina Dinka Ranns gcc/cp/ * pt.c (listify_autos): Use non cv qualified auto_node in std::initializer_list. testsuite/ * g++.dg/cpp0x/initlist-deduce2.C: New test. Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp0x/initlist-deduce2.C Modified: branches/gcc-9-branch/gcc/cp/ChangeLog branches/gcc-9-branch/gcc/cp/pt.c
[Bug c++/90101] [P0732] Error using non-type template parameter in a template template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90101 --- Comment #5 from Jason Merrill --- Author: jason Date: Fri Jul 19 08:53:07 2019 New Revision: 273597 URL: https://gcc.gnu.org/viewcvs?rev=273597&root=gcc&view=rev Log: PR c++/90098 - partial specialization and class non-type parms. A non-type template parameter of class type used in an expression has const-qualified type; the pt.c hunks deal with this difference from the unqualified type of the parameter declaration. WAhen we use such a parameter as an argument to another template, we don't want to confuse things by copying it, we should pass it straight through. And we might as well skip copying other classes in constant evaluation context in a template, too; we'll get the copy semantics at instantiation time. PR c++/90099 PR c++/90101 * call.c (build_converted_constant_expr_internal): Don't copy. * pt.c (process_partial_specialization): Allow VIEW_CONVERT_EXPR around class non-type parameter. (unify) [TEMPLATE_PARM_INDEX]: Ignore cv-quals. (invalid_nontype_parm_type_p): Check for dependent class type. Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class18.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class19.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class20.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C Modified: branches/gcc-9-branch/gcc/cp/ChangeLog branches/gcc-9-branch/gcc/cp/call.c branches/gcc-9-branch/gcc/cp/pt.c
[Bug c++/90099] [P0732] Partial specialization of a class template with variadic parameter pack fails after adding non-type template parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90099 --- Comment #2 from Jason Merrill --- Author: jason Date: Fri Jul 19 08:53:07 2019 New Revision: 273597 URL: https://gcc.gnu.org/viewcvs?rev=273597&root=gcc&view=rev Log: PR c++/90098 - partial specialization and class non-type parms. A non-type template parameter of class type used in an expression has const-qualified type; the pt.c hunks deal with this difference from the unqualified type of the parameter declaration. WAhen we use such a parameter as an argument to another template, we don't want to confuse things by copying it, we should pass it straight through. And we might as well skip copying other classes in constant evaluation context in a template, too; we'll get the copy semantics at instantiation time. PR c++/90099 PR c++/90101 * call.c (build_converted_constant_expr_internal): Don't copy. * pt.c (process_partial_specialization): Allow VIEW_CONVERT_EXPR around class non-type parameter. (unify) [TEMPLATE_PARM_INDEX]: Ignore cv-quals. (invalid_nontype_parm_type_p): Check for dependent class type. Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class18.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class19.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class20.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class21.C branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp2a/nontype-class22.C Modified: branches/gcc-9-branch/gcc/cp/ChangeLog branches/gcc-9-branch/gcc/cp/call.c branches/gcc-9-branch/gcc/cp/pt.c
[Bug c++/82081] Tail call optimisation of noexcept function leads to exception allowed through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82081 --- Comment #10 from Jason Merrill --- Author: jason Date: Fri Jul 19 08:52:41 2019 New Revision: 273594 URL: https://gcc.gnu.org/viewcvs?rev=273594&root=gcc&view=rev Log: PR c++/82081 - tail call optimization breaks noexcept If a noexcept function calls a function that might throw, doing the tail call optimization means that an exception thrown in the called function will propagate out, breaking the noexcept specification. So we need to prevent the optimization in that case. * tree-tailcall.c (find_tail_calls): Don't turn a call from a nothrow function to a might-throw function into a tail call. Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/tree-ssa/tail-call-1.C Modified: branches/gcc-9-branch/gcc/ChangeLog branches/gcc-9-branch/gcc/tree-tailcall.c
[Bug c++/85552] Adding curly braces to the declaration of a std::unique_ptr to a forward declared class breaks compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85552 --- Comment #6 from Jason Merrill --- Author: jason Date: Fri Jul 19 08:52:58 2019 New Revision: 273596 URL: https://gcc.gnu.org/viewcvs?rev=273596&root=gcc&view=rev Log: PR c++/85552 - wrong instantiation of dtor for DMI. * typeck2.c (digest_nsdmi_init): Set tf_no_cleanup for direct-init. Added: branches/gcc-9-branch/gcc/testsuite/g++.dg/cpp0x/nsdmi-list5.C Modified: branches/gcc-9-branch/gcc/cp/ChangeLog branches/gcc-9-branch/gcc/cp/typeck2.c
[Bug c++/90100] [P0732] Cannot write a type-trait matching non-type class template parameters
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90100 Jason Merrill changed: What|Removed |Added Target Milestone|10.0|9.2 --- Comment #2 from Jason Merrill --- And 9.2.
[Bug c++/90099] [P0732] Partial specialization of a class template with variadic parameter pack fails after adding non-type template parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90099 Jason Merrill changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Assignee|unassigned at gcc dot gnu.org |jason at gcc dot gnu.org Target Milestone|--- |9.2 --- Comment #3 from Jason Merrill --- Fixed for 9.2/10.
[Bug c++/90098] [P0732] Partial specialization of a class template with variadic parameter pack fails
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90098 Jason Merrill changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |9.2 --- Comment #3 from Jason Merrill --- Fixed for 9.2/10.
[Bug c++/90101] [P0732] Error using non-type template parameter in a template template argument
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90101 Jason Merrill changed: What|Removed |Added Target Milestone|10.0|9.2
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #7 from Jakub Jelinek --- Perhaps we could define patterns for combine like: (set (match_operand:SI 0 "register_operand" "=q") (ashiftrt:SI (zero_extend:SI (match_operand:QI 1 "register_operand" "q")) (match_operand:QI 2 "nonmemory_operand" "cI"))) (clobber (reg:CC 17 flags)) etc. and split them before reload into an insn that does that with a subreg:QI followed by zero extension (or sign extension in certain cases) and then hope before reload the zero extension will be cancelled with the following subreg use (I guess we can't look at immediate uses during splitting). The disadvantage would be that this would match even if we aren't using a subreg at the end, so would match even the case of: unsigned int foo (unsigned char a, unsigned char b) { return a >> b; } and would change the movzbl %dil, %edi; shrq %cl, %edi into shrq %cl, %dil; movzbl %dil, %edi. Dunno if that is acceptable.
[Bug c++/85552] Adding curly braces to the declaration of a std::unique_ptr to a forward declared class breaks compilation
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85552 Jason Merrill changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |9.2 --- Comment #7 from Jason Merrill --- Fixed for 9.2/10.
[Bug tree-optimization/91201] [7/8/9/10 Regression] SIMD not generated for horizontal sum of bytes in array
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91201 --- Comment #3 from Joel Yliluoma --- For the record, for this particular case (8-bit checksum of an array, 16 bytes in this case) there exists even more optimal SIMD code, which ICC (version 18 or greater) generates automatically. vmovups xmm0, XMMWORD PTR bytes[rip] #5.9 vpxor xmm2, xmm2, xmm2 #4.41 vpaddbxmm0, xmm2, xmm0 #4.41 vpsrldq xmm1, xmm0, 8 #4.41 vpaddbxmm3, xmm0, xmm1 #4.41 vpsadbw xmm4, xmm2, xmm3 #4.41 vmovd eax, xmm4 #4.41 movsx rax, al #4.41 ret #7.16
[Bug rtl-optimization/53652] *andn* isn't used for vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652 --- Comment #4 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #3) > Ran into this again in context of PR91204, there is another case that isn't > matched for a different reason: > int a, b, c[64]; > > void > foo (void) > { > int i; > for (i = 0; i < 64; i++) > c[i] = ~c[i] & b; > } > In this case the loop has been unrolled and combiner even tries to match > (set (reg:V4SI 137 [ vect__4.8 ]) > (and:V4SI (not:V4SI (mem/c:V4SI (symbol_ref:DI ("c") [flags 0x2] > ) [1 MEM [(int *)&c]+0 S16 A128])) > (reg:V4SI 132))) > but doesn't match that as memory operand is not allowed in the andnot > patterns (perhaps it should and we should just wait for reload to cure it > up). It should also accept memory operand, this is the way we trick combiner in several other places.
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 --- Comment #6 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #3) > seems to work for me. Or of course something similar can be done in > config/i386/mmx.md, basically copy the sse.md one_cmpl2 pattern to > mmx.md with TARGET_MMX_WITH_SSE and MMXMODEI iterator. Like this? --cut here-- diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 4c71e66e6607..c78b33b510a6 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -1158,6 +1158,14 @@ ;; ; +(define_expand "one_cmpl2" + [(set (match_operand:MMXMODEI 0 "register_operand") + (xor:MMXMODEI + (match_operand:MMXMODEI 1 "register_operand") + (match_dup 2)))] + "TARGET_MMX_WITH_SSE" + "operands[2] = force_reg (mode, CONSTM1_RTX (mode));") + (define_insn "mmx_andnot3" [(set (match_operand:MMXMODEI 0 "register_operand" "=y,x,Yv") (and:MMXMODEI --cut here--
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 --- Comment #7 from Jakub Jelinek --- Yes. But #c3 does the same thing in the middle-end. Though, if there is a backend pattern, it will be used also by the vectorizer or taken into account by generic vector lowering. So maybe we want both?
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 --- Comment #8 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #7) > Yes. But #c3 does the same thing in the middle-end. > Though, if there is a backend pattern, it will be used also by the > vectorizer or taken into account by generic vector lowering. > So maybe we want both? Yes, this was also my intention. A named pattern can be used elsewhere, too.
[Bug tree-optimization/91198] GCC not generating AVX-512 compress/expand instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91198 --- Comment #4 from Moritz Kreutzer --- > How would a vectorized version with the intrinsic look like? Something along the lines of (assuming insize is a multiple of 16): __mmask16 mask; __m512 vin; __m512 const thr = _mm512_set1_ps(threshold); int o = 0; for (int i = 0; i < insize; i+=16) { vin = _mm512_loadu_ps(&input[i]); mask = _mm512_cmplt_ps_mask(vin, thr); _mm512_mask_compressstoreu_ps(&output[o], mask, vin); o += __builtin_popcount(_mm512_mask2int(mask)); } *outsize = o; I don't really understand your other two questions, but maybe the intrinsics code will help.
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #8 from Uroš Bizjak --- (In reply to Jakub Jelinek from comment #7) > Perhaps we could define patterns for combine like: > (set (match_operand:SI 0 "register_operand" "=q") > (ashiftrt:SI (zero_extend:SI (match_operand:QI 1 > "register_operand" "q")) > (match_operand:QI 2 "nonmemory_operand" "cI"))) > (clobber (reg:CC 17 flags)) > etc. and split them before reload into an insn that does that with a > subreg:QI followed by zero extension (or sign extension in certain cases) > and then hope before reload the zero extension will be cancelled with the > following subreg use (I guess we can't look at immediate uses during > splitting). The split pass is performed quite late in the game, and the narrowed insn would miss combine pass (c.f. Comment #0 for optimization opportunity). > The disadvantage would be that this would match even if we aren't using a > subreg at the end, so would match even the case of: > unsigned int foo (unsigned char a, unsigned char b) > { > return a >> b; > } > and would change the movzbl %dil, %edi; shrq %cl, %edi into shrq %cl, %dil; > movzbl %dil, %edi. Dunno if that is acceptable. I think that much cleaner solution would be to do the transformation in a kind of "narrowing/widening" pass, as proposed by Richard in Comment #3. There, all type data can be easily determined and operation can be narrowed if the target provides adequate set of instructions.
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #9 from Jakub Jelinek --- We have several PRs for narrowing/widening pass on late gimple, but I'm afraid this exact thing is not something that can be done there, because the semantics on what the x86 instructions do is quite weird and hard to express in a single gimple statement (and hard to explain to the middle-end that x86 has something like that). As I said, we could add a new optab for that kind of thing and pattern match it during expansion, or it can be done the above way and see how well does that work, or some machine specific pass.
[Bug middle-end/91190] [10 Regression] ICE on valid code: in hashtab_chk_error, at hash-table.c:137
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91190 --- Comment #2 from Jakub Jelinek --- Author: jakub Date: Fri Jul 19 10:26:23 2019 New Revision: 273599 URL: https://gcc.gnu.org/viewcvs?rev=273599&root=gcc&view=rev Log: PR middle-end/91190 * function.c (insert_temp_slot_address): Store into the hash table a copy of address to avoid RTL sharing issues. * gcc.c-torture/compile/pr91190.c: New test. Added: trunk/gcc/testsuite/gcc.c-torture/compile/pr91190.c Modified: trunk/gcc/ChangeLog trunk/gcc/function.c trunk/gcc/testsuite/ChangeLog
[Bug tree-optimization/91211] New: [10 Regression] wrong code with __builtin_memset() and __builtin_memcpy() at -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91211 Bug ID: 91211 Summary: [10 Regression] wrong code with __builtin_memset() and __builtin_memcpy() at -O1 and above Product: gcc Version: 10.0 Status: UNCONFIRMED Keywords: wrong-code Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: zsojka at seznam dot cz Target Milestone: --- Host: x86_64-pc-linux-gnu Created attachment 46612 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46612&action=edit reduced testcase Output: $ x86_64-pc-linux-gnu-gcc -O testcase.c $ ./a.out Aborted $ x86_64-pc-linux-gnu-gcc -v Using built-in specs. COLLECT_GCC=/repo/gcc-trunk/binary-latest-amd64/bin/x86_64-pc-linux-gnu-gcc COLLECT_LTO_WRAPPER=/repo/gcc-trunk/binary-trunk-273590-checking-yes-rtl-df-extra-nobootstrap-amd64/bin/../libexec/gcc/x86_64-pc-linux-gnu/10.0.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: /repo/gcc-trunk//configure --enable-languages=c,c++ --enable-valgrind-annotations --disable-nls --enable-checking=yes,rtl,df,extra --disable-bootstrap --with-cloog --with-ppl --with-isl --build=x86_64-pc-linux-gnu --host=x86_64-pc-linux-gnu --target=x86_64-pc-linux-gnu --with-ld=/usr/bin/x86_64-pc-linux-gnu-ld --with-as=/usr/bin/x86_64-pc-linux-gnu-as --disable-libstdcxx-pch --prefix=/repo/gcc-trunk//binary-trunk-273590-checking-yes-rtl-df-extra-nobootstrap-amd64 Thread model: posix Supported LTO compression algorithms: zlib zstd gcc version 10.0.0 20190719 (experimental) (GCC)
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #10 from Jakub Jelinek --- I've tried: --- gcc/config/i386/i386.md.jj 2019-07-19 11:56:10.475964435 +0200 +++ gcc/config/i386/i386.md 2019-07-19 12:43:52.461469500 +0200 @@ -10661,6 +10661,43 @@ "ix86_split_ (operands, NULL_RTX, mode); DONE;" [(set_attr "type" "multi")]) +(define_insn_and_split "*shrqi3_3" + [(set (match_operand:SI 0 "register_operand") + (ashiftrt:SI + (zero_extend:SI (match_operand:QI 1 "register_operand")) + (match_operand:QI 2 "nonmemory_operand"))) + (clobber (reg:CC FLAGS_REG))] + "" + "#" + "&& 1" + [(parallel + [(set (match_dup 3) + (subreg:QI (ashiftrt:SI + (zero_extend:SI (match_dup 1)) + (match_dup 2)) 0)) + (clobber (reg:CC FLAGS_REG))]) + (set (match_dup 0) (zero_extend:SI (match_dup 3)))] +{ + operands[3] = (can_create_pseudo_p () +? gen_reg_rtx (QImode) : gen_lowpart (QImode, operands[0])); +}) + +(define_insn "*shrqi3_4" + [(set (match_operand:QI 0 "register_operand" "=q") + (subreg:QI + (ashiftrt:SI + (zero_extend:SI (match_operand:QI 1 "register_operand" "0")) + (match_operand:QI 2 "nonmemory_operand" "cI")) 0)) + (clobber (reg:CC FLAGS_REG))] + "" +{ + if (operands[2] == const1_rtx + && (TARGET_SHIFT1 || optimize_function_for_size_p (cfun))) +return "shr{q}\t%0"; + else +return "shr{q}\t{%2, %0|%0, %2}"; +}) + ;; By default we don't ask for a scratch register, because when DWImode ;; values are manipulated, registers are already at a premium. But if ;; we have one handy, we won't turn it away. and surprisingly, not just before RA, but even after it nothing will optimize away the extra zero extend. unsigned char foo (unsigned char a, unsigned char b) { return a >> b; } void bar (unsigned char a, unsigned char b, unsigned char *c) { *c = a >> b; } changes with the patch as: - movzbl %dil, %eax movl%esi, %ecx - sarl%cl, %eax + shrq%cl, %dil + movzbl %dil, %eax and: - movzbl %dil, %edi movl%esi, %ecx - sarl%cl, %edi + shrq%cl, %dil + movzbl %dil, %edi movb%dil, (%rdx)
[Bug tree-optimization/91198] GCC not generating AVX-512 compress/expand instructions
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91198 --- Comment #5 from rguenther at suse dot de --- On Fri, 19 Jul 2019, moritz.kreutzer at siemens dot com wrote: > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91198 > > --- Comment #4 from Moritz Kreutzer --- > > How would a vectorized version with the intrinsic look like? > > Something along the lines of (assuming insize is a multiple of 16): > > > __mmask16 mask; > __m512 vin; > __m512 const thr = _mm512_set1_ps(threshold); > int o = 0; > for (int i = 0; i < insize; i+=16) { > vin = _mm512_loadu_ps(&input[i]); > mask = _mm512_cmplt_ps_mask(vin, thr); > _mm512_mask_compressstoreu_ps(&output[o], mask, vin); > o += __builtin_popcount(_mm512_mask2int(mask)); > } > *outsize = o; > > > > I don't really understand your other two questions, but maybe the intrinsics > code will help. Yeah, it helps. I missed o += __builtin_popcount(_mm512_mask2int(mask)); so the output vector address is computed via a reduction over the number of 'true' conditions met.
[Bug middle-end/91202] Unnecessary promotion of shift operands
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91202 --- Comment #11 from Jakub Jelinek --- As for TARGET_SHIFT_TRUNCATION_MASK, I'm not sure it can be safely used, because different instructions on x86 work differently. The old scalar shifts do the & 31 masking for QImode/HImode, but e.g. vector shifts don't do any masking, but out of bound shift counts are basically infinity shift count. So, if we wanted to use say truncation mask of 31 for QI/HI/SImode, we'd need to make sure we never do for those what say STV does and use vector instructions instead, because those have different behavior.
[Bug other/91209] gm2 bootstrap comparison failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91209 --- Comment #2 from Matthias Klose --- no, I'm not sure, but why not use the bug tracker with the recent proposal to merge the gm2 frontend?
[Bug bootstrap/91208] [10 Regression] bootstrap comparison failure for objc and obj-c++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91208 --- Comment #1 from Matthias Klose --- my last successful bootstrap is from 20190706, r273162
[Bug c++/82081] Tail call optimisation of noexcept function leads to exception allowed through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82081 --- Comment #11 from Jason Merrill --- Author: jason Date: Fri Jul 19 11:53:41 2019 New Revision: 273601 URL: https://gcc.gnu.org/viewcvs?rev=273601&root=gcc&view=rev Log: PR c++/82081 - tail call optimization breaks noexcept If a noexcept function calls a function that might throw, doing the tail call optimization means that an exception thrown in the called function will propagate out, breaking the noexcept specification. So we need to prevent the optimization in that case. * tree-tailcall.c (find_tail_calls): Don't turn a call from a nothrow function to a might-throw function into a tail call. Added: branches/gcc-8-branch/gcc/testsuite/g++.dg/tree-ssa/tail-call-1.C Modified: branches/gcc-8-branch/gcc/ChangeLog branches/gcc-8-branch/gcc/tree-tailcall.c
[Bug c++/82081] Tail call optimisation of noexcept function leads to exception allowed through
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82081 Jason Merrill changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED Target Milestone|--- |8.4 --- Comment #12 from Jason Merrill --- Fixed for 8.4/9.2/10.
[Bug tree-optimization/91211] [10 Regression] wrong code with __builtin_memset() and __builtin_memcpy() at -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91211 Richard Biener changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed||2019-07-19 Assignee|unassigned at gcc dot gnu.org |rguenth at gcc dot gnu.org Target Milestone|--- |10.0 Ever confirmed|0 |1 --- Comment #1 from Richard Biener --- I bet this is again mine.
[Bug tree-optimization/91200] ICE on valid code at -O1: verify_ssa failed
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91200 --- Comment #3 from Richard Biener --- Author: rguenth Date: Fri Jul 19 12:24:53 2019 New Revision: 273602 URL: https://gcc.gnu.org/viewcvs?rev=273602&root=gcc&view=rev Log: 2019-07-19 Richard Biener PR tree-optimization/91200 * tree-ssa-phiopt.c (cond_store_replacement): Check we have no PHI nodes in middle-bb. * gcc.dg/torture/pr91200.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr91200.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-phiopt.c
[Bug c++/91212] New: const ignored for ctor arguments within return
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91212 Bug ID: 91212 Summary: const ignored for ctor arguments within return Product: gcc Version: 9.1.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: m...@osm-ag.de Target Milestone: --- Created attachment 46613 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46613&action=edit preprocessed from -save-temps g++ 9.1.0, gmp-6.1.1, mpfr-3.1.4, mpc-1.0.3, isl-0.18, cloog-0.18.4 having two ctors: struct X{ template X(const char (&src)[N]) {} template X(char (&src)[N]) {} }; X f() { char buf[1]; return buf; } the second ctor with non-const buffer should be used. with g++ 9.1.0 the first ctor with const buffer is used. build-option: --prefix=/opt/local/gcc-9.1.0 --enable-languages=c,c++ --disable-multilib make-target: profiledbootstrap $ gcc --version gcc (GCC) 9.1.0 Copyright (C) 2019 Free Software Foundation, Inc. This is free software; see the source for copying conditions. There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. $ uname -a Linux refle15sp1 4.12.14-195-default #1 SMP Tue May 7 10:55:11 UTC 2019 (8fba516) x86_64 x86_64 x86_64 GNU/Linux $ cat /etc/os-release NAME="SLES" VERSION="15-SP1" VERSION_ID="15.1" PRETTY_NAME="SUSE Linux Enterprise Server 15 SP1" ID="sles" ID_LIKE="suse" ANSI_COLOR="0;32" CPE_NAME="cpe:/o:suse:sles:15:sp1" $ /opt/local/gcc-9.1.0/bin/g++ -v -save-temps -o ctor ctor.C Using built-in specs. COLLECT_GCC=/opt/local/gcc-9.1.0/bin/g++ COLLECT_LTO_WRAPPER=/opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/lto-wrapper Target: x86_64-pc-linux-gnu Configured with: ../gcc-9.1.0/configure --prefix=/opt/local/gcc-9.1.0 --enable-languages=c,c++ --disable-multilib Thread model: posix gcc version 9.1.0 (GCC) COLLECT_GCC_OPTIONS='-v' '-save-temps' '-o' 'ctor' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/cc1plus -E -quiet -v -D_GNU_SOURCE ctor.C -mtune=generic -march=x86-64 -fpch-preprocess -o ctor.ii ignoring nonexistent directory "/opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../x86_64-pc-linux-gnu/include" #include "..." search starts here: #include <...> search starts here: /opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../include/c++/9.1.0 /opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../include/c++/9.1.0/x86_64-pc-linux-gnu /opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../include/c++/9.1.0/backward /opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/include /usr/local/include /opt/local/gcc-9.1.0/include /opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/include-fixed /usr/include End of search list. COLLECT_GCC_OPTIONS='-v' '-save-temps' '-o' 'ctor' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/cc1plus -fpreprocessed ctor.ii -quiet -dumpbase ctor.C -mtune=generic -march=x86-64 -auxbase ctor -version -o ctor.s GNU C++14 (GCC) version 9.1.0 (x86_64-pc-linux-gnu) compiled by GNU C version 9.1.0, GMP version 6.1.1, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.18-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 GNU C++14 (GCC) version 9.1.0 (x86_64-pc-linux-gnu) compiled by GNU C version 9.1.0, GMP version 6.1.1, MPFR version 3.1.4, MPC version 1.0.3, isl version isl-0.18-GMP GGC heuristics: --param ggc-min-expand=100 --param ggc-min-heapsize=131072 Compiler executable checksum: c5c2c4d89679134a97ef259dab7ccdbf COLLECT_GCC_OPTIONS='-v' '-save-temps' '-o' 'ctor' '-shared-libgcc' '-mtune=generic' '-march=x86-64' as -v --64 -o ctor.o ctor.s GNU assembler version 2.31.1 (x86_64-suse-linux) using BFD version (GNU Binutils; SUSE Linux Enterprise 15) 2.31.1.20180828-5 COMPILER_PATH=/opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/:/opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/:/opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/:/opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/:/opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/ LIBRARY_PATH=/opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/:/opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../../lib64/:/lib/../lib64/:/usr/lib/../lib64/:/opt/local/gcc-9.1.0/lib/gcc/x86_64-pc-linux-gnu/9.1.0/../../../:/lib/:/usr/lib/ COLLECT_GCC_OPTIONS='-v' '-save-temps' '-o' 'ctor' '-shared-libgcc' '-mtune=generic' '-march=x86-64' /opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/collect2 -plugin /opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/liblto_plugin.so -plugin-opt=/opt/local/gcc-9.1.0/libexec/gcc/x86_64-pc-linux-gnu/9.1.0/lto-wrapper -plugin-opt=-fresolution=ctor.res -plugin-opt=-pass-through=-lgcc_s -plugin-opt=-pass-through=-lgcc -plug
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #15 from Richard Biener --- So another idea would be to provide [us]{min,max} patterns for integer modes that split after reload into a compare&cmov or jumpy sequence if allocated using GPR regs but also allow SSE reg alternatives which would fit into existing SSE code if we'd allow SI-WITH-SSE similar to how we do TARGET_MMX_WITH_SSE operations. We already seem to have movsi patterns {r,m}<->v so that part is done, for the testcase that would leave addsi3 plus appropriate costing of the min/max case. The smaxsi3 "splitter" (well ;)) is Index: gcc/config/i386/i386.md === --- gcc/config/i386/i386.md (revision 273592) +++ gcc/config/i386/i386.md (working copy) @@ -1881,6 +1881,27 @@ (define_expand "mov" "" "ix86_expand_move (mode, operands); DONE;") +(define_insn "smaxsi3" + [(set (match_operand:SI 0 "register_operand" "=r,x") + (smax:SI (match_operand:SI 1 "register_operand" "%0,x") +(match_operand:SI 2 "register_operand" "r,x"))) + (clobber (reg:CC FLAGS_REG))] + "TARGET_AVX2" +{ + switch (get_attr_type (insn)) +{ +case TYPE_SSEADD: + return "vpmaxsd\t{%2, %1, %0|%0, %1, %2}"; +case TYPE_ICMOV: + return "cmpl\t{%2, %0|%0, %2}\n" +"cmovl\t{%2, %0|%0, %2}"; +default: + gcc_unreachable (); +} +} + [(set_attr "isa" "noavx,avx") + (set_attr "type" "icmov,sseadd")]) + (define_insn "*mov_xor" [(set (match_operand:SWI48 0 "register_operand" "=r") (match_operand:SWI48 1 "const0_operand")) with that we get the elision of the zeroing between the vpmaxsd but even -mtune=bdver2 doesn't disparage the cross-unit moves enough for the RA to choose the first alternative. Huh. But it then goes through the stack... .L3: vmovd %xmm0, %r8d addl(%rdx,%rax,4), %r8d movl%r8d, 4(%rdi,%rax,4) movl(%rcx,%rax,4), %r9d addl(%rsi,%rax,4), %r9d movl%r9d, -4(%rsp) vmovd -4(%rsp), %xmm2 movl%r8d, -4(%rsp) movq%rax, %r8 vmovd -4(%rsp), %xmm3 vpmaxsd %xmm3, %xmm2, %xmm0 vpmaxsd %xmm1, %xmm0, %xmm0 vmovd %xmm0, 4(%rdi,%rax,4) incq%rax cmpq%r8, %r10 jne .L3 not sure why RA selected the 2nd alternative at all... but yes, we'd want to slightly prefer it. But I expected the inter-unit moves to push us to the first alternative. As you can see we can automatically handle stores of SImode in SSE regs and I verified we can also handle loads by slightly altering the testcase. That leaves implementing the addsi3 alternative for SSE regs, like the following quite incomplete hack Index: gcc/config/i386/i386.md === --- gcc/config/i386/i386.md (revision 273592) +++ gcc/config/i386/i386.md (working copy) @@ -5368,10 +5389,10 @@ (define_insn_and_split "*add3_doubl }) (define_insn "*add_1" - [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r") + [(set (match_operand:SWI48 0 "nonimmediate_operand" "=rm,r,r,r,v") (plus:SWI48 - (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r") - (match_operand:SWI48 2 "x86_64_general_operand" "re,m,0,le"))) + (match_operand:SWI48 1 "nonimmediate_operand" "%0,0,r,r,v") + (match_operand:SWI48 2 "x86_64_general_operand" "re,m,0,le,v"))) (clobber (reg:CC FLAGS_REG))] "ix86_binary_operator_ok (PLUS, mode, operands)" { @@ -5390,6 +5411,9 @@ (define_insn "*add_1" return "dec{}\t%0"; } + case TYPE_SSEADD: + return "vpaddd\t{%2, %1, %0|%0, %1, %2}"; + default: /* For most processors, ADD is faster than LEA. This alternative was added to use ADD as much as possible. */ @@ -5406,6 +5430,8 @@ (define_insn "*add_1" [(set (attr "type") (cond [(eq_attr "alternative" "3") (const_string "lea") + (eq_attr "alternative" "4") + (const_string "sseadd") (match_operand:SWI48 2 "incdec_operand") (const_string "incdec") ] but somehow that doesn't trigger for me.
[Bug lto/84579] __gnu_lto_v1 should be removed when linking with -fno-lto
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84579 --- Comment #6 from Romain Geissler --- Hi, After trying to build our own set of open source components with this patch (among the sqlite, openssl, boost, tcmalloc), we have no link issues resulting from this change. Tested with gcc 8 and gcc 9. The only problem being adapting the 2 binutils LTO tests which are now failing. Cheers, Romain
[Bug bootstrap/91208] [10 Regression] bootstrap comparison failure for objc and obj-c++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91208 Jakub Jelinek changed: What|Removed |Added CC||jakub at gcc dot gnu.org --- Comment #2 from Jakub Jelinek --- Strange, I've bootstrapped r273585 with objc as well as obj-c++ just fine last night. --enable-languages=default,ada,obj-c++,lto,go,brig,d \ --enable-checking=yes,rtl,extra
[Bug bootstrap/91208] [10 Regression] bootstrap comparison failure for objc and obj-c++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91208 --- Comment #3 from Jakub Jelinek --- cc1*checksum.o warning appears normally (though I wonder why, e.g. for cc1-checksum.o etc. it is not printed, just for ObjC/ObjC++), what is more interesting is what is printed after the comparison failure message (i.e. content of .bad_compare) and what the differences are.
[Bug ipa/91194] A suspicious condition in recursive_inlining
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91194 --- Comment #3 from Jan Hubicka --- Author: hubicka Date: Fri Jul 19 14:31:09 2019 New Revision: 273603 URL: https://gcc.gnu.org/viewcvs?rev=273603&root=gcc&view=rev Log: PR ipa/91194 * ipa-inline.c (recursive_inlining): Fix limits check. Modified: trunk/gcc/ChangeLog trunk/gcc/ipa-inline.c
[Bug target/91204] [10 Regression] ICE in expand_expr_real_2, at expr.c:9215 with -O3
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91204 --- Comment #9 from uros at gcc dot gnu.org --- Author: uros Date: Fri Jul 19 14:36:49 2019 New Revision: 273604 URL: https://gcc.gnu.org/viewcvs?rev=273604&root=gcc&view=rev Log: PR target/91204 * config/i386/mmx.md (one_cmpl2): New expander. Modified: trunk/gcc/ChangeLog trunk/gcc/config/i386/mmx.md
[Bug middle-end/91205] -fstack-protector-strong -D_FORTIFY_SOURCE=2 breaks tftpd
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91205 Martin Sebor changed: What|Removed |Added CC||msebor at gcc dot gnu.org --- Comment #6 from Martin Sebor --- (In reply to Ricardo Ribalda from comment #5) The rule is to respect object and subobject boundaries. I.e., except for elements of the same array, don't assume that a pointer one object (or subobject) can be used to derive a pointer to another object (or suobject), even an immediately adjacent one, or that such a pointer can be used to access the adjacent object. To get from subobject A to subobject B within the same enclosing object X, start with a pointer to X rather that with one to X.A, and increment it by the difference between the offsets of two subobjects. GCC implements weaker rules for memcpy to try to accommodate code the doesn't respect this rule but that comes at the cost of compromised buffer overflow detection (i.e., warnings) and protection (_FORTIFY_SOURCE).
[Bug rtl-optimization/53652] *andn* isn't used for vectorization
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53652 --- Comment #5 from Segher Boessenkool --- It might work a lot better if it didn't have to load that all-ones vector in a separate insn. Because it does, you need to do a 3->3 combination (which we do not currently support) if you need to do the memory load in a separate insn, as well the the insn needed to keep the constant load (it isn't dead yet, later insns use that same value again)). So that would mean having insns (that split) for doing a NOT.
[Bug tree-optimization/91213] New: Missed optimization: (sub X Y) -> (xor X Y) when Y <= X and isPowerOf2(X + 1)
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91213 Bug ID: 91213 Summary: Missed optimization: (sub X Y) -> (xor X Y) when Y <= X and isPowerOf2(X + 1) Product: gcc Version: 10.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: nok.raven at gmail dot com Target Milestone: --- Proof https://rise4fun.com/Alive/Xr3 unsigned foo(unsigned x) { if (x > 31) __builtin_unreachable(); return 31 - x; } unsigned bar(unsigned x) { if (x > 63) __builtin_unreachable(); return 63 - x; } The optimization seems to be manually applied to __builtin_clz, and because GCC cannot reverse it or apply again the: unsigned bsrl(unsigned x) { return 31 - __builtin_clz(x); } becomes (sub 31 (xor 31 (bsr x))): bsrl(unsigned int): bsrl %edi, %edi movl $31, %eax xorl $31, %edi subl %edi, %eax ret https://godbolt.org/z/3nzi0z
[Bug tree-optimization/91211] [10 Regression] wrong code with __builtin_memset() and __builtin_memcpy() at -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91211 Richard Biener changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|--- |FIXED --- Comment #2 from Richard Biener --- Fixed.
[Bug tree-optimization/91211] [10 Regression] wrong code with __builtin_memset() and __builtin_memcpy() at -O1 and above
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91211 --- Comment #3 from Richard Biener --- Author: rguenth Date: Fri Jul 19 16:19:39 2019 New Revision: 273605 URL: https://gcc.gnu.org/viewcvs?rev=273605&root=gcc&view=rev Log: 2019-07-19 Richard Biener PR tree-optimization/91211 * tree-ssa-sccvn.c (vn_walk_cb_data::push_partial_def): Fix memset encoding size. * gcc.dg/torture/pr91211.c: New testcase. Added: trunk/gcc/testsuite/gcc.dg/torture/pr91211.c Modified: trunk/gcc/ChangeLog trunk/gcc/testsuite/ChangeLog trunk/gcc/tree-ssa-sccvn.c
[Bug rtl-optimization/91154] [10 Regression] 456.hmmer regression on Haswell caused by r272922
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91154 --- Comment #16 from Richard Biener --- Ah, because x86_64_general_operand allows memory but the v alternative not and reloading that is appearantly more expensive than not doing that and reloading the general reg later. Fun. Changing that to x86_64_nonmemory_operand makes the whole thing work nearly fully (for this testcase, breaking everything else of course), there's one gpr op remaining again because we get memory, this time in the first operand which I kept as nonimmediate_operand. Not sure how we make RA happier to reload a memory operand for the v,v,v alternative without doing that elsewhere. movl$-987654321, %r10d vmovd (%rdi), %xmm0 leal-1(%r8), %r9d xorl%eax, %eax vmovd %r10d, %xmm1 .p2align 4,,10 .p2align 3 .L3: vmovd (%rdx,%rax,4), %xmm2 vpaddd %xmm2, %xmm0, %xmm0 vmovd %xmm0, 4(%rdi,%rax,4) movl(%rcx,%rax,4), %r8d addl(%rsi,%rax,4), %r8d vmovd %r8d, %xmm3 movq%rax, %r8 vpmaxsd %xmm0, %xmm3, %xmm0 vpmaxsd %xmm1, %xmm0, %xmm0 vmovd %xmm0, 4(%rdi,%rax,4) addq$1, %rax cmpq%r9, %r8 jne .L3
[Bug c++/91214] New: first atof function call not return correct result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91214 Bug ID: 91214 Summary: first atof function call not return correct result Product: gcc Version: 9.1.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: xiaoyi_wu at yahoo dot com Target Milestone: --- The first call to atof in the program gives incorrect result. Here is the terminal session: lima:~$ uname -a Linux lima 5.1.17-300.fc30.x86_64 #1 SMP Wed Jul 10 15:20:27 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux lima:~$ g++ -v Using built-in specs. COLLECT_GCC=g++ COLLECT_LTO_WRAPPER=/usr/libexec/gcc/x86_64-redhat-linux/9/lto-wrapper OFFLOAD_TARGET_NAMES=nvptx-none OFFLOAD_TARGET_DEFAULT=1 Target: x86_64-redhat-linux Configured with: ../configure --enable-bootstrap --enable-languages=c,c++,fortran,objc,obj-c++,ada,go,d,lto --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-bugurl=http://bugzilla.redhat.com/bugzilla --enable-shared --enable-threads=posix --enable-checking=release --enable-multilib --with-system-zlib --enable-__cxa_atexit --disable-libunwind-exceptions --enable-gnu-unique-object --enable-linker-build-id --with-gcc-major-version-only --with-linker-hash-style=gnu --enable-plugin --enable-initfini-array --with-isl --enable-offload-targets=nvptx-none --without-cuda-driver --enable-gnu-indirect-function --enable-cet --with-tune=generic --with-arch_32=i686 --build=x86_64-redhat-linux Thread model: posix gcc version 9.1.1 20190503 (Red Hat 9.1.1-1) (GCC) lima:~$ cat main.cc #include #include int main(int argc, const char **argv) { if (argc != 2) return printf("usage$ %s input.txt\n", *argv); FILE *file = fopen(argv[1], "r"); char line[256]; while (fgets(line, 256, file)) { printf("{%s} %g\n", line, atof(line)); } fclose(file); } lima:~$ cat input.txt 123.45 234.56 345.67 lima:~$ g++ -Wall -O3 main.cc lima:~$ a.out input.txt {123.45 } 0 {234.56 } 234.56 {345.67 } 345.67 lima:~$ This small program read lines from input.txt in a loop, and print the line and the atof value of the line out. But the first atof returns a 0 instead of the expected 123.45.
[Bug c++/91215] New: Compiled program loops endlessly because of -O2 with g++ 8.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91215 Bug ID: 91215 Summary: Compiled program loops endlessly because of -O2 with g++ 8.3.0 Product: gcc Version: 8.3.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: sboisvert at gydle dot com Target Milestone: --- Created attachment 46614 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=46614&action=edit Minimum reproducer C++ source code Hello, I compile my program with -O2 with g++ 8.3.0. When I run my program, there is an infinite loop. With -O0, this is not the case. Steps to reproduce == Step 1: Acquire g++-8 (8.3.0) I used g++ (Ubuntu 8.3.0-6ubuntu1~18.04.1) 8.3.0. Step 2: build with O2 Command: /usr/bin/g++-8 -O2 g++-8.3.0-infinite-loop-bug.cpp -o g++-8.3.0-infinite-loop-bug Step 3: execute the program ./g++-8.3.0-infinite-loop-bug Expected result === The body of the for loop is done 50 times (i = 0, ..., i = 49). Then the program returns 0. Actual result = The body of the for loop is performed beyond i = 49. There is an infinite loop. Minimum reproducer == /* * Buggy for loop with g++ (Ubuntu 8.3.0-6ubuntu1~18.04.1) 8.3.0 * * /usr/bin/g++-8 -O2 g++-8.3.0-infinite-loop-bug.cpp -o g++-8.3.0-infinite-loop-bug ./g++-8.3.0-infinite-loop-bug */ #include int doY() { for (int i = 0; i < 50; ++i) { std::cout << "DEBUG []"; std::cout << " i: " << i; std::cout << std::endl; } // return 0; // This fixes the bug. } int main(int argc, char* argv[]){ doY(); return 0; }
[Bug c++/91215] Compiled program loops endlessly because of -O2 with g++ 8.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91215 --- Comment #1 from Sebastien Boisvert --- As indicated in the minimum reproducer, returning 0 fixed this bug. g++ reports this warning: g++-8.3.0-infinite-loop-bug.cpp: In function 'int doY()': g++-8.3.0-infinite-loop-bug.cpp:20:1: warning: no return statement in function returning non-void [-Wreturn-type] } ^ Thanks
[Bug c++/91215] Compiled program loops endlessly because of -O2 with g++ 8.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91215 Andrew Pinski changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #2 from Andrew Pinski --- In C++ (unlike C), fallthrough to end block where the return type is not void is undefined. In C, only when the value that is returned is undefined.
[Bug c++/91215] Compiled program loops endlessly because of -O2 with g++ 8.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91215 --- Comment #3 from Sebastien Boisvert --- That was fast, thanks !
[Bug c++/91215] Compiled program loops endlessly because of -O2 with g++ 8.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91215 --- Comment #4 from Andrew Pinski --- Also read https://gcc.gnu.org/gcc-8/porting_to.html : -Wreturn-type is enabled by default G++ now assumes that control never reaches the end of a non-void function (i.e. without reaching a return statement). This means that you should always pay attention to -Wreturn-type warnings, as they indicate code that can misbehave when optimized. To tell the compiler that control can never reach the end of a function (e.g. because all callers enforce its preconditions) you can suppress -Wreturn-type warnings by adding __builtin_unreachable: char signchar(int i) // precondition: i != 0 { if (i > 0) return '+'; else if (i < 0) return '-'; __builtin_unreachable(); } Because -Wreturn-type is now enabled by default, G++ will warn if main is declared with an implicit int return type (which is non-standard but allowed by GCC). To avoid the warning simply add a return type to main, which makes the code more portable anyway.
[Bug c++/91215] Compiled program loops endlessly because of -O2 with g++ 8.3.0
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91215 --- Comment #5 from Sebastien Boisvert --- OK, thanks for the link, this is interesting. A previous release, g++ 7.4.0, does not generate an infinite loop in the executable. Like g++ 8.3.0, it does print the warning: no return statement in function returning non-void [-Wreturn-type]. So, in a way, g++ 7.4.0 handled this better than g++ 8.3.0, from a software development standpoint. If the warning [-Wreturn-type] can generate an infinite loop in the resulting executable, then it should be an error, because the resulting executable is unusable. In the end, I should use -Werror=return-type. Thanks
[Bug bootstrap/91208] [10 Regression] bootstrap comparison failure for objc and obj-c++
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91208 Matthias Klose changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |DUPLICATE --- Comment #4 from Matthias Klose --- I'm wrong, this is a duplicate of PR91209. *** This bug has been marked as a duplicate of bug 91209 ***
[Bug other/91209] gm2 bootstrap comparison failure
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91209 --- Comment #3 from Matthias Klose --- *** Bug 91208 has been marked as a duplicate of this bug. ***
[Bug c++/91214] first atof function call not return correct result
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91214 xiaoyi_wu at yahoo dot com changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|--- |INVALID --- Comment #1 from xiaoyi_wu at yahoo dot com --- Never mind. False alarm. The text file has a not visible byte order mark of 0xefbbbf at the beginning, causing atof correctly returning 0.
[Bug tree-optimization/91183] strlen of a strcpy result with a conditional source not folded
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91183 Martin Sebor changed: What|Removed |Added Keywords||patch --- Comment #3 from Martin Sebor --- Patch: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01323.html
[Bug tree-optimization/86688] missing -Wstringop-overflow using a non-string local array in strnlen with excessive bound
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=86688 Martin Sebor changed: What|Removed |Added Keywords||patch See Also||https://gcc.gnu.org/bugzill ||a/show_bug.cgi?id=91183 --- Comment #2 from Martin Sebor --- Patch: https://gcc.gnu.org/ml/gcc-patches/2019-07/msg01323.html
[Bug c++/85784] False positive with -Wunused-but-set-parameter
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85784 trashyankes at wp dot pl changed: What|Removed |Added CC||trashyankes at wp dot pl --- Comment #1 from trashyankes at wp dot pl --- Another similar case: ``` template struct F : T... { F(int i) : T{i}... { } }; int main() { F<> z{2}; //warning: parameter 'i' set but not used [-Wunused-but-set-parameter] } ``` https://gcc.godbolt.org/z/lhGLZ6 I think in case when variadic list is empty, code still "use" `i`. Error still exist in 9.1