[Bug middle-end/96564] New maybe use of uninitialized variable warning since GCC >10

2020-08-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96564 --- Comment #1 from Marc Glisse --- I think there are duplicates about the fact that while gcc knows that a and x cannot alias (if you read *x, write to *a, then read from *x again, gcc reuses the first value), it does not use that information to

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565 Marc Glisse changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565 --- Comment #2 from Marc Glisse --- Actually, it isn't so much the alloca call itself, it seems to be __builtin_stack_save / __builtin_stack_restore that prevent DSE from removing arr[0] = 0 (without that write, DCE can remove __builtin_alloca_wi

[Bug tree-optimization/96513] building terminated with -O3

2020-08-12 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96513 Marc Glisse changed: What|Removed |Added Known to fail||10.1.0, 9.3.0 Ever confirmed|1

[Bug c/96586] suboptimal code generated for condition expression

2020-08-12 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96586 --- Comment #1 from Marc Glisse --- Probably a DUP of several issues about the lack of flow-sensitivity of our analysis. Int_1_Loc escapes (we don't pay attention that it only escapes late), so we don't realize that Proc_7 cannot modify it.

[Bug tree-optimization/96654] Failure to optimize vectorized conversion to `int` with AVX

2020-08-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96654 --- Comment #2 from Marc Glisse --- gcc doesn't seem very fond of using 2 different vector bitsizes at the same time, so VEC_PACK_FIX_TRUNC_EXPR takes 2 vectors of 2 double and gives one vector of 4 int. At the RTL level, we have a vec_concat:V4D

[Bug tree-optimization/96753] Missed optimization on modulo when left side value is known to be greater than right side value

2020-08-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96753 Marc Glisse changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug tree-optimization/96753] Missed optimization on modulo when left side value is known to be greater than right side value

2020-08-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96753 --- Comment #2 from Marc Glisse --- Maybe using vrp_evaluate_conditional (or some other similar helper) instead of manually comparing ranges in simplify_div_or_mod_using_ranges would help.

[Bug middle-end/96750] 10-12% performance decrease in benchmark going from GCC8 to GCC9/GCC10

2020-08-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96750 --- Comment #2 from Marc Glisse --- (In reply to Martin Liška from comment #1) > after: > 1794240.0 > > before: > 1802710.0 That's less than 1% of difference (with "after" better than "before"), not the 10% regression claimed, maybe there is an

[Bug tree-optimization/96565] Failure to optimize out VLA even though it is left unused

2020-08-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96565 --- Comment #4 from Marc Glisse --- (In reply to Richard Biener from comment #3) > I guess the "usual" way of dealing with this would be to have > CLOBBERs for all VLAs before the __builtin_stack_restore. That looks like a good idea. I didn't t

[Bug target/96528] [11 Regression] vector comparisons on ARM

2020-08-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96528 --- Comment #2 from Marc Glisse --- (In reply to Richard Biener from comment #1) > I think that eventually vector lowering should lower > > _1 = a == 5; > _2 = b == 7; > _3 = _1 | _2; > _4 = _3 ? -1 : 0; > > to > > _31 = _1 ? -1 : 0; > _

[Bug c/96804] Arguments are swapped in floating-point addition

2020-08-26 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96804 --- Comment #2 from Marc Glisse --- (In reply to Paweł Bylica from comment #0) > This is a problem because when both arguments are NaNs, the result may be > different than the one predicted by IEEE 754. Could you quote the sentence in IEEE 754 t

[Bug target/96793] __builtin_floor produces wrong result when rounding direction is FE_DOWNWARD

2020-08-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96793 --- Comment #13 from Marc Glisse --- x-x does depend on the rounding mode (the transformation in match.pd gets it wrong, by the way). If the sign of 0 is the only issue, maybe we can test flag_rounding_math && flag_signed_zeros or the correspondi

[Bug target/96793] __builtin_floor produces wrong result when rounding direction is FE_DOWNWARD

2020-08-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96793 --- Comment #14 from Marc Glisse --- (In reply to Marc Glisse from comment #13) > if (HONOR_SIGNED_ZEROS (mode)) > x2 = copysign (x2, x); Hmm, I misread the comment, sorry. We already do that, for both floor and ceil. But we do

[Bug c++/96862] -frounding-math -std=c++2a error: '(1.29e+2 * 6.9314718055994529e-1)' is not a constant expression

2020-08-31 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96862 --- Comment #5 from Marc Glisse --- "[Note: This document does not require an implementation to support the FENV_ACCESS pragma; it is implementation-defined (15.8) whether the pragma is supported. As a consequence, it is implementation-defined

[Bug c++/96862] -frounding-math -std=c++2a error: '(1.29e+2 * 6.9314718055994529e-1)' is not a constant expression

2020-08-31 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96862 --- Comment #8 from Marc Glisse --- Should we handle flag_trapping_math at the same time?

[Bug tree-optimization/92712] [8/9 Regression] Performance regression with assumed values

2020-09-01 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #25 from Marc Glisse --- (In reply to Feng Xue from comment #24) > Another point: if B+-C can be folded to an existing gimple value, we might > deduce B+-C does not overflow? We can deduce that loading this value that represents B+-C

[Bug tree-optimization/96897] Failure to optimize not+not+dec+and+not to add+or

2020-09-02 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96897 --- Comment #1 from Marc Glisse --- We already transform to return ~(-2 - x) | x; so this is really asking for ~(-2 - x) --> x + 1

[Bug tree-optimization/96912] Failure to optimize pblendvb pattern

2020-09-03 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96912 --- Comment #2 from Marc Glisse --- With consistent types, we recognize a VEC_COND_EXPR. With inconsistent types, I guess we would need to reinterpret x and y as v16i8, and reinterpret the result back to v2i64. (please keep #include in your tes

[Bug target/96918] Failure to optimize vector shift left+shift right+or to pshuf

2020-09-03 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96918 --- Comment #5 from Marc Glisse --- typedef unsigned short v8i16 __attribute__((vector_size(16))); v8i16 bswap_epi16(v8i16 x) { return (x << 8) | (x >> 8); } We do recognize a rotate already in GENERIC return x r<< 8; But this is ex

[Bug tree-optimization/96938] Failure to optimize bit-setting pattern when not using temporary

2020-09-04 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96938 --- Comment #1 from Marc Glisse --- With "char tmp" instead of "int tmp", we get the same code as the first function.

[Bug tree-optimization/97085] [11 Regression] aarch64, SVE: ICE in gimple_expand_vec_cond_expr since r11-2610-ga1ee6d507b

2020-09-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97085 --- Comment #4 from Marc Glisse --- I would be happy with a revert of that patch, if the ARM backend gets fixed, but indeed a missed optimization should not cause an ICE. (In reply to Richard Biener from comment #2) > At least we're not at all e

[Bug c++/85746] Premature evaluation of __builtin_constant_p?

2019-10-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85746 --- Comment #9 from Marc Glisse --- Author: glisse Date: Tue Oct 22 14:42:38 2019 New Revision: 277292 URL: https://gcc.gnu.org/viewcvs?rev=277292&root=gcc&view=rev Log: PR c++/85746: Don't fold __builtin_constant_p prematurely 2019-10-22 Marc

[Bug c++/85746] Premature evaluation of __builtin_constant_p?

2019-10-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=85746 Marc Glisse changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug c++/92194] maybe-uninitialized false positive with c++2a

2019-10-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92194 --- Comment #2 from Marc Glisse --- With -Wsystem-headers you also get the warning in C++17 (and it is actually a bit more informative, at least it says where it is used).

[Bug tree-optimization/92233] missed optimisation for multiplication when it's known that at least one of the arguments is 0

2019-10-26 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92233 --- Comment #1 from Marc Glisse --- (llvm doesn't do it either) Would some kind of threading be the most natural way to handle this? If the compiler duplicates the code as if (a==0) return a*b; else if (b==0) return a*b; then it becomes easy to

[Bug tree-optimization/71761] missing tailcall optimization

2019-11-18 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71761 --- Comment #5 from Marc Glisse --- The case "struct token { int i; };" was fixed in gcc-7. >= f (); [return slot optimization] [tail call] That's all you should see at this point, it is later that it gives up.

[Bug rtl-optimization/91333] [9/10 Regression] suboptimal register allocation for inline asm

2019-11-20 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91333 --- Comment #3 from Marc Glisse --- gcc version 9.2.1 20191109 (Debian 9.2.1-19) (current debian testing/unstable) gives me the 3 movapd, whether I use -O1, -O2 or -O3, and -Os gives 2 movapd. I didn't try with a vanilla gcc, not sure which debi

[Bug target/92592] Redundant comparison after subtraction on x86

2019-11-20 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92592 --- Comment #3 from Marc Glisse --- IFN_SUB_OVERFLOW recognition?

[Bug tree-optimization/92716] -Os doesn't inline byteswap function even though it's a single instruction

2019-11-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92716 --- Comment #4 from Marc Glisse --- Yes, the pass that recognizes bswap (unsurprisingly called bswap) happens much later than inlining in the pipeline. This kind of thing is unavoidable since cycling through optimization passes is considered unde

[Bug rtl-optimization/92712] [8/9/10 Regression] Performance regression with assumed values

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #5 from Marc Glisse --- a*x+x -> (a+1)*x is unsafe (a=INT_MAX, x=0), but there are cases where we could prove that it is safe, in particular when a is actually b-1 (more generally for a*x+b*x when we can prove (with VRP?) that a+b can

[Bug rtl-optimization/92712] [8/9/10 Regression] Performance regression with assumed values

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #7 from Marc Glisse --- The first question could be why SCCP produces (const int) ((unsigned int) t_2(D) + 4294967295) * v_3(D) + v_3(D) and not directly t*v. Several loop passes do have this tendency to split out the last (or first)

[Bug rtl-optimization/92712] [8/9/10 Regression] Performance regression with assumed values

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #8 from Marc Glisse --- (In reply to Jakub Jelinek from comment #6) > The suggestion I'll try to work on is to check if C isn't a plus/minus expr > with constant second operand that doesn't go in the other direction and thus > where t

[Bug rtl-optimization/92712] [8/9/10 Regression] Performance regression with assumed values

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #11 from Marc Glisse --- (In reply to Jakub Jelinek from comment #10) > I know, it will be a small complication, sure, but it can be handled. Ah, I think I understand now. But still x=-1 a=INT_MAX a*x+x gives INT_MIN without overflo

[Bug c++/92727] Idea for better error messages

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92727 --- Comment #9 from Marc Glisse --- (In reply to Jonathan Wakely from comment #7) > I disagree. The static assert contains all you need to know, expert or not. > The benefit of seeing all the gory details is to figure out why something > didn't c

[Bug c++/92727] Idea for better error messages

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92727 --- Comment #11 from Marc Glisse --- (In reply to Jonathan Wakely from comment #10) > The bug in this example is that the push_back call needs to make a copy and > the type is not copyable. It's not a bug that the copy constructor is > implictly

[Bug rtl-optimization/92712] [8/9/10 Regression] Performance regression with assumed values

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #16 from Marc Glisse --- (In reply to Jakub Jelinek from comment #15) > I guess we could handle those cases by using something like > check_for_binary_op_overflow, except that for the case where A might be -1 > and plusminus equal to

[Bug tree-optimization/92712] [8/9/10 Regression] Performance regression with assumed values

2019-11-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92712 --- Comment #20 from Marc Glisse --- (In reply to Jakub Jelinek from comment #19) > Created attachment 47398 [details] > gcc10-pr92712.patch > > Full untested patch. The patch looks very good to me :-)

[Bug c++/65656] __builtin_constant_p should always be constexpr

2019-11-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65656 --- Comment #11 from Marc Glisse --- Comment #6 looks like it was probably fixed with bug 85746.

[Bug c/92826] Impossible to silence warning: non-standard suffix on floating constant [-Wpedantic]

2019-12-05 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92826 --- Comment #2 from Marc Glisse --- I thought the name "pedantic" made it clear that it is going to warn about things that are just fine, and you shouldn't use it... You can disable the warning by inserting __extension__ in your code. It might be

[Bug c++/93005] Redundant NEON loads/stores from stack are not eliminated

2019-12-19 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93005 Marc Glisse changed: What|Removed |Added Target||arm-linux-gnueabihf Status|UNC

[Bug target/93039] Fails to use SSE bitwise ops for float-as-int manipulations

2019-12-21 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93039 Marc Glisse changed: What|Removed |Added Target||x86_64-*-* --- Comment #1 from Marc Glisse

[Bug tree-optimization/93044] extra cast is not removed

2019-12-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93044 --- Comment #2 from Marc Glisse --- In match.pd && ((inter_unsignedp && inter_prec > inside_prec) == (final_unsignedp && final_prec > inter_prec)) looks suspicious.

[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin

2019-12-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #5 from Marc Glisse --- We could indeed relax a bit the "same type" condition. We could also make sure that __restrict appears somewhere in the call chain when using copy or uninitialized_*, which lets the compiler merge the 2 loads/s

[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin

2019-12-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #7 from Marc Glisse --- (In reply to fdlbxtqi from comment #6) > > > clearly incorrect > > > > Please distinguish between what is wrong (generated code crashes, or returns > > 3 instead of 2), and what is suboptimal. > > Suppose #if

[Bug tree-optimization/93063] New: Loop distribution and NOP conversions

2019-12-24 Thread glisse at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- This comes from PR 93059. void f(signed short*__restrict p,unsigned short*__restrict q,int n){ for(int i=0;i

[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin

2019-12-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #8 from Marc Glisse --- (In reply to fdlbxtqi from comment #6) > void copy_char_vector_with_iter(std::vector::iterator > out,std::vector const& bits) > { > std::copy_n(bits.begin(),bits.size(),out); > } > > https://godbolt.org/

[Bug tree-optimization/93063] Loop distribution and NOP conversions

2019-12-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93063 --- Comment #1 from Marc Glisse --- Created attachment 47549 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=47549&action=edit Untested patch Not ready, at the very least it misses a comment and a test, but it shows where the test needs rel

[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin

2019-12-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #13 from Marc Glisse --- (In reply to fdlbxtqi from comment #11) > TBH. I would rather see the library does the optimization instead of the > compiler. I do not trust the compiler can always optimize this stuff. If we have both, that

[Bug libstdc++/93059] char and char8_t does not talk with each other with memcpy. std::copy std::copy_n, std::fill, std::fill_n, std::uninitialized_copy std::uninitialized_copy_n, std::fill, std::unin

2019-12-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93059 --- Comment #17 from Marc Glisse --- (In reply to fdlbxtqi from comment #15) > What I am worried about is that whether revamping these functions would be a > new wave of ABI breaking. I don't foresee any ABI issue here. Do make sure your code d

[Bug libstdc++/93147] std::tuple of empty structs with member equality operators has ambiguous equality operator

2020-01-03 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93147 --- Comment #3 from Marc Glisse --- (In reply to Alexander Kondratskiy from comment #0) > My suspicion is that tuple indirectly inherits the types `A` and `B`, and > even though it may be private inheritance, the compiler still finds both > compa

[Bug libstdc++/93151] system_error header fails to compile with -D_XOPEN_SOURCE=600

2020-01-09 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93151 --- Comment #2 from Marc Glisse --- (In reply to Jonathan Wakely from comment #1) > I don't know what the advantage of testing for them at configure time is. Strange systems that define them as enum values and not macros?

[Bug libstdc++/87106] Group move and destruction of the source, where possible, for speed

2020-01-15 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87106 --- Comment #24 from Marc Glisse --- Something like that, yes. Essentially, I used trivial because I was convinced it was safe in that case, not because it looked like the perfect condition. If someone can convincingly argue for a weaker conditio

[Bug libstdc++/87106] Group move and destruction of the source, where possible, for speed

2020-01-16 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87106 --- Comment #29 from Marc Glisse --- Note that __is_bitwise_relocatable is specialized to true for deque, so we are not super consistent here ;-) The original patch used is_trivially_move_constructible, IIRC I changed it to is_trivial so the revi

[Bug rtl-optimization/91333] [9/10 Regression] suboptimal register allocation for inline asm

2020-01-17 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91333 --- Comment #5 from Marc Glisse --- With trunk (master?), compiling with -O3, h gives movapd %xmm1, %xmm3 addsd %xmm3, %xmm1 movapd %xmm0, %xmm2 addsd %xmm2, %xmm0 addsd %xmm1, %xmm0 which looks g

[Bug middle-end/26724] __builtin_constant_p fails to recognise function with constant return

2020-01-22 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26724 --- Comment #4 from Marc Glisse --- (In reply to pskocik from comment #3) > I don't know if this is related, It isn't, please file a separate bug report. memcmp is optimized to an integer comparison in strlen, much later than the lowering of __b

[Bug target/93459] ix86_fold_builtin should handle __builtin_ia32_pcmpeqd128_mask and similar builtins

2020-01-27 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93459 --- Comment #2 from Marc Glisse --- For __builtin_ia32_vec_ext_v4si, shouldn't we lower it to an array access in gimple, when the second argument is constant? I assume we don't want to do it directly in smmintrin.h for diagnostic purposes.

[Bug tree-optimization/93504] Missed reassociation with constants and not of that constant with IORs

2020-01-30 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93504 --- Comment #4 from Marc Glisse --- /* (x & ~m) | (y & m) -> ((x ^ y) & m) ^ x */ I guess several transformations like this one which match (unary m) could do with a second version for the case where m is constant (and thus (unary m) is already

[Bug target/93535] slow float/double simple constant folding with -Ofast

2020-02-02 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93535 --- Comment #1 from Marc Glisse --- (testing clang++ -O2 -ffast-math is useful as well) It would be more helpful if you could isolate some specific transformations that gcc is missing, instead of one big benchmark. For instance: double f(int n)

[Bug rtl-optimization/90977] Inconsistencies with inline asm

2020-02-04 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90977 --- Comment #3 from Marc Glisse --- (In reply to Segher Boessenkool from comment #2) > Please open a separate bug for the powerpc ICE? I just tried and couldn't reproduce the ICE with the gcc-9 ppc64el cross-compiler I have now. Maybe that ICE i

[Bug target/93594] Missed optimization with _mm256_set/setr_m128i intrinsics

2020-02-06 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93594 --- Comment #4 from Marc Glisse --- The versions involving _mm256_cast* may be related to PR50829 and others (UNSPEC hiding the semantics of the operation).

[Bug target/93594] Missed optimization with _mm256_set/setr_m128i intrinsics

2020-02-06 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93594 --- Comment #9 from Marc Glisse --- (In reply to Jakub Jelinek from comment #6) > if we change the cast patterns so that they are a > vec_concat of the operand and UNSPEC_CAST that then represents just the > uninitialized higher part, simplify-rt

[Bug middle-end/93644] [10 Regression] -Wreturn-local-addr July regression: new false-positive warning

2020-02-09 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93644 --- Comment #2 from Marc Glisse --- # buffer_2 = PHI <&stack_bufD.1939(3), buffer_7(D)(9)> buffer_18 = ASSERT_EXPR ; Can't we deduce from this buffer_18 = buffer_7(D) ? Of course that's not a general solution, but it looks like a sensib

[Bug target/60181] constant folding of complex number incorrect

2020-02-11 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60181 --- Comment #10 from Marc Glisse --- Flags like -ftrapping-math can prevent gcc from folding at compile-time when the result is infinite (or maybe it always refuses to fold in that case). In your example, gcc generates a runtime call to __muldc3

[Bug tree-optimization/93745] New: Redundant store not eliminated with intermediate instruction

2020-02-14 Thread glisse at gcc dot gnu.org
-optimization Severity: normal Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- double d; void f(long*p){ long i=*p; d=3.; *p=i; } I would like *p=i to be

[Bug tree-optimization/93745] Redundant store not eliminated with intermediate instruction

2020-02-14 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93745 --- Comment #2 from Marc Glisse --- Ah, right :-( I thought the example rang a bell, but before your explanation I couldn't connect it, thanks.

[Bug tree-optimization/93745] [8/9/10 Regression] Redundant store not eliminated with intermediate instruction

2020-02-17 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93745 --- Comment #8 from Marc Glisse --- (In reply to Martin Sebor from comment #7) > But regardless of what language might have even looser rules than C/C++ in > this area, it would seem like a rather unfortunate design limitation for GCC > not to be

[Bug tree-optimization/93891] New: CSE where clobber writes the same value

2020-02-22 Thread glisse at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- void f(int**p,int**q){ ++**p; *q=*p; --**p; } produces _1 = *p_8(D); _2 = *_1; _3 = _2 + 1; *_1 = _3; *q_10(D

[Bug tree-optimization/93891] CSE where clobber writes the same value

2020-02-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93891 --- Comment #1 from Marc Glisse --- On the original code (I can attach it if needed, but it is large, it is resizing a std::vector with reference-counted elements) FRE3 fails to simplify MEM[(struct Handle_for *)__cur_16] ={v} {CLOBBER}; _17

[Bug tree-optimization/93896] Store merging uses SSE only for trivial types

2020-02-23 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93896 --- Comment #1 from Marc Glisse --- Without the constructor, we get plain *this_2(D).a = {}; which is expanded as an __int128 store. With the constructor, we get MEM[(struct M *)this_2(D)].p = 0B; MEM[(struct M *)this_2(D)].sz = 0; MEM

[Bug middle-end/93902] conversion from 64-bit long or unsigned long to double prevents simple optimization

2020-02-24 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93902 --- Comment #1 from Marc Glisse --- The current optimization in match.pd is an equivalence, it replaces (double)i==(double)j with i==j when the conversion is always exact. Here, what we would want is that inside the a==b branch, the compiler woul

[Bug tree-optimization/93917] New: VRP forgets range of value read from memory

2020-02-24 Thread glisse at gcc dot gnu.org
Priority: P3 Component: tree-optimization Assignee: unassigned at gcc dot gnu.org Reporter: glisse at gcc dot gnu.org Target Milestone: --- First a case that works: void f(int n){ if(n<0)__builtin_unreachable(); } EVRP assigns a range to n, and VRP1 folds

[Bug c/29186] optimzation breaks floating point exception flag reading

2020-02-25 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=29186 --- Comment #23 from Marc Glisse --- (In reply to Richard B. Kreckel from comment #22) > I can't reproduce this bug any more, I think you are just lucky, I am sure it hasn't been fixed and gcc will still happily swap FP operations with function

[Bug middle-end/93939] missing optimization for floating-point expression converted to integer whose result is known at compile time

2020-02-26 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93939 --- Comment #1 from Marc Glisse --- typedef long T; also generates a comparison with 24. The main issue is that b is used outside of the branch controlled by if(b==8), so a naive substitution misses it. Repeating 3*b in the branch is useless, w

[Bug tree-optimization/93971] C++ containers considered to alias declared objects of incompatible types

2020-02-28 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93971 --- Comment #1 from Marc Glisse --- If x is NaN, you cannot simplify x!=x to false.

[Bug tree-optimization/93971] std::string considered to alias declared objects of incompatible types

2020-02-29 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93971 --- Comment #5 from Marc Glisse --- It has never been very clear to me what restrict means on a struct member, but I believe adding it to the pointer in vector means that in a function: void f(vector*a, vector*b) the compiler could assume that a-

[Bug tree-optimization/93971] std::string considered to alias declared objects of incompatible types

2020-03-01 Thread glisse at gcc dot gnu.org
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93971 --- Comment #7 from Marc Glisse --- (In reply to Martin Sebor from comment #6) > (1) one that would make "std::string::ptr" on par with > that of any other pointer other than char (i.e., a char that's not allowed > to be used to access anything b

[Bug libstdc++/56111] [4.8 Regression] {float,double,long double} complex not accepted anymore

2013-02-12 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56111 --- Comment #17 from Marc Glisse 2013-02-12 20:55:09 UTC --- (In reply to comment #15) > Marc, any news on this? I landed less than 2 hours ago... I'll see if I can handle it tomorrow.

[Bug libstdc++/56111] [4.8 Regression] {float,double,long double} complex not accepted anymore

2013-02-13 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56111 --- Comment #18 from Marc Glisse 2013-02-13 21:58:57 UTC --- Author: glisse Date: Wed Feb 13 21:58:53 2013 New Revision: 196034 URL: http://gcc.gnu.org/viewcvs?root=gcc&view=rev&rev=196034 Log: 2013-02-13 Marc Glisse PR libs

[Bug libstdc++/56111] [4.8 Regression] {float,double,long double} complex not accepted anymore

2013-02-13 Thread glisse at gcc dot gnu.org
||FIXED AssignedTo|unassigned at gcc dot |glisse at gcc dot gnu.org |gnu.org | --- Comment #19 from Marc Glisse 2013-02-13 22:02:33 UTC --- Done. Users should still convert their code to use _Complex instead if they

[Bug tree-optimization/56355] New: abs and multiplication

2013-02-16 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56355 Bug #: 56355 Summary: abs and multiplication Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Keywords: missed-optimization Sev

[Bug tree-optimization/56355] abs and multiplication

2013-02-16 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56355 --- Comment #1 from Marc Glisse 2013-02-16 12:07:28 UTC --- Actually, for g/h with double, using __builtin_fabs instead of std::abs does it, so it is just the usual lack of combine at the tree level. But there is still f, and the builtin a

[Bug c++/51033] generic vector subscript and shuffle support was not added to C++

2013-02-17 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=51033 --- Comment #31 from Marc Glisse 2013-02-17 15:04:15 UTC --- (In reply to comment #30) > Another example is binary operators between scalar and vectors. In C the > scalar > is automatically treated as a vector, but in C++ it results in a

[Bug tree-optimization/55796] Comparison with a negated number vs sum

2013-02-28 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55796 --- Comment #2 from Marc Glisse 2013-02-28 16:57:35 UTC --- (In reply to comment #1) > DOM is the optimization that tries to optimize these cases. It's probably > overly careful for FP compares. Note that the compiler doesn't optimize

[Bug tree-optimization/56488] [4.7 Regression] wrong code for loop at -O3

2013-03-01 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56488 --- Comment #3 from Marc Glisse 2013-03-01 09:53:20 UTC --- Seems to me that 'e' is signed and the testcase relies on wrapping overflow (-fwrapv helps).

[Bug tree-optimization/56488] [4.7 Regression] wrong code for loop at -O3

2013-03-01 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56488 --- Comment #5 from Marc Glisse 2013-03-01 10:05:53 UTC --- You are right, of course. I remembered that gcc defined unsigned->signed conversion, but I had forgotten that it defined all narrowing conversions as well, sorry.

[Bug rtl-optimization/55611] Operand swapping for commutative operators

2013-03-11 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55611 --- Comment #2 from Marc Glisse 2013-03-11 14:44:09 UTC --- The fortran test that fails is equivalent to the following (use -Ofast -g, surprisingly it only fails in var tracking) float f(double*a,double*b){ double x=a[0]*b[0]; x+=a

[Bug rtl-optimization/55611] Operand swapping for commutative operators

2013-03-11 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55611 --- Comment #3 from Marc Glisse 2013-03-11 15:14:46 UTC --- (In reply to comment #1) > I also tried the reverse order (just swap x and y in the GET_CODE comparison). > It got a crazy process during stage3 compiling tree-ssa-address.c (I ki

[Bug rtl-optimization/55611] Operand swapping for commutative operators

2013-03-12 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55611 Marc Glisse changed: What|Removed |Added Attachment #28931|0 |1 is obsolete|

[Bug tree-optimization/56355] abs and multiplication

2013-03-20 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56355 --- Comment #3 from Marc Glisse 2013-03-20 15:40:30 UTC --- Jeff Law mentions that it would be good to add this "squares are nonnegative" information to VRP, not just tree_binary_nonnegative_warnv_p. http://gcc.gnu.org/ml/gcc-patches/2013-

[Bug tree-optimization/56355] abs and multiplication

2013-03-20 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56355 --- Comment #4 from Marc Glisse 2013-03-20 16:19:33 UTC --- (In reply to comment #3) > Jeff Law mentions that it would be good to add this "squares are nonnegative" > information to VRP, not just tree_binary_nonnegative_warnv_p. > http://

[Bug tree-optimization/56695] ICE in expand_vec_cond_expr, at optabs.c:6751

2013-03-23 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56695 Marc Glisse changed: What|Removed |Added CC||glisse at gcc dot gnu.org

[Bug tree-optimization/56695] ICE in expand_vec_cond_expr, at optabs.c:6751

2013-03-23 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56695 --- Comment #3 from Marc Glisse 2013-03-23 17:06:30 UTC --- In any case, it would be good to add some checks on the first argument of VEC_COND_EXPR to the verifier, so the problem is detected earlier than expand.

[Bug tree-optimization/56695] [4.9 Regression] ICE in expand_vec_cond_expr, at optabs.c:6751

2013-03-26 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56695 --- Comment #6 from Marc Glisse 2013-03-26 16:03:04 UTC --- Created attachment 29733 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29733 Untested patch I was thinking about something like this. In 4.8, I added vec_cond_expr expan

[Bug libstdc++/56785] std::tuple of two elements does not apply empty base class optimization when one of its elements is a std::tuple with two elements

2013-03-30 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56785 --- Comment #1 from Marc Glisse 2013-03-30 09:06:44 UTC --- I assume (needs to be checked) this is related to the issue, reported elsewhere, that makes the size of std::tuple>>> grow linearly with the nesting depth.

[Bug target/50829] avx extra copy for _mm256_insertf128_pd

2013-03-30 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=50829 --- Comment #13 from Marc Glisse 2013-03-30 10:13:46 UTC --- (In reply to comment #10) > Created attachment 28846 [details] > Use subreg The patch was submitted at http://gcc.gnu.org/ml/gcc-patches/2013-03/msg00683.html and rejected, s

[Bug target/56788] New: _mm_frcz_sd and _mm_frcz_ss ignore their second argument

2013-03-30 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56788 Bug #: 56788 Summary: _mm_frcz_sd and _mm_frcz_ss ignore their second argument Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED

[Bug target/56788] _mm_frcz_sd and _mm_frcz_ss ignore their second argument

2013-03-30 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56788 --- Comment #1 from Marc Glisse 2013-03-30 20:42:51 UTC --- _mm_frcz_ss is inconsistent between compilers. Microsoft gives it 2 arguments and movss-like semantics, whereas clang gives it a single argument. AMD doesn't document intrinsics,

[Bug tree-optimization/56790] New: VEC_COND_EXPR not constant folded

2013-03-30 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=56790 Bug #: 56790 Summary: VEC_COND_EXPR not constant folded Classification: Unclassified Product: gcc Version: 4.8.0 Status: UNCONFIRMED Keywords: missed-optimization

[Bug target/55583] Extended shift instruction on x86-64 is not used, producing unoptimal code

2013-04-01 Thread glisse at gcc dot gnu.org
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55583 --- Comment #5 from Marc Glisse 2013-04-01 13:45:33 UTC --- Created attachment 29764 --> http://gcc.gnu.org/bugzilla/attachment.cgi?id=29764 Patch from comment #4 I apparently forgot to attach a patch when I posted comment #4. This is

<    1   2   3   4   5   6   7   8   9   10   >