from:"amonakov at gcc dot gnu.org via Gcc\-bugs"

[Bug target/106902] [12/13/14/15/16 Regression] Program compiled with -O3 -mfma produces different result

2025-05-30 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #39 from Alexander Monakov --- > I don't think we need any mass rebuilds there, I'm fine with approving > a change of the default now. The question is what to do for languages > other than C/C++, that complication might be a reason

[Bug c++/120456] __builtin_shuffle produces unnecessary vperm2i128

2025-05-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120456 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/120398] [15/16 Regression] vectorization emits shuffles followed by scalar adds

2025-05-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120398 --- Comment #2 from Alexander Monakov --- Right, aarch64 vectorizes at -O3 to the desired form, but not -O2: .L3: ldr d1, [x0, x2, lsl 3] add x2, x2, 1 fmlav31.2s, v1.2s, v1.2s cmp x1, x2

[Bug tree-optimization/109892] SLP failure with explicit fma

2025-05-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109892 --- Comment #3 from Alexander Monakov --- In the meantime codegen for 'g' substantially regressed (with or without -mfma), PR 120398.

[Bug tree-optimization/120398] New: vectorization emits shuffles followed by scalar adds

2025-05-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120398 Bug ID: 120398 Summary: vectorization emits shuffles followed by scalar adds Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Compon

[Bug tree-optimization/120396] New: unprofitable SLP vectorization, leaves scalar parts live

2025-05-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120396 Bug ID: 120396 Summary: unprofitable SLP vectorization, leaves scalar parts live Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Prio

[Bug target/120294] Missed DCE with xor when emulating __builtin_ctzg() with __builtin_ctz() and bitshift

2025-05-15 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=120294 Alexander Monakov changed: What|Removed |Added Status|RESOLVED|UNCONFIRMED Resolution|DUPL

[Bug c/90253] no warning for cv-qualified selectors in _Generic

2025-04-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90253 --- Comment #3 from Alexander Monakov --- clang-15 and newer warn for this, enabled by default: warning: due to lvalue conversion of the controlling expression, association of type 'const char' will never be selected because it is qualified [-Wu

[Bug c/119845] Something triggered by -march=native create code that is not compliant with floating point standards

2025-04-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119845 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org

[Bug c/119774] New: Missing -Wcast-align for reduced-alignment types

2025-04-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119774 Bug ID: 119774 Summary: Missing -Wcast-align for reduced-alignment types Product: gcc Version: unknown Status: UNCONFIRMED Severity: normal Priority: P3 Compone

[Bug tree-optimization/119733] store-merging increases alignment

2025-04-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119733 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/119596] x86: too eager use of rep movsq/rep stosq for inlined ops

2025-04-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119596 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-04-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #56 from Alexander Monakov --- I think you mean -fno-plt, not -mno-plt (here and your previous comment)? Ideally we would be able to express a relocation to PLT, but without admitting lazy binding (i.e. the trampoline will be in .pl

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-04-01 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #54 from Alexander Monakov --- I think the x86_64 behavior is simply copied as-is from i386. On i386, there was a time when Glibc wouldn't preserve eax+ecx+edx in the PLT trampoline, but preserving those became necessary when GCC ex

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-31 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #51 from Alexander Monakov --- Michael, can you give your ack/nack for Ard's proposal in comment #24 (the second variant, I guess keying off -m[no-]direct-extern-access doesn't make sense here). I think it properly addresses what you

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #49 from Alexander Monakov --- Aha, and I see the kernel employs the trick of preincluding a file containing ' '#pragma GCC visibility push(hidden)' when building PIE objects since 2020. So the mcount-emitting macro in the i386 backe

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #47 from Alexander Monakov --- (In reply to Ard Biesheuvel from comment #43) > Non-PIC might be more efficient, but there are cases where we cannot use it. > The early startup code on x86 runs from a different virtual mapping than it

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #46 from Alexander Monakov --- A small correction: -static-pie is not a linker option, it's a gcc (compiler driver) option, which it decomposes into -static -pie --no-dynamic-linker -z text for the linker; --no-dynamic-linker was add

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #44 from Alexander Monakov --- (In reply to Ard Biesheuvel from comment #43) > arch/arm64/Makefile specifies '-shared' for the linker flags, but does not > pass -fpic of -fpie to the compiler. We used to pass '-pie -shared' but that

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #42 from Alexander Monakov --- > In Linux, we don't even bother with PIC codegen, even though we link with > -pie. My git-grep for that is coming up empty, where should I look? > ... on x86_64, where PIC and non-PIC codegen are r

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #40 from Alexander Monakov --- > In Linux, we don't even bother with PIC codegen, even though we link with > -pie. Earlier you said that building with -fPIC may be desirable, and your patch was dealing with PIC codegen in GCC for m

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #38 from Alexander Monakov --- (In reply to Ard Biesheuvel from comment #37) > Yes, we can drop -mcmodel=kernel, and use -mcmodel=small instead. This is > why I'm not keen on relying on that - it is ill-defined and there is really >

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #36 from Alexander Monakov --- Today, gcc rejects -fpic -mcmodel=kernel on the command line though, and it doesn't look like you can drop -mcmodel=kernel, so... was there some plan for dealing with that? (considering new flags to co

[Bug c++/119387] [14/15 Regression] Regression in performance by a factor of 6 when building with debugging symbols since r14-5979

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119387 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #34 from Alexander Monakov --- We have -mcmodel=kernel already, which is incompatible with -fpic. Ard, where is -fpic in the kernel context coming from? Kernel's top-level Makefile passes -fno-PIE, and arch/x86/Makefile passes -mcmo

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #30 from Alexander Monakov --- Sorry, sent too soon: a.out had the concept of PLT as well as GOT: https://gcc.gnu.org/cgit/gcc/tree/gcc/config/i386/i386.c?id=c98f874233428d7e6ba83def7842fd703ac0ddf1#n820

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #31 from Alexander Monakov --- I am certainly missing some interesting history here, because on one hand, I see that a.out uses call-pop combo in function prologue to find out current PC, and then uses %ebx-relative addressing in PIC

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #29 from Alexander Monakov --- (In reply to Alexander Monakov from comment #21) > GOT indirection for mcount has been there from the very beginning: > https://gcc.gnu.org/cgit/gcc/tree/gcc/config/i386/i386. > h?id=c98f874233428d7e6ba

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #21 from Alexander Monakov --- GOT indirection for mcount has been there from the very beginning: https://gcc.gnu.org/cgit/gcc/tree/gcc/config/i386/i386.h?id=c98f874233428d7e6ba83def7842fd703ac0ddf1#n623

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #25 from Alexander Monakov --- (In reply to Ard Biesheuvel from comment #24) > - never emit 'call mcount' > - emit 'call *mcount@GOTPCREL(%rip)' if -fno-plt > - emit 'call mcount@PLT' otherwise As discussed, gcc was always using GOT

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #23 from Alexander Monakov --- That's probably just copying existing behavior from 32-bit x86. Can we preserve previous behavior that under -fpic -mno-direct-extern-access mcount is called via GOT (and else emit mcount@PLT if -fpic

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #19 from Alexander Monakov --- The question is why prior to your patch GCC emitted mcount@GOT (i.e. avoiding the PLT trampoline for mcount on purpose), going all the way back to gcc-3.4 (and probably further).

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 --- Comment #15 from Alexander Monakov --- Any idea why prior to introduction of this bug, gcc always emitted a GOT-indirect call for mcount? It looks like it is avoiding lazy PLT resolver, but why is that necessary?

[Bug target/119386] [14/15 Regression][x64] Shared libraries can no longer be compiled with profiling

2025-03-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119386 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/119368] immintrin code running slower with gcc than clang

2025-03-19 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119368 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug tree-optimization/119103] shift not demotated when shift amount range is known

2025-03-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=119103 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/106902] [12/13/14/15 Regression] Program compiled with -O3 -mfma produces different result

2025-02-28 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=106902 --- Comment #36 from Alexander Monakov --- We can flip the default from =fast to =on for -std=gnuXX any time we like, but it must remain at =off for =std=cXX as long as STDC FP_CONTRACT pragma is not implemented. I have not attempted any mass r

[Bug c/118818] Optimization of divps to rcpps + newton can cause slow down

2025-02-10 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118818 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug tree-optimization/118570] -O2 much faster than -O3 for Romberg's method

2025-01-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118570 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug ipa/117432] [12/13/14/15 Regression] IPA ICF disregards types of variadic arguments since r10-4643-ga37f58f506e436

2025-01-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117432 --- Comment #10 from Alexander Monakov --- Yeah, I would expect compare_operand to be the proper place for a fix, not its callers.

[Bug target/118342] `a == 0 ? 32 : __builtin_ctz(a)` for Intel and AMD cores could be implemented even without BMI1

2025-01-09 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118342 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug tree-optimization/118198] tail merge/cross jump should not merge abort

2025-01-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=118198 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/117926] [14/15 Regression] emits 3dnow (MMX) instruction from autovectorized GIMPLE without emms at -O2 since r14-2786-gade30fad6669e5

2024-12-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117926 --- Comment #5 from Alexander Monakov --- Thanks, here's a variant of the small testcase that fails on gcc-14 too, just needed to make the integer field the first in the struct: struct s { int i[2]; float f[2]; double d; }; void f(s

[Bug target/117926] New: [15 Regression] emits MMX from autovectorized GIMPLE without emms at -O2

2024-12-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117926 Bug ID: 117926 Summary: [15 Regression] emits MMX from autovectorized GIMPLE without emms at -O2 Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: wrong

[Bug c/117469] returns_twice on defined functions

2024-11-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117469 --- Comment #4 from Alexander Monakov --- The code in comment #3 is invalid: siglongjmp is called when the state saved in env is no longer valid: plat_setjmp has returned (and the stack slot where its return address is stored is overwritten).

[Bug target/117421] [RISCV] Use byte comparison instead of word comparison

2024-11-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117421 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug rtl-optimization/117476] [15 regression] bad generated code at -O1 since r15-4991-g69bd93c167fefb

2024-11-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117476 --- Comment #22 from Alexander Monakov --- *** Bug 117532 has been marked as a duplicate of this bug. ***

[Bug rtl-optimization/117532] [15 Regression] Miscompile with -Os and -O0/1/2/3

2024-11-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117532 Alexander Monakov changed: What|Removed |Added Resolution|--- |DUPLICATE CC|

[Bug c/117469] returns_twice on defined functions

2024-11-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117469 --- Comment #2 from Alexander Monakov --- (In reply to Xi Ruoyao from comment #1) > So if the tail-call uses [[musttail]] the alternative 3 should be "fine"? Yes, plus annotating the callees that return twice with the attribute is still require

[Bug c/117469] New: returns_twice on defined functions

2024-11-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117469 Bug ID: 117469 Summary: returns_twice on defined functions Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c A

[Bug ipa/117432] [12/13/14/15 Regression] IPA ICF disregards types of variadic arguments since r10-4643-ga37f58f506e436

2024-11-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117432 --- Comment #6 from Alexander Monakov --- compare_operand is used in compare_asm_inputs_outputs, so this is broken too: void foo32(void) { asm("" :: "r"(-1)); } void foo64(void) { asm("" :: "r"(-1LL)); }

[Bug ipa/117432] [11/12/13/14/15 Regression] IPA ICF disregards types of variadic arguments

2024-11-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117432 --- Comment #1 from Alexander Monakov --- Created attachment 59528 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=59528&action=edit executable testcase

[Bug ipa/117432] New: [11/12/13/14/15 Regression] IPA ICF disregards types of variadic arguments

2024-11-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117432 Bug ID: 117432 Summary: [11/12/13/14/15 Regression] IPA ICF disregards types of variadic arguments Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: wro

[Bug ipa/112601] [12/13/14/15 Regression] ICE in cgraph_node::verify_node(): error: invalid calls_comdat_local flag

2024-10-29 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112601 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug middle-end/117249] [12/13/14/15 Regression] --disable-checking is broken since r5-2450

2024-10-23 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117249 --- Comment #12 from Alexander Monakov --- On IRC Jakub mentioned gcc_assert (token() == TYPEDEF) in gengtype and Richi further noted tree-ssa-loop-ivopts.cc:gcc_assert (use->op_p = gimple_call_arg_ptr (call, 0)); cgraph.cc: gcc_assert (++edge

[Bug middle-end/117249] [12/13/14/15 Regression] --disable-checking is broken since r5-2450

2024-10-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117249 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug rtl-optimization/117239] [12/13/14/15 Regression] wrong code at -O{s,2} with "-fno-inline -fschedule-insns" on x86_64-linux-gnu

2024-10-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117239 --- Comment #4 from Alexander Monakov --- (In reply to Alexander Monakov from comment #2) > Alternatively, > changing 'if (o.i)' to 'if (o.i != 1)' allows to reproduce with PIE as well. ^ I meant 'if (o.i ==

[Bug rtl-optimization/117239] wrong code at -O{s,2} with "-fno-inline -fschedule-insns" on x86_64-linux-gnu

2024-10-20 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117239 --- Comment #2 from Alexander Monakov --- Amazing bug. Note that it depends on high-order bits of return address overwriting o.i, so may need -no-pie -fno-pie to reproduce. Alternatively, changing 'if (o.i)' to 'if (o.i != 1)' allows to reproduc

[Bug target/87832] AMD pipeline models are very costly size-wise

2024-10-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=87832 --- Comment #15 from Alexander Monakov --- No, I didn't do older AMDs (btver2 & bdver3) and newer AMD (znver4) regressed this once again. Here's the current picture of top 10: nm -CS -t d --defined-only gcc/insn-automata.o | sed 's/^[0-9]* 0*//'

[Bug other/116947] --enable-checking=valgrind ignores failures during bootstrap

2024-10-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116947 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/116738] Constant folding of _mm_min_ss and _mm_max_ss is wrong

2024-09-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116738 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug c/116483] RFE: a notion for asm goto to indicate all labels in the function may be jumped to

2024-09-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116483 --- Comment #11 from Alexander Monakov --- > It only handles switch statements, not computed gotos. Oh, right, apologies for misunderstanding your question like that. For computed gotos it is indeed not so easy, especially if there is more than

[Bug c/116483] RFE: a notion for asm goto to indicate all labels in the function may be jumped to

2024-09-12 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116483 --- Comment #9 from Alexander Monakov --- (In reply to Xi Ruoyao from comment #8) > Is there any pointer how to implement this instead? It may be sufficient to change (define_insn "@tablejump" [(set (pc) (match_operand:P 0 "register_

[Bug preprocessor/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 Alexander Monakov changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug c/116483] RFE: a notion for asm goto to indicate all labels in the function may be jumped to

2024-08-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116483 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug preprocessor/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 --- Comment #12 from Alexander Monakov --- Thanks. It's probably nicer to deduplicate computation of required padding to a common header (libcpp/internal.h), I'll send a patch to that effect.

[Bug preprocessor/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 --- Comment #9 from Alexander Monakov --- Okay, if you take the addition and the branch from the inlined variant: addl %eax, %edx je .L3 and add a 'test' instruction: addl %eax, %edx test %edx, %edx je .L3 then Valgrind doesn't complain. So

[Bug preprocessor/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 --- Comment #8 from Alexander Monakov --- Thanks for the reference, but it doesn't help. Something more subtle is going on, because placing the shift-add combo in a separate function makes Valgrind properly compute known bits even without the ma

[Bug preprocessor/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 --- Comment #6 from Alexander Monakov --- As for Valgrind false positive, it handles this SSSE3 code really well and misses the key point by a very narrow margin. We have found = m1 + (m2 << 16); where both m1 and m2 hold 16-bit masks from p

[Bug preprocessor/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 Alexander Monakov changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Ever confirmed|0

[Bug c/116458] [15 regression] New valgrind error in search_line_ssse3

2024-08-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116458 --- Comment #3 from Alexander Monakov --- David, thanks for Cc'ing me and for running Valgrind builds! Richi, I'll check in more detail later today, I think we should unbreak Valgrind builds ASAP by initializing padding under #ifdef ENABLE_VALG

[Bug target/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN

2024-07-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 --- Comment #15 from Alexander Monakov --- (In reply to Jakub Jelinek from comment #14) > (In reply to Alexander Monakov from comment #13) > > fldt does not convert (otherwise there's no way to spill/reload x87 > > registers). > > Doesn't it st

[Bug target/114659] gcc miscompiles a __builtin_memcpy on i386, leading to wrong results for SNaN

2024-07-26 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114659 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug ipa/115533] [12/13/14/15 regression] flac miscompiled with -O3 -march=znver2 -fipa-pta -fno-vect-cost-model since r12-3893-g6390c5047adb75

2024-07-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115533 --- Comment #26 from Alexander Monakov --- (In reply to Richard Biener from comment #24) > > That's because of -fno-vect-cost-model, it wouldn't be vectorized otherwise. Thanks, I forgot. The testcase in PR 106902 was vectorized at plain -O3 b

[Bug ipa/115533] [12/13/14/15 regression] flac miscompiled with -O3 -march=znver2 -fipa-pta -fno-vect-cost-model since r12-3893-g6390c5047adb75

2024-07-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115533 --- Comment #23 from Alexander Monakov --- I suggest it to close this a dup of PR 106902 if there are no better ideas. By the way, in both cases SLP introduces vectors in a loop where scalar computations it's attempting to replace are not elimi

[Bug ipa/115533] [12/13/14/15 regression] flac miscompiled with -O3 -march=znver2 -fipa-pta -fno-vect-cost-model since r12-3893-g6390c5047adb75

2024-06-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115533 --- Comment #22 from Alexander Monakov --- Similar to the RawTherapee issue, SLP opportunities are created by predcom, so either -fno-predictive-commoning or -fno-tree-slp-vectorize avoids numerical runaway on the small testcase.

[Bug ipa/115533] [12/13/14/15 regression] flac miscompiled with -O3 -march=znver2 -fipa-pta -fno-vect-cost-model since r12-3893-g6390c5047adb75

2024-06-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115533 --- Comment #20 from Alexander Monakov --- Sam, can you provide more context? It seems there is no downstream bugreport? How does the alleged miscompilation manifest? Note that effects of interplay of fp-contract=fast and vectorization can be p

[Bug target/115333] -march=native sets --param "l2-cache-size=1024" on Ryzen 7 7800X3D

2024-06-03 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115333 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/115161] [15 Regression] highway-1.0.7 miscompilation of some SSE2 intrinsics

2024-05-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161 --- Comment #23 from Alexander Monakov --- (In reply to Sergei Trofimovich from comment #22) > Here `pcmpeqd %xmm2,%xmm1` is a problematic instruction. Why does `gcc` use > `%xmm2` (result of `cvttps2dq`) instead of, say `%xmm0` which contains >

[Bug middle-end/115170] __cxa_atexit@plt even if -fno-plt

2024-05-24 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115170 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/115161] [15 Regression] highway-1.0.7 miscompilation of some SSE2 intrinsics

2024-05-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161 --- Comment #20 from Alexander Monakov --- (In reply to Jakub Jelinek from comment #19) > If we guarantee that we never constant fold FIX/UNSIGNED_FIX with > -ftrapping-math (we shouldn't, as the exceptions should be raised), then > using FIX/UN

[Bug target/115161] [15 Regression] highway-1.0.7 miscompilation of some SSE2 intrinsics

2024-05-22 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161 --- Comment #18 from Alexander Monakov --- No, allowing value-changing transformations under -ftrapping-math is really not appropriate. Invoking the intrinsic on a large floating-point value is not UB.

[Bug target/115161] [15 Regression] highway-1.0.7 miscompilation of some SSE2 intrinsics

2024-05-21 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115161 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug middle-end/115132] Sibling calls optim should not be performed when builtin_unwind_init is used

2024-05-17 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115132 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug middle-end/115091] Support value speculation in frontend

2024-05-16 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115091 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/115014] GCC generates incorrect instructions for addressing the data segment through EBP register

2024-05-10 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115014 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 --- Comment #4 from Alexander Monakov --- Like this: pandxmm1, XMMWORD PTR .LC0[rip] movaps XMMWORD PTR [rsp-40], xmm0 xor eax, eax xor edx, edx movaps XMMWORD PTR [rsp-24], xmm1 mov

[Bug target/114944] Codegen of __builtin_shuffle for an 16-byte uint8_t vector is suboptimal on SSE2

2024-05-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114944 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/114960] New: [12/13/14/15 Regression] fails to clean up vector casts

2024-05-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114960 Bug ID: 114960 Summary: [12/13/14/15 Regression] fails to clean up vector casts Product: gcc Version: 12.3.1 Status: UNCONFIRMED Severity: normal Pri

[Bug c/114923] gcc ignores escaping pointer and applies invalid optimization

2024-05-02 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114923 --- Comment #4 from Alexander Monakov --- You can place points of possible access outside of abstract machine in a fine-grained manner with volatile asms: asm volatile("" : "=m"(buf)); This cannot be reordered against accesses to volatile va

[Bug c/114923] gcc ignores escaping pointer and applies invalid optimization

2024-05-02 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114923 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug libgomp/114765] linking to libgomp and setting CPU_PROC_BIND causes affinity reset

2024-04-18 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114765 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug c++/114480] g++: internal compiler error: Segmentation fault signal terminated program cc1plus

2024-04-05 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480 --- Comment #21 from Alexander Monakov --- It is possible to reduce gcc_qsort workload by improving the presorted-ness of the array, but of course avoiding quadratic behavior would be much better. With the following change, we go from 261,2

[Bug c++/114480] g++: internal compiler error: Segmentation fault signal terminated program cc1plus

2024-04-04 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114480 --- Comment #20 from Alexander Monakov --- (note that if you uninclude the testcase and compile with -fno-exceptions it's much faster) On the smaller testcase from comment 14, prune_unused_phi_nodes invokes gcc_qsort 53386 times. There are two

[Bug lto/114337] LTO symbol table doesn't include builtin functions

2024-03-25 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114337 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug target/108866] Allow to pass Windows resource file (.rc) as input to gcc

2024-03-14 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=108866 Alexander Monakov changed: What|Removed |Added CC||amonakov at gcc dot gnu.org --- Com

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1

2024-03-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 --- Comment #10 from Alexander Monakov --- Indeed, but OTOH according to bug 84402 comment 58 it caused a noticeable hit on gimple-match.cc compilation: 733a1b777f16cd397b43a242d9c31761f66d3da8 13th January 2023 sched-deps: do not schedule pseu

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%) since r13-5154-g733a1b777f1

2024-03-13 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 --- Comment #8 from Alexander Monakov --- If we want to get rid of the compilation time regression sooner rather than later, I can suggest limiting my change only to functions that call setjmp: diff --git a/gcc/sched-deps.cc b/gcc/sched-deps.cc

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%)

2024-03-11 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 Alexander Monakov changed: What|Removed |Added CC||mkuvyrkov at gcc dot gnu.org --- Co

[Bug rtl-optimization/114261] [13/14 Regression] Scheduling takes excessive time (97%)

2024-03-06 Thread amonakov at gcc dot gnu.org via Gcc-bugs

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114261 --- Comment #3 from Alexander Monakov --- The first attachment is empty (perhaps you made a non-recursive archive when you meant to recursively zip a directory).

1 2 3 4 5 >

1 - 100 of 456 matches

Mail list logo