[Bug target/115755] mulx (with -mbmi2) does not show up with constant multiply

2024-07-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115755 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/115749] Non optimal assembly for integer modulo by a constant on x86-64 CPUs

2024-07-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115749 Hongtao Liu changed: What|Removed |Added CC||haochen.jiang at intel dot com,

[Bug target/115796] [15 Regression] build failure since double_u -> __double_u change

2024-07-08 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115796 Hongtao Liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/115115] [12/13/14/15 Regression] highway-1.0.7 wrong _mm_cvttps_epi32() constant fold

2024-07-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115115 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/113733] Invalid APX TLS code squence

2024-07-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113733 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/113312] Add __attribute__((no_callee_saved_registers)) for Intel FRED

2024-07-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113312 --- Comment #28 from Hongtao Liu --- __attribute__((no_callee_saved_registers)) is added in GCC14.

[Bug target/113312] Add __attribute__((no_callee_saved_registers)) for Intel FRED

2024-07-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113312 Hongtao Liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/115833] SLP of signed short multiply goes wrong

2024-07-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115833 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/115833] SLP of signed short multiply goes wrong

2024-07-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115833 Hongtao Liu changed: What|Removed |Added CC||lin1.hu at intel dot com --- Comment #4 f

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2024-07-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 Hongtao Liu changed: What|Removed |Added Last reconfirmed||2024-07-11 Status|UNCONFIRMED

[Bug tree-optimization/115872] [12/13/14/15 regression] ICE in fab pass (error: missing definition with -g & -O3)

2024-07-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115872 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2024-07-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|UNCONFIRMED Ever confirmed|1

[Bug target/115842] [15 Regression] 6.5% slowdown of 548.exchange2_r on Intel Ice Lake

2024-07-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115842 --- Comment #3 from Hongtao Liu --- (In reply to Hongtao Liu from comment #2) > Bisected to r15-1673-gb8153b5417bed0, the commit fixed wrong rtx_cost of > r15-882-g1d6199e5f8c1c0 which happened to improved 548.exchange_r. Looks like wrong rtx_c

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-07-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 115889, which changed state. Bug 115889 Summary: [15 Regression] FAIL: gcc.dg/vect/vect-vfa-03.c execution test with -march=znver4 --param vect-partial-vector-usage=1 since r15-1368-g6d0b7b69d14302 https://gcc.gnu.org/bu

[Bug target/115889] [15 Regression] FAIL: gcc.dg/vect/vect-vfa-03.c execution test with -march=znver4 --param vect-partial-vector-usage=1 since r15-1368-g6d0b7b69d14302

2024-07-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115889 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/115872] [12/13/14/15 regression] ICE in fab pass (error: missing definition with -g & -O3)

2024-07-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115872 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/115843] [14/15 Regression] 531.deepsjeng_r fails to verify with -O3 -march=znver4 --param vect-partial-vector-usage=2

2024-07-15 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/115843] [14/15 Regression] 531.deepsjeng_r fails to verify with -O3 -march=znver4 --param vect-partial-vector-usage=2

2024-07-15 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115843 --- Comment #10 from Hongtao Liu --- > But using kmovw for QImode mask is not correct as we don't know the value in > gpr. Perhaps we'd consider restrict the kmovb under avx512dq only. Why? as long as we only care about lower 8 bits, vmovw sho

[Bug target/113733] Invalid APX TLS code squence

2024-07-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113733 Bug 113733 depends on bug 113711, which changed state. Bug 113711 Summary: APX instruction set and instructions longer than 15 bytes (assembly warning) https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113711 What|Removed

[Bug target/113711] APX instruction set and instructions longer than 15 bytes (assembly warning)

2024-07-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113711 Hongtao Liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug middle-end/115863] [15 Regression] zlib-1.3.1 miscompilation since r15-1936-g80e446e829d818

2024-07-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115863 Hongtao Liu changed: What|Removed |Added CC||lin1.hu at intel dot com --- Comment #16

[Bug tree-optimization/114966] fails to optimize avx2 in-register permute written with std::experimental::simd

2024-07-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114966 --- Comment #5 from Hongtao Liu --- I saw pass_eras optimize BIT_FIELD_REF of big memory into load from small memory Created a replacement for D.161366 offset: 0, size: 64: SR.20D.170101 Created a replacement for D.161366 offset: 64, size: 64:

[Bug target/115978] [x86] GCC issues an error when using -m32 -march=native on APX available machine

2024-07-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978 --- Comment #4 from Hongtao Liu --- To clarify, the question originally came from whether or not to report error for -m32,-march=native, and then LLVM folks said it's diffcult for LLVM not issuing error for -march=native -m32, but issuing error

[Bug target/115978] [x86] GCC issues an error when using -m32 -march=native on APX available machine

2024-07-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978 --- Comment #6 from Hongtao Liu --- (In reply to H.J. Lu from comment #5) > (In reply to Hongtao Liu from comment #4) > > To clarify, the question originally came from whether or not to report error > > for -m32,-march=native, and then LLVM folk

[Bug tree-optimization/115994] New: Vectorizer failed to do vectorizaton for .sat_trunc when nunits_in / nunits_out > 2

2024-07-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115994 Bug ID: 115994 Summary: Vectorizer failed to do vectorizaton for .sat_trunc when nunits_in / nunits_out > 2 Product: gcc Version: 15.0 Status: UNCONFIRMED Seve

[Bug tree-optimization/115994] Vectorizer failed to do vectorizaton for .sat_trunc when nunits_in / nunits_out > 2

2024-07-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115994 --- Comment #1 from Hongtao Liu --- Also in vect_recog_sat_trunc_pattern 4700 tree v_itype = get_vectype_for_scalar_type (vinfo, itype); 4701 tree v_otype = get_vectype_for_scalar_type (vinfo, otype); 4702 internal_fn fn = IFN_S

[Bug target/115982] [15 Regression] ICE: unrecognizable insn in ira_remove_insn_scratches with -mavx512vl since r15-1742

2024-07-21 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115982 Hongtao Liu changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at

[Bug target/115982] [15 Regression] ICE: unrecognizable insn in ira_remove_insn_scratches with -mavx512vl since r15-1742

2024-07-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115982 --- Comment #5 from Hongtao Liu --- Fixed by r15-2217-ga3f03891065cb9, could be latent on release branch since GCC12

[Bug target/116043] [15 regression] TLS relocation issue when building glibc with -O3 -mavx512bf16

2024-07-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116043 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/116043] [15 regression] TLS relocation issue when building glibc with -O3 -mavx512bf16 by r15-1619

2024-07-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116043 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org

[Bug c++/116064] [15 Regression] SPEC 2017 523.xalancbmk_r failed to build

2024-07-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/98856] [12/13/14/15 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2024-07-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #4

[Bug tree-optimization/98856] [12/13/14/15 Regression] botan AES-128/XTS is slower by ~17% since r11-6649-g285fa338b06b804e72997c4d876ecf08a9c083af

2024-07-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98856 --- Comment #48 from Hongtao Liu --- (In reply to Hongtao Liu from comment #47) > Created attachment 58746 [details] > Accoate v2di with GPR > > The attached patch can allocated V2DI with GPR to avoid spill. > @Uros Is it a good idea to make G

[Bug target/115978] [x86] GCC issues an error when using -m32 -march=native on APX available machine

2024-07-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|WAITING

[Bug target/115978] [x86] GCC issues an error when using -m32 -march=native on APX available machine

2024-07-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115978 --- Comment #10 from Hongtao Liu --- (In reply to H.J. Lu from comment #9) > (In reply to Hongtao Liu from comment #8) > > Fixed in GCC15,thanks H.J. > > Does GCC 14 have the same issue with -m32 -march=native? Yes, will backport the patch.

[Bug target/96846] [x86] Prefer xor/test/setcc over test/setcc/movzx sequence

2024-07-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=96846 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #5

[Bug rtl-optimization/116096] [15 Regression] during RTL pass: cprop_hardreg ICE: in extract_insn, at recog.cc:2848 (unrecognizable insn ashift:TI?) with -O2 -flive-range-shrinkage -fno-peephole2 -mst

2024-07-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug rtl-optimization/116096] [15 Regression] during RTL pass: cprop_hardreg ICE: in extract_insn, at recog.cc:2848 (unrecognizable insn ashift:TI?) with -O2 -flive-range-shrinkage -fno-peephole2 -mst

2024-07-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org

[Bug rtl-optimization/116096] [15 Regression] during RTL pass: cprop_hardreg ICE: in extract_insn, at recog.cc:2848 (unrecognizable insn ashift:TI?) with -O2 -flive-range-shrinkage -fno-peephole2 -mst

2024-07-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116096 --- Comment #3 from Hongtao Liu --- > > (define_insn "ashl3_doubleword" >[(set (match_operand:DWI 0 "register_operand" "=&r,&r") > - (ashift:DWI (match_operand:DWI 1 "reg_or_pm1_operand" "0n,r") > + (ashift:DWI (match_operand:

[Bug target/116122] [14/15 regression] __FLT16_MAX__ is defined even with -mno-sse2 on 32-bit x86

2024-07-28 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116122 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |ASSIGNED Last reconfirmed|

[Bug target/117072] [15 Regression] FAIL: gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since r15-3509-gd34cda72098867

2024-10-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 --- Comment #12 from Hongtao Liu --- > > So the backend fix should at least add 8 patterns to handle that, in that > case, maybe the middle-end canonicalization would be better. And I will still submit a patch to make the FMA predicates more

[Bug target/117072] [15 Regression] FAIL: gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since r15-3509-gd34cda72098867

2024-10-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 --- Comment #11 from Hongtao Liu --- (In reply to Hongtao Liu from comment #10) > (In reply to rguent...@suse.de from comment #9) > > On Fri, 11 Oct 2024, liuhongt at gcc dot gnu.org wrote: > > > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116 Hongtao Liu changed: What|Removed |Added Assignee|liuhongt at gcc dot gnu.org|uros at gcc dot gnu.org --- Commen

[Bug target/79786] [12/13/14/15 Regression] ICE tree check: expected class 'type', have 'declaration' (var_decl) in iamcu_alignment, at config/i386/i386.c:30263

2024-10-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=79786 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #7

[Bug target/117081] [15 Regression] FAIL: gcc.target/i386/pr91384.c since r15-1619-g3b9b8d6cfdf593

2024-10-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117081 --- Comment #5 from Hongtao Liu --- *** Bug 117082 has been marked as a duplicate of this bug. ***

[Bug target/117082] [15 Regression] FAIL: gcc.target/i386/stack-check-17.c since r15-1619-g3b9b8d6cfdf593

2024-10-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117082 Hongtao Liu changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-14 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116 Hongtao Liu changed: What|Removed |Added Last reconfirmed||2024-10-14 Ever confirmed|0

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116 --- Comment #3 from Hongtao Liu --- A simple testcase typedef long long v4di __attribute__((vector_size(32))); v4di foo (long long a) { return __extension__(v4di){(long long)foo, 1, 1, 1}; } reproduced with -O2 -mavx2, failed at least sin

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116 --- Comment #4 from Hongtao Liu --- (In reply to Hongtao Liu from comment #3) > A simple testcase > > typedef long long v4di __attribute__((vector_size(32))); > > v4di > foo (long long a) > { > return __extension__(v4di){(long long)foo, 1,

[Bug target/117072] [15 Regression] FAIL: gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since r15-3509-gd34cda72098867

2024-10-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 --- Comment #10 from Hongtao Liu --- (In reply to rguent...@suse.de from comment #9) > On Fri, 11 Oct 2024, liuhongt at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 > > > > --- Comment #8 from Hongtao Liu -

[Bug target/117088] New: [15 regression] 548.exchange_r regressed by 10% with -O2 -march=x86-64-v3 after enhance O2 vectorization

2024-10-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117088 Bug ID: 117088 Summary: [15 regression] 548.exchange_r regressed by 10% with -O2 -march=x86-64-v3 after enhance O2 vectorization Product: gcc Version: 15.0 Status: UNCON

[Bug target/117159] kmovw storing to memory is assumed to zero-extend

2024-10-15 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159 --- Comment #2 from Hongtao Liu --- typedef __attribute__((__vector_size__ (4))) unsigned char W; typedef __attribute__((__vector_size__ (64))) int V; typedef __attribute__((__vector_size__ (64))) long long Vq; W w; V v; Vq vq; static inline W

[Bug target/117159] kmovw storing to memory is assumed to zero-extend

2024-10-20 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/117232] EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF when using cmov

2024-10-20 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org Las

[Bug tree-optimization/117055] New: [meta-bug] GCC15 O2 vectorization enhancement

2024-10-09 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117055 Bug ID: 117055 Summary: [meta-bug] GCC15 O2 vectorization enhancement Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tr

[Bug target/116940] [15 Regression] wrong code with -O -mavx512vl and vector compare and negation since r15-1742

2024-10-07 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116940 Hongtao Liu changed: What|Removed |Added Status|NEW |ASSIGNED Assignee|unassigned at

[Bug target/101017] ICE: Segmentation fault, convert_memory_address_addr_space_1 with vector_size(32) and target_clone arch=core-avx2/default

2024-10-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101017 --- Comment #11 from Hongtao Liu --- (In reply to David Binderman from comment #10) > Did this ever happen ? > > Similar test case gcc/testsuite/gcc.target/i386/avx10_1-26.c > still seems to cause a crash: > > testsuite $ ~/gcc/results/bin/gcc

[Bug target/117232] EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF when using cmov

2024-10-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232 --- Comment #4 from Hongtao Liu --- (In reply to Andrew Pinski from comment #0) > This is expansion of PR 113609 which showed when I improved phiopt's factor > operations to handle more than just 1 operand operations. > > New reduced testcase t

[Bug target/117232] EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF when using cmov

2024-10-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug tree-optimization/64700] Sink common code through PHI

2024-10-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=64700 Bug 64700 depends on bug 117232, which changed state. Bug 117232 Summary: EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF when using cmov https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117232

[Bug target/117072] [15 Regression] FAIL: gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since r15-3509-gd34cda72098867

2024-10-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 --- Comment #8 from Hongtao Liu --- (In reply to Richard Biener from comment #7) > OTOH I'll note that no other simplify_* treats canonicalization as > simplification and the existing swap_commutative_operands_p transform for FMA > is highly unc

[Bug c++/116064] [15 Regression] SPEC 2017 523.xalancbmk_r failed to build

2024-10-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116064 --- Comment #11 from Hongtao Liu --- (In reply to Richard Biener from comment #10) > So - fixed? Yes.

[Bug target/117116] [15 regression] error: unrecognizable insn: with -march=znver3

2024-10-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117116 --- Comment #2 from Hongtao Liu --- Looks like it just expose an backend bug, I'll take a look.

[Bug target/117240] ICE: in copy_to_mode_reg, at explow.cc:657 with __builtin_ia32_vaesenc_v32qi() or __builtin_ia32_vaesenc_v64qi()

2024-10-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117240 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org

[Bug target/117301] Many AVX10 tests fail

2024-10-29 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117301 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/117416] [15 Regression] ICE: in gen_prefetch, at config/i386/i386.md:28541 with __builtin_ia32_prefetch() by r15-4833-ge9ab41b79933d4

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug bootstrap/117407] [15 regression] bootstrap fails after r15-4847-g79a75b1f551821

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117407 Hongtao Liu changed: What|Removed |Added CC||zsojka at seznam dot cz --- Comment #5 fr

[Bug target/117416] [15 Regression] ICE: in gen_prefetch, at config/i386/i386.md:28541 with __builtin_ia32_prefetch() by r15-4833-ge9ab41b79933d4

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416 Hongtao Liu changed: What|Removed |Added Resolution|--- |DUPLICATE Status|NEW

[Bug target/117416] [15 Regression] ICE: in gen_prefetch, at config/i386/i386.md:28541 with __builtin_ia32_prefetch() by r15-4833-ge9ab41b79933d4

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416 Hongtao Liu changed: What|Removed |Added Resolution|DUPLICATE |--- Status|RESOLVED

[Bug target/117438] x86's pass_align_tight_loops may cause performance regression in nested loops

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/117304] ICE: in emit_move_insn, at expr.cc:4633 with -mavx10.1 and __builtin_ia32_cvtudq2ps512_mask()

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117304 --- Comment #4 from Hongtao Liu --- $: grep AVX512F i386-builtin.def | grep -v EVEX512 | grep -e V8DI -e V8DF -e V16SI -e V16SF -e V32HI -e V32HF -e V32BF -e V64QI BDESC (OPTION_MASK_ISA_AVX512F, 0, CODE_FOR_unspec_fix_truncv8dfv8si2_mask_roun

[Bug target/117416] [15 Regression] ICE: in gen_prefetch, at config/i386/i386.md:28541 with __builtin_ia32_prefetch() by r15-4833-ge9ab41b79933d4

2024-11-05 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117416 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED

[Bug target/117438] x86's pass_align_tight_loops may cause performance regression in nested loops

2024-11-04 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438 --- Comment #4 from Hongtao Liu --- (In reply to Mayshao-oc from comment #0) > Created attachment 59530 [details] > gcc -O1 loop.c > > Pass_align_tight_loops align the inner loop aggressively, this may cause > significant performance regression

[Bug target/117318] ICE: in expand_simple_unop, at optabs.cc:2585 with __builtin_ia32_pmovusqb512mem_mask()

2024-10-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117318 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/117323] GCC failed to optimize value / 128 to value >> 7 when the range of value must be positive

2024-10-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323 --- Comment #6 from Hongtao Liu --- (In reply to Andrew Pinski from comment #5) > Note the reasoning for the difference in arguments between aarch64 and > x86_64 is that x86_64 defines PUSH_ARGS_REVERSED to be 1. Interesting define min/max as m

[Bug middle-end/117323] New: GCC failed to optimize value / 128 to value >> 7 when the range of value must be positive

2024-10-27 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323 Bug ID: 117323 Summary: GCC failed to optimize value / 128 to value >> 7 when the range of value must be positive Product: gcc Version: 15.0 Status: UNCONFIRMED

[Bug target/117318] ICE: in expand_simple_unop, at optabs.cc:2585 with __builtin_ia32_pmovusqb512mem_mask()

2024-10-27 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117318 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org E

[Bug middle-end/117542] Missed loop vectorization for truncate from float to __bf16.

2024-11-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542 --- Comment #2 from Hongtao Liu --- (In reply to Richard Biener from comment #1) > It doesn't even unambiguously specify whether the mode is that of the source > or the destination. The original idea was of course that the size > unambiguously

[Bug middle-end/117542] New: Missed loop vectorization for truncate from float to __bf16.

2024-11-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117542 Bug ID: 117542 Summary: Missed loop vectorization for truncate from float to __bf16. Product: gcc Version: 15.0 Status: UNCONFIRMED Keywords: missed-optimizati

[Bug target/117240] ICE: in copy_to_mode_reg, at explow.cc:657 with __builtin_ia32_vaesenc_v32qi() or __builtin_ia32_vaesenc_v64qi()

2024-10-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117240 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/117301] Many AVX10 tests fail

2024-10-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117301 --- Comment #3 from Hongtao Liu --- yes, new instructions are still under review for binutils and not landed on Binutil trunk, but GCC check_effective_target_avx10_2 target with "old" _mm256_mask_vpdpbssd_epi32. The problem should be gone when

[Bug tree-optimization/117323] GCC failed to optimize value / 128 to value >> 7 when the range of value must be positive

2024-10-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117323 --- Comment #4 from Hongtao Liu --- Another miss optimization is GCC failed to recognize max_expr for sum1, which generates a lot pack/unpack code in the vectorizer prephitmp_66 = (int) _8; # DEBUG a => NULL # DEBUG b => NULL # DEBUG a

[Bug tree-optimization/116765] [12/13/14/15 regression] gcc generate wrong code with -O3 -march=skylake

2024-09-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116765 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/116800] std::simd: poor code generation of AVX512 fused multiply-add

2024-09-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116800 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/116738] Constant folding of _mm_min_ss and _mm_max_ss is wrong

2024-09-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116738 --- Comment #11 from Hongtao Liu --- Sure. > Hongtao, can you please take the patch forward?

[Bug target/117159] kmovw storing to memory is assumed to zero-extend

2024-10-15 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117159 Hongtao Liu changed: What|Removed |Added Assignee|unassigned at gcc dot gnu.org |liuhongt at gcc dot gnu.org

[Bug target/116940] [15 Regression] wrong code with -O -mavx512vl and vector compare and negation since r15-1742

2024-10-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116940 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug testsuite/115365] [15 regression] New test case gcc.dg/pr100927.c from r15-1022-gb05288d1f1e4b6 fails

2024-10-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365 Hongtao Liu changed: What|Removed |Added Status|REOPENED|RESOLVED Resolution|---

[Bug target/117072] [15 Regression] FAIL: gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since r15-3509-gd34cda72098867

2024-10-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072 Hongtao Liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-10-16 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 117072, which changed state. Bug 117072 Summary: [15 Regression] FAIL: gcc.target/i386/cond_op_fma_{float,double,_Float16}-1.c since r15-3509-gd34cda72098867 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117072

[Bug target/117438] x86's pass_align_tight_loops may cause performance regression in nested loops

2024-11-05 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117438 --- Comment #5 from Hongtao Liu --- I reproduce with 30% regression on CLX, there's more frontend-bound with aligned case, it's uarch specific, will make it a uarch tune.

[Bug target/117304] ICE: in emit_move_insn, at expr.cc:4633 with -mavx10.1 and __builtin_ia32_cvtudq2ps512_mask()

2024-11-06 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117304 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/117839] Redundant vector XOR instructions

2024-11-28 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117839 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/73350] AVX512: GCC optimizes away rounding flags

2024-11-28 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73350 Hongtao Liu changed: What|Removed |Added Status|NEW |RESOLVED CC|

[Bug target/80862] [x86] Wrong rounding results for some test cases

2024-11-28 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=80862 Bug 80862 depends on bug 73350, which changed state. Bug 73350 Summary: AVX512: GCC optimizes away rounding flags https://gcc.gnu.org/bugzilla/show_bug.cgi?id=73350 What|Removed |Added --

[Bug target/117562] [15 Regression] 40% slowdown of 482.sphinx3 on Zen4, Zen5 since r15-5120-g9a62c149589103

2024-11-22 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117562 --- Comment #10 from Hongtao Liu --- > > I do wonder about the usefulness of the memory alternative on the > sse_movhlps pattern though, there's the sse_storehps pattern which > also models the store part more precisely as V2SFmode. Is > sse_

[Bug middle-end/117823] sdot_prod pattern extended to floating point?

2024-11-27 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117823 --- Comment #1 from Hongtao Liu --- The vectorization maybe need ffast-math.

[Bug middle-end/117823] New: sdot_prod pattern extended to floating point?

2024-11-27 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=117823 Bug ID: 117823 Summary: sdot_prod pattern extended to floating point? Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: mi

[Bug target/116675] No blend constant permute for V8HImode with just SSE2

2024-11-28 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|REOPENED

[Bug tree-optimization/53947] [meta-bug] vectorizer missed-optimizations

2024-11-28 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53947 Bug 53947 depends on bug 116675, which changed state. Bug 116675 Summary: No blend constant permute for V8HImode with just SSE2 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=116675 What|Removed |Added ---

<    1   2   3   4   5   6   >