[Bug target/112904] [14 Regression] ICE in extract_insn, at recog.cc:2791 with -mxop since r14-4964-g7eed861e8ca3f5

2023-12-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112904 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/112943] [14 Regression] ICE: in gen_reg_rtx, at emit-rtl.cc:1176 with -O2 -march=westmere -mapxf

2023-12-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112943 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/111972] [14 regression] missed vectorzation for bool a = j != 1; j = (long int)a;

2023-12-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=111972 --- Comment #20 from Hongtao Liu --- (In reply to Andrew Pinski from comment #19) > Fixed. Thanks.

[Bug target/112891] [11/12/13/14 Regression] Missing vzeroupper insert

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112891 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/112962] [14 Regression] ICE: SIGSEGV in operator() (recog.h:431) with -fexceptions -mssse3 and __builtin_ia32_pabsd128()

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112962 --- Comment #12 from Hongtao Liu --- (In reply to Jakub Jelinek from comment #8) > Of course, yet another option is: > --- gcc/config/i386/i386.cc 2023-12-12 08:54:39.821148670 +0100 > +++ gcc/config/i386/i386.cc 2023-12-12 11:07:03.79528636

[Bug target/112962] [14 Regression] ICE: SIGSEGV in operator() (recog.h:431) with -fexceptions -mssse3 and __builtin_ia32_pabsd128()

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112962 --- Comment #13 from Hongtao Liu --- > I prefer this solution, that's what we did for blendvps case. > I don't know either, just follow what we did before (with false) when > folding builtins. I mean when I was working on r14-1145-g1ede03e2d043

[Bug target/112992] Inefficient vector initialization using vec_duplicate/broadcast

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/112992] Inefficient vector initialization using vec_duplicate/broadcast

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992 --- Comment #4 from Hongtao Liu --- (In reply to Hongtao Liu from comment #3) > I think we need to also guard SImode and DImode case under AVX2 when > MODE_SIZE==256. Since there's vbroadcastss only support m alternative under avx

[Bug target/112992] Inefficient vector initialization using vec_duplicate/broadcast

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992 --- Comment #5 from Hongtao Liu --- (In reply to Roger Sayle from comment #0) > The following four functions should in theory all produce the same code: > > typedef unsigned long long v4di __attribute((vector_size(32))); > typedef unsigned int

[Bug target/112992] Inefficient vector initialization using vec_duplicate/broadcast

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992 --- Comment #6 from Hongtao Liu --- > Thoughts? Apologies if this is a dup. I'm happy to work up a patch if > someone could advise on where best this should be fixed. Perhaps RTL's > vec_duplicate could be canonicalized to the most appropriat

[Bug target/112992] Inefficient vector initialization using vec_duplicate/broadcast

2023-12-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=112992 --- Comment #7 from Hongtao Liu --- (In reply to Hongtao Liu from comment #6) > > Thoughts? Apologies if this is a dup. I'm happy to work up a patch if > > someone could advise on where best this should be fixed. Perhaps RTL's > > vec_duplica

[Bug target/113039] [14 Regression] -fcf-protection -fcf-protection=branch doesn't work

2023-12-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/104401] [x86] Failure to recognize min/max pattern using pcmp+pblendv

2023-12-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104401 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/113078] New: [14 regression] reduction of cond_sub is not vectorized.

2023-12-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113078 Bug ID: 113078 Summary: [14 regression] reduction of cond_sub is not vectorized. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Prio

[Bug target/113079] New: [x86] Fails to generate dot_prod instructions for 64-bit vector.

2023-12-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 Bug ID: 113079 Summary: [x86] Fails to generate dot_prod instructions for 64-bit vector. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal

[Bug target/113079] [x86] Fails to generate dot_prod instructions for 64-bit vector.

2023-12-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079 --- Comment #1 from Hongtao Liu --- (In reply to Hongtao Liu from comment #0) > int > foo (int n, unsigned char* p, char* pi) > { > int sum = 0; > for (int i = 0; i != 8; i++) > { > sum += p[i] * pi[i]; > } > return s

[Bug target/113090] New: Suboptimal vector permuation for 64-bit vector.

2023-12-19 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113090 Bug ID: 113090 Summary: Suboptimal vector permuation for 64-bit vector. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug libfortran/110966] should matmul_c8_avx512f be updated with matmul_c8_x86-64-v4.

2024-01-07 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=110966 Hongtao Liu changed: What|Removed |Added Resolution|--- |INVALID Status|WAITING

[Bug tree-optimization/113261] New: missing vectorization for dot_prod chain.

2024-01-07 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113261 Bug ID: 113261 Summary: missing vectorization for dot_prod chain. Product: gcc Version: 14.0 Status: UNCONFIRMED Keywords: missed-optimization Severity: normal

[Bug tree-optimization/113261] missing vectorization for dot_prod chain.

2024-01-07 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113261 --- Comment #1 from Hongtao Liu --- For foo1, _99 = .REDUC_PLUS (vect_patt_79.51_97); _90 = .REDUC_PLUS (vect_patt_28.43_88); _19 = _90 + _99; can be optimized to _tmp = vect_patt_79.51_97 + vect_patt_28.43_88; _19 = .REDUC_PLUS

[Bug target/113288] [i386] Missing #define for -mavx10.1-256 and -mavx10.1-512

2024-01-08 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113288 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/104401] [x86] Failure to recognize min/max pattern using pcmp+pblendv

2024-01-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104401 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/113312] Update __attribute__((interrupt)) for Intel FRED

2024-01-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113312 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/113345] New: miss optimization for psign{b,w,d}.

2024-01-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113345 Bug ID: 113345 Summary: miss optimization for psign{b,w,d}. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target

[Bug target/113345] miss optimization for psign{b,w,d}.

2024-01-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113345 --- Comment #1 from Hongtao Liu --- > > maybe we can just refactor the pattern as blow, then combine can generate > the pattern for us. > > 22115(define_insn "_psign3" > 22116 [(set (match_operand:VI124_AVX2 0 "register_operand" "=x,x") > 22

[Bug target/113039] [14 Regression] -fcf-protection -fcf-protection=branch doesn't work

2024-01-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113039 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/113345] miss optimization for psign{b,w,d}.

2024-01-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113345 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/113126] [14 Regression] ICE: in gimple_expand_vec_cond_expr, at gimple-isel.cc:325 at -O1

2024-01-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113126 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug testsuite/113437] [14 Regression] gcc.dg/tree-ssa/pr95906.c fails on arm since g:6686e16fda4

2024-01-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113437 --- Comment #2 from Hongtao Liu --- Maybe we can add target vect_int.

[Bug tree-optimization/113458] Missed SLP for reduction of multiplication/addition with promotion

2024-01-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458 --- Comment #2 from Hongtao Liu --- > But if we reduce n to 4, the loop based vectorizer is not able to handle it > either. Do we support 1 element vector(i.e V1SI) in vectorizer? and it also relies on backend support of dot_prodv4qi.

[Bug target/113458] Missed SLP for reduction of multiplication/addition with promotion

2024-01-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113458 --- Comment #6 from Hongtao Liu --- > thus > > vect__16.5_40 = MEM [(short int *)a_22(D)]; > vect__17.6_41 = (vector(4) int) vect__16.5_40; > vect__18.9_44 = MEM [(signed char *)b_23(D)]; > vect_patt_36.10_45 = (vector(4) signed shor

[Bug testsuite/113437] [14 Regression] gcc.dg/tree-ssa/pr95906.c fails on arm since g:6686e16fda4

2024-01-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113437 --- Comment #5 from Hongtao Liu --- (In reply to Andrew Pinski from comment #3) > (In reply to Hongtao Liu from comment #2) > > Maybe we can add target vect_int. > > Not really because vect_int depends on the vect.exp framework still. See PR >

[Bug testsuite/113437] [14 Regression] gcc.dg/tree-ssa/pr95906.c fails on arm since g:6686e16fda4

2024-01-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113437 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug tree-optimization/113539] [14 Regression] perlbench miscompiled on aarch64 since r14-8223-g1c1853a70f

2024-01-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113539 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/113576] New: [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 Bug ID: 113576 Summary: [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c Product: gcc Version: 14.0 Status: UNCONFIRMED Sever

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #1 from Hongtao Liu --- int __attribute__((noinline)) sbitmap_first_set_bit (const_sbitmap bmap) { unsigned int n = 0; sbitmap_iterator sbi; EXECUTE_IF_SET_IN_SBITMAP (bmap, 0, n, sbi) return n; return -1; } hangs on th

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #6 from Hongtao Liu --- Another potential buggy place is 240 vexit_reduc_67 = mask_patt_43.28_62 & mask_patt_43.28_63; 241 if (vexit_reduc_67 == { -1, -1, -1, -1 }) 242goto ; [94.50%] 243 else is expanded to 319(insn 69

[Bug tree-optimization/113592] New: missed partial sum optimization in vectorizer

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592 Bug ID: 113592 Summary: missed partial sum optimization in vectorizer Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tr

[Bug tree-optimization/113593] New: missed partial sum optimization in vectorizer

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113593 Bug ID: 113593 Summary: missed partial sum optimization in vectorizer Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: tr

[Bug tree-optimization/113594] New: Missing partial sum optimziation in the vectorizer.

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113594 Bug ID: 113594 Summary: Missing partial sum optimziation in the vectorizer. Product: gcc Version: 14.0 Status: UNCONFIRMED Severity: normal Priority: P3 Compone

[Bug tree-optimization/113592] missed partial sum optimization in vectorizer

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592 --- Comment #1 from Hongtao Liu --- *** Bug 113593 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113593] missed partial sum optimization in vectorizer

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113593 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug tree-optimization/113594] Missing partial sum optimziation in the vectorizer.

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113594 Hongtao Liu changed: What|Removed |Added Resolution|--- |DUPLICATE Status|UNCONFIRMED

[Bug tree-optimization/113592] missed partial sum optimization in vectorizer

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592 --- Comment #2 from Hongtao Liu --- *** Bug 113594 has been marked as a duplicate of this bug. ***

[Bug tree-optimization/113592] missed partial sum optimization in vectorizer

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113592 --- Comment #3 from Hongtao Liu --- This testcase is probably not a good example for typical partail sum which relies on unroll loops. double foo (double* p, int n) { double sum = 0; for (int i = 0; i != n; i++) sum += p[i] * p[i]

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #7 from Hongtao Liu --- diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 1fd957288d4..33a8d539b4d 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -8032,7 +8032,7 @@ native_encode_vector_part (const_tree expr, unsign

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #8 from Hongtao Liu --- maybe diff --git a/gcc/fold-const.cc b/gcc/fold-const.cc index 1fd957288d4..6d321f9baef 100644 --- a/gcc/fold-const.cc +++ b/gcc/fold-const.cc @@ -8035,6 +8035,9 @@ native_encode_vector_part (const_tree exp

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 Hongtao Liu changed: What|Removed |Added Resolution|FIXED |--- Status|RESOLVED

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600 --- Comment #1 from Hongtao Liu --- Guess it's same issue as PR112879?

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600 --- Comment #2 from Hongtao Liu --- A patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2023-December/640276.html Would you give a try to see if it fixes the regression, I don't currently have a znver4 machine for testing.

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #20 from Hongtao Liu --- > Note that I wonder how to eliminate redundant maskings? I suppose > eventually combine tracking nonzero bits where obvious would do > that? For example for cmp:V4SI we know the bits will be zero but > I

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #21 from Hongtao Liu --- typedef unsigned long mp_limb_t; typedef long mp_size_t; typedef unsigned long mp_bitcnt_t; typedef mp_limb_t *mp_ptr; typedef const mp_limb_t *mp_srcptr; #define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8) #def

[Bug target/113609] New: EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF.

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609 Bug ID: 113609 Summary: EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF. Product: gcc Version: 14.0 Status: UNCONFIRMED

[Bug target/113609] EQ/NE comparison between avx512 kmask and -1 can be optimized with kxortest with checking CF.

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113609 --- Comment #1 from Hongtao Liu --- Since they're different modes, CCZ for cmp, but CCS for kortest, it could be diffcult to optimize it in RA stage by adding alternatives(like we did for compared to 0). So the easy way could be adding peephole

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-25 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #22 from Hongtao Liu --- typedef unsigned long mp_limb_t; typedef long mp_size_t; typedef unsigned long mp_bitcnt_t; typedef mp_limb_t *mp_ptr; typedef const mp_limb_t *mp_srcptr; #define GMP_LIMB_BITS (sizeof(mp_limb_t) * 8) #def

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-29 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #25 from Hongtao Liu --- (In reply to Tamar Christina from comment #24) > Just to avoid confusion, are you still working on this one Richi? I'm working on a patch to add a target hook as #c18 mentioned.

[Bug tree-optimization/113576] [14 regression] 502.gcc_r hangs r14-8223-g1c1853a70f9422169190e65e568dcccbce02d95c

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113576 --- Comment #28 from Hongtao Liu --- I saw we already maskoff integral modes for vector mask in store_constructor /* Use sign-extension for uniform boolean vectors with integer modes and single-bit mask entries. Ef

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600 --- Comment #5 from Hongtao Liu --- It looks like x264_pixel_satd_16x16 consumes more time after my commit, an extracted case is as below, note there's no attribute((always_inline)) in the original x264_pixel_satd_8x4, it's added to force inline

[Bug target/113600] [14 regression] 525.x264_r run-time regresses by 8% with PGO -Ofast -march=znver4

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113600 --- Comment #6 from Hongtao Liu --- Guess explicit .REDUC_PLUS instead of original VEC_PERM_EXPR somehow impacts the store split decision.

[Bug target/113656] [x86] ICE in simplify_const_unary_operation, at simplify-rtx.cc:1954 with new -mavx10.1

2024-01-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113656 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org,

[Bug target/113729] Missing APX NDD optimization

2024-02-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/113744] Unnecessary "m" constraint in *adddi_4

2024-02-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113744 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/113729] Missing APX NDD optimization

2024-02-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113729 --- Comment #2 from Hongtao Liu --- extern unsigned char b; int foo (void) { return (unsigned char)(200 + b); } gcc -O2 -mapxf foo(): subb $56, b(%rip), %al movzbl %al, %eax ret And this can be optimzied to foo(): subb $56, b(%ri

[Bug rtl-optimization/115384] [15 Regression] ICE: RTL check: expected code 'const_int', have 'const_wide_int' in simplify_binary_operation_1, at simplify-rtx.cc:4088 since r15-1047-g7876cde25cbd2f

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115384 Hongtao Liu changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #3 from Hongtao Liu

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406 Hongtao Liu changed: What|Removed |Added Status|NEW |ASSIGNED --- Comment #2 from Hongtao Liu

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406 --- Comment #3 from Hongtao Liu --- typedef __attribute__((__vector_size__ (1))) char V; char foo (V v) { return ((V) v == v)[0]; } int main () { char x = foo ((V) { }); if (x != -1) __builtin_abort (); } w/ vcond_mask_qiqi, it's no

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406 --- Comment #4 from Hongtao Liu --- > > and for _2 = VIEW_CONVERT_EXPR(_1); we explicitly > clear the upper bits due to PR113576, and then we get 1 hit the abort. It's not VIEW_CONVERT_EXPR clear the uppper bits, but _1 = { -1 };

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406 --- Comment #5 from Hongtao Liu --- > _2 = VEC_COND_EXPR <_1, { -1 }, { 0 }>; Hmm, it should check vcond_mask_qiv1qi instead of vcond_mask_qiqi, I guess since the backend doesn't supports v1qi, TYPE_MODE of V is QImode, then it wrongly checke

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406 --- Comment #6 from Hongtao Liu --- For 1 element vector, when backend doesn't support it's vector mode, the scalar mode is used for the type, which makes expand_vec_cond_expr_p use QImode for icode check.(vcond_mask_qiqi) It could also be the

[Bug testsuite/115365] New test case gcc.dg/pr100927.c from r15-1022-gb05288d1f1e4b6 fails

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365 Hongtao Liu changed: What|Removed |Added Target|powerpc64le-linux-gnu, |powerpc64le-linux-gnu,

[Bug target/115418] Extra movapd emitted for MAX implementation

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115418 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug testsuite/115365] New test case gcc.dg/pr100927.c from r15-1022-gb05288d1f1e4b6 fails

2024-06-10 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115365 --- Comment #7 from Hongtao Liu --- +/* { dg-final { scan-rtl-dump-times {(?n)^(?!.*REG_EQUIV)(?=.*\(fix:SI)} 3 "final" } } */ Does this fix the testcase on solaris2?

[Bug rtl-optimization/115384] [15 Regression] ICE: RTL check: expected code 'const_int', have 'const_wide_int' in simplify_binary_operation_1, at simplify-rtx.cc:4088 since r15-1047-g7876cde25cbd2f

2024-06-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115384 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/115452] New: ICE when dump stv2 for gcc.target/i386/pr70322-2.c with -march=cascadelake

2024-06-11 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115452 Bug ID: 115452 Summary: ICE when dump stv2 for gcc.target/i386/pr70322-2.c with -march=cascadelake Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: nor

[Bug target/115452] ICE when dump stv2 for gcc.target/i386/pr70322-2.c with -march=cascadelake

2024-06-12 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115452 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/115462] [15 regression] 416.gamess regressed 4-6% on x86_64 since r15-882-g1d6199e5f8c1c0

2024-06-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/115463] [15 regression] 526.blender_r regressed 5% on Zen2 with -Ofast -flto -march=native since r15-1058-gc989e59fc99d99

2024-06-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115463 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug rtl-optimization/115021] [14/15 regression] unnecessary spill for vpternlog

2024-06-13 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115021 --- Comment #5 from Hongtao Liu --- It's fixed by r15-1100-gec985bc97a0157

[Bug target/115517] New: Fix regression after dropping uses of vcond{,u,eq}_optab

2024-06-17 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 Bug ID: 115517 Summary: Fix regression after dropping uses of vcond{,u,eq}_optab Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Prio

[Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab

2024-06-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 --- Comment #2 from Hongtao Liu --- (In reply to Richard Biener from comment #1) > Btw, I had opened PR115490 with my results for this already. Some mitigation > should be from optimizing ISEL expansion to vcond_mask and I'd start with > lookin

[Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab

2024-06-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 --- Comment #4 from Hongtao Liu --- (In reply to rguent...@suse.de from comment #3) > On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 > > > > --- Comment #2 from Hongtao Liu --

[Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab

2024-06-18 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 --- Comment #6 from Hongtao Liu --- (In reply to rguent...@suse.de from comment #5) > On Tue, 18 Jun 2024, liuhongt at gcc dot gnu.org wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 > > > > --- Comment #4 from Hongtao Liu --

[Bug target/115406] [15 Regression] wrong code with vector compare at -O0 with -mavx512f since r15-920-gb6c6d5abf0d31c

2024-06-23 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115406 --- Comment #7 from Hongtao Liu --- > > BTW, when assign -1 to vector(1) , should the upper bit be > cleared? Look like only 1 element boolean vector is cleared, but not > vector(2) . > If the upper bits are not cleared, both 2 cases are equal

[Bug target/115610] -flate-combine disabled by default for x86 port

2024-06-24 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115610 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org Last reconf

[Bug tree-optimization/115450] [15 Regression] cpu2017 502.gcc runtime miscompute on aarch64 with SVE since r15-1006-gd93353e6423eca

2024-06-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115450 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug middle-end/26163] [meta-bug] missed optimization in SPEC (2k17, 2k and 2k6 and 95)

2024-06-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=26163 Bug 26163 depends on bug 115462, which changed state. Bug 115462 Summary: [15 regression] 416.gamess regressed 4-6% on x86_64 since r15-882-g1d6199e5f8c1c0 https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462 What|Removed

[Bug target/115462] [15 regression] 416.gamess regressed 4-6% on x86_64 since r15-882-g1d6199e5f8c1c0

2024-06-26 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115462 Hongtao Liu changed: What|Removed |Added Status|UNCONFIRMED |RESOLVED Resolution|---

[Bug target/115683] New: SSE2 regressions after obselete of vcond{,u,eq}.

2024-06-27 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115683 Bug ID: 115683 Summary: SSE2 regressions after obselete of vcond{,u,eq}. Product: gcc Version: 15.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component:

[Bug middle-end/115675] [15 Regression] truncv4hiv4qi affect r14-1402-gd8545fb2c71683's optimization.

2024-06-27 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115675 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug tree-optimization/115693] 8 std::byte std::array comparison potential missed optimization

2024-06-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115693 Hongtao Liu changed: What|Removed |Added CC||liuhongt at gcc dot gnu.org --- Comment #

[Bug target/115610] -flate-combine disabled by default for x86 port

2024-06-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115610 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|ASSIGNED

[Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab

2024-06-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 --- Comment #14 from Hongtao Liu --- regressions above SSE4.1 are fxed in GCC15, SSE2 regressions are tracked in PR115683

[Bug target/115517] Fix x86 regressions after dropping uses of vcond{,u,eq}_optab

2024-06-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED Status|UNCONFIRMED

[Bug target/114189] Target implements obsolete vcond{,u,eq} expanders

2024-06-30 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=114189 Bug 114189 depends on bug 115517, which changed state. Bug 115517 Summary: Fix x86 regressions after dropping uses of vcond{,u,eq}_optab https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115517 What|Removed |Added

[Bug target/107432] __builtin_convertvector generates inefficient code

2024-07-02 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=107432 Hongtao Liu changed: What|Removed |Added Resolution|--- |FIXED CC|

[Bug target/115748] [15 Regression] gcc.target/i386/avx512bw-pr70509.c SIGILL with -m32

2024-07-02 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115748 Hongtao Liu changed: What|Removed |Added Ever confirmed|0 |1 Status|UNCONFIRMED

[Bug target/109812] GraphicsMagick resize is a lot slower in GCC 13.1 vs Clang 16 on Intel Raptor Lake

2024-07-02 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109812 --- Comment #23 from Hongtao Liu --- (In reply to edison from comment #22) > for 607.cactuBSSN_s,if use preENV_GOMP_CPU_AFFINITY = 0-23 in CPU2017 .cfg, > all p-core(i9-13900k) usage will down to 15%(the e-core almost 100%), if > comment out it

[Bug target/115748] [15 Regression] gcc.target/i386/avx512bw-pr70509.c SIGILL with -m32

2024-07-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115748 Hongtao Liu changed: What|Removed |Added Status|ASSIGNED|RESOLVED Resolution|---

[Bug target/115756] default tuning for x86_64 produces shifts for `*240`

2024-07-03 Thread liuhongt at gcc dot gnu.org via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=115756 --- Comment #3 from Hongtao Liu --- Current rtx_cost for imulq in generic_cost is COST_N_INSNS (4), make it as COST_N_INSNS (3) could generate imulq. {COSTS_N_INSNS (3), /* cost of starting multiply for QI */ COSTS_N_INS

<    1   2   3   4   5   6   >