[Bug tree-optimization/103462] GCC failed to reduce bit clear in loop.

2021-11-28 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103462 --- Comment #1 from Hongtao.liu --- Should it be done in vectorizer or ldist(just like memory op), or somewhere else?

[Bug tree-optimization/103462] GCC failed to reduce bit clear in loop.

2021-11-28 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103462 --- Comment #2 from Hongtao.liu --- bit clear and induction variable could be simplified to `& CONSTANT`

[Bug target/103463] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -Os -fno-tree-dominator-opts -fno-tree-vrp

2021-11-28 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103463 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1

[Bug target/95740] Failure to avoid using the stack when interpreting a float as an integer when it is modified afterwards

2021-11-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95740 --- Comment #4 from Hongtao.liu --- It can be fixed by 2 files changed, 4 insertions(+), 2 deletions(-) gcc/config/i386/i386.c | 2 +- gcc/config/i386/i386.h | 4 +++- modified gcc/config/i386/i386.c @@ -19194,7 +19194,7 @@ ix86_preferred_reloa

[Bug target/95740] Failure to avoid using the stack when interpreting a float as an integer when it is modified afterwards

2021-11-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95740 --- Comment #5 from Hongtao.liu --- (In reply to Hongtao.liu from comment #4) > It can be fixed by > > 2 files changed, 4 insertions(+), 2 deletions(-) > gcc/config/i386/i386.c | 2 +- > gcc/config/i386/i386.h | 4 +++- > > modified gcc/config/

[Bug target/103463] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -Os -fno-tree-dominator-opts -fno-tree-vrp

2021-11-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103463 --- Comment #3 from Hongtao.liu --- (In reply to Hongtao.liu from comment #1) > It should be fixed by > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585613.html Hmm, it looks to be broken again.

[Bug target/103463] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -Os -fno-tree-dominator-opts -fno-tree-vrp

2021-11-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103463 --- Comment #4 from Hongtao.liu --- (In reply to Hongtao.liu from comment #3) > (In reply to Hongtao.liu from comment #1) > > It should be fixed by > > https://gcc.gnu.org/pipermail/gcc-patches/2021-November/585613.html > > Hmm, it looks to be

[Bug target/103463] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -Os -fno-tree-dominator-opts -fno-tree-vrp

2021-11-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103463 --- Comment #5 from Hongtao.liu --- diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index c88374c9d2b..4e9fae80479 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11512,6 +11512,7 @@ (define_insn "*x86_64_sh

[Bug target/103484] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -O2 -fno-tree-bit-ccp

2021-11-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103484 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1

[Bug target/100711] Miss optimization for pandn

2021-11-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100711 Hongtao.liu changed: What|Removed |Added Resolution|--- |FIXED Status|NEW

[Bug target/103463] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -Os -fno-tree-dominator-opts -fno-tree-vrp

2021-11-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103463 Hongtao.liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/103484] [12 Regression] ICE: in ix86_attr_length_immediate_default, at config/i386/i386.c:16686 with -O2 -fno-tree-bit-ccp

2021-11-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103484 --- Comment #6 from Hongtao.liu --- Fixed in GCC12.

[Bug tree-optimization/103144] vectorizer failed to recognize shift>>=1 in loop as shift>>i

2021-11-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103144 --- Comment #2 from Hongtao.liu --- Another issue is for SLP, when trip count is small and loop is completely unrolled. SLP failed to generate vlshr_optab. #include void foo (uint64_t* __restrict pdst, uint64_t* psrc, uint64_t shift) { for

[Bug sanitizer/103519] Address sanitizer check missing for AVX512 masked load

2021-12-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103519 --- Comment #2 from Hongtao.liu --- get_mem_refs_of_builtin_call doesn't handle target-specific builtins.

[Bug target/95740] Failure to avoid using the stack when interpreting a float as an integer when it is modified afterwards

2021-12-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95740 --- Comment #7 from Hongtao.liu --- Fixed in GCC12.

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1

[Bug target/103554] -mavx generates worse code on scalar code

2021-12-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 --- Comment #8 from Hongtao.liu --- > but the x86 backend chooses to not let the vectorizer compare costs with > different vector sizes but instead asks it to pick the first working > solution from the vector of modes to consider (and in that or

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-06 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #2 from Hongtao.liu --- > > Also, baz iz highly un-optimal for 32bit targets. Yes, it needs to be fixed, note w/ -mavx512fp16 codegen for baz is optimal on 32-bit target, maybe related to vector_mode_supported_p, but then why code

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #5 from Hongtao.liu --- (In reply to Uroš Bizjak from comment #4) > (In reply to Hongyu Wang from comment #3) > > > So we may need to support V8HFmode in VALID_SSE2_REG_MODE if we don't want > > to modify those function_args and fu

[Bug target/103554] -mavx generates worse code on scalar code

2021-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 --- Comment #10 from Hongtao.liu --- Got it, thanks for your detail explanation, so there're 2 issues in this case, first x86 target didn't choose vector size w/ smallest cost, second BB vectorization with gaps at the end of a load is not suppor

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #8 from Hongtao.liu --- (In reply to Uroš Bizjak from comment #6) > (In reply to Hongtao.liu from comment #5) > > > There're several places in i386-expand.c which assume TARGET_AVX512FP16 for > > case V8HF/V16HF/V32HF, if we want to

[Bug middle-end/100738] Gimple failed to simplify ((v4si) ~a) < 0 ? c : d to ((v4si)a) >= 0 ? c : d

2021-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=100738 Hongtao.liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-07 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #10 from Hongtao.liu --- (In reply to Uroš Bizjak from comment #9) > (In reply to Hongtao.liu from comment #8) > > (In reply to Uroš Bizjak from comment #6) > > > (In reply to Hongtao.liu from comment #5) > > > > > > > There're seve

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #15 from Hongtao.liu --- (In reply to Uroš Bizjak from comment #12) > (In reply to Hongtao.liu from comment #10) > > > Sure. > Please find attached the complete patch that enables HF vector modes in > Comment #11. The patch survives

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #16 from Hongtao.liu --- There're already testcases for vec_extract/vec_set/vec_duplicate, but those testcases are written under TARGET_AVX512FP16, i'll make a copy of them and test them w/o avx512fp16.

[Bug target/103554] -mavx generates worse code on scalar code

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 --- Comment #12 from Hongtao.liu --- (In reply to rguent...@suse.de from comment #11) > On Tue, 7 Dec 2021, crazylht at gmail dot com wrote: > > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103554 > > > > --- Comment #10 from Hongtao.liu --

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #17 from Hongtao.liu --- (In reply to Hongtao.liu from comment #16) > There're already testcases for vec_extract/vec_set/vec_duplicate, but those > testcases are written under TARGET_AVX512FP16, i'll make a copy of them and > test th

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #18 from Hongtao.liu --- codegen for foo1/foo2 is suboptimal under -mavx2, i guess we can have vec_setv16hf_0 and with vpblendw. typedef _Float16 __v16hf __attribute__ ((__vector_size__ (32))); typedef _Float16 __m256h __attribute__

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #19 from Hongtao.liu --- (In reply to Hongtao.liu from comment #17) > (In reply to Hongtao.liu from comment #16) > > There're already testcases for vec_extract/vec_set/vec_duplicate, but those > > testcases are written under TARGET_A

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-08 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #20 from Hongtao.liu --- V2HF/V4HF should also be restricted under AVX512FP16.

[Bug target/103571] ABI: V2HF, V4HF and V8HFmode argument passing issues

2021-12-09 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103571 --- Comment #22 from Hongtao.liu --- reply to Uroš Bizjak from comment #21) > (In reply to Hongtao.liu from comment #19) > > (In reply to Hongtao.liu from comment #17) > > > (In reply to Hongtao.liu from comment #16) > > > > There're already t

[Bug tree-optimization/103682] [12 regression] ICE on atomics: gimple check: expected gimple_assign(error_mark), have gimple_nop() in gimple_assign_rhs_code, at gimple.h:2852 since r12-5486-g7df89377a

2021-12-13 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103682 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #3

[Bug tree-optimization/103682] [12 regression] ICE on atomics: gimple check: expected gimple_assign(error_mark), have gimple_nop() in gimple_assign_rhs_code, at gimple.h:2852 since r12-5486-g7df89377a

2021-12-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103682 --- Comment #5 from Hongtao.liu --- Fixed in GCC12.

[Bug target/92658] x86 lacks vector extend / truncate

2021-12-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=92658 --- Comment #25 from Hongtao.liu --- (In reply to Uroš Bizjak from comment #5) > Created attachment 47927 [details] > Prototype patch v2 > > A couple of typos fixed. > > Still doesn't vectorize v4qi->v4si, v2qi->v2di, v2hi->v2di and v4qi->v4di.

[Bug target/101846] Improve __builtin_shufflevector emitted code

2021-12-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101846 --- Comment #9 from Hongtao.liu --- (In reply to Andrew Pinski from comment #7) > With just -mavx512f we produce a bunch of instructions (looking like we went > to scalar mode) while LLVM is able to produce: > foo(short __vector(16)):

[Bug tree-optimization/103462] GCC failed to reduce bit clear in loop.

2021-12-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103462 --- Comment #5 from Hongtao.liu --- Created attachment 52004 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52004&action=edit Testes patch, wait for gcc13.

[Bug tree-optimization/103462] GCC failed to reduce bit clear in loop.

2021-12-14 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103462 --- Comment #6 from Hongtao.liu --- (In reply to Hongtao.liu from comment #5) > Created attachment 52004 [details] > Testes patch, wait for gcc13. Add error in the patch to see if there's any change in gcc which can be optimized, it turns out t

[Bug target/101796] Miss optimization to optimized (vashl op0, (op1: const_duplicate_vector)) to (ashl op0 op1_inner)

2021-12-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101796 Hongtao.liu changed: What|Removed |Added Status|NEW |RESOLVED Resolution|---

[Bug tree-optimization/103194] [12 Regression] ice in optimize_atomic_bit_test_and with __sync_fetch_and_and since r12-5102-gfb161782545224f5

2021-12-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103194 --- Comment #23 from Hongtao.liu --- (In reply to Jakub Jelinek from comment #22) > (In reply to Hongtao.liu from comment #15) > > > Is the behavior well defined for n >= 64? I got > > > > > > foo.c:11:19: warning: left shift count >= width of

[Bug ipa/103734] IPA-CP opportunity for imagick in SPECCPU 2017

2021-12-15 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103734 --- Comment #2 from Hongtao.liu --- (In reply to Tamar Christina from comment #0) > When using --param ipa-cp-eval-threshold=1 --param ipa-cp-unit-growth=20 on > imagick the hot functions MorphologyApply and GetVirtualPixelsFromNexus get > repla

[Bug testsuite/102944] Many gcc.dg/Wstringop-overflow-*.c failures

2021-12-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102944 --- Comment #6 from Hongtao.liu --- (In reply to Martin Sebor from comment #5) > I don't see any of the FAILs or XFAILs listed in comment #0 with cross > compilers for any of the Targets. Can this report be resolved? I thinks so, now we only ha

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #1 from Hongtao.liu --- kmovw here is zero_extend, and at gimple level it's not redundant in loop. _31 = MEM[(const __m256i_u * {ref-all})n_5]; _30 = MEM[(const __m256i_u * {ref-all})n_5 + 32B]; _28 = VIEW_CONVERT_EXPR<__v16hi

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #2 from Hongtao.liu --- Failed here /* Allow propagations into a loop only for reg-to-reg copies, since replacing one register by another shouldn't increase the cost. */ struct loop *def_loop = def_insn->bb ()->cfg_bb ()->

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #3 from Hongtao.liu --- (In reply to Hongtao.liu from comment #2) > Failed here > > /* Allow propagations into a loop only for reg-to-reg copies, since > replacing one register by another shouldn't increase the cost. */ >

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-16 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #4 from Hongtao.liu --- (In reply to Hongtao.liu from comment #3) > (In reply to Hongtao.liu from comment #2) > > Failed here > > > > /* Allow propagations into a loop only for reg-to-reg copies, since > > replacing one regis

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #10 from Hongtao.liu --- (In reply to Uroš Bizjak from comment #9) > (In reply to Thiago Macieira from comment #0) > > Testcase: > ... > > The assembly for this produces: > > > > vmovdqu16 (%rdi), %ymm1 > > vmo

[Bug target/98648] Failure to optimize out no-op vector operation using andnot

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98648 --- Comment #6 from Hongtao.liu --- Fixed by r12-6071-g19dcecd963295b02b96c8cac57933657dbe3234a

[Bug target/98468] [9 regression] test case gcc.target/powerpc/rlwimi-2.c fails starting with r9-3594

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98468 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #6 f

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #11 from Hongtao.liu --- (In reply to Thiago Macieira from comment #6) > It got worse. Now I'm seeing: > > .L807: > vmovdqu16 (%rsi), %ymm2 > vmovdqu16 32(%rsi), %ymm3 > vpcmpuw $6, %ymm0, %ymm2,

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #12 from Hongtao.liu --- (In reply to Hongtao.liu from comment #11) > (In reply to Thiago Macieira from comment #6) > > It got worse. Now I'm seeing: > > > > .L807: > > vmovdqu16 (%rsi), %ymm2 > > vmovdqu16

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #13 from Hongtao.liu --- Created attachment 52031 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=52031&action=edit untested patch. Attached patch can optimize #c0 to vmovdqu (%rdi), %ymm1 vmovdqu16 32(%r

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-19 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 Hongtao.liu changed: What|Removed |Added Attachment #52031|0 |1 is obsolete|

[Bug target/103750] [i386] GCC schedules KMOV instructions that destroys performance in loop

2021-12-20 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103750 --- Comment #15 from Hongtao.liu --- (In reply to Hongtao.liu from comment #14) > Created attachment 52032 [details] > update patch > > Update patch, Now gcc can generate optimal code > current fix add define_insn_and_splitter for 3 things: 1

[Bug target/102080] [12 Regression] avx512vl related ICE, on firefox-92 gcc ICEs: in expand_insn, at optabs.c:7946 by r12-2679

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102080 --- Comment #3 from Hongtao.liu --- (In reply to H.J. Lu from comment #2) > It is caused by r12-2679. Mine.

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #13 from Hongtao.liu --- fold shulfps to vec_perm_exp, but still 2 shulfps are generated. __m128 f (__m128 a, __m128 b) { vector(4) float _3; vector(4) float _5; vector(4) float _6; ;; basic block 2, loop depth 0 ;;pred:

[Bug rtl-optimization/43147] SSE shuffle merge

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43147 --- Comment #20 from Hongtao.liu --- Fixed in GCC12, now gcc generate optimal codes. main: .LFB532: .cfi_startproc subq$8, %rsp .cfi_def_cfa_offset 16 movaps .LC0(%rip), %xmm0 callprintv

[Bug target/102080] [12 Regression] avx512vl related ICE, on firefox-92 gcc ICEs: in expand_insn, at optabs.c:7946 by r12-2679

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102080 --- Comment #4 from Hongtao.liu --- diff --git a/test.c.032t.ccp1 b/test.c.033t.forwprop1 index 5b18739..c6f0587 100644 --- a/test.c.032t.ccp1 +++ b/test.c.033t.forwprop1 @@ -31,11 +31,12 @@ void EncodedFromDisplay () __m256 __trans_tmp_11;

[Bug target/102080] [12 Regression] avx512vl related ICE, on firefox-92 gcc ICEs: in expand_insn, at optabs.c:7946 by r12-2679

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102080 --- Comment #9 from Hongtao.liu --- (In reply to Andrew Pinski from comment #8) > That is the mask is a vector mode still for these patterns according to the > internals doc. > Rather than the scalar mode you have: > (match_operand: 1 "register_

[Bug middle-end/102080] [12 Regression] avx512vl related ICE, on firefox-92 gcc ICEs: in expand_insn, at optabs.c:7946 by r12-2679

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102080 --- Comment #11 from Hongtao.liu --- Created attachment 51363 --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=51363&action=edit Proposed patch I'm testing this patch.

[Bug target/101472] AVX-512 wrong code for consecutive masked scatters

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101472 --- Comment #5 from Hongtao.liu --- Fixed in GCC12, backport to GCC11 and GCC10.

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #15 from Hongtao.liu --- (In reply to Andrew Pinski from comment #14) > (In reply to Hongtao.liu from comment #13) > > fold shulfps to vec_perm_exp, but still 2 shulfps are generated. > > > > __m128 f (__m128 a, __m128 b) > > { > >

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-26 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #16 from Hongtao.liu --- typedef int v4si __attribute__ ((vector_size(16))); v4si f(v4si a, v4si b) { v4si a1 = __builtin_shufflevector (a, a, 2, 3 ,1 ,0); v4si b1 = __builtin_shufflevector (b, a, 2, 3 ,1 ,0); return a1 *

[Bug target/98167] [x86] Failure to optimize operation on indentically shuffled operands into a shuffle of the result of the operation

2021-08-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98167 --- Comment #18 from Hongtao.liu --- (In reply to Andrew Pinski from comment #17) > (In reply to Hongtao.liu from comment #16) > > typedef int v4si __attribute__ ((vector_size(16))); > > > > v4si f(v4si a, v4si b) { > > v4si a1 = __builtin_s

[Bug target/101796] Miss optimization to optimized (vashl op0, (op1: const_duplicate_vector)) to (ashl op0 op1_inner)

2021-08-27 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101796 --- Comment #4 from Hongtao.liu --- (In reply to Andrew Pinski from comment #3) > Combine is able to do the combine but it fails as it does not match: > Trying 10, 9 -> 14: >10: r92:HI=0x3 > 9: r91:V32HI=vec_duplicate(r92:HI) > REG

[Bug target/51838] Inefficient add of 128 bit quantity represented as 64 bit tuple to 128 bit integer.

2021-08-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51838 --- Comment #2 from Hongtao.liu --- (In reply to Andrew Pinski from comment #1) > We do get slightly better now: > xorl%eax, %eax > movq%rdi, %r8 > xorl%edi, %edi > addq%rsi, %rax > adcq

[Bug rtl-optimization/97756] [9/10/11/12 Regression] Inefficient handling of 128-bit arguments

2021-08-29 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97756 --- Comment #7 from Hongtao.liu --- (In reply to Patrick Palka from comment #3) > Perhaps related to this PR: On x86_64, the following basic wrapper around > int128 addition > > __uint128_t f(__uint128_t x, __uint128_t y) { return x + y; } >

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #1

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #2 from Hongtao.liu --- I successfully reproduce error related to 32-bit SPARC libgcc But failed to configure for target mcore, i didn't find any reference in https://gcc.gnu.org/install/specific.html --target=mcore results in ***

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #3 from Hongtao.liu --- static inline void set_rtl (tree t, rtx x) { gcc_checking_assert (!x || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t)) || (use_register_for_decl (t)

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #4 from Hongtao.liu --- > > and it hit REG_P (XEXP (x, 1)), XEXP (x, 1) is invalid for subreg, so > set_rtl here doesn't accept subreg? typo, it hit gcc_assert that if X is not REG, it must be CONCAT or PARALLEL, but here is SUBR

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #5 from Hongtao.liu --- (In reply to Hongtao.liu from comment #4) > > > > and it hit REG_P (XEXP (x, 1)), XEXP (x, 1) is invalid for subreg, so > > set_rtl here doesn't accept subreg? > > typo, it hit gcc_assert that if X is not R

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #6 from Hongtao.liu --- The difference of insn sequence is like good one: (insn 5 4 6 (clobber (reg/v:DF 153)) "/scratch/jmyers/glibc/many12/src/gcc/libgcc/libgcc2.c":1948:1 -1 (nil)) (insn 6 5 7 (set (subreg:SI (reg/v:DF 153)

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #7 from Hongtao.liu --- Since we also allow something like (concat:(subreg) (subreg)), should we also allow subreg outside? gcc_checking_assert (!x || !(TREE_CODE (t) == SSA_NAME || is_gimple_reg (t))

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-30 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #8 from Hongtao.liu --- (In reply to Hongtao.liu from comment #7) > Since we also allow something like (concat:(subreg) (subreg)), should we > also allow subreg outside? > >gcc_checking_assert (!x > || !(TRE

[Bug middle-end/102133] [12 Regression] ICE in set_rtl building libgcc __muldc3 for 32-bit SPARC

2021-08-31 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102133 --- Comment #12 from Hongtao.liu --- Fixed in GCC12.

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 Hongtao.liu changed: What|Removed |Added CC||crazylht at gmail dot com --- Comment #3

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #6 from Hongtao.liu --- Reproduced with a simple testcase float foo (long a) { union{long a; float b[2];}c; c.a = a; return c.b[1]; }

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #7 from Hongtao.liu --- (In reply to Hongtao.liu from comment #6) > Reproduced with a simple testcase > > > float > foo (long a) > { > union{long a; > float b[2];}c; > c.a = a; > return c.b[1]; > } (subreg:SF (reg:DI) 4)

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #8 from Hongtao.liu --- (In reply to Hongtao.liu from comment #7) > (In reply to Hongtao.liu from comment #6) > > Reproduced with a simple testcase > > > > > > float > > foo (long a) > > { > > union{long a; > > float b[2];}c;

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #9 from Hongtao.liu --- > > (define_insn "movsf_hardfloat" > [(set (match_operand:SF 0 "nonimmediate_operand" >"=!r, f, v, wa,m, wY, > Z, m, wa, !r,

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #10 from Hongtao.liu --- (In reply to Hongtao.liu from comment #9) > > > > (define_insn "movsf_hardfloat" > > [(set (match_operand:SF 0 "nonimmediate_operand" > > "=!r, f, v, wa,m, wY, >

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #11 from Hongtao.liu --- (In reply to Hongtao.liu from comment #10) > (In reply to Hongtao.liu from comment #9) > > > > > > (define_insn "movsf_hardfloat" > > > [(set (match_operand:SF 0 "nonimmediate_operand" > > >"=!r,

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #12 from Hongtao.liu --- (In reply to Hongtao.liu from comment #10) > (In reply to Hongtao.liu from comment #9) > > > > > > (define_insn "movsf_hardfloat" > > > [(set (match_operand:SF 0 "nonimmediate_operand" > > >"=!r,

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #4 from Hongtao.liu --- Because _tile_loadd is implemented as embedded assembly plus macros, if __AMX_TILE__ is removed, no error will be reported if the user does not use the -mamx option, So this macro is added here, but obviously

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #7 from Hongtao.liu --- (In reply to Thiago Macieira from comment #5) > (In reply to Hongtao.liu from comment #4) > > Because _tile_loadd is implemented as embedded assembly plus macros, if > > __AMX_TILE__ is removed, no error will

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #8 from Hongtao.liu --- (In reply to Thiago Macieira from comment #6) > > I suggest doing as Clang did and make it an intrinsic. > > Or even a __builtin_ia32_markamxtile(); intrinsic, which produces the error > if misused and does a

[Bug target/102166] [i386] AMX intrinsics and macros not defined in C++

2021-09-01 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102166 --- Comment #10 from Hongtao.liu --- > > Anyway, I suggest at a minimum removing the #define check. There's little > harm in having no diagnostic on misuse: misuses are probably going to be > seen when testing. Until GCC is able to generate AMX

[Bug target/102182] New: Runtime error for gcc.dg/torture/fp-int-convert-float16.c

2021-09-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102182 Bug ID: 102182 Summary: Runtime error for gcc.dg/torture/fp-int-convert-float16.c Product: gcc Version: 12.0 Status: UNCONFIRMED Keywords: wrong-code

[Bug target/102182] Runtime error for gcc.dg/torture/fp-int-convert-float16.c

2021-09-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102182 Hongtao.liu changed: What|Removed |Added CC||hjl.tools at gmail dot com --- Comment #1

[Bug target/102182] Runtime error for gcc.dg/torture/fp-int-convert-float16.c

2021-09-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102182 --- Comment #2 from Hongtao.liu --- Reproduced case. #include int main (void) { static volatile unsigned int ivin, ivout; static volatile _Float16 fv1, fv2; ivin = ((unsigned int)1); fv1 = ((unsigned int)1); fv2 = ivin; ivout = fv2;

[Bug target/102182] Runtime error for gcc.dg/torture/fp-int-convert-float16.c

2021-09-02 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102182 --- Comment #3 from Hongtao.liu --- during pass_expand we got (debug_insn 24 23 0 (debug_marker) "test1.c":10:3 -1 (nil)) ;; fv2.1_3 ={v} fv2; (insn 25 24 0 (set (reg:HF 84 [ fv2.1_3 ]) (mem/v/c:HF (symbol_ref:SI ("fv2.1") [flags

[Bug target/102182] Runtime error for gcc.dg/torture/fp-int-convert-float16.c

2021-09-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102182 --- Comment #4 from Hongtao.liu --- After emit libcall in convert_to_mode, it failed maybe_emit_unop_insn, so all insns deleted, but from here is already overrided, it seems to be a bug. if (icode != CODE_FOR_nothing) {

[Bug target/102186] [12 Regression] Broken bootstrap: soft-fp/half.h:62:1: error: unable to emulate ‘HF’ since r12-3308-ge42d2d2a20f2bb59928bc895ec9f46503a1b5c73

2021-09-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102186 --- Comment #3 from Hongtao.liu --- A patch is posted at https://gcc.gnu.org/pipermail/gcc-patches/2021-September/578746.html

[Bug target/102154] [12 Regression] ICE in extract_insn, at recog.c:2769 since r12-3277-gd2874d905647a1d146dafa60199d440e837adc4d

2021-09-03 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102154 --- Comment #15 from Hongtao.liu --- (In reply to Segher Boessenkool from comment #14) > (In reply to Jonathan Wakely from comment #13) > > Is this also the cause of several libstdc++ FAILs on ppc64le? > > > Yes. > > I have asked for reversio

[Bug target/102211] New: ICE introduced by r12-3277

2021-09-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102211 Bug ID: 102211 Summary: ICE introduced by r12-3277 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: normal Priority: P3 Component: target Assi

[Bug target/102211] ICE introduced by r12-3277

2021-09-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102211 --- Comment #1 from Hongtao.liu --- But it's ok for float foo (float a, long b) { union{float a[2]; long b;}c; c.b = b; return c.a[0]; } foo: fmv.w.x fa0,a0 ret Which means movement between gpr and float reg is allo

[Bug target/102211] ICE introduced by r12-3277

2021-09-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=102211 --- Comment #2 from Hongtao.liu --- According to *movsi_internal and *movdi_64bit, SImode, and DImode can be placed into FP_REGS, but in riscv_hard_regno_mode_ok, SImode/DImode is not allowed to be allocated as FP_REGS, the mismatch here caues t

[Bug target/88473] AVX512: constant folding on mask does not remove unnecessary instructions

2021-09-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473 --- Comment #8 from Hongtao.liu --- (In reply to Andrew Pinski from comment #7) > The UNSPEC_MASKOP ones are still there. > > PR 93885 is the same issue. void test(void* data, void* data2) { __m128i v = _mm_load_si128((__m128i const*)data);

[Bug target/88473] AVX512: constant folding on mask does not remove unnecessary instructions

2021-09-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88473 --- Comment #9 from Hongtao.liu --- (In reply to Hongtao.liu from comment #8) > (In reply to Andrew Pinski from comment #7) > > The UNSPEC_MASKOP ones are still there. > > > > PR 93885 is the same issue. > > void test(void* data, void* data2) >

[Bug target/82139] unnecessary movapd with _mm_castsi128_pd to use BLENDPD on __m128i results

2021-09-05 Thread crazylht at gmail dot com via Gcc-bugs
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82139 --- Comment #2 from Hongtao.liu --- (In reply to Andrew Pinski from comment #1) > It is worse on the trunk: > .L2: > movdqu (%rdi), %xmm1 > movdqu (%rdi), %xmm0 > addq$16, %rdi > paddd %xmm3, %xmm1 >

<    3   4   5   6   7   8   9   10   11   12   >