[gcc r16-91] Accept allones or 0 operand for vcond_mask op1.

2025-04-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f72a2d221539cede358f2487b94bc370c6fc44b5 commit r16-91-gf72a2d221539cede358f2487b94bc370c6fc44b5 Author: liuhongt Date: Sun Mar 30 20:15:41 2025 -0700 Accept allones or 0 operand for vcond_mask op1. Since ix86_expand_sse_movcc will simplify them into a simple

[gcc r16-46] Generate 2 FMA instructions in ix86_expand_swdivsf.

2025-04-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e1098c7b08d9e6018f60dae7a14c5ad621618223 commit r16-46-ge1098c7b08d9e6018f60dae7a14c5ad621618223 Author: hongtao.liu Date: Thu Apr 17 09:07:55 2025 +0200 Generate 2 FMA instructions in ix86_expand_swdivsf. When FMA is available, N-R step can be rewritten with

[gcc r15-9473] Revert documents from r11-344-g0fec3f62b9bfc0

2025-04-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fa58ff249a0e63a721ccb6d770c86523d84a212a commit r15-9473-gfa58ff249a0e63a721ccb6d770c86523d84a212a Author: liuhongt Date: Sun Apr 13 19:40:51 2025 -0700 Revert documents from r11-344-g0fec3f62b9bfc0 gcc/ChangeLog: PR target/108134

[gcc r15-8461] Use ix86_fp_comparison_operator in cbranchbf4 to avoid ICE.

2025-03-19 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:62a6cafd7f55c6e88a9780b91039257572038535 commit r15-8461-g62a6cafd7f55c6e88a9780b91039257572038535 Author: liuhongt Date: Mon Mar 17 22:47:11 2025 -0700 Use ix86_fp_comparison_operator in cbranchbf4 to avoid ICE. *jcc only supports ix86_fp_comparison_operator

[gcc r15-8283] Mark gcc.target/i386/apx-ndd-tls-1b.c as xfail.

2025-03-18 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:be671ec1f30ecd55aaff09048afb2a619018cb8a commit r15-8283-gbe671ec1f30ecd55aaff09048afb2a619018cb8a Author: liuhongt Date: Sun Mar 16 22:28:44 2025 -0700 Mark gcc.target/i386/apx-ndd-tls-1b.c as xfail. It looks like the testcase is fragile, it's supposed to ch

[gcc r15-6940] Fix typo to avoid ICE.

2025-01-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:3872daa5767622d1f8b086050996c85604db7514 commit r15-6940-g3872daa5767622d1f8b086050996c85604db7514 Author: liuhongt Date: Wed Jan 15 19:09:24 2025 -0800 Fix typo to avoid ICE. gcc/ChangeLog: PR target/118489 * config/i386/sse.md (

[gcc r15-6844] Refactor ix86_expand_vecop_qihi2.

2025-01-12 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:0e05b793fba2a9bea9f0fbb1f068679f5dadf514 commit r15-6844-g0e05b793fba2a9bea9f0fbb1f068679f5dadf514 Author: liuhongt Date: Wed Jan 8 23:11:17 2025 -0800 Refactor ix86_expand_vecop_qihi2. Since there's regression to use vpermq, and it's manually disabled by

[gcc r15-6097] Fix inaccuracy in cunroll/cunrolli when considering what's innermost loop.

2024-12-10 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ee2f19b0937b5efc0b23c4319cbd4a38b27eac6e commit r15-6097-gee2f19b0937b5efc0b23c4319cbd4a38b27eac6e Author: liuhongt Date: Mon Dec 2 01:54:59 2024 -0800 Fix inaccuracy in cunroll/cunrolli when considering what's innermost loop. r15-919-gef27b91b62c3aa removed

[gcc r12-10832] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-25 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:89a27cf6b1354cc80d834d71f7a3aa137d605e94 commit r12-10832-g89a27cf6b1354cc80d834d71f7a3aa137d605e94 Author: liuhongt Date: Thu Nov 21 23:57:38 2024 -0800 Fix uninitialized operands[2] in vec_unpacks_hi_v4sf. It could cause weired spill in RA when register pre

[gcc r13-9216] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-25 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:0eb8c19cb45fc004b7039fa22ff9021604d80dbc commit r13-9216-g0eb8c19cb45fc004b7039fa22ff9021604d80dbc Author: liuhongt Date: Thu Nov 21 23:57:38 2024 -0800 Fix uninitialized operands[2] in vec_unpacks_hi_v4sf. It could cause weired spill in RA when register pres

[gcc r14-10979] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-25 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4a63cc6de77481878ec31e1e6ac30e22c50b063a commit r14-10979-g4a63cc6de77481878ec31e1e6ac30e22c50b063a Author: liuhongt Date: Thu Nov 21 23:57:38 2024 -0800 Fix uninitialized operands[2] in vec_unpacks_hi_v4sf. It could cause weired spill in RA when register pre

[gcc r15-5639] Fix uninitialized operands[2] in vec_unpacks_hi_v4sf.

2024-11-24 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ba4cf2e296d8d5950c3d356fa6b6efcad00d0189 commit r15-5639-gba4cf2e296d8d5950c3d356fa6b6efcad00d0189 Author: liuhongt Date: Thu Nov 21 23:57:38 2024 -0800 Fix uninitialized operands[2] in vec_unpacks_hi_v4sf. It could cause weired spill in RA when register pres

[gcc r15-5489] Add microarchtecture tunable for pass_align_tight_loops [PR117438]

2024-11-19 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:6350e956d1a74963a62bedabef3d4a1a3f2d4852 commit r15-5489-g6350e956d1a74963a62bedabef3d4a1a3f2d4852 Author: MayShao-oc Date: Thu Nov 7 10:57:02 2024 +0800 Add microarchtecture tunable for pass_align_tight_loops [PR117438] Hi Hongtao: Add m_CASCADELAK, a

[gcc r15-5071] Guard truncate from vector float to vector __bf16 with !flag_rounding_math && HONOR_NANS (BFmode).

2024-11-10 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:de867e8da30bf5e0cb51c3946ec43c3c4778d4a0 commit r15-5071-gde867e8da30bf5e0cb51c3946ec43c3c4778d4a0 Author: liuhongt Date: Wed Nov 6 18:15:42 2024 -0800 Guard truncate from vector float to vector __bf16 with !flag_rounding_math && HONOR_NANS (BFmode). hw inst

[gcc r15-4955] Support vector float_extend from __bf16 to float.

2024-11-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:648bd1fcc6acfc56e08f4ad8146a80910cfacfd7 commit r15-4955-g648bd1fcc6acfc56e08f4ad8146a80910cfacfd7 Author: liuhongt Date: Wed Oct 23 23:51:20 2024 -0700 Support vector float_extend from __bf16 to float. It's supported by vector permutation with zero vector.

[gcc r15-4954] Support vector float_truncate for SF to BF.

2024-11-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a17acf4f25f0ce9b8dce24f25867500a3b093b57 commit r15-4954-ga17acf4f25f0ce9b8dce24f25867500a3b093b57 Author: liuhongt Date: Wed Oct 23 00:51:00 2024 -0700 Support vector float_truncate for SF to BF. Generate native instruction whenever possible, otherwise use v

[gcc r13-9157] Fix ICE due to subreg:us_truncate.

2024-10-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:28ea5a4ec3e9e49439fdb912ef4edeebfdae881d commit r13-9157-g28ea5a4ec3e9e49439fdb912ef4edeebfdae881d Author: liuhongt Date: Tue Oct 29 02:09:39 2024 -0700 Fix ICE due to subreg:us_truncate. Force_operand issues an ICE when input is (subreg:DI (us_truncate:V

[gcc r15-4775] Fix ICE due to subreg:us_truncate.

2024-10-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bc0eeccf27a084461a2d5661e23468350acb43da commit r15-4775-gbc0eeccf27a084461a2d5661e23468350acb43da Author: liuhongt Date: Tue Oct 29 02:09:39 2024 -0700 Fix ICE due to subreg:us_truncate. Force_operand issues an ICE when input is (subreg:DI (us_truncate:V

[gcc r14-10852] Fix ICE due to subreg:us_truncate.

2024-10-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:71a0cf699b6a2dc03abec53aeafab8b70db2bb07 commit r14-10852-g71a0cf699b6a2dc03abec53aeafab8b70db2bb07 Author: liuhongt Date: Tue Oct 29 02:09:39 2024 -0700 Fix ICE due to subreg:us_truncate. Force_operand issues an ICE when input is (subreg:DI (us_truncate:

[gcc r12-10793] Fix ICE due to subreg:us_truncate.

2024-10-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:d0a932fb53ccdf5155db90632901c55446b8 commit r12-10793-gd0a932fb53ccdf5155db90632901c55446b8 Author: liuhongt Date: Tue Oct 29 02:09:39 2024 -0700 Fix ICE due to subreg:us_truncate. Force_operand issues an ICE when input is (subreg:DI (us_truncate:

[gcc r12-10784] Fix ICE due to isa mismatch for the builtins.

2024-10-23 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ab84a8a4b78990942e006e9f060dc2705f2c6d8f commit r12-10784-gab84a8a4b78990942e006e9f060dc2705f2c6d8f Author: liuhongt Date: Tue Oct 22 01:54:40 2024 -0700 Fix ICE due to isa mismatch for the builtins. gcc/ChangeLog: PR target/117240

[gcc r15-4566] Fix ICE due to isa mismatch for the builtins.

2024-10-23 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:403e361d5aa620e77c9832578b2409a0fdd79d96 commit r15-4566-g403e361d5aa620e77c9832578b2409a0fdd79d96 Author: liuhongt Date: Tue Oct 22 01:54:40 2024 -0700 Fix ICE due to isa mismatch for the builtins. gcc/ChangeLog: PR target/117240

[gcc r13-9145] Fix ICE due to isa mismatch for the builtins.

2024-10-23 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:2452387468423882c0732e0fad3a83e887574ccc commit r13-9145-g2452387468423882c0732e0fad3a83e887574ccc Author: liuhongt Date: Tue Oct 22 01:54:40 2024 -0700 Fix ICE due to isa mismatch for the builtins. gcc/ChangeLog: PR target/117240

[gcc r14-10831] Fix ICE due to isa mismatch for the builtins.

2024-10-23 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b718f6ec1674c0db30f26c65b7a9215e9388dd6c commit r14-10831-gb718f6ec1674c0db30f26c65b7a9215e9388dd6c Author: liuhongt Date: Tue Oct 22 01:54:40 2024 -0700 Fix ICE due to isa mismatch for the builtins. gcc/ChangeLog: PR target/117240

[gcc r15-4560] i386: Optimize EQ/NE comparison between avx512 kmask and -1.

2024-10-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ee7e77e9c121f5a6f27c92b6b24b2abf9cd66a4d commit r15-4560-gee7e77e9c121f5a6f27c92b6b24b2abf9cd66a4d Author: liuhongt Date: Mon Oct 21 02:22:08 2024 -0700 i386: Optimize EQ/NE comparison between avx512 kmask and -1. r15-974-gbf7745f887c765e06f2e75508f263debb60a

[gcc r12-10781] [GCC13/GCC12] Fix testcase.

2024-10-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:45bde60836d04cce4637b74ecadbb0aff90b832f commit r12-10781-g45bde60836d04cce4637b74ecadbb0aff90b832f Author: liuhongt Date: Tue Oct 22 11:24:23 2024 +0800 [GCC13/GCC12] Fix testcase. The optimization relies on other patterns which are only available at GCC

[gcc r13-9142] [GCC13/GCC12] Fix testcase.

2024-10-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b43518a01cbbbafe042b85a48fa09a32948380a commit r13-9142-g8b43518a01cbbbafe042b85a48fa09a32948380a Author: liuhongt Date: Tue Oct 22 11:24:23 2024 +0800 [GCC13/GCC12] Fix testcase. The optimization relies on other patterns which are only available at GCC1

[gcc r12-10778] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:91800a70a2af1349eefc5f3380be2b254b1db395 commit r12-10778-g91800a70a2af1349eefc5f3380be2b254b1db395 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines

[gcc r13-9139] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fca35b417c236e3448bc3666820fd1ba423fe6e9 commit r13-9139-gfca35b417c236e3448bc3666820fd1ba423fe6e9 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines v

[gcc r14-10807] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:79e7e02b7cc578d03eab2b50c029f44409ef8e26 commit r14-10807-g79e7e02b7cc578d03eab2b50c029f44409ef8e26 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines

[gcc r15-4510] Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw"

2024-10-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5259d3927c1c8e3a15b4b844adef59b48c241233 commit r15-4510-g5259d3927c1c8e3a15b4b844adef59b48c241233 Author: liuhongt Date: Wed Oct 16 13:43:48 2024 +0800 Refine splitters related to "combine vpcmpuw + zero_extend to vpcmpuw" r12-6103-g1a7ce8570997eb combines v

[gcc r15-4400] Don't lower vpcmpu to pcmpgt since the latter is for signed comparison.

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:21e2cd65add9070292313f8e12e8731d0aa2c869 commit r15-4400-g21e2cd65add9070292313f8e12e8731d0aa2c869 Author: liuhongt Date: Tue Oct 8 16:18:31 2024 +0800 Don't lower vpcmpu to pcmpgt since the latter is for signed comparison. r15-1737-gb06a108f0fbffe lower AVX5

[gcc r15-4399] Canonicalize (vec_merge (fma: op2 op1 op3) (match_dup 1)) mask) to (vec_merge (fma: op1 op2 op3) (ma

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:edf4db8355dead3413bad64f6a89bae82dabd0ad commit r15-4399-gedf4db8355dead3413bad64f6a89bae82dabd0ad Author: liuhongt Date: Mon Oct 14 13:09:59 2024 +0800 Canonicalize (vec_merge (fma: op2 op1 op3) (match_dup 1)) mask) to (vec_merge (fma: op1 op2 op3) (match_dup 1)) ma

[gcc r15-4398] Canonicalize (vec_merge (fma op2 op1 op3) op1 mask) to (vec_merge (fma op1 op2 op3) op1 mask).

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:330782a1b6cfe881ad884617ffab441aeb1c2b5c commit r15-4398-g330782a1b6cfe881ad884617ffab441aeb1c2b5c Author: liuhongt Date: Mon Oct 14 17:16:13 2024 +0800 Canonicalize (vec_merge (fma op2 op1 op3) op1 mask) to (vec_merge (fma op1 op2 op3) op1 mask). For x86 ma

[gcc r13-9118] Add a new tune avx256_avoid_vec_perm for SRF.

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:eecd5f8ce1729a214bf0a1edfdd3ee1cf79be881 commit r13-9118-geecd5f8ce1729a214bf0a1edfdd3ee1cf79be881 Author: liuhongt Date: Wed Sep 25 13:11:11 2024 +0800 Add a new tune avx256_avoid_vec_perm for SRF. According to Intel SOM[1], For Crestmont, most 256-bit Inte

[gcc r13-9117] Add new microarchitecture tune for SRF/GRR/CWF.

2024-10-16 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e9eadc29c1c57cd7be9ec8de231d8fb9e8ac0c7c commit r13-9117-ge9eadc29c1c57cd7be9ec8de231d8fb9e8ac0c7c Author: liuhongt Date: Tue Sep 24 15:53:14 2024 +0800 Add new microarchitecture tune for SRF/GRR/CWF. For Crestmont, 4-operand vex blendv instructions come from

[gcc r15-4371] Adjust testcase to avoid scan FIX in REG_EQUIV.

2024-10-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a8b4ea1bcc10b5253992f4b932aec6862aef32fa commit r15-4371-ga8b4ea1bcc10b5253992f4b932aec6862aef32fa Author: liuhongt Date: Tue Oct 15 11:17:20 2024 +0800 Adjust testcase to avoid scan FIX in REG_EQUIV. Also add hard_float target to avoid failed on arm-eabi.

[gcc r14-10783] Add a new tune avx256_avoid_vec_perm for SRF.

2024-10-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9b7d5ecbecfbd193899648e411f1a9b2a77471e2 commit r14-10783-g9b7d5ecbecfbd193899648e411f1a9b2a77471e2 Author: liuhongt Date: Wed Sep 25 13:11:11 2024 +0800 Add a new tune avx256_avoid_vec_perm for SRF. According to Intel SOM[1], For Crestmont, most 256-bit Int

[gcc r14-10782] Add new microarchitecture tune for SRF/GRR/CWF.

2024-10-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fe0692f689a18c432d6f59f404d4cd020cbebef2 commit r14-10782-gfe0692f689a18c432d6f59f404d4cd020cbebef2 Author: liuhongt Date: Tue Sep 24 15:53:14 2024 +0800 Add new microarchitecture tune for SRF/GRR/CWF. For Crestmont, 4-operand vex blendv instructions come fro

[gcc r15-4234] Add a new tune avx256_avoid_vec_perm for SRF.

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9eaecce3d8c1d9349adbf8c2cdaf8d87672ed29c commit r15-4234-g9eaecce3d8c1d9349adbf8c2cdaf8d87672ed29c Author: liuhongt Date: Wed Sep 25 13:11:11 2024 +0800 Add a new tune avx256_avoid_vec_perm for SRF. According to Intel SOM[1], For Crestmont, most 256-bit Inte

[gcc r15-4233] Add new microarchitecture tune for SRF/GRR/CWF.

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9c8cea8feb6cd54ef73113a0b74f1df7b60d09dc commit r15-4233-g9c8cea8feb6cd54ef73113a0b74f1df7b60d09dc Author: liuhongt Date: Tue Sep 24 15:53:14 2024 +0800 Add new microarchitecture tune for SRF/GRR/CWF. For Crestmont, 4-operand vex blendv instructions come from

[gcc r15-4225] Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:70c3db511ba14ff5fa68cb41d0714a9fb957ea5d commit r15-4225-g70c3db511ba14ff5fa68cb41d0714a9fb957ea5d Author: liuhongt Date: Mon Mar 25 21:28:14 2024 -0700 Enable vectorization for unknown tripcount in very cheap cost model but disable epilog vectorization. gcc

[gcc r15-4226] Adjust testcase after relax O2 vectorization.

2024-10-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:d5d1189c12199db79f6feb5cfcc7e6475c3a4d91 commit r15-4226-gd5d1189c12199db79f6feb5cfcc7e6475c3a4d91 Author: liuhongt Date: Thu Sep 19 13:38:34 2024 +0800 Adjust testcase after relax O2 vectorization. gcc/testsuite/ChangeLog: * gcc.dg/fstack-pr

[gcc r15-3885] Define VECTOR_STORE_FLAG_VALUE

2024-09-25 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:78eef8919e2f2973ed7750ba66f5726e70614d07 commit r15-3885-g78eef8919e2f2973ed7750ba66f5726e70614d07 Author: liuhongt Date: Mon Sep 23 11:06:04 2024 +0800 Define VECTOR_STORE_FLAG_VALUE gcc/ChangeLog: * config/i386/i386.h (VECTOR_STORE_FLAG_VAL

[gcc r15-3579] Enable tune fuse_move_and_alu for GNR.

2024-09-10 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f80e4ba94e41410219bdcdb1a0f204ea3f148666 commit r15-3579-gf80e4ba94e41410219bdcdb1a0f204ea3f148666 Author: liuhongt Date: Tue Sep 10 15:04:58 2024 +0800 Enable tune fuse_move_and_alu for GNR. According to Intel Software Optimization Manual[1], the Redwood cov

[gcc r15-3558] Don't force_reg operands[3] when it's not const0_rtx.

2024-09-09 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c726a6643125a59e2ba6f992924a2d0098104578 commit r15-3558-gc726a6643125a59e2ba6f992924a2d0098104578 Author: liuhongt Date: Fri Sep 6 15:03:16 2024 +0800 Don't force_reg operands[3] when it's not const0_rtx. It fix the regression by a51f2fc0d80869ab079

[gcc r15-3498] Handle const0_operand for *avx2_pcmp3_1.

2024-09-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a51f2fc0d80869ab079a93cc3858f24a1fd28237 commit r15-3498-ga51f2fc0d80869ab079a93cc3858f24a1fd28237 Author: liuhongt Date: Wed Sep 4 15:39:17 2024 +0800 Handle const0_operand for *avx2_pcmp3_1. *_eq3_1 supports nonimm_or_0_operand for op1 and op2, pass_com

[gcc r12-10694] Check avx upper register for parallel.

2024-09-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:6585b06303d8fd9da907f443fc0da9faed303712 commit r12-10694-g6585b06303d8fd9da907f443fc0da9faed303712 Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r13-8999] Check avx upper register for parallel.

2024-09-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5e049ada87842947adaca5c607516396889f64d6 commit r13-8999-g5e049ada87842947adaca5c607516396889f64d6 Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r14-10625] Check avx upper register for parallel.

2024-09-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ba9a3f105ea552a22d08f2d54dfdbef16af7c99e commit r14-10625-gba9a3f105ea552a22d08f2d54dfdbef16af7c99e Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r15-3314] Check avx upper register for parallel.

2024-08-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ab214ef734bfc3dcffcf79ff9e1dd651c2b40566 commit r15-3314-gab214ef734bfc3dcffcf79ff9e1dd651c2b40566 Author: liuhongt Date: Thu Aug 29 11:39:20 2024 +0800 Check avx upper register for parallel. For function arguments/return, when it's BLK mode, it's put in a

[gcc r12-10683] Fix testcase failure.

2024-08-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:141d8aa375ea32c05f0d437828e6a76f1a3ea4af commit r12-10683-g141d8aa375ea32c05f0d437828e6a76f1a3ea4af Author: liuhongt Date: Thu Aug 22 14:31:40 2024 +0800 Fix testcase failure. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-10.c: Use

[gcc r13-8988] Fix testcase failure.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:ea9c508927ec032c6d67a24df59ffa429e4d3d95 commit r13-8988-gea9c508927ec032c6d67a24df59ffa429e4d3d95 Author: liuhongt Date: Thu Aug 22 14:31:40 2024 +0800 Fix testcase failure. gcc/testsuite/ChangeLog: * gcc.target/i386/pieces-memcpy-10.c: Use

[gcc r12-10682] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b4bc34db3f2948e37ad55a09870635e88c54c7d3 commit r12-10682-gb4bc34db3f2948e37ad55a09870635e88c54c7d3 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128

[gcc r13-8987] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:aea374238cec1a1e53fb79575d2f998e16926999 commit r13-8987-gaea374238cec1a1e53fb79575d2f998e16926999 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128_

[gcc r14-10608] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:27dc1533b6dfc49f3912c524db51d6c372a5ac3d commit r14-10608-g27dc1533b6dfc49f3912c524db51d6c372a5ac3d Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128

[gcc r15-3078] Align ix86_{move_max,store_max} with vectorizer.

2024-08-21 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 commit r15-3078-g6ea25c041964bf63014fcf7bb68fb1f5a0a4e123 Author: liuhongt Date: Thu Aug 15 12:54:07 2024 +0800 Align ix86_{move_max,store_max} with vectorizer. When none of mprefer-vector-width, avx256_optimal/avx128_

[gcc r15-3058] Align predicates for operands[1] between mov and *mov_internal.

2024-08-20 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bb42c551905024ea23095a0eb7b58fdbcfbcaef6 commit r15-3058-gbb42c551905024ea23095a0eb7b58fdbcfbcaef6 Author: liuhongt Date: Tue Aug 20 14:41:00 2024 +0800 Align predicates for operands[1] between mov and *mov_internal. > It's not obvious to me why movv16qi req

[gcc r14-10588] Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

2024-08-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4e7735a8d87559bbddfe3a985786996e22241f8d commit r14-10588-g4e7735a8d87559bbddfe3a985786996e22241f8d Author: liuhongt Date: Mon Aug 12 14:35:31 2024 +0800 Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area. gcc/

[gcc r15-2930] Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload.

2024-08-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f7e672da8fc3d416a6d07eb01f3be4400ef94fac commit r15-2930-gf7e672da8fc3d416a6d07eb01f3be4400ef94fac Author: liuhongt Date: Mon Aug 12 18:24:34 2024 +0800 Movement between GENERAL_REGS and SSE_REGS for TImode doesn't need secondary reload. It results in 2 fail

[gcc r15-2906] Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area.

2024-08-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c3c83d22d212a35cb1bfb8727477819463f0dcd8 commit r15-2906-gc3c83d22d212a35cb1bfb8727477819463f0dcd8 Author: liuhongt Date: Mon Aug 12 14:35:31 2024 +0800 Move ix86_align_loops into a separate pass and insert the pass after pass_endbr_and_patchable_area. gcc/C

[gcc r13-8971] Refine constraint "Bk" to define_special_memory_constraint.

2024-08-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:617562e4e422c7bd282960b14abfffd994445009 commit r13-8971-g617562e4e422c7bd282960b14abfffd994445009 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r12-10668] Refine constraint "Bk" to define_special_memory_constraint.

2024-08-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:c94738e2462ff46f3013f6270f6a955b749d82b2 commit r12-10668-gc94738e2462ff46f3013f6270f6a955b749d82b2 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r14-10551] Refine constraint "Bk" to define_special_memory_constraint.

2024-08-02 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a295076bee293aa3112c615f9af7a27231816a36 commit r14-10551-ga295076bee293aa3112c615f9af7a27231816a36 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r15-2539] Fix mismatch between constraint and predicate for ashl3_doubleword.

2024-08-01 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:64ca25aec4939aea79bd812b089fbb666ca6f2fd commit r15-2539-g64ca25aec4939aea79bd812b089fbb666ca6f2fd Author: liuhongt Date: Fri Jul 26 09:56:03 2024 +0800 Fix mismatch between constraint and predicate for ashl3_doubleword. (insn 98 94 387 2 (parallel [

[gcc r15-2395] Refine constraint "Bk" to define_special_memory_constraint.

2024-07-29 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:bc1fda00d5f20e2f3e77a50b2822562b6e0040b2 commit r15-2395-gbc1fda00d5f20e2f3e77a50b2822562b6e0040b2 Author: liuhongt Date: Wed Jul 24 11:29:23 2024 +0800 Refine constraint "Bk" to define_special_memory_constraint. For below pattern, RA may still allocate r162

[gcc r15-2217] Relax ix86_hardreg_mov_ok after split1.

2024-07-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a3f03891065cb9691f6e9cebce4d4542deb92a35 commit r15-2217-ga3f03891065cb9691f6e9cebce4d4542deb92a35 Author: liuhongt Date: Mon Jul 22 11:36:59 2024 +0800 Relax ix86_hardreg_mov_ok after split1. ix86_hardreg_mov_ok is added by r11-5066-gbe39636d9f68c4

[gcc r15-2127] Optimize maskstore when mask is 0 or -1 in UNSPEC_MASKMOV

2024-07-17 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:228972b2b7bf50f4776f8ccae0d7c2950827d0f1 commit r15-2127-g228972b2b7bf50f4776f8ccae0d7c2950827d0f1 Author: liuhongt Date: Tue Jul 16 15:29:01 2024 +0800 Optimize maskstore when mask is 0 or -1 in UNSPEC_MASKMOV gcc/ChangeLog: PR target/115843

[gcc r14-10425] x86: Update branch hint for Redwood Cove.

2024-07-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1fff665a51e221a578a92631fc8ea62dd79fa3b6 commit r14-10425-g1fff665a51e221a578a92631fc8ea62dd79fa3b6 Author: H.J. Lu Date: Tue Apr 26 11:08:55 2022 -0700 x86: Update branch hint for Redwood Cove. According to IntelĀ® 64 and IA-32 Architectures Optimization Refe

[gcc r15-2038] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-15 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f27bf48e0204524ead795fe618cd8b1224f72fd4 commit r15-2038-gf27bf48e0204524ead795fe618cd8b1224f72fd4 Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pendi

[gcc r14-10422] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:13bfc385b0baebd22aeabb0d90915f2e9b18febe commit r14-10422-g13bfc385b0baebd22aeabb0d90915f2e9b18febe Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pend

[gcc r13-8913] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:9a1cdaa5e8441394d613f5f3401e7aab21efe8f0 commit r13-8913-g9a1cdaa5e8441394d613f5f3401e7aab21efe8f0 Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pendi

[gcc r12-10617] Fix SSA_NAME leak due to def_stmt is removed before use_stmt.

2024-07-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e1427b39d28f382d21e7a0ea1714b3250e0a6e5d commit r12-10617-ge1427b39d28f382d21e7a0ea1714b3250e0a6e5d Author: liuhongt Date: Fri Jul 12 09:39:23 2024 +0800 Fix SSA_NAME leak due to def_stmt is removed before use_stmt. - _5 = __atomic_fetch_or_8 (&set_work_pend

[gcc r15-1905] Rename __{float, double}_u to __x86_{float, double}_u to avoid pulluting the namespace.

2024-07-08 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:23ab7f632f4f5bae67fb53cf7b18fea7ba7242c4 commit r15-1905-g23ab7f632f4f5bae67fb53cf7b18fea7ba7242c4 Author: liuhongt Date: Mon Jul 8 10:35:35 2024 +0800 Rename __{float,double}_u to __x86_{float,double}_u to avoid pulluting the namespace. I have a build failu

[gcc r15-1888] x86: Update branch hint for Redwood Cove.

2024-07-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:a910c30c7c27cd0f6d2d2694544a09fb11d611b9 commit r15-1888-ga910c30c7c27cd0f6d2d2694544a09fb11d611b9 Author: H.J. Lu Date: Tue Apr 26 11:08:55 2022 -0700 x86: Update branch hint for Redwood Cove. According to IntelĀ® 64 and IA-32 Architectures Optimization Refer

[gcc r15-1836] Use __builtin_cpu_support instead of __get_cpuid_count.

2024-07-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:699087a16591adfdf21228876b6c48dbcd353faa commit r15-1836-g699087a16591adfdf21228876b6c48dbcd353faa Author: liuhongt Date: Thu Jul 4 13:57:32 2024 +0800 Use __builtin_cpu_support instead of __get_cpuid_count. gcc/testsuite/ChangeLog: PR target

[gcc r15-1806] Move runtime check into a separate function and guard it with target ("no-avx")

2024-07-03 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:239ad907b1fc08874042f8bea5f61eaf3ba2877d commit r15-1806-g239ad907b1fc08874042f8bea5f61eaf3ba2877d Author: liuhongt Date: Wed Jul 3 14:47:33 2024 +0800 Move runtime check into a separate function and guard it with target ("no-avx") The patch can avoid SIGILL

[gcc r15-1742] Remove vcond{, u, eq} expanders since they will be obsolete.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:55f80c690c5fa59836646565a9dee2a3f68374a0 commit r15-1742-g55f80c690c5fa59836646565a9dee2a3f68374a0 Author: liuhongt Date: Mon Jun 24 09:19:01 2024 +0800 Remove vcond{,u,eq} expanders since they will be obsolete. gcc/ChangeLog: PR target/11551

[gcc r15-1741] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:2ccdd0f22312a14ac64bf944fdc4f8e7532eb0eb commit r15-1741-g2ccdd0f22312a14ac64bf944fdc4f8e7532eb0eb Author: liuhongt Date: Thu Jun 20 12:41:13 2024 +0800 Optimize a < 0 ? -1 : 0 to (signed)a >> 31. Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x

[gcc r15-1738] Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:09737d9605521df9232d9990006c44955064f44e commit r15-1738-g09737d9605521df9232d9990006c44955064f44e Author: liuhongt Date: Tue Jun 18 15:52:02 2024 +0800 Match IEEE min/max with UNSPEC_IEEE_{MIN,MAX}. These versions of the min/max patterns implement exactly th

[gcc r15-1740] Adjust testcase for the regressed testcases after obsolete of vcond{, u, eq}.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e94e6ee495d95f29355bbc017214228a5e367638 commit r15-1740-ge94e6ee495d95f29355bbc017214228a5e367638 Author: liuhongt Date: Wed Jun 19 16:05:58 2024 +0800 Adjust testcase for the regressed testcases after obsolete of vcond{,u,eq}. > Richard suggests that we imp

[gcc r15-1739] Add more splitter for mskmov with avx512 comparison.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:3cb204046c0db899750aee9480af4f1953a40ac3 commit r15-1739-g3cb204046c0db899750aee9480af4f1953a40ac3 Author: liuhongt Date: Wed Jun 19 13:12:00 2024 +0800 Add more splitter for mskmov with avx512 comparison. gcc/ChangeLog: PR target/115517

[gcc r15-1737] Lower AVX512 kmask comparison back to AVX2 comparison when op_{true, false} is vector -1/0.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b06a108f0fbffe12493b527224f6e4131a72beac commit r15-1737-gb06a108f0fbffe12493b527224f6e4131a72beac Author: liuhongt Date: Tue Jun 18 14:03:42 2024 +0800 Lower AVX512 kmask comparison back to AVX2 comparison when op_{true,false} is vector -1/0. gcc/ChangeLog

[gcc r15-1736] Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV)

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:2e2dfa0095c3326a0a5fc2ff175918b42eeb044f commit r15-1736-g2e2dfa0095c3326a0a5fc2ff175918b42eeb044f Author: liuhongt Date: Mon Jun 17 17:16:46 2024 +0800 Add more splitters to match (unspec [op1 op2 (gt op3 constm1_operand)] UNSPEC_BLENDV) These define_insn_a

[gcc r15-1735] Enable flate-combine.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e62ea4fb8ffcab06ddd02f26db91b29b7270743f commit r15-1735-ge62ea4fb8ffcab06ddd02f26db91b29b7270743f Author: liuhongt Date: Wed Jun 26 13:52:24 2024 +0800 Enable flate-combine. Move pass_stv2 and pass_rpad after pre_reload pass_late_combine, also define tar

[gcc r15-1734] Extend lshifrtsi3_1_zext to ?k alternative.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8e1fa107a63b2e160b6bf69de4fe163dd3cebd80 commit r15-1734-g8e1fa107a63b2e160b6bf69de4fe163dd3cebd80 Author: liuhongt Date: Wed Jun 26 13:07:31 2024 +0800 Extend lshifrtsi3_1_zext to ?k alternative. late_combine will combine lshift + zero into *lshifrtsi3_1_zex

[gcc r15-1733] Define mask as extern instead of uninitialized local variables.

2024-06-30 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5e1a9f4ccff390ae79a9b9d0d39b325f2b4ea925 commit r15-1733-g5e1a9f4ccff390ae79a9b9d0d39b325f2b4ea925 Author: liuhongt Date: Wed Jun 26 11:17:46 2024 +0800 Define mask as extern instead of uninitialized local variables. The testcases are supposed to scan for vpo

[gcc r15-1673] Fix wrong cost of MEM when addr is a lea.

2024-06-26 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b8153b5417bed02f47354a14ad36100785dfdc47 commit r15-1673-gb8153b5417bed02f47354a14ad36100785dfdc47 Author: liuhongt Date: Mon Jun 24 17:53:22 2024 +0800 Fix wrong cost of MEM when addr is a lea. 416.gamess regressed 4-6% on x86_64 since my r15-882-g1d6199e5f8

[gcc r15-1638] Optimize a < 0 ? -1 : 0 to (signed)a >> 31.

2024-06-25 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:aac00d09859cc5934bd0f7493d537b8430337773 commit r15-1638-gaac00d09859cc5934bd0f7493d537b8430337773 Author: liuhongt Date: Thu Jun 20 12:41:13 2024 +0800 Optimize a < 0 ? -1 : 0 to (signed)a >> 31. Try to optimize x < 0 ? -1 : 0 into (signed) x >> 31 and x

[gcc r15-1563] AVX-512: Pacify -Wshift-overflow=2. [PR115409]

2024-06-22 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:4c957d7ba84d8bbce6e778048f38e92ef71806c8 commit r15-1563-g4c957d7ba84d8bbce6e778048f38e92ef71806c8 Author: Collin Funk Date: Mon Jun 10 06:36:47 2024 + AVX-512: Pacify -Wshift-overflow=2. [PR115409] A shift of 31 on a signed int is undefined behavior. Si

[gcc r15-1308] Adjust ix86_rtx_costs for pternlog_operand_p.

2024-06-14 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:d3fae2bea034edb001cd45d1d86c5ceef146899b commit r15-1308-gd3fae2bea034edb001cd45d1d86c5ceef146899b Author: liuhongt Date: Tue Jun 11 21:22:42 2024 +0800 Adjust ix86_rtx_costs for pternlog_operand_p. r15-1100-gec985bc97a0157 improves handling of ternlog instru

[gcc r15-1307] Remove one_if_conv for latest Intel processors.

2024-06-13 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:8b69efd9819f86b973d7a550e987ce455fce6d62 commit r15-1307-g8b69efd9819f86b973d7a550e987ce455fce6d62 Author: liuhongt Date: Mon Jun 3 10:38:19 2024 +0800 Remove one_if_conv for latest Intel processors. The tune is added by PR79390 for SciMark2 on Broadwell.

[gcc r15-1234] Fix ICE due to REGNO of a SUBREG.

2024-06-12 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:f8bf80a4e1682b2238baad8c44939682f96b1fe0 commit r15-1234-gf8bf80a4e1682b2238baad8c44939682f96b1fe0 Author: liuhongt Date: Thu Jun 13 09:53:58 2024 +0800 Fix ICE due to REGNO of a SUBREG. Use reg_or_subregno instead. gcc/ChangeLog: PR

[gcc r15-1191] Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P

2024-06-11 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:1d496d2cd1d5d8751a1637abca89339d6f9ddd3b commit r15-1191-g1d496d2cd1d5d8751a1637abca89339d6f9ddd3b Author: liuhongt Date: Tue Jun 11 10:23:27 2024 +0800 Fix ICE in rtl check due to CONST_WIDE_INT in CONST_VECTOR_DUPLICATE_P The patch add extra check to make s

[gcc r12-10497] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:5d52558a531130675329d72ca5c4713abf5bf885 commit r12-10497-g5d52558a531130675329d72ca5c4713abf5bf885 Author: Jan Hubicka Date: Fri Dec 29 23:51:03 2023 +0100 Disable FMADD in chains for Zen4 and generic this patch disables use of FMA in matrix multiplication l

[gcc r13-8825] Disable FMADD in chains for Zen4 and generic

2024-06-07 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:e4f85ea6271a10e13c6874709a05e04ab0508fbf commit r13-8825-ge4f85ea6271a10e13c6874709a05e04ab0508fbf Author: Jan Hubicka Date: Fri Dec 29 23:51:03 2023 +0100 Disable FMADD in chains for Zen4 and generic this patch disables use of FMA in matrix multiplication lo

[gcc r15-1088] Add additional option --param max-completely-peeled-insns=200 for power64*-*-*

2024-06-06 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:b24f2954dbc13d85e9fb62e05a88e9df21e4d4f4 commit r15-1088-gb24f2954dbc13d85e9fb62e05a88e9df21e4d4f4 Author: liuhongt Date: Fri Jun 7 09:29:24 2024 +0800 Add additional option --param max-completely-peeled-insns=200 for power64*-*-* gcc/testsuite/ChangeLog:

[gcc r15-1050] Refine testcase for power10.

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:fcfce55c85f842ed843cbc4aabe744c6a004dead commit r15-1050-gfcfce55c85f842ed843cbc4aabe744c6a004dead Author: liuhongt Date: Thu Jun 6 11:27:53 2024 +0800 Refine testcase for power10. For power10, there're extra 3 REG_EQUIV notes with (fix:SI. to avoid the f

[gcc r15-1048] Adjust rtx_cost for MEM to enable more simplication

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:961dd0d635217c703a38c48903981e0d60962546 commit r15-1048-g961dd0d635217c703a38c48903981e0d60962546 Author: liuhongt Date: Fri Apr 19 10:39:53 2024 +0800 Adjust rtx_cost for MEM to enable more simplication For CONST_VECTOR_DUPLICATE_P in constant_pool, it is j

[gcc r15-1047] Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode.

2024-06-05 Thread hongtao Liu via Gcc-cvs
https://gcc.gnu.org/g:7876cde25cbd2f026a0ae488e5263e72f8e9bfa0 commit r15-1047-g7876cde25cbd2f026a0ae488e5263e72f8e9bfa0 Author: liuhongt Date: Fri Apr 19 10:29:34 2024 +0800 Simplify (AND (ASHIFTRT A imm) mask) to (LSHIFTRT A imm) for vector mode. When mask is (1 << (prec - imm)

  1   2   >